diff --git a/.gitignore b/.gitignore index 2417a7f3477ee3d635fb09975cbe0473f2637031..2f15a00811101c8743f981fecb6976c7066fb941 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,7 @@ __pycache__/ *.py[cod] *$py.class +.idea # C extensions *.so @@ -142,7 +143,4 @@ cython_debug/ att_advisor*.html *.xlsx operator_tuning_file*.cfg -.ipynb_checkpoints/ - -# pycharm settings -.idea \ No newline at end of file +.ipynb_checkpoints/ \ No newline at end of file diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 0000000000000000000000000000000000000000..0c8727a91869b9afe6f5c50ff759ecb5fb45988c --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "msmonitor/third_party/dynolog"] + path = msmonitor/third_party/dynolog + url = https://github.com/facebookincubator/dynolog.git diff --git a/OWNERS b/OWNERS index 2e949debf181a6e75fdb5b1e1e091ce7a39c7e69..1b8f63546de38bc966852e4e1b318ad68a1161af 100644 --- a/OWNERS +++ b/OWNERS @@ -1,7 +1,6 @@ approvers: - leo920320 - wo-wenjie -- ma-dongfang - xhahn - aerfaliang - wangchao285 @@ -11,16 +10,18 @@ approvers: - ly-qianxiao - blian - kun_8 -- binghamhuang +- uniteone reviewers: - lv-kaimeng -- litian_drinksnow -- binghamhuang - wo-wenjie - ly-qianxiao - leo920320 - sunboquan -- stby - Seanesmhxocism - TAJh -- czr9775 \ No newline at end of file +- czr9775 +- kali20gakki +- wjchuee +- chenhao_1209 +- feng123www +- uniteone \ No newline at end of file diff --git a/README.md b/README.md index dd25d20158d7a42bec57efc931d3fad5e838a73b..bb8b30fd23806352de9ceb72ebdbecd716eff67f 100644 --- a/README.md +++ b/README.md @@ -1,96 +1,79 @@ -# 变更通知 +# 🚨 重要通知 -原Ascend Training Tools工具更名为MindStudio Training Tools,MindStudio训练工具链。变更计划如下: +**1. Ascend Training Tools 更名为 MindStudio Training Tools (mstt)。** -1. 2024.06.25本代码仓名称变更为mstt。 -2. 2024.07.04 URL变更为[https://gitee.com/ascend/mstt](https://gitee.com/ascend/mstt),原始URL仍然可用,但建议使用新URL。 +**2. 本代码仓 URL 变更为 [https://gitee.com/ascend/mstt](https://gitee.com/ascend/mstt),原 URL 仍然可用(2024.07.04 )。** -# MindStudio Training Tools +--- -MindStudio Training Tools,MindStudio训练工具链。针对训练&大模型场景,提供端到端命令行&可视化调试调优工具,帮助用户快速提高模型开发效率。 +# 🧰 MindStudio Training Tools -## 模型训练迁移全流程 -![输入图片说明](debug/resources/model_training_migration_process.png) +![Build Status](https://img.shields.io/badge/build-passing-brightgreen) +![Commit Activity](https://img.shields.io/badge/commit%20activity-high-red) +![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue) -## 使用说明 - -### [分析迁移工具](https://gitee.com/ascend/mstt/wikis/工具介绍/分析迁移工具/分析迁移工具介绍) +## [分析迁移工具](https://gitee.com/ascend/mstt/wikis/工具介绍/分析迁移工具/分析迁移工具介绍) 1. [脚本分析工具](https://gitee.com/ascend/mstt/wikis/%E5%B7%A5%E5%85%B7%E4%BB%8B%E7%BB%8D/%E5%88%86%E6%9E%90%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7/%E5%88%86%E6%9E%90%E5%B7%A5%E5%85%B7%E4%BD%BF%E7%94%A8%E6%8C%87%E5%AF%BC) - 脚本分析工具提供分析脚本,帮助用户在执行迁移操作前,分析基于GPU平台的PyTorch训练脚本中算子、三方库套件、亲和API分析以及动态shape的支持情况。 + 脚本分析工具可以帮助用户在执行迁移操作前,分析基于 GPU 平台的 PyTorch 训练脚本中算子、三方库套件、API 亲和性以及动态 shape 的支持情况。 2. [(推荐)自动迁移工具](https://gitee.com/ascend/mstt/wikis/%E5%B7%A5%E5%85%B7%E4%BB%8B%E7%BB%8D/%E5%88%86%E6%9E%90%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7/%E8%87%AA%E5%8A%A8%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7%E4%BD%BF%E7%94%A8%E6%8C%87%E5%AF%BC) - 自动迁移只需在训练脚本中导入库代码即可完成模型脚本迁移,使用方式较简单,且修改内容最少。 + 自动迁移工具只需在训练脚本中导入库代码即可完成模型脚本的迁移,使用方式简单,且修改内容少。 3. [脚本迁移工具](https://gitee.com/ascend/mstt/wikis/%E5%B7%A5%E5%85%B7%E4%BB%8B%E7%BB%8D/%E5%88%86%E6%9E%90%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7/%E8%84%9A%E6%9C%AC%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7%E4%BD%BF%E7%94%A8%E6%8C%87%E5%AF%BC) - 脚本迁移工具提供后端命令行用于将GPU上训练的PyTorch脚本迁移至NPU上,得到新的训练脚本用于训练。 - -4. [训推一体权重转换工具](https://gitee.com/Ascend/mstt/wikis/%E5%B7%A5%E5%85%B7%E4%BB%8B%E7%BB%8D/%E5%88%86%E6%9E%90%E8%BF%81%E7%A7%BB%E5%B7%A5%E5%85%B7/%E8%AE%AD%E6%8E%A8%E4%B8%80%E4%BD%93%E6%9D%83%E9%87%8D%E8%BD%AC%E6%8D%A2%E5%B7%A5%E5%85%B7%E4%BD%BF%E7%94%A8%E6%8C%87%E5%AF%BC) + 脚本迁移工具通过后端命令行,将 GPU 上训练的 PyTorch 脚本迁移至 NPU 上,得到新的训练脚本用于训练。 - 训推一体权重转换工具,支持在GPU和NPU上训练好的模型转成加速推理支持的格式。 +## [精度工具](./debug/accuracy_tools/) -### [精度工具](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools) +[MindStudio Probe(msprobe,MindStudio 精度调试工具)](./debug/accuracy_tools/msprobe)。 -1. [api_accuracy_checker(Ascend模型精度预检工具)](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/api_accuracy_checker) +## [性能工具](./profiler/msprof_analyze) - 在昇腾NPU上扫描用户训练模型中所有API,进行API复现,给出精度情况的诊断和分析。 +1. [compare_tools(性能比对工具)](./profiler/msprof_analyze/compare_tools) -2. [ptdbg_ascend(PyTorch精度工具)](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/ptdbg_ascend) + 提供 NPU 与 GPU 性能拆解功能以及算子、通信、内存性能的比对功能。 - 进行PyTorch整网API粒度的数据dump、精度比对和溢出检测,从而定位PyTorch训练场景下的精度问题。 +2. [cluster_analyse(集群分析工具)](./profiler/msprof_analyze/cluster_analyse) -### [性能工具](https://gitee.com/ascend/mstt/tree/master/profiler) + 提供多机多卡的集群分析能力(基于通信域的通信分析和迭代耗时分析), 当前需要配合 MindStudio Insight 的集群分析功能使用。 -1. [compare_tools(性能比对工具)](https://gitee.com/ascend/mstt/tree/master/profiler/compare_tools) +3. [advisor](./profiler/msprof_analyze/advisor) - 提供NPU与GPU性能拆解功能以及算子、通信、内存性能的比对功能。 + 将 Ascend PyTorch Profiler 或者 msprof 采集的 PyTorch 场景性能数据进行分析,并输出性能调优建议。 -2. [cluster_analyse(集群分析工具)](https://gitee.com/ascend/mstt/tree/master/profiler/cluster_analyse) +4. [bind_core](./profiler/affinity_cpu_bind) - 提供多机多卡的集群分析能力(基于通信域的通信分析和迭代耗时分析), 当前需要配合MindStudio Insight的集群分析功能使用。 + 绑核脚本,支持非侵入修改工程代码,实现一键式绑核功能。 -3. [affinity_cpu_bind (亲和性cpu绑核工具) ](https://gitee.com/ascend/mstt/tree/master/profiler/affinity_cpu_bind) +5. [msMonitor](./msmonitor) - 提供亲和性CPU绑核能力,改善host_bound调度问题。 + MindStudio一站式在线监控工具。 -### [Tensorboard](https://gitee.com/ascend/mstt/tree/master/plugins/tensorboard-plugins/tb_plugin) +## [Tensorboard](./plugins/tensorboard-plugins/tb_plugin) -Tensorboard支持NPU性能数据可视化插件PyTorch Profiler TensorBoard NPU Plugin。 +Tensorboard 支持 NPU 性能数据可视化插件 PyTorch Profiler TensorBoard NPU Plugin。 -支持将Ascend平台采集、解析的Pytorch Profiling数据可视化呈现,也兼容GPU数据采集、解析可视化。 +支持将 Ascend 平台采集、解析的 PyTorch Profiling 数据可视化呈现,也兼容 GPU 数据采集、解析可视化。 ## 分支维护策略 -MindStudio Training Tools工具版本分支的维护阶段如下: - -| **状态** | **时间** | **说明** | -| ------------------- | -------- | ------------------------------------------------ | -| 计划 | 1—3 个月 | 计划特性 | -| 开发 | 3个月 | 开发特性 | -| 维护 | 6—12个月 | 合入所有已解决的问题并发布版本 | -| 无维护 | 0—3 个月 | 合入所有已解决的问题,无专职维护人员,无版本发布 | -| 生命周期终止(EOL) | N/A | 分支不再接受任何修改 | - -## 现有分支的维护状态 - -MindStudio Training Tools分支版本号命名规则如下: - -mstt仓每年发布4个版本,每个版本都将对应一个分支;以v6.0为例,其将对应v6.0.RC1、v6.0.RC2、v6.0.RC3以及v6.0.0四个版本,在仓库中将存在与之对应的分支。 - -| **分支** | **状态** | **发布日期** | **后续状态** | **EOL日期** | -| ------------- | -------- | ------------ | ------------------------ | ----------- | -| **v6.0.0** | 维护 | 2023/12/12 | 预计2024/12/12起无维护 | | +1. MindStudio Training Tools 工具版本分支的维护阶段如下: -## 参与贡献 + | **状态** | **时间** | **说明** | + | ------------------- | -------- | ------------------------------------------------ | + | 计划 | 1—3 个月 | 计划特性 | + | 开发 | 3个月 | 开发特性 | + | 维护 | 6—12个月 | 合入所有已解决的问题并发布版本 | + | 无维护 | 0—3 个月 | 合入所有已解决的问题,无专职维护人员,无版本发布 | + | 生命周期终止(EOL) | N/A | 分支不再接受任何修改 | -1. Fork 本仓库 -2. 新建 xxx 分支 -3. 提交代码 -4. 新建 Pull Request +2. MindStudio Training Tools 分支版本号命名规则如下: -## 版本过渡提示 + mstt 仓每年发布 4 个版本,每个版本都将对应一个分支;以 v6.0 为例,其将对应 v6.0.RC1、v6.0.RC2、v6.0.RC3 以及 v6.0.0 四个版本,在仓库中将存在与之对应的分支。 -当前版本预检和ptdbg维护到2024/09/30,准备于2024/09/30下线,相关目录mstt/debug/accuracy_tools/api_accuracy_checker和mstt/debug/accuracy_tools/ptdbg_ascend将于2024/09/30删除。新版本的预检和ptdbg已经合到mstt/debug/accuracy_tools/atat目录下。 + | **分支** | **状态** | **发布日期** | **后续状态** | **EOL日期** | + | ------------- | -------- | ------------ | ------------------------ | ----------- | + | **v6.0.0** | 维护 | 2023.12.12 | 预计 2024.12.12 起无维护 | | diff --git a/debug/accuracy_tools/msprobe/README.md b/debug/accuracy_tools/msprobe/README.md index 0232923c19ad649485f31dc9641589572ee68806..3fa0db1657da25a023d73f689e79a5967cee8391 100644 --- a/debug/accuracy_tools/msprobe/README.md +++ b/debug/accuracy_tools/msprobe/README.md @@ -161,13 +161,7 @@ MindSpore 动态图场景的[离线预检](./docs/09.accuracy_checker_MindSpore. 该工具主要适用于对比两个环境下可能影响训练精度的配置差异, 推荐在精度对比前使用。 -[PyTorch 训练前配置检查](./docs/31.config_checking.md) - -### 14 权重比对 - -权重比对功能用于训练过程中保存的checkpoint,计算对应参数间的余弦相似度、欧式距离等指标。当前支持pytorch下megatron/mindspeed不同模型并行策略下的权重互相比对。 - -[Megatron权重比对](./docs/32.checkpoint_compare.md) +[PyTorch 训练前配置检查](./docs/31.config_check.md) ## 📑 补充材料 diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.cpp b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.cpp index 15d17cab98ee4bde6f4fbefd7c3777fb04253fc3..a23f53f030c924770ca0c635bad933f0db831334 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.cpp @@ -19,18 +19,18 @@ #include #include -#include "include/ErrorCode.hpp" -#include "include/Macro.hpp" -#include "utils/FileUtils.hpp" -#include "base/ErrorInfos.hpp" -#include "DebuggerConfigFieldMap.hpp" -#include "DebuggerConfig.hpp" +#include "include/ErrorCode.h" +#include "include/Macro.h" +#include "utils/FileUtils.h" +#include "base/ErrorInfosManager.h" +#include "DebuggerConfigFieldMap.h" +#include "DebuggerConfig.h" namespace MindStudioDebugger { template DebuggerErrno ParseJsonBaseObj2Var(const nlohmann::json& content, const std::string& field, T& output, - bool mandatory=false) + bool mandatory = false) { nlohmann::json::const_iterator iter = content.find(field); if (iter == content.end()) { @@ -52,7 +52,8 @@ DebuggerErrno ParseJsonBaseObj2Var(const nlohmann::json& content, const std::str template DebuggerErrno ParseJsonStringAndTrans(const nlohmann::json& content, const std::string& field, - const std::map& enum2name, T& output, bool mandatory=false) { + const std::map& enum2name, T& output, bool mandatory = false) +{ DebuggerErrno ret; std::string value; @@ -66,7 +67,7 @@ DebuggerErrno ParseJsonStringAndTrans(const nlohmann::json& content, const std:: } int32_t enumId = GetEnumIdFromName(enum2name, value); - if (enumId == debuggerInvalidEnum) { + if (enumId == DEBUGGER_INVALID_ENUM) { return DebuggerErrno::ERROR_UNKNOWN_VALUE; } @@ -93,19 +94,21 @@ DebuggerErrno ParseJsonStringAndTrans(const nlohmann::json& content, const std:: static bool DebuggerCfgParseUIntRangeGetBorder(const std::string& exp, uint32_t& left, uint32_t& right) { if (std::count(exp.begin(), exp.end(), '-') != 1) { - LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, "When using a range expression, it should be formatted as \"a-b\"."); + LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, + "When using a range expression, it should be formatted as \"a-b\"."); return false; } std::istringstream iss(exp); char dash; iss >> left >> dash >> right; if (iss.fail() || dash != '-') { - LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, "When using a range expression, it should be formatted as \"a-b\"."); + LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, + "When using a range expression, it should be formatted as \"a-b\"."); return false; } if (left >= right) { LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, - "When using a range expression, the left border should be smaller than the right."); + "When using a range expression, the left border should be smaller than the right."); return false; } return true; @@ -135,7 +138,8 @@ void DebuggerCfgParseUIntRange(const nlohmann::json& content, const std::string& realLen++; } else if (element.is_string()) { std::string exp = element.get(); - uint32_t begin, end; + uint32_t begin; + uint32_t end; if (!DebuggerCfgParseUIntRangeGetBorder(exp, begin, end)) { LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, "Failed to parse " + name + "."); return; @@ -153,7 +157,7 @@ void DebuggerCfgParseUIntRange(const nlohmann::json& content, const std::string& constexpr uint32_t maxEleNum = 65536; if (realLen > maxEleNum) { LOG_ERROR(DebuggerErrno::ERROR_INVALID_FORMAT, - "When using a range expression in " + name + ", maximum of 65536 elements can be expressed."); + "When using a range expression in " + name + ", maximum of 65536 elements can be expressed."); return; } @@ -175,9 +179,9 @@ void CommonCfgParseTasks(const nlohmann::json& content, std::vector(content, kTask, taskName, true); + ret = ParseJsonBaseObj2Var(content, TASK, taskName, true); if (ret == DebuggerErrno::ERROR_FIELD_NOT_EXISTS) { - ret = ParseJsonBaseObj2Var>(content, kTasks, taskNameList, true); + ret = ParseJsonBaseObj2Var>(content, TASKS, taskNameList, true); } else { taskNameList.emplace_back(taskName); } @@ -188,8 +192,8 @@ void CommonCfgParseTasks(const nlohmann::json& content, std::vector& expressions) { for (auto& expression : expressions) { size_t len = expression.size(); - if (strncmp(expression.c_str(), kRegexPrefix, kRegexPrefixLen) == 0 && - strncmp(expression.c_str() + (len - kRegexSuffixLen), kRegexSuffix, kRegexSuffixLen) == 0) { - /* name-regex(xxx)表示正则表达式*/ - regexList.emplace_back(expression.substr(kRegexPrefixLen, len - kRegexPrefixLen - kRegexSuffixLen)); + if (strncmp(expression.c_str(), REGEX_PREFIX, REGEX_PREFIX_LEN) == 0 && + strncmp(expression.c_str() + (len - REGEX_SUFFIX_LEN), REGEX_SUFFIX, REGEX_SUFFIX_LEN) == 0) { + /* name-regex(xxx)表示正则表达式 */ + regexList.emplace_back(expression.substr(REGEX_PREFIX_LEN, len - REGEX_PREFIX_LEN - REGEX_SUFFIX_LEN)); } else { /* 否则认为是full scope name */ fullNameList.emplace_back(expression); @@ -224,7 +228,7 @@ std::vector KernelListMatcher::GenRealKernelList(const char** fullK { std::vector output; /* 返回空列表表示全部dump,返回一个空字符串表示没有匹配上的,都不dump */ - if (this->empty() || fullKernelList == nullptr) { + if (this->Empty() || fullKernelList == nullptr) { return output; } output = fullNameList; @@ -252,34 +256,38 @@ void CommonCfg::Parse(const nlohmann::json& content) return; } - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kOutputPath, outputPath); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, OUTPUT_PATH, outputPath); outputPath = FileUtils::GetAbsPath(outputPath); - DebuggerCfgParseUIntRange(content, kRank, rank); - DebuggerCfgParseUIntRange(content, kStep, step); - PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, kLevel, DebuggerLevelEnum2Name, level); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kSeed, seed); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kIsDeterministic, isDeterministic); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kEnableDataloader, enableDataloader); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kAclConfig, aclConfig); + DebuggerCfgParseUIntRange(content, RANK, rank); + DebuggerCfgParseUIntRange(content, STEP, step); + PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, LEVEL, DEBUGGER_LEVEL_ENUM_2_NAME, level); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, SEED, seed); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, IS_DETERMINISTIC, isDeterministic); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, ENABLE_DATALOADER, enableDataloader); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, ACL_CONFIG, aclConfig); } void DebuggerCfgParseDataMode(const nlohmann::json& content, DebuggerDataDirection& direction, DebuggerDataInOut& inout) { std::vector buf; - bool fw, bw, in, out, all; + bool fw; + bool bw; + bool in; + bool out; + bool all; direction = DebuggerDataDirection::DIRECTION_BOTH; inout = DebuggerDataInOut::INOUT_BOTH; - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kDataMode, buf); - all = static_cast(std::find(buf.begin(), buf.end(), kDataModeAll) != buf.end()); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, DATA_MODE, buf); + all = static_cast(std::find(buf.begin(), buf.end(), DATA_MODE_ALL) != buf.end()); if (buf.empty() || all) { return; } - fw = static_cast(std::find(buf.begin(), buf.end(), kDirectionForward) != buf.end()); - bw = static_cast(std::find(buf.begin(), buf.end(), kDirectionBackward) != buf.end()); - in = static_cast(std::find(buf.begin(), buf.end(), kInOutInput) != buf.end()); - out = static_cast(std::find(buf.begin(), buf.end(), kInOutOutput) != buf.end()); + fw = static_cast(std::find(buf.begin(), buf.end(), DIRECTION_FORWARD) != buf.end()); + bw = static_cast(std::find(buf.begin(), buf.end(), DIRECTION_BACKWARD) != buf.end()); + in = static_cast(std::find(buf.begin(), buf.end(), INOUT_INPUT) != buf.end()); + out = static_cast(std::find(buf.begin(), buf.end(), INOUT_OUTPUT) != buf.end()); /* 互补项都配或都不配都表示both,因此关注不同的场景就行 */ if (fw != bw) { @@ -303,18 +311,18 @@ void StatisticsCfgParseSummary(const nlohmann::json& content, std::vector modeListName; /* 若无该字段,认为是statistic,因此这里给mode设个默认值 */ - ret = ParseJsonBaseObj2Var(content, kSummaryMode, mode); + ret = ParseJsonBaseObj2Var(content, SUMMARY_MODE, mode); if (ret == DebuggerErrno::OK) { - if (mode == kStatistics) { + if (mode == STATISTICS) { summaryOption.push_back(DebuggerSummaryOption::MAX); summaryOption.push_back(DebuggerSummaryOption::MIN); summaryOption.push_back(DebuggerSummaryOption::MEAN); summaryOption.push_back(DebuggerSummaryOption::L2NORM); - } else if (mode == kMd5) { + } else if (mode == MD5) { summaryOption.push_back(DebuggerSummaryOption::MD5); } else { LOG_ERROR(DebuggerErrno::ERROR_UNKNOWN_VALUE, "Summary mode " + mode + " is unknown."); @@ -322,7 +330,7 @@ void StatisticsCfgParseSummary(const nlohmann::json& content, std::vector>(content, kSummaryMode, modeListName); + ret = ParseJsonBaseObj2Var>(content, SUMMARY_MODE, modeListName); if (ret != DebuggerErrno::OK) { LOG_ERROR(ret, "Value of field summary_mode should be string or list."); return; @@ -338,8 +346,8 @@ void StatisticsCfgParseSummary(const nlohmann::json& content, std::vector filter; - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kScope, scope); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kList, filter); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, SCOPE, scope); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, LIST, filter); filter.erase(std::remove_if(filter.begin(), filter.end(), [](const std::string& s) { return s.find_first_not_of(' ') == std::string::npos; }), - filter.end()); + filter.end()); list = std::move(filter); if (DebuggerConfig::GetInstance().GetDebugLevel() == DebuggerLevel::L2) { matcher.Parse(list); @@ -368,24 +376,24 @@ void StatisticsCfg::Parse(const nlohmann::json& content) void DumpTensorCfg::Parse(const nlohmann::json& content) { std::vector filter; - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kScope, scope); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kList, filter); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, SCOPE, scope); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, LIST, filter); filter.erase(std::remove_if(filter.begin(), filter.end(), [](const std::string& s) { return s.find_first_not_of(' ') == std::string::npos; }), - filter.end()); + filter.end()); list = std::move(filter); if (DebuggerConfig::GetInstance().GetDebugLevel() == DebuggerLevel::L2) { matcher.Parse(list); } DebuggerCfgParseDataMode(content, direction, inout); - PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, kFileFormat, DumpFileFormatEnum2Name, fileFormat); - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kBackwardInput, backwardInput); + PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, FILE_FORMAT, DUMP_FILE_FORMAT_ENUM_2_NAME, fileFormat); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, BACKWARD_INPUT, backwardInput); } void OverflowCheckCfg::Parse(const nlohmann::json& content) { - PARSE_OPTIONAL_FIELD_CHECK_RET(content, kOverflowNums, overflowNums); - PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, kCheckMode, OpCheckLevelEnum2Name, checkMode); + PARSE_OPTIONAL_FIELD_CHECK_RET(content, OVERFLOW_NUMS, overflowNums); + PARSE_OPTIONAL_FIELD_TRANS_CHECK_RET(content, CHECK_MODE, OP_CHECK_LEVEL_ENUM_2_NAME, checkMode); } void DebuggerConfig::Reset() @@ -424,14 +432,14 @@ void DebuggerConfig::Parse() iter = content.find(name); \ if (iter != content.end()) { \ member = std::make_shared(); \ - member->Parse(*(iter)); \ + ((member)->Parse(*(iter))); \ } \ } \ } while (0) - PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_DUMP_STATISTICS, kTaskStatistics, statisticCfg, StatisticsCfg); - PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_DUMP_TENSOR, kTaskDumpTensor, dumpTensorCfg, DumpTensorCfg); - PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_OVERFLOW_CHECK, kTaskOverflowCheck, overflowCheckCfg, OverflowCheckCfg); + PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_DUMP_STATISTICS, TASK_STATISTICS, statisticCfg, StatisticsCfg); + PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_DUMP_TENSOR, TASK_DUMP_TENSOR, dumpTensorCfg, DumpTensorCfg); + PARSE_SUBTASK_CONFIG(DebuggerTaskType::TASK_OVERFLOW_CHECK, TASK_OVERFLOW_CHECK, overflowCheckCfg, OverflowCheckCfg); #undef PARSE_SUBTASK_CONFIG return; @@ -456,8 +464,8 @@ int32_t DebuggerConfig::LoadConfig(const std::string& framework, const std::stri return -1; } - int32_t enumId = GetEnumIdFromName(FrameworkEnum2Name, framework); - if (enumId == debuggerInvalidEnum) { + int32_t enumId = GetEnumIdFromName(FRAMEWORK_ENUM_2_NAME, framework); + if (enumId == DEBUGGER_INVALID_ENUM) { LOG_ERROR(DebuggerErrno::ERROR_UNKNOWN_VALUE, "Unknown framework " + framework + "."); return -1; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.hpp b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.h similarity index 91% rename from debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.hpp rename to debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.h index d56191443f8e6a7819c2bfbf402a5937bacd92ff..e9390ffe461e2586b518384ed0650fb18453b6c0 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfig.h @@ -26,11 +26,11 @@ #include #include -#include "include/Macro.hpp" +#include "include/Macro.h" namespace MindStudioDebugger { -constexpr int debuggerInvalidEnum = -1; +constexpr int DEBUGGER_INVALID_ENUM = -1; enum class DebuggerFramework { FRAMEWORK_PYTORCH, @@ -47,7 +47,7 @@ enum class DebuggerTaskType { TASK_RUN_UT, TASK_GRAD_PROBE, - TASK_BUTT = debuggerInvalidEnum, + TASK_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerDevType { @@ -55,7 +55,7 @@ enum class DebuggerDevType { DEVICE_TYPE_GPU, DEVICE_TYPE_CPU, - DEVICE_TYPE_BUTT = debuggerInvalidEnum, + DEVICE_TYPE_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerLevel { @@ -64,7 +64,7 @@ enum class DebuggerLevel { L2, MIX, - LEVEL_BUTT = debuggerInvalidEnum, + LEVEL_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerDataDirection { @@ -72,7 +72,7 @@ enum class DebuggerDataDirection { DIRECTION_BACKWARD, DIRECTION_BOTH, - DIRECTION_BUTT = debuggerInvalidEnum, + DIRECTION_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerDataInOut { @@ -80,14 +80,14 @@ enum class DebuggerDataInOut { INOUT_OUTPUT, INOUT_BOTH, - INOUT_BUTT = debuggerInvalidEnum, + INOUT_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerDumpFileFormat { FILE_FORMAT_BIN, FILE_FORMAT_NPY, - FILE_FORMAT_BUTT = debuggerInvalidEnum, + FILE_FORMAT_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerOpCheckLevel { @@ -95,7 +95,7 @@ enum class DebuggerOpCheckLevel { CHECK_LEVEL_ATOMIC, CHECK_LEVEL_ALL, - CHECK_LEVEL_BUTT = debuggerInvalidEnum, + CHECK_LEVEL_BUTT = DEBUGGER_INVALID_ENUM, }; enum class DebuggerSummaryOption { @@ -108,7 +108,7 @@ enum class DebuggerSummaryOption { POS_INF_CNT, MD5, - SUMMARY_BUTT = debuggerInvalidEnum, + SUMMARY_BUTT = DEBUGGER_INVALID_ENUM, }; class KernelListMatcher { @@ -119,8 +119,8 @@ public: void Parse(const std::vector& expressions); std::vector GenRealKernelList(const char** fullKernelList) const; - inline bool empty() const {return fullNameList.empty() && regexList.empty();} - inline bool needAllKernels() const {return !regexList.empty();} + inline bool Empty() const {return fullNameList.empty() && regexList.empty();} + inline bool NeedAllKernels() const {return !regexList.empty();} private: std::vector fullNameList; @@ -208,11 +208,11 @@ private: class DebuggerConfig { - public: - static DebuggerConfig& GetInstance() { - static DebuggerConfig instance_; - return instance_; + static DebuggerConfig& GetInstance() + { + static DebuggerConfig configInstance; + return configInstance; } int32_t LoadConfig(const std::string& framework, const std::string& cfgFilePath); diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.hpp b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.h similarity index 30% rename from debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.hpp rename to debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.h index 8ebef4206b42b702712edccc5b19d9611370c63b..95954ecd417275c6e38fc37f01a6f8bb18c939e4 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/DebuggerConfigFieldMap.h @@ -19,129 +19,129 @@ #include #include -#include "DebuggerConfig.hpp" +#include "DebuggerConfig.h" namespace MindStudioDebugger { -constexpr const char* kFramework = "framework"; -constexpr const char* kFrameworkPyTorch = "PyTorch"; -constexpr const char* kFrameworkMindSpore = "MindSpore"; - -constexpr const char* kTaskStatistics = "statistics"; -constexpr const char* kTaskDumpTensor = "tensor"; -constexpr const char* kTaskOverflowCheck = "overflow_check"; -constexpr const char* kFreeBenchmark = "free_benchmark"; -constexpr const char* kRunUT = "run_ut"; -constexpr const char* kGradProbe = "grad_probe"; - -constexpr const char* kLevel0 = "L0"; -constexpr const char* kLevel1 = "L1"; -constexpr const char* kLevel2 = "L2"; -constexpr const char* kLevelMix = "mix"; - -constexpr const char* kDirectionForward = "forward"; -constexpr const char* kDirectionBackward = "backward"; -constexpr const char* kDirectionBoth = "both"; -constexpr const char* kInOutInput = "input"; -constexpr const char* kInOutOutput = "output"; -constexpr const char* kInOutBoth = "both"; -constexpr const char* kDataModeAll = "all"; - -constexpr const char* kFreeBenchmarkHandlerCheck = "check"; -constexpr const char* kFreeBenchmarkHandlerFix = "fix"; - -constexpr const char* kDumpFileFormatBin = "bin"; -constexpr const char* kDumpFileFormatNpy = "npy"; - -constexpr const char* kOpCheckLevelAiCore = "aicore"; -constexpr const char* kOpCheckLevelAtomic = "atomic"; -constexpr const char* kOpCheckLevelAll = "all"; - -constexpr const char* kTask = "task"; -constexpr const char* kTasks = "tasks"; -constexpr const char* kOutputPath = "dump_path"; -constexpr const char* kRank = "rank"; -constexpr const char* kStep = "step"; -constexpr const char* kLevel = "level"; -constexpr const char* kSeed = "seed"; -constexpr const char* kIsDeterministic = "is_deterministic"; -constexpr const char* kEnableDataloader = "enable_dataloader"; -constexpr const char* kAclConfig = "acl_config"; - -constexpr const char* kScope = "scope"; -constexpr const char* kList = "list"; - -constexpr const char* kDataMode = "data_mode"; -constexpr const char* kSummaryMode = "summary_mode"; -constexpr const char* kFileFormat = "file_format"; -constexpr const char* kOverflowNums = "overflow_nums"; -constexpr const char* kCheckMode = "check_mode"; -constexpr const char* kBackwardInput = "backward_input"; - -constexpr const char* kStatistics = "statistics"; -constexpr const char* kMd5 = "md5"; -constexpr const char* kMax = "max"; -constexpr const char* kMin = "min"; -constexpr const char* kMean = "mean"; -constexpr const char* kL2Norm = "l2norm"; -constexpr const char* kNanCount = "nan count"; -constexpr const char* kNegativeInfCount = "negative inf count"; -constexpr const char* kPositiveInfCount = "positive inf count"; - -const std::map FrameworkEnum2Name = { - {static_cast(DebuggerFramework::FRAMEWORK_PYTORCH), kFrameworkPyTorch}, - {static_cast(DebuggerFramework::FRAMEWORK_MINDSPORE), kFrameworkMindSpore}, +constexpr const char* FRAMEWORK = "framework"; +constexpr const char* FRAMEWORK_PYTORCH = "PyTorch"; +constexpr const char* FRAMEWORK_MINDSPORE = "MindSpore"; + +constexpr const char* TASK_STATISTICS = "statistics"; +constexpr const char* TASK_DUMP_TENSOR = "tensor"; +constexpr const char* TASK_OVERFLOW_CHECK = "overflow_check"; +constexpr const char* TASK_FREE_BENCHMARK = "free_benchmark"; +constexpr const char* TASK_RUN_UT = "run_ut"; +constexpr const char* TASK_GRAD_PROBE = "grad_probe"; + +constexpr const char* LEVEL0 = "L0"; +constexpr const char* LEVEL1 = "L1"; +constexpr const char* LEVEL2 = "L2"; +constexpr const char* LEVEL_MIX = "mix"; + +constexpr const char* DIRECTION_FORWARD = "forward"; +constexpr const char* DIRECTION_BACKWARD = "backward"; +constexpr const char* DIRECTION_BOTH = "both"; +constexpr const char* INOUT_INPUT = "input"; +constexpr const char* INOUT_OUTPUT = "output"; +constexpr const char* INOUT_BOTH = "both"; +constexpr const char* DATA_MODE_ALL = "all"; + +constexpr const char* FREE_BENCHMARK_HANDLER_CHECK = "check"; +constexpr const char* FREE_BENCHMARK_HANDLER_FIX = "fix"; + +constexpr const char* DUMP_FILE_FORMAT_BIN = "bin"; +constexpr const char* DUMP_FILE_FORMAT_NPY = "npy"; + +constexpr const char* OP_CHECK_LEVEL_AICORE = "aicore"; +constexpr const char* OP_CHECK_LEVEL_ATOMIC = "atomic"; +constexpr const char* OP_CHECK_LEVEL_ALL = "all"; + +constexpr const char* TASK = "task"; +constexpr const char* TASKS = "tasks"; +constexpr const char* OUTPUT_PATH = "dump_path"; +constexpr const char* RANK = "rank"; +constexpr const char* STEP = "step"; +constexpr const char* LEVEL = "level"; +constexpr const char* SEED = "seed"; +constexpr const char* IS_DETERMINISTIC = "is_deterministic"; +constexpr const char* ENABLE_DATALOADER = "enable_dataloader"; +constexpr const char* ACL_CONFIG = "acl_config"; + +constexpr const char* SCOPE = "scope"; +constexpr const char* LIST = "list"; + +constexpr const char* DATA_MODE = "data_mode"; +constexpr const char* SUMMARY_MODE = "summary_mode"; +constexpr const char* FILE_FORMAT = "file_format"; +constexpr const char* OVERFLOW_NUMS = "overflow_nums"; +constexpr const char* CHECK_MODE = "check_mode"; +constexpr const char* BACKWARD_INPUT = "backward_input"; + +constexpr const char* STATISTICS = "statistics"; +constexpr const char* MD5 = "md5"; +constexpr const char* MAX = "max"; +constexpr const char* MIN = "min"; +constexpr const char* MEAN = "mean"; +constexpr const char* L2_NORM = "l2norm"; +constexpr const char* NAN_COUNT = "nan count"; +constexpr const char* NEGATIVE_INF_COUNT = "negative inf count"; +constexpr const char* POSITIVE_INF_COUNT = "positive inf count"; + +const std::map FRAMEWORK_ENUM_2_NAME = { + {static_cast(DebuggerFramework::FRAMEWORK_PYTORCH), FRAMEWORK_PYTORCH}, + {static_cast(DebuggerFramework::FRAMEWORK_MINDSPORE), FRAMEWORK_MINDSPORE}, }; -const std::map TaskTypeEnum2Name = { - {static_cast(DebuggerTaskType::TASK_DUMP_TENSOR), kTaskDumpTensor}, - {static_cast(DebuggerTaskType::TASK_DUMP_STATISTICS), kTaskStatistics}, - {static_cast(DebuggerTaskType::TASK_OVERFLOW_CHECK), kTaskOverflowCheck}, - {static_cast(DebuggerTaskType::TASK_FREE_BENCHMARK), kFreeBenchmark}, - {static_cast(DebuggerTaskType::TASK_RUN_UT), kRunUT}, - {static_cast(DebuggerTaskType::TASK_GRAD_PROBE), kGradProbe}, +const std::map TASK_TYPE_ENUM_2_NAME = { + {static_cast(DebuggerTaskType::TASK_DUMP_TENSOR), TASK_DUMP_TENSOR}, + {static_cast(DebuggerTaskType::TASK_DUMP_STATISTICS), TASK_STATISTICS}, + {static_cast(DebuggerTaskType::TASK_OVERFLOW_CHECK), TASK_OVERFLOW_CHECK}, + {static_cast(DebuggerTaskType::TASK_FREE_BENCHMARK), TASK_FREE_BENCHMARK}, + {static_cast(DebuggerTaskType::TASK_RUN_UT), TASK_RUN_UT}, + {static_cast(DebuggerTaskType::TASK_GRAD_PROBE), TASK_GRAD_PROBE}, }; -const std::map DebuggerLevelEnum2Name = { - {static_cast(DebuggerLevel::L0), kLevel0}, - {static_cast(DebuggerLevel::L1), kLevel1}, - {static_cast(DebuggerLevel::L2), kLevel2}, - {static_cast(DebuggerLevel::MIX), kLevelMix}, +const std::map DEBUGGER_LEVEL_ENUM_2_NAME = { + {static_cast(DebuggerLevel::L0), LEVEL0}, + {static_cast(DebuggerLevel::L1), LEVEL0}, + {static_cast(DebuggerLevel::L2), LEVEL2}, + {static_cast(DebuggerLevel::MIX), LEVEL_MIX}, }; -const std::map DataDirectionEnum2Name = { - {static_cast(DebuggerDataDirection::DIRECTION_FORWARD), kDirectionForward}, - {static_cast(DebuggerDataDirection::DIRECTION_BACKWARD), kDirectionBackward}, - {static_cast(DebuggerDataDirection::DIRECTION_BOTH), kDirectionBoth}, +const std::map DATA_DIRECTION_ENUM_2_NAME = { + {static_cast(DebuggerDataDirection::DIRECTION_FORWARD), DIRECTION_FORWARD}, + {static_cast(DebuggerDataDirection::DIRECTION_BACKWARD), DIRECTION_BACKWARD}, + {static_cast(DebuggerDataDirection::DIRECTION_BOTH), DIRECTION_BOTH}, }; -const std::map DataInOutEnum2Name = { - {static_cast(DebuggerDataInOut::INOUT_INPUT), kInOutInput}, - {static_cast(DebuggerDataInOut::INOUT_OUTPUT), kInOutOutput}, - {static_cast(DebuggerDataInOut::INOUT_BOTH), kInOutBoth}, +const std::map DATA_INOUT_ENUM_2_NAME = { + {static_cast(DebuggerDataInOut::INOUT_INPUT), INOUT_INPUT}, + {static_cast(DebuggerDataInOut::INOUT_OUTPUT), INOUT_OUTPUT}, + {static_cast(DebuggerDataInOut::INOUT_BOTH), INOUT_BOTH}, }; -const std::map DumpFileFormatEnum2Name = { - {static_cast(DebuggerDumpFileFormat::FILE_FORMAT_BIN), kDumpFileFormatBin}, - {static_cast(DebuggerDumpFileFormat::FILE_FORMAT_NPY), kDumpFileFormatNpy}, +const std::map DUMP_FILE_FORMAT_ENUM_2_NAME = { + {static_cast(DebuggerDumpFileFormat::FILE_FORMAT_BIN), DUMP_FILE_FORMAT_BIN}, + {static_cast(DebuggerDumpFileFormat::FILE_FORMAT_NPY), DUMP_FILE_FORMAT_NPY}, }; -const std::map OpCheckLevelEnum2Name = { - {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_AICORE), kOpCheckLevelAiCore}, - {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_ATOMIC), kOpCheckLevelAtomic}, - {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_ALL), kOpCheckLevelAll}, +const std::map OP_CHECK_LEVEL_ENUM_2_NAME = { + {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_AICORE), OP_CHECK_LEVEL_AICORE}, + {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_ATOMIC), OP_CHECK_LEVEL_ATOMIC}, + {static_cast(DebuggerOpCheckLevel::CHECK_LEVEL_ALL), OP_CHECK_LEVEL_ALL}, }; -const std::map SummaryOptionEnum2Name = { - {static_cast(DebuggerSummaryOption::MAX), kMax}, - {static_cast(DebuggerSummaryOption::MIN), kMin}, - {static_cast(DebuggerSummaryOption::MEAN), kMean}, - {static_cast(DebuggerSummaryOption::NAN_CNT), kNanCount}, - {static_cast(DebuggerSummaryOption::NEG_INF_CNT), kNegativeInfCount}, - {static_cast(DebuggerSummaryOption::POS_INF_CNT), kPositiveInfCount}, - {static_cast(DebuggerSummaryOption::L2NORM), kL2Norm}, +const std::map SUMMARY_OPTION_ENUM_2_NAME = { + {static_cast(DebuggerSummaryOption::MAX), MAX}, + {static_cast(DebuggerSummaryOption::MIN), MIN}, + {static_cast(DebuggerSummaryOption::MEAN), MEAN}, + {static_cast(DebuggerSummaryOption::NAN_CNT), NAN_COUNT}, + {static_cast(DebuggerSummaryOption::NEG_INF_CNT), NEGATIVE_INF_COUNT}, + {static_cast(DebuggerSummaryOption::POS_INF_CNT), POSITIVE_INF_COUNT}, + {static_cast(DebuggerSummaryOption::L2NORM), L2_NORM}, - {static_cast(DebuggerSummaryOption::MD5), kMd5}, + {static_cast(DebuggerSummaryOption::MD5), MD5}, }; inline int32_t GetEnumIdFromName(const std::map& enum2name, const std::string& name) @@ -151,7 +151,7 @@ inline int32_t GetEnumIdFromName(const std::map& enum2name return iter->first; } } - return debuggerInvalidEnum; + return DEBUGGER_INVALID_ENUM; } inline std::string GetNameFromEnumId(const std::map& enum2name, int32_t id) diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/Environment.cpp b/debug/accuracy_tools/msprobe/ccsrc/base/Environment.cpp index 3a31e03cf898901767e3c658b993edc14b76e35a..cfc4c4b164ccdbae3a7cf4173d64a1b180c8b87a 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/Environment.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/Environment.cpp @@ -14,14 +14,14 @@ * limitations under the License. */ -#include "utils/CPythonUtils.hpp" -#include "DebuggerConfig.hpp" -#include "Environment.hpp" +#include "utils/CPythonUtils.h" +#include "DebuggerConfig.h" +#include "Environment.h" namespace MindStudioDebugger { namespace Environment { -static int32_t GetRankID_PT() +static int32_t GetPTRankID() { /* if torch.distributed.is_initialized(): * return torch.distributed.get_rank() @@ -48,10 +48,10 @@ static int32_t GetRankID_PT() return id; } -static int32_t GetRankID_MS() +static int32_t GetMSRankID() { - constexpr const char* kRankId = "RANK_ID"; - const char* rankIdEnv = getenv(kRankId); + constexpr const char* RANK_ID = "RANK_ID"; + const char* rankIdEnv = getenv(RANK_ID); if (rankIdEnv == nullptr) { return -1; } @@ -78,9 +78,9 @@ int32_t GetRankID() } if (DebuggerConfig::GetInstance().GetFramework() == DebuggerFramework::FRAMEWORK_PYTORCH) { - id = GetRankID_PT(); + id = GetPTRankID(); } else if (DebuggerConfig::GetInstance().GetFramework() == DebuggerFramework::FRAMEWORK_MINDSPORE) { - id = GetRankID_MS(); + id = GetMSRankID(); } return id; diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/Environment.hpp b/debug/accuracy_tools/msprobe/ccsrc/base/Environment.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/base/Environment.hpp rename to debug/accuracy_tools/msprobe/ccsrc/base/Environment.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.cpp b/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.cpp similarity index 89% rename from debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.cpp rename to debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.cpp index b07554a9fe10609ab4fa03357877b2f7630bd55e..755be22eac060c150aa9bdd508888ae2879a5d90 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.cpp @@ -22,13 +22,12 @@ #include #include -#include "utils/FileUtils.hpp" -#include "ErrorInfos.hpp" +#include "utils/FileUtils.h" +#include "ErrorInfosManager.h" namespace MindStudioDebugger { -static std::mutex errInfoMtx; -static std::ofstream logOfs; +static std::mutex g_errInfoMtx; DebuggerErrLevel ErrorInfosManager::topLevel = DebuggerErrLevel::LEVEL_NONE; DebuggerErrLevel ErrorInfosManager::threshold = DebuggerErrLevel::LEVEL_INFO; @@ -84,8 +83,8 @@ void ErrorInfosManager::LogErrorInfo(DebuggerErrLevel level, DebuggerErrno errId return; } - std::lock_guard lk(errInfoMtx); - std::ostream& output = logOfs.is_open() ? logOfs : std::cout; + std::lock_guard lk(g_errInfoMtx); + std::ostream& output = std::cout; output << "[" << ErrorLevelString[level] << "]"; if (errId != DebuggerErrno::NONE) { output << "[" << ErrnoString[errId] << "]"; @@ -101,26 +100,12 @@ void ErrorInfosManager::LogErrorInfo(DebuggerErrLevel level, DebuggerErrno errId DebuggerErrLevel ErrorInfosManager::GetTopErrLevelInDuration() { - std::lock_guard lk(errInfoMtx); + std::lock_guard lk(g_errInfoMtx); DebuggerErrLevel ret = topLevel; topLevel = DebuggerErrLevel::LEVEL_NONE; return ret; } -void ErrorInfosManager::SetLogPath(const std::string& path) -{ - std::lock_guard lk(errInfoMtx); - if (logOfs.is_open()) { - logOfs.close(); - } - - if (path.empty()) { - return; - } - - FileUtils::OpenFile(path, logOfs); -} - __attribute__((constructor)) void InitDebuggerThreshold() { const char* msprobeLogLevelEnv = getenv("MSPROBE_LOG_LEVEL"); diff --git a/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.hpp b/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.h similarity index 96% rename from debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.hpp rename to debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.h index 6c740a6a36cfd7692b793dfa7625789771731289..62d1a1e8902da59ebeef90e7c1fd2dd4ce188f21 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfos.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/base/ErrorInfosManager.h @@ -18,7 +18,7 @@ #include #include -#include "include/ErrorCode.hpp" +#include "include/ErrorCode.h" namespace MindStudioDebugger { @@ -35,14 +35,14 @@ class ErrorInfosManager { public: static void LogErrorInfo(DebuggerErrLevel level, DebuggerErrno errId, const std::string& info); static DebuggerErrLevel GetTopErrLevelInDuration(); - static void SetLogPath(const std::string& path); static void SetLogThreshold(DebuggerErrLevel t) { threshold = t; } private: static DebuggerErrLevel topLevel; static DebuggerErrLevel threshold; }; -inline void CleanErrorInfoCache() { +inline void CleanErrorInfoCache() +{ ErrorInfosManager::GetTopErrLevelInDuration(); } diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.cpp index c40d7806ca065f037d64aa4b15d0d1f12024c9f3..aa33fee61dd4c06f75c7d723491ec9e32ace812a 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.cpp @@ -25,56 +25,56 @@ #include #include -#include "include/Macro.hpp" -#include "utils/FileUtils.hpp" -#include "utils/FileOperation.hpp" -#include "utils/DataUtils.hpp" -#include "utils/MathUtils.hpp" -#include "core/AclTensor.hpp" -#include "base/ErrorInfos.hpp" +#include "include/Macro.h" +#include "utils/FileUtils.h" +#include "utils/FileOperation.h" +#include "utils/DataUtils.h" +#include "utils/MathUtils.h" +#include "core/AclTensor.h" +#include "base/ErrorInfosManager.h" #include "proto/AclDumpMsg.pb.h" -#include "AclDumpDataProcessor.hpp" +#include "AclDumpDataProcessor.h" namespace MindStudioDebugger { namespace AclDumpMsg = toolkit::dumpdata; -constexpr size_t kDhaAtomicAddInfoSize = 128; -constexpr size_t kL2AtomicAddInfoSize = 128; -constexpr size_t kAiCoreInfoSize = 256; -constexpr size_t kDhaAtomicAddStatusSize = 256; -constexpr size_t kL2AtomicAddStatusSize = 256; -constexpr size_t kUint64Size = sizeof(uint64_t); -constexpr const char* debugFileSign = "Opdebug.Node_OpDebug."; - -constexpr const char* kStatsHeaderInout = "Input/Output"; -constexpr const char* kStatsHeaderId = "Index"; -constexpr const char* kStatsHeaderDataSize = "Data Size"; -constexpr const char* kStatsHeaderDataType = "Data Type"; -constexpr const char* kStatsHeaderFormat = "Format"; -constexpr const char* kStatsHeaderShape = "Shape"; -constexpr const char* kStatsHeaderMax = "Max Value"; -constexpr const char* kStatsHeaderMin = "Min Value"; -constexpr const char* kStatsHeaderAvg = "Avg Value"; -constexpr const char* kStatsHeaderL2Norm = "l2norm"; -constexpr const char* kStatsHeaderL2NormInCsv = "L2Norm Value"; -constexpr const char* kStatsHeaderMD5 = "MD5 Value"; -constexpr const char* kStatsHeaderNan = "Nan Count"; -constexpr const char* kStatsHeaderNanInCsv = "NaN Count"; -constexpr const char* kStatsHeaderNegInf = "Negative Inf Count"; -constexpr const char* kStatsHeaderPosInf = "Positive Inf Count"; -constexpr const char* kRankId = "RANK_ID"; -constexpr const char* kDigitalNumbers = "0123456789"; - -static const std::map> summaryOptionHeaderStrMap = { - {DebuggerSummaryOption::MAX, {kStatsHeaderMax, kStatsHeaderMax}}, - {DebuggerSummaryOption::MIN, {kStatsHeaderMin, kStatsHeaderMin}}, - {DebuggerSummaryOption::MEAN, {kStatsHeaderAvg, kStatsHeaderAvg}}, - {DebuggerSummaryOption::L2NORM, {kStatsHeaderL2Norm, kStatsHeaderL2NormInCsv}}, - {DebuggerSummaryOption::NAN_CNT, {kStatsHeaderNan, kStatsHeaderNanInCsv}}, - {DebuggerSummaryOption::NEG_INF_CNT, {kStatsHeaderNegInf, kStatsHeaderNegInf}}, - {DebuggerSummaryOption::POS_INF_CNT, {kStatsHeaderPosInf, kStatsHeaderPosInf}}, - {DebuggerSummaryOption::MD5, {kStatsHeaderMD5, kStatsHeaderMD5}}, +constexpr size_t DHA_ATOMIC_ADD_INFO_SIZE = 128; +constexpr size_t L2_ATOMIC_ADD_INFO_SIZE = 128; +constexpr size_t AICORE_INFO_SIZE = 256; +constexpr size_t DHA_ATOMIC_ADD_STATUS_SIZE = 256; +constexpr size_t L2_ATOMIC_ADD_STATUS_SIZE = 256; +constexpr size_t UINT64_SIZE = sizeof(uint64_t); +constexpr const char* DEBUG_FILE_SIGN = "Opdebug.Node_OpDebug."; + +constexpr const char* STATS_HEADER_INOUT = "Input/Output"; +constexpr const char* STATS_HEADER_ID = "Index"; +constexpr const char* STATS_HEADER_DATA_SIZE = "Data Size"; +constexpr const char* STATS_HEADER_DATA_TYPE = "Data Type"; +constexpr const char* STATS_HEADER_FORMAT = "Format"; +constexpr const char* STATS_HEADER_SHAPE = "Shape"; +constexpr const char* STATS_HEADER_MAX = "Max Value"; +constexpr const char* STATS_HEADER_MIN = "Min Value"; +constexpr const char* STATS_HEADER_AVG = "Avg Value"; +constexpr const char* STATS_HEADER_L2NORM = "l2norm"; +constexpr const char* STATS_CSV_HEADER_L2NORM = "L2Norm Value"; +constexpr const char* STATS_HEADER_MD5 = "MD5 Value"; +constexpr const char* STATS_HEADER_NAN = "Nan Count"; +constexpr const char* STATS_CSV_HEADER_NAN = "NaN Count"; +constexpr const char* STATS_HEADER_NEG_INF = "Negative Inf Count"; +constexpr const char* STATS_HEADER_POS_INF = "Positive Inf Count"; +constexpr const char* RANK_ID = "RANK_ID"; +constexpr const char* DIGITAL_NUMBERS = "0123456789"; + +static const std::map> SUMMARY_OPTION_HEADER_STR_MAP = { + {DebuggerSummaryOption::MAX, {STATS_HEADER_MAX, STATS_HEADER_MAX}}, + {DebuggerSummaryOption::MIN, {STATS_HEADER_MIN, STATS_HEADER_MIN}}, + {DebuggerSummaryOption::MEAN, {STATS_HEADER_AVG, STATS_HEADER_AVG}}, + {DebuggerSummaryOption::L2NORM, {STATS_HEADER_L2NORM, STATS_CSV_HEADER_L2NORM}}, + {DebuggerSummaryOption::NAN_CNT, {STATS_HEADER_NAN, STATS_CSV_HEADER_NAN}}, + {DebuggerSummaryOption::NEG_INF_CNT, {STATS_HEADER_NEG_INF, STATS_HEADER_NEG_INF}}, + {DebuggerSummaryOption::POS_INF_CNT, {STATS_HEADER_POS_INF, STATS_HEADER_POS_INF}}, + {DebuggerSummaryOption::MD5, {STATS_HEADER_MD5, STATS_HEADER_MD5}}, }; const static std::map kDtypeTransMap = { @@ -91,7 +91,7 @@ public: std::string GetCsvHeader() const; std::string GetCsvValue() const; std::string GetPath() const {return path;} - bool empty() const {return stats.empty();}; + bool Empty() const {return stats.empty();}; static AclTensorStats CalTensorSummary(const AclTensorInfo& tensor, const std::vector& opt); static AclTensorStats ParseTensorSummary(const std::string& dumpPath, const std::string& input); @@ -114,13 +114,13 @@ private: void ParseInfoFromDumpPath(const std::string& dumpPath); std::string& operator[](DebuggerSummaryOption opt) { return stats[opt]; } - static constexpr const size_t bufferLen = 1024; + static constexpr const size_t BUFFER_LEN = 1024; }; void AclTensorStats::ParseInfoFromDumpPath(const std::string& dumpPath) { std::string filename; - if (FileUtils::GetFileSuffix(filename) == "csv") { + if (FileUtils::GetFileSuffix(dumpPath) == "csv") { filename = FileUtils::GetFileBaseName(dumpPath); } else { filename = FileUtils::GetFileName(dumpPath); @@ -159,7 +159,8 @@ AclTensorStats::AclTensorStats(const AclTensorInfo& tensor, const std::map& opt) +AclTensorStats AclTensorStats::CalTensorSummary(const AclTensorInfo& tensor, + const std::vector& opt) { DEBUG_FUNC_TRACE(); std::map summary; @@ -174,9 +175,9 @@ AclTensorStats AclTensorStats::CalTensorSummary(const AclTensorInfo& tensor, con static std::map ParseTensorSummaryHeaderOrder(const std::vector& segs) { std::map ret; - for (uint32_t pos = 0; pos < segs.size(); ++pos) { + for (size_t pos = 0; pos < segs.size(); ++pos) { const std::string& opt = segs[pos]; - for (auto it = summaryOptionHeaderStrMap.begin(); it != summaryOptionHeaderStrMap.end(); ++it) { + for (auto it = SUMMARY_OPTION_HEADER_STR_MAP.begin(); it != SUMMARY_OPTION_HEADER_STR_MAP.end(); ++it) { if (opt == it->second.first) { ret[pos] = it->first; break; @@ -188,14 +189,14 @@ static std::map ParseTensorSummaryHeaderOrder(c AclTensorStats AclTensorStats::ParseTensorSummary(const std::string& dumpPath, const std::string& input) { - constexpr const uint32_t optPosBase = 7; + constexpr const size_t optPosBase = 7; static std::map order; static uint32_t headerLen = 0; std::vector segs = FileUtils::SplitPath(input, ','); /* device计算统计量场景,各个kernel的统计项的顺序是相同的,只要计算一次即可 */ if (order.empty()) { - if (segs.size() <= optPosBase || segs[0] != kStatsHeaderInout) { + if (segs.size() <= optPosBase || segs[0] != STATS_HEADER_INOUT) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_FORMAT, "Summary data miss header, some data may lose."); return AclTensorStats(); } @@ -211,7 +212,7 @@ AclTensorStats AclTensorStats::ParseTensorSummary(const std::string& dumpPath, c } /* 不重复解析header行 */ - if (segs[0] == kStatsHeaderInout) { + if (segs[0] == STATS_HEADER_INOUT) { return AclTensorStats(); } @@ -236,11 +237,11 @@ std::string AclTensorStats::GetCsvHeader() const return std::string(); } std::string ret; - ret.reserve(bufferLen); + ret.reserve(BUFFER_LEN); ret.append("Op Type,Op Name,Task ID,Stream ID,Timestamp,Input/Output,Slot,Data Size,Data Type,Format,Shape"); for (auto it = stats.begin(); it != stats.end(); it++) { ret.append(","); - ret.append(summaryOptionHeaderStrMap.at(it->first).second); + ret.append(SUMMARY_OPTION_HEADER_STR_MAP.at(it->first).second); } ret.append("\n"); @@ -254,7 +255,7 @@ std::string AclTensorStats::GetCsvValue() const } std::string ret; - ret.reserve(bufferLen); + ret.reserve(BUFFER_LEN); ret.append(opType).append(",").append(opName).append(",").append(taskID).append(",").append(streamID).append(",") \ .append(timestamp).append(",").append(inout).append(",").append(slot).append(",") .append(dataSize) \ .append(",").append(dataType).append(",").append(format).append(",").append(shape); @@ -282,7 +283,7 @@ std::string AclDumpDataProcessor::ToString() const std::to_string(totalLen) + ")"; } -DebuggerErrno AclDumpDataProcessor::PushData(const acldumpChunk *chunk) +DebuggerErrno AclDumpDataProcessor::PushData(const AclDumpChunk *chunk) { DEBUG_FUNC_TRACE(); if (completed) { @@ -305,7 +306,7 @@ DebuggerErrno AclDumpDataProcessor::PushData(const acldumpChunk *chunk) } /* 防止正负翻转 */ - if (SIZE_MAX - len < totalLen || totalLen + len > kMaxDataLen) { + if (SIZE_MAX - len < totalLen || totalLen + len > MAX_DATA_LEN) { LOG_ERROR(DebuggerErrno::ERROR_BUFFER_OVERFLOW, ToString() + ": buffer overflow(cached size " + std::to_string(totalLen) + ", receiving size " + std::to_string(len) + ")."); errorOccurred = true; @@ -417,17 +418,17 @@ static nlohmann::json ParseOverflowInfo(const uint8_t* data) DEBUG_FUNC_TRACE(); uint32_t index = 0; nlohmann::json overflowInfo; - uint64_t modelId = DataUtils::UnpackUint64Value_Le(data); - index += kUint64Size; - uint64_t streamId = DataUtils::UnpackUint64Value_Le(data + index); - index += kUint64Size; - uint64_t taskId = DataUtils::UnpackUint64Value_Le(data + index); - index += kUint64Size; - uint64_t taskType = DataUtils::UnpackUint64Value_Le(data + index); - index += kUint64Size; - uint64_t pcStart = DataUtils::UnpackUint64Value_Le(data + index); - index += kUint64Size; - uint64_t paraBase = DataUtils::UnpackUint64Value_Le(data + index); + uint64_t modelId = DataUtils::UnpackUint64ValueLe(data); + index += UINT64_SIZE; + uint64_t streamId = DataUtils::UnpackUint64ValueLe(data + index); + index += UINT64_SIZE; + uint64_t taskId = DataUtils::UnpackUint64ValueLe(data + index); + index += UINT64_SIZE; + uint64_t taskType = DataUtils::UnpackUint64ValueLe(data + index); + index += UINT64_SIZE; + uint64_t pcStart = DataUtils::UnpackUint64ValueLe(data + index); + index += UINT64_SIZE; + uint64_t paraBase = DataUtils::UnpackUint64ValueLe(data + index); overflowInfo["model_id"] = modelId; overflowInfo["stream_id"] = streamId; @@ -443,30 +444,30 @@ static DebuggerErrno DumpOpDebugDataToDisk(const std::string& dumpPath, AclDumpM { DEBUG_FUNC_TRACE(); std::string outPath = dumpPath + ".output."; - uint32_t num = dumpData.output().size(); + uint32_t num = static_cast(dumpData.output().size()); for (uint32_t slot = 0; slot < num; slot++) { uint32_t offset = 0; // parse DHA Atomic Add info nlohmann::json dhaAtomicAddInfo = ParseOverflowInfo(data + offset); - offset += kDhaAtomicAddInfoSize; + offset += DHA_ATOMIC_ADD_INFO_SIZE; // parse L2 Atomic Add info nlohmann::json l2AtomicAddInfo = ParseOverflowInfo(data + offset); - offset += kL2AtomicAddInfoSize; + offset += L2_ATOMIC_ADD_INFO_SIZE; // parse AICore info nlohmann::json aiCoreInfo = ParseOverflowInfo(data + offset); - offset += kAiCoreInfoSize; + offset += AICORE_INFO_SIZE; // parse DHA Atomic Add status - dhaAtomicAddInfo["status"] = DataUtils::UnpackUint64Value_Le(data + offset); - offset += kDhaAtomicAddStatusSize; + dhaAtomicAddInfo["status"] = DataUtils::UnpackUint64ValueLe(data + offset); + offset += DHA_ATOMIC_ADD_STATUS_SIZE; // parse L2 Atomic Add status - l2AtomicAddInfo["status"] = DataUtils::UnpackUint64Value_Le(data + offset); - offset += kL2AtomicAddStatusSize; + l2AtomicAddInfo["status"] = DataUtils::UnpackUint64ValueLe(data + offset); + offset += L2_ATOMIC_ADD_STATUS_SIZE; // parse AICore status - uint64_t kernelCode = DataUtils::UnpackUint64Value_Le(data + offset); - offset += kUint64Size; - uint64_t blockIdx = DataUtils::UnpackUint64Value_Le(data + offset); - offset += kUint64Size; - uint64_t status = DataUtils::UnpackUint64Value_Le(data + offset); + uint64_t kernelCode = DataUtils::UnpackUint64ValueLe(data + offset); + offset += UINT64_SIZE; + uint64_t blockIdx = DataUtils::UnpackUint64ValueLe(data + offset); + offset += UINT64_SIZE; + uint64_t status = DataUtils::UnpackUint64ValueLe(data + offset); aiCoreInfo["kernel_code"] = DataUtils::U64ToHexString(kernelCode); aiCoreInfo["block_idx"] = blockIdx; aiCoreInfo["status"] = status; @@ -542,8 +543,7 @@ static std::string MappingFilePath(const std::string& originPath) return std::string(); } - DebuggerErrno ret; - ret = FileUtils::CreateDir(dir); + DebuggerErrno ret = FileUtils::CreateDir(dir); if (ret != DebuggerErrno::OK) { LOG_ERROR(DebuggerErrno::ERROR, "Failed to create directory " + dir + "."); return std::string(); @@ -586,7 +586,8 @@ static DebuggerErrno StandardizedDumpPath(std::string& originPath) return DebuggerErrno::OK; } -static std::string GenDataPath(const std::string& path) { +static std::string GenDataPath(const std::string& path) +{ LOG_DEBUG("Original acl data path is " + path); std::string outputPath = DebuggerConfig::GetInstance().GetOutputPath(); std::string dataPath; @@ -608,7 +609,8 @@ static std::string GenDataPath(const std::string& path) { } /* * ACL 接口返回数据的路径格式如下 - * {dump_path}/rank_{rank_id}/{time stamp}/step_{step_id}/{time}/{device_id}/{model_name}/{model_id}/{iteration_id}/{data name} + * {dump_path}/rank_{rank_id}/{time stamp}/step_{step_id}/{time} + /{device_id}/{model_name}/{model_id}/{iteration_id}/{data name} * items[0] 表示 rank_{rank_id} * items[1] 表示 {time stamp} * items[2] 表示 step_{step_id} @@ -668,15 +670,15 @@ static DebuggerErrno DumpOneAclTensorFmtNpy(AclTensorInfo& tensor) AclDtype dstDtype = it->second; ret = AclTensor::TransDtype(tensor, dstDtype); if (ret != DebuggerErrno::OK) { - LOG_ERROR(ret, tensor + ": Failed to transform dtype from " + DataUtils::GetDTypeString(it->first) + " to " + - DataUtils::GetDTypeString(it->second)+ "."); + LOG_ERROR(ret, tensor + ": Failed to transform dtype from " + + DataUtils::GetDTypeString(it->first) + " to " + + DataUtils::GetDTypeString(it->second)+ "."); return ret; } } // dump_path: dump_dir/op_type.op_name.task_id.stream_id.timestamp std::string dumpPathSlot = tensor.dumpPath + GetTensorInfoSuffix(tensor) + "." + NPY_SUFFIX; - if (StandardizedDumpPath(dumpPathSlot) != DebuggerErrno::OK) { LOG_ERROR(DebuggerErrno::ERROR, "Failed to standardize path " + dumpPathSlot + "."); return DebuggerErrno::ERROR; @@ -702,7 +704,7 @@ static DebuggerErrno DumpOneAclTensorFmtNpy(AclTensorInfo& tensor) static DebuggerErrno WriteOneTensorStatToDisk(const AclTensorStats& stat) { DEBUG_FUNC_TRACE(); - if (stat.empty()) { + if (stat.Empty()) { return DebuggerErrno::OK; } @@ -903,7 +905,7 @@ DebuggerErrno AclDumpDataProcessor::DumpToDisk() const std::string dataPath = GenDataPath(dumpPath); DebuggerErrno ret; - if (FileUtils::GetFileName(dumpPath).find(debugFileSign) == 0 && + if (FileUtils::GetFileName(dumpPath).find(DEBUG_FILE_SIGN) == 0 && DebuggerConfig::GetInstance().GetOverflowCheckCfg() != nullptr) { ret = DumpOpDebugDataToDisk(dataPath, dumpData, msg + dataSegOffset, dataSegLen); } else if (DebuggerConfig::GetInstance().GetStatisticsCfg() != nullptr && diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.h similarity index 82% rename from debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.h index 4ce2ab6e8c8709437791aba9699ec76184cb6761..227f0f45dc3fb621cb687c5199f14576d9a1699e 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumpDataProcessor.h @@ -20,23 +20,23 @@ #include #include -#include "include/ErrorCode.hpp" -#include "base/DebuggerConfig.hpp" -#include "third_party/ACL/AclApi.hpp" +#include "include/ErrorCode.h" +#include "base/DebuggerConfig.h" +#include "third_party/ACL/AclApi.h" namespace MindStudioDebugger { -constexpr size_t kMaxDataLen = 4ULL * 1024 * 1024 * 1024; +constexpr size_t MAX_DATA_LEN = 4ULL * 1024 * 1024 * 1024; class AclDumpDataProcessor { public: - AclDumpDataProcessor(const std::string& path, const std::vector& opts) : - dumpPath{path}, hostAnalysisOpts{opts} {}; + AclDumpDataProcessor(const std::string& path, const std::vector& opts) + : dumpPath{path}, hostAnalysisOpts{opts} {}; ~AclDumpDataProcessor(); bool IsCompleted() const {return completed;} bool ErrorOccurred() const {return errorOccurred;} - DebuggerErrno PushData(const acldumpChunk *chunk); + DebuggerErrno PushData(const AclDumpChunk *chunk); DebuggerErrno DumpToDisk(); std::string ToString() const; diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.cpp index 805a6a7a0a24bb1fee1472698511d53beb7a35a6..7c103e42b8a8177f95c4936b94ff1192bb5cf696 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.cpp @@ -19,51 +19,51 @@ #include #include -#include "include/Macro.hpp" -#include "utils/FileUtils.hpp" -#include "utils/FileOperation.hpp" -#include "third_party/ACL/AclApi.hpp" -#include "base/Environment.hpp" -#include "base/ErrorInfos.hpp" -#include "AclDumper.hpp" +#include "include/Macro.h" +#include "utils/FileUtils.h" +#include "utils/FileOperation.h" +#include "third_party/ACL/AclApi.h" +#include "base/Environment.h" +#include "base/ErrorInfosManager.h" +#include "AclDumper.h" namespace MindStudioDebugger { -constexpr const char* kAclDumpScene = "dump_scene"; -constexpr const char* kSceneNormal = "normal"; -constexpr const char* kSceneException ="lite_exception"; +constexpr const char* ACL_DUMP_SCENE = "dump_scene"; +constexpr const char* SCENE_NORMAL = "normal"; +constexpr const char* SCENE_EXCEPTION = "lite_exception"; -constexpr const char* kAclDumpPath = "dump_path"; -constexpr const char* kAclDumpStep = "dump_step"; +constexpr const char* ACL_DUMP_PATH = "dump_path"; +constexpr const char* ACL_DUMP_STEP = "dump_step"; -constexpr const char* kAclDumpList = "dump_list"; -constexpr const char* kAclDumpLayer = "layer"; -constexpr const char* kAclDumpModel = "model_name"; +constexpr const char* ACL_DUMP_LIST = "dump_list"; +constexpr const char* ACL_DUMP_LAYER = "layer"; +constexpr const char* ACL_DUMP_MODEL_NAME = "model_name"; -constexpr const char* kAclDumpMode = "dump_mode"; -constexpr const char* kAclModeInput = "input"; -constexpr const char* kAclModeOutput = "output"; -constexpr const char* kAclModeAll = "all"; +constexpr const char* ACL_DUMP_MODE = "dump_mode"; +constexpr const char* ACL_MODE_INPUT = "input"; +constexpr const char* ACL_MODE_OUTPUT = "output"; +constexpr const char* ACL_MODE_ALL = "all"; -constexpr const char* kAclDumpOpSwitch = "dump_op_switch"; -constexpr const char* kAclDumpDebug = "dump_debug"; -constexpr const char* kAclSwitchOn = "on"; -constexpr const char* kAclSwitchOff = "off"; +constexpr const char* DUMP_OP_SWITCH = "dump_op_switch"; +constexpr const char* ACL_DUMP_DEBUG = "dump_debug"; +constexpr const char* ACL_SWITCH_ON = "on"; +constexpr const char* ACL_SWITCH_OFF = "off"; -constexpr const char* kAclDumpData = "dump_data"; -constexpr const char* kAclDumpTensor = "tensor"; -constexpr const char* kAclDumpStats = "stats"; +constexpr const char* ACL_DUMP_DATA = "dump_data"; +constexpr const char* ACL_DUMP_TENSOR = "tensor"; +constexpr const char* ACL_DUMP_STATS = "stats"; -constexpr const char* kAclDumpStatsOpt = "dump_stats"; -constexpr const char* kAclDumpStatsMax = "Max"; -constexpr const char* kAclDumpStatsMin = "Min"; -constexpr const char* kAclDumpStatsAvg = "Avg"; -constexpr const char* kAclDumpStatsNorn = "L2norm"; -constexpr const char* kAclDumpStatsNan = "Nan"; -constexpr const char* kAclDumpStatsNegInf = "Negative Inf"; -constexpr const char* kAclDumpStatsPosInf = "Positive Inf"; +constexpr const char* ACL_DUMP_STATS_OPT = "dump_stats"; +constexpr const char* ACL_DUMP_STATS_MAX = "Max"; +constexpr const char* ACL_DUMP_STATS_MIN = "Min"; +constexpr const char* ACL_DUMP_STATS_AVG = "Avg"; +constexpr const char* ACL_DUMP_STATS_NORM = "L2norm"; +constexpr const char* ACL_DUMP_STATS_NAN = "Nan"; +constexpr const char* ACL_DUMP_STATS_NEG_INF = "Negative Inf"; +constexpr const char* ACL_DUMP_STATS_POS_INF = "Positive Inf"; -constexpr const size_t kProcessorNumMax = 100; +constexpr const size_t PROCESSOR_NUM_MAX = 100; inline std::string GenAclJsonPath(const std::string& dumpPath, uint32_t rank) { @@ -74,14 +74,14 @@ inline std::string GenAclJsonPath(const std::string& dumpPath, uint32_t rank) static std::string GenDumpInoutString(DebuggerDataInOut mode) { static std::map dumpModeMap = { - {DebuggerDataInOut::INOUT_INPUT, kAclModeInput}, - {DebuggerDataInOut::INOUT_OUTPUT, kAclModeOutput}, - {DebuggerDataInOut::INOUT_BOTH, kAclModeAll}, + {DebuggerDataInOut::INOUT_INPUT, ACL_MODE_INPUT}, + {DebuggerDataInOut::INOUT_OUTPUT, ACL_MODE_OUTPUT}, + {DebuggerDataInOut::INOUT_BOTH, ACL_MODE_ALL}, }; auto it = dumpModeMap.find(mode); if (it == dumpModeMap.end()) { - return kAclModeAll; + return ACL_MODE_ALL; } else { return it->second; } @@ -90,13 +90,13 @@ static std::string GenDumpInoutString(DebuggerDataInOut mode) static std::vector GenStatsOptions(const std::vector& options) { static std::map summaryOptMap = { - {DebuggerSummaryOption::MAX, kAclDumpStatsMax}, - {DebuggerSummaryOption::MIN, kAclDumpStatsMin}, - {DebuggerSummaryOption::MEAN, kAclDumpStatsAvg}, - {DebuggerSummaryOption::L2NORM, kAclDumpStatsNorn}, - {DebuggerSummaryOption::NAN_CNT, kAclDumpStatsNan}, - {DebuggerSummaryOption::NEG_INF_CNT, kAclDumpStatsNegInf}, - {DebuggerSummaryOption::POS_INF_CNT, kAclDumpStatsPosInf}, + {DebuggerSummaryOption::MAX, ACL_DUMP_STATS_MAX}, + {DebuggerSummaryOption::MIN, ACL_DUMP_STATS_MIN}, + {DebuggerSummaryOption::MEAN, ACL_DUMP_STATS_AVG}, + {DebuggerSummaryOption::L2NORM, ACL_DUMP_STATS_NORM}, + {DebuggerSummaryOption::NAN_CNT, ACL_DUMP_STATS_NAN}, + {DebuggerSummaryOption::NEG_INF_CNT, ACL_DUMP_STATS_NEG_INF}, + {DebuggerSummaryOption::POS_INF_CNT, ACL_DUMP_STATS_POS_INF}, }; std::vector output; @@ -156,7 +156,7 @@ bool AclDumper::IsOverflowCompleted() return overflowNums != -1 && realOverflowNums > overflowNums; } -void AclDumper::CountOverflowNumbers(const acldumpChunk* chunk) +void AclDumper::CountOverflowNumbers(const AclDumpChunk* chunk) { if (IsOverflowCompleted() || !isOverflowDump || !chunk->isLastChunk) { return; @@ -194,19 +194,19 @@ DebuggerErrno AclDumper::AclDumpGenTensorJson(std::shared_ptrinout); - aclDumpJson[kAclDumpData] = kAclDumpTensor; - aclDumpJson[kAclDumpList] = nlohmann::json::array(); - aclDumpJson[kAclDumpOpSwitch] = kAclSwitchOn; + aclDumpJson[ACL_DUMP_PATH] = fullDumpPath; + aclDumpJson[ACL_DUMP_MODE] = GenDumpInoutString(dumpTensorCfg->inout); + aclDumpJson[ACL_DUMP_DATA] = ACL_DUMP_TENSOR; + aclDumpJson[ACL_DUMP_LIST] = nlohmann::json::array(); + aclDumpJson[DUMP_OP_SWITCH] = ACL_SWITCH_ON; if (!needDump) { /* 这里沿用mindspore框架的方案,用一个大数0x7FFFFFFF表示不需要dump;这个方案非常奇怪,后续可以看下能否优化 */ - aclDumpJson[kAclDumpStep] = std::to_string(INT_MAX); + aclDumpJson[ACL_DUMP_STEP] = std::to_string(INT_MAX); } else { std::vector kernelsList = dumpTensorCfg->matcher.GenRealKernelList(kernels); if (!kernelsList.empty()) { - aclDumpJson[kAclDumpList].push_back({{kAclDumpLayer, kernelsList}}); + aclDumpJson[ACL_DUMP_LIST].push_back({{ACL_DUMP_LAYER, kernelsList}}); } } @@ -230,25 +230,26 @@ DebuggerErrno AclDumper::AclDumpGenStatJson(std::shared_ptr fullDumpPath = dumpPath; } - aclDumpJson[kAclDumpPath] = fullDumpPath; - aclDumpJson[kAclDumpMode] = GenDumpInoutString(statisticsCfg->inout); - aclDumpJson[kAclDumpList] = nlohmann::json::array(); - aclDumpJson[kAclDumpOpSwitch] = kAclSwitchOn; + aclDumpJson[ACL_DUMP_PATH] = fullDumpPath; + aclDumpJson[ACL_DUMP_MODE] = GenDumpInoutString(statisticsCfg->inout); + aclDumpJson[ACL_DUMP_LIST] = nlohmann::json::array(); + aclDumpJson[DUMP_OP_SWITCH] = ACL_SWITCH_ON; /* 如果需要host侧分析,下给acl的任务还是dump tensor,然后在host侧转成统计量 */ if (!hostAnalysisOpt.empty()) { - aclDumpJson[kAclDumpData] = kAclDumpTensor; + aclDumpJson[ACL_DUMP_DATA] = ACL_DUMP_TENSOR; } else { - aclDumpJson[kAclDumpData] = kAclDumpStats; - aclDumpJson[kAclDumpStatsOpt] = GenStatsOptions(statisticsCfg->summaryOption); + aclDumpJson[ACL_DUMP_DATA] = ACL_DUMP_STATS; + aclDumpJson[ACL_DUMP_STATS_OPT] = GenStatsOptions(statisticsCfg->summaryOption); } if (!needDump) { - aclDumpJson[kAclDumpStep] = std::to_string(INT_MAX); + aclDumpJson[ACL_DUMP_STEP] = std::to_string(INT_MAX); } else { std::vector kernelsList = statisticsCfg->matcher.GenRealKernelList(kernels); - if (!kernelsList.empty()){ - aclDumpJson[kAclDumpList].push_back({{kAclDumpLayer, kernelsList}}); + if (!kernelsList.empty()) + { + aclDumpJson[ACL_DUMP_LIST].push_back({{ACL_DUMP_LAYER, kernelsList}}); } } @@ -277,10 +278,10 @@ DebuggerErrno AclDumper::AclDumpGenOverflowJson(std::shared_ptrfileName); auto it = dataProcessors.find(dumpPath); if (it == dataProcessors.end()) { - if (dataProcessors.size() > kProcessorNumMax) { + if (dataProcessors.size() > PROCESSOR_NUM_MAX) { LOG_ERROR(DebuggerErrno::ERROR_BUFFER_OVERFLOW, "The number of processors has reached the upper limit."); return; } @@ -429,7 +430,7 @@ void AclDumper::SetDump(uint32_t rank, uint32_t curStep, ExtArgs& args) if (!initialized) { ret = Initialize(); - if(ret != DebuggerErrno::OK) { + if (ret != DebuggerErrno::OK) { LOG_ERROR(ret, "AclDumper initialization failed."); return; } @@ -458,8 +459,7 @@ void AclDumper::SetDump(uint32_t rank, uint32_t curStep, ExtArgs& args) return; } - aclError aclRet; - aclRet = CALL_ACL_API(aclmdlInitDump); + aclError aclRet = CALL_ACL_API(AclmdlInitDump); if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to init acldump(" + std::to_string(aclRet) + ")."); @@ -467,7 +467,7 @@ void AclDumper::SetDump(uint32_t rank, uint32_t curStep, ExtArgs& args) } const std::string& dumpPath = DebuggerConfig::GetInstance().GetOutputPath(); - aclRet = CALL_ACL_API(aclmdlSetDump, GenAclJsonPath(dumpPath, rank).c_str()); + aclRet = CALL_ACL_API(AclmdlSetDump, GenAclJsonPath(dumpPath, rank).c_str()); if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to enable acldump(" + std::to_string(aclRet) + ")."); @@ -485,51 +485,53 @@ void AclDumper::FinalizeDump(ExtArgs& args) return; } - CALL_ACL_API(aclrtSynchronizeDevice); - aclError aclRet = CALL_ACL_API(aclmdlFinalizeDump); + CALL_ACL_API(AclrtSynchronizeDevice); + aclError aclRet = CALL_ACL_API(AclmdlFinalizeDump); if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to finalize acldump(" + std::to_string(aclRet) + ")."); - } aclDumpHasSet = false; } -void KernelInitDump() { - if (AscendCLApi::LoadAclApi() != DebuggerErrno::OK) { - return; - } +void KernelInitDump() +{ + if (AscendCLApi::LoadAclApi() != DebuggerErrno::OK) { + return; + } - DebuggerErrno ret = InitAcl(); - if (ret != DebuggerErrno::OK) { - LOG_ERROR(ret, "Failed to call InitAcl."); - return; - } - auto aclRet = CALL_ACL_API(aclmdlInitDump); - if (aclRet != ACL_SUCCESS) { + DebuggerErrno ret = InitAcl(); + if (ret != DebuggerErrno::OK) { + LOG_ERROR(ret, "Failed to call InitAcl."); + return; + } + auto aclRet = CALL_ACL_API(AclmdlInitDump); + if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to init acldump(" + std::to_string(aclRet) + ")."); return; - } + } } -void KernelSetDump(const std::string &filePath) { - std::string dumpPath = FileUtils::GetAbsPath(filePath); - auto aclRet = CALL_ACL_API(aclmdlSetDump, dumpPath.c_str()); - if (aclRet != ACL_SUCCESS) { +void KernelSetDump(const std::string &filePath) +{ + std::string dumpPath = FileUtils::GetAbsPath(filePath); + auto aclRet = CALL_ACL_API(AclmdlSetDump, dumpPath.c_str()); + if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to enable acldump(" + std::to_string(aclRet) + ")."); return; - } + } } -void KernelFinalizeDump() { - CALL_ACL_API(aclrtSynchronizeDevice); - auto aclRet = CALL_ACL_API(aclmdlFinalizeDump); - if (aclRet != ACL_SUCCESS) { +void KernelFinalizeDump() +{ + CALL_ACL_API(AclrtSynchronizeDevice); + auto aclRet = CALL_ACL_API(AclmdlFinalizeDump); + if (aclRet != ACL_SUCCESS) { LOG_ERROR(DebuggerErrno::ERROR_EXTERNAL_API_ERROR, "Failed to finalize acldump(" + std::to_string(aclRet) + ")."); - } + } } } \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.h similarity index 87% rename from debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.h index 6985df65e166101c08501e5e206e003bda494b9a..b4316b18418acf343987f5443d230ea4039c1612 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclDumper.h @@ -21,17 +21,18 @@ #include #include -#include "include/ExtArgs.hpp" -#include "base/DebuggerConfig.hpp" -#include "AclDumpDataProcessor.hpp" +#include "include/ExtArgs.h" +#include "base/DebuggerConfig.h" +#include "AclDumpDataProcessor.h" namespace MindStudioDebugger { class AclDumper { public: - static AclDumper& GetInstance() { - static AclDumper instance_; - return instance_; + static AclDumper& GetInstance() + { + static AclDumper dumperInstance; + return dumperInstance; } static bool IsIterNeedDump(uint32_t iterId); @@ -39,7 +40,7 @@ public: void SetDump(uint32_t rank, uint32_t curStep, ExtArgs& args); void FinalizeDump(ExtArgs& args); - void OnAclDumpCallBack(const acldumpChunk* chunk, int32_t len); + void OnAclDumpCallBack(const AclDumpChunk* chunk, int32_t len); std::string GetDumpPath(uint32_t curStep) const; @@ -58,7 +59,7 @@ private: uint32_t curStep, const char** kernels); DebuggerErrno AclDumpGenOverflowJson(std::shared_ptr overflowCfg, uint32_t rank, uint32_t curStep); - void CountOverflowNumbers(const acldumpChunk* chunk); + void CountOverflowNumbers(const AclDumpChunk* chunk); bool IsOverflowCompleted(); bool initialized{false}; diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.cpp index e2fc4a62da8e24b5a0ae1fd272c0fabf2c8022de..2ff83ee8d8b5bbcc9f7cd486dfa41b2e1c756480 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.cpp @@ -22,10 +22,10 @@ #include #include -#include "utils/DataUtils.hpp" -#include "utils/MathUtils.hpp" -#include "base/ErrorInfos.hpp" -#include "AclTensor.hpp" +#include "utils/DataUtils.h" +#include "utils/MathUtils.h" +#include "base/ErrorInfosManager.h" +#include "AclTensor.h" namespace MindStudioDebugger { namespace AclDumpMsg = toolkit::dumpdata; @@ -33,21 +33,21 @@ namespace AclTensor { using namespace MathUtils; -constexpr int64_t kCubeSize = 16; -constexpr int64_t kCube16 = kCubeSize; -constexpr int64_t kCube32 = 32; -constexpr int64_t kCube64 = 64; -constexpr int64_t kCubeSize_C04 = 4; - -constexpr size_t hwH = 1; -constexpr size_t hwW = 2; -constexpr size_t fnzW1 = 4; -constexpr size_t fnzH1 = 3; -constexpr size_t fnzH0 = 2; -constexpr size_t fnzW0 = 1; -constexpr size_t fzN0 = 1; -constexpr size_t fzNi = 2; -constexpr size_t fzC0 = 3; +constexpr int64_t CUBE_SIZE = 16; +constexpr int64_t CUBE_16 = CUBE_SIZE; +constexpr int64_t CUBE_32 = 32; +constexpr int64_t CUBE_64 = 64; +constexpr int64_t CUBE_SIZE_C04 = 4; + +constexpr size_t HW_H = 1; +constexpr size_t HW_W = 2; +constexpr size_t FNZ_W1 = 4; +constexpr size_t FNZ_H1 = 3; +constexpr size_t FNZ_H0 = 2; +constexpr size_t FNZ_W0 = 1; +constexpr size_t FZ_N0 = 1; +constexpr size_t FZ_NI = 2; +constexpr size_t FZ_C0 = 3; using TensorTransFunc = DebuggerErrno (*)(AclTensorInfo &); @@ -94,21 +94,20 @@ const static std::unordered_set kSupportedFormat = { AclFormat::FORMAT_DHWNC, AclFormat::FORMAT_NDC1HWC0, AclFormat::FORMAT_FRACTAL_Z_3D, - AclFormat::FORMAT_C1HWNCoC0, + AclFormat::FORMAT_C1HWNCOC0, AclFormat::FORMAT_FRACTAL_NZ, AclFormat::FORMAT_FRACTAL_ZN_LSTM, AclFormat::FORMAT_NCL, }; const static std::map, TensorTransFunc> formatTransFuncMap = { - /* {{from, to}, function} */ {{AclFormat::FORMAT_HWCN, AclFormat::FORMAT_NCHW}, nullptr}, {{AclFormat::FORMAT_NHWC, AclFormat::FORMAT_NCHW}, nullptr}, {{AclFormat::FORMAT_FRACTAL_Z, AclFormat::FORMAT_NCHW}, FRAC_Z_TO_NCHW}, {{AclFormat::FORMAT_FRACTAL_NZ, AclFormat::FORMAT_NCHW}, FRAC_NZ_TO_NCHW}, {{AclFormat::FORMAT_NC1HWC0, AclFormat::FORMAT_NCHW}, NC1HWC0_TO_NCHW}, {{AclFormat::FORMAT_NDC1HWC0, AclFormat::FORMAT_NCHW}, NDC1HWC0_TO_NCDHW}, - {{AclFormat::FORMAT_C1HWNCoC0, AclFormat::FORMAT_NCHW}, C1HWNCoC0_TO_NCHW}, + {{AclFormat::FORMAT_C1HWNCOC0, AclFormat::FORMAT_NCHW}, C1HWNCoC0_TO_NCHW}, {{AclFormat::FORMAT_NC1HWC0_C04, AclFormat::FORMAT_NCHW}, NC1HWC0_C04_TO_NCHW}, {{AclFormat::FORMAT_FRACTAL_Z_3D, AclFormat::FORMAT_NCHW}, FRAC_Z3D_TO_NCDHW}, }; @@ -164,7 +163,8 @@ const static std::unordered_map formatTrans {AclDumpMsg::OutputFormat::FORMAT_NC1HWC0_C04, AclFormat::FORMAT_NC1HWC0_C04}, {AclDumpMsg::OutputFormat::FORMAT_FRACTAL_Z_C04, AclFormat::FORMAT_FRACTAL_Z_C04}, {AclDumpMsg::OutputFormat::FORMAT_CHWN, AclFormat::FORMAT_CHWN}, - {AclDumpMsg::OutputFormat::FORMAT_FRACTAL_DECONV_SP_STRIDE8_TRANS, AclFormat::FORMAT_FRACTAL_DECONV_SP_STRIDE8_TRANS}, + {AclDumpMsg::OutputFormat::FORMAT_FRACTAL_DECONV_SP_STRIDE8_TRANS, + AclFormat::FORMAT_FRACTAL_DECONV_SP_STRIDE8_TRANS}, {AclDumpMsg::OutputFormat::FORMAT_HWCN, AclFormat::FORMAT_HWCN}, {AclDumpMsg::OutputFormat::FORMAT_NC1KHKWHWC0, AclFormat::FORMAT_NC1KHKWHWC0}, {AclDumpMsg::OutputFormat::FORMAT_BN_WEIGHT, AclFormat::FORMAT_BN_WEIGHT}, @@ -174,7 +174,7 @@ const static std::unordered_map formatTrans {AclDumpMsg::OutputFormat::FORMAT_HASHTABLE_LOOKUP_VALUE, AclFormat::FORMAT_HASHTABLE_LOOKUP_VALUE}, {AclDumpMsg::OutputFormat::FORMAT_HASHTABLE_LOOKUP_OUTPUT, AclFormat::FORMAT_HASHTABLE_LOOKUP_OUTPUT}, {AclDumpMsg::OutputFormat::FORMAT_HASHTABLE_LOOKUP_HITS, AclFormat::FORMAT_HASHTABLE_LOOKUP_HITS}, - {AclDumpMsg::OutputFormat::FORMAT_C1HWNCoC0, AclFormat::FORMAT_C1HWNCoC0}, + {AclDumpMsg::OutputFormat::FORMAT_C1HWNCoC0, AclFormat::FORMAT_C1HWNCOC0}, {AclDumpMsg::OutputFormat::FORMAT_MD, AclFormat::FORMAT_MD}, {AclDumpMsg::OutputFormat::FORMAT_NDHWC, AclFormat::FORMAT_NDHWC}, {AclDumpMsg::OutputFormat::FORMAT_FRACTAL_ZZ, AclFormat::FORMAT_FRACTAL_ZZ}, @@ -201,20 +201,20 @@ const static std::unordered_map formatTrans {AclDumpMsg::OutputFormat::FORMAT_C1HWC0, AclFormat::FORMAT_C1HWC0}, }; -enum kAxis4D : int { kN = 0, kC, kH, kW, kNchwDims }; +enum Axis4D : int { AXIS_N = 0, AXIS_C, AXIS_H, AXIS_W, NCHW_DIMS }; enum Axis5D : int { - N_ncdhw = 0, - C_ncdhw, - D_ncdhw, - H_ncdhw, - W_ncdhw, - kNcdhw, - N_ndc1hwc0 = 0, - D_ndc1hwc0, - C1_ndc1hwc0, - H_ndc1hwc0, - W_ndc1hwc0, - C0_ndc1hwc0 + N_NCDHW, + C_NCDHW, + D_NCDHW, + H_NCDHW, + W_NCDHW, + NCDHW, + N_NDC1HWC0, + D_NDC1HWC0, + C1_NDC1HWC0, + H_NDC1HWC0, + W_NDC1HWC0, + C0_NDC1HWC0 }; static inline AclDtype transAclDtype2MS(AclDumpMsg::OutputDataType dt) @@ -235,7 +235,8 @@ static inline AclFormat transAclFormat2MS(AclDumpMsg::OutputFormat fmt) return AclFormat::FORMAT_MAX; } -static size_t EleNumOfTensor(const AclTensorInfo& tensor, bool host = true) { +static size_t EleNumOfTensor(const AclTensorInfo& tensor, bool host = true) +{ size_t num = 1; const AclShape& shape = host ? tensor.hostShape : tensor.deviceShape; for (auto dim : shape) { @@ -244,23 +245,26 @@ static size_t EleNumOfTensor(const AclTensorInfo& tensor, bool host = true) { return 0; } - if (SIZE_MAX / dim < num) { + if (SIZE_MAX / dim < static_cast(num)) { throw std::out_of_range(tensor + ": Count of element over size_t."); } num *= static_cast(dim); } - return num; + return num; } -static inline size_t SizeOfAclDType(const AclTensorInfo& tensor) { +static inline size_t SizeOfAclDType(const AclTensorInfo& tensor) +{ return DataUtils::SizeOfDType(tensor.dtype); } -static inline size_t SizeOfAclDType(const AclDtype& dtype) { +static inline size_t SizeOfAclDType(const AclDtype& dtype) +{ return DataUtils::SizeOfDType(dtype); } -size_t SizeOfTensor(const AclTensorInfo& tensor, bool host) { +size_t SizeOfTensor(const AclTensorInfo& tensor, bool host) +{ size_t num = EleNumOfTensor(tensor, host); size_t eleSize = SizeOfAclDType(tensor); if (num != 0 && SIZE_MAX / num < eleSize) { @@ -269,16 +273,17 @@ size_t SizeOfTensor(const AclTensorInfo& tensor, bool host) { return num * eleSize; } -static inline int64_t GetCubeSizeByType(const AclDtype& dtype) { +static inline int64_t GetCubeSizeByType(const AclDtype& dtype) +{ if (dtype == AclDtype::DT_UINT8 || dtype == AclDtype::DT_INT8) { - return kCube32; + return CUBE_32; } if (dtype == AclDtype::DT_INT4) { - return kCube64; + return CUBE_64; } - return kCube16; + return CUBE_16; } static inline void AssertDim(const AclShape& shape, size_t dim) @@ -291,11 +296,14 @@ static inline void AssertDim(const AclShape& shape, size_t dim) static inline void AssertConsis(const AclTensorInfo& tensor) { - size_t tensor_size = EleNumOfTensor(tensor, false) * SizeOfAclDType(tensor); + size_t tensorSize = EleNumOfTensor(tensor, false) * SizeOfAclDType(tensor); // Processing dtype whose size < 1 // The ele num of quantization type(qint4*2) in MindSpore must be even. - if (tensor.dtype == AclDtype::DT_INT4) tensor_size = EleNumOfTensor(tensor, false) / 2; - if (tensor_size != tensor.dataSize) { + int int4_size_factor = 2; + if (tensor.dtype == AclDtype::DT_INT4) { + tensorSize = EleNumOfTensor(tensor, false) / int4_size_factor; + } + if (tensorSize != tensor.dataSize) { throw std::runtime_error(tensor + ": The internal data of Tensor is inconsistent."); } } @@ -325,8 +333,8 @@ AclTensorInfo ParseAttrsFromDumpData(const std::string& dumpPath, const uint8_t* for (auto d : tensor.original_shape().dim()) { if (d > INT64_MAX) { LOG_WARNING(DebuggerErrno::ERROR_VALUE_OVERFLOW, - "The value(" + std::to_string(d) + ") exceeds the max value of int64_t, " + - "this maybe caused by the unfixed shape operaters."); + "The value(" + std::to_string(d) + ") exceeds the max value of int64_t, " + + "this maybe caused by the unfixed shape operaters."); hShape.clear(); break; } @@ -335,7 +343,7 @@ AclTensorInfo ParseAttrsFromDumpData(const std::string& dumpPath, const uint8_t* // convert format to host format. It can be either NCHW or ND (non 4-dimemsions). AclFormat hFmt; - if (hShape.size() == kDim4) { + if (hShape.size() == DIM_4) { hFmt = AclFormat::FORMAT_NCHW; } else if (hShape.empty()) { hFmt = dFmt; @@ -347,7 +355,8 @@ AclTensorInfo ParseAttrsFromDumpData(const std::string& dumpPath, const uint8_t* } int32_t subFormat = tensor.sub_format(); - return AclTensorInfo{dumpPath, data, dtype, dtype, dFmt, hFmt, dShape, hShape, dataSize, subFormat, io, slot, dumpOriginData}; + return AclTensorInfo{dumpPath, data, dtype, dtype, dFmt, hFmt, + dShape, hShape, dataSize, subFormat, io, slot, dumpOriginData}; } template AclTensorInfo ParseAttrsFromDumpData( @@ -364,14 +373,14 @@ static inline void AllocTensorTransBuf(AclTensorInfo& tensor) static DebuggerErrno FRAC_Z_TO_NCHW_WITH_GROUPS(AclTensorInfo& tensor) { - AssertDim(tensor.hostShape, kDim4); + AssertDim(tensor.hostShape, DIM_4); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto nDim = tensor.hostShape[kN]; - auto cDim = tensor.hostShape[kC]; - auto hDim = tensor.hostShape[kH]; - auto wDim = tensor.hostShape[kW]; + auto nDim = tensor.hostShape[AXIS_N]; + auto cDim = tensor.hostShape[AXIS_C]; + auto hDim = tensor.hostShape[AXIS_H]; + auto wDim = tensor.hostShape[AXIS_W]; auto groups = tensor.subFormat; auto cinOri = cDim; auto coutOri = nDim / groups; @@ -382,7 +391,7 @@ static DebuggerErrno FRAC_Z_TO_NCHW_WITH_GROUPS(AclTensorInfo& tensor) } auto cubeK = GetCubeSizeByType(tensor.dtype); - auto eMult = std::min(Lcm(Lcm(cinOri, cubeK) / cinOri, Lcm(coutOri, kCubeSize) / cinOri), + auto eMult = std::min(Lcm(Lcm(cinOri, cubeK) / cinOri, Lcm(coutOri, CUBE_SIZE) / cinOri), static_cast(groups)); if (eMult == 0) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_VALUE, @@ -391,12 +400,12 @@ static DebuggerErrno FRAC_Z_TO_NCHW_WITH_GROUPS(AclTensorInfo& tensor) } auto cinOpt = AlignCeil(eMult * cinOri, cubeK); - auto coutOpt = AlignCeil(eMult * coutOri, kCubeSize); + auto coutOpt = AlignCeil(eMult * coutOri, CUBE_SIZE); auto c1Dim = cinOpt / cubeK; const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t g = 0; g < groups; ++g) { for (int64_t c = 0; c < cDim; ++c) { @@ -433,17 +442,17 @@ static DebuggerErrno FRAC_Z_TO_NCHW(AclTensorInfo& tensor) return FRAC_Z_TO_NCHW_WITH_GROUPS(tensor); } - AssertDim(tensor.hostShape, kDim4); + AssertDim(tensor.hostShape, DIM_4); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto n0 = tensor.deviceShape.at(fzN0); - auto ni = tensor.deviceShape.at(fzNi); - auto c0 = tensor.deviceShape.at(fzC0); - auto n = tensor.hostShape[kN]; - auto c = tensor.hostShape[kC]; - auto h = tensor.hostShape[kH]; - auto w = tensor.hostShape[kW]; + auto n0 = tensor.deviceShape.at(FZ_N0); + auto ni = tensor.deviceShape.at(FZ_NI); + auto c0 = tensor.deviceShape.at(FZ_C0); + auto n = tensor.hostShape[AXIS_N]; + auto c = tensor.hostShape[AXIS_C]; + auto h = tensor.hostShape[AXIS_H]; + auto w = tensor.hostShape[AXIS_W]; auto nc = ni * n0; auto ncc0 = nc * c0; auto wncc0 = w * ncc0; @@ -457,8 +466,8 @@ static DebuggerErrno FRAC_Z_TO_NCHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t nIdx = 0; nIdx < n; nIdx++) { int64_t nHeadAddr = nIdx * chw; for (int64_t cIdx = 0; cIdx < c; cIdx++) { @@ -487,7 +496,7 @@ static DebuggerErrno FRAC_Z_TO_NCHW(AclTensorInfo& tensor) static void TransShapeToHwNz(const AclShape &hostShape, AclShape& hwShape) { - if (hostShape.size() == kDim1) { + if (hostShape.size() == DIM_1) { hwShape.push_back(1); hwShape.push_back(1); hwShape.push_back(hostShape[0]); @@ -495,12 +504,12 @@ static void TransShapeToHwNz(const AclShape &hostShape, AclShape& hwShape) } auto size = hostShape.size(); int64_t times = 1; - for (size_t i = 0; i != size - kDim2; i++) { + for (size_t i = 0; i != size - DIM_2; i++) { times *= hostShape[i]; } hwShape.push_back(times); - hwShape.push_back(hostShape[size - kDim2]); - hwShape.push_back(hostShape[size - kDim1]); + hwShape.push_back(hostShape[size - DIM_2]); + hwShape.push_back(hostShape[size - DIM_1]); } static DebuggerErrno FRAC_NZ_TO_NCHW(AclTensorInfo& tensor) @@ -511,20 +520,20 @@ static DebuggerErrno FRAC_NZ_TO_NCHW(AclTensorInfo& tensor) AclShape hwShape; TransShapeToHwNz(tensor.hostShape, hwShape); auto times = hwShape.at(0); - auto h = hwShape.at(hwH); - auto w = hwShape.at(hwW); + auto h = hwShape.at(HW_H); + auto w = hwShape.at(HW_W); auto hw = h * w; auto shapeSize = tensor.deviceShape.size(); - if (shapeSize < kDim4) { + if (shapeSize < DIM_4) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_VALUE, tensor + ": Invalid shape size."); return DebuggerErrno::ERROR_INVALID_VALUE; } - auto w1 = tensor.deviceShape[shapeSize - fnzW1]; - auto h1 = tensor.deviceShape[shapeSize - fnzH1]; - auto h0 = tensor.deviceShape[shapeSize - fnzH0]; - auto w0 = tensor.deviceShape[shapeSize - fnzW0]; + auto w1 = tensor.deviceShape[shapeSize - FNZ_W1]; + auto h1 = tensor.deviceShape[shapeSize - FNZ_H1]; + auto h0 = tensor.deviceShape[shapeSize - FNZ_H0]; + auto w0 = tensor.deviceShape[shapeSize - FNZ_W0]; auto h1h0w0 = h1 * h0 * w0; auto w1h1h0w0 = w1 * h1h0w0; if (w0 == 0) { @@ -535,8 +544,8 @@ static DebuggerErrno FRAC_NZ_TO_NCHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t timesIdx = 0; timesIdx < times; timesIdx++) { auto timesHead = timesIdx * w1h1h0w0; @@ -576,16 +585,16 @@ static DebuggerErrno FRAC_NZ_TO_NCHW(AclTensorInfo& tensor) static DebuggerErrno NC1HWC0_TO_NCHW(AclTensorInfo& tensor) { - AssertDim(tensor.hostShape, kDim4); + AssertDim(tensor.hostShape, DIM_4); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto n = tensor.hostShape[kN]; - auto c = tensor.hostShape[kC]; - auto h = tensor.hostShape[kH]; - auto w = tensor.hostShape[kW]; - auto c1 = tensor.deviceShape[kDim1]; - auto c0 = tensor.deviceShape[kDim4]; + auto n = tensor.hostShape[AXIS_N]; + auto c = tensor.hostShape[AXIS_C]; + auto h = tensor.hostShape[AXIS_H]; + auto w = tensor.hostShape[AXIS_W]; + auto c1 = tensor.deviceShape[DIM_1]; + auto c0 = tensor.deviceShape[DIM_4]; if (c0 == 0) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_VALUE, tensor + ": Invalid shape size."); return DebuggerErrno::ERROR_INVALID_VALUE; @@ -599,8 +608,8 @@ static DebuggerErrno NC1HWC0_TO_NCHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t nIndex = 0; nIndex < n; nIndex++) { int64_t nHeadAddr = nIndex * chw; for (int64_t cIndex = 0; cIndex < c; cIndex++) { @@ -628,17 +637,17 @@ static DebuggerErrno NC1HWC0_TO_NCHW(AclTensorInfo& tensor) static DebuggerErrno NDC1HWC0_TO_NCDHW(AclTensorInfo& tensor) { - AssertDim(tensor.hostShape, kDim5); + AssertDim(tensor.hostShape, DIM_5); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto n = tensor.hostShape[N_ncdhw]; - auto c = tensor.hostShape[C_ncdhw]; - auto d = tensor.hostShape[D_ncdhw]; - auto h = tensor.hostShape[H_ncdhw]; - auto w = tensor.hostShape[W_ncdhw]; - auto c1 = tensor.deviceShape[C1_ndc1hwc0]; - auto c0 = tensor.deviceShape[C0_ndc1hwc0]; + auto n = tensor.hostShape[N_NCDHW]; + auto c = tensor.hostShape[C_NCDHW]; + auto d = tensor.hostShape[D_NCDHW]; + auto h = tensor.hostShape[H_NCDHW]; + auto w = tensor.hostShape[W_NCDHW]; + auto c1 = tensor.deviceShape[C1_NDC1HWC0]; + auto c0 = tensor.deviceShape[C0_NDC1HWC0]; if (c0 == 0) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_VALUE, tensor + ": Invalid shape size."); return DebuggerErrno::ERROR_INVALID_VALUE; @@ -654,8 +663,8 @@ static DebuggerErrno NDC1HWC0_TO_NCDHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t nIndex = 0; nIndex < n; nIndex++) { int64_t nHead = nIndex * cdhw; for (int64_t cIndex = 0; cIndex < c; cIndex++) { @@ -687,14 +696,14 @@ static DebuggerErrno NDC1HWC0_TO_NCDHW(AclTensorInfo& tensor) static DebuggerErrno C1HWNCoC0_TO_NCHW(AclTensorInfo& tensor) { - AssertDim(tensor.hostShape, kDim4); + AssertDim(tensor.hostShape, DIM_4); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto n = tensor.hostShape[kN]; - auto c = tensor.hostShape[kC]; - auto h = tensor.hostShape[kH]; - auto w = tensor.hostShape[kW]; + auto n = tensor.hostShape[AXIS_N]; + auto c = tensor.hostShape[AXIS_C]; + auto h = tensor.hostShape[AXIS_H]; + auto w = tensor.hostShape[AXIS_W]; const int coIdx = 4; const int c0Idx = 5; auto co = tensor.deviceShape[coIdx]; @@ -703,8 +712,8 @@ static DebuggerErrno C1HWNCoC0_TO_NCHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t nIndex = 0; nIndex < n; nIndex++) { for (int64_t cIndex = 0; cIndex < c; cIndex++) { for (int64_t hIndex = 0; hIndex < h; hIndex++) { @@ -736,17 +745,17 @@ static DebuggerErrno NC1HWC0_C04_TO_NCHW(AclTensorInfo& tensor) static DebuggerErrno FRAC_Z3D_TO_NCDHW(AclTensorInfo& tensor) { - AssertDim(tensor.hostShape, kDim5); + AssertDim(tensor.hostShape, DIM_5); AssertConsis(tensor); AllocTensorTransBuf(tensor); - auto n = tensor.hostShape[N_ncdhw]; - auto c = tensor.hostShape[C_ncdhw]; - auto d = tensor.hostShape[D_ncdhw]; - auto h = tensor.hostShape[H_ncdhw]; - auto w = tensor.hostShape[W_ncdhw]; - constexpr int kFZ3D_C0 = 3; - auto c0 = tensor.deviceShape[kFZ3D_C0]; + auto n = tensor.hostShape[N_NCDHW]; + auto c = tensor.hostShape[C_NCDHW]; + auto d = tensor.hostShape[D_NCDHW]; + auto h = tensor.hostShape[H_NCDHW]; + auto w = tensor.hostShape[W_NCDHW]; + constexpr int FZ3D_C0 = 3; + auto c0 = tensor.deviceShape[FZ3D_C0]; if (c0 == 0) { LOG_WARNING(DebuggerErrno::ERROR_INVALID_VALUE, tensor + ": Invalid shape size."); return DebuggerErrno::ERROR_INVALID_VALUE; @@ -765,8 +774,8 @@ static DebuggerErrno FRAC_Z3D_TO_NCDHW(AclTensorInfo& tensor) const uint8_t* src = tensor.aclData; auto dst = tensor.transBuf.begin(); - auto dtypeSize = SizeOfAclDType(tensor); - auto dstSize = tensor.transBuf.size(); + int64_t dtypeSize = static_cast(SizeOfAclDType(tensor)); + int64_t dstSize = static_cast(tensor.transBuf.size()); for (int64_t nIdx = 0; nIdx < n; nIdx++) { int64_t nHead = nIdx * cdhw; for (int64_t cIdx = 0; cIdx < c; cIdx++) { @@ -879,7 +888,6 @@ static DebuggerErrno TransInt4ToInt8(const uint8_t* input, size_t elemNums, uint DebuggerErrno TransDtype(AclTensorInfo& tensor, AclDtype to) { - if (tensor.dtype == to) { return DebuggerErrno::OK; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.h similarity index 83% rename from debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.h index f2ac429a7f14370ea1721369c7f9089cb971bb6e..301da55ef7686255c12c8fb52dcfaf3e8314e3a8 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/AclTensor.h @@ -19,9 +19,9 @@ #include #include -#include "include/ErrorCode.hpp" +#include "include/ErrorCode.h" #include "proto/AclDumpMsg.pb.h" -#include "utils/DataUtils.hpp" +#include "utils/DataUtils.h" namespace MindStudioDebugger { @@ -29,12 +29,12 @@ using AclShape = DataUtils::TensorShape; using AclDtype = DataUtils::DataType; using AclFormat = DataUtils::TensorFormat; -constexpr uint8_t kDim1 = 1; -constexpr uint8_t kDim2 = 2; -constexpr uint8_t kDim3 = 3; -constexpr uint8_t kDim4 = 4; -constexpr uint8_t kDim5 = 5; -constexpr uint8_t kDim6 = 6; +constexpr uint8_t DIM_1 = 1; +constexpr uint8_t DIM_2 = 2; +constexpr uint8_t DIM_3 = 3; +constexpr uint8_t DIM_4 = 4; +constexpr uint8_t DIM_5 = 5; +constexpr uint8_t DIM_6 = 6; struct AclTensorInfo { std::string dumpPath; @@ -52,21 +52,24 @@ struct AclTensorInfo { bool dumpOriginData; std::vector transBuf; - std::string ToString() const { + std::string ToString() const + { return "AclTensor(path=" + dumpPath + ",dtype=" + DataUtils::GetDTypeString(dtype) + ",inout=" + inout + ")"; } }; -inline std::string operator+(const std::string& s, const AclTensorInfo& tensor) { +inline std::string operator+(const std::string& s, const AclTensorInfo& tensor) +{ return s + tensor.ToString(); } -inline std::string operator+(const AclTensorInfo& tensor, const std::string& s) { +inline std::string operator+(const AclTensorInfo& tensor, const std::string& s) +{ return tensor.ToString() + s; } namespace AclTensor { -size_t SizeOfTensor(const AclTensorInfo& tensor, bool host=true); +size_t SizeOfTensor(const AclTensorInfo& tensor, bool host = true); template AclTensorInfo ParseAttrsFromDumpData(const std::string &dumpPath, const uint8_t* data, const T& tensor, const std::string& io, uint32_t slot); diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.cpp index d4d74f1962222558c88c576b8ffbd8c474e152f2..abdee8a0f7b44b0bc1d0a9dd9b75e2b9304e8220 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.cpp @@ -16,10 +16,10 @@ #include -#include "base/ErrorInfos.hpp" -#include "base/DebuggerConfig.hpp" -#include "third_party/ACL/AclApi.hpp" -#include "PrecisionDebugger.hpp" +#include "base/ErrorInfosManager.h" +#include "base/DebuggerConfig.h" +#include "third_party/ACL/AclApi.h" +#include "PrecisionDebugger.h" namespace MindStudioDebugger { @@ -83,12 +83,12 @@ int32_t PrecisionDebugger::Initialize(const std::string& framework, const std::s return ret; } - if(AscendCLApi::LoadAclApi() != DebuggerErrno::OK) { + if (AscendCLApi::LoadAclApi() != DebuggerErrno::OK) { return -1; } const DebuggerConfig& cfg = DebuggerConfig::GetInstance(); - for (auto iter = subDebuggers.begin(); iter != subDebuggers.end(); ) { + for (auto iter = subDebuggers.begin(); iter != subDebuggers.end();) { if (!(*iter)->Condition(cfg)) { iter = subDebuggers.erase(iter); } else { @@ -124,7 +124,7 @@ void PrecisionDebugger::Stop() } enable = false; - CALL_ACL_API(aclrtSynchronizeDevice); + CALL_ACL_API(AclrtSynchronizeDevice); for (auto task : subDebuggers) { task->OnStop(); @@ -147,7 +147,7 @@ void PrecisionDebugger::Step(uint32_t step) throw std::runtime_error("Step over upper limit(4294967295)."); } curStep += step; - CALL_ACL_API(aclrtSynchronizeDevice); + CALL_ACL_API(AclrtSynchronizeDevice); for (auto task : subDebuggers) { task->OnStep(curStep); diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.h similarity index 92% rename from debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.h index fbc22c016c40285a90a3de5989684098639256c9..939992d8151b620b4a6225ce912e97bd61a84cfa 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/PrecisionDebugger.h @@ -19,7 +19,7 @@ #include #include -#include "base/DebuggerConfig.hpp" +#include "base/DebuggerConfig.h" namespace MindStudioDebugger { @@ -43,9 +43,10 @@ protected: class PrecisionDebugger { public: - static PrecisionDebugger& GetInstance() { - static PrecisionDebugger instance_; - return instance_; + static PrecisionDebugger& GetInstance() + { + static PrecisionDebugger debuggerInstance; + return debuggerInstance; } int32_t Initialize(const std::string& framework, const std::string& cfgFile); diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.cpp index 2d80ed3ce1ab11ee5ddf9bad18583a6813f32529..23f327f6c32e7234bce1da3ac795e915cf0f0004 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.cpp @@ -16,11 +16,11 @@ #include -#include "base/ErrorInfos.hpp" -#include "base/DebuggerConfig.hpp" -#include "base/Environment.hpp" -#include "core/AclDumper.hpp" -#include "MSAclDumper.hpp" +#include "base/ErrorInfosManager.h" +#include "base/DebuggerConfig.h" +#include "base/Environment.h" +#include "core/AclDumper.h" +#include "MSAclDumper.h" namespace MindStudioDebugger { diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.h similarity index 84% rename from debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.h index cd09bf51af0dac67065d51b8ce60c20f011cd585..9482714520430caf52e915678963540380b5bbc2 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MSAclDumper.h @@ -18,20 +18,22 @@ #include -#include "include/ExtArgs.hpp" -#include "core/PrecisionDebugger.hpp" +#include "include/ExtArgs.h" +#include "core/PrecisionDebugger.h" namespace MindStudioDebugger { class MSAclDumper : public PrecisionDbgTaskBase { public: - static MSAclDumper& GetInstance() { - static MSAclDumper instance_; - return instance_; + static MSAclDumper& GetInstance() + { + static MSAclDumper dumperInstance; + return dumperInstance; } std::string Name() const override {return "MindSpore AclDumper";} - bool Condition(const DebuggerConfig& cfg) const override { + bool Condition(const DebuggerConfig& cfg) const override + { return cfg.GetFramework() == DebuggerFramework::FRAMEWORK_MINDSPORE && cfg.GetDebugLevel() == DebuggerLevel::L2; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.cpp b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.cpp index 631ea7c4acf4666b911a3bb5f28a3c6cc4fe0d54..395b8b6846427e558ceae544439bdfff2da633ea 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.cpp @@ -14,10 +14,10 @@ * limitations under the License. */ -#include "include/Macro.hpp" -#include "base/ErrorInfos.hpp" -#include "MindSporeTrigger.hpp" -#include "MSAclDumper.hpp" +#include "include/Macro.h" +#include "base/ErrorInfosManager.h" +#include "MSAclDumper.h" +#include "MindSporeTrigger.h" namespace MindStudioDebugger { diff --git a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.hpp b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.h similarity index 97% rename from debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.hpp rename to debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.h index 022e5d7d4c14a9771681840b967b2ec3aebb811b..d5048925bf58a1e4414b2983d796e598ac56c17b 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/core/mindspore/MindSporeTrigger.h @@ -18,7 +18,7 @@ #include -#include "include/ExtArgs.hpp" +#include "include/ExtArgs.h" namespace MindStudioDebugger { diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/mindspore/MindSporeDbgHook.cpp b/debug/accuracy_tools/msprobe/ccsrc/if/mindspore/MindSporeDbgHook.cpp index db279a33f17311a7c3681e7d899c2fa85a6fdcc8..2d744282d4eb2e741ae0e4afa7081a1a65738d61 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/if/mindspore/MindSporeDbgHook.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/if/mindspore/MindSporeDbgHook.cpp @@ -19,9 +19,9 @@ #include #include -#include "include/Macro.hpp" -#include "include/ExtArgs.hpp" -#include "core/mindspore/MindSporeTrigger.hpp" +#include "include/Macro.h" +#include "include/ExtArgs.h" +#include "core/mindspore/MindSporeTrigger.h" EXPORT_SYMBOL void MS_DbgOnStepBegin(uint32_t device, int32_t curStep, std::map exts) @@ -38,7 +38,7 @@ EXPORT_SYMBOL void MS_DbgOnStepBegin(uint32_t device, int32_t curStep, continue; } std::vector* ss = reinterpret_cast*>(ext.second); - strBuf = new const char*[(*ss).size() + 1]; + strBuf = new const char* [(*ss).size() + 1]; strBuf[(*ss).size()] = nullptr; size_t i = 0; for (std::string& s : *ss) { @@ -69,6 +69,4 @@ EXPORT_SYMBOL void MS_DbgOnStepEnd(std::map& exts) args[static_cast(ext.first)] = ext.second; } return MindStudioDebugger::MindSporeTrigger::TriggerOnStepEnd(args); -} - - +} \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.cpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.cpp index 1c380ed3f505795eb622f7f401558f72a54db557..2bb73e34200216d77c5f884a343cd02b1250af7b 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.cpp @@ -18,37 +18,40 @@ #include #include -#include "base/ErrorInfos.hpp" -#include "core/AclDumper.hpp" -#include "utils/CPythonUtils.hpp" +#include "base/ErrorInfosManager.h" +#include "core/AclDumper.h" +#include "utils/CPythonUtils.h" namespace MindStudioDebugger { -static PyObject *CPythonKernelInitDump(PyObject *module, PyObject *args) { - PyGILState_STATE gstate = PyGILState_Ensure(); - KernelInitDump(); - PyGILState_Release(gstate); - Py_RETURN_NONE; +static PyObject *CPythonKernelInitDump(PyObject *module, PyObject *args) +{ + PyGILState_STATE gstate = PyGILState_Ensure(); + KernelInitDump(); + PyGILState_Release(gstate); + Py_RETURN_NONE; } -static PyObject *CPythonKernelSetDump(PyObject *module, PyObject *args) { - const char *path; - if (!PyArg_ParseTuple(args, "s", &path)) { +static PyObject *CPythonKernelSetDump(PyObject *module, PyObject *args) +{ + const char *path; + if (!PyArg_ParseTuple(args, "s", &path)) { LOG_ERROR(DebuggerErrno::ERROR_INVALID_VALUE, "npu set dump error, cfg_file must string"); return nullptr; - } - PyGILState_STATE gstate = PyGILState_Ensure(); - KernelSetDump(std::string(path)); - PyGILState_Release(gstate); - Py_RETURN_NONE; + } + PyGILState_STATE gstate = PyGILState_Ensure(); + KernelSetDump(std::string(path)); + PyGILState_Release(gstate); + Py_RETURN_NONE; } -static PyObject *CPythonKernelFinalizeDump(PyObject *module, PyObject *args) { - PyGILState_STATE gstate = PyGILState_Ensure(); - KernelFinalizeDump(); - PyGILState_Release(gstate); - Py_RETURN_NONE; +static PyObject *CPythonKernelFinalizeDump(PyObject *module, PyObject *args) +{ + PyGILState_STATE gstate = PyGILState_Ensure(); + KernelFinalizeDump(); + PyGILState_Release(gstate); + Py_RETURN_NONE; } static PyMethodDef DumpMethods[] = { diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.hpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.hpp rename to debug/accuracy_tools/msprobe/ccsrc/if/python/ACLDump.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.cpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.cpp index 4b8fc03491e2c0792c3c707c272e7b587d60c7ad..e41243aa8d3c27b92c275dcd098e983083328d8e 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.cpp @@ -18,7 +18,7 @@ #include #include -#include "utils/CPythonUtils.hpp" +#include "utils/CPythonUtils.h" namespace MindStudioDebugger { @@ -29,8 +29,12 @@ PyDoc_STRVAR(CPythonAgentModuleDoc, static PyObject* CPythonAgentRegister(PyObject *module, PyObject *args) { + if (args == nullptr || !PyTuple_Check(args)) { + PyErr_SetString(PyExc_TypeError, "Expect a tuple."); + Py_RETURN_NONE; + } /* 预期2个参数,name和obj */ - if (args == nullptr || PyTuple_GET_SIZE(args) != 2) { + if (PyTuple_GET_SIZE(args) != 2) { PyErr_SetString(PyExc_TypeError, "\'register_context\' expects 2 arguments."); Py_RETURN_NONE; } @@ -56,7 +60,7 @@ static PyObject* CPythonAgentRegister(PyObject *module, PyObject *args) static PyObject* CPythonAgentUnRegister(PyObject *module, PyObject *obj) { CPythonUtils::PythonStringObject name(obj); - if(name.IsNone()) { + if (name.IsNone()) { PyErr_SetString(PyExc_TypeError, "\"name\" should be a string."); Py_RETURN_NONE; } @@ -68,7 +72,7 @@ static PyObject* CPythonAgentUnRegister(PyObject *module, PyObject *obj) static PyObject* CPythonAgentGetContext(PyObject *module, PyObject *obj) { CPythonUtils::PythonStringObject name(obj); - if(name.IsNone()) { + if (name.IsNone()) { PyErr_SetString(PyExc_TypeError, "\"name\" should be a string."); Py_RETURN_NONE; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.hpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.hpp rename to debug/accuracy_tools/msprobe/ccsrc/if/python/CPythonAgent.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/MsProbeIfPython.cpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/MsProbeIfPython.cpp index a18c54a146f7d676d6b3c7f760e50f9e7eebe56c..fa3f65cc5fc4d211f9608cadc115601598232d05 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/if/python/MsProbeIfPython.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/if/python/MsProbeIfPython.cpp @@ -16,9 +16,9 @@ #include -#include "PrecisionDebuggerIfPython.hpp" -#include "CPythonAgent.hpp" -#include "ACLDump.hpp" +#include "PrecisionDebuggerIfPython.h" +#include "CPythonAgent.h" +#include "ACLDump.h" namespace MindStudioDebugger { @@ -27,7 +27,7 @@ PyDoc_STRVAR(MsProbeCModuleDoc, class _PrecisionDebugger: PrecisionDebugger in CXX \n\ class _DebuggerConfig: Configuration data of PrecisionDebugger \n\ class CPythonAgent: Used for front-end and back-end code interactions \n\ - \n\ + \n\ ..."); static struct PyModuleDef g_MsProbeCModule = { diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.cpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.cpp index 627b5399c573c20e4c19dd737ba4dc332cd09d6e..90977cc3c22c66d272cb7b4c51d38ecc689c6436 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.cpp @@ -18,8 +18,8 @@ #include #include -#include "utils/CPythonUtils.hpp" -#include "core/PrecisionDebugger.hpp" +#include "utils/CPythonUtils.h" +#include "core/PrecisionDebugger.h" namespace MindStudioDebugger { @@ -53,7 +53,6 @@ static int InitPrecisionDebugger(PyObject *self, PyObject *args, PyObject *kws) CPythonUtils::PythonDictObject kwArgs(kws); std::string framework = kwArgs.GetItem("framework"); std::string cfgFile = kwArgs.GetItem("config_path"); - if (PrecisionDebugger::GetInstance().Initialize(framework, cfgFile) != 0) { PyErr_SetString(PyExc_RuntimeError, "Failed to load config, read log for more details."); return -1; @@ -101,7 +100,11 @@ static PyObject* PrecisionDebuggerStop(PyObject *self) static PyObject* PrecisionDebuggerStep(PyObject *self, PyObject *args) { - if (args == nullptr || PyTuple_GET_SIZE(args) == 0) { + if (args == nullptr || !PyTuple_Check(args)) { + PrecisionDebugger::GetInstance().Step(); + Py_RETURN_NONE; + } + if (PyTuple_GET_SIZE(args) == 0) { PrecisionDebugger::GetInstance().Step(); Py_RETURN_NONE; } @@ -182,5 +185,4 @@ PyTypeObject* GetPyPrecisionDebuggerType() } return &PyPrecisionDebuggerType; } - } \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.hpp b/debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.hpp rename to debug/accuracy_tools/msprobe/ccsrc/if/python/PrecisionDebuggerIfPython.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/include/ErrorCode.hpp b/debug/accuracy_tools/msprobe/ccsrc/include/ErrorCode.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/include/ErrorCode.hpp rename to debug/accuracy_tools/msprobe/ccsrc/include/ErrorCode.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/include/ExtArgs.hpp b/debug/accuracy_tools/msprobe/ccsrc/include/ExtArgs.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/include/ExtArgs.hpp rename to debug/accuracy_tools/msprobe/ccsrc/include/ExtArgs.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/include/Macro.hpp b/debug/accuracy_tools/msprobe/ccsrc/include/Macro.h similarity index 100% rename from debug/accuracy_tools/msprobe/ccsrc/include/Macro.hpp rename to debug/accuracy_tools/msprobe/ccsrc/include/Macro.h diff --git a/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.cpp b/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.cpp index ff249f8e6ad02f1fa6f2201a24c41eb9bdb962cb..c79f2820a9f693afb93cdf68b3f2f7c751d7e389 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.cpp @@ -18,30 +18,30 @@ #include #include -#include "base/ErrorInfos.hpp" -#include "AclApi.hpp" +#include "base/ErrorInfosManager.h" +#include "AclApi.h" namespace MindStudioDebugger { namespace AscendCLApi { using namespace MindStudioDebugger; -constexpr const char* kLibAscendclName = "libascendcl.so"; -constexpr const char* kLibMSAscendName = "libmindspore_ascend.so.2"; +constexpr const char* LIB_ASCEND_CL_NAME = "libascendcl.so"; +constexpr const char* LIB_MS_ASCEND_NAME = "libmindspore_ascend.so.2"; -using aclInitFuncType = aclError (*)(const char *); -using aclmdlInitDumpFuncType = aclError (*)(); -using aclmdlSetDumpFuncType = aclError (*)(const char *); -using aclmdlFinalizeDumpFuncType = aclError (*)(); -using acldumpRegCallbackFuncType = aclError (*)(AclDumpCallbackFuncType, int32_t); -using aclrtSynchronizeDeviceFuncType = aclError (*)(); +using AclInitFuncType = aclError (*)(const char *); +using AclmdlInitDumpFuncType = aclError (*)(); +using AclmdlSetDumpFuncType = aclError (*)(const char *); +using AclmdlFinalizeDumpFuncType = aclError (*)(); +using AcldumpRegCallbackFuncType = aclError (*)(AclDumpCallbackFuncType, int32_t); +using AclrtSynchronizeDeviceFuncType = aclError (*)(); -static aclInitFuncType aclInitFunc = nullptr; -static aclmdlInitDumpFuncType aclmdlInitDumpFunc = nullptr; -static aclmdlSetDumpFuncType aclmdlSetDumpFunc = nullptr; -static aclmdlFinalizeDumpFuncType aclmdlFinalizeDumpFunc = nullptr; -static acldumpRegCallbackFuncType acldumpRegCallbackFunc = nullptr; -static aclrtSynchronizeDeviceFuncType aclrtSynchronizeDeviceFunc = nullptr; +static AclInitFuncType g_aclInitFunc = nullptr; +static AclmdlInitDumpFuncType g_aclmdlInitDumpFunc = nullptr; +static AclmdlSetDumpFuncType g_aclmdlSetDumpFunc = nullptr; +static AclmdlFinalizeDumpFuncType g_aclmdlFinalizeDumpFunc = nullptr; +static AcldumpRegCallbackFuncType g_acldumpRegCallbackFunc = nullptr; +static AclrtSynchronizeDeviceFuncType g_aclrtSynchronizeDeviceFunc = nullptr; DebuggerErrno LoadAclApi() { @@ -52,7 +52,7 @@ DebuggerErrno LoadAclApi() return DebuggerErrno::OK; } - hLibAscendcl = dlopen(kLibAscendclName, RTLD_LAZY | RTLD_NOLOAD); + hLibAscendcl = dlopen(LIB_ASCEND_CL_NAME, RTLD_LAZY | RTLD_NOLOAD); if (hLibAscendcl == nullptr) { LOG_ERROR(DebuggerErrno::ERROR_DEPENDENCY_NOT_FIND, "Failed to search libascendcl.so." + std::string(dlerror())); @@ -60,11 +60,11 @@ DebuggerErrno LoadAclApi() } static const std::map functionMap = { - {"aclInit", reinterpret_cast(&aclInitFunc)}, - {"aclmdlInitDump", reinterpret_cast(&aclmdlInitDumpFunc)}, - {"aclmdlSetDump", reinterpret_cast(&aclmdlSetDumpFunc)}, - {"aclmdlFinalizeDump", reinterpret_cast(&aclmdlFinalizeDumpFunc)}, - {"aclrtSynchronizeDevice", reinterpret_cast(&aclrtSynchronizeDeviceFunc)}, + {"aclInit", reinterpret_cast(&g_aclInitFunc)}, + {"aclmdlInitDump", reinterpret_cast(&g_aclmdlInitDumpFunc)}, + {"aclmdlSetDump", reinterpret_cast(&g_aclmdlSetDumpFunc)}, + {"aclmdlFinalizeDump", reinterpret_cast(&g_aclmdlFinalizeDumpFunc)}, + {"aclrtSynchronizeDevice", reinterpret_cast(&g_aclrtSynchronizeDeviceFunc)}, }; for (auto& iter : functionMap) { @@ -83,15 +83,15 @@ DebuggerErrno LoadAclApi() } /* 规避adump的bug,mindspore场景优先使用libmindspore_ascend.so中的符号 */ - void* handler = dlopen(kLibMSAscendName, RTLD_LAZY | RTLD_NOLOAD); - std::string libName = kLibMSAscendName; + void* handler = dlopen(LIB_MS_ASCEND_NAME, RTLD_LAZY | RTLD_NOLOAD); + std::string libName = LIB_MS_ASCEND_NAME; if (handler == nullptr) { handler = hLibAscendcl; - libName = kLibAscendclName; + libName = LIB_ASCEND_CL_NAME; } - acldumpRegCallbackFunc = reinterpret_cast(dlsym(handler, "acldumpRegCallback")); - if (acldumpRegCallbackFunc == nullptr) { + g_acldumpRegCallbackFunc = reinterpret_cast(dlsym(handler, "acldumpRegCallback")); + if (g_acldumpRegCallbackFunc == nullptr) { LOG_ERROR(DebuggerErrno::ERROR_DEPENDENCY_NOT_FIND, "Failed to load function acldumpRegCallback from " + libName + "."); } @@ -104,53 +104,53 @@ DebuggerErrno LoadAclApi() return DebuggerErrno::OK; } -aclError ACLAPI_aclInit(const char* cfg) +aclError AclApiAclInit(const char* cfg) { - if (aclInitFunc == nullptr) { + if (g_aclInitFunc == nullptr) { throw std::runtime_error("API aclInit does not have a definition."); } - return aclInitFunc(cfg); + return g_aclInitFunc(cfg); } -aclError ACLAPI_aclmdlInitDump() +aclError AclApiAclmdlInitDump() { - if (aclmdlInitDumpFunc == nullptr) { + if (g_aclmdlInitDumpFunc == nullptr) { throw std::runtime_error("API aclmdlInitDump does not have a definition."); } - return aclmdlInitDumpFunc(); + return g_aclmdlInitDumpFunc(); } -aclError ACLAPI_aclmdlSetDump(const char* cfg) +aclError AclApiAclmdlSetDump(const char* cfg) { - if (aclmdlSetDumpFunc == nullptr) { + if (g_aclmdlSetDumpFunc == nullptr) { throw std::runtime_error("API aclmdlSetDump does not have a definition."); } - return aclmdlSetDumpFunc(cfg); + return g_aclmdlSetDumpFunc(cfg); } -aclError ACLAPI_aclmdlFinalizeDump() +aclError AclApiAclmdlFinalizeDump() { - if (aclmdlFinalizeDumpFunc == nullptr) { + if (g_aclmdlFinalizeDumpFunc == nullptr) { throw std::runtime_error("API aclmdlFinalizeDump does not have a definition."); } - return aclmdlFinalizeDumpFunc(); + return g_aclmdlFinalizeDumpFunc(); } -aclError ACLAPI_acldumpRegCallback(AclDumpCallbackFuncType messageCallback, int32_t flag) +aclError AclApiAcldumpRegCallback(AclDumpCallbackFuncType messageCallback, int32_t flag) { - if (acldumpRegCallbackFunc == nullptr) { + if (g_acldumpRegCallbackFunc == nullptr) { throw std::runtime_error("API acldumpRegCallback does not have a definition."); } - return acldumpRegCallbackFunc(messageCallback, flag); + return g_acldumpRegCallbackFunc(messageCallback, flag); } -aclError ACLAPI_aclrtSynchronizeDevice() +aclError AclApiAclrtSynchronizeDevice() { - if (aclrtSynchronizeDeviceFunc == nullptr) { + if (g_aclrtSynchronizeDeviceFunc == nullptr) { throw std::runtime_error("API aclrtSynchronizeDevice does not have a definition."); } - return aclrtSynchronizeDeviceFunc(); + return g_aclrtSynchronizeDeviceFunc(); } -} +} } diff --git a/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.hpp b/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.h similarity index 78% rename from debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.hpp rename to debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.h index 731ae2e2caacaa345605ec572c8dcd6dba091488..366826fac943e622da58d8a136e66f18253b40c1 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/third_party/ACL/AclApi.h @@ -18,24 +18,23 @@ #include -#include "include/ErrorCode.hpp" +#include "include/ErrorCode.h" extern "C" { - -typedef int aclError; +using aclError = int; constexpr int ACL_SUCCESS = 0; constexpr int ACL_ERROR_NONE = 0; constexpr int ACL_ERROR_REPEAT_INITIALIZE = 100002; #define ACL_DUMP_MAX_FILE_PATH_LENGTH 4096 -typedef struct acldumpChunk { +typedef struct AclDumpChunk { char fileName[ACL_DUMP_MAX_FILE_PATH_LENGTH]; // 待落盘的Dump数据文件名,ACL_DUMP_MAX_FILE_PATH_LENGTH表示文件名最大长度,当前为4096 uint32_t bufLen; // dataBuf数据长度,单位Byte uint32_t isLastChunk; // 标识Dump数据是否为最后一个分片,0表示不是最后一个分片,1表示最后一个分片 int64_t offset; // Dump数据文件内容的偏移,其中-1表示文件追加内容 int32_t flag; // 预留Dump数据标识,当前数据无标识 uint8_t dataBuf[0]; // Dump数据的内存地址 -} acldumpChunk; +} AclDumpChunk; } @@ -44,16 +43,16 @@ namespace AscendCLApi { DebuggerErrno LoadAclApi(); -using AclDumpCallbackFuncType = int32_t (*)(const acldumpChunk*, int32_t); -aclError ACLAPI_aclInit(const char* cfg); -aclError ACLAPI_aclmdlInitDump(); -aclError ACLAPI_aclmdlSetDump(const char* cfg); -aclError ACLAPI_aclmdlFinalizeDump(); -aclError ACLAPI_acldumpRegCallback(AclDumpCallbackFuncType messageCallback, int32_t flag); +using AclDumpCallbackFuncType = int32_t (*)(const AclDumpChunk*, int32_t); +aclError AclApiAclInit(const char* cfg); +aclError AclApiAclmdlInitDump(); +aclError AclApiAclmdlSetDump(const char* cfg); +aclError AclApiAclmdlFinalizeDump(); +aclError AclApiAcldumpRegCallback(AclDumpCallbackFuncType messageCallback, int32_t flag); -aclError ACLAPI_aclrtSynchronizeDevice(); +aclError AclApiAclrtSynchronizeDevice(); -#define CALL_ACL_API(func, ...) MindStudioDebugger::AscendCLApi::ACLAPI_##func(__VA_ARGS__) +#define CALL_ACL_API(func, ...) MindStudioDebugger::AscendCLApi::AclApi##func(__VA_ARGS__) } } diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.cpp b/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.cpp index 7a6b7b36bff9983b7262d5919e95b15ed4ed4d04..932a2adbc74fd71091c629adbe289c779ca8288e 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.cpp @@ -18,7 +18,7 @@ #include #include -#include "CPythonUtils.hpp" +#include "CPythonUtils.h" namespace MindStudioDebugger { namespace CPythonUtils { @@ -77,7 +77,6 @@ PythonObject PythonObject::From(const uint32_t& input) PythonObject PythonObject::From(const double& input) { return PythonNumberObject::From(input); - } PythonObject PythonObject::From(const std::string& input) { @@ -203,7 +202,7 @@ PythonObject PythonObject::Call(PythonTupleObject& args, PythonDictObject& kwarg if (args.IsNone() || kwargs.IsNone()) { if (!ignore) { PyErr_SetString(PyExc_TypeError, "Call python object with invalid parameters."); - } + } return PythonObject(); } @@ -227,7 +226,6 @@ PythonObject PythonObject::GetGlobal(const std::string& name, bool ignore) } return PythonObject(PyDict_GetItemString(globals, name.c_str())); - } PythonObject PythonObject::Import(const std::string& name, bool ignore) noexcept @@ -483,7 +481,7 @@ PythonTupleObject::PythonTupleObject() : PythonObject() PythonTupleObject::PythonTupleObject(PyObject* o) : PythonObject() { - if (!PyTuple_Check(o)) { + if (!o || !PyTuple_Check(o)) { return; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.hpp b/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.h similarity index 91% rename from debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.hpp rename to debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.h index f000bb11d6adc7a0d98433c9dd730c7d9ccb75de..db5153139362c07548e39fc6d17047673e2a4dd6 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/CPythonUtils.h @@ -40,14 +40,14 @@ namespace CPythonUtils { * | tuple | PythonTupleObject | * | dict | PythonDictObject | * ------------------------------------------- - * + * * 创建对象的方式: * 1、通过原生PyObject*类型创建,PythonObject生命周期内会持有原生对象的一个引用 * 2、通过From方法从c++对象创建 * 3、通过GetGlobal、Import等方法从解释器上下文获取 * 4、通过GetRegisteredPyObj获取到上下文的python对象 * 5、通过已有PythonObject对象的Get、GetItem等方法获取子对象 - * + * * 对象转换: * 1、对于转换成PyObject*、bool、string的场景,支持隐式转换 * 2、对于非通用类型转换,调用To方法,返回0表示成功 @@ -56,7 +56,7 @@ namespace CPythonUtils { * python维度支持bool()的都可以转bool(即并非只有bool类型支持转换,下同) * 支持str()的都可以转string * 可迭代对象(且元素支持转换)都可以转vector - * + * * 对象传递: * 1、子类可以安全传递或拷贝给PythonObject对象 * 2、PythonObject传给子类时,若类型匹配,可以安全转递,否则会转为None @@ -81,7 +81,8 @@ PythonObject GetRegisteredPyObj(const std::string& name); class PythonObject { public: - PythonObject() { + PythonObject() + { Py_INCREF(Py_None); ptr = Py_None; } @@ -91,19 +92,21 @@ public: } Py_XINCREF(ptr); } - ~PythonObject() { + ~PythonObject() + { Py_XDECREF(ptr); } explicit PythonObject(const PythonObject &obj) : PythonObject(static_cast(obj)) {} - PythonObject& operator=(const PythonObject &obj) { + PythonObject& operator=(const PythonObject &obj) + { SetPtr(static_cast(obj)); return *this; } /* 获取全局对象 */ - static PythonObject GetGlobal(const std::string& name, bool ignore=true); + static PythonObject GetGlobal(const std::string& name, bool ignore = true); /* 获取模块对象;若其还未加载至缓存,则加载一遍 */ - static PythonObject Import (const std::string& name, bool ignore=true) noexcept; + static PythonObject Import (const std::string& name, bool ignore = true) noexcept; /* From/To转换,统一放一份在基类,用于遍历迭代器等场景 */ static PythonObject From(const PythonObject& input); @@ -136,17 +139,19 @@ public: bool IsCallable() const {return PyCallable_Check(ptr);} /* 用于调用可调用对象,相当于python代码中的obj(),为了简单只实现了args+kwargs参数形式 */ - PythonObject Call(bool ignore=true) noexcept; - PythonObject Call(PythonTupleObject& args, bool ignore=true) noexcept; - PythonObject Call(PythonTupleObject& args, PythonDictObject& kwargs, bool ignore=true) noexcept; + PythonObject Call(bool ignore = true) noexcept; + PythonObject Call(PythonTupleObject& args, bool ignore = true) noexcept; + PythonObject Call(PythonTupleObject& args, PythonDictObject& kwargs, bool ignore = true) noexcept; /* 用于获取对象属性,相当于python代码中的obj.xx */ - PythonObject Get(const std::string& name, bool ignore=true) const; - PythonObject& NewRef() { + PythonObject Get(const std::string& name, bool ignore = true) const; + PythonObject& NewRef() + { Py_XINCREF(ptr); return *this; } - std::string ToString() const { + std::string ToString() const + { std::string ret; if (To(ret) == 0) { return ret; @@ -156,21 +161,24 @@ public: operator PyObject*() const {return ptr;} operator bool() const {return static_cast(PyObject_IsTrue(ptr));} - operator std::string() const { + operator std::string() const + { return ToString(); } - PythonObject operator()(bool ignore=true) {return Call(ignore);} - PythonObject operator()(PythonTupleObject& args, bool ignore=true) {return Call(args, ignore);} - PythonObject operator()(PythonTupleObject& args, PythonDictObject& kwargs, bool ignore=true) { + PythonObject operator()(bool ignore = true) {return Call(ignore);} + PythonObject operator()(PythonTupleObject& args, bool ignore = true) {return Call(args, ignore);} + PythonObject operator()(PythonTupleObject& args, PythonDictObject& kwargs, bool ignore = true) + { return Call(args, kwargs, ignore); } protected: - void SetPtr(PyObject* o) { + void SetPtr(PyObject* o) + { Py_XDECREF(ptr); if (o == nullptr) { o = Py_None; - } + } Py_INCREF(o); ptr = o; } @@ -220,11 +228,11 @@ public: size_t Size() const; template - PythonListObject& Append(T value, bool ignore=true); - PythonObject GetItem(size_t pos, bool ignore=true); - PythonListObject& SetItem(size_t pos, PythonObject& item, bool ignore=true); - PythonListObject& Insert(int64_t pos, PythonObject& item, bool ignore=true); - PythonTupleObject ToTuple(bool ignore=true); + PythonListObject& Append(T value, bool ignore = true); + PythonObject GetItem(size_t pos, bool ignore = true); + PythonListObject& SetItem(size_t pos, PythonObject& item, bool ignore = true); + PythonListObject& Insert(int64_t pos, PythonObject& item, bool ignore = true); + PythonTupleObject ToTuple(bool ignore = true); }; class PythonTupleObject : public PythonObject { @@ -236,7 +244,7 @@ public: static PythonTupleObject From(const std::vector& input); size_t Size() const; - PythonObject GetItem(size_t pos, bool ignore=true); + PythonObject GetItem(size_t pos, bool ignore = true); }; class PythonDictObject : public PythonObject { @@ -248,11 +256,11 @@ public: static PythonDictObject From(const std::map& input); template - PythonDictObject& Add(T1 key, T2 value, bool ignore=true); + PythonDictObject& Add(T1 key, T2 value, bool ignore = true); template - PythonDictObject& Delete(T key, bool ignore=true); + PythonDictObject& Delete(T key, bool ignore = true); template - PythonObject GetItem(T key, bool ignore=true); + PythonObject GetItem(T key, bool ignore = true); }; /**************************************************************************************************/ diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.cpp b/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.cpp index 23088d48e31a15af69bdc19939e490b27c4a50a6..c6744b68e67285e3cabebde1462f3305d2f863a4 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.cpp @@ -21,19 +21,21 @@ #include #include -#include "DataUtils.hpp" +#include "DataUtils.h" namespace MindStudioDebugger { namespace DataUtils { -int64_t SizeToS64(size_t v) { +int64_t SizeToS64(size_t v) +{ if (v > static_cast(INT64_MAX)) { throw std::runtime_error("Value " + std::to_string(v) + "exceeds the maximum value of int64."); } return static_cast(v); } -std::string U64ToHexString(uint64_t v) { +std::string U64ToHexString(uint64_t v) +{ std::stringstream ss; ss << "0x" << std::hex << std::uppercase << v; return std::move(ss.str()); @@ -42,33 +44,33 @@ std::string U64ToHexString(uint64_t v) { BFloat16::BFloat16(float f32) { if (std::isnan(f32)) { - value_ = BFloat16::nan_value; + value_ = BFloat16::NAN_VALUE; } else { + constexpr uint8_t offsetSize = 16; union { - uint32_t U32; - float F32; + uint32_t u32Value; + float f32Value; }; - F32 = f32; - uint32_t rounding_bias = ((U32 >> 16) & 1) + UINT32_C(0x7FFF); - value_ = static_cast((U32 + rounding_bias) >> 16); + f32Value = f32; + uint32_t rounding_bias = ((u32Value >> offsetSize) & 1) + UINT32_C(0x7FFF); + value_ = static_cast((u32Value + rounding_bias) >> offsetSize); } } BFloat16::operator float() const { /* 为了兼容性,不要用c++20的bit_cast */ - union - { + constexpr uint8_t offsetSize = 16; + union { float f32; uint32_t ui32; }; - ui32 = static_cast(value_); - ui32 <<= 16; // 将ui32左移16位 + ui32 <<= offsetSize; // 将ui32左移16位 return f32; } -const static std::unordered_map kTypeSizeMap = { +constexpr std::pair TYPE_SIZE_ARRAY[] = { {DataType::DT_BOOL, 1}, {DataType::DT_INT8, 1}, {DataType::DT_UINT8, 1}, @@ -88,15 +90,16 @@ const static std::unordered_map kTypeSizeMap = { size_t SizeOfDType(DataType type) { - auto it = kTypeSizeMap.find(type); - if (it == kTypeSizeMap.end()) { - return 0; + for (const auto& pair : TYPE_SIZE_ARRAY) { + if (pair.first == type) { + return pair.second; + } } - return it->second; + return 0; } -constexpr auto kOpDType_UNKNOWN = "UNKNOWN"; -const static std::unordered_map kDDTypeToStringMap = { +constexpr auto OP_DTYPE_UNKNOWN = "UNKNOWN"; +const std::pair DTYPE_TO_STRING_ARRAY[] = { {DataType::DT_UNDEFINED, "UNDEFINED"}, {DataType::DT_FLOAT, "FLOAT"}, {DataType::DT_FLOAT16, "FLOAT16"}, @@ -133,15 +136,16 @@ const static std::unordered_map kDDTypeToStringMap = { std::string GetDTypeString(DataType dtype) { - auto it = kDDTypeToStringMap.find(dtype); - if (it != kDDTypeToStringMap.end()) { - return it->second; + for (const auto& pair : DTYPE_TO_STRING_ARRAY) { + if (pair.first == dtype) { + return std::string(pair.second); + } } - return kOpDType_UNKNOWN; + return OP_DTYPE_UNKNOWN; } -constexpr auto kOpFormat_UNKNOWN = "UNKNOWN"; -const static std::unordered_map kFormatToStringMap = { +constexpr auto OP_FORMAT_UNKNOWN = "UNKNOWN"; +const std::pair FORMAT_TO_STRING_ARRAY[] = { {TensorFormat::FORMAT_NCHW, "NCHW"}, {TensorFormat::FORMAT_NHWC, "NHWC"}, {TensorFormat::FORMAT_ND, "ND"}, @@ -167,7 +171,7 @@ const static std::unordered_map kFormatToStringMap = {TensorFormat::FORMAT_HASHTABLE_LOOKUP_VALUE, "HASHTABLE_LOOKUP_VALUE"}, {TensorFormat::FORMAT_HASHTABLE_LOOKUP_OUTPUT, "HASHTABLE_LOOKUP_OUTPUT"}, {TensorFormat::FORMAT_HASHTABLE_LOOKUP_HITS, "HASHTABLE_LOOKUP_HITS"}, - {TensorFormat::FORMAT_C1HWNCoC0, "C1HWNCoC0"}, + {TensorFormat::FORMAT_C1HWNCOC0, "C1HWNCoC0"}, {TensorFormat::FORMAT_MD, "MD"}, {TensorFormat::FORMAT_NDHWC, "NDHWC"}, {TensorFormat::FORMAT_FRACTAL_ZZ, "FRACTAL_ZZ"}, @@ -196,11 +200,12 @@ const static std::unordered_map kFormatToStringMap = std::string GetFormatString(TensorFormat fmt) { - auto it = kFormatToStringMap.find(fmt); - if (it != kFormatToStringMap.end()) { - return it->second; + for (const auto& pair : FORMAT_TO_STRING_ARRAY) { + if (pair.first == fmt) { + return std::string(pair.second); + } } - return kOpFormat_UNKNOWN; + return OP_FORMAT_UNKNOWN; } std::string GetShapeString(const TensorShape& shape) diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.hpp b/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.h similarity index 90% rename from debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.hpp rename to debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.h index f58e15a8c77719f62ddeef8ebbcd25a5b5ebf624..35f9ae4f242f8575ea98d86a3da95b381c63fbef 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/DataUtils.h @@ -24,11 +24,11 @@ namespace MindStudioDebugger { namespace DataUtils { -inline uint64_t UnpackUint64Value_Le(const void* data) +inline uint64_t UnpackUint64ValueLe(const void* data) { return le64toh(*reinterpret_cast(data)); } -inline uint64_t UnpackUint64Value_Be(const void* data) +inline uint64_t UnpackUint64ValueBe(const void* data) { return be64toh(*reinterpret_cast(data)); } @@ -38,11 +38,11 @@ std::string U64ToHexString(uint64_t v); class BFloat16 { public: - static constexpr uint16_t value_mask = 0x7fff; - static constexpr uint16_t inf_value = 0x7f80; - static constexpr uint16_t nan_value = 0x7fc0; - static constexpr uint16_t true_value = 0x3c00; - static constexpr uint32_t f32_inf_value = 0x7f800000; + static constexpr uint16_t VALUE_MASK = 0x7fff; + static constexpr uint16_t INF_VALUE = 0x7f80; + static constexpr uint16_t NAN_VALUE = 0x7fc0; + static constexpr uint16_t TRUE_VALUE = 0x3c00; + static constexpr uint32_t F32_INF_VALUE = 0x7f800000; BFloat16() = default; ~BFloat16() = default; @@ -51,7 +51,7 @@ public: BFloat16 &operator=(const BFloat16 &other) noexcept = default; BFloat16 &operator=(BFloat16 &&other) noexcept = default; - explicit BFloat16(float f); + explicit BFloat16(float f32); explicit operator float() const; BFloat16 operator+(const BFloat16& other) const { return BFloat16(static_cast(*this) + static_cast(other)); } @@ -131,7 +131,7 @@ enum TensorFormat : int { FORMAT_HASHTABLE_LOOKUP_VALUE = 22, FORMAT_HASHTABLE_LOOKUP_OUTPUT = 23, FORMAT_HASHTABLE_LOOKUP_HITS = 24, - FORMAT_C1HWNCoC0 = 25, + FORMAT_C1HWNCOC0 = 25, FORMAT_MD = 26, FORMAT_NDHWC = 27, FORMAT_FRACTAL_ZZ = 28, diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.cpp b/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.cpp index 7f025e568abdfe95830902d1e72bdb77300f7de5..d8861e5b0c766f1254063bdda568c6f9b2e21ef4 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.cpp @@ -18,9 +18,9 @@ #include #include -#include "FileUtils.hpp" -#include "DataUtils.hpp" -#include "FileOperation.hpp" +#include "FileUtils.h" +#include "DataUtils.h" +#include "FileOperation.h" namespace MindStudioDebugger { namespace FileOperation { @@ -34,7 +34,8 @@ struct NpyDtypeDescr { char type; size_t length; - std::string str() const { + std::string Str() const + { std::ostringstream buffer; buffer << "\'" << byteorder << type << length << "\'"; return buffer.str(); @@ -42,9 +43,9 @@ struct NpyDtypeDescr { }; // npy file header start information -constexpr char kNpyMagicPrefix[] = "\x93NUMPY"; -constexpr size_t kNpyMagicLen = sizeof(kNpyMagicPrefix) - 1; -constexpr size_t kNpyArrayAlign = 64; +constexpr char NPY_MAGIC_PREFIX[] = "\x93NUMPY"; +constexpr size_t NPY_MAGIC_LEN = sizeof(NPY_MAGIC_PREFIX) - 1; +constexpr size_t NPY_ARRAY_ALIGN = 64; static const std::unordered_map npyTypeDescMap = { {DataType::DT_BOOL, NpyDtypeDescr{'|', 'b', 1}}, {DataType::DT_INT8, NpyDtypeDescr{'|', 'i', 1}}, {DataType::DT_INT16, NpyDtypeDescr{'<', 'i', 2}}, {DataType::DT_INT32, NpyDtypeDescr{'<', 'i', 4}}, @@ -90,7 +91,8 @@ inline static std::string NpyTransShapeToStr(const DataUtils::TensorShape &shape return buffer.str(); } -inline static std::vector NpyLen2Bytes(size_t length, size_t lengthLen) { +inline static std::vector NpyLen2Bytes(size_t length, size_t lengthLen) +{ std::vector buff; lengthLen = std::min(lengthLen, static_cast(sizeof(length))); for (size_t i = 0; i < lengthLen; i++) { @@ -100,7 +102,8 @@ inline static std::vector NpyLen2Bytes(size_t length, size_t lengthLen) { return buff; } -static std::string GenerateNpyHeader(const DataUtils::TensorShape &shape, DataUtils::DataType dt, bool fortranOrder=false) +static std::string GenerateNpyHeader(const DataUtils::TensorShape &shape, + DataUtils::DataType dt, bool fortranOrder = false) { auto typeDesc = npyTypeDescMap.find(dt); if (typeDesc == npyTypeDescMap.end()) { @@ -111,7 +114,7 @@ static std::string GenerateNpyHeader(const DataUtils::TensorShape &shape, DataUt std::string fortranOrderStr = fortranOrder ? "True" : "False" ; buffer << "{"; - buffer << "'descr': " << typeDesc->second.str() << ", "; + buffer << "'descr': " << typeDesc->second.Str() << ", "; buffer << "'fortran_order': " << fortranOrderStr << ", "; buffer << "'shape': " << NpyTransShapeToStr(shape) << ", "; buffer << "}"; @@ -125,19 +128,19 @@ static std::string GenerateNpyHeader(const DataUtils::TensorShape &shape, DataUt constexpr const size_t lengthLenV2 = 4; size_t lengthLen = lengthLenV1; - size_t totalLen = kNpyMagicLen + versionLen + lengthLen + headerLen + 1; + size_t totalLen = NPY_MAGIC_LEN + versionLen + lengthLen + headerLen + 1; if (totalLen > maxLen) { version = {2, 0}; lengthLen = lengthLenV2; - totalLen = kNpyMagicLen + versionLen + lengthLen + headerLen + 1; + totalLen = NPY_MAGIC_LEN + versionLen + lengthLen + headerLen + 1; } - const size_t padLen = kNpyArrayAlign - totalLen % kNpyArrayAlign; + const size_t padLen = NPY_ARRAY_ALIGN - totalLen % NPY_ARRAY_ALIGN; const size_t paddingHeaderLen = headerLen + padLen + 1; const std::string padding(padLen, ' '); std::vector lengthBytes = NpyLen2Bytes(paddingHeaderLen, lengthLen); std::ostringstream out; - out.write(kNpyMagicPrefix, DataUtils::SizeToS64(kNpyMagicLen)); + out.write(NPY_MAGIC_PREFIX, DataUtils::SizeToS64(NPY_MAGIC_LEN)); out.put(version.first); out.put(version.second); out.write(lengthBytes.data(), DataUtils::SizeToS64(lengthBytes.size())); diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.hpp b/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.h similarity index 95% rename from debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.hpp rename to debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.h index 3f89263ae3621d33f5bbc8a67e86887d8063067e..1560a1a6dba353f2e0122a639e46fa4c87195bba 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/FileOperation.h @@ -18,8 +18,8 @@ #include -#include "include/ErrorCode.hpp" -#include "DataUtils.hpp" +#include "include/ErrorCode.h" +#include "DataUtils.h" namespace MindStudioDebugger { diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.cpp b/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.cpp index 8c3cd20883d26d68a5e3504bec47a9c3d76d3023..fddd4e28721c1cb9c0a8e61503743ced27022012 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.cpp @@ -27,8 +27,8 @@ #include #include -#include "include/ErrorCode.hpp" -#include "FileUtils.hpp" +#include "include/ErrorCode.h" +#include "FileUtils.h" /* 部分环境上c++版本比较老,这里不用filesystem库实现 */ @@ -38,7 +38,8 @@ namespace FileUtils { using namespace MindStudioDebugger; /********************* 基础检查函数库,不做过多校验,路径有效性由调用者保证 ******************/ -bool IsPathExist(const std::string& path) { +bool IsPathExist(const std::string& path) +{ struct stat buffer; return (stat(path.c_str(), &buffer) == 0); } @@ -60,7 +61,7 @@ static std::string GetFullPath(const std::string &originPath) } cwd = cwdBuf; - std::string fullPath = std::move(cwd + pathSeparator + originPath); + std::string fullPath = std::move(cwd + PATH_SEPARATOR + originPath); return fullPath; } @@ -84,7 +85,8 @@ std::vector SplitPath(const std::string &path, char separator) return tokens; } -std::string GetAbsPath(const std::string &originPath) { +std::string GetAbsPath(const std::string &originPath) +{ std::string fullPath = GetFullPath(originPath); if (fullPath.empty()) { return ""; @@ -118,7 +120,8 @@ std::string GetAbsPath(const std::string &originPath) { return resolvedPath; } -bool IsDir(const std::string& path) { +bool IsDir(const std::string& path) +{ struct stat buffer; if (stat(path.c_str(), &buffer) == 0) { return (buffer.st_mode & S_IFDIR) != 0; @@ -126,15 +129,17 @@ bool IsDir(const std::string& path) { return false; } -bool IsRegularFile(const std::string& path) { - struct stat path_stat; - if (stat(path.c_str(), &path_stat) == 0) { - return S_ISREG(path_stat.st_mode); +bool IsRegularFile(const std::string& path) +{ + struct stat pathStat; + if (stat(path.c_str(), &pathStat) == 0) { + return S_ISREG(pathStat.st_mode); } return false; } -bool IsFileSymbolLink(const std::string& path) { +bool IsFileSymbolLink(const std::string& path) +{ struct stat buffer; if (lstat(path.c_str(), &buffer) == 0) { if (S_ISLNK(buffer.st_mode)) { @@ -144,7 +149,8 @@ bool IsFileSymbolLink(const std::string& path) { return false; } -bool IsPathCharactersValid(const std::string& path) { +bool IsPathCharactersValid(const std::string& path) +{ for (const char& ch : path) { if (!std::isalnum(ch) && ch != '_' && ch != '.' && ch != ':' && ch != '/' && ch != '-') { return false; @@ -243,14 +249,14 @@ bool IsPathLengthLegal(const std::string& path) bool IsPathDepthValid(const std::string& path) { - return std::count(path.begin(), path.end(), pathSeparator) <= PATH_DEPTH_MAX; + return std::count(path.begin(), path.end(), PATH_SEPARATOR) <= PATH_DEPTH_MAX; } bool IsFileOwner(const std::string& path) { - struct stat file_stat; - if (stat(path.c_str(), &file_stat) == 0) { - if (file_stat.st_uid == getuid()) { + struct stat fileStat; + if (stat(path.c_str(), &fileStat) == 0) { + if (fileStat.st_uid == getuid()) { return true; } } @@ -306,7 +312,6 @@ static DebuggerErrno DeleteDirRec(const std::string &path, uint32_t depth) closedir(dir); return DebuggerErrno::ERROR_ILLEGAL_FILE_TYPE; } - } closedir(dir); @@ -321,7 +326,8 @@ static DebuggerErrno DeleteDirRec(const std::string &path, uint32_t depth) return DebuggerErrno::OK; } -DebuggerErrno DeleteDir(const std::string &path, bool recursion) { +DebuggerErrno DeleteDir(const std::string &path, bool recursion) +{ if (!IsPathExist(path)) { return DebuggerErrno::OK; } @@ -340,7 +346,8 @@ DebuggerErrno DeleteDir(const std::string &path, bool recursion) { return DebuggerErrno::OK; } -static DebuggerErrno CreateDirAux(const std::string& path, bool recursion, mode_t mode) { +static DebuggerErrno CreateDirAux(const std::string& path, bool recursion, mode_t mode) +{ std::string parent = GetParentDir(path); DebuggerErrno ret; @@ -404,16 +411,17 @@ DebuggerErrno Chmod(const std::string& path, const mode_t& mode) return chmod(absPath.c_str(), mode) == 0 ? DebuggerErrno::OK : DebuggerErrno::ERROR_SYSCALL_FAILED; } -DebuggerErrno GetFileSize(const std::string &path, size_t& size) { - struct stat path_stat; - if (stat(path.c_str(), &path_stat) != 0) { +DebuggerErrno GetFileSize(const std::string &path, size_t& size) +{ + struct stat pathStat; + if (stat(path.c_str(), &pathStat) != 0) { return DebuggerErrno::ERROR_FILE_NOT_EXISTS; } - if (!S_ISREG(path_stat.st_mode)) { + if (!S_ISREG(pathStat.st_mode)) { return DebuggerErrno::ERROR_ILLEGAL_FILE_TYPE; } - size = static_cast(path_stat.st_size); + size = static_cast(pathStat.st_size); return DebuggerErrno::OK; } diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.hpp b/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.h similarity index 85% rename from debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.hpp rename to debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.h index f1defd092c9630a4e32878b520a26242bae0116d..e3814ad7cdf2a6f41e9193849f10c6a2fd6e0d92 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/FileUtils.h @@ -23,11 +23,11 @@ #include #include -#include "include/ErrorCode.hpp" +#include "include/ErrorCode.h" namespace MindStudioDebugger { -constexpr const char pathSeparator = '/'; +constexpr const char PATH_SEPARATOR = '/'; constexpr const uint32_t FULL_PATH_LENGTH_MAX = 4096; constexpr const uint32_t FILE_NAME_LENGTH_MAX = 255; constexpr const uint32_t PATH_DEPTH_MAX = 32; @@ -64,8 +64,8 @@ constexpr const uint32_t FILE_NAME_MAX = 255; /* 基础检查函数库,不做过多校验,路径有效性由调用者保证 */ bool IsPathExist(const std::string& path); -std::vector SplitPath(const std::string &path, char separator=pathSeparator); -std::string GetAbsPath(const std::string &path); +std::vector SplitPath(const std::string &path, char separator = PATH_SEPARATOR); +std::string GetAbsPath(const std::string &originpath); bool IsDir(const std::string& path); bool IsRegularFile(const std::string& path); bool IsFileSymbolLink(const std::string& path); @@ -85,19 +85,19 @@ bool IsFileOwner(const std::string& path); /* 文件操作函数库,会对入参做基本检查 */ DebuggerErrno DeleteFile(const std::string &path); -DebuggerErrno DeleteDir(const std::string &path, bool recursion=false); -DebuggerErrno CreateDir(const std::string &path, bool recursion=false, mode_t mode=NORMAL_DIR_MODE_DEFAULT); +DebuggerErrno DeleteDir(const std::string &path, bool recursion = false); +DebuggerErrno CreateDir(const std::string &path, bool recursion = false, mode_t mode = NORMAL_DIR_MODE_DEFAULT); DebuggerErrno Chmod(const std::string& path, const mode_t& mode); DebuggerErrno GetFileSize(const std::string &path, size_t& size); -DebuggerErrno OpenFile(const std::string& path, std::ifstream& ifs, std::ios::openmode mode=std::ios::in); -DebuggerErrno OpenFile(const std::string& path, std::ofstream& ofs, std::ios::openmode mode=std::ios::out, - mode_t permission=NORMAL_FILE_MODE_DEFAULT); +DebuggerErrno OpenFile(const std::string& path, std::ifstream& ifs, std::ios::openmode mode = std::ios::in); +DebuggerErrno OpenFile(const std::string& path, std::ofstream& ofs, std::ios::openmode mode = std::ios::out, + mode_t permission = NORMAL_FILE_MODE_DEFAULT); /* 通用检查函数 */ DebuggerErrno CheckFileSuffixAndSize(const std::string &path, FileType type); DebuggerErrno CheckDirCommon(const std::string &path); DebuggerErrno CheckFileBeforeRead(const std::string &path, const std::string& authority="r", - FileType type=FileType::COMMON); -DebuggerErrno CheckFileBeforeCreateOrWrite(const std::string &path, bool overwrite=false); + FileType type = FileType::COMMON); +DebuggerErrno CheckFileBeforeCreateOrWrite(const std::string &path, bool overwrite = false); } } \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.cpp b/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.cpp index 27111d60c9f86f2ae9b2b2a00b804ab886917755..1c1a4e96965e014f038d85636627c7b2ec185814 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.cpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.cpp @@ -68,13 +68,13 @@ std::string CalculateMD5(const uint8_t* data, size_t length) unsigned char digest[MD5_DIGEST_LENGTH]; MD5_Final(digest, &md5ctx); - static const char hexchar[] = "0123456789abcdef"; + static const char HEX_CHAR[] = "0123456789abcdef"; constexpr const uint8_t hexbase = 16; constexpr const size_t byteToStrWidth = 2; char md5string[MD5_DIGEST_LENGTH * byteToStrWidth + 1]; for (int i = 0; i < MD5_DIGEST_LENGTH; i++) { - md5string[i * byteToStrWidth] = hexchar[digest[i] / hexbase]; - md5string[i * byteToStrWidth + 1] = hexchar[digest[i] % hexbase]; + md5string[i * byteToStrWidth] = HEX_CHAR[digest[i] / hexbase]; + md5string[i * byteToStrWidth + 1] = HEX_CHAR[digest[i] % hexbase]; } md5string[sizeof(md5string) - 1] = '\0'; diff --git a/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.hpp b/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.h similarity index 88% rename from debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.hpp rename to debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.h index 141471ac8ce284ac1a7ab4b6db59f5d0da9a9fe2..accbee3187f02ba81e2eaf4e550a36200f52ec58 100644 --- a/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.hpp +++ b/debug/accuracy_tools/msprobe/ccsrc/utils/MathUtils.h @@ -23,7 +23,8 @@ namespace MindStudioDebugger { namespace MathUtils { template -T Gcd(T a, T b) { +T Gcd(T a, T b) +{ if (a == 0 || b == 0) { return 0; } @@ -37,7 +38,8 @@ T Gcd(T a, T b) { } template -T Lcm(T a, T b) { +T Lcm(T a, T b) +{ if (a == 0 || b == 0) { return 0; } @@ -46,7 +48,8 @@ T Lcm(T a, T b) { } template -T DivCeil(T v, T divisor) { +T DivCeil(T v, T divisor) +{ if (divisor == 0) { return 0; } @@ -56,13 +59,13 @@ T DivCeil(T v, T divisor) { template T AlignCeil(T v, T block) { - return DivCeil(v, block) * block; + return DivCeil(v, block) * block; } float Random(); float Random(float floor, float ceil); int32_t RandomInt(int32_t floor, int32_t ceil); -std::string RandomString(uint32_t len, char min=' ', char max='~'); +std::string RandomString(uint32_t len, char min = ' ', char max = '~'); std::string CalculateMD5(const uint8_t* data, size_t length); diff --git a/debug/accuracy_tools/msprobe/config.json b/debug/accuracy_tools/msprobe/config.json index 9bf9579b80770210bdda668b782a41540e7cb763..553b7f9ee3b89215647b00fb14b70af44ea5f00c 100644 --- a/debug/accuracy_tools/msprobe/config.json +++ b/debug/accuracy_tools/msprobe/config.json @@ -25,9 +25,7 @@ "run_ut": { "white_list": [], "black_list": [], - "error_data_path": "./", - "master_ip": "127.0.0.1", - "master_port": "8888" + "error_data_path": "./" }, "grad_probe": { "grad_level": "L1", diff --git a/debug/accuracy_tools/msprobe/core/common/const.py b/debug/accuracy_tools/msprobe/core/common/const.py index 9d2532767b9b87fa83fbe63fa2b4dda512cddeff..582e0dd2a854be147a81865a2b0a39bc13e55fef 100644 --- a/debug/accuracy_tools/msprobe/core/common/const.py +++ b/debug/accuracy_tools/msprobe/core/common/const.py @@ -27,8 +27,6 @@ class Const: ipv4_pattern = "([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5])(\.([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5])){3}$" SEP = "." - COLON = ":" - DOUBLE_SLASH = "//" REGEX_PREFIX_MAX_LENGTH = 20 REGEX_PREFIX_PATTERN = r"^[a-zA-Z0-9_-]+$" REGEX_FORWARD_BACKWARD = r'\.(forward|backward)\.' @@ -72,7 +70,7 @@ class Const: SUMMARY = "summary" MD5 = "md5" VALUE = "value" - SUMMARY_MODE = [ALL, SUMMARY, MD5] + SUMMARY_MODE = ["statistics", "md5"] WRITE_FLAGS = os.O_WRONLY | os.O_CREAT WRITE_MODES = stat.S_IWUSR | stat.S_IRUSR @@ -99,6 +97,7 @@ class Const: GRAD_OUTPUT = 'grad_output' PARAMS = 'parameters' PARAMS_GRAD = 'parameters_grad' + DEBUG = 'debug' START = "start" STOP = "stop" ENV_ENABLE = "1" @@ -136,6 +135,7 @@ class Const: NPU = 'NPU' NPU_LOWERCASE = 'npu' CPU_LOWERCASE = 'cpu' + GPU_LOWERCASE = 'gpu' CUDA_LOWERCASE = 'cuda' DEVICE = 'device' DISTRIBUTED = 'Distributed' @@ -170,7 +170,6 @@ class Const: LEFT_MOVE_INDEX = -1 RIGHT_MOVE_INDEX = 1 LAST_INDEX = -1 - MAX_TRAVERSAL_DEPTH = 5 TOP_LAYER = "TopLayer" CELL = "Cell" @@ -196,7 +195,11 @@ class Const: FILL_CHAR_NUMS = 50 TOOL_ENDS_SUCCESSFULLY = f"{TOOL_NAME} ends successfully." + WITHOUT_CALL_STACK = "The call stack retrieval failed." + STACK_FILTER_KEYWORDS = ["msprobe/core", "msprobe/pytorch", "msprobe/mindspore"] + CALL_STACK_FLAG = "data_dump/api_registry" + NEW_STACK_FLAG = "0" STEP = "step" RANK = "rank" @@ -217,6 +220,7 @@ class Const: TYPE = 'type' DTYPE = 'dtype' SHAPE = 'shape' + STACK_INFO = 'stack_info' MAX = 'Max' MIN = 'Min' MEAN = 'Mean' @@ -338,6 +342,35 @@ class Const: } } + def _fused_adamw_( + self, + grads, + exp_avgs, + exp_avg_sqs, + max_exp_avg_sqs, + state_steps, + *, + lr, + beta1, + beta2, + weight_decay, + eps, + amsgrad, + maximize, + grad_scale=None, + found_inf=None + ): + pass + + API_WITH_SELF_ARG = { + 'Torch._fused_adamw_': _fused_adamw_ + } + + ASCEND = "ASCEND" + MATCH_MODE_NAME = "pure name" + MATCH_MODE_MAPPING = "mapping" + MATCH_MODE_SIMILARITY = "similarity" + class CompareConst: """ @@ -496,13 +529,6 @@ class CompareConst: Const.PARAMS_GRAD: PARAMS_GRAD_STRUCT } - STRUCT_COMPARE_KEY = [ - INPUT_STRUCT, - OUTPUT_STRUCT, - PARAMS_STRUCT, - PARAMS_GRAD_STRUCT - ] - # compare standard HUNDRED_RATIO_THRESHOLD = 0.01 THOUSAND_RATIO_THRESHOLD = 0.001 @@ -581,15 +607,35 @@ class CompareConst: MAX_DIFF: None, MIN_DIFF: None, MEAN_DIFF: None, NORM_DIFF: None, MAX_RELATIVE_ERR: None, MIN_RELATIVE_ERR: None, MEAN_RELATIVE_ERR: None, NORM_RELATIVE_ERR: None } + + API_MAPPING_KEYS_TO_COMPARE = [ + ('ms_args', 'pt_args'), + ('ms_outputs', 'pt_outputs'), + ('ms_parameters', 'pt_parameters'), + ('ms_parameters_grad', 'pt_parameters_grad') + ] + INPUT_PATTERN = Const.SEP + Const.INPUT + Const.SEP KWARGS_PATTERN = Const.SEP + Const.KWARGS + Const.SEP OUTPUT_PATTERN = Const.SEP + Const.OUTPUT + Const.SEP PARAMS_PATTERN = Const.SEP + Const.PARAMS + Const.SEP PARAMS_GRAD_PATTERN = Const.SEP + Const.PARAMS_GRAD + Const.SEP - COMPARE_KEY = 'compare_key' - COMPARE_SHAPE = 'compare_shape' + + CMP_KEY = 'compare_key' + CMP_SHAPE = 'compare_shape' + + OP_NAME_X = 'op_name_x' + MATCH_RESULT_COLUMNS = [ + OP_NAME_X, 'dtype_x', 'shape_x', 'summary_x', 'stack_info_x', 'data_name_x', + CMP_KEY, CMP_SHAPE, + 'op_name_y', 'dtype_y', 'shape_y', 'summary_y', 'stack_info_y', 'data_name_y', + ] + INTERNAL_API_MAPPING_FILE = 'ms_to_pt_api.yaml' UNREADABLE = 'unreadable data' + NPU_DUMP_DATA_DIR = 'npu_dump_data_dir' + BENCH_DUMP_DATA_DIR = 'bench_dump_data_dir' + NO_REAL_DATA_FLAG = '-1' class FileCheckConst: @@ -612,6 +658,7 @@ class FileCheckConst: YAML_SUFFIX = ".yaml" IR_SUFFIX = ".ir" ZIP_SUFFIX = ".zip" + SHELL_SUFFIX = ".sh" MAX_PKL_SIZE = 1073741824 # 1 * 1024 * 1024 * 1024 MAX_NUMPY_SIZE = 10737418240 # 10 * 1024 * 1024 * 1024 MAX_JSON_SIZE = 1073741824 # 1 * 1024 * 1024 * 1024 @@ -706,7 +753,7 @@ class MonitorConst: "DeepSpeedZeroOptimizer_Stage3" ) DEEPSPEED_ZERO_OPT_FILTER = "DeepSpeedZeroOptimizer" - RULE_NAME = ['AnomalyTurbulence'] + RULE_NAME = ['AnomalyTurbulence', 'AnomalyNan'] SLICE_SIZE = 20480 # used for name @@ -723,12 +770,13 @@ class MonitorConst: ACTVGRAD = "actv_grad" POST_GRAD = "post_grad" PRE_GRAD = "pre_grad" + PRE_PARAM = "param_origin" + POST_PARAM = "param_updated" ACC_GRAD = "acc_grad" PREFIX_POST = "post" PREFIX_PRE = "pre" EXP_AVG = "exp_avg" EXP_AVG_SQ = "exp_avg_sq" - PARAM = "param" CSV_HEADER = ["vpp_stage", "name", "step"] CSV_HEADER_XY = ["vpp_stage", "name", "step", "micro_step"] @@ -741,65 +789,3 @@ class MonitorConst: HEADER_NAME = 'name' MAX_NDIGITS = 20 - - -class DistributedCheckConst: - API_FULL_NAME = "api_full_name" - API_NAME = "api_name" - GROUP = "group" - GROUP_RANKS = "group_ranks" - GROUP_INDEX = "group_index" - SRC = "src" - SRC_INDEX = "src_index" - OP = "op" - SCATTER_LIST = "scatter_list" - TORCH_PROCESS_GROUP = "torch.ProcessGroup" - ALL_ARGS = "all_args" - ALL_KWARGS = "all_kwargs" - RESULT_FILE_PATH = "result_file_path" - BENCHMARK_RESULT = "benchmark_result" - MASTER_IP = "master_ip" - MASTER_PORT = "master_port" - WORLD_SIZE = "world_size" - HCCL = "hccl" - TCP = "tcp" - BROADCAST = "broadcast" - REDUCE = "reduce" - ALL_REDUCE = "all_reduce" - SCATTER = "scatter" - GATHER = "gather" - ALL_GATHER = "all_gather" - ALL_TO_ALL = "all_to_all" - ALL_TO_ALL_SINGLE = "all_to_all_single" - BROADCAST_SRC_INDEX = 1 - FIRST_TENSOR_INDEX = 0 - MAX_CUMSUM_CHECK_NUM = 1000 - - REDOPTYPE_SUM = "RedOpType.SUM" - REDOPTYPE_PRODUCT = "RedOpType.PRODUCT" - REDOPTYPE_MIN = "RedOpType.MIN" - REDOPTYPE_MAX = "RedOpType.MAX" - REDOPTYPE_BAND = "RedOpType.BAND" - REDOPTYPE_BOR = "RedOpType.BOR" - REDOPTYPE_BXOR = "RedOpType.BXOR" - - API_ARGS_INDEX = { - "broadcast": { - "group": 2, - "src": 1 - }, - "reduce": { - "op": 2, - "dst": 1 - }, - "all_reduce": { - "reduce_op": 2 - }, - "scatter": { - "src": 2, - "scatter_list": 1 - }, - "gather": { - "dst": 2 - } - } diff --git a/debug/accuracy_tools/msprobe/core/common/file_utils.py b/debug/accuracy_tools/msprobe/core/common/file_utils.py index 87f2fc52cda4e5a1965e92e28c9e9d4975d1b78a..933c8af0acd36f3d25580c3a676d92cfc37fd561 100644 --- a/debug/accuracy_tools/msprobe/core/common/file_utils.py +++ b/debug/accuracy_tools/msprobe/core/common/file_utils.py @@ -12,10 +12,13 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - +import atexit import csv import fcntl +import io import os +import pickle +from multiprocessing import shared_memory import stat import json import re @@ -33,6 +36,7 @@ from msprobe.core.common.decorator import recursion_depth_decorator from msprobe.core.common.log import logger from msprobe.core.common.exceptions import FileCheckException from msprobe.core.common.const import FileCheckConst +from msprobe.core.common.global_lock import global_lock, is_main_process proc_lock = multiprocessing.Lock() @@ -304,12 +308,13 @@ def check_path_before_create(path): def check_dirpath_before_read(path): path = os.path.realpath(path) dirpath = os.path.dirname(path) - if check_others_writable(dirpath): - logger.warning(f"The directory is writable by others: {dirpath}.") - try: - check_path_owner_consistent(dirpath) - except FileCheckException: - logger.warning(f"The directory {dirpath} is not yours.") + if dedup_log('check_dirpath_before_read', dirpath): + if check_others_writable(dirpath): + logger.warning(f"The directory is writable by others: {dirpath}.") + try: + check_path_owner_consistent(dirpath) + except FileCheckException: + logger.warning(f"The directory {dirpath} is not yours.") def check_file_or_directory_path(path, isdir=False): @@ -820,3 +825,99 @@ def split_zip_file_path(zip_file_path): check_file_suffix(zip_file_path, FileCheckConst.ZIP_SUFFIX) zip_file_path = os.path.realpath(zip_file_path) return os.path.dirname(zip_file_path), os.path.basename(zip_file_path) + + +def dedup_log(func_name, filter_name): + with SharedDict() as shared_dict: + exist_names = shared_dict.get(func_name, set()) + if filter_name in exist_names: + return False + exist_names.add(filter_name) + shared_dict[func_name] = exist_names + return True + + +class SharedDict: + def __init__(self): + self._changed = False + self._dict = None + self._shm = None + + def __enter__(self): + self._load_shared_memory() + return self + + def __exit__(self, exc_type, exc_val, exc_tb): + try: + if self._changed: + data = pickle.dumps(self._dict) + global_lock.acquire() + self._shm.buf[0:len(data)] = bytearray(data) + global_lock.release() + self._shm.close() + except FileNotFoundError: + name = self.get_shared_memory_name() + logger.warning(f'close shared memory {name} failed, shared memory has already been destroyed.') + + def __setitem__(self, key, value): + self._dict[key] = value + self._changed = True + + def __contains__(self, item): + return item in self._dict + + @classmethod + def destroy_shared_memory(cls): + if is_main_process(): + name = cls.get_shared_memory_name() + try: + shm = shared_memory.SharedMemory(create=False, name=name) + shm.close() + shm.unlink() + logger.debug(f'destroy shared memory, name: {name}') + except FileNotFoundError: + logger.warning(f'destroy shared memory {name} failed, shared memory has already been destroyed.') + + @classmethod + def get_shared_memory_name(cls): + if is_main_process(): + return f'shared_memory_{os.getpid()}' + return f'shared_memory_{os.getppid()}' + + def get(self, key, default=None): + return self._dict.get(key, default) + + def _load_shared_memory(self): + name = self.get_shared_memory_name() + try: + self._shm = shared_memory.SharedMemory(create=False, name=name) + except FileNotFoundError: + try: + self._shm = shared_memory.SharedMemory(create=True, name=name, size=1024 * 1024) + data = pickle.dumps({}) + self._shm.buf[0:len(data)] = bytearray(data) + logger.debug(f'create shared memory, name: {name}') + except FileExistsError: + self._shm = shared_memory.SharedMemory(create=False, name=name) + self._safe_load() + + def _safe_load(self): + with io.BytesIO(self._shm.buf[:]) as buff: + try: + self._dict = SafeUnpickler(buff).load() + except Exception as e: + logger.warning(f'shared dict is unreadable, reason: {e}, create new dict.') + self._dict = {} + self._changed = True + + +class SafeUnpickler(pickle.Unpickler): + WHITELIST = {'builtins': {'str', 'bool', 'int', 'float', 'list', 'set', 'dict'}} + + def find_class(self, module, name): + if module in self.WHITELIST and name in self.WHITELIST[module]: + return super().find_class(module, name) + raise pickle.PicklingError(f'Unpickling {module}.{name} is illegal!') + + +atexit.register(SharedDict.destroy_shared_memory) diff --git a/debug/accuracy_tools/msprobe/core/common/framework_adapter.py b/debug/accuracy_tools/msprobe/core/common/framework_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..331bd362ad173c12446662d8f11fac5cd423119d --- /dev/null +++ b/debug/accuracy_tools/msprobe/core/common/framework_adapter.py @@ -0,0 +1,140 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License.import functools +import functools +from msprobe.core.common.const import Const + + +class FrameworkDescriptor: + def __get__(self, instance, owner): + if owner._framework is None: + owner.import_framework() + return owner._framework + + +class FmkAdp: + fmk = Const.PT_FRAMEWORK + supported_fmk = [Const.PT_FRAMEWORK, Const.MS_FRAMEWORK] + supported_dtype_list = ["bfloat16", "float16", "float32", "float64"] + _framework = None + framework = FrameworkDescriptor() + + @classmethod + def import_framework(cls): + if cls.fmk == Const.PT_FRAMEWORK: + import torch + cls._framework = torch + elif cls.fmk == Const.MS_FRAMEWORK: + import mindspore + cls._framework = mindspore + else: + raise Exception(f"init framework adapter error, not in {cls.supported_fmk}") + + @classmethod + def set_fmk(cls, fmk=Const.PT_FRAMEWORK): + if fmk not in cls.supported_fmk: + raise Exception(f"init framework adapter error, not in {cls.supported_fmk}") + cls.fmk = fmk + cls._framework = None # 重置框架,以便下次访问时重新导入 + + @classmethod + def get_rank(cls): + if cls.fmk == Const.PT_FRAMEWORK: + return cls.framework.distributed.get_rank() + return cls.framework.communication.get_rank() + + @classmethod + def get_rank_id(cls): + if cls.is_initialized(): + return cls.get_rank() + return 0 + + @classmethod + def is_initialized(cls): + if cls.fmk == Const.PT_FRAMEWORK: + return cls.framework.distributed.is_initialized() + return cls.framework.communication.GlobalComm.INITED + + @classmethod + def is_nn_module(cls, module): + if cls.fmk == Const.PT_FRAMEWORK: + return isinstance(module, cls.framework.nn.Module) + return isinstance(module, cls.framework.nn.Cell) + + @classmethod + def is_tensor(cls, tensor): + if cls.fmk == Const.PT_FRAMEWORK: + return isinstance(tensor, cls.framework.Tensor) + return isinstance(tensor, cls.framework.Tensor) + + @classmethod + def process_tensor(cls, tensor, func): + if cls.fmk == Const.PT_FRAMEWORK: + if not tensor.is_floating_point() or tensor.dtype == cls.framework.float64: + tensor = tensor.float() + return float(func(tensor)) + return float(func(tensor).asnumpy()) + + @classmethod + def tensor_max(cls, tensor): + return cls.process_tensor(tensor, lambda x: x.max()) + + @classmethod + def tensor_min(cls, tensor): + return cls.process_tensor(tensor, lambda x: x.min()) + + @classmethod + def tensor_mean(cls, tensor): + return cls.process_tensor(tensor, lambda x: x.mean()) + + @classmethod + def tensor_norm(cls, tensor): + return cls.process_tensor(tensor, lambda x: x.norm()) + + @classmethod + def dtype(cls, dtype_str): + if dtype_str not in cls.supported_dtype_list: + raise Exception(f"{dtype_str} is not supported by adapter, not in {cls.supported_dtype_list}") + return getattr(cls.framework, dtype_str) + + @classmethod + def named_parameters(cls, module): + if cls.fmk == Const.PT_FRAMEWORK: + if not isinstance(module, cls.framework.nn.Module): + raise Exception(f"{module} is not a torch.nn.Module") + return module.named_parameters() + if not isinstance(module, cls.framework.nn.Cell): + raise Exception(f"{module} is not a mindspore.nn.Cell") + return module.parameters_and_names() + + @classmethod + def register_forward_pre_hook(cls, module, hook, with_kwargs=False): + if cls.fmk == Const.PT_FRAMEWORK: + if not isinstance(module, cls.framework.nn.Module): + raise Exception(f"{module} is not a torch.nn.Module") + module.register_forward_pre_hook(hook, with_kwargs=with_kwargs) + else: + if not isinstance(module, cls.framework.nn.Cell): + raise Exception(f"{module} is not a mindspore.nn.Cell") + original_construct = module.construct + + @functools.wraps(original_construct) + def new_construct(*args, **kwargs): + if with_kwargs: + hook(module, args, kwargs) + else: + hook(module, args) + return original_construct(*args, **kwargs) + + module.construct = new_construct diff --git a/debug/accuracy_tools/msprobe/core/common/global_lock.py b/debug/accuracy_tools/msprobe/core/common/global_lock.py new file mode 100644 index 0000000000000000000000000000000000000000..2090f009ea5a78a7c5fbda61c12b6c0a842b7d25 --- /dev/null +++ b/debug/accuracy_tools/msprobe/core/common/global_lock.py @@ -0,0 +1,86 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import multiprocessing +from multiprocessing.shared_memory import SharedMemory +import random +import time +import atexit +import os + +from msprobe.core.common.log import logger + + +def is_main_process(): + return multiprocessing.current_process().name == 'MainProcess' + + +class GlobalLock: + def __init__(self): + self.name = self.get_lock_name() + try: + self._shm = SharedMemory(create=False, name=self.name) + time.sleep(random.randint(0, 500) / 10000) # 等待随机时长以避免同时获得锁 + except FileNotFoundError: + try: + self._shm = SharedMemory(create=True, name=self.name, size=1) + self._shm.buf[0] = 0 + logger.debug(f'{self.name} is created.') + except FileExistsError: + self.__init__() + + @classmethod + def get_lock_name(cls): + if is_main_process(): + return f'global_lock_{os.getpid()}' + return f'global_lock_{os.getppid()}' + + @classmethod + def is_lock_exist(cls): + try: + SharedMemory(create=False, name=cls.get_lock_name()).close() + return True + except FileNotFoundError: + return False + + def cleanup(self): + self._shm.close() + if is_main_process(): + try: + self._shm.unlink() + logger.debug(f'{self.name} is unlinked.') + except FileNotFoundError: + logger.warning(f'{self.name} has already been unlinked.') + + def acquire(self, timeout=180): + """ + acquire global lock, default timeout is 3 minutes. + + :param float timeout: timeout(seconds), default value is 180. + """ + start = time.time() + while time.time() - start < timeout: + if self._shm.buf[0] == 0: + self._shm.buf[0] = 1 + return + time.sleep(random.randint(10, 500) / 10000) # 自旋,等待1-50ms + self._shm.buf[0] = 1 + + def release(self): + self._shm.buf[0] = 0 + + +global_lock = GlobalLock() +atexit.register(global_lock.cleanup) diff --git a/debug/accuracy_tools/msprobe/core/common/utils.py b/debug/accuracy_tools/msprobe/core/common/utils.py index 400ff8b3fbf0edbdeeb8311e8e262ac298b76e56..c30bd160bf9d9f5795afc72417be643a893ffaee 100644 --- a/debug/accuracy_tools/msprobe/core/common/utils.py +++ b/debug/accuracy_tools/msprobe/core/common/utils.py @@ -285,8 +285,8 @@ def set_dump_path(input_param): if not npu_path_valid or not bench_path_valid: logger.error(f"Please check the json path is valid and ensure that neither npu_path nor bench_path is None.") raise CompareException(CompareException.INVALID_PATH_ERROR) - input_param['npu_dump_data_dir'] = os.path.join(os.path.dirname(npu_path), Const.DUMP_TENSOR_DATA) - input_param['bench_dump_data_dir'] = os.path.join(os.path.dirname(bench_path), Const.DUMP_TENSOR_DATA) + input_param[CompareConst.NPU_DUMP_DATA_DIR] = os.path.join(os.path.dirname(npu_path), Const.DUMP_TENSOR_DATA) + input_param[CompareConst.BENCH_DUMP_DATA_DIR] = os.path.join(os.path.dirname(bench_path), Const.DUMP_TENSOR_DATA) def get_dump_mode(input_param): @@ -506,4 +506,28 @@ def is_save_variable_valid(variable, valid_special_types, depth=0): return all(isinstance(key, str) and is_save_variable_valid(value, valid_special_types, depth + 1) for key, value in variable.items()) else: - return False \ No newline at end of file + return False + + +def replace_last_occurrence(text, old, new): + if text is None: + return text + index = text.rfind(old) + if index != -1: + return text[:index] + text[index:].replace(old, new, 1) + return text + + +def load_stack_json(stack_path): + stack_dict = load_json(stack_path) + if not stack_dict.get(Const.NEW_STACK_FLAG): + return stack_dict + + new_stack_dict = {} + for stack_info in stack_dict.values(): + if len(stack_info) != 2: + continue + api_list, stack_str = stack_info + for api_name in api_list: + new_stack_dict.update({api_name: stack_str}) + return new_stack_dict diff --git a/debug/accuracy_tools/msprobe/core/common_config.py b/debug/accuracy_tools/msprobe/core/common_config.py index b9a717c0c52f11e52ac055e3cfe6a0e77fe7e44c..836a7b89d3008c8e2fc34053eddd186e875279d6 100644 --- a/debug/accuracy_tools/msprobe/core/common_config.py +++ b/debug/accuracy_tools/msprobe/core/common_config.py @@ -111,3 +111,10 @@ class BaseConfig: f"The element '{mode}' of data_mode {self.data_mode} is not in {Const.DUMP_DATA_MODE_LIST}.", MsprobeException(MsprobeException.INVALID_PARAM_ERROR) ) + + def _check_summary_mode(self): + if self.summary_mode and self.summary_mode not in Const.SUMMARY_MODE: + logger.error_log_with_exp( + f"summary_mode is invalid, summary_mode is not in {Const.SUMMARY_MODE}.", + MsprobeException(MsprobeException.INVALID_PARAM_ERROR) + ) diff --git a/debug/accuracy_tools/msprobe/core/compare/acc_compare.py b/debug/accuracy_tools/msprobe/core/compare/acc_compare.py index 3765b91ae95d3af01aeb9b746af321481a8e142b..801d8abfca9c1346ada796b45b9af8e432bbb903 100644 --- a/debug/accuracy_tools/msprobe/core/compare/acc_compare.py +++ b/debug/accuracy_tools/msprobe/core/compare/acc_compare.py @@ -13,111 +13,224 @@ # See the License for the specific language governing permissions and # limitations under the License. -import multiprocessing import os import re -from copy import deepcopy +from dataclasses import dataclass +from collections import defaultdict +import numpy as np import pandas as pd from tqdm import tqdm from msprobe.core.advisor.advisor import Advisor from msprobe.core.common.const import CompareConst, Const from msprobe.core.common.exceptions import FileCheckException -from msprobe.core.common.file_utils import load_json, remove_path +from msprobe.core.common.file_utils import load_json, remove_path, create_directory from msprobe.core.common.log import logger -from msprobe.core.common.utils import CompareException, add_time_with_xlsx, check_op_str_pattern_valid, safe_get_value -from msprobe.core.compare.check import check_dump_json_str, check_graph_mode, check_stack_json_str, \ - check_struct_match, fuzzy_check_op -from msprobe.core.compare.highlight import find_compare_result_error_rows, highlight_rows_xlsx -from msprobe.core.compare.multiprocessing_compute import ComparisonResult, _handle_multi_process, _save_cmp_result -from msprobe.core.compare.npy_compare import compare_ops_apply, get_error_flag_and_msg -from msprobe.core.compare.utils import get_accuracy, get_rela_diff_summary_mode, get_un_match_accuracy, merge_tensor, \ - print_compare_ends_info, read_op, get_name_and_state, reorder_op_x_list +from msprobe.core.common.utils import CompareException, add_time_with_xlsx, check_op_str_pattern_valid, \ + set_dump_path, get_dump_mode, check_compare_param, check_configuration_param, load_stack_json +from msprobe.core.compare.check import check_dump_json_str, check_stack_json_str, cross_dtype_mapping +from msprobe.core.compare.utils import merge_tensor, print_compare_ends_info, read_op, \ + reorder_op_x_list, set_stack_json_path +from msprobe.core.compare.config import ModeConfig, MappingConfig, MappingDict +from msprobe.core.compare.multiprocessing_compute import CompareRealData +from msprobe.core.compare.highlight import HighLight + + +@dataclass +class ComparisonConfig: + dump_mode: str + stack_mode: bool + auto_analyze: bool + fuzzy_match: bool + data_mapping: dict + suffix: str + cell_mapping: dict + api_mapping: dict + layer_mapping: dict -class ModeConfig: - def __init__(self, stack_mode=False, auto_analyze=True, fuzzy_match=False, dump_mode=None): - self.stack_mode = stack_mode - self.auto_analyze = auto_analyze - self.fuzzy_match = fuzzy_match - self.dump_mode = dump_mode +class Comparator: + def __init__(self, file_reader, mode_config: ModeConfig, mapping_config: MappingConfig, is_cross_framework=False): + self.file_reader = file_reader + self.mode_config = mode_config + self.mapping_config = mapping_config + if self.mapping_config.data_mapping: + self.cross_frame = is_cross_framework + else: + self.cross_frame = (self.mapping_config.cell_mapping is not None or + self.mapping_config.api_mapping is not None) -class Comparator: - def __init__(self, mode_config: ModeConfig): - self.stack_mode = mode_config.stack_mode - self.auto_analyze = mode_config.auto_analyze - self.fuzzy_match = mode_config.fuzzy_match - self.dump_mode = mode_config.dump_mode + self.mapping_dict = MappingDict(mapping_config) @staticmethod - def get_result_md5_compare(ms_op_name, bench_op_name, npu_ops_all, bench_ops_all, *args): - npu_struct = npu_ops_all.get(ms_op_name).get('struct', []) - bench_struct = bench_ops_all.get(bench_op_name).get('struct', []) + def process_output_file(output_path, suffix): + file_name = add_time_with_xlsx("compare_result" + suffix) + file_path = os.path.join(os.path.realpath(output_path), file_name) + if os.path.exists(file_path): + logger.warning(f"{file_path} will be deleted.") + remove_path(file_path) + return file_path - if len(npu_struct) < 3 or len(bench_struct) < 3: - logger.error(f"The length of npu_struct and bench_struct must be >= 3, " - f"but got npu_struct={len(npu_struct)} and bench_struct={len(bench_struct)}. Please check!") - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) + def compare_core(self, input_param, output_path, **kwargs): + """ + Compares data from multiple JSON files and generates a comparison report. - result_item = [ms_op_name, bench_op_name, npu_struct[0], bench_struct[0], - npu_struct[1], bench_struct[1], npu_struct[2], bench_struct[2], - CompareConst.PASS if npu_struct[2] == bench_struct[2] else CompareConst.DIFF] + Args: + input_param (dict): A dictionary containing paths to JSON files ("npu_path", "bench_path", + "stack_path"). + output_path (str): The path where the output Excel report will be saved. + **kwargs: Additional keyword arguments including: + - stack_mode (bool, optional): Enables stack mode comparison. Defaults to False. + - auto_analyze (bool, optional): If True, triggers automatic analysis after comparison. Defaults to True. + - suffix (str, optional): Suffix to append to the output file name. Defaults to ''. + - fuzzy_match (bool, optional): Enables fuzzy matching during comparison. Defaults to False. + - dump_mode (str): ALL, SUMMARY, MD5. - if len(args) >= 2 and args[0]: - result_item.extend(args[1]) - else: - result_item.append(CompareConst.NONE) - return result_item + Returns: + """ + logger.info("Please check whether the input data belongs to you. If not, there may be security risks.") - @staticmethod - def calculate_summary_data(npu_summary_data, bench_summary_data, result_item): - err_msg = "" - result_item, accuracy_check, err_msg = get_rela_diff_summary_mode(result_item, npu_summary_data, - bench_summary_data, err_msg) - result_item.append(accuracy_check) - result_item.append(err_msg) + # get kwargs or set default value + suffix = kwargs.get('suffix', '') - @staticmethod - def _generate_na_data(ops_all): - if not ops_all: - return {} - key = next(iter(ops_all)) - value = deepcopy(ops_all[key]) - for k, v in value.items(): - if isinstance(v, tuple): - value[k] = tuple(CompareConst.N_A for _ in range(len(v))) - elif isinstance(v, list): - value[k] = [CompareConst.N_A] * len(v) - else: - value[k] = CompareConst.N_A - return value + # process output file + file_path = self.process_output_file(output_path, suffix) + + # initialize the compare result table and compare general data(name, dtype, shape, statistics/md5, etc.) + npu_json = input_param.get("npu_json_path") + bench_json = input_param.get("bench_json_path") + stack_json = input_param.get("stack_json_path") + result_df = self.compare_statistics([npu_json, bench_json, stack_json]) + if not result_df.values.tolist(): + logger.warning("Can`t match any op.") + return - def make_result_table(self, result): - header = CompareConst.HEAD_OF_COMPARE_MODE[self.dump_mode][:] + # compare real data + if self.mode_config.dump_mode == Const.ALL: + compare_real_data = CompareRealData(self.file_reader, self.mode_config, self.cross_frame) + result_df = compare_real_data.do_multi_process(input_param, result_df) - if self.stack_mode: - header.append(CompareConst.STACK) - if self.dump_mode == Const.ALL: - header.append(CompareConst.DATA_NAME) - else: - if self.dump_mode == Const.ALL: - for row in result: - del row[-2] # 输出结果不要堆栈信息时,删除中间结果result中的stack info,真实数据时为倒数第2列 - header.append(CompareConst.DATA_NAME) - else: - for row in result: - del row[-1] # 输出结果不要堆栈信息时,删除中间结果result中的stack info,非真实数据时为倒数第1列 - result_df = pd.DataFrame(result, columns=header, dtype='object') - return result_df + # highlight suspicious API + highlight_dict = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} + highlight = HighLight(self.mode_config) + highlight.find_compare_result_error_rows(result_df, highlight_dict) + highlight.highlight_rows_xlsx(result_df, highlight_dict, file_path) + + # output compare analysis suggestions + if self.mode_config.auto_analyze: + advisor = Advisor(result_df, output_path, suffix) + advisor.analysis() + + print_compare_ends_info() + + def compare_statistics(self, file_list): + # load and parse json data + parse_data = ParseData(self.mode_config) + npu_df, bench_df = parse_data.parse(file_list) + + npu_df[[Const.DTYPE, Const.SHAPE]] = npu_df[[Const.DTYPE, Const.SHAPE]].astype(str) + bench_df[[Const.DTYPE, Const.SHAPE]] = bench_df[[Const.DTYPE, Const.SHAPE]].astype(str) + + # create new columns for compare op_name and shape + # process npu_df's COMPARE_KEY whether same or different framework + process_df = ProcessDf(self.mode_config, self.mapping_config, self.mapping_dict) + npu_df, bench_df = process_df.process_compare_key_and_shape(npu_df, bench_df) + + # match npu and bench, match_result contains both npu_info and bench_info + match = Match(self.mode_config, self.mapping_config, self.cross_frame) + match_result = match.match_api_infos(npu_df, bench_df) + # 筛选出npu_name存在的行并填充筛选出行中的缺失值为N/A + match_result = match_result[match_result['op_name_x'].notna()].fillna(CompareConst.N_A) + bench_columns = [i + '_y' for i in bench_df.columns] + match_result.loc[~match.gen_dtype_condition(match_result), bench_columns] = CompareConst.N_A + + # organize compare result table by renaming columns + create_table = CreateTable(self.mode_config) + result_df, header = create_table.make_result_df(match_result) + + # calculate statistics diff + calc_stats_diff = CalcStatsDiff(self.mode_config) + return calc_stats_diff.calc_accuracy(result_df, header) + + +class ParseData: + def __init__(self, mode_config: ModeConfig): + self.mode_config = mode_config + + def parse(self, file_list): + npu_json_path, bench_json_path, stack_json_path = file_list + npu_json_data = load_json(npu_json_path) + bench_json_data = load_json(bench_json_path) + stack_json_data = load_stack_json(stack_json_path) if self.mode_config.stack_mode else None + + # parse json data and generate df + npu_df = self.gen_data_df(npu_json_data, stack_json_data) + bench_df = self.gen_data_df(bench_json_data, stack_json_data) + + return npu_df, bench_df + + def gen_data_df(self, data_json, stack_json_data): + result = { + CompareConst.OP_NAME: [], + Const.DTYPE: [], + Const.SHAPE: [], + Const.SUMMARY: [], + Const.STACK_INFO: [] + } + if self.mode_config.dump_mode == Const.ALL: + result['data_name'] = [] + elif self.mode_config.dump_mode == Const.MD5: + result[Const.MD5] = [] + + api_nums = len(data_json['data']) + progress_bar = tqdm(total=api_nums, desc="API/Module Read Progress", unit="api/module", ncols=100) + + # 从json中循环解析API数据,遍历所有API + for data_name in data_json['data']: + check_op_str_pattern_valid(data_name) + merge_list = self.gen_merge_list(data_json, data_name, stack_json_data) + if not merge_list: + continue + + op_name_list = merge_list.get(CompareConst.OP_NAME) + summary_list = merge_list.get(Const.SUMMARY) + data_name_list = merge_list.get('data_name') + op_name_reorder, summary_reorder, data_name_reorder = reorder_op_x_list(op_name_list, + summary_list, + data_name_list) + # 遍历单个API的所有item + for index, op_name in enumerate(op_name_reorder): + result[CompareConst.OP_NAME].append(op_name) + if (CompareConst.INPUT_PATTERN in op_name) or (CompareConst.KWARGS_PATTERN in op_name): + struct = merge_list[CompareConst.INPUT_STRUCT].pop(0) + elif CompareConst.OUTPUT_PATTERN in op_name: + struct = merge_list[CompareConst.OUTPUT_STRUCT].pop(0) + elif CompareConst.PARAMS_PATTERN in op_name: + struct = merge_list[CompareConst.PARAMS_STRUCT].pop(0) + else: + struct = merge_list[CompareConst.PARAMS_GRAD_STRUCT].pop(0) + result[Const.DTYPE].append(struct[0]) + result[Const.SHAPE].append(struct[1]) + if self.mode_config.dump_mode == Const.MD5: + result[Const.MD5].append(struct[2]) + result[Const.SUMMARY].append(summary_reorder.pop(0)) + result[Const.STACK_INFO].append( + merge_list[Const.STACK_INFO][0] if index == 0 and self.mode_config.stack_mode else None) + if self.mode_config.dump_mode == Const.ALL: + result['data_name'].append(data_name_reorder.pop(0)) + + progress_bar.update(1) + progress_bar.close() + return pd.DataFrame(result) def gen_merge_list(self, json_data, op_name, stack_json_data): op_data = json_data['data'][op_name] check_dump_json_str(op_data, op_name) op_parsed_list = read_op(op_data, op_name) - if self.stack_mode: + if self.mode_config.stack_mode: stack_info = stack_json_data.get(op_name) if stack_info is not None: check_stack_json_str(stack_info, op_name) @@ -127,392 +240,481 @@ class Comparator: 'full_info': stack_info }) - merge_list = merge_tensor(op_parsed_list, self.dump_mode) + merge_list = merge_tensor(op_parsed_list, self.mode_config.dump_mode) return merge_list - def check_op(self, npu_dict, bench_dict): - npu_op_name = npu_dict[CompareConst.OP_NAME] - bench_op_name = bench_dict[CompareConst.OP_NAME] - graph_mode = check_graph_mode(safe_get_value(npu_op_name, 0, "npu_op_name"), - safe_get_value(bench_op_name, 0, "bench_op_name")) - - frame_name = getattr(self, "frame_name") - if frame_name == "PTComparator": - from msprobe.pytorch.compare.match import graph_mapping - if graph_mode: - return graph_mapping.match(npu_op_name[0], bench_op_name[0]) - struct_match = check_struct_match(npu_dict, bench_dict) - if not self.fuzzy_match: - name_match = npu_op_name == bench_op_name - return name_match and struct_match - try: - name_match = fuzzy_check_op(npu_op_name, bench_op_name) - except Exception as err: - logger.warning("%s and %s can not fuzzy match." % (npu_op_name, bench_op_name)) - name_match = False - return name_match and struct_match - def match_op(self, npu_queue, bench_queue): - for b_index, b_op in enumerate(bench_queue[0: -1]): - if self.check_op(npu_queue[-1], b_op): - return len(npu_queue) - 1, b_index - if self.check_op(npu_queue[-1], bench_queue[-1]): - return len(npu_queue) - 1, len(bench_queue) - 1 - for n_index, n_op in enumerate(npu_queue[0: -1]): - if self.check_op(n_op, bench_queue[-1]): - return n_index, len(bench_queue) - 1 - return -1, -1 +class ProcessDf: + def __init__(self, mode_config: ModeConfig, mapping_config: MappingConfig, mapping_dict: MappingDict): + self.mode_config = mode_config + self.mapping_config = mapping_config + self.mapping_dict = mapping_dict - def compare_process(self, file_lists): - npu_json_path, bench_json_path, stack_json_path = file_lists - npu_json_data = load_json(npu_json_path) - bench_json_data = load_json(bench_json_path) - stack_json_data = load_json(stack_json_path) if self.stack_mode else None + @staticmethod + def get_api_name(api_list): + try: + api_name = api_list[0] + Const.SEP + api_list[1] + except IndexError as error: + logger.error('Failed to retrieve API name, please check if the dump data is reasonable') + raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error + return api_name + + def process_compare_key_and_shape(self, npu_df, bench_df): + npu_df = self.assign_npu_df_compare_key(npu_df, bench_df) + npu_df[CompareConst.CMP_SHAPE] = npu_df[Const.SHAPE] + bench_df[CompareConst.CMP_KEY] = bench_df[CompareConst.OP_NAME] + bench_df[CompareConst.CMP_SHAPE] = bench_df[Const.SHAPE] + return npu_df, bench_df + + def assign_npu_df_compare_key(self, npu_df, bench_df): + """ + 处理 npu_df 的 COMPARE_KEY 赋值逻辑 - if self.fuzzy_match: - logger.warning("This task uses fuzzy matching, which may affect the accuracy of the comparison.") + :param npu_df: DataFrame,NPU 对比数据 + :param bench_df: DataFrame,Bench 对比数据 + :return: compare_key(name)处理后的 npu_df + """ + # 处理api_mapping映射 + if self.mapping_config.api_mapping: + # 如果用户不传api_mapping.yaml,先使用内置api_mapping.yaml替换npu_op_name + npu_df[CompareConst.CMP_KEY] = npu_df[CompareConst.OP_NAME].apply(self.process_internal_api_mapping) + # 如果用户传入api_mapping.yaml,再使用传入api_mapping.yaml进一步替换npu_op_name + if isinstance(self.mapping_config.api_mapping, str): + self.modify_compare_data_with_user_mapping(npu_df, bench_df) + # 处理cell_mapping映射 + elif self.mapping_config.cell_mapping: + npu_df[CompareConst.CMP_KEY] = npu_df[CompareConst.OP_NAME].apply(self.process_cell_mapping) + # 处理data_mapping映射 + elif self.mapping_config.data_mapping: + npu_df[CompareConst.CMP_KEY] = npu_df[CompareConst.OP_NAME].apply(self.process_data_mapping) + else: + npu_df[CompareConst.CMP_KEY] = npu_df[CompareConst.OP_NAME] + return npu_df + + def process_internal_api_mapping(self, npu_op_name): + # get api name & class name from op_name + ms_api_name = self.get_api_name(npu_op_name.split(Const.SEP)) + class_name = ms_api_name.split(Const.SEP)[0] + if class_name == "Mint": + return npu_op_name.replace("Mint", "Torch") + elif class_name == "MintFunctional": + return npu_op_name.replace("MintFunctional", "Functional") + elif self.mapping_dict.ms_to_pt_mapping.get(ms_api_name): + return npu_op_name.replace(ms_api_name, self.mapping_dict.ms_to_pt_mapping.get(ms_api_name)) + else: + return npu_op_name + + def modify_compare_data_with_user_mapping(self, npu_df, bench_df): + def gen_input_compare_key(pattern, term): + is_unmatched = True + for i, prefix in enumerate(mapping_dict.get(f'ms_{term}')): + if op_name.split(pattern)[1].startswith(str(prefix)): + npu_df.loc[index, CompareConst.CMP_KEY] = ( + op_name.replace(pattern + str(prefix), + pattern + str(mapping_dict.get(f'pt_{term}')[i]))) + is_unmatched = False + return is_unmatched + + ms_api_indices_dict = self.get_api_indices_dict(npu_df) + pt_api_indices_dict = self.get_api_indices_dict(bench_df) + + for mapping_dict in self.mapping_dict.api_mapping_dict: + all_length_equal = True + for k1, k2 in CompareConst.API_MAPPING_KEYS_TO_COMPARE: + if len(mapping_dict.get(k1, [])) != len(mapping_dict.get(k2, [])): + all_length_equal = False + if not all_length_equal: + logger.warning('The user-defined mapping table is incorrect,\ + make sure that the number of parameters is equal') + continue - npu_ops_queue = [] - bench_ops_queue = [] - result = [] + ms_api, pt_api = mapping_dict.get('ms_api'), mapping_dict.get('pt_api') + if ms_api not in ms_api_indices_dict or pt_api not in pt_api_indices_dict: + continue + for index in ms_api_indices_dict.get(ms_api): + op_name = npu_df.loc[index, CompareConst.OP_NAME].replace(ms_api, pt_api, 1) + if CompareConst.INPUT_PATTERN in op_name: + is_abandoned = gen_input_compare_key(CompareConst.INPUT_PATTERN, 'args') + elif CompareConst.KWARGS_PATTERN in op_name: + is_abandoned = gen_input_compare_key(CompareConst.KWARGS_PATTERN, 'args') + elif CompareConst.OUTPUT_PATTERN in op_name: + is_abandoned = gen_input_compare_key(CompareConst.OUTPUT_PATTERN, 'output') + elif CompareConst.PARAMS_PATTERN in op_name: + is_abandoned = gen_input_compare_key(CompareConst.PARAMS_PATTERN, 'parameters') + elif CompareConst.PARAMS_GRAD_PATTERN in op_name: + is_abandoned = gen_input_compare_key(CompareConst.PARAMS_GRAD_PATTERN, 'parameters_grad') + else: + logger.error(f'Excepted op_name: {op_name}') + raise CompareException(CompareException.INVALID_DATA_ERROR) + if is_abandoned: + npu_df.loc[index, CompareConst.CMP_KEY] = op_name + 'abandoned' - ops_npu_iter = iter(npu_json_data['data']) - ops_bench_iter = iter(bench_json_data['data']) - read_err_npu = True - read_err_bench = True - last_npu_ops_len = 0 - last_bench_ops_len = 0 + def get_api_indices_dict(self, op_name_df): + """ + 生成多个api对应的各自的所有的input、output等的index的键值对字典 + 示例: + {'Functional.conv2d': [0, 1, 2, 3], + 'Functional.batch_norm': [4, 5, 6, 7, 8] + } + """ + api_indices_dict = defaultdict(list) + for op_index, name in enumerate(op_name_df[CompareConst.OP_NAME]): + api_name = self.get_api_name(name.split(Const.SEP)) + api_indices_dict[api_name].append(op_index) + return api_indices_dict + + def process_cell_mapping(self, npu_op_name): + if not npu_op_name: + return CompareConst.N_A + param_grad_flag = Const.PARAMS_GRAD in npu_op_name.split(Const.SEP) + if not param_grad_flag and not re.search(Const.REGEX_FORWARD_BACKWARD, npu_op_name): + return CompareConst.N_A + npu_op_name = npu_op_name.replace("Cell", "Module", 1) + if self.mapping_dict.cell_mapping_dict: + # get cell name & class name from op_name + # Cell.fc1.Dense.forward.0.input.0 + cell_name = re.split(r'\.(?:forward|backward|parameters_grad)\.', npu_op_name.split(Const.SEP, 1)[-1])[0] + if cell_name in self.mapping_dict.cell_mapping_dict: + npu_op_name = npu_op_name.replace(cell_name, self.mapping_dict.cell_mapping_dict[cell_name], 1) + return npu_op_name + + def process_data_mapping(self, npu_op_name): + return self.mapping_dict.data_mapping_dict.get(npu_op_name, npu_op_name) + + +class Match: + def __init__(self, mode_config: ModeConfig, mapping_config: MappingConfig, cross_frame): + self.mode_config = mode_config + self.mapping_config = mapping_config + self.cross_frame = cross_frame - npu_api_nums = len(npu_json_data['data']) - progress_bar = tqdm(total=npu_api_nums, desc="API/Module Read Progress", unit="item", ncols=100) + @staticmethod + def put_unmatched_in_table(match_result, npu_op_item): + npu_columns = npu_op_item.index.tolist()[:-2] + new_columns = [name[:-1] + 'y' for name in npu_columns] + na_series = pd.Series([CompareConst.N_A] * len(new_columns), index=new_columns) + new_result_item = pd.concat([npu_op_item, na_series]).to_frame().T + new_result_item.columns = CompareConst.MATCH_RESULT_COLUMNS + match_result = pd.concat([match_result, new_result_item]) + return match_result - while True: - if not read_err_npu and not read_err_bench: - break - try: - last_npu_ops_len = len(npu_ops_queue) - op_name_npu = next(ops_npu_iter) - check_op_str_pattern_valid(op_name_npu) - npu_merge_list = self.gen_merge_list(npu_json_data, op_name_npu, stack_json_data) - if npu_merge_list: - npu_ops_queue.append(npu_merge_list) - except StopIteration: - read_err_npu = False - try: - last_bench_ops_len = len(bench_ops_queue) - op_name_bench = next(ops_bench_iter) - check_op_str_pattern_valid(op_name_bench) - bench_merge_list = self.gen_merge_list(bench_json_data, op_name_bench, stack_json_data) - if bench_merge_list: - bench_ops_queue.append(bench_merge_list) - except StopIteration: - read_err_bench = False + @staticmethod + def put_matched_in_table(match_result, npu_op_item, bench_op_item): + head_len = len(CompareConst.MATCH_RESULT_COLUMNS) + new_result_item = pd.concat([npu_op_item, bench_op_item]).head(head_len).to_frame().T + new_result_item.columns = CompareConst.MATCH_RESULT_COLUMNS + match_result = pd.concat([match_result, new_result_item]) + return match_result - progress_bar.update(1) + @staticmethod + def rename_api(op_name): + """ + 原api: {api_type}.{api_name}.{API调用次数}.{前向反向}.{input/output}.{参数序号} + rename后: {api_type}.{api_name}.{API调用次数}.{input/output}.{参数序号} + """ + if Const.FORWARD not in op_name and Const.BACKWARD not in op_name: + return op_name + process = Const.FORWARD if Const.FORWARD in op_name else Const.BACKWARD + name_split = op_name.split(process) + try: + torch_func_index, in_out = name_split[0], name_split[1] + except IndexError as error: + logger.error(f'{op_name} can not be split with {process}, please check!') + raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error + torch_func_split = torch_func_index.rsplit(Const.SEP, 2) + torch_func = str(torch_func_split[0]) + Const.SEP + process + str(in_out) + return torch_func + + def check_op_item(self, npu_op_item, bench_op_item): + name_match = self.rename_api(npu_op_item[CompareConst.CMP_KEY]) == self.rename_api( + bench_op_item[CompareConst.CMP_KEY]) + shape_match = npu_op_item[CompareConst.CMP_SHAPE] == bench_op_item[CompareConst.CMP_SHAPE] + if name_match and shape_match: + return True + else: + npu_op_name = npu_op_item[CompareConst.OP_NAME] + bench_op_name = bench_op_item[CompareConst.OP_NAME] + check_op_str_pattern_valid(npu_op_name) + check_op_str_pattern_valid(bench_op_name) + logger.warning(f"{npu_op_name} and {bench_op_name} can not fuzzy match") + return False - # merge all boolean expressions - both_empty = not npu_ops_queue and not bench_ops_queue - no_change = (len(npu_ops_queue) == last_npu_ops_len) and (len(bench_ops_queue) == last_bench_ops_len) - if both_empty or no_change: - continue + def match_api_infos(self, npu_df, bench_df): + """ + 正常匹配和模糊匹配 + """ + if self.mapping_config.data_mapping: + match_result = pd.merge(npu_df, bench_df, on=[CompareConst.CMP_KEY], how='left') + + # reorder match_result by op_name of npu + op_name_order = npu_df[CompareConst.OP_NAME].tolist() + match_result[CompareConst.OP_NAME_X] = pd.Categorical(match_result[CompareConst.OP_NAME_X], + categories=op_name_order, ordered=True) + match_result = match_result.sort_values(CompareConst.OP_NAME_X).reset_index(drop=True) + match_result[CompareConst.OP_NAME_X] = match_result[CompareConst.OP_NAME_X].astype('object') + elif not self.mode_config.fuzzy_match: + match_result = pd.merge(npu_df, bench_df, on=[CompareConst.CMP_KEY, CompareConst.CMP_SHAPE], + how='outer') + else: + match_result = self.process_fuzzy_match(npu_df, bench_df) + return match_result - # APIs in NPU and Bench models unconsistent judgment + def process_fuzzy_match(self, npu_df, bench_df): + """ + 模糊匹配通过循环方式匹配api + """ + npu_ops_queue = [] + bench_ops_queue = [] + match_result = pd.DataFrame(columns=CompareConst.MATCH_RESULT_COLUMNS) + + max_len = max(len(npu_df), len(bench_df)) + min_len = min(len(npu_df), len(bench_df)) + for i in range(max_len): + if i < min_len: + npu_ops_queue.append(npu_df.iloc[i]) + bench_ops_queue.append(bench_df.iloc[i]) + else: + try: + npu_ops_queue.append(npu_df.iloc[i]) + except IndexError: + pass + try: + bench_ops_queue.append(bench_df.iloc[i]) + except IndexError: + pass + + # 如果append之后queue状态不一致,则判断结束 if bool(npu_ops_queue) ^ bool(bench_ops_queue): - logger.info("Please check whether the number and calls of APIs in NPU and Bench models are consistent.") break - n_match_point, b_match_point = self.match_op(npu_ops_queue, bench_ops_queue) + npu_match_point, bench_match_point = self.match_op(npu_ops_queue, bench_ops_queue) - # 如果没有匹配到,数据放到队列中,跳过,直到后面匹配到,把匹配之前的api放到不匹配中 - if n_match_point == -1 and b_match_point == -1: + # 如果没有匹配到,数据放到队列中,跳过。直到后面匹配到,把匹配之前的api放到不匹配中 + if npu_match_point == -1 and bench_match_point == -1: continue - n_match_data = npu_ops_queue[n_match_point] - b_match_data = bench_ops_queue[b_match_point] - un_match_data = npu_ops_queue[0: n_match_point] - for npu_data in un_match_data: - get_un_match_accuracy(result, npu_data, self.dump_mode) - get_accuracy(result, n_match_data, b_match_data, self.dump_mode) - del npu_ops_queue[0: n_match_point + 1] - del bench_ops_queue[0: b_match_point + 1] - progress_bar.close() + npu_op_item = npu_ops_queue[npu_match_point] + bench_op_item = bench_ops_queue[bench_match_point] + unmatched_data = npu_ops_queue[0: npu_match_point] + for op_item in unmatched_data: + match_result = self.put_unmatched_in_table(match_result, op_item) + match_result = self.put_matched_in_table(match_result, npu_op_item, bench_op_item) + del npu_ops_queue[0: npu_match_point + 1] + del bench_ops_queue[0: bench_match_point + 1] + if npu_ops_queue: - for npu_data in npu_ops_queue: - get_un_match_accuracy(result, npu_data, self.dump_mode) - - result_df = self.make_result_table(result) - return result_df - - def merge_data(self, json_data, stack_json_data): - ops_all = {} - for op_name in json_data.get('data', {}): - merge_list = self.gen_merge_list(json_data, op_name, stack_json_data) - if merge_list: - struct_to_index_mapping = { - CompareConst.INPUT_STRUCT: 0, - CompareConst.OUTPUT_STRUCT: 0, - CompareConst.PARAMS_STRUCT: 0, - CompareConst.PARAMS_GRAD_STRUCT: 0 - } - - op_name_list = merge_list.get(CompareConst.OP_NAME) - summary_list = merge_list.get(Const.SUMMARY) - data_name_list = merge_list.get('data_name') - op_name_reorder, summary_reorder, data_name_reorder = reorder_op_x_list(op_name_list, - summary_list, - data_name_list) - for index, op_full_name in enumerate(op_name_reorder): - data_name = data_name_reorder[index] if data_name_reorder else None - - _, state = get_name_and_state(op_full_name) - struct_key = CompareConst.STATE_TO_STRUCT_MAPPING.get(state) - if not struct_key: - continue - ops_all[op_full_name] = { - CompareConst.STRUCT: safe_get_value(merge_list, struct_to_index_mapping.get(struct_key), - "merge_list", key=struct_key), - CompareConst.SUMMARY: safe_get_value(summary_reorder, index, "summary_reorder"), - 'data_name': data_name, - 'stack_info': merge_list.get('stack_info') - } - struct_to_index_mapping[struct_key] += 1 - return ops_all - - def get_accuracy(self, npu_ops_all, bench_ops_all): - result = [] - bench_ops_all[CompareConst.N_A] = self._generate_na_data(bench_ops_all) - for ms_op_name, bench_op_name in self.data_mapping_dict.items(): - check_op_str_pattern_valid(ms_op_name) - check_op_str_pattern_valid(bench_op_name) - if ms_op_name in npu_ops_all and bench_op_name in bench_ops_all: - npu_stack_info = npu_ops_all.get(ms_op_name).get("stack_info", None) - bench_stack_info = bench_ops_all.get(bench_op_name).get("stack_info", None) - has_stack = npu_stack_info and bench_stack_info - if self.dump_mode == Const.MD5: - result.append(self.get_result_md5_compare(ms_op_name, bench_op_name, npu_ops_all, - bench_ops_all, has_stack, npu_stack_info)) - continue - - npu_struct = npu_ops_all.get(ms_op_name).get('struct', []) - bench_struct = bench_ops_all.get(bench_op_name).get('struct', []) - - if len(npu_struct) < 2 or len(bench_struct) < 2: - logger.error( - f"The length of npu_struct and bench_struct must be >= 2, " - f"but got npu_struct={len(npu_struct)} and bench_struct={len(bench_struct)}. " - f"Please check!" - ) - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) - - base_result_item = [ - ms_op_name, bench_op_name, - npu_struct[0], - bench_struct[0], - npu_struct[1], - bench_struct[1] - ] - - if self.dump_mode == Const.SUMMARY: - result_item = base_result_item + [" "] * 8 # 8个统计量数据情况的比对指标 - else: - result_item = base_result_item + [" "] * 6 # 6个真实数据情况的比对指标 - - npu_summary_data = npu_ops_all.get(ms_op_name).get("summary") - result_item.extend(npu_summary_data) - bench_summary_data = bench_ops_all.get(bench_op_name).get("summary") - result_item.extend(bench_summary_data) - if self.dump_mode == Const.SUMMARY: - self.calculate_summary_data(npu_summary_data, bench_summary_data, result_item) - else: - result_item.append(CompareConst.ACCURACY_CHECK_YES) - result_item.append("") - if has_stack: - result_item.extend(npu_stack_info) - else: - result_item.append(CompareConst.NONE) - if self.dump_mode == Const.ALL: - ms_data_name = npu_ops_all.get(ms_op_name).get("data_name", None) - pt_data_name = bench_ops_all.get(bench_op_name).get("data_name", None) - result_item.append([ms_data_name, pt_data_name]) - result.append(result_item) - logger.info(f"{ms_op_name}, {bench_op_name} compared.") - elif ms_op_name not in npu_ops_all: - logger.warning(f'Can not find npu op name : `{ms_op_name}` in npu dump json file.') - elif bench_op_name not in npu_ops_all: - logger.warning(f'Can not find bench op name : `{bench_op_name}` in bench dump json file.') - return result + for op_item in npu_ops_queue: + match_result = self.put_unmatched_in_table(match_result, op_item) - def compare_process_custom(self, file_lists): - npu_json_path, bench_json_path, stack_json_path = file_lists - npu_json_data = load_json(npu_json_path) - bench_json_data = load_json(bench_json_path) - stack_json_data = load_json(stack_json_path) if self.stack_mode else None - npu_ops_all = self.merge_data(npu_json_data, stack_json_data) - bench_ops_all = self.merge_data(bench_json_data, stack_json_data) + match_result.reset_index(drop=True, inplace=True) + return match_result - result = self.get_accuracy(npu_ops_all, bench_ops_all) - result_df = self.make_result_table(result) - return result_df + def match_op(self, npu_queue, bench_queue): + for b_index, b_op in enumerate(bench_queue[0: -1]): + if self.check_op_item(npu_queue[-1], b_op): + return len(npu_queue) - 1, b_index + if self.check_op_item(npu_queue[-1], bench_queue[-1]): + return len(npu_queue) - 1, len(bench_queue) - 1 + for n_index, n_op in enumerate(npu_queue[0: -1]): + if self.check_op_item(n_op, bench_queue[-1]): + return n_index, len(bench_queue) - 1 + return -1, -1 - def compare_by_op(self, npu_op_name, bench_op_name, op_name_mapping_dict, input_param): + def gen_dtype_condition(self, match_result): """ - :param npu_op_name: excel中的NPU_Name,例如:MintFunctional.conv2d.0.forward.input.3.0 - :param bench_op_name: excel中的Bench_Name,例如:Functional.conv2d.0.forward.input.3.0 - :param op_name_mapping_dict: op_name和npy或pt文件的映射关系 - :param input_param: npu_json_path/bench_json_path/stack_json_path等参数 - :return: result_list,包含余弦相似度、最大绝对误差、最大相对误差、千分之一误差率、千分之五误差率和错误信息 - 用于读取excel中的NPU_Name和Bench_Name,根据映射关系找到npy或pt文件,然后读取文件中的数据进行比较,计算余弦相似度、欧式距离 - 最大绝对误差、最大相对误差、千分之一误差率、千分之五误差率并生成错误信息 + dtype匹配条件为npu、bench的dtype一致或属于规定的映射关系 """ - error_file, relative_err, error_flag = None, None, False + # 如果使用了data_mapping,不校验dtype,返回全True的DataFrame + if self.mapping_config.data_mapping: + return pd.Series(True, index=match_result.index) + + npu_dtype = match_result['dtype_x'] + bench_dtype = match_result['dtype_y'] + npu_dtype = self.process_cross_frame_dtype(npu_dtype) + bench_dtype = self.process_cross_frame_dtype(bench_dtype) + + equal_condition = npu_dtype == bench_dtype + match_condition = ( + (npu_dtype.isin(CompareConst.DTYPE_MATCH_GROUPS[0]) & bench_dtype.isin( + CompareConst.DTYPE_MATCH_GROUPS[0])) | + (npu_dtype.isin(CompareConst.DTYPE_MATCH_GROUPS[1]) & bench_dtype.isin( + CompareConst.DTYPE_MATCH_GROUPS[1])) + ) + return equal_condition | match_condition - data_name_pair = op_name_mapping_dict.get(npu_op_name) - npu_data_name = data_name_pair[0] - bench_data_name = data_name_pair[1] + def process_cross_frame_dtype(self, dtype): + if self.cross_frame: + dtype = dtype.map(cross_dtype_mapping).fillna(dtype) + return dtype - if str(npu_data_name) == '-1': # 没有npu真实数据 - n_value, b_value, error_flag = CompareConst.READ_NONE, CompareConst.READ_NONE, True - elif str(bench_data_name) == '-1': # 没有bench真实数据 - n_value, b_value, error_flag = CompareConst.READ_NONE, CompareConst.READ_NONE, True - error_file = 'no_bench_data' - else: - npu_dir = input_param.get("npu_dump_data_dir") - bench_dir = input_param.get("bench_dump_data_dir") - try: - frame_name = getattr(self, "frame_name") - read_npy_data = getattr(self, "read_npy_data") - if frame_name == "MSComparator": - n_value = read_npy_data(npu_dir, npu_data_name) - if self.cross_frame: - b_value = read_npy_data(bench_dir, bench_data_name, load_pt_file=True) - else: - b_value = read_npy_data(bench_dir, bench_data_name) - else: - n_value = read_npy_data(npu_dir, npu_data_name) - b_value = read_npy_data(bench_dir, bench_data_name) - except IOError as error: - error_file = error.filename - n_value, b_value = CompareConst.READ_NONE, CompareConst.READ_NONE - error_flag = True - except (FileCheckException, CompareException): - error_file = npu_data_name - n_value, b_value = CompareConst.READ_NONE, CompareConst.READ_NONE - error_flag = True - - # 通过n_value, b_value同时得到错误标志和错误信息 - n_value, b_value, error_flag, err_msg = get_error_flag_and_msg(n_value, b_value, - error_flag=error_flag, error_file=error_file) - - result_list, err_msg = compare_ops_apply(n_value, b_value, error_flag, err_msg) - - if self.fuzzy_match and npu_op_name != bench_op_name and bench_op_name != CompareConst.N_A: - err_msg += " Fuzzy matching data, the comparison accuracy may be affected." - result_list.append(err_msg) - return result_list - def compare_core(self, input_param, output_path, **kwargs): - """ - Compares data from multiple JSON files and generates a comparison report. - - Args: - input_param (dict): A dictionary containing paths to JSON files ("npu_path", "bench_path", - "stack_path"). - output_path (str): The path where the output Excel report will be saved. - **kwargs: Additional keyword arguments including: - - stack_mode (bool, optional): Enables stack mode comparison. Defaults to False. - - auto_analyze (bool, optional): If True, triggers automatic analysis after comparison. Defaults to True. - - suffix (str, optional): Suffix to append to the output file name. Defaults to ''. - - fuzzy_match (bool, optional): Enables fuzzy matching during comparison. Defaults to False. - - dump_mode (str): ALL, SUMMARY, MD5. - - Returns: - """ - # get kwargs or set default value - suffix = kwargs.get('suffix', '') +class CreateTable: + def __init__(self, mode_config: ModeConfig): + self.mode_config = mode_config - logger.info("Please check whether the input data belongs to you. If not, there may be security risks.") - file_name = add_time_with_xlsx("compare_result" + suffix) - file_path = os.path.join(os.path.realpath(output_path), file_name) - if os.path.exists(file_path): - logger.warning(f"{file_path} will be deleted.") - remove_path(file_path) - highlight_dict = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} + @staticmethod + def process_data_name(result): + result['data_name_x'] = result.apply(lambda row: [row['data_name_x'], row['data_name_y']], axis=1) + return result - npu_json = input_param.get("npu_json_path") - bench_json = input_param.get("bench_json_path") - stack_json = input_param.get("stack_json_path") - if self.data_mapping: - result_df = self.compare_process_custom([npu_json, bench_json, stack_json]) - else: - result_df = self.compare_process([npu_json, bench_json, stack_json]) + @staticmethod + def set_summary(summary): + if summary == CompareConst.N_A: + return [CompareConst.N_A] * 4 # 4为统计值个数 + summary_list = [] + for i in summary: + if str(i).lower() == 'nan': + summary_list.append(CompareConst.NAN) + else: + summary_list.append(i) + return summary_list - if not result_df.values.tolist(): - logger.warning("Can`t match any op.") - return + def make_result_df(self, result): + # get header + header = CompareConst.HEAD_OF_COMPARE_MODE[self.mode_config.dump_mode][:] + if self.mode_config.stack_mode: + header.append(CompareConst.STACK) + if self.mode_config.dump_mode == Const.ALL: + header.append(CompareConst.DATA_NAME) + result = self.process_data_name(result) + + # rename match_result columns + result.rename(columns={'op_name_x': CompareConst.NPU_NAME, + 'op_name_y': CompareConst.BENCH_NAME, + 'dtype_x': CompareConst.NPU_DTYPE, + 'dtype_y': CompareConst.BENCH_DTYPE, + 'shape_x': CompareConst.NPU_SHAPE, + 'shape_y': CompareConst.BENCH_SHAPE, + 'md5_x': CompareConst.NPU_MD5, + 'md5_y': CompareConst.BENCH_MD5, + 'data_name_x': CompareConst.DATA_NAME, + 'stack_info_x': CompareConst.STACK}, inplace=True) + + # process summary data + npu_summary = [CompareConst.NPU_MAX, CompareConst.NPU_MIN, CompareConst.NPU_MEAN, CompareConst.NPU_NORM] + bench_summary = [CompareConst.BENCH_MAX, CompareConst.BENCH_MIN, CompareConst.BENCH_MEAN, + CompareConst.BENCH_NORM] + result[npu_summary] = result['summary_x'].apply(self.set_summary).tolist() + result[bench_summary] = result['summary_y'].apply(self.set_summary).tolist() + + result_df = pd.DataFrame(columns=header) + for h in header: + if h in result.columns: + result_df[h] = result[h] + return result_df, header + + +class CalcStatsDiff: + def __init__(self, mode_config: ModeConfig): + self.mode_config = mode_config - if self.dump_mode == Const.ALL: - result_df = self.do_multi_process(input_param, result_df) + @staticmethod + def type_check(val): + """ + 检查是否为数值或字符串形式的nan, 如果是返回True + """ + check_series = pd.Series(False, index=val.index) + val_str = val.astype(str) + check_series[pd.to_numeric(val_str, errors='coerce').notna() | val_str.str.lower().eq('nan')] = True + return check_series - find_compare_result_error_rows(result_df, highlight_dict, self.dump_mode) - highlight_rows_xlsx(result_df, highlight_dict, file_path) + @staticmethod + def get_number(val): + return pd.to_numeric(val.astype(str), errors='coerce') + + def calc_summary_diff(self, result_df, cond_no_bench, stats_index: str): + npu_val = result_df['NPU ' + stats_index] + bench_val = result_df['Bench ' + stats_index] + diff_name = stats_index.capitalize() + ' diff' + rel_err_name = ('norm' if stats_index == 'l2norm' else stats_index).capitalize() + 'RelativeErr' + + # npu、bench中统计量均为数字或nan + cond_num_nan = self.type_check(npu_val) & self.type_check(bench_val) + + # 如果统计量不是数字或nan,就赋值统计量差异为N/A + result_df.loc[~cond_num_nan, [diff_name, rel_err_name]] = CompareConst.N_A + cond_valid_stat = ~cond_no_bench & cond_num_nan # 有效统计条件:bench_name不是N/A,并且NPU和bench的统计量都是数字或nan + result_df.loc[cond_valid_stat, diff_name] = self.get_number(npu_val) - self.get_number(bench_val) + + cond_diff_nan = result_df[diff_name].isna() # 统计量差异是nan + cond_nan_diff = cond_valid_stat & cond_diff_nan + result_df.loc[cond_nan_diff, [diff_name, rel_err_name]] = CompareConst.NAN + + cond_not_nan_diff = cond_valid_stat & ~cond_diff_nan + condition_pt_zero = bench_val == 0 + result_df.loc[cond_not_nan_diff & condition_pt_zero, rel_err_name] = CompareConst.N_A + + # 相对误差转成百分比字符串 + cond_ref_err = cond_not_nan_diff & ~condition_pt_zero + result_df.loc[cond_ref_err, rel_err_name] = ( + result_df.loc[cond_ref_err, diff_name] / bench_val[cond_ref_err] * 100) + result_df.loc[cond_ref_err, rel_err_name] = (result_df.loc[cond_ref_err, rel_err_name].abs().astype(str) + '%') + + magnitude = self.get_number(result_df[diff_name]).abs() / (pd.Series( + np.maximum(self.get_number(npu_val), self.get_number(bench_val))).abs() + CompareConst.EPSILON) + return magnitude > CompareConst.MAGNITUDE + + def calc_accuracy(self, result_df, header): + # bench name N/A represents no bench data, err_msg adds "No bench data matched." + condition_no_bench = result_df[CompareConst.BENCH_NAME] == CompareConst.N_A + result_df[condition_no_bench] = result_df[condition_no_bench].fillna(CompareConst.N_A) + result_df.loc[condition_no_bench, CompareConst.ERROR_MESSAGE] = CompareConst.NO_BENCH + + if self.mode_config.dump_mode == Const.MD5: + condition_md5_equal = result_df[CompareConst.NPU_MD5] == result_df[CompareConst.BENCH_MD5] + result_df.loc[condition_md5_equal, CompareConst.RESULT] = CompareConst.PASS + result_df.loc[~condition_md5_equal & ~condition_no_bench, CompareConst.RESULT] = CompareConst.DIFF + elif self.mode_config.dump_mode == Const.SUMMARY: + warning_list = [ + self.calc_summary_diff(result_df, condition_no_bench, stats_index) + for stats_index in ['max', 'min', 'mean', 'l2norm'] + ] + warning_flag = pd.DataFrame(warning_list).any() + result_df.loc[~condition_no_bench, [CompareConst.RESULT, CompareConst.ERROR_MESSAGE]] = '' + result_df.loc[warning_flag, CompareConst.RESULT] = CompareConst.WARNING + result_df.loc[warning_flag, CompareConst.ERROR_MESSAGE] = 'Need double check api accuracy.' + else: + fill_cols = [CompareConst.COSINE, CompareConst.EUC_DIST, + CompareConst.MAX_ABS_ERR, CompareConst.MAX_RELATIVE_ERR, + CompareConst.ONE_THOUSANDTH_ERR_RATIO, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO, + CompareConst.ERROR_MESSAGE] + result_df.loc[~condition_no_bench, fill_cols] = '' + result_df.loc[~condition_no_bench, CompareConst.ACCURACY] = CompareConst.ACCURACY_CHECK_YES + + return result_df[header] + + +def setup_comparison(input_param, output_path, **kwargs) -> ComparisonConfig: + """公共的前置处理逻辑,返回封装后的 ComparisonConfig 对象""" + try: + config = ComparisonConfig( + dump_mode='', + stack_mode=False, + auto_analyze=kwargs.get('auto_analyze', True), + fuzzy_match=kwargs.get('fuzzy_match', False), + data_mapping=kwargs.get('data_mapping', {}), + suffix=kwargs.get('suffix', ''), + cell_mapping=kwargs.get('cell_mapping', {}), + api_mapping=kwargs.get('api_mapping', {}), + layer_mapping=kwargs.get('layer_mapping', {}), + ) - if self.auto_analyze: - advisor = Advisor(result_df, output_path, suffix) - advisor.analysis() + set_dump_path(input_param) + config.dump_mode = get_dump_mode(input_param) - print_compare_ends_info() + # set stack_mode and set "stack_json_path" in input_param + if 'stack_json_path' in input_param: + config.stack_mode = kwargs.get('stack_mode', False) + else: + config.stack_mode = set_stack_json_path(input_param) - def compare_ops(self, idx, dump_path_dict, result_df, lock, input_param): - cos_result = [] - euc_dist_result = [] - max_err_result = [] - max_relative_err_result = [] - one_thousand_err_ratio_result = [] - five_thousand_err_ratio_result = [] - err_mess = [] - - is_print_compare_log = input_param.get("is_print_compare_log") - - for i in range(len(result_df)): - npu_op_name = result_df.iloc[i, 0] - bench_op_name = result_df.iloc[i, 1] - if is_print_compare_log: - logger.info("start compare: {}".format(npu_op_name)) - - cos_sim, euc_dist, max_abs_err, max_relative_err, one_thousand_err_ratio, five_thousand_err_ratio, err_msg \ - = self.compare_by_op(npu_op_name, bench_op_name, dump_path_dict, input_param) - - if is_print_compare_log: - logger.info( - "[{}] Compare result: cosine {}, max_abs_err {}, max_relative_err {}, {}, \ - one_thousand_err_ratio {}, " - "five_thousand_err_ratio {}".format(npu_op_name, cos_sim, max_abs_err, max_relative_err, - err_msg, one_thousand_err_ratio, five_thousand_err_ratio)) - cos_result.append(cos_sim) - euc_dist_result.append(euc_dist) - max_err_result.append(max_abs_err) - max_relative_err_result.append(max_relative_err) - one_thousand_err_ratio_result.append(one_thousand_err_ratio) - five_thousand_err_ratio_result.append(five_thousand_err_ratio) - err_mess.append(err_msg) - - cr = ComparisonResult( - cos_result=cos_result, - euc_dist_result=euc_dist_result, - max_err_result=max_err_result, - max_relative_err_result=max_relative_err_result, - one_thousand_err_ratio_result=one_thousand_err_ratio_result, - five_thousand_err_ratio_result=five_thousand_err_ratio_result, - err_msgs=err_mess - ) + check_configuration_param(config.stack_mode, config.auto_analyze, config.fuzzy_match, + input_param.get('is_print_compare_log', True)) + create_directory(output_path) + check_compare_param(input_param, output_path, config.dump_mode, config.stack_mode) - return _save_cmp_result(idx, cr, result_df, lock) + return config - def do_multi_process(self, input_param, result_df): - try: - result_df = _handle_multi_process(self.compare_ops, input_param, result_df, - multiprocessing.Manager().RLock()) - return result_df - except ValueError as e: - logger.error('result dataframe is not found.') - raise CompareException(CompareException.INVALID_DATA_ERROR) from e + except (CompareException, FileCheckException) as error: + logger.error('Compare failed. Please check the arguments and do it again!') + raise CompareException(error.code) from error diff --git a/debug/accuracy_tools/msprobe/core/compare/check.py b/debug/accuracy_tools/msprobe/core/compare/check.py index 9429d7ffa1a3c1feffb0bc68f5cde777e5f8d460..a88ddb8f5e088a9f72ef2d2b721b03dbc539c385 100644 --- a/debug/accuracy_tools/msprobe/core/compare/check.py +++ b/debug/accuracy_tools/msprobe/core/compare/check.py @@ -14,113 +14,46 @@ # limitations under the License. from msprobe.core.common.log import logger -from msprobe.core.compare.utils import rename_api from msprobe.core.common.utils import check_op_str_pattern_valid, CompareException -from msprobe.core.common.const import CompareConst, Const - -dtype_mapping = { - "Int8": "torch.int8", - "UInt8": "torch.uint8", - "Int16": "torch.int16", - "UInt16": "torch.uint16", - "Int32": "torch.int32", - "UInt32": "torch.uint32", - "Int64": "torch.int64", - "UInt64": "torch.uint64", - "Float16": "torch.float16", - "Float32": "torch.float32", - "Float64": "torch.float64", - "Bool": "torch.bool", - "BFloat16": "torch.bfloat16", - "Complex64": "torch.complex64", - "Complex128": "torch.complex128" +from msprobe.core.common.const import Const + +cross_dtype_mapping = { + "Int8": "int", + "torch.int8": "int", + "UInt8": "int", + "torch.uint8": "int", + "Int16": "int", + "torch.int16": "int", + "UInt16": "int", + "torch.uint16": "int", + "Int32": "int", + "torch.int32": "int", + "UInt32": "int", + "torch.uint32": "int", + "Int64": "int", + "torch.int64": "int", + "UInt64": "int", + "torch.uint64": "int", + + "Float16": "float", + "torch.float16": "float", + "Float32": "float", + "torch.float32": "float", + "Float64": "float", + "torch.float64": "float", + "BFloat16": "float", + "torch.bfloat16": "float", + + "Bool": "bool", + "torch.bool": "bool", + + "Complex64": "complex", + "torch.complex64": "complex", + "Complex128": "complex", + "torch.complex128": "complex", } -def compare_op_dict_struct(npu_dict, bench_dict): - return all(npu_dict.get(key) == bench_dict.get(key) for key in CompareConst.STRUCT_COMPARE_KEY) - - -def check_struct_match(npu_dict, bench_dict): - is_match = compare_op_dict_struct(npu_dict, bench_dict) - if not is_match: - struct_match_list = [] - try: - for i, key in enumerate(CompareConst.STRUCT_COMPARE_KEY): - # 首先额外检查input_struct是否空,input_struct不可能为空 - if i == 0 and (not npu_dict.get(key, []) or not bench_dict.get(key, [])): - return False - struct_match_list.append(check_type_shape_match(npu_dict.get(key, []), bench_dict.get(key, []))) - except CompareException as error: - err_msg = f'index out of bounds error occurs in npu or bench api, please check!\n' \ - f'npu_dict: {npu_dict}' \ - f'bench_dict: {bench_dict}' - logger.error(err_msg) - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error - is_match = all(struct_match_list) - return is_match - - -def check_type_shape_match(npu_struct, bench_struct): - """ - further check dtypes with a dtype mapping list when dtypes are not entirely consistent. - """ - if len(npu_struct) != len(bench_struct): - return False - if not npu_struct and not bench_struct: - return True - - struct_match = False - for npu_type_shape, bench_type_shape in zip(npu_struct, bench_struct): - try: - npu_type = npu_type_shape[0] - npu_shape = npu_type_shape[1] - bench_type = bench_type_shape[0] - bench_shape = bench_type_shape[1] - except IndexError as error: - logger.error(f'length of npu_type_shape: {npu_type_shape} and bench_type_shape: {bench_type_shape} ' - f'should both be 2, please check!') - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error - shape_match = npu_shape == bench_shape - type_match = ((npu_type == bench_type) or - any(npu_type in group and bench_type in group for group in CompareConst.DTYPE_MATCH_GROUPS)) - struct_match = shape_match and type_match - if not struct_match: - return False - return struct_match - - -def check_graph_mode(a_op_name, b_op_name): - if Const.ATEN in a_op_name and Const.ATEN not in b_op_name: - return True - if Const.ATEN not in a_op_name and Const.ATEN in b_op_name: - return True - return False - - -def fuzzy_check_op(npu_name_list, bench_name_list): - # 先检查api里的item长度是否相等,如果不是parameters_grad, 必然有input或者output,长度不可能为0 - # 如果是parameters_grad, "parameters_grad"字段的字典不会是空字典,因此len>=1 - if len(npu_name_list) == 0 or len(bench_name_list) == 0 or len(npu_name_list) != len(bench_name_list): - return False - is_match = True - for npu_name, bench_name in zip(npu_name_list, bench_name_list): - is_match = fuzzy_check_name(npu_name, bench_name) - if not is_match: - break - return is_match - - -def fuzzy_check_name(npu_name, bench_name): - if Const.FORWARD in npu_name and Const.FORWARD in bench_name: - is_match = rename_api(npu_name, Const.FORWARD) == rename_api(bench_name, Const.FORWARD) - elif Const.BACKWARD in npu_name and Const.BACKWARD in bench_name: - is_match = rename_api(npu_name, Const.BACKWARD) == rename_api(bench_name, Const.BACKWARD) - else: - is_match = npu_name == bench_name - return is_match - - def check_dump_json_str(op_data, op_name): input_list = op_data.get(Const.INPUT_ARGS, None) if op_data.get(Const.INPUT_ARGS, None) else op_data.get( Const.INPUT, None) diff --git a/debug/accuracy_tools/msprobe/core/compare/config.py b/debug/accuracy_tools/msprobe/core/compare/config.py new file mode 100644 index 0000000000000000000000000000000000000000..d50aa0ab7e869bd534ebbf22f9c93ce97e67110a --- /dev/null +++ b/debug/accuracy_tools/msprobe/core/compare/config.py @@ -0,0 +1,70 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +from msprobe.core.common.const import Const, CompareConst +from msprobe.core.common.file_utils import load_yaml + + +class ModeConfig: + def __init__(self, stack_mode=False, auto_analyze=True, fuzzy_match=False, dump_mode=Const.SUMMARY): + self.stack_mode = stack_mode + self.auto_analyze = auto_analyze + self.fuzzy_match = fuzzy_match + self.dump_mode = dump_mode + + +class MappingConfig: + def __init__(self, cell_mapping=None, api_mapping=None, data_mapping=None): + self.cell_mapping = cell_mapping + self.api_mapping = api_mapping + self.data_mapping = data_mapping + + +class MappingDict: + def __init__(self, mapping_config: MappingConfig): + self.cell_mapping_dict = self.load_mapping_file(mapping_config.cell_mapping) + self.api_mapping_dict = self.load_mapping_file(mapping_config.api_mapping) + if mapping_config.api_mapping is not None: + self.ms_to_pt_mapping = self.load_internal_api() + self.data_mapping_dict = self.init_data_mapping(mapping_config.data_mapping) + + @staticmethod + def load_internal_api(): + cur_path = os.path.dirname(os.path.realpath(__file__)) + yaml_path = os.path.abspath(os.path.join(cur_path, CompareConst.INTERNAL_API_MAPPING_FILE)) + return load_yaml(yaml_path) + + @staticmethod + def load_mapping_file(mapping_file): + if isinstance(mapping_file, str): + mapping_dict = load_yaml(mapping_file) + else: + mapping_dict = {} + return mapping_dict + + def init_data_mapping(self, data_mapping): + """ + 初始化data_mapping_dict + """ + if isinstance(data_mapping, str) or data_mapping is None: + data_mapping_dict = self.load_mapping_file(data_mapping) + elif isinstance(data_mapping, dict): + data_mapping_dict = data_mapping + else: + raise TypeError(f"The type of parameter `data_mapping` must be dict, str or None, but got " + f"{type(data_mapping)}") + return data_mapping_dict \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/core/compare/highlight.py b/debug/accuracy_tools/msprobe/core/compare/highlight.py index 1983313249f34680a8f25c3a2466d8871fe0a693..71959d77d1ad3f3e293b103c6844d9641c9e51be 100644 --- a/debug/accuracy_tools/msprobe/core/compare/highlight.py +++ b/debug/accuracy_tools/msprobe/core/compare/highlight.py @@ -30,12 +30,7 @@ from msprobe.core.common.file_utils import save_workbook from msprobe.core.common.log import logger from msprobe.core.common.utils import get_header_index, safe_get_value from msprobe.core.compare.utils import table_value_is_valid, get_name_and_state, CompareException - - -class HighlightCheck(abc.ABC): - @abc.abstractmethod - def apply(self, info, color_columns, dump_mode): - raise NotImplementedError +from msprobe.core.compare.config import ModeConfig def add_highlight_row_info(color_list, num, highlight_err_msg): @@ -46,6 +41,12 @@ def add_highlight_row_info(color_list, num, highlight_err_msg): color_list.append((num, [highlight_err_msg])) +class HighlightCheck(abc.ABC): + @abc.abstractmethod + def apply(self, info, color_columns, dump_mode): + raise NotImplementedError + + class CheckOrderMagnitude(HighlightCheck): """检查Max diff的数量级差异""" @@ -75,12 +76,12 @@ class CheckOneThousandErrorRatio(HighlightCheck): if (api_in[one_thousand_index] > CompareConst.ONE_THOUSAND_ERROR_IN_RED and api_out[one_thousand_index] < CompareConst.ONE_THOUSAND_ERROR_OUT_RED): add_highlight_row_info(color_columns.red, num, - "The input/parameters's one thousandth err ratio exceeds 0.9, " + "The input/parameter's one thousandth err ratio exceeds 0.9, " "while the output's is below 0.6") elif api_in[one_thousand_index] - api_out[one_thousand_index] > CompareConst.ONE_THOUSAND_ERROR_DIFF_YELLOW: add_highlight_row_info(color_columns.yellow, num, "The output's one thousandth err ratio decreases by more than 0.1 " - "compared to the input/parameters's") + "compared to the input/parameter's") class CheckCosineSimilarity(HighlightCheck): @@ -94,7 +95,7 @@ class CheckCosineSimilarity(HighlightCheck): if api_in[cosine_index] - api_out[cosine_index] > CompareConst.COSINE_DIFF_YELLOW: add_highlight_row_info(color_columns.yellow, num, "The output's cosine decreases by more than 0.1 " - "compared to the input/parameters's") + "compared to the input/parameter's") class CheckMaxRelativeDiff(HighlightCheck): @@ -117,7 +118,7 @@ class CheckMaxRelativeDiff(HighlightCheck): input_max_relative_diff < CompareConst.MAX_RELATIVE_IN_YELLOW): add_highlight_row_info(color_columns.yellow, num, "The output's maximum relative error exceeds 0.1, " - "while the input/parameters's is below 0.01") + "while the input/parameter's is below 0.01") class CheckOverflow(HighlightCheck): @@ -159,73 +160,6 @@ class HighlightRules: } -def check_indices_numeric(api_items, indices: list): - """检查指定索引处的值是否都为数字类型(int 或 float)""" - return all(isinstance(api_items[i], (float, int)) for i in indices) - - -def apply_comparison_rules(api_info, dump_mode, color_columns): - """output与input/params的比较""" - if dump_mode == Const.SUMMARY: - for rule in HighlightRules.summary_compare_rules.values(): - rule.apply(api_info, color_columns, dump_mode) - else: - for rule in HighlightRules.compare_rules.values(): - rule.apply(api_info, color_columns, dump_mode) - - -def find_error_rows(result, api_batch, highlight_dict, dump_mode): - """找到单个API中需要高亮的行""" - if dump_mode == Const.MD5: - return - npu_max_index = get_header_index(CompareConst.NPU_MAX, dump_mode) - bench_max_index = get_header_index(CompareConst.BENCH_MAX, dump_mode) - max_diff_index = get_header_index(CompareConst.MAX_DIFF if dump_mode == Const.SUMMARY - else CompareConst.MAX_ABS_ERR, dump_mode) - - red_lines, yellow_lines = [], [] - LineInfo = namedtuple('LineInfo', ['line_data', 'num_pointer']) - ApiInfo = namedtuple('ApiInfo', ['api_input', 'api_output', 'num_pointer']) - ColorColumns = namedtuple('ColorColumns', ['red', 'yellow']) - color_columns = ColorColumns(red=red_lines, yellow=yellow_lines) - - api_batch_start = api_batch.start # result_df的input起始全局索引 - api_batch_params_end_index = api_batch.params_end_index # result_df的params结束全局索引 + 1 - api_batch_output_end_index = api_batch.output_end_index # result_df的output结束全局索引 + 1 - api_batch_params_slice_index_local = api_batch_params_end_index - api_batch_start # result的params结束局部切片索引 - api_batch_output_slice_index_local = api_batch_output_end_index - api_batch_start # result的output结束局部切片索引 - - # 对单行API的输入或输出进行误差判断 - for i, line in enumerate(result): - index = api_batch_start + i - line_info = LineInfo(line_data=line, num_pointer=index) - for rule in HighlightRules.basic_rules.values(): - rule.apply(line_info, color_columns, dump_mode) - - # 对API的输出与输入比较,进行误差判断 - for n, api_out in enumerate(result[api_batch_params_slice_index_local: api_batch_output_slice_index_local]): - index = api_batch_start + api_batch_params_slice_index_local + n - # 单行检查只有溢出检查(红色),如果已经溢出,不进一步检查 - if index in red_lines: - continue - if not check_indices_numeric(api_out, [npu_max_index, bench_max_index, max_diff_index]): - continue - - # input/parameters的比较检查, 这里api_in包括input、parameters - for _, api_in in enumerate(result[0: api_batch_params_slice_index_local]): - if not check_indices_numeric(api_in, [npu_max_index, bench_max_index, max_diff_index]): - continue - api_info = ApiInfo(api_input=api_in, api_output=api_out, num_pointer=index) - apply_comparison_rules(api_info, dump_mode, color_columns) - - red_lines_num_set = {x[0] for x in red_lines} - yellow_lines_num_set = {x[0] for x in yellow_lines} - highlight_dict.get('red_rows', set()).update(red_lines_num_set) - highlight_dict.get('yellow_rows', set()).update(yellow_lines_num_set - red_lines_num_set) - highlight_dict.get('red_lines', []).extend(red_lines) - highlight_dict.get('yellow_lines', []).extend(yellow_lines) - - class ApiBatch: def __init__(self, api_name: str, start: int): self.api_name = api_name @@ -259,159 +193,225 @@ class ApiBatch: self.params_grad_end_index += 1 -def api_batches_update(api_batches, api_name, state, index): - """ - 当一个api的所有item更新完后,input, output的索引范围: - input: [start: start+input_len] - output: [start+input_len: output_end_index] - params: [output_end_index: params_end_index] - """ - if not api_batches: - api_batches.append(ApiBatch(api_name, index)) - else: - api_batch = api_batches[-1] - if api_batch.api_name == api_name or ( - not re.search(Const.REGEX_FORWARD_BACKWARD, api_name) and api_name in api_batch.api_name): - try: - api_batch.increment(state) - except ValueError as e: - logger.error(f"api_batch: {api_batch} with invalid state, please check! {e}") - raise CompareException(CompareException.INVALID_STATE_ERROR) from e - else: - api_batches.append(ApiBatch(api_name, index)) +class HighLight: + def __init__(self, mode_config: ModeConfig): + self.mode_config = mode_config - -def find_compare_result_error_rows(result_df, highlight_dict, dump_mode): - """将dataframe根据API分组,并找到有误差的算子用于高亮""" - result = result_df.values - api_batches = [] - for i, res_i in enumerate(result): - api_full_name = safe_get_value(res_i, 0, "res_i") - api_name, state = get_name_and_state(api_full_name) - api_batches_update(api_batches, api_name, state, i) - with tqdm(total=len(api_batches), desc="API/Module Analyse Progress", unit="item", ncols=100) as progress_bar: - for api_batch in api_batches: - find_error_rows(result[api_batch.start: api_batch.params_grad_end_index], api_batch, highlight_dict, - dump_mode) - progress_bar.update(1) - - -def value_check(value, api_name=None, i=None, result_df_columns=None): - if not table_value_is_valid(value): - if result_df_columns: - logger.error(f"Malicious value [{value}] at api_name [{api_name}], column [{result_df_columns[i]}], " - f"is not allowed to be written into the compare result xlsx.") + @staticmethod + def api_batches_update(api_batches, api_name, state, index): + """ + 当一个api的所有item更新完后,input, output的索引范围: + input: [start: start+input_len] + output: [start+input_len: output_end_index] + params: [output_end_index: params_end_index] + """ + if not api_batches: + api_batches.append(ApiBatch(api_name, index)) else: - logger.error(f"Malicious value [{value}] is not allowed to be written into the compare result xlsx.") - - -def df_malicious_value_check(df_chunk, result_df_columns): - for row in df_chunk.itertuples(index=False): - api_name = row[0] - for i, value in enumerate(row): - value_check(value, api_name, i, result_df_columns) - - -def handle_multi_process_malicious_value_check(func, result_df): - result_total_nums = len(result_df) - process_num = int((multiprocessing.cpu_count() + 1) / 2) - - if result_total_nums <= process_num: - process_num = 1 - chunks = [result_df] - else: - chunk_size = result_total_nums // process_num - chunks = [result_df.iloc[i: i + chunk_size] for i in range(0, result_total_nums, chunk_size)] - - pool = multiprocessing.Pool(process_num) - - def err_call(args): - logger.error("Multiprocessing malicious value check failed! Reason: {}".format(args)) - try: - pool.terminate() - except OSError: - logger.error("Pool terminate failed") - - result_df_columns = result_df.columns.tolist() - for column in result_df_columns: - value_check(column) - for df_chunk in chunks: - pool.apply_async(func, args=(df_chunk, result_df_columns,), error_callback=err_call) - - pool.close() - pool.join() - - -def compare_result_df_convert(value): - if not isinstance(value, (float, int)) or isinstance(value, bool): # bool类型或者非数字类型转str - value = f"{str(value)}\t" if str(value) in ("inf", "-inf", "nan") else str(value) - if isinstance(value, float): - value = f"{str(value)}\t" if str(value) in ("inf", "-inf", "nan") else value - return value - - -def highlight_rows_xlsx(result_df, highlight_dict, file_path): - """Write and highlight results in Excel""" + api_batch = api_batches[-1] + if api_batch.api_name == api_name or ( + not re.search(Const.REGEX_FORWARD_BACKWARD, api_name) and api_name in api_batch.api_name): + try: + api_batch.increment(state) + except ValueError as e: + logger.error(f"api_batch: {api_batch} with invalid state, please check! {e}") + raise CompareException(CompareException.INVALID_STATE_ERROR) from e + else: + api_batches.append(ApiBatch(api_name, index)) + + @staticmethod + def check_indices_numeric(api_items, indices: list): + """检查指定索引处的值是否都为数字类型(int 或 float)""" + return all(isinstance(api_items[i], (float, int)) for i in indices) + + @staticmethod + def update_highlight_err_msg(result_df, highlight_dict): + if result_df.shape[1] <= 1: + return - update_highlight_err_msg(result_df, highlight_dict) # add highlight err_msg + if CompareConst.NPU_MD5 in result_df.columns: + return - wb = openpyxl.Workbook() - ws = wb.active + err_msg = result_df.get(CompareConst.ERROR_MESSAGE) + red_lines_num_set = highlight_dict.get('red_rows') + + for color in ['red', 'yellow']: + line_key = f'{color}_lines' + lines = highlight_dict.get(line_key, []) + for line_index, messages in lines: + if color == 'yellow' and line_index in red_lines_num_set: + continue # 如果是 yellow 行,且已被 red 行覆盖,跳过 + + for msg in messages: + if err_msg[line_index] == '': + err_msg[line_index] = msg + else: + err_msg[line_index] += '\n' + msg + + if color == 'red': + red_lines_num_set.add(line_index) + + result_df[CompareConst.ERROR_MESSAGE] = err_msg + + @staticmethod + def compare_result_df_convert(value): + if not isinstance(value, (float, int)) or isinstance(value, bool): # bool类型或者非数字类型转str + value = f"{str(value)}\t" if str(value) in ("inf", "-inf", "nan") else str(value) + if isinstance(value, float): + value = f"{str(value)}\t" if str(value) in ("inf", "-inf", "nan") else value + return value + + @staticmethod + def value_check(value, api_name=None, i=None, result_df_columns=None): + if not table_value_is_valid(value): + if result_df_columns: + logger.error(f"Malicious value [{value}] at api_name [{api_name}], column [{result_df_columns[i]}], " + f"is not allowed to be written into the compare result xlsx.") + else: + logger.error(f"Malicious value [{value}] is not allowed to be written into the compare result xlsx.") + + def find_compare_result_error_rows(self, result_df, highlight_dict): + """将dataframe根据API分组,并找到有误差的算子用于高亮""" + result = result_df.values + api_batches = [] + for i, res_i in enumerate(result): + api_full_name = safe_get_value(res_i, 0, "res_i") + api_name, state = get_name_and_state(api_full_name) + self.api_batches_update(api_batches, api_name, state, i) + with tqdm(total=len(api_batches), desc="API/Module Analyse Progress", unit="item", ncols=100) as progress_bar: + for api_batch in api_batches: + self.find_error_rows(result[api_batch.start: api_batch.params_grad_end_index], api_batch, + highlight_dict) + progress_bar.update(1) + + def find_error_rows(self, result, api_batch, highlight_dict): + """找到单个API中需要高亮的行""" + if self.mode_config.dump_mode == Const.MD5: + return + npu_max_index = get_header_index(CompareConst.NPU_MAX, self.mode_config.dump_mode) + bench_max_index = get_header_index(CompareConst.BENCH_MAX, self.mode_config.dump_mode) + max_diff_index = get_header_index(CompareConst.MAX_DIFF if self.mode_config.dump_mode == Const.SUMMARY + else CompareConst.MAX_ABS_ERR, self.mode_config.dump_mode) + + red_lines, yellow_lines = [], [] + LineInfo = namedtuple('LineInfo', ['line_data', 'num_pointer']) + ApiInfo = namedtuple('ApiInfo', ['api_input', 'api_output', 'num_pointer']) + ColorColumns = namedtuple('ColorColumns', ['red', 'yellow']) + color_columns = ColorColumns(red=red_lines, yellow=yellow_lines) + + api_batch_start = api_batch.start # result_df的input起始全局索引 + api_batch_params_end_index = api_batch.params_end_index # result_df的params结束全局索引 + 1 + api_batch_output_end_index = api_batch.output_end_index # result_df的output结束全局索引 + 1 + api_batch_params_slice_index_local = api_batch_params_end_index - api_batch_start # result的params结束局部切片索引 + api_batch_output_slice_index_local = api_batch_output_end_index - api_batch_start # result的output结束局部切片索引 + + # 对单行API的输入或输出进行误差判断 + for i, line in enumerate(result): + index = api_batch_start + i + line_info = LineInfo(line_data=line, num_pointer=index) + for rule in HighlightRules.basic_rules.values(): + rule.apply(line_info, color_columns, self.mode_config.dump_mode) + + # 对API的输出与输入比较,进行误差判断 + for n, api_out in enumerate(result[api_batch_params_slice_index_local: api_batch_output_slice_index_local]): + index = api_batch_start + api_batch_params_slice_index_local + n + # 单行检查只有溢出检查(红色),如果已经溢出,不进一步检查 + if index in red_lines: + continue + if not self.check_indices_numeric(api_out, [npu_max_index, bench_max_index, max_diff_index]): + continue - # write header - logger.info('Initializing Excel file.') + # input/parameters的比较检查, 这里api_in包括input、parameters + for api_in in result[0: api_batch_params_slice_index_local]: + if not self.check_indices_numeric(api_in, [npu_max_index, bench_max_index, max_diff_index]): + continue + api_info = ApiInfo(api_input=api_in, api_output=api_out, num_pointer=index) + self.apply_comparison_rules(api_info, color_columns) + + red_lines_num_set = {x[0] for x in red_lines} + yellow_lines_num_set = {x[0] for x in yellow_lines} + highlight_dict.get('red_rows', set()).update(red_lines_num_set) + highlight_dict.get('yellow_rows', set()).update(yellow_lines_num_set - red_lines_num_set) + highlight_dict.get('red_lines', []).extend(red_lines) + highlight_dict.get('yellow_lines', []).extend(yellow_lines) + + def apply_comparison_rules(self, api_info, color_columns): + """output与input/params的比较""" + if self.mode_config.dump_mode == Const.SUMMARY: + for rule in HighlightRules.summary_compare_rules.values(): + rule.apply(api_info, color_columns, self.mode_config.dump_mode) + else: + for rule in HighlightRules.compare_rules.values(): + rule.apply(api_info, color_columns, self.mode_config.dump_mode) - handle_multi_process_malicious_value_check(df_malicious_value_check, result_df) + def highlight_rows_xlsx(self, result_df, highlight_dict, file_path): + """Write and highlight results in Excel""" - result_df_convert = result_df.applymap(compare_result_df_convert) + self.update_highlight_err_msg(result_df, highlight_dict) # add highlight err_msg - for row in dataframe_to_rows(result_df_convert, index=False, header=True): - ws.append(row) + wb = openpyxl.Workbook() + ws = wb.active - # 对可疑数据标色 - logger.info('Coloring Excel in progress.') - col_len = len(result_df.columns) - red_fill = PatternFill( - start_color=CompareConst.RED, end_color=CompareConst.RED, fill_type="solid" - ) - yellow_fill = PatternFill( - start_color=CompareConst.YELLOW, end_color=CompareConst.YELLOW, fill_type="solid", - ) - for i in highlight_dict.get("red_rows", []): - for j in range(1, col_len + 1): - ws.cell(row=i + 2, column=j).fill = red_fill # 2因为ws.cell中的row或column需要>=1,数据从第2行开始 - for i in highlight_dict.get("yellow_rows", []): - for j in range(1, col_len + 1): - ws.cell(row=i + 2, column=j).fill = yellow_fill + # write header + logger.info('Initializing Excel file.') - logger.info('Saving Excel file to disk: %s' % file_path) - save_workbook(wb, file_path) + self.handle_multi_process_malicious_value_check(self.df_malicious_value_check, result_df) + result_df_convert = result_df.applymap(self.compare_result_df_convert) -def update_highlight_err_msg(result_df, highlight_dict): - if result_df.shape[1] <= 1: - return + for row in dataframe_to_rows(result_df_convert, index=False, header=True): + ws.append(row) - if CompareConst.NPU_MD5 in result_df.columns: - return + # 对可疑数据标色 + logger.info('Coloring Excel in progress.') + col_len = len(result_df.columns) + red_fill = PatternFill( + start_color=CompareConst.RED, end_color=CompareConst.RED, fill_type="solid" + ) + yellow_fill = PatternFill( + start_color=CompareConst.YELLOW, end_color=CompareConst.YELLOW, fill_type="solid", + ) + for i in highlight_dict.get("red_rows", []): + for j in range(1, col_len + 1): + ws.cell(row=i + 2, column=j).fill = red_fill # 2因为ws.cell中的row或column需要>=1,数据从第2行开始 + for i in highlight_dict.get("yellow_rows", []): + for j in range(1, col_len + 1): + ws.cell(row=i + 2, column=j).fill = yellow_fill - err_msg = result_df.get(CompareConst.ERROR_MESSAGE) - red_lines_num_set = highlight_dict.get('red_rows') + logger.info('Saving Excel file to disk: %s' % file_path) + save_workbook(wb, file_path) - for color in ['red', 'yellow']: - line_key = f'{color}_lines' - lines = highlight_dict.get(line_key, []) - for line_index, messages in lines: - if color == 'yellow' and line_index in red_lines_num_set: - continue # 如果是 yellow 行,且已被 red 行覆盖,跳过 + def handle_multi_process_malicious_value_check(self, func, result_df): + result_total_nums = len(result_df) + process_num = int((multiprocessing.cpu_count() + 1) / 2) - for msg in messages: - if err_msg[line_index] == '': - err_msg[line_index] = msg - else: - err_msg[line_index] += '\n' + msg + if result_total_nums <= process_num: + process_num = 1 + chunks = [result_df] + else: + chunk_size = result_total_nums // process_num + chunks = [result_df.iloc[i: i + chunk_size] for i in range(0, result_total_nums, chunk_size)] - if color == 'red': - red_lines_num_set.add(line_index) + pool = multiprocessing.Pool(process_num) - result_df[CompareConst.ERROR_MESSAGE] = err_msg + def err_call(args): + logger.error("Multiprocessing malicious value check failed! Reason: {}".format(args)) + try: + pool.close() + except OSError: + logger.error("Pool terminate failed") + + result_df_columns = result_df.columns.tolist() + for column in result_df_columns: + self.value_check(column) + for df_chunk in chunks: + pool.apply_async(func, args=(df_chunk, result_df_columns,), error_callback=err_call) + + pool.close() + pool.join() + + def df_malicious_value_check(self, df_chunk, result_df_columns): + for row in df_chunk.itertuples(index=False): + api_name = row[0] + for i, value in enumerate(row): + self.value_check(value, api_name, i, result_df_columns) diff --git a/debug/accuracy_tools/msprobe/core/compare/merge_result/merge_result.py b/debug/accuracy_tools/msprobe/core/compare/merge_result/merge_result.py index dff60ec7288a532e21aaa5357c93454604f72dd2..286fd4622180f4028838f7f8d30bd718d702e6f1 100644 --- a/debug/accuracy_tools/msprobe/core/compare/merge_result/merge_result.py +++ b/debug/accuracy_tools/msprobe/core/compare/merge_result/merge_result.py @@ -238,7 +238,7 @@ def handle_multi_process(func, func_args, lock): def err_call(args): logger.error('Multiprocess merge result failed! Reason: {}'.format(args)) try: - pool.terminate() + pool.close() except OSError: logger.error("Pool terminate failed") diff --git a/debug/accuracy_tools/msprobe/mindspore/compare/ms_to_pt_api.yaml b/debug/accuracy_tools/msprobe/core/compare/ms_to_pt_api.yaml similarity index 100% rename from debug/accuracy_tools/msprobe/mindspore/compare/ms_to_pt_api.yaml rename to debug/accuracy_tools/msprobe/core/compare/ms_to_pt_api.yaml diff --git a/debug/accuracy_tools/msprobe/core/compare/multiprocessing_compute.py b/debug/accuracy_tools/msprobe/core/compare/multiprocessing_compute.py index 71b0f29d64f717adc87b74cf48e891652e9e753f..510e9fd01be89c6f9c64657c7c45774f010226e2 100644 --- a/debug/accuracy_tools/msprobe/core/compare/multiprocessing_compute.py +++ b/debug/accuracy_tools/msprobe/core/compare/multiprocessing_compute.py @@ -23,48 +23,20 @@ from tqdm import tqdm from msprobe.core.common.log import logger from msprobe.core.common.utils import CompareException from msprobe.core.common.const import CompareConst +from msprobe.core.common.exceptions import FileCheckException +from msprobe.core.compare.npy_compare import compare_ops_apply, get_error_flag_and_msg +from msprobe.core.compare.config import ModeConfig -def _handle_multi_process(func, input_param, result_df, lock): - process_num = max(int((multiprocessing.cpu_count() + 1) // 4), 1) - op_name_mapping_dict = read_dump_data(result_df) - - df_chunk_size = len(result_df) // process_num - if df_chunk_size > 0: - df_chunks = [result_df.iloc[i:i + df_chunk_size] for i in range(0, len(result_df), df_chunk_size)] - else: - df_chunks = [result_df] - - results = [] - pool = multiprocessing.Pool(process_num) - - def err_call(args): - logger.error('multiprocess compare failed! Reason: {}'.format(args)) - try: - pool.terminate() - except OSError as e: - logger.error("pool terminate failed") - - progress_bar = tqdm(total=len(result_df), desc="API/Module Item Compare Process", unit="row", ncols=100) - - def update_progress(size, progress_lock, extra_param=None): - with progress_lock: - progress_bar.update(size) - - for process_idx, df_chunk in enumerate(df_chunks): - idx = df_chunk_size * process_idx - chunk_size = len(df_chunk) - result = pool.apply_async(func, - args=(idx, op_name_mapping_dict, df_chunk, lock, input_param), - error_callback=err_call, - callback=partial(update_progress, chunk_size, lock) - ) - results.append(result) - - final_results = [r.get() for r in results] - pool.close() - pool.join() - return pd.concat(final_results, ignore_index=True) +@dataclass +class ComparisonResult: + cos_result: list + euc_dist_result: list + max_err_result: list + max_relative_err_result: list + one_thousand_err_ratio_result: list + five_thousand_err_ratio_result: list + err_msgs: list def _ms_graph_handle_multi_process(func, result_df, mode): @@ -81,9 +53,9 @@ def _ms_graph_handle_multi_process(func, result_df, mode): def err_call(args): logger.error('multiprocess compare failed! Reason: {}'.format(args)) try: - pool.terminate() + pool.close() except OSError as e: - logger.error("pool terminate failed") + logger.error(f'pool terminate failed: {str(e)}') for df_chunk in df_chunks: result = pool.apply_async(func, args=(df_chunk, mode), error_callback=err_call) @@ -94,74 +66,6 @@ def _ms_graph_handle_multi_process(func, result_df, mode): return pd.concat(final_results, ignore_index=True) -def read_dump_data(result_df): - try: - npu_dump_name_list = result_df.iloc[0:, 0].tolist() - dump_tensor_pair_list = result_df.iloc[0:, -1].tolist() - op_name_mapping_dict = {} - for index, _ in enumerate(npu_dump_name_list): - npu_dump_name = npu_dump_name_list[index] - dump_tensor_pair = dump_tensor_pair_list[index] - op_name_mapping_dict[npu_dump_name] = dump_tensor_pair - return op_name_mapping_dict - except ValueError as e: - logger.error('result dataframe is not found.') - raise CompareException(CompareException.INVALID_DATA_ERROR) from e - except IndexError as e: - logger.error('result dataframe elements can not be access.') - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from e - - -@dataclass -class ComparisonResult: - cos_result: list - euc_dist_result: list - max_err_result: list - max_relative_err_result: list - one_thousand_err_ratio_result: list - five_thousand_err_ratio_result: list - err_msgs: list - - -def _save_cmp_result(offset, result: ComparisonResult, result_df, lock): - """ - Save comparison results into the result DataFrame with thread safety. - Args: - offset: offset for index - result: data struct of ComparisonResult - result_df: result of DataFrame - lock: thread lock - - Returns: - comparison results in DataFrame - """ - - lock.acquire() - try: - for i, _ in enumerate(result.cos_result): - process_index = i + offset - result_df.loc[process_index, CompareConst.COSINE] = result.cos_result[i] - result_df.loc[process_index, CompareConst.EUC_DIST] = result.euc_dist_result[i] - result_df.loc[process_index, CompareConst.MAX_ABS_ERR] = result.max_err_result[i] - result_df.loc[process_index, CompareConst.MAX_RELATIVE_ERR] = result.max_relative_err_result[i] - result_df.loc[process_index, CompareConst.ONE_THOUSANDTH_ERR_RATIO] = ( - result.one_thousand_err_ratio_result)[i] - result_df.loc[process_index, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO] = ( - result.five_thousand_err_ratio_result)[i] - result_df.loc[process_index, CompareConst.ACCURACY] = ( - check_accuracy(result.cos_result[i], result.max_err_result[i])) - result_df.loc[process_index, CompareConst.ERROR_MESSAGE] = result.err_msgs[i] - return result_df - except ValueError as e: - logger.error('result dataframe is not found.') - raise CompareException(CompareException.INVALID_DATA_ERROR) from e - except IndexError as e: - logger.error('result dataframe elements can not be access.') - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from e - finally: - lock.release() - - def check_accuracy(cos, max_abs_err): if cos == CompareConst.SHAPE_UNMATCH: return CompareConst.ACCURACY_CHECK_UNMATCH @@ -179,3 +83,212 @@ def check_accuracy(cos, max_abs_err): if cos < CompareConst.COS_MAX_THRESHOLD or max_abs_err > CompareConst.MAX_ABS_ERR_MAX_THRESHOLD: return CompareConst.ACCURACY_CHECK_NO return CompareConst.ACCURACY_CHECK_YES + + +class CompareRealData: + def __init__(self, file_reader, mode_config: ModeConfig, cross_frame): + self.file_reader = file_reader + self.mode_config = mode_config + self.cross_frame = cross_frame + + @staticmethod + def read_dump_data(result_df): + try: + npu_dump_name_list = result_df.iloc[0:, 0].tolist() + dump_tensor_pair_list = result_df.iloc[0:, -1].tolist() + op_name_mapping_dict = {} + for index, npu_dump_name in enumerate(npu_dump_name_list): + dump_tensor_pair = dump_tensor_pair_list[index] + op_name_mapping_dict[npu_dump_name] = dump_tensor_pair + return op_name_mapping_dict + except ValueError as e: + logger.error('result dataframe is not found.') + raise CompareException(CompareException.INVALID_DATA_ERROR) from e + except IndexError as e: + logger.error('result dataframe elements can not be access.') + raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from e + + @staticmethod + def _save_cmp_result(offset, result: ComparisonResult, result_df, lock): + """ + Save comparison results into the result DataFrame with thread safety. + Args: + offset: offset for index + result: data struct of ComparisonResult + result_df: result of DataFrame + lock: thread lock + + Returns: + comparison results in DataFrame + """ + + lock.acquire() + try: + for i, cos_item in enumerate(result.cos_result): + process_index = i + offset + result_df.loc[process_index, CompareConst.COSINE] = cos_item + result_df.loc[process_index, CompareConst.EUC_DIST] = result.euc_dist_result[i] + result_df.loc[process_index, CompareConst.MAX_ABS_ERR] = result.max_err_result[i] + result_df.loc[process_index, CompareConst.MAX_RELATIVE_ERR] = result.max_relative_err_result[i] + result_df.loc[process_index, CompareConst.ONE_THOUSANDTH_ERR_RATIO] = ( + result.one_thousand_err_ratio_result)[i] + result_df.loc[process_index, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO] = ( + result.five_thousand_err_ratio_result)[i] + result_df.loc[process_index, CompareConst.ACCURACY] = ( + check_accuracy(result.cos_result[i], result.max_err_result[i])) + result_df.loc[process_index, CompareConst.ERROR_MESSAGE] = result.err_msgs[i] + return result_df + except ValueError as e: + logger.error('result dataframe is not found.') + raise CompareException(CompareException.INVALID_DATA_ERROR) from e + except IndexError as e: + logger.error('result dataframe elements can not be access.') + raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from e + finally: + lock.release() + + def compare_by_op(self, npu_op_name, bench_op_name, op_name_mapping_dict, input_param): + """ + :param npu_op_name: excel中的NPU_Name,例如:MintFunctional.conv2d.0.forward.input.3.0 + :param bench_op_name: excel中的Bench_Name,例如:Functional.conv2d.0.forward.input.3.0 + :param op_name_mapping_dict: op_name和npy或pt文件的映射关系 + :param input_param: npu_json_path/bench_json_path/stack_json_path等参数 + :return: result_list,包含余弦相似度、最大绝对误差、最大相对误差、千分之一误差率、千分之五误差率和错误信息 + 用于读取excel中的NPU_Name和Bench_Name,根据映射关系找到npy或pt文件,然后读取文件中的数据进行比较,计算余弦相似度、欧式距离 + 最大绝对误差、最大相对误差、千分之一误差率、千分之五误差率并生成错误信息 + """ + error_file, relative_err, error_flag = None, None, False + + data_name_pair = op_name_mapping_dict.get(npu_op_name) + npu_data_name = data_name_pair[0] + bench_data_name = data_name_pair[1] + + if str(npu_data_name) == CompareConst.NO_REAL_DATA_FLAG: # 没有npu真实数据 + n_value, b_value, error_flag = CompareConst.READ_NONE, CompareConst.READ_NONE, True + elif str(bench_data_name) == CompareConst.NO_REAL_DATA_FLAG: # 没有bench真实数据 + n_value, b_value, error_flag = CompareConst.READ_NONE, CompareConst.READ_NONE, True + error_file = 'no_bench_data' + elif str(bench_data_name) == CompareConst.N_A: # bench没匹配 + n_value, b_value, error_flag = CompareConst.READ_NONE, CompareConst.READ_NONE, True + error_file = None + else: + npu_dir = input_param.get(CompareConst.NPU_DUMP_DATA_DIR) + bench_dir = input_param.get(CompareConst.BENCH_DUMP_DATA_DIR) + try: + n_value, b_value = self.file_reader(npu_dir, npu_data_name, bench_dir, bench_data_name, + self.cross_frame) + except IOError as error: + error_file = error.filename + n_value, b_value = CompareConst.READ_NONE, CompareConst.READ_NONE + error_flag = True + except (FileCheckException, CompareException): + error_file = npu_data_name + n_value, b_value = CompareConst.READ_NONE, CompareConst.READ_NONE + error_flag = True + + # 通过n_value, b_value同时得到错误标志和错误信息 + n_value, b_value, error_flag, err_msg = get_error_flag_and_msg(n_value, b_value, + error_flag=error_flag, error_file=error_file) + + result_list, err_msg = compare_ops_apply(n_value, b_value, error_flag, err_msg) + + if self.mode_config.fuzzy_match and npu_op_name != bench_op_name and bench_op_name != CompareConst.N_A: + err_msg += " Fuzzy matching data, the comparison accuracy may be affected." + result_list.append(err_msg) + return result_list + + def compare_ops(self, idx, dump_path_dict, result_df, lock, input_param): + cos_result = [] + euc_dist_result = [] + max_err_result = [] + max_relative_err_result = [] + one_thousand_err_ratio_result = [] + five_thousand_err_ratio_result = [] + err_mess = [] + + is_print_compare_log = input_param.get("is_print_compare_log") + + for i in range(len(result_df)): + npu_op_name = result_df.iloc[i, 0] + bench_op_name = result_df.iloc[i, 1] + if is_print_compare_log: + logger.info("start compare: {}".format(npu_op_name)) + + cos_sim, euc_dist, max_abs_err, max_relative_err, one_thousand_err_ratio, five_thousand_err_ratio, err_msg \ + = self.compare_by_op(npu_op_name, bench_op_name, dump_path_dict, input_param) + + if is_print_compare_log: + logger.info( + "[{}] Compare result: cosine {}, max_abs_err {}, max_relative_err {}, {}, \ + one_thousand_err_ratio {}, " + "five_thousand_err_ratio {}".format(npu_op_name, cos_sim, max_abs_err, max_relative_err, + err_msg, one_thousand_err_ratio, five_thousand_err_ratio)) + cos_result.append(cos_sim) + euc_dist_result.append(euc_dist) + max_err_result.append(max_abs_err) + max_relative_err_result.append(max_relative_err) + one_thousand_err_ratio_result.append(one_thousand_err_ratio) + five_thousand_err_ratio_result.append(five_thousand_err_ratio) + err_mess.append(err_msg) + + cr = ComparisonResult( + cos_result=cos_result, + euc_dist_result=euc_dist_result, + max_err_result=max_err_result, + max_relative_err_result=max_relative_err_result, + one_thousand_err_ratio_result=one_thousand_err_ratio_result, + five_thousand_err_ratio_result=five_thousand_err_ratio_result, + err_msgs=err_mess + ) + + return self._save_cmp_result(idx, cr, result_df, lock) + + def do_multi_process(self, input_param, result_df): + try: + result_df = self._handle_multi_process(self.compare_ops, input_param, result_df, + multiprocessing.Manager().RLock()) + return result_df + except ValueError as e: + logger.error('result dataframe is not found.') + raise CompareException(CompareException.INVALID_DATA_ERROR) from e + + def _handle_multi_process(self, func, input_param, result_df, lock): + process_num = max(int((multiprocessing.cpu_count() + 1) // 4), 1) + op_name_mapping_dict = self.read_dump_data(result_df) + + df_chunk_size = len(result_df) // process_num + if df_chunk_size > 0: + df_chunks = [result_df.iloc[i:i + df_chunk_size] for i in range(0, len(result_df), df_chunk_size)] + else: + df_chunks = [result_df] + + results = [] + pool = multiprocessing.Pool(process_num) + + def err_call(args): + logger.error('multiprocess compare failed! Reason: {}'.format(args)) + try: + pool.close() + except OSError: + logger.error("pool terminate failed") + + progress_bar = tqdm(total=len(result_df), desc="API/Module Item Compare Process", unit="row", ncols=100) + + def update_progress(size, progress_lock, extra_param=None): + with progress_lock: + progress_bar.update(size) + + for process_idx, df_chunk in enumerate(df_chunks): + idx = df_chunk_size * process_idx + chunk_size = len(df_chunk) + result = pool.apply_async(func, + args=(idx, op_name_mapping_dict, df_chunk, lock, input_param), + error_callback=err_call, + callback=partial(update_progress, chunk_size, lock) + ) + results.append(result) + + final_results = [r.get() for r in results] + pool.close() + pool.join() + return pd.concat(final_results, ignore_index=True) diff --git a/debug/accuracy_tools/msprobe/core/compare/npy_compare.py b/debug/accuracy_tools/msprobe/core/compare/npy_compare.py index 4103d361fec14284fc38f97e1418e5405e939cd9..b58d2854ef1c3ffcb62f144a1c3101f38efbd55b 100644 --- a/debug/accuracy_tools/msprobe/core/compare/npy_compare.py +++ b/debug/accuracy_tools/msprobe/core/compare/npy_compare.py @@ -290,10 +290,8 @@ class CompareOps: def error_value_process(n_value): - if n_value == CompareConst.READ_NONE or n_value == CompareConst.UNREADABLE: + if n_value in [CompareConst.READ_NONE, CompareConst.UNREADABLE, CompareConst.NONE]: return CompareConst.UNSUPPORTED, "" - if n_value == CompareConst.NONE: - return 0, "" if n_value == CompareConst.SHAPE_UNMATCH: return CompareConst.SHAPE_UNMATCH, "" if n_value == CompareConst.NAN: diff --git a/debug/accuracy_tools/msprobe/core/compare/utils.py b/debug/accuracy_tools/msprobe/core/compare/utils.py index 035e7d1cda05968834429d6a6399b7044c3573b1..951a9d8a51d32808c492c52763157485de73f353 100644 --- a/debug/accuracy_tools/msprobe/core/compare/utils.py +++ b/debug/accuracy_tools/msprobe/core/compare/utils.py @@ -20,6 +20,7 @@ import zlib from dataclasses import dataclass import numpy as np +import pandas as pd from msprobe.core.common.const import Const, CompareConst, FileCheckConst from msprobe.core.common.utils import CompareException, check_regex_prefix_format_valid, logger, safe_get_value @@ -81,22 +82,6 @@ def check_and_return_dir_contents(dump_dir, prefix): return contents -def rename_api(npu_name, process): - """ - 原api: {api_type}.{api_name}.{API调用次数}.{前向反向}.{input/output}.{参数序号} - rename后: {api_type}.{api_name}.{input/output}.{参数序号} - """ - npu_split = npu_name.split(process) - try: - torch_func_index, in_out = npu_split[0], npu_split[1] - except IndexError as error: - logger.error(f'{npu_name} can not be split with {process}, please check!') - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error - torch_func_split = torch_func_index.rsplit(Const.SEP, 2) - torch_func = str(torch_func_split[0]) + str(in_out) - return torch_func - - def read_op(op_data, op_name): if Const.PARAMS_GRAD in op_name.split(Const.SEP): op_parsed_list = op_item_parse(op_data, op_name) @@ -191,35 +176,146 @@ def gen_op_item(op_data, op_name): return op_item -def resolve_api_special_parameters(data_dict, full_op_name, item_list): +@dataclass +class ApiItemInfo: + name: str + struct: tuple + stack_info: list + + +def merge_tensor(tensor_list, dump_mode): + keys = [ + CompareConst.OP_NAME, + CompareConst.INPUT_STRUCT, + CompareConst.KWARGS_STRUCT, + CompareConst.OUTPUT_STRUCT, + CompareConst.PARAMS_STRUCT, + CompareConst.PARAMS_GRAD_STRUCT, + Const.SUMMARY, + Const.STACK_INFO + ] + op_dict = {key: [] for key in keys} + + if dump_mode == Const.ALL: + op_dict["data_name"] = [] + + for tensor in tensor_list: + # A dict(len=2) with 'full_op_name' and 'full_info' is added to the tensor only if self.stack_mode is True + if len(tensor) == 2: + op_dict[Const.STACK_INFO].append(tensor['full_info']) + break + + op_dict[CompareConst.OP_NAME].append(tensor['full_op_name']) + + _, state = get_name_and_state(tensor['full_op_name']) + struct_key = CompareConst.STATE_TO_STRUCT_MAPPING.get(state) + if not struct_key: + continue + if dump_mode == Const.MD5: + op_dict.get(struct_key).append((tensor[Const.DTYPE], tensor[Const.SHAPE], tensor[Const.MD5])) + else: + op_dict.get(struct_key).append((tensor[Const.DTYPE], tensor[Const.SHAPE])) + op_dict[Const.SUMMARY].append([tensor[Const.MAX], tensor[Const.MIN], tensor[Const.MEAN], tensor[Const.NORM]]) + + if dump_mode == Const.ALL: + op_dict["data_name"].append(tensor['data_name']) + + if not op_dict[CompareConst.KWARGS_STRUCT]: + del op_dict[CompareConst.KWARGS_STRUCT] + return op_dict if op_dict[CompareConst.OP_NAME] else {} + + +def print_compare_ends_info(): + total_len = len(CompareConst.COMPARE_ENDS_SUCCESSFULLY) + Const.FILL_CHAR_NUMS + logger.info('*' * total_len) + logger.info(f"*{CompareConst.COMPARE_ENDS_SUCCESSFULLY.center(total_len - 2)}*") + logger.info('*' * total_len) + + +def table_value_is_valid(value: str) -> bool: + if not isinstance(value, str): + return True + try: + # -1.00 or +1.00 should be considered as digit numbers + float(value) + except ValueError: + # otherwise, they will be considered as formular injections + return not bool(re.compile(FileCheckConst.CSV_BLACK_LIST).search(value)) + return True + + +def get_name_and_state(name): """ - Function Description: - 解析下面格式的数据, 是api参数的一种特殊格式 - { - "last_hidden_state": { - "type": "torch.Tensor", - "dtype": "torch.bfloat16", - ... - }, - "loss": { - "type": "torch.Tensor", - "dtype": "torch.float32", - ... - } - } - Parameter: - data_dict: 字典格式的数据 - full_op_name: 参数的全名字符串 - item_list: 参数信息集合 + Get api/module name and state + example: + name = 'conv2d.forward.1.input.0' + return: ('conv2d.forward.1.', 'input') + + name = 'Functional.pad.0.backward.output.0' + return: ('Functional.pad.0.backward.', 'output') + + state type: input, output, kwargs, parameters, parameters_grad """ - for key, value in data_dict.items(): - if isinstance(value, dict): - parsed_item = value - parts = full_op_name.split(Const.SEP) - parts.insert(-1, key) - full_op_name_new = ".".join(parts) - parsed_item['full_op_name'] = full_op_name_new - item_list.append(parsed_item) + if not isinstance(name, str): + logger.error(f'Invalid name: {name}, type should be string, please check.') + raise CompareException(CompareException.INVALID_API_NAME_ERROR) + + if Const.PARAMS_GRAD in name.split(Const.SEP): + return name.split(Const.PARAMS_GRAD)[0], Const.PARAMS_GRAD + + split = re.split(Const.REGEX_FORWARD_BACKWARD, name) + if len(split) < 3: + logger.error(f'Invalid name string: {name}, can not be split by forward/backward, please check.') + raise CompareException(CompareException.INVALID_API_NAME_ERROR) + api = f'{split[0]}.{split[1]}.' + state_str = split[2] + match = re.match(r'^(\d+\.)?(input|output|kwargs|parameters)\..+$', state_str) + if not match: + raise CompareException(f'Invalid name string: {name}') + if match.group(1): + api = f'{api}{match.group(1)}' + state = match.group(2) + return api, state + + +def reorder_op_name_list(op_name_list): + if not op_name_list: + return op_name_list + + parameters = [] + output = [] + parameters_grad = [] + others = [] + for x in op_name_list: + state = get_name_and_state(x)[1] + if state == Const.PARAMS: + parameters.append(x) + elif state == Const.OUTPUT: + output.append(x) + elif state == Const.PARAMS_GRAD: + parameters_grad.append(x) + else: + others.append(x) + # 合并others, parameters, 和output,确保parameters排在output前面 + op_name_reorder = others + parameters + output + parameters_grad + return op_name_reorder + + +def reorder_op_x_list(op_name_list, summary_list, data_name_list): + """对op_name, summary, data_name重新排序,把parameters放到input后output前,data_name由于统计量比对时,为None,单独处理""" + if not op_name_list or not summary_list: + return op_name_list, summary_list, data_name_list + + index_map = {name: index for index, name in enumerate(op_name_list)} + + op_name_reorder = reorder_op_name_list(op_name_list) + summary_reorder = [summary_list[index_map.get(name)] for name in op_name_reorder] + if data_name_list: + data_name_reorder = [data_name_list[index_map.get(name)] for name in op_name_reorder] + else: + data_name_reorder = data_name_list + + return op_name_reorder, summary_reorder, data_name_reorder def process_summary_data(summary_data): @@ -407,204 +503,23 @@ def get_accuracy(result, n_dict, b_dict, dump_mode): CompareConst.PARAMS_GRAD_STRUCT) -def append_stack_info(result_item, npu_stack_info, index): - """添加堆栈信息到 result_item""" - if npu_stack_info and index == 0: - result_item.extend(npu_stack_info) - else: - result_item.append(CompareConst.NONE) - - -def get_un_match_accuracy(result, n_dict, dump_mode): - npu_stack_info = n_dict.get("stack_info", None) - bench_name, bench_type, bench_shape = CompareConst.N_A, CompareConst.N_A, CompareConst.N_A +def make_result_table(result, dump_mode, stack_mode): + header = CompareConst.HEAD_OF_COMPARE_MODE[dump_mode][:] - struct_to_index_mapping = { - CompareConst.INPUT_STRUCT: 0, - CompareConst.OUTPUT_STRUCT: 0, - CompareConst.PARAMS_STRUCT: 0, - CompareConst.PARAMS_GRAD_STRUCT: 0 - } - - op_name_list = n_dict.get(CompareConst.OP_NAME) - summary_list = n_dict.get(Const.SUMMARY) - data_name_list = n_dict.get('data_name') - op_name_reorder, summary_reorder, _ = reorder_op_x_list(op_name_list, - summary_list, - data_name_list) - for index, n_name in enumerate(op_name_reorder): - _, state = get_name_and_state(n_name) - struct_key = CompareConst.STATE_TO_STRUCT_MAPPING.get(state) - if not struct_key: - continue - n_struct = safe_get_value(n_dict, struct_to_index_mapping.get(struct_key), "n_dict", key=struct_key) - struct_to_index_mapping[struct_key] += 1 - - try: - result_item = [n_name, bench_name, n_struct[0], bench_type, n_struct[1], bench_shape] - except IndexError as e: - err_msg = "index out of bounds error occurs, please check!\n" \ - f"op_name of n_dict is {n_dict['op_name']}\n" \ - f"input_struct of n_dict is {n_dict[CompareConst.INPUT_STRUCT]}\n" \ - f"output_struct of n_dict is {n_dict[CompareConst.OUTPUT_STRUCT]}" - logger.error(err_msg) - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from e - - if dump_mode == Const.MD5: - result_item.extend([CompareConst.N_A] * 3) - append_stack_info(result_item, npu_stack_info, index) - result.append(result_item) - continue - if dump_mode == Const.SUMMARY: - result_item.extend([CompareConst.N_A] * 8) # 8个统计量数据情况的比对指标 + if stack_mode: + header.append(CompareConst.STACK) if dump_mode == Const.ALL: - result_item.extend([CompareConst.N_A] * 6) # 6个真实数据情况的比对指标 - - npu_summary_data = safe_get_value(summary_reorder, index, "summary_reorder") - bench_summary_data = [CompareConst.N_A] * 4 - result_item.extend(npu_summary_data) - result_item.extend(bench_summary_data) - err_msg = CompareConst.NO_BENCH - accuracy_check_res = CompareConst.N_A - result_item.append(accuracy_check_res) - result_item.append(err_msg) - append_stack_info(result_item, npu_stack_info, index) - if dump_mode == Const.ALL and result_item[1] == CompareConst.N_A: - result_item.extend([["-1", "-1"]]) - result.append(result_item) - - -def merge_tensor(tensor_list, dump_mode): - op_dict = {} - op_dict["op_name"] = [] - op_dict[CompareConst.INPUT_STRUCT] = [] - op_dict[CompareConst.KWARGS_STRUCT] = [] - op_dict[CompareConst.OUTPUT_STRUCT] = [] - op_dict[CompareConst.PARAMS_STRUCT] = [] - op_dict[CompareConst.PARAMS_GRAD_STRUCT] = [] - op_dict[Const.SUMMARY] = [] - op_dict["stack_info"] = [] - - if dump_mode == Const.ALL: - op_dict["data_name"] = [] - - for tensor in tensor_list: - # A dict(len=2) with 'full_op_name' and 'full_info' is added to the tensor only if self.stack_mode is True - if len(tensor) == 2: - op_dict['stack_info'].append(tensor['full_info']) - break - - op_dict["op_name"].append(tensor['full_op_name']) - - _, state = get_name_and_state(tensor['full_op_name']) - struct_key = CompareConst.STATE_TO_STRUCT_MAPPING.get(state) - if not struct_key: - continue - if dump_mode == Const.MD5: - op_dict.get(struct_key).append((tensor[Const.DTYPE], tensor[Const.SHAPE], tensor[Const.MD5])) - else: - op_dict.get(struct_key).append((tensor[Const.DTYPE], tensor[Const.SHAPE])) - op_dict[Const.SUMMARY].append([tensor[Const.MAX], tensor[Const.MIN], tensor[Const.MEAN], tensor[Const.NORM]]) - + header.append(CompareConst.DATA_NAME) + else: if dump_mode == Const.ALL: - op_dict["data_name"].append(tensor['data_name']) - - if not op_dict[CompareConst.KWARGS_STRUCT]: - del op_dict[CompareConst.KWARGS_STRUCT] - return op_dict if op_dict["op_name"] else {} - - -def print_compare_ends_info(): - total_len = len(CompareConst.COMPARE_ENDS_SUCCESSFULLY) + Const.FILL_CHAR_NUMS - logger.info('*' * total_len) - logger.info(f"*{CompareConst.COMPARE_ENDS_SUCCESSFULLY.center(total_len - 2)}*") - logger.info('*' * total_len) - - -def table_value_is_valid(value: str) -> bool: - if not isinstance(value, str): - return True - try: - # -1.00 or +1.00 should be considered as digit numbers - float(value) - except ValueError: - # otherwise, they will be considered as formular injections - return not bool(re.compile(FileCheckConst.CSV_BLACK_LIST).search(value)) - return True - - -def get_name_and_state(name): - """ - Get api/module name and state - example: - name = 'conv2d.forward.1.input.0' - return: ('conv2d.forward.1.', 'input') - - name = 'Functional.pad.0.backward.output.0' - return: ('Functional.pad.0.backward.', 'output') - - state type: input, output, kwargs, parameters, parameters_grad - """ - if not isinstance(name, str): - logger.error(f'Invalid name: {name}, type should be string, please check.') - raise CompareException(CompareException.INVALID_API_NAME_ERROR) - - if Const.PARAMS_GRAD in name.split(Const.SEP): - return name.split(Const.PARAMS_GRAD)[0], Const.PARAMS_GRAD - - split = re.split(Const.REGEX_FORWARD_BACKWARD, name) - if len(split) < 3: - logger.error(f'Invalid name string: {name}, can not be split by forward/backward, please check.') - raise CompareException(CompareException.INVALID_API_NAME_ERROR) - api = f'{split[0]}.{split[1]}.' - state_str = split[2] - match = re.match(r'^(\d+\.)?(input|output|kwargs|parameters)\..+$', state_str) - if not match: - raise CompareException(f'Invalid name string: {name}') - if match.group(1): - api = f'{api}{match.group(1)}' - state = match.group(2) - return api, state - - -def reorder_op_name_list(op_name_list): - if not op_name_list: - return op_name_list - - parameters = [] - output = [] - parameters_grad = [] - others = [] - for x in op_name_list: - state = get_name_and_state(x)[1] - if state == Const.PARAMS: - parameters.append(x) - elif state == Const.OUTPUT: - output.append(x) - elif state == Const.PARAMS_GRAD: - parameters_grad.append(x) + for row in result: + del row[-2] # 输出结果不要堆栈信息时,删除中间结果result中的stack info,真实数据时为倒数第2列 + header.append(CompareConst.DATA_NAME) else: - others.append(x) - # 合并others, parameters, 和output,确保parameters排在output前面 - op_name_reorder = others + parameters + output + parameters_grad - return op_name_reorder - - -def reorder_op_x_list(op_name_list, summary_list, data_name_list): - """对op_name, summary, data_name重新排序,把parameters放到input后output前,data_name由于统计量比对时,为None,单独处理""" - if not op_name_list or not summary_list: - return op_name_list, summary_list, data_name_list - - index_map = {name: index for index, name in enumerate(op_name_list)} - - op_name_reorder = reorder_op_name_list(op_name_list) - summary_reorder = [summary_list[index_map.get(name)] for name in op_name_reorder] - if data_name_list: - data_name_reorder = [data_name_list[index_map.get(name)] for name in op_name_reorder] - else: - data_name_reorder = data_name_list - - return op_name_reorder, summary_reorder, data_name_reorder + for row in result: + del row[-1] # 输出结果不要堆栈信息时,删除中间结果result中的stack info,非真实数据时为倒数第1列 + result_df = pd.DataFrame(result, columns=header, dtype='object') + return result_df def _compare_parser(parser): diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/__init__.py b/debug/accuracy_tools/msprobe/core/config_check/__init__.py similarity index 85% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/__init__.py rename to debug/accuracy_tools/msprobe/core/config_check/__init__.py index 7d60f07881d378bb0a7a9c6faf6147af07a915b2..621122ffa00ba40a868853ccb46ff582c3e5fdda 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/__init__.py +++ b/debug/accuracy_tools/msprobe/core/config_check/__init__.py @@ -13,4 +13,5 @@ # See the License for the specific language governing permissions and # limitations under the License. -import msprobe.pytorch.config_checking.checkers +import msprobe.core.config_check.checkers +from msprobe.core.config_check.config_checker import ConfigChecker diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/__init__.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/__init__.py similarity index 55% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/__init__.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/__init__.py index c1bed99b3a091f7163431e4cc3e6c02cad69530d..9b9024b862f1f60655d2f71a47ab401546a86076 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/__init__.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/__init__.py @@ -15,13 +15,11 @@ __all__ = ['BaseChecker', 'apply_patches'] -import msprobe.pytorch.config_checking.checkers.env_args_checker -import msprobe.pytorch.config_checking.checkers.pip_checker -import msprobe.pytorch.config_checking.checkers.dataset_checker -import msprobe.pytorch.config_checking.checkers.weights_checker -import msprobe.pytorch.config_checking.checkers.hyperparameter_checker -import msprobe.pytorch.config_checking.checkers.random_checker +import msprobe.core.config_check.checkers.env_args_checker +import msprobe.core.config_check.checkers.pip_checker +import msprobe.core.config_check.checkers.dataset_checker +import msprobe.core.config_check.checkers.weights_checker +import msprobe.core.config_check.checkers.hyperparameter_checker +import msprobe.core.config_check.checkers.random_checker -from msprobe.pytorch.config_checking.checkers.random_checker import apply_patches - -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker +from msprobe.core.config_check.checkers.base_checker import BaseChecker diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/base_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/base_checker.py similarity index 77% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/base_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/base_checker.py index 26a7275aaf0a724b0d30053cd33cee761adcec8d..7f17e7c14c63eb767ae5819098499b2a2ee202c5 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/base_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/base_checker.py @@ -14,10 +14,8 @@ # limitations under the License. import os -from abc import ABC, abstractmethod - -import torch +from msprobe.core.common.framework_adapter import FmkAdp from msprobe.core.common.const import FileCheckConst @@ -30,31 +28,33 @@ class PackInput: self.check_input_params() def check_input_params(self): - if self.model and not isinstance(self.model, torch.nn.Module): - raise Exception(f"model is not torch.nn.Module or module list.") + if self.model and not FmkAdp.is_nn_module(self.model): + raise Exception(f"model is not torch.nn.Module/mindspore.nn.Cell or module list.") if not isinstance(self.output_zip_path, str) or not self.output_zip_path.endswith(FileCheckConst.ZIP_SUFFIX): raise Exception(f"output zip path must be a string and ends with '.zip'") -class BaseChecker(ABC): +class BaseChecker: input_needed = None target_name_in_zip = None multi_rank = False @staticmethod - @abstractmethod def pack(pack_input): pass @staticmethod - @abstractmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): + pass + + @staticmethod + def apply_patches(fmk): pass @classmethod - def compare_ex(cls, bench_dir, cmp_dir, output_path): + def compare_ex(cls, bench_dir, cmp_dir, output_path, fmk): bench_filepath = os.path.join(bench_dir, cls.target_name_in_zip) cmp_filepath = os.path.join(cmp_dir, cls.target_name_in_zip) if not os.path.exists(bench_filepath) or not os.path.exists(cmp_filepath): return None, None, None - return cls.compare(bench_dir, cmp_dir, output_path) + return cls.compare(bench_dir, cmp_dir, output_path, fmk) diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/dataset_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/dataset_checker.py similarity index 77% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/dataset_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/dataset_checker.py index 89217ac18ba4fb2fffe0eaaf325a634ee5d32eb3..96ff4809f81b8db20bc5bb26ecbf1d2e8f6e874b 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/dataset_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/dataset_checker.py @@ -15,29 +15,19 @@ import os import json -import torch import pandas as pd from msprobe.core.common.file_utils import create_file_in_zip, load_json -from msprobe.pytorch.common.utils import get_rank_id -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.config_checker import register_checker_item, register_pre_forward_fun_list -from msprobe.pytorch.config_checking.utils.utils import config_checking_print +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.config_checker import register_checker_item, register_pre_forward_fun_list +from msprobe.core.config_check.utils.utils import config_checking_print, get_tensor_features from msprobe.core.common.decorator import recursion_depth_decorator - - -def process_tensor(tensor): - return { - 'max': float(tensor.max().item()), - 'min': float(tensor.min().item()), - 'mean': float(tensor.mean().item()), - 'norm': float(torch.norm(tensor).item()) - } +from msprobe.core.common.framework_adapter import FmkAdp @recursion_depth_decorator("config_check: process_obj") def process_obj(obj): - if isinstance(obj, torch.Tensor): - return process_tensor(obj) + if FmkAdp.is_tensor(obj): + return get_tensor_features(obj) elif isinstance(obj, (tuple, list)): return {i: process_obj(x) for i, x in enumerate(obj)} elif isinstance(obj, dict): @@ -59,24 +49,34 @@ def parse_args_and_kargs(args, kwargs): @recursion_depth_decorator("config_check: compare_dataset_dicts") def compare_dataset_dicts(dict1, dict2, tag=''): results = [] - for key, value1 in dict1.items(): + # 处理 dict1 中的键 + for key in dict1: new_tag = f"{tag}.{key}" if tag else key + if key not in dict2: + result = {'tag': new_tag, 'equal': False, 'status': 'delete'} + results.append(result) + continue + value1 = dict1[key] value2 = dict2[key] - # 若为包含四个指定键的字典,不再递归 if not isinstance(value1, dict): continue if set(value1.keys()) == {'max', 'min', 'mean', 'norm'}: equal = value1 == value2 relative_diffs = { - f"{k}_relative_diff": (abs(value1[k] - value2[k]) / value1[k]) \ - if value1[k] != 0 else None \ + f"{k}_relative_diff": (abs(value1[k] - value2[k]) / value1[k]) if value1[k] != 0 else None for k in ['max', 'min', 'mean', 'norm'] } - result = {'tag': new_tag, 'equal': equal} + result = {'tag': new_tag, 'equal': equal, 'status': 'unchanged'} result.update(relative_diffs) results.append(result) else: results.extend(compare_dataset_dicts(value1, value2, new_tag)) + # 处理 dict2 中独有的键 + for key in dict2: + if key not in dict1: + new_tag = f"{tag}.{key}" if tag else key + result = {'tag': new_tag, 'equal': False, 'status': 'added'} + results.append(result) return results @@ -97,8 +97,8 @@ def compare_dataset(bench_dir, cmp_dir): dict2 = load_json(rank_path_cmp) results = compare_dataset_dicts(dict1, dict2) for result in results: - result['step'] = step - result['rank'] = rank + result['step'] = int(step.replace("step", "")) + result['rank'] = int(rank.replace("rank", "")) all_results.extend(results) df = pd.DataFrame(all_results, columns=DatasetChecker.result_header) @@ -122,14 +122,14 @@ class DatasetChecker(BaseChecker): def collect_input(model, args, kwargs, step): features = parse_args_and_kargs(args, kwargs) dataset_filepath = os.path.join(DatasetChecker.target_name_in_zip, - f"step{step}", f"rank{get_rank_id()}", "dataset.json") + f"step{step}", f"rank{FmkAdp.get_rank_id()}", "dataset.json") create_file_in_zip(output_zip_path, dataset_filepath, json.dumps(features, indent=4)) config_checking_print(f"add first dataset input features to zip") register_pre_forward_fun_list(collect_input) @staticmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): bench_dataset_pack_path = os.path.join(bench_dir, DatasetChecker.target_name_in_zip) cmp_dataset_pack_path = os.path.join(cmp_dir, DatasetChecker.target_name_in_zip) diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/env_args_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/env_args_checker.py similarity index 57% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/env_args_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/env_args_checker.py index 9eaeb1a05729f7e9ae4ff8c9727d8f3589b3a166..d4f72a6b26850322aa5c7685745cfe5b54bdb8a1 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/env_args_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/env_args_checker.py @@ -19,10 +19,10 @@ import json import pandas as pd from msprobe.core.common.file_utils import load_json, load_yaml, create_file_with_content, create_file_in_zip -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.config_checker import register_checker_item -from msprobe.pytorch.config_checking.utils.utils import config_checking_print -from msprobe.core.common.file_utils import save_excel +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.config_checker import register_checker_item +from msprobe.core.config_check.utils.utils import config_checking_print +from msprobe.core.common.const import Const dirpath = os.path.dirname(__file__) @@ -36,21 +36,40 @@ def collect_env_data(): return result +def get_device_type(env_json): + for key in env_json.keys(): + if Const.ASCEND in key: + return Const.NPU_LOWERCASE + return Const.GPU_LOWERCASE + + def compare_env_data(npu_path, bench_path): necessary_env = load_yaml(env_yaml_path) - npu_data = load_json(npu_path) + cmp_data = load_json(npu_path) + cmp_type = get_device_type(cmp_data) bench_data = load_json(bench_path) + bench_type = get_device_type(bench_data) data = [] for _, value in necessary_env.items(): - npu_env_name = value[0]["name"] - npu_value = npu_data.get(npu_env_name) if npu_data.get(npu_env_name) else value[0]["default_value"] - if len(value) == 1: - data.append([npu_env_name, "only npu has this env", npu_value, "", "warning"]) + cmp_env = value.get(cmp_type) + bench_env = value.get(bench_type) + if not bench_env and not cmp_env: continue - bench_env_name = value[1]["name"] - bench_value = bench_data.get(bench_env_name) if bench_data.get(bench_env_name) else value[1]["default_value"] - if npu_value != bench_value: - data.append([npu_env_name, bench_env_name, npu_value, bench_value, "error"]) + elif cmp_env: + cmp_env_name = cmp_env["name"] + cmp_value = cmp_data.get(cmp_env_name, value[cmp_type]["default_value"]) + if not bench_env: + data.append(["only cmp has this env", cmp_env["name"], "", cmp_value, "warning"]) + continue + bench_env_name = bench_env["name"] + bench_value = bench_data.get(bench_env_name, value[bench_type]["default_value"]) + if cmp_value != bench_value: + data.append([bench_env_name, cmp_env_name, bench_value, cmp_value, "error"]) + else: + bench_env_name = bench_env["name"] + bench_value = bench_data.get(bench_env_name) if bench_data.get(bench_env_name) else value[bench_type][ + "default_value"] + data.append([bench_env_name, "only bench has this env", bench_value, "", "warning"]) df = pd.DataFrame(data, columns=EnvArgsChecker.result_header) return df @@ -69,7 +88,7 @@ class EnvArgsChecker(BaseChecker): config_checking_print(f"add env args to zip") @staticmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): bench_env_data = os.path.join(bench_dir, EnvArgsChecker.target_name_in_zip) cmp_env_data = os.path.join(cmp_dir, EnvArgsChecker.target_name_in_zip) df = compare_env_data(bench_env_data, cmp_env_data) diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/hyperparameter_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/hyperparameter_checker.py similarity index 30% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/hyperparameter_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/hyperparameter_checker.py index 9ac1cd61fc5483c1a002bf0109d56a341aeab120..774abef4877786268bf700bbb695586800ef64d0 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/hyperparameter_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/hyperparameter_checker.py @@ -15,190 +15,144 @@ import os import json -import re -import tempfile from difflib import SequenceMatcher from typing import Union, List, Dict, Any +import pandas as pd -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.config_checker import register_checker_item -from msprobe.pytorch.config_checking.utils.utils import compare_dict, config_checking_print +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.config_checker import register_checker_item +from msprobe.core.config_check.utils.utils import compare_dict, config_checking_print, update_dict +from msprobe.core.config_check.utils.hyperparameter_parser import ParserFactory from msprobe.core.common.file_utils import (os_walk_for_files, create_file_in_zip, load_json, create_file_with_list, - FileOpen) -from msprobe.core.common.const import FileCheckConst, Const + FileOpen, load_yaml) +from msprobe.core.common.const import Const + + +dirpath = os.path.dirname(__file__) +hyperparameters_path = os.path.join(dirpath, "../resource/hyperparameter.yaml") +parameter_name_mapping = load_yaml(os.path.realpath(hyperparameters_path)) +hyperparameters_dict = {} @register_checker_item("hyperparameter") class HyperparameterChecker(BaseChecker): - input_needed = "shell_path" target_name_in_zip = "hyperparameters" - - PARAMETER_NAME_MAPPING = { - "learning_rate": ["lr", "learningrate"], - "batch_size": ["batch", "bs", "batch_size_per_gpu"], - "epochs": ["num_epochs", "max_epochs", "epoch"], - "weight_decay": ["wd", "weightdecay"], - "dropout_rate": ["dropout", "drop_rate"], - } + result_header = ["file_name", "bench_para", "cmp_para", "bench_value", "cmp_value", "matched_with", "level"] + hyperparameters_file_list = ["hyperparameters_static.json", "hyperparameters_dynamic.json"] @staticmethod def pack(pack_input): shell_path = pack_input.shell_path output_zip_path = pack_input.output_zip_path - if not isinstance(shell_path, list): - raise TypeError("shell_path should be a list of file paths.") - - for index, script_path in enumerate(shell_path): - if os.path.isfile(script_path): - hyperparameters = HyperparameterChecker._extract_hyperparameters_from_script(script_path) - if hyperparameters: - create_file_in_zip(output_zip_path, os.path.join(HyperparameterChecker.target_name_in_zip, - HyperparameterChecker.target_name_in_zip + - Const.REPLACEMENT_CHARACTER + str(index) - + FileCheckConst.JSON_SUFFIX), - json.dumps(hyperparameters, indent=4)) - config_checking_print(f"add hyperparameters args to zip") + if shell_path: + if not isinstance(shell_path, list): + raise TypeError("shell_path should be a list of file paths.") + + hyperparameters = {} + parser_factory = ParserFactory() + for script_path in shell_path: + if os.path.isfile(script_path): + parser = parser_factory.get_parser(os.path.splitext(script_path)[1]) + update_dict(hyperparameters, parser.run(os.path.realpath(script_path))) else: - config_checking_print(f"Warning: Failed to extract hyperparameters from script {script_path}") + config_checking_print(f"Warning: Script path {script_path} is not a file.") + if hyperparameters: + create_file_in_zip(output_zip_path, + os.path.join(HyperparameterChecker.target_name_in_zip, + HyperparameterChecker.hyperparameters_file_list[0]), + json.dumps(hyperparameters, indent=4)) + config_checking_print(f"add static hyperparameters args to zip") else: - config_checking_print(f"Warning: Script path {script_path} is not a file.") + config_checking_print(f"Warning: Failed to extract hyperparameters from script {shell_path}") + if hyperparameters_dict: + create_file_in_zip(output_zip_path, + os.path.join(HyperparameterChecker.target_name_in_zip, + HyperparameterChecker.hyperparameters_file_list[1]), + json.dumps(vars(hyperparameters_dict), default=lambda x: None, indent=4)) + config_checking_print(f"add dynamic hyperparameters args to zip") @staticmethod - def compare(bench_dir, cmp_dir, output_path): - bench_model_dir = os.path.join(bench_dir, HyperparameterChecker.target_name_in_zip) - cmp_model_dir = os.path.join(cmp_dir, HyperparameterChecker.target_name_in_zip) - bench_hyperparameters = HyperparameterChecker.load_hyperparameters(bench_model_dir) - cmp_hyperparameters = HyperparameterChecker.load_hyperparameters(cmp_model_dir) - - if len(bench_hyperparameters) != len(cmp_hyperparameters): - config_checking_print("The shell path length dose not match!") - raise Exception("The shell path length dose not match!") - + def compare(bench_dir, cmp_dir, output_path, fmk): all_diffs = [] - all_files = set(bench_hyperparameters.keys()) | set(cmp_hyperparameters.keys()) - - for filename in all_files: - bench_params = bench_hyperparameters.get(filename, {}) - cmp_params = cmp_hyperparameters.get(filename, {}) - - if bench_params and cmp_params: - all_diffs.extend(HyperparameterChecker.compare_param(bench_params, cmp_params, filename)) - - elif bench_params is not None: - all_diffs.append(f"[Only in benchmark] File: {filename}") - else: - all_diffs.append(f"[Only in compare] File: {filename}") - return HyperparameterChecker.target_name_in_zip, True, None + for file_name in HyperparameterChecker.hyperparameters_file_list: + bench_model_dir = os.path.join(bench_dir, HyperparameterChecker.target_name_in_zip, file_name) + cmp_model_dir = os.path.join(cmp_dir, HyperparameterChecker.target_name_in_zip, file_name) + if os.path.isfile(bench_model_dir) and os.path.isfile(cmp_model_dir): + bench_hyperparameters = load_json(bench_model_dir) + cmp_hyperparameters = load_json(cmp_model_dir) + all_diffs.extend( + HyperparameterChecker.compare_param(bench_hyperparameters, cmp_hyperparameters, file_name)) + df = pd.DataFrame(all_diffs, columns=HyperparameterChecker.result_header) + pass_check = "error" not in df["level"].values + return HyperparameterChecker.target_name_in_zip, pass_check, df @staticmethod - def compare_param(bench_params, cmp_params, filename): + def compare_param(bench_params, cmp_params, file_name): all_diffs = [] - file_diffs = [] bench_param_names = bench_params.keys() for bench_param_name in bench_param_names: - matched_cmp_param_name = HyperparameterChecker._fuzzy_match_parameter(bench_param_name, cmp_params) + matched_cmp_param_name, matched_with = HyperparameterChecker._fuzzy_match_parameter(bench_param_name, + cmp_params) + bench_param_value = bench_params[bench_param_name] if matched_cmp_param_name: - bench_param_value = bench_params[bench_param_name] cmp_param_value = cmp_params[matched_cmp_param_name] if bench_param_value != cmp_param_value: - diff = compare_dict({bench_param_name: bench_param_value}, - {matched_cmp_param_name: cmp_param_value}) - if diff: - file_diffs.extend( - [f" Parameter '{bench_param_name}' (matched with '{matched_cmp_param_name}'): {d}" - for d in diff]) + all_diffs.append( + [file_name, bench_param_name, matched_cmp_param_name, bench_param_value, cmp_param_value, + matched_with, "error"]) del cmp_params[matched_cmp_param_name] else: - file_diffs.append( - f" [Only in benchmark] Parameter: '{bench_param_name}': {bench_params[bench_param_name]}") + all_diffs.append( + [file_name, bench_param_name, "Only in benchmark", bench_param_value, "", "", "warning"]) for cmp_param_name, cmp_param_value in cmp_params.items(): - file_diffs.append(f" [Only in compare] Parameter: '{cmp_param_name}': {cmp_param_value}") - if file_diffs: - file_diffs.sort() - all_diffs.append(f"File: {filename}") - all_diffs.extend(file_diffs) + all_diffs.append([file_name, "Only in comparison", cmp_param_name, "", cmp_param_value, "", "warning"]) + all_diffs.sort() return all_diffs @staticmethod - def load_hyperparameters(model_dir): - hyperparameters = {} - if not os.path.exists(model_dir): - return hyperparameters - subfiles = os_walk_for_files(model_dir, Const.MAX_TRAVERSAL_DEPTH) - for files in subfiles: - if files["file"].endswith(FileCheckConst.JSON_SUFFIX): - filepath = os.path.join(files["root"], files["file"]) - relative_filepath = os.path.relpath(filepath, model_dir) - params = load_json(filepath) - if params: - hyperparameters[relative_filepath] = params - return hyperparameters - - @staticmethod - def _extract_hyperparameters_from_script(script_path: str) -> Dict[str, Any]: - """ - Extracts arguments from bash script used to run a model training. - """ - hyperparameters = {} - script_content_list = [] - with FileOpen(script_path, 'r') as file: - for line in file: - stripped_line = line.lstrip() - if not stripped_line.startswith('#'): - line = line.split('#')[0].rstrip() + '\n' - if line.strip(): - script_content_list.append(line) - script_content = ''.join(script_content_list) - - command_line = re.search(r'torchrun\s[^|]*|python -m torch.distributed.launch\s[^|]*', script_content, - re.DOTALL) - if command_line: - command_line = command_line.group() - - blocks = re.findall(r'([a-zA-Z0-9_]{1,20}_ARGS)="(.*?)"', script_content, re.DOTALL) - block_contents = {} - for block_name, block_content in blocks: - block_content = block_content.replace('\n', ' ') - block_contents[block_name] = block_content - command_line = command_line.replace(f"${block_name}", block_content) - - matches = re.findall(r'--([\w-]+)(?:\s+([^\s\\]+))?', command_line) - for match in matches: - key, value = match - args_key = re.match(r'\$\{?(\w+)}?', value) - if args_key: - env_vars = re.findall(rf'{args_key.group(1)}=\s*(.+)', script_content) - if env_vars: - value = env_vars[-1] - hyperparameters[key] = value if value else True - - return hyperparameters + def apply_patches(fmk): + try: + from megatron import training + + def collect_hyperparameter_wrapper(func): + def wrapper(*args, **kwargs): + global hyperparameters_dict + result = func(*args, **kwargs) + if not hyperparameters_dict: + hyperparameters_dict = result + return result + return wrapper + training.get_args = collect_hyperparameter_wrapper(training.get_args) + except ImportError: + config_checking_print("No megatron find.") + except Exception as e: + config_checking_print(f"Patch megatron method failed, detail:{str(e)}") @staticmethod - def _fuzzy_match_parameter(param_name: str, available_params: Dict[str, Any]) -> Union[str, None]: + def _fuzzy_match_parameter(param_name: str, available_params: Dict[str, Any]): """ Fuzzy matches a parameter name against available parameter names using predefined mappings and string similarity. """ if param_name in available_params: - return param_name + return param_name, Const.MATCH_MODE_NAME canonical_name = None - for standard_name, aliases in HyperparameterChecker.PARAMETER_NAME_MAPPING.items(): + for standard_name, aliases in parameter_name_mapping.items(): if param_name == standard_name or param_name in aliases: canonical_name = standard_name break if canonical_name: if canonical_name in available_params: - return canonical_name - for alias in HyperparameterChecker.PARAMETER_NAME_MAPPING[canonical_name]: + return canonical_name, Const.MATCH_MODE_MAPPING + for alias in parameter_name_mapping[canonical_name]: if alias in available_params: config_checking_print( f"Matched '{param_name}' to alias '{alias}' via canonical name '{canonical_name}'") - return alias + return alias, Const.MATCH_MODE_MAPPING best_match_name = None best_match_ratio = 0.8 @@ -211,6 +165,6 @@ class HyperparameterChecker(BaseChecker): if best_match_name: config_checking_print( f"Fuzzy matched parameter '{param_name}' to '{best_match_name}' (similarity: {best_match_ratio:.2f})") - return best_match_name + return best_match_name, f"{Const.MATCH_MODE_SIMILARITY}:{best_match_ratio}" - return None + return None, None diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/pip_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/pip_checker.py similarity index 86% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/pip_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/pip_checker.py index db2bbe1881a4cfe379215279b786861a85b07c33..ef3fb68e592e4c5fe4bc7e0b0bcc7175fe1b16aa 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/pip_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/pip_checker.py @@ -21,9 +21,9 @@ except ImportError: import importlib_metadata as metadata from msprobe.core.common.file_utils import load_yaml, create_file_in_zip -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.config_checker import register_checker_item -from msprobe.pytorch.config_checking.utils.utils import config_checking_print +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.config_checker import register_checker_item +from msprobe.core.config_check.utils.utils import config_checking_print from msprobe.core.common.file_utils import FileOpen, save_excel dirpath = os.path.dirname(__file__) @@ -48,8 +48,9 @@ def collect_pip_data(): return result -def compare_pip_data(bench_pip_path, cmp_pip_path): +def compare_pip_data(bench_pip_path, cmp_pip_path, fmk): necessary_dependency = load_yaml(depend_path)["dependency"] + necessary_dependency.append(fmk) bench_data = load_pip_txt(bench_pip_path) cmp_data = load_pip_txt(cmp_pip_path) data = [] @@ -80,9 +81,9 @@ class PipPackageChecker(BaseChecker): config_checking_print(f"add pip info to zip") @staticmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): bench_pip_path = os.path.join(bench_dir, PipPackageChecker.target_name_in_zip) cmp_pip_path = os.path.join(cmp_dir, PipPackageChecker.target_name_in_zip) - df = compare_pip_data(bench_pip_path, cmp_pip_path) + df = compare_pip_data(bench_pip_path, cmp_pip_path, fmk) pass_check = "error" not in df['level'].values return PipPackageChecker.target_name_in_zip, pass_check, df diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/random_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/random_checker.py similarity index 72% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/random_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/random_checker.py index 883144d617d8ea33afbedf8d510f7181450fae6c..1d1d0a7e79feb63116dc40c139b74c9d5778a8f0 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/random_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/random_checker.py @@ -23,13 +23,13 @@ import json from collections import defaultdict import numpy as np -import torch import pandas as pd -from msprobe.pytorch.config_checking.config_checker import register_checker_item, register_pre_forward_fun_list -from msprobe.pytorch.common.utils import get_rank_id -from msprobe.core.common.file_utils import create_file_in_zip, load_json, save_excel -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.utils.utils import config_checking_print +from msprobe.core.config_check.config_checker import register_checker_item, register_pre_forward_fun_list +from msprobe.core.common.file_utils import create_file_in_zip, load_json +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.utils.utils import config_checking_print +from msprobe.core.common.framework_adapter import FmkAdp +from msprobe.core.common.const import Const random_log_dict = defaultdict(dict) @@ -106,25 +106,8 @@ def track_random_call(func: Callable, name: str): return wrapper -def apply_patches(): - random_patches = { - 'random': random.random, - 'randint': random.randint, - 'uniform': random.uniform, - 'choice': random.choice - } - for name, func in random_patches.items(): - setattr(random, name, track_random_call(func, f"random.{name}")) - - np_random_patches = { - 'rand': np.random.rand, - 'randint': np.random.randint, - 'choice': np.random.choice, - 'normal': np.random.normal - } - for name, func in np_random_patches.items(): - setattr(np.random, name, track_random_call(func, f"np.random.{name}")) - +def torch_patchs(): + import torch torch_patches = { 'rand': torch.rand, 'randint': torch.randint, @@ -136,7 +119,7 @@ def apply_patches(): } for name, func in torch_patches.items(): setattr(torch, name, track_random_call(func, f"torch.{name}")) - + tensor_patches = { 'exponential_': torch.Tensor.exponential_, 'geometric_': torch.Tensor.geometric_, @@ -145,9 +128,26 @@ def apply_patches(): } for name, func in tensor_patches.items(): setattr(torch.Tensor, name, track_random_call(func, f"torch.Tensor.{name}")) - +def mindspore_patchs(): + import mindspore + + mindspore_ops_patches = { + 'rand': mindspore.ops.uniform, + 'randint': mindspore.ops.randint, + 'randn': mindspore.ops.normal + } + for name, func in mindspore_ops_patches.items(): + setattr(mindspore.ops, name, track_random_call(func, f"mindspore.ops.{name}")) + + mindspore_patches = { + 'manual_seed': mindspore.set_seed + } + for name, func in mindspore_patches.items(): + setattr(mindspore, name, track_random_call(func, f"mindspore.{name}")) + + @register_checker_item("random") class RandomChecker(BaseChecker): input_needed = None @@ -164,7 +164,7 @@ class RandomChecker(BaseChecker): if RandomChecker.write_once: return - random_log_filepath = os.path.join(RandomChecker.target_name_in_zip, f"rank{get_rank_id()}.json") + random_log_filepath = os.path.join(RandomChecker.target_name_in_zip, f"rank{FmkAdp.get_rank_id()}.json") create_file_in_zip(output_zip_path, random_log_filepath, json.dumps(random_log_dict, indent=4)) config_checking_print(f"add first random_log input features to zip") RandomChecker.write_once = True @@ -172,11 +172,37 @@ class RandomChecker(BaseChecker): register_pre_forward_fun_list(collect_input) @staticmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): bench_random_log_pack_path = os.path.join(bench_dir, RandomChecker.target_name_in_zip) cmp_random_log_pack_path = os.path.join(cmp_dir, RandomChecker.target_name_in_zip) df = compare_random(bench_random_log_pack_path, cmp_random_log_pack_path) pass_check = False not in df['equal'].values return RandomChecker.target_name_in_zip, pass_check, df - + + @staticmethod + def apply_patches(fmk=Const.PT_FRAMEWORK): + random_patches = { + 'random': random.random, + 'randint': random.randint, + 'uniform': random.uniform, + 'choice': random.choice + } + for name, func in random_patches.items(): + setattr(random, name, track_random_call(func, f"random.{name}")) + + np_random_patches = { + 'rand': np.random.rand, + 'randint': np.random.randint, + 'choice': np.random.choice, + 'normal': np.random.normal + } + for name, func in np_random_patches.items(): + setattr(np.random, name, track_random_call(func, f"np.random.{name}")) + + if fmk == Const.PT_FRAMEWORK: + torch_patchs() + elif fmk == Const.MS_FRAMEWORK: + mindspore_patchs() + else: + raise Exception(f"apply patches framework error, not in {FmkAdp.supported_fmk}") diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/weights_checker.py b/debug/accuracy_tools/msprobe/core/config_check/checkers/weights_checker.py similarity index 86% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/weights_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/checkers/weights_checker.py index 859df770cd3de36524aa6f071fd0b8391f3bb0bd..876e68ef029993918704003d6369ce06d2c84bd3 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/checkers/weights_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/checkers/weights_checker.py @@ -15,20 +15,19 @@ import os import json -import torch import pandas as pd -from msprobe.core.common.file_utils import create_file_in_zip, save_excel, os_walk_for_files, load_json -from msprobe.pytorch.common.utils import get_rank_id -from msprobe.pytorch.config_checking.checkers.base_checker import BaseChecker -from msprobe.pytorch.config_checking.config_checker import register_checker_item, register_pre_forward_fun_list -from msprobe.pytorch.config_checking.utils.utils import config_checking_print, get_tensor_features +from msprobe.core.common.file_utils import create_file_in_zip, os_walk_for_files, load_json +from msprobe.core.config_check.checkers.base_checker import BaseChecker +from msprobe.core.config_check.config_checker import register_checker_item, register_pre_forward_fun_list +from msprobe.core.config_check.utils.utils import config_checking_print, get_tensor_features +from msprobe.core.common.framework_adapter import FmkAdp def collect_weights_data(model): weights_data = {} - for name, param in model.named_parameters(): - if param.dtype == torch.bfloat16: + for name, param in FmkAdp.named_parameters(model): + if param.dtype != FmkAdp.dtype("float32"): param = param.float() weights_data[name] = get_tensor_features(param) return weights_data @@ -86,8 +85,8 @@ def compare_weight(bench_dir, cmp_dir): cmp_root = os.path.join(cmp_dir, relative_path) cmp_file = os.path.join(cmp_root, info["file"]) - step = relative_path.split(os.sep)[0].replace("step", "") - rank = relative_path.split(os.sep)[1].replace("rank", "") + step = int(relative_path.split(os.sep)[0].replace("step", "")) + rank = int(relative_path.split(os.sep)[1].replace("rank", "")) if not os.path.exists(cmp_file): bench_data = load_json(bench_file) @@ -131,13 +130,13 @@ class WeightsChecker(BaseChecker): def collect_weights(model, args, kwargs, step): weights_data_dict = collect_weights_data(model) weights_data_filepath = os.path.join(WeightsChecker.target_name_in_zip, - f"step{step}", f"rank{get_rank_id()}", "weight.json") + f"step{step}", f"rank{FmkAdp.get_rank_id()}", "weight.json") create_file_in_zip(output_zip_path, weights_data_filepath, json.dumps(weights_data_dict, indent=4)) config_checking_print(f"add weights info to zip") register_pre_forward_fun_list(collect_weights) @staticmethod - def compare(bench_dir, cmp_dir, output_path): + def compare(bench_dir, cmp_dir, output_path, fmk): bench_weight_pack_path = os.path.join(bench_dir, WeightsChecker.target_name_in_zip) cmp_weight_pack_path = os.path.join(cmp_dir, WeightsChecker.target_name_in_zip) df = compare_weight(bench_weight_pack_path, cmp_weight_pack_path) diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/config_checking.py b/debug/accuracy_tools/msprobe/core/config_check/config_check_cli.py similarity index 47% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/config_checking.py rename to debug/accuracy_tools/msprobe/core/config_check/config_check_cli.py index a8cc15ab6ee36907bdc7a061cd04359b4b83ebf8..1715d580ad857a534a9a7db8afd639d79f0d3e9d 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/config_checking.py +++ b/debug/accuracy_tools/msprobe/core/config_check/config_check_cli.py @@ -13,37 +13,31 @@ # See the License for the specific language governing permissions and # limitations under the License. -from msprobe.pytorch.common.log import logger -from msprobe.pytorch.config_checking.config_checker import ConfigChecker -from msprobe.pytorch.config_checking.ckpt_compare.compare_weight import compare_checkpoints +from msprobe.core.config_check.config_checker import ConfigChecker +from msprobe.core.common.log import logger -def pack(config_filepath): - ConfigChecker(config_filepath) +def pack(shell_path, output_path, framework): + ConfigChecker(shell_path=shell_path, output_zip_path=output_path, fmk=framework) -def compare(bench_zip_path, cmp_zip_path, outpath): - ConfigChecker.compare(bench_zip_path, cmp_zip_path, outpath) +def compare(bench_zip_path, cmp_zip_path, output_path, framework): + ConfigChecker.compare(bench_zip_path, cmp_zip_path, output_path, framework) def _config_checking_parser(parser): - parser.add_argument('-pack', '--pack', help='Pack a directory into a zip file') - parser.add_argument('-c', '--compare', nargs=2, help='Compare two zip files or ckpt dir') - parser.add_argument('-s', '--ckpt-sim', default=False, action='store_true', - help='Calculate the similarity of two ckpt') + parser.add_argument('-d', '--dump', nargs='*', help='Collect the train config into a zip file') + parser.add_argument('-c', '--compare', nargs=2, help='Compare two zip files') parser.add_argument('-o', '--output', help='output path, default is current directory') def _run_config_checking_command(args): - if args.pack: - pack(args.pack) + if args.dump is not None: + output_dirpath = args.output if args.output else "./config_check_pack.zip" + pack(args.dump, output_dirpath, args.framework) elif args.compare: - if args.ckpt_sim: - output_path = args.output if args.output else "./ckpt_compare_out.json" - compare_checkpoints(args.compare[0], args.compare[1], output_path) - else: - output_dirpath = args.output if args.output else "./config_check_result" - compare(args.compare[0], args.compare[1], output_dirpath) + output_dirpath = args.output if args.output else "./config_check_result" + compare(args.compare[0], args.compare[1], output_dirpath, args.framework) else: - logger.error("The param is not correct, you need to give '-pack' for pack or '-c' for compare.") - raise Exception("The param is not correct, you need to give '-pack' for pack or '-c' for compare.") + logger.error("The param is not correct, you need to give '-d' for dump or '-c' for compare.") + raise Exception("The param is not correct, you need to give '-d' for dump or '-c' for compare.") diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/config_checker.py b/debug/accuracy_tools/msprobe/core/config_check/config_checker.py similarity index 74% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/config_checker.py rename to debug/accuracy_tools/msprobe/core/config_check/config_checker.py index fa5d2ff3b036f0f150b19345ccad51e4b58e95f4..da930eabb4f4185b420c0aadb11c55fb1f1e0bd3 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/config_checker.py +++ b/debug/accuracy_tools/msprobe/core/config_check/config_checker.py @@ -16,15 +16,14 @@ import os import shutil -import torch -import torch.distributed as dist import pandas as pd from msprobe.core.common.file_utils import save_excel, split_zip_file_path, \ create_directory, extract_zip, make_dir -from msprobe.pytorch.config_checking.checkers.base_checker import PackInput -from msprobe.pytorch.config_checking.utils.utils import config_checking_print - +from msprobe.core.common.framework_adapter import FmkAdp +from msprobe.core.config_check.checkers.base_checker import PackInput +from msprobe.core.config_check.utils.utils import config_checking_print +from msprobe.core.common.const import Const class ConfigChecker: @@ -34,7 +33,8 @@ class ConfigChecker: result_header = ["filename", "pass_check"] step = 0 - def __init__(self, model=None, shell_path=None, output_zip_path="./config_check_pack.zip"): + def __init__(self, model=None, shell_path=None, output_zip_path="./config_check_pack.zip", fmk="pytorch"): + FmkAdp.set_fmk(fmk) self.pack_input = PackInput(output_zip_path, model, shell_path) file_path, file_name = split_zip_file_path(self.pack_input.output_zip_path) if not os.path.exists(file_path): @@ -45,13 +45,12 @@ class ConfigChecker: raise Exception("The output file path already exist!") self.pack() - @staticmethod - def compare(bench_zip_path, cmp_zip_path, outpath): - if os.path.exists(outpath): - shutil.rmtree(outpath) - bench_dir = os.path.join(outpath, "bench") - cmp_dir = os.path.join(outpath, "cmp") + def compare(bench_zip_path, cmp_zip_path, output_path, fmk=Const.PT_FRAMEWORK): + if os.path.exists(output_path): + shutil.rmtree(output_path) + bench_dir = os.path.join(output_path, "bench") + cmp_dir = os.path.join(output_path, "cmp") extract_zip(bench_zip_path, bench_dir) config_checking_print(f"extract zip file {bench_zip_path} to {bench_dir}") extract_zip(cmp_zip_path, cmp_dir) @@ -60,23 +59,23 @@ class ConfigChecker: result = [] summary_result = [] for checker in ConfigChecker.checkers.values(): - checker_name, pass_check, df = checker.compare_ex(bench_dir, cmp_dir, outpath) + checker_name, pass_check, df = checker.compare_ex(bench_dir, cmp_dir, output_path, fmk) if checker_name: summary_result.append([checker_name, pass_check]) if df is not None: result.append((df, checker_name)) summary_result_df = pd.DataFrame(summary_result, columns=ConfigChecker.result_header) result.insert(0, (summary_result_df, "summary")) - save_excel(os.path.join(outpath, ConfigChecker.result_filename), result) - config_checking_print(f"config checking result save to {os.path.realpath(outpath)}") + save_excel(os.path.join(output_path, ConfigChecker.result_filename), result) + config_checking_print(f"config checking result save to {os.path.realpath(output_path)}") + @staticmethod + def apply_patches(fmk=Const.PT_FRAMEWORK): + for checker in ConfigChecker.checkers.values(): + checker.apply_patches(fmk) def pack(self): config_checking_print(f"pack result zip path {os.path.realpath(self.pack_input.output_zip_path)}") - if dist.is_initialized() and dist.get_rank() == 0: - config_checking_print(f"pack result zip path {self.pack_input.output_zip_path}") - if os.path.exists(self.pack_input.output_zip_path): - os.remove(self.pack_input.output_zip_path) def hook(model, args, kwargs): for collect_func in self.pre_forward_fun_list: @@ -84,11 +83,11 @@ class ConfigChecker: ConfigChecker.step += 1 if self.pack_input.model: - self.pack_input.model.register_forward_pre_hook(hook, with_kwargs=True) + FmkAdp.register_forward_pre_hook(self.pack_input.model, hook, with_kwargs=True) for checker in ConfigChecker.checkers.values(): if checker.input_needed and not getattr(self.pack_input, checker.input_needed): continue - if dist.is_initialized() and dist.get_rank() != 0 and not checker.multi_rank: + if FmkAdp.is_initialized() and FmkAdp.get_rank() != 0 and not checker.multi_rank: continue checker.pack(self.pack_input) diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/resource/dependency.yaml b/debug/accuracy_tools/msprobe/core/config_check/resource/dependency.yaml similarity index 96% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/resource/dependency.yaml rename to debug/accuracy_tools/msprobe/core/config_check/resource/dependency.yaml index f4f73a5fce97f20608a3c9bacb92e53f1747f092..02c0b565bf59b1b220f16ae17a47f5f4d5b13c1f 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/resource/dependency.yaml +++ b/debug/accuracy_tools/msprobe/core/config_check/resource/dependency.yaml @@ -19,6 +19,4 @@ dependency: - megatron - numpy - datasets - - torch - - torchversion - peft \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/resource/env.yaml b/debug/accuracy_tools/msprobe/core/config_check/resource/env.yaml similarity index 63% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/resource/env.yaml rename to debug/accuracy_tools/msprobe/core/config_check/resource/env.yaml index 13ea0e39f89b4807b72a6322ddc865145d9fde9d..87d663b9d94976c24feb88b181b3ead98905eb5a 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/resource/env.yaml +++ b/debug/accuracy_tools/msprobe/core/config_check/resource/env.yaml @@ -14,25 +14,44 @@ # limitations under the License. HCCL_DETERMINISTIC: - - name: HCCL_DETERMINISTIC + npu: + name: HCCL_DETERMINISTIC + default_value: False + gpu: + name: NCCL_DETERMINISTIC default_value: False -HCCL_ALGO: - - name: HCCL_ALGO +HCCL_ALGO: + npu: + name: HCCL_ALGO + default_value: None + gpu: + name: NCCL_ALGO default_value: None HCCL_INTRA_ROCE_ENABLE: - - name: HCCL_INTRA_ROCE_ENABLE + npu: + name: HCCL_INTRA_ROCE_ENABLE default_value: 0 + HCCL_INTRA_PICE_ENABLE: - - name: HCCL_INTRA_PICE_ENABLE + npu: + name: HCCL_INTRA_ROCE_ENABLE default_value: 1 ASCEND_LAUNCH_BLOCKING: - - name: ASCEND_LAUNCH_BLOCKING - default_value: False + npu: + name: ASCEND_LAUNCH_BLOCKING + default_value: 0 + gpu: + name: CUDA_LAUNCH_BLOCKING + default_value: 0 -ASCEND_RT_VISIBLE_DEVICE: - - name: ASCEND_RT_VISIBLE_DEVICE +ASCEND_RT_VISIBLE_DEVICES: + npu: + name: ASCEND_RT_VISIBLE_DEVICES + default_value: None + gpu: + name: CUDA_VISIBLE_DEVICES default_value: None \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/core/config_check/resource/hyperparameter.yaml b/debug/accuracy_tools/msprobe/core/config_check/resource/hyperparameter.yaml new file mode 100644 index 0000000000000000000000000000000000000000..5cff815717fc5b668bdd5f99de1a18e0373760fe --- /dev/null +++ b/debug/accuracy_tools/msprobe/core/config_check/resource/hyperparameter.yaml @@ -0,0 +1,21 @@ +learning_rate: + - lr + - learningrate + +batch_size: + - batch + - bs + - batch_size_per_gpu + +epochs: + - num_epochs + - max_epochs + - epoch + +weight_decay: + - wd + - weightdecay + +dropout_rate: + - dropout + - drop_rate \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/core/config_check/utils/hyperparameter_parser.py b/debug/accuracy_tools/msprobe/core/config_check/utils/hyperparameter_parser.py new file mode 100644 index 0000000000000000000000000000000000000000..6cb540ee49951652b6094f80229da099cfc5afdf --- /dev/null +++ b/debug/accuracy_tools/msprobe/core/config_check/utils/hyperparameter_parser.py @@ -0,0 +1,115 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +from abc import ABC, abstractmethod + +from msprobe.core.config_check.utils.utils import config_checking_print +from msprobe.core.common.file_utils import FileOpen, load_yaml +from msprobe.core.common.const import Const, FileCheckConst + + +class Parser(ABC): + @abstractmethod + def parse(self, file_path: str) -> dict: + pass + + def run(self, file_path: str) -> dict: + """ + 统一对外调用接口 + :param file_path: 需解析的文件路径 + :return: + """ + try: + result = self.parse(file_path) + except Exception as exc: + config_checking_print(f"{self.__class__} parsing error, skip file path: {file_path}, error: {exc}") + result = {} + return result + + +class ShellParser(Parser): + def parse(self, file_path: str) -> dict: + """ + Extracts arguments from bash script used to run a model training. + """ + hyperparameters = {} + script_content_list = [] + with FileOpen(file_path, 'r') as file: + for line in file: + stripped_line = line.lstrip() + if not stripped_line.startswith('#'): + line = line.split('#')[0].rstrip() + '\n' + if line.strip(): + script_content_list.append(line) + script_content = ''.join(script_content_list) + + command_line = re.search(r'msrun\s[^|]*|torchrun\s[^|]*|python\d? -m torch.distributed.launch\s[^|]*', + script_content, + re.DOTALL) + if command_line: + command_line = command_line.group() + + blocks = re.findall(r'([a-zA-Z0-9_]{1,20}_ARGS)="(.*?)"', script_content, re.DOTALL) + block_contents = {} + for block_name, block_content in blocks: + block_content = block_content.replace('\n', ' ') + block_contents[block_name] = block_content + command_line = command_line.replace(f"${block_name}", block_content) + + matches = re.findall(r'--([\w-]+)(?:\s+([^\s\\]+))?', command_line) + for match in matches: + key, value = match + args_key = re.match(r'\$\{?(\w+)}?', value) + if args_key: + env_vars = re.findall(rf'{args_key.group(1)}=\s*(.+)', script_content) + if env_vars: + value = env_vars[-1] + hyperparameters[key] = value if value else True + + return hyperparameters + + +class YamlParser(Parser): + hyperparameters = {} + + def parse(self, file_path: str) -> dict: + ori_hyper = load_yaml(file_path) + self.recursive_parse_parameters(ori_hyper, "") + return self.hyperparameters + + def recursive_parse_parameters(self, parameters, prefix): + if isinstance(parameters, dict): + for key, value in parameters.items(): + new_prefix = prefix + Const.SEP + key if prefix else key + self.recursive_parse_parameters(value, new_prefix) + elif isinstance(parameters, list): + for value in parameters: + self.recursive_parse_parameters(value, prefix) + elif isinstance(parameters, (int, str, bool)): + self.hyperparameters.update({prefix: parameters}) + + +class ParserFactory: + __ParserDict = { + FileCheckConst.SHELL_SUFFIX: ShellParser(), + FileCheckConst.YAML_SUFFIX: YamlParser() + } + + def get_parser(self, file_type: str) -> Parser: + parser = self.__ParserDict.get(file_type, None) + if not parser: + raise ValueError(f'Invalid parser type: {file_type}') + return parser diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/utils/utils.py b/debug/accuracy_tools/msprobe/core/config_check/utils/utils.py similarity index 80% rename from debug/accuracy_tools/msprobe/pytorch/config_checking/utils/utils.py rename to debug/accuracy_tools/msprobe/core/config_check/utils/utils.py index 3f8cef378ef3479aa0892786e835a66861eb6637..8c3c329cf20e2b6fb890437b3ba9950f14cc8878 100644 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/utils/utils.py +++ b/debug/accuracy_tools/msprobe/core/config_check/utils/utils.py @@ -17,9 +17,8 @@ import os import re import hashlib -import torch - -from msprobe.pytorch.common.log import logger +from msprobe.core.common.framework_adapter import FmkAdp +from msprobe.core.common.log import logger def merge_keys(dir_0, dir_1): @@ -53,15 +52,13 @@ def tensor_to_hash(tensor): def get_tensor_features(tensor): features = { - "max": lambda x: torch.max(x).item(), - "min": lambda x: torch.min(x).item(), - "mean": lambda x: torch.mean(x).item(), - "norm": lambda x: torch.norm(x).item(), + "max": FmkAdp.tensor_max(tensor), + "min": FmkAdp.tensor_max(tensor), + "mean": FmkAdp.tensor_max(tensor), + "norm": FmkAdp.tensor_max(tensor), } - if not tensor.is_floating_point() or tensor.dtype == torch.float64: - tensor = tensor.float() - return {key: features.get(key)(tensor) for key in features} + return features def compare_dicts(dict1, dict2, path=''): @@ -97,3 +94,14 @@ def bytes_hash(obj: bytes): hex_dig = hashlib.sha256(obj).hexdigest() short_hash = int(hex_dig, 16) % (2 ** 16) return short_hash + + +def update_dict(ori_dict, new_dict): + for key, value in new_dict.items(): + if key in ori_dict and ori_dict[key] != value: + if "values" in ori_dict.keys(): + ori_dict[key]["values"].append(new_dict[key]) + else: + ori_dict[key] = {"description": "duplicate_value", "values": [ori_dict[key], new_dict[key]]} + else: + ori_dict[key] = value diff --git a/debug/accuracy_tools/msprobe/core/data_dump/api_registry.py b/debug/accuracy_tools/msprobe/core/data_dump/api_registry.py index 375ed7f7446bac31adf2c8041e0a6216f180a9da..9656830b0f67eb891eb5dea796fa6b05ea59b194 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/api_registry.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/api_registry.py @@ -13,10 +13,12 @@ # See the License for the specific language governing permissions and # limitations under the License. +import inspect from typing import Dict, Any, Optional, Callable, Union, List, Tuple from msprobe.core.common.const import Const from msprobe.core.common.file_utils import load_yaml +from msprobe.core.common.log import logger def _get_attr(module, attr_name): @@ -44,6 +46,38 @@ class ApiWrapper: self.api_names = self._get_api_names() self.wrapped_api_functions = dict() + @staticmethod + def deal_with_self_kwargs(api_name, api_func, args, kwargs): + if kwargs and 'self' in kwargs: + func_params = None + try: + func_params = inspect.signature(api_func).parameters + except Exception: + if api_name in Const.API_WITH_SELF_ARG: + func_params = inspect.signature(Const.API_WITH_SELF_ARG.get(api_name)).parameters + if func_params is None: + return False, args, kwargs + + for name, param in func_params.items(): + if name == 'self' and param.kind == inspect.Parameter.KEYWORD_ONLY: + return False, args, kwargs + args_ = list(args) + names_and_values = [] + self_index = 0 + for i, item in enumerate(func_params.items()): + names_and_values.append((item[0], item[1].default)) + if item[0] == 'self': + self_index = i + break + for i in range(len(args), self_index + 1): + if names_and_values[i][0] in kwargs: + args_.append(kwargs.pop(names_and_values[i][0])) + else: + args_.append(names_and_values[i][1]) + args = tuple(args_) + + return True, args, kwargs + def wrap_api( self, api_templates, hook_build_func: Optional[Callable] ): @@ -68,6 +102,14 @@ class ApiWrapper: if callable(ori_api): def wrap_api_func(api_name, api_func, prefix, hook_build_func, api_template): def api_function(*args, **kwargs): + api_name_with_prefix = prefix + Const.SEP + str(api_name.split(Const.SEP)[-1]) + enable_wrap, args, kwargs = self.deal_with_self_kwargs(api_name_with_prefix, + api_func, args, kwargs) + if not enable_wrap: + logger.warning(f'Cannot collect precision data of {api_name_with_prefix}. ' + 'It may be fixed by passing the value of "self" ' + 'as a positional argument instead of a keyword argument. ') + return api_func(*args, **kwargs) return api_template(api_name, api_func, prefix, hook_build_func)(*args, **kwargs) api_function.__name__ = api_name return api_function @@ -132,6 +174,18 @@ class ApiRegistry: else: setattr(api_group, api, api_attr) + @staticmethod + def register_custom_api(module, api_name, api_prefix, hook_build_func, api_template): + def wrap_api_func(api_name, api_func, prefix, hook_build_func, api_template): + def api_function(*args, **kwargs): + return api_template(api_name, api_func, prefix, hook_build_func)(*args, **kwargs) + + api_function.__name__ = api_name + return api_function + + setattr(module, api_name, + wrap_api_func(api_name, getattr(module, api_name), api_prefix, hook_build_func, api_template)) + def register_all_api(self): self.all_api_registered = True for framework, api_types in self.api_types.items(): diff --git a/debug/accuracy_tools/msprobe/core/data_dump/data_collector.py b/debug/accuracy_tools/msprobe/core/data_dump/data_collector.py index 622c441a27db9fa7a88eb23265c6176163d2aa92..01bebcabcfe69e1de49e6425e88696f7ac093eea 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/data_collector.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/data_collector.py @@ -41,7 +41,7 @@ class DataCollector: self.backward_module_names = {} self.optimizer_status = "" self.optimizer_status_first_start = {Const.OPTIMIZER: True, Const.CLIP_GRAD: True} - atexit.register(self.write_json) + atexit.register(self.write_json_at_exit) @property def dump_data_dir(self): @@ -78,6 +78,11 @@ class DataCollector: def write_json(self): self.data_writer.write_json() + def write_json_at_exit(self): + if self.config.async_dump and self.config.task == Const.TENSOR: + self.data_processor.dump_async_data() + self.data_writer.write_json() + def update_data(self, name, data_info): msg = f"msprobe is collecting data on {name}." if self.config.task == Const.OVERFLOW_CHECK: @@ -89,6 +94,10 @@ class DataCollector: logger.debug(msg) self.data_writer.update_data(data_info) + def call_stack_collect(self, name): + stack_info = self.data_processor.analyze_api_call_stack(name) + self.data_writer.update_stack(name, stack_info) + def forward_input_data_collect(self, name, module, pid, module_input_output, is_recompute=None): if self.config.task == Const.FREE_BENCHMARK: backward_name = name.replace(Const.FORWARD, Const.BACKWARD) @@ -118,9 +127,16 @@ class DataCollector: self.set_is_recomputable(data_info, is_recompute) if self.config.level == Const.LEVEL_L2: return - self.data_writer.update_stack(self.data_processor.analyze_api_call_stack(name)) + self.call_stack_collect(name) self.handle_data(name, data_info, flush=self.data_processor.is_terminated) + def forward_data_collect_only_tensor(self, name, module, pid, module_input_output): + if not self.check_scope_and_pid(self.scope, name, pid): + return + + self.data_processor.analyze_forward(name, module, module_input_output) + + def forward_data_collect(self, name, module, pid, module_input_output, is_recompute=None): self.update_construct(name) if not self.check_scope_and_pid(self.scope, name, pid): @@ -130,9 +146,15 @@ class DataCollector: if self.config.task != Const.STRUCTURE: data_info = self.data_processor.analyze_forward(name, module, module_input_output) self.set_is_recomputable(data_info, is_recompute) - self.data_writer.update_stack(self.data_processor.analyze_api_call_stack(name)) + self.call_stack_collect(name) self.handle_data(name, data_info, flush=self.data_processor.is_terminated) + def backward_data_collect_only_tensor(self, name, module, pid, module_input_output, is_recompute=None): + if not self.check_scope_and_pid(self.scope, name, pid): + return + + self.data_processor.analyze_backward(name, module, module_input_output) + def backward_data_collect(self, name, module, pid, module_input_output, is_recompute=None): self.update_construct(name) if not self.check_scope_and_pid(self.scope, name, pid): @@ -180,7 +202,10 @@ class DataCollector: self.optimizer_status_first_start[self.optimizer_status] = False self.data_writer.update_construct({name: self.optimizer_status}) else: - self.data_writer.update_construct({name: self.module_processor.api_parent_node}) + if self.config.level == Const.LEVEL_MIX and \ + not (name.startswith(Const.MODULE) or name.startswith(Const.CELL)): + self.data_writer.update_construct({name: self.module_processor.api_parent_node}) + self.data_writer.update_construct(self.module_processor.module_node) def handle_data(self, name, data_info, flush=False): @@ -204,6 +229,7 @@ class DataCollector: def params_data_collect(self, name, param_name, pid, data): grad_name = name + Const.SEP + Const.PARAMS_GRAD + self.update_api_or_module_name(grad_name) # 校验scope和pid,以及当前name是否有过反向计算 if not self.check_scope_and_pid(self.scope, name, pid) and not self.backward_module_names.get(name): # 如果没有反向计算,则需要清除之前占位写入的grad数据 @@ -213,15 +239,19 @@ class DataCollector: data_info = self.data_processor.analyze_params(grad_name, param_name, data) self.handle_data(grad_name, data_info, flush=self.data_processor.is_terminated) + def debug_data_collect_forward(self, variable, name_with_count): data_info = self.data_processor.analyze_debug_forward(variable, name_with_count) - self.data_writer.update_debug({name_with_count: data_info}) + name_with_count_category = name_with_count + Const.SEP + Const.DEBUG + self.data_writer.update_debug({name_with_count_category: data_info}) def debug_data_collect_backward(self, variable, grad_name_with_count): # prepare all None nested data structure all_none_data_info = self.data_processor.analyze_element_to_all_none(variable) - self.data_writer.update_debug({grad_name_with_count: all_none_data_info}) + grad_name_with_count_category = grad_name_with_count + Const.SEP + Const.DEBUG + self.data_writer.update_debug({grad_name_with_count_category: all_none_data_info}) # register tensor backward hook - self.data_processor.analyze_debug_backward(variable, grad_name_with_count, self.data_writer.cache_debug['data']) + self.data_processor.analyze_debug_backward(variable, grad_name_with_count_category, + self.data_writer.cache_debug['data']) diff --git a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/base.py b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/base.py index 71d77eb0bbaefcad3f6ac1b1e7cb8ac15732eaa3..60257b14b2ec2a5958d771e36e10c349f79aaaac 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/base.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/base.py @@ -13,17 +13,17 @@ # See the License for the specific language governing permissions and # limitations under the License. +import copy import inspect import os from dataclasses import dataclass, is_dataclass -from typing import Tuple, Dict, Optional, Any from functools import partial -import copy -from typing import Union +from typing import Tuple, Dict, Optional, Any, Union import numpy as np from msprobe.core.common.const import Const +from msprobe.core.common.file_utils import save_npy from msprobe.core.common.log import logger from msprobe.core.common.utils import convert_tuple, CompareException @@ -116,7 +116,10 @@ class BaseDataProcessor: @staticmethod def analyze_api_call_stack(name): try: - api_stack = inspect.stack()[5:] + if name.startswith("Primitive"): + api_stack = inspect.stack()[4:] + else: + api_stack = inspect.stack()[5:] except Exception as e: logger.warning(f"The call stack of <{name}> failed to retrieve, {e}.") api_stack = None @@ -125,12 +128,14 @@ class BaseDataProcessor: for (_, path, line, func, code, _) in api_stack: if not code: continue + if any(filter_path in path for filter_path in Const.STACK_FILTER_KEYWORDS) and \ + Const.CALL_STACK_FLAG not in path: + continue stack_line = f"File {path}, line {str(line)}, in {func}, \n {code[0].strip()}" stack_str.append(stack_line) else: stack_str.append(Const.WITHOUT_CALL_STACK) - stack_info_struct = {name: stack_str} - return stack_info_struct + return tuple(stack_str) @staticmethod def transfer_type(data): @@ -172,7 +177,7 @@ class BaseDataProcessor: else: raise ValueError("set_value_into_nested_structure failed: " "invalid data_structure type or invalid index") - + @staticmethod def is_distributed_op(module): return getattr(module, "op_is_distributed", False) @@ -418,6 +423,7 @@ class BaseDataProcessor: api_info_struct = {} self.save_name = name + Const.SEP + param_name data_info = self.analyze_element(grad) + self.save_name = None grad_info_dict = {param_name: [data_info]} api_info_struct[name] = grad_info_dict return api_info_struct @@ -426,10 +432,10 @@ class BaseDataProcessor: file_format = Const.PT_SUFFIX if self.config.framework == Const.PT_FRAMEWORK else Const.NUMPY_SUFFIX if self.save_name is not None: dump_data_name = (self.save_name + file_format) - self.save_name = None else: - dump_data_name = (self.current_api_or_module_name + Const.SEP + self.api_data_category + Const.SEP + - suffix + file_format) + suffix_with_seq = (Const.SEP + suffix) if suffix else "" + dump_data_name = (self.current_api_or_module_name + Const.SEP + self.api_data_category + suffix_with_seq + + file_format) file_path = os.path.join(self.data_writer.dump_tensor_data_dir, dump_data_name) return dump_data_name, file_path @@ -438,24 +444,32 @@ class BaseDataProcessor: def analyze_debug_forward(self, variable, name_with_count): self.current_api_or_module_name = name_with_count - self.api_data_category = Const.TENSOR - # these two attributes are used to construct tensor file name {name_with_count}.tensor.{indexes}.npy/pt + self.api_data_category = Const.DEBUG + # these two attributes are used to construct tensor file name {name_with_count}.debug.{indexes}.npy/pt data_info = self.analyze_element(variable) return data_info - def analyze_debug_backward(self, variable, grad_name_with_count, nested_data_structure): + def analyze_debug_backward(self, variable, grad_name_with_count_category, nested_data_structure): def hook_fn(grad, indexes): suffix = Const.SEP.join([str(index) for index in indexes]) - self.save_name = grad_name_with_count + Const.SEP + Const.TENSOR + Const.SEP + suffix + suffix_with_sep = (Const.SEP + suffix) if suffix else "" + self.save_name = grad_name_with_count_category + suffix_with_sep grad_data_info = self.analyze_element(grad) self.save_name = None - full_index = [grad_name_with_count] + indexes + full_index = [grad_name_with_count_category] + indexes try: self.set_value_into_nested_structure(nested_data_structure, full_index, grad_data_info) except (ValueError, IndexError) as e: - logger.warning(f"error occurred while recording statistics of {grad_name_with_count} variable, " + logger.warning(f"error occurred while recording statistics of {grad_name_with_count_category} variable," f"skip current recording, detailed information: {e}") return grad wrap_register_hook_single_element = partial(self.register_hook_single_element, hook_fn=hook_fn) self.recursive_apply_transform(variable, wrap_register_hook_single_element) + + def _analyze_and_save_ndarray(self, ndarray, suffix): + dump_data_name, file_path = self.get_save_file_path(suffix) + save_npy(ndarray, file_path) + ndarray_json = BaseDataProcessor._analyze_ndarray(ndarray, suffix) + ndarray_json.update({"data_name": dump_data_name}) + return ndarray_json diff --git a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/mindspore_processor.py b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/mindspore_processor.py index 4e61725e2a6260e397be260c422dceaf989aa988..1ba5378779079c450596ad571e1b6034f9b626eb 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/mindspore_processor.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/mindspore_processor.py @@ -24,7 +24,7 @@ import numpy as np from msprobe.core.common.const import Const from msprobe.core.data_dump.data_processor.base import (BaseDataProcessor, TensorStatInfo, ModuleForwardInputsOutputs, ModuleBackwardInputsOutputs) -from msprobe.core.common.file_utils import path_len_exceeds_limit, save_npy +from msprobe.core.common.file_utils import path_len_exceeds_limit from msprobe.mindspore.common.utils import convert_bf16_to_fp32, save_tensor_as_npy from msprobe.mindspore.common.log import logger from msprobe.mindspore.dump.hook_cell.api_register import get_api_register @@ -62,8 +62,9 @@ class MindsporeDataProcessor(BaseDataProcessor): def get_stat_info_sync(data): tensor_stat = TensorStatInfo() if data.dtype == ms.bool_: - tensor_stat.max = mint.any(data) - tensor_stat.min = mint.all(data) + data_np = data.asnumpy() + tensor_stat.max = np.max(data_np).item() + tensor_stat.min = np.min(data_np).item() elif not data.shape: tensor_stat.max = tensor_stat.min = tensor_stat.mean = tensor_stat.norm = data elif data.dtype == ms.complex64 or data.dtype == ms.complex128: @@ -117,6 +118,11 @@ class MindsporeDataProcessor(BaseDataProcessor): def get_special_types(cls): return super().get_special_types() + cls.mindspore_special_type + def dump_async_data(self): + for file_path, tensor in self._async_dump_cache.items(): + save_tensor_as_npy(tensor, file_path) + self._async_dump_cache.clear() + def get_stat_info(self, data): self.api_register.restore_inner_used_api() tensor_stat = TensorStatInfo() @@ -187,20 +193,9 @@ class MindsporeDataProcessor(BaseDataProcessor): tensor_json.update({Const.MD5: tensor_md5}) return tensor_json - -class StatisticsDataProcessor(MindsporeDataProcessor): - pass - - -class TensorDataProcessor(MindsporeDataProcessor): - def dump_async_data(self): - for file_path, tensor in self._async_dump_cache.items(): - save_tensor_as_npy(tensor, file_path) - self._async_dump_cache.clear() - - def _analyze_tensor(self, tensor, suffix): + def _analyze_and_save_tensor(self, tensor, suffix): dump_data_name, file_path = self.get_save_file_path(suffix) - single_arg = super()._analyze_tensor(tensor, suffix) + single_arg = MindsporeDataProcessor._analyze_tensor(self, tensor, suffix) single_arg.update({"data_name": dump_data_name}) if self.config.async_dump: self._async_dump_cache[file_path] = tensor.copy() @@ -208,12 +203,27 @@ class TensorDataProcessor(MindsporeDataProcessor): save_tensor_as_npy(tensor, file_path) return single_arg + +class StatisticsDataProcessor(MindsporeDataProcessor): + def _analyze_tensor(self, tensor, suffix): + if any(item in self.current_api_or_module_name for item in self.config.tensor_list): + return self._analyze_and_save_tensor(tensor, suffix) + else: + return super()._analyze_tensor(tensor, suffix) + def _analyze_ndarray(self, ndarray, suffix): - dump_data_name, file_path = self.get_save_file_path(suffix) - save_npy(ndarray, file_path) - ndarray_json = super()._analyze_ndarray(ndarray, suffix) - ndarray_json.update({"data_name": dump_data_name}) - return ndarray_json + if any(item in self.current_api_or_module_name for item in self.config.tensor_list): + return self._analyze_and_save_ndarray(ndarray, suffix) + else: + return super()._analyze_ndarray(ndarray, suffix) + + +class TensorDataProcessor(MindsporeDataProcessor): + def _analyze_tensor(self, tensor, suffix): + return self._analyze_and_save_tensor(tensor, suffix) + + def _analyze_ndarray(self, ndarray, suffix): + return self._analyze_and_save_ndarray(ndarray, suffix) class OverflowCheckDataProcessor(MindsporeDataProcessor): @@ -260,7 +270,7 @@ class OverflowCheckDataProcessor(MindsporeDataProcessor): api_info_struct = super().analyze_backward(name, module, module_input_output) self.maybe_save_overflow_data() return api_info_struct if self.has_overflow else None - + def analyze_params(self, name, param_name, grad): self.has_overflow = False api_info_struct = super().analyze_params(name, param_name, grad) @@ -287,11 +297,17 @@ class OverflowCheckDataProcessor(MindsporeDataProcessor): if max_tensor is None or min_tensor is None: return - if mint.isinf(max_tensor) or mint.isnan(max_tensor): + def check_inf_nan(value): + # Use .item() if it's a tensor-like structure + if hasattr(value, "item"): + value = value.item() + return np.isinf(value) or np.isnan(value) + + if check_inf_nan(max_tensor): self.has_overflow = True return - if mint.isinf(min_tensor) or mint.isnan(min_tensor): + if check_inf_nan(min_tensor): self.has_overflow = True def _analyze_tensor(self, tensor, suffix): diff --git a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/pytorch_processor.py b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/pytorch_processor.py index 333d05b00f28eeb7162942350045ccea22f1374e..2e66fe29ee55de1c15de06f288fe2b45679e0c50 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/data_processor/pytorch_processor.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/data_processor/pytorch_processor.py @@ -79,21 +79,22 @@ class PytorchDataProcessor(BaseDataProcessor): def analyze_device_in_kwargs(element): single_arg = {} single_arg.update({'type': "torch.device"}) - if not isinstance(element, str): + if isinstance(element, (int, str)): + single_arg.update({"value": element}) + elif isinstance(element, torch.device): if hasattr(element, "index"): device_value = element.type + ":" + str(element.index) else: device_value = element.type single_arg.update({"value": device_value}) else: - single_arg.update({"value": element}) + logger.debug(f"Device type {type(element)} is not supported.") return single_arg @staticmethod def analyze_dtype_in_kwargs(element): return {"type": "torch.dtype", "value": str(element)} - @staticmethod def get_stat_info_async(data): tensor_stat = TensorStatInfo() @@ -226,6 +227,11 @@ class PytorchDataProcessor(BaseDataProcessor): def get_special_types(cls): return super().get_special_types() + cls.pytorch_special_type + def dump_async_data(self): + for file_path, tensor in self._async_dump_cache.items(): + save_pt(tensor.contiguous(), file_path) + self._async_dump_cache.clear() + def analyze_single_element(self, element, suffix_stack): if suffix_stack and suffix_stack[-1] in self.torch_object_key: return self.torch_object_key[suffix_stack[-1]](element) @@ -286,20 +292,9 @@ class PytorchDataProcessor(BaseDataProcessor): tensor_json.update({Const.MD5: tensor_md5}) return tensor_json - -class StatisticsDataProcessor(PytorchDataProcessor): - pass - - -class TensorDataProcessor(PytorchDataProcessor): - def dump_async_data(self): - for file_path, tensor in self._async_dump_cache.items(): - save_pt(tensor.contiguous(), file_path) - self._async_dump_cache.clear() - - def _analyze_tensor(self, tensor, suffix): + def _analyze_and_save_tensor(self, tensor, suffix): dump_data_name, file_path = self.get_save_file_path(suffix) - single_arg = super()._analyze_tensor(tensor, suffix) + single_arg = PytorchDataProcessor._analyze_tensor(self, tensor, suffix) single_arg.update({"data_name": dump_data_name}) tensor, _ = self._cast_to_float_if_fp8(tensor) if self.config.async_dump: @@ -309,14 +304,36 @@ class TensorDataProcessor(PytorchDataProcessor): save_pt(saved_tensor, file_path) return single_arg - def _analyze_ndarray(self, ndarray, suffix): + def _analyze_and_save_ndarray(self, ndarray, suffix): dump_data_name, file_path = self.get_save_file_path(suffix) save_pt(torch.tensor(ndarray), file_path) - ndarray_json = super()._analyze_ndarray(ndarray, suffix) + ndarray_json = PytorchDataProcessor._analyze_ndarray(ndarray, suffix) ndarray_json.update({"data_name": dump_data_name}) return ndarray_json +class StatisticsDataProcessor(PytorchDataProcessor): + def _analyze_tensor(self, tensor, suffix): + if any(item in self.current_api_or_module_name for item in self.config.tensor_list): + return self._analyze_and_save_tensor(tensor, suffix) + else: + return super()._analyze_tensor(tensor, suffix) + + def _analyze_ndarray(self, ndarray, suffix): + if any(item in self.current_api_or_module_name for item in self.config.tensor_list): + return self._analyze_and_save_ndarray(ndarray, suffix) + else: + return super()._analyze_ndarray(ndarray, suffix) + + +class TensorDataProcessor(PytorchDataProcessor): + def _analyze_tensor(self, tensor, suffix): + return self._analyze_and_save_tensor(tensor, suffix) + + def _analyze_ndarray(self, ndarray, suffix): + return self._analyze_and_save_ndarray(ndarray, suffix) + + class OverflowCheckDataProcessor(PytorchDataProcessor): __slots__ = ["cached_tensors_and_file_paths"] diff --git a/debug/accuracy_tools/msprobe/core/data_dump/json_writer.py b/debug/accuracy_tools/msprobe/core/data_dump/json_writer.py index d519275d02218e98b872246febb4f74aac751b5a..ea42e7d6772125a37d792fbebbb2681b6189662c 100644 --- a/debug/accuracy_tools/msprobe/core/data_dump/json_writer.py +++ b/debug/accuracy_tools/msprobe/core/data_dump/json_writer.py @@ -36,6 +36,7 @@ class DataWriter: self.dump_tensor_data_dir = None self.debug_file_path = None self.flush_size = 1000 + self.larger_flush_size = 20000 self.cache_data = {} self.cache_stack = {} self.cache_construct = {} @@ -57,27 +58,27 @@ class DataWriter: if is_new_file: change_mode(file_path, FileCheckConst.DATA_FILE_AUTHORITY) - @recursion_depth_decorator("JsonWriter: DataWriter._replace_stat_placeholders", max_depth=Const.DUMP_MAX_DEPTH) + @recursion_depth_decorator("JsonWriter: DataWriter._replace_stat_placeholders") def _replace_stat_placeholders(self, data, stat_result): if isinstance(data, dict): keys = list(data.keys()) # 获取当前所有键 for key in keys: # 递归所有变量 value = data[key] if key == Const.TENSOR_STAT_INDEX and isinstance(value, int): - if value > 0: + if value >= 0: idx = value else: return stat_values = stat_result[idx] if idx < len(stat_result) else [None] * 4 - # 构建新字段并删除旧键 + new_entries = { - "type": data["type"], - "dtype": data["dtype"], - "shape": data["shape"], - "Max": stat_values[0], - "Min": stat_values[1], - "Mean": stat_values[2], - "Norm": stat_values[3] + Const.TYPE: data["type"], + Const.DTYPE: data["dtype"], + Const.SHAPE: data["shape"], + Const.MAX: stat_values[0], + Const.MIN: stat_values[1], + Const.MEAN: stat_values[2], + Const.NORM: stat_values[3], } del data[key] @@ -101,6 +102,7 @@ class DataWriter: self.cache_data = {} self.cache_stack = {} self.cache_construct = {} + self.cache_debug = {} def initialize_json_file(self, **kwargs): if self.debug_file_path and not self.cache_debug: @@ -129,8 +131,20 @@ class DataWriter: def flush_data_periodically(self): dump_data = self.cache_data.get(Const.DATA) - if dump_data and isinstance(dump_data, dict) and len(dump_data) % self.flush_size == 0: - self.write_json() + + if not dump_data or not isinstance(dump_data, dict): + return + + length = len(dump_data) + + # 小于大阈值时,使用小阈值落盘 + if length < self.larger_flush_size: + if length % self.flush_size == 0: + self.write_json() + # 大于等于大阈值时,使用大阈值落盘 + else: + if length % self.larger_flush_size == 0: + self.write_json() def update_data(self, new_data): with lock: @@ -148,9 +162,13 @@ class DataWriter: else: dump_data.update(new_data) - def update_stack(self, new_data): + def update_stack(self, name, stack_data): with lock: - self.cache_stack.update(new_data) + api_list = self.cache_stack.get(stack_data) + if api_list is None: + self.cache_stack.update({stack_data: [name]}) + else: + api_list.append(name) def update_construct(self, new_data): with lock: @@ -165,7 +183,11 @@ class DataWriter: save_json(file_path, self.cache_data, indent=1) def write_stack_info_json(self, file_path): - save_json(file_path, self.cache_stack, indent=1) + num, new_cache_stack = 0, {} + for key, value in self.cache_stack.items(): + new_cache_stack[num] = [value, key] + num += 1 + save_json(file_path, new_cache_stack, indent=1) def write_construct_info_json(self, file_path): save_json(file_path, self.cache_construct, indent=1) @@ -231,3 +253,4 @@ class DataWriter: self.write_construct_info_json(self.construct_file_path) if self.cache_debug: self.write_debug_info_json(self.debug_file_path) + diff --git a/debug/accuracy_tools/msprobe/docs/01.installation.md b/debug/accuracy_tools/msprobe/docs/01.installation.md index a40059b1c86552cc1234d22502d69cb9f12108b3..b5077228919c713c5e7910703678339c0b809326 100644 --- a/debug/accuracy_tools/msprobe/docs/01.installation.md +++ b/debug/accuracy_tools/msprobe/docs/01.installation.md @@ -16,7 +16,9 @@ pip install mindstudio-probe |版本|发布日期|支持 PyTorch 版本|支持 MindSpore 版本|下载链接|校验码| |:--:|:--:|:--:|:--:|:--:|:--:| -|1.2.2|2025.2.26|1.11/2.0/2.1/2.2|2.4.0|[mindstudio_probe-1.2.2-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.2/mindstudio_probe-1.2.2-py3-none-any.whl)|1db0cf4572bc0305c68705b74775f652c6cb2c2bedb6c6e57f43e31ab273b288| +|8.0.0|2025.5.07|1.11/2.0/2.1/2.2|2.4.0/2.5.0/2.6.0|[mindstudio_probe-8.0.0-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/8.0/mindstudio_probe-8.0.0-py3-none-any.whl)|6810eade7ae99e3b24657d5cab251119882decd791aa76a7aeeb94dea767daec| +|1.3.0|2025.4.17|1.11/2.0/2.1/2.2|2.4.0/2.5.0/2.6.0|[mindstudio_probe-1.3.0-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.3/mindstudio_probe-1.3.0-py3-none-any.whl)|85dbc5518b5c23d29c67d7b85d662517d0318352f372891f8d91e73e71b439c3| +|1.2.2|2025.3.03|1.11/2.0/2.1/2.2|2.4.0|[mindstudio_probe-1.2.2-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.2/mindstudio_probe-1.2.2-py3-none-any.whl)|961411bb460d327ea51d6ca4d0c8e8c5565f07c0852d7b8592b781ca35b87212| |1.2.1|2025.2.07|1.11/2.0/2.1/2.2|2.4.0|[mindstudio_probe-1.2.1-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.2/mindstudio_probe-1.2.1-py3-none-any.whl)|b64b342118558e0339b39237f88a49b93fd24551b0cb202c872fbfef4260c86b| |1.2.0|2025.1.13|1.11/2.0/2.1/2.2|2.4.0|[mindstudio_probe-1.2.0-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.2/mindstudio_probe-1.2.0-py3-none-any.whl)|1e3aeea1706112f6ee52fd1165037936bb209138f0b9ec42ea21e2c1c8942cdc| |1.1.1|2024.12.09|1.11/2.0/2.1/2.2|2.4.0|[mindstudio_probe-1.1.1-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/msprobe/1.1/mindstudio_probe-1.1.1-py3-none-any.whl)|577b597555dc155b76ba1a62d575c3546004644e140a456c3ba0824d46283735| @@ -80,8 +82,6 @@ pip install ./mindstudio_probe*.whl ## 1.1.1 -## 1.1.1 - 【数据采集】 - dump 支持 processgroup、namedtuple、slice 等数据类型 diff --git a/debug/accuracy_tools/msprobe/docs/02.config_introduction.md b/debug/accuracy_tools/msprobe/docs/02.config_introduction.md index 3b53f4e3c21a31426cb435e03460f7367340af89..85d895c277763a3ded1bdb19bda608299e96ba69 100644 --- a/debug/accuracy_tools/msprobe/docs/02.config_introduction.md +++ b/debug/accuracy_tools/msprobe/docs/02.config_introduction.md @@ -10,20 +10,18 @@ ### 1.1 通用配置 -| 参数 | 解释 | 是否必选 | -| ----------------- |------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -------- | -| task | dump 的任务类型,str 类型。可选参数:
"statistics":仅采集统计信息,默认值;
"tensor":采集统计信息和完全复刻整网的真实数据;
"run_ut":精度预检,仅 PyTorch 场景支持,采集数据时勿选;
"overflow_check":溢出检测;
"free_benchmark":无标杆比对,不支持 MSAdapter 场景;
"grad_probe":梯度监控, 不支持 MSAdapter 场景;
"structure":仅采集模型结构以及调用栈信息,不采集具体数据。
根据 task 参数取值的不同,可以配置不同场景参数,详见:
[1.2 task 配置为 statistics](#12-task-配置为-statistics),
[1.3 task 配置为 tensor](#13-task-配置为-tensor),
[1.4 task 配置为 run_ut](#14-task-配置为-run_ut),
[1.5 task 配置为 overflow_check](#15-task-配置为-overflow_check),
[1.6 task 配置为 free_benchmark](#16-task-配置为-free_benchmark),
[1.7 task 配置为 grad_probe](#17-task-配置为-grad_probe)。
[1.8 task 配置为 structure](#18-task-配置为-structure)。
**配置示例**:"task": "tensor"。 | 否 | -| dump_path | 设置 dump 数据目录路径,str 类型。
**配置示例**:"dump_path": "./dump_path"。 | 是 | -| rank | 指定对某张卡上的数据进行采集,list[Union[int, str]] 类型,默认未配置(表示采集所有卡的数据),应配置元素为 ≥0 的整数或类似"4-6"的字符串,且须配置实际可用的 Rank ID。
PyTorch 场景: Rank ID 从 0 开始计数,最大取值为所有节点可用卡总数-1,若所配置的值大于实际训练所运行的卡的 Rank ID,则 dump 数据为空,比如当前环境 Rank ID 为 0 到 7,实际训练运行 0 到 3 卡,此时若配置 Rank ID 为 4 或不存在的 10 等其他值,dump 数据为空。
MindSpore 场景:所有节点的 Rank ID 均从 0 开始计数,最大取值为每个节点可用卡总数-1,config.json 配置一次 rank 参数对所有节点同时生效。静态图 L0 级别 dump 暂不支持指定rank。
注意,单卡训练时,rank必须为[],即空列表,不能指定rank。
**配置示例**:"rank": [1, "4-6"]。 | 否 | -| step | 指定采集某个 step 的数据,list[Union[int, str]] 类型。默认未配置,表示采集所有 step 数据。采集特定 step 时,须指定为训练脚本中存在的 step,可逐个配置,也可以指定范围。
**配置示例**:"step": [0, 1 , 2, "4-6"]。 | 否 | -| level | dump 级别,str 类型,根据不同级别采集不同数据。可选参数:
"L0":dump 模块级精度数据,使用背景详见 [1.1.1 模块级精度数据 dump 说明](#111-模块级精度数据-dump-说明);
"L1":dump API 级精度数据,默认值,仅 PyTorch、MSAdapter 以及 MindSpore 均支持;
"L2":dump kernel 级精度数据,PyTorch 场景详细介绍见 [PyTorch 场景的 kernel dump 说明](./04.kernel_dump_PyTorch.md);MindSpore 动态图场景详细介绍见 [MindSpore 动态图场景的 kernel dump 说明](./28.kernel_dump_MindSpore.md);MindSpore 静态图场景详细介绍见《MindSpore 场景的数据采集》中的 ["**8.1 静态图场景**"](./06.data_dump_MindSpore.md#81-静态图场景)小节;
"mix":dump module 模块级和 API 级精度数据,即"L0"+"L1",仅 PyTorch、MSAdapter 以及 MindSpore 动态图场景支持。
"debug":单点保存功能,细节详见[单点保存工具 README](./28.debugger_save_instruction.md)
**配置示例**:"level": "L1"。 | 否 | -| enable_dataloader | 自动控制开关,bool 类型,仅 PyTorch 场景支持。可选参数 true(开启)或 false(关闭),默认为 false。配置为 true 后自动识别 step 参数指定的迭代,并在该迭代执行完成后退出训练,此时 start、stop 和 step 函数可不配置,开启该开关要求训练脚本是通过 torch.utils.data.dataloader 方式加载数据。仅支持 PyTorch 单卡训练使用,分布式训练场景下存在数据 dump 不全问题。 **这个特性下个版本将被废弃** | 否 | -| async_dump | 异步 dump 开关,bool 类型。可选参数 true(开启)或 false(关闭),默认为 false。配置为 true 后开启异步 dump,即采集的精度数据会在当前 step 训练结束后统一落盘,训练过程中工具不触发同步操作。由于使用该模式有**显存溢出**的风险,当 task 配置为 tensor 时,即真实数据的异步dump模式,必须配置 [list](#13-task-配置为-tensor) 参数,指定需要 dump 的 tensor 。该模式暂不支持复数类型 tensor
的统计量计算。 | 否 | +| 参数 | 解释 | 是否必选 | +| ----------------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -------- | +| task | dump 的任务类型,str 类型。可选参数:
"statistics":仅采集统计信息,默认值;
"tensor":采集统计信息和完全复刻整网的真实数据;
"run_ut":精度预检,仅 PyTorch 场景支持,采集数据时勿选;
"overflow_check":溢出检测;
"free_benchmark":无标杆比对,不支持 MSAdapter 场景;
"grad_probe":梯度监控, 不支持 MSAdapter 场景;
"structure":仅采集模型结构以及调用栈信息,不采集具体数据。
根据 task 参数取值的不同,可以配置不同场景参数,详见:
[1.2 task 配置为 statistics](#12-task-配置为-statistics),
[1.3 task 配置为 tensor](#13-task-配置为-tensor),
[1.4 task 配置为 run_ut](#14-task-配置为-run_ut),
[1.5 task 配置为 overflow_check](#15-task-配置为-overflow_check),
[1.6 task 配置为 free_benchmark](#16-task-配置为-free_benchmark),
[1.7 task 配置为 grad_probe](#17-task-配置为-grad_probe),
[1.8 task 配置为 structure](#18-task-配置为-structure)。
**配置示例**:"task": "tensor"。 | 否 | +| dump_path | 设置 dump 数据目录路径,str 类型。
**配置示例**:"dump_path": "./dump_path"。 | 是 | +| rank | 指定对某张卡上的数据进行采集,list[Union[int, str]] 类型,默认未配置(表示采集所有卡的数据),应配置元素为 ≥0 的整数或类似"4-6"的字符串,且须配置实际可用的 Rank ID。
PyTorch 场景: Rank ID 从 0 开始计数,最大取值为所有节点可用卡总数-1,若所配置的值大于实际训练所运行的卡的 Rank ID,则 dump 数据为空,比如当前环境 Rank ID 为 0 到 7,实际训练运行 0 到 3 卡,此时若配置 Rank ID 为 4 或不存在的 10 等其他值,dump 数据为空。
MindSpore 场景:所有节点的 Rank ID 均从 0 开始计数,最大取值为每个节点可用卡总数-1,config.json 配置一次 rank 参数对所有节点同时生效。静态图 L0 级别 dump 暂不支持指定rank。
注意,单卡训练时,rank必须为[],即空列表,不能指定rank。
**配置示例**:"rank": [1, "4-6"]。 | 否 | +| step | 指定采集某个 step 的数据,list[Union[int, str]] 类型。默认未配置,表示采集所有 step 数据。采集特定 step 时,须指定为训练脚本中存在的 step,可逐个配置,也可以指定范围。
**配置示例**:"step": [0, 1 , 2, "4-6"]。 | 否 | +| level | dump 级别,str 类型,根据不同级别采集不同数据。可选参数:
"L0":dump 模块级精度数据,使用背景详见 [1.1.1 模块级精度数据 dump 说明](#111-模块级精度数据-dump-说明)。
"L1":dump API 级精度数据,默认值,仅 PyTorch、MSAdapter 以及 MindSpore 动态图场景支持。
"L2":dump kernel 级精度数据,PyTorch 场景详细介绍见 [PyTorch 场景的 kernel dump 说明](./04.kernel_dump_PyTorch.md);MindSpore 动态图场景详细介绍见 [MindSpore 动态图场景的 kernel dump 说明](./28.kernel_dump_MindSpore.md);MindSpore 静态图场景详细介绍见《MindSpore 场景的数据采集》中的 ["**8.1 静态图场景**"](./06.data_dump_MindSpore.md#81-静态图场景)小节。
"mix":dump module 模块级和 API 级精度数据,即"L0"+"L1",仅 PyTorch、MSAdapter 以及 MindSpore 动态图场景支持。
"debug":单点保存功能,详见[单点保存工具](./28.debugger_save_instruction.md)。
**配置示例**:"level": "L1"。 | 否 | +| enable_dataloader | 自动控制开关,bool 类型,仅 PyTorch 场景支持。可选参数 true(开启)或 false(关闭),默认为 false。配置为 true 后自动识别 step 参数指定的迭代,并在该迭代执行完成后退出训练,此时 start、stop 和 step 函数可不配置,开启该开关要求训练脚本是通过 torch.utils.data.dataloader 方式加载数据。仅支持 PyTorch 单卡训练使用,分布式训练场景下存在数据 dump 不全问题。 **这个特性下个版本将被废弃** | 否 | +| async_dump | 异步 dump 开关,bool 类型。可选参数 true(开启)或 false(关闭),默认为 false。配置为 true 后开启异步 dump,即采集的精度数据会在当前 step 训练结束后统一落盘,训练过程中工具不触发同步操作。由于使用该模式有**显存溢出**的风险,当 task 配置为 tensor 时,即真实数据的异步dump模式,必须配置 [list](#13-task-配置为-tensor) 参数,指定需要 dump 的 tensor 。该模式暂不支持复数类型 tensor
的统计量计算。 | 否 | #### 1.1.1 模块级精度数据 dump 说明 -PyTorch 与 MindSpore 均支持。 - 大模型场景下,通常不是简单的利用自动迁移能力实现从 GPU 到 NPU 的训练脚本迁移,而是会对 NPU 网络进行一系列针对性的适配,因此,常常会造成迁移后的 NPU 模型存在部分子结构不能与 GPU 原始模型完全对应。模型结构不一致导致 API 调用类型及数量不一致,若直接按照 API 粒度进行精度数据 dump 和比对,则无法完全比对所有的 API。 本小节介绍的功能是对模型中的大粒度模块进行数据 dump,使其比对时,对于无法以 API 粒度比对的模块可以直接以模块粒度进行比对。 @@ -32,6 +30,7 @@ PyTorch 与 MindSpore 均支持。 特别地,在PyTorch场景中,为了规避BackwardHook函数的输出不能进行原地操作的框架限制,工具使用了`torch._C._autograd._set_creation_meta`接口对BackwardHook函数的输出张量进行属性重置,这可能会造成dump数据中缺少原地操作模块(nn.ReLU(inplace=True)及其上一个模块的反向数据。 + ### 1.2 task 配置为 statistics @@ -44,9 +43,11 @@ PyTorch 与 MindSpore 均支持。
配置示例:"list": ["Module.module.language_model.encoder.layers.0.mlp.ParallelMlp.forward.0"], 或 "list": ["Cell.network_with_loss.language_model.encoder.layers.0.mlp.ParallelMlp.forward.0"] + + - + @@ -54,6 +55,7 @@ PyTorch 与 MindSpore 均支持。 **说明**: + 1. "summary_mode" 配置为 "md5" 时,所使用的校验算法为 CRC-32 算法。 **示例**: @@ -68,6 +70,7 @@ PyTorch 与 MindSpore 均支持。 "statistics": { "scope": [], "list": [], + "tensor_list":[], "data_mode": ["all"], "summary_mode": "statistics" } @@ -82,6 +85,7 @@ PyTorch 与 MindSpore 均支持。 | list | 与[ 1.2 task 配置为 statistics ](#12-task-配置为-statistics)中的解释相同。 | 否 | | data_mode | 与[ 1.2 task 配置为 statistics ](#12-task-配置为-statistics)中的解释相同 | 否 | | file_format | tensor 数据的保存格式,str 类型,仅支持 MindSpore 静态图场景的 L2 级别配置该字段,其他场景不生效。可选参数:
"bin":dump 的 tensor 文件为二进制格式;
"npy":dump 的 tensor 文件后缀为 .npy,默认值。 | 否 | +| summary_mode | 控制 dump 文件输出的模式,str 类型,支持 PyTorch、MSAdapter、MindSpore 动态图。可选参数:
md5:dump 输出包含 CRC-32 值以及 API 统计信息的 dump.json 文件,用于验证数据的完整性;
statistics:dump 仅输出包含 API 统计信息的 dump.json 文件,默认值。| 否 | | online_run_uta | 在线预检模式开关,bool 类型,可选参数 true(开启)、false(关闭),默认未配置,表示关闭。配置为 true 表示开启在线预检。| 否 | | nfs_patha | 在线预检模式共享存储目录路径,str 类型,用于 GPU 设备和 NPU 设备间进行通信。仅在 online_run_ut 字段配置为 true 时生效,配置该参数后 host 和 port 不生效。 | 否 | | hosta | 在线预检模式局域网场景信息接收端 IP,str 类型,用于 GPU 设备和 NPU 设备间进行通信,NPU 侧须配置为 GPU 侧的局域网 IP 地址。仅在 online_run_ut 字段配置为 true 时生效,局域网场景时,不能配置 nfs_path 参数,否则局域网场景不生效。 | 否 | diff --git a/debug/accuracy_tools/msprobe/docs/03.config_examples.md b/debug/accuracy_tools/msprobe/docs/03.config_examples.md index 542250fac243f3ab2f1d0aff87bc509ac7c1a675..0d29a4eb1a824bba2c1bda1a214c9add2e87bdba 100644 --- a/debug/accuracy_tools/msprobe/docs/03.config_examples.md +++ b/debug/accuracy_tools/msprobe/docs/03.config_examples.md @@ -17,6 +17,7 @@ "statistics": { "scope": [], "list": [], + "tensor_list": [], "data_mode": ["all"], "summary_mode": "statistics" } diff --git a/debug/accuracy_tools/msprobe/docs/04.kernel_dump_PyTorch.md b/debug/accuracy_tools/msprobe/docs/04.kernel_dump_PyTorch.md index ce3fd54f5a6741b262f6248f70a9f1166ca0b4a6..346481aad12c42994669b7b3ea794843e49c1618 100644 --- a/debug/accuracy_tools/msprobe/docs/04.kernel_dump_PyTorch.md +++ b/debug/accuracy_tools/msprobe/docs/04.kernel_dump_PyTorch.md @@ -6,7 +6,7 @@ ## 1 kernel dump 配置示例 -使用 kernel dump 时,list 必须要填一个 API 名称,kernel dump 目前每个 step 只支持采集一个 API 的数据。 +使用 kernel dump 时,task 需要配置为 tensor , list 必须要填一个 API 名称,kernel dump 目前每个 step 只支持采集一个 API 的数据。 API 名称填写参考 L1 dump 结果文件 dump.json 中的API名称,命名格式为:`{api_type}.{api_name}.{API调用次数}.{forward/backward}`。 ```json diff --git a/debug/accuracy_tools/msprobe/docs/05.data_dump_PyTorch.md b/debug/accuracy_tools/msprobe/docs/05.data_dump_PyTorch.md index 31d5e305b114e168d0aa2ed9c8d4e50e5b11383e..15abb958bfd57779481fa93134d8aa237a0331ed 100644 --- a/debug/accuracy_tools/msprobe/docs/05.data_dump_PyTorch.md +++ b/debug/accuracy_tools/msprobe/docs/05.data_dump_PyTorch.md @@ -249,6 +249,42 @@ debugger.set_init_step(step) 1.step: 指定的起始step数。 +### 1.11 register_custom_api + +**功能说明**:注册用户自定义的api到工具用于 L1 dump 。 + +**原型**: + +```Python +debugger.register_custom_api(module, api_name, api_prefix) +``` +**参数说明**: + +以 torch.matmul api 为例 + +1.module: api 所属的包,即传入 torch。 + +2.api_name: api 名,string类型,即传入 "matmul"。 + +3.api_prefix: [dump.json](./27.dump_json_instruction.md) 中 api 名的前缀,可选,默认为包名的字符串格式, 即 "torch"。 + +### 1.12 restore_custom_api + +**功能说明**:恢复用户原有的自定义的api,取消 dump 。 + +**原型**: + +```Python +debugger.restore_custom_api(module, api_name) +``` +**参数说明**: + +以 torch.matmul api 为例 + +1.module: api 所属的包,即传入 torch。 + +2.api_name: api 名,string类型,即传入 "matmul"。 + ## 2 示例代码 @@ -404,8 +440,8 @@ class ModuleOP(nn.Module): | | | | ├── Functional.linear.5.backward.output.pt # 命名格式为{api_type}.{api_name}.{API调用次数}.{forward/backward}.{input/output}.{参数序号}, 其中,“参数序号”表示该API的第n个输入或输出,例如1,则为第一个参数,若该参数为list格式,则根据list继续排序,例如1.1,表示该API的第1个参数的第1个元素。 | | | | ... | | | | ├── Module.conv1.Conv2d.forward.0.input.0.pt # 命名格式为{Module}.{module_name}.{class_name}.{forward/backward}.{调用次数}.{input/output}.{参数序号}, 其中,“参数序号”表示该Module的第n个参数,例如1,则为第一个参数,若该参数为list格式,则根据list继续排序,例如1.1,表示该Module的第1个参数的第1个元素。 -| | | | ├── Module.conv1.Conv2D.forward.0.parameters.bias.pt # 模块参数数据:命名格式为{Module}.{module_name}.{class_name}.forward.{调用次数}.parameters.{parameter_name}。 -| | | | └── Module.conv1.Conv2D.parameters_grad.weight.pt # 模块参数梯度数据:命名格式为{Module}.{module_name}.{class_name}.parameters_grad.{parameter_name}。因为同一模块的参数使用同一梯度进行更新,所以参数梯度文件名不包含调用次数。 +| | | | ├── Module.conv1.Conv2d.forward.0.parameters.bias.pt # 模块参数数据:命名格式为{Module}.{module_name}.{class_name}.forward.{调用次数}.parameters.{parameter_name}。 +| | | | └── Module.conv1.Conv2d.parameters_grad.weight.pt # 模块参数梯度数据:命名格式为{Module}.{module_name}.{class_name}.parameters_grad.{parameter_name}。因为同一模块的参数使用同一梯度进行更新,所以参数梯度文件名不包含调用次数。 | | | | # 当dump时传入的model参数为List[torch.nn.Module]或Tuple[torch.nn.Module]时,模块级数据的命名中包含该模块在列表中的索引index,命名格式为{Module}.{index}.*,*表示以上三种模块级数据的命名格式,例如:Module.0.conv1.Conv2d.forward.0.input.0.pt。 │ | | ├── dump.json │ | | ├── stack.json diff --git a/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md b/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md index eee3eb4b1ba9871a197b6f6e680fd13da3492b96..185f375a78022b37a032037caa2e475987230722 100644 --- a/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md +++ b/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md @@ -30,8 +30,8 @@ dump 的"tensor"模式采集数据量大小,可以参考[数据量基线](data ## 5. 场景介绍 -### 5.1 静态图场景 -在静态图场景下,msprobe 支持 **L0 Level** 和 **L2 Level** 的数据采集。且当 MindSpore 版本高于 2.5.0 时,若需采集 **L2 Level** 数据,必须使用编包时添加了`--include-mod=adump`选项的 mindstudio-probe whl 包进行 msprobe 工具安装。 +### 5.1 静态图场景 +在静态图场景下,msprobe 支持 **L0 Level** 和 **L2 Level** 的数据采集。且当 MindSpore 版本高于 2.5.0 时,若需采集 **L2 Level** 数据,必须使用编包时添加了`--include-mod=adump`选项的 mindstudio-probe whl 包进行 msprobe 工具安装。 - **L0 Level(Cell 级)** :采集 `Cell` 对象的数据,适用于需要分析特定网络模块的情况。 - **L2 Level(Kernel 级)** :采集底层算子的输入输出数据,适用于深入分析算子级别的精度问题。 @@ -150,6 +150,7 @@ save(variable, name, save_backward=True) | name | 指定的名称 | str | 是 | | save_backward | 是否保存反向数据 | boolean | 否 | + #### 6.1.6 set_init_step **功能说明**:设置起始step数,step数默认从0开始计数,使用该接口后step从指定值开始计数。该函数需要写在训练迭代的循环开始前,不能写在循环内。 @@ -165,6 +166,43 @@ set_init_step(step) 1.step: 指定的起始step数。 +#### 6.1.7 register_custom_api + +**功能说明**:注册用户自定义的api到工具用于 L1 dump 。 + +**原型**: + +```Python +debugger.register_custom_api(module, api_name, api_prefix) +``` +**参数说明**: + +以 torch.matmul api 为例 + +1.module: api 所属的包,即传入 torch。 + +2.api_name: api 名,string类型,即传入 "matmul"。 + +3.api_prefix: [dump.json](./27.dump_json_instruction.md) 中 api 名的前缀,可选,默认为包名的字符串格式, 即 "torch"。 + +#### 6.1.8 restore_custom_api + +**功能说明**:恢复用户原有的自定义的api,取消 dump 。 + +**原型**: + +```Python +debugger.restore_custom_api(module, api_name) +``` +**参数说明**: + +以 torch.matmul api 为例 + +1.module: api 所属的包,即传入 torch。 + +2.api_name: api 名,string类型,即传入 "matmul"。 + + ### 6.2 msprobe.mindspore.MsprobeStep **功能说明**:MindSpore Callback类,自动在每个step开始时调用start()接口,在每个step结束时调用stop()、step()接口。实现使用 Model 高阶 API 的动态图场景下 L0、L1、mix 级别,和静态图场景下 L0级别的精度数据采集控制,控制粒度为单个 **Step** ,而 PrecisionDebugger.start, PrecisionDebugger.stop 接口的控制粒度任意训练代码段。 @@ -214,7 +252,7 @@ seed_all(seed=1234, mode=False, rm_dropout=True) **说明**: 静态图 L0 级别的Dump功能是基于mindspore.ops.TensorDump算子实现。在Ascend平台上的Graph模式下,可以通过设置环境变量 [MS_DUMP_SLICE_SIZE 和 MS_DUMP_WAIT_TIME](https://www.mindspore.cn/docs/zh-CN/r2.5.0/api_python/env_var_list.html) 解决在输出大Tesnor或输出Tensor比较密集场景下算子执行失败的问题。 -##### 7.1.1.1 未使用 Model 高阶 API +##### 7.1.1.1 未使用 Model 高阶 API ```python @@ -236,7 +274,7 @@ for data, label in data_loader: debugger.step() # 更新迭代数 ``` -##### 7.1.1.2 使用 Model 高阶 API +##### 7.1.1.2 使用 Model 高阶 API ```python @@ -407,7 +445,6 @@ L2 级别 dump 的目录结构如下所示: 3. acl_dump_{device_id}.json 为在 Dump 接口调用过程中生成的中间文件,一般情况下无需关注。 其他场景下,除 kernel_kbyk_dump.json(jit_level=O0/O1)、kernel_graph_dump.json(jit_level=O2)等无需关注的中间文件外的其他 dump 结果文件请参见 MindSpore 官方文档中的[ Ascend 下 O0/O1 模式 Dump 数据对象目录和数据文件介绍](https://www.mindspore.cn/docs/zh-CN/r2.5.0/model_train/debug/dump.html#%E6%95%B0%E6%8D%AE%E5%AF%B9%E8%B1%A1%E7%9B%AE%E5%BD%95%E5%92%8C%E6%95%B0%E6%8D%AE%E6%96%87%E4%BB%B6%E4%BB%8B%E7%BB%8D)与[ Ascend 下 O2 模式 Dump 数据对象目录和数据文件介绍](https://www.mindspore.cn/docs/zh-CN/r2.5.0/model_train/debug/dump.html#%E6%95%B0%E6%8D%AE%E5%AF%B9%E8%B1%A1%E7%9B%AE%E5%BD%95%E5%92%8C%E6%95%B0%E6%8D%AE%E6%96%87%E4%BB%B6%E4%BB%8B%E7%BB%8D-1)。 - ### 8.2 动态图场景 dump 结果目录结构示例如下: @@ -423,9 +460,9 @@ dump 结果目录结构示例如下: | | | | ├── Tensor.__add__.0.forward.output.0.npy | | | | ... | | | | ├── Jit.AlexNet.0.forward.input.0.npy -| | | | ├── Primitive.conv2d.Conv2D.0.forward.input.0.npy -| | | | ├── Cell.conv1.Conv2D.forward.0.parameters.weight.npy # 模块参数数据:命名格式为{Cell}.{cell_name}.{class_name}.forward.{调用次数}.parameters.{parameter_name}。 -| | | | ├── Cell.conv1.Conv2D.parameters_grad.weight.npy # 模块参数梯度数据:命名格式为{Cell}.{cell_name}.{class_name}.parameters_grad.{parameter_name}。因为同一模块的参数使用同一梯度进行更新,所以参数梯度文件名不包含调用次数。 +| | | | ├── Primitive.conv2d.Conv2d.0.forward.input.0.npy +| | | | ├── Cell.conv1.Conv2d.forward.0.parameters.weight.npy # 模块参数数据:命名格式为{Cell}.{cell_name}.{class_name}.forward.{调用次数}.parameters.{parameter_name}。 +| | | | ├── Cell.conv1.Conv2d.parameters_grad.weight.npy # 模块参数梯度数据:命名格式为{Cell}.{cell_name}.{class_name}.parameters_grad.{parameter_name}。因为同一模块的参数使用同一梯度进行更新,所以参数梯度文件名不包含调用次数。 | | | | └── Cell.relu.ReLU.forward.0.input.0.npy # 命名格式为{Cell}.{cell_name}.{class_name}.{forward/backward}.{调用次数}.{input/output}.{参数序号}, 其中,“参数序号”表示该Cell的第n个参数,例如1,则为第一个参数,若该参数为list格式,则根据list继续排序,例如1.1,表示该Cell的第1个参数的第1个元素。 | | | | # 当dump时传入的model参数为List[mindspore.nn.Cell]或Tuple[mindspore.nn.Cell]时,模块级数据的命名中包含该模块在列表中的索引index,命名格式为{Cell}.{index}.*,*表示以上三种模块级数据的命名格式,例如:Cell.0.relu.ReLU.forward.0.input.0.npy。 │ | | ├── dump.json @@ -487,6 +524,3 @@ ops: - adaptive_avg_pool2d - adaptive_avg_pool3d ``` -### 9.2 不支持模型 - -静态图场景L0级暂不支持Yi模型。 \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/docs/07.accuracy_checker_PyTorch.md b/debug/accuracy_tools/msprobe/docs/07.accuracy_checker_PyTorch.md index ba7f978b09a4ffbc25d6525492aeeda43a698279..55223d493ae4bf5d761750559ef53d8581c08fde 100644 --- a/debug/accuracy_tools/msprobe/docs/07.accuracy_checker_PyTorch.md +++ b/debug/accuracy_tools/msprobe/docs/07.accuracy_checker_PyTorch.md @@ -107,7 +107,7 @@ msprobe -f pytorch multi_run_ut -api_info ./dump_path/step{step_number}/rank{ran | -save_error_data | 保存精度未达标的 API 输入输出数据。 | 否 | | -o 或 --out_path | 指定 run_ut 执行结果存盘路径,默认“./”。 | 否 | | -j 或 --jit_compile | 开启 jit 编译。 | 否 | -| -n 或 --num_splits | 同时执行 run_ut 线程的数量,默认为 8,最大支持 64,但每个 Device 最大支持 8 个线程。当指定多个线程和多个 Device 时,线程数在每张卡上均分。 | 否 | +| -n 或 --num_splits | 同时执行 run_ut 线程的数量,默认为 8,最大支持 64,但每个 Device 最大支持 8 个线程。当指定多个线程和多个 Device 时,线程数在每张卡上均分。 | 否 | | -d 或 --device | 指定 Device ID,选择 UT 代码运行所在的卡,默认值为 0,支持同时指定 0~7,共 8 个 Device。 | 否 | | -csv_path 或 --result_csv_path | 指定本次运行中断时生成的 `accuracy_checking_result_{timestamp}.csv` 文件路径,执行 run_ut 中断时,若想从中断处继续执行,配置此参数即可。需要指定为上次中断的 `accuracy_checking_result_{timestamp}.csv` 文件。详见 [3.3 断点续检](#33-断点续检)。 | run_ut 操作中断后继续执行场景下必须配置 | | -f 或 --filter_api | 过滤模型中除最大值和最小值以外其他参数和结构相同的 API。适用于模型较大且重复 API 较多的场景。 | 否 | diff --git a/debug/accuracy_tools/msprobe/docs/08.accuracy_checker_online_PyTorch.md b/debug/accuracy_tools/msprobe/docs/08.accuracy_checker_online_PyTorch.md index 7dfc71e8369ac906c434e10d372210d49c847bf9..ebfe92d2f5464f2d8cda88351f8d53e818b32ecd 100644 --- a/debug/accuracy_tools/msprobe/docs/08.accuracy_checker_online_PyTorch.md +++ b/debug/accuracy_tools/msprobe/docs/08.accuracy_checker_online_PyTorch.md @@ -65,7 +65,7 @@ Host 与 GPU Host 设备间建立连接,将 NPU 上对应 API 的输入数据 以下秘钥生成方法仅为简单示例,客户应使用与自己需求相符的秘钥生成和存储机制并保证秘钥安全性与机密性,必要时可采用分层秘钥机制。 ```shell # 创建私钥文件server.key -openssl genrsa -out server.key 2048 +openssl genrsa -out server.key 3072 # 创建签名请求文件server.csr openssl req -new -key server.key -out server.csr diff --git a/debug/accuracy_tools/msprobe/docs/09.accuracy_checker_MindSpore.md b/debug/accuracy_tools/msprobe/docs/09.accuracy_checker_MindSpore.md index 3bf65032edae2b8e35c5818d5c030c9ce4c79e95..d2f938459410a3a1cc4c363975b9b10939d9e7fe 100644 --- a/debug/accuracy_tools/msprobe/docs/09.accuracy_checker_MindSpore.md +++ b/debug/accuracy_tools/msprobe/docs/09.accuracy_checker_MindSpore.md @@ -34,9 +34,18 @@ msprobe -f mindspore run_ut -api_info ./dump.json -o ./checker_result | -api_info 或 --api_info_file | 指定 API 信息文件 dump.json。对其中的mint api以及部分Tensor api进行预检,预检支持的Tensor api列表详见 [ 预检支持列表](../mindspore/api_accuracy_checker/checker_support_api.yaml)。 | str | 是 | | -o 或 --out_path | 指定预检结果存盘路径,默认“./”。 | str | 否 | | -csv_path 或 --result_csv_path | 指定本次运行中断时生成的 `accuracy_checking_result_{timestamp}.csv` 文件路径,执行 run_ut 中断时,若想从中断处继续执行,配置此参数即可。需要指定为上次中断的 `accuracy_checking_result_{timestamp}.csv` 文件。详见 [3.3 断点续检](#33-断点续检)。 | str | 否 | +| -save_error_data | 保存(随机数据模式)精度未达标的 API 输入输出数据。 | 空 | 否 | 预检执行结果包括 `accuracy_checking_result_{timestamp}.csv` 和 `accuracy_checking_details_{timestamp}.csv` 两个文件。`accuracy_checking_result_{timestamp}.csv` 属于 API 级,标明每个 API 是否通过测试。建议用户先查看 `accuracy_checking_result_{timestamp}.csv` 文件,对于其中没有通过测试的或者特定感兴趣的 API,根据其 API Name 字段在 `accuracy_checking_details_{timestamp}.csv` 中查询其各个输出的达标情况以及比较指标。详细介绍请参见 [4 预检结果](#4-预检结果)。 +随机数据模式下,如果需要保存比对不达标的输入和输出数据,可以在 run_ut 执行命令结尾添加 `-save_error_data`,例如: + +```bash +msprobe -f mindspore run_ut -api_info ./dump.json -o ./checker_result -save_error_data +``` + +数据默认会存盘到 '{out_path}/error_data' 路径下。 + ### 3.2 使用 multi_run_ut 执行多线程预检 multi_run_ut 脚本,可以并行在多个Device执行 run_ut 操作,从而减少预检耗时。示例如下: @@ -45,16 +54,19 @@ multi_run_ut 脚本,可以并行在多个Device执行 run_ut 操作,从而 msprobe -f mindspore multi_run_ut -api_info ./dump.json -d 0 1 2 3 ``` -| 参数名称 | 说明 |参数类型 | 是否必选 | -| ---------------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------- | ---------------------------------- | -| -api_info 或 --api_info_file | 指定 API 信息文件 dump.json。对其中的mint api以及部分Tensor api进行预检,预检支持的Tensor api列表详见 [ 预检支持列表](../mindspore/api_accuracy_checker/checker_support_api.yaml)。 | str | 是 | -| -o 或 --out_path | 指定预检结果存盘路径,默认“./”。 | str | 否 | -| -csv_path 或 --result_csv_path | 指定本次运行中断时生成的 `accuracy_checking_result_{timestamp}.csv` 文件路径,执行 run_ut 中断时,若想从中断处继续执行,配置此参数即可。需要指定为上次中断的 `accuracy_checking_result_{timestamp}.csv` 文件。详见 [3.3 断点续检](#33-断点续检)。 | str | 否 | -| -d 或 --device | 指定 Device ID,选择 UT 代码运行所在的卡,默认值为 0,支持同时指定 0 ~ Device数量 - 1 ,例如 0 1 2 3 4。 | List[int] | 否 | +| 参数名称 | 说明 | 参数类型 | 是否必选 | +| ---------------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------| ---------------------------------- | +| -api_info 或 --api_info_file | 指定 API 信息文件 dump.json。对其中的mint api以及部分Tensor api进行预检,预检支持的Tensor api列表详见 [ 预检支持列表](../mindspore/api_accuracy_checker/checker_support_api.yaml)。 | str | 是 | +| -o 或 --out_path | 指定预检结果存盘路径,默认“./”。 | str | 否 | +| -csv_path 或 --result_csv_path | 指定本次运行中断时生成的 `accuracy_checking_result_{timestamp}.csv` 文件路径,执行 run_ut 中断时,若想从中断处继续执行,配置此参数即可。需要指定为上次中断的 `accuracy_checking_result_{timestamp}.csv` 文件。详见 [3.3 断点续检](#33-断点续检)。 | str | 否 | +| -d 或 --device | 指定 Device ID,选择 UT 代码运行所在的卡,默认值为 0,支持同时指定 0 ~ Device数量 - 1 ,例如 0 1 2 3 4。 | List[int] | 否 | +| -save_error_data | 保存(随机数据模式)精度未达标的 API 输入输出数据。 | 空 | 否 | 在不同卡数下,使用38B语言大模型的预检耗时基线参考 [multi_run_ut耗时基线](accuracy_checker_MindSpore/accuracy_checker_MindSpore_baseline.md) +数据默认会存盘到 './ut_error_data{timestamp}' 路径下 + ### 3.3 断点续检 断点续检操作通过如下命令执行: diff --git a/debug/accuracy_tools/msprobe/docs/11.accuracy_compare_MindSpore.md b/debug/accuracy_tools/msprobe/docs/11.accuracy_compare_MindSpore.md index 06bec97673443ffa7765e1d8a87effb83dc995ba..47ba0e8b16e3ac76a4400f87b79ae7ba145bf501 100644 --- a/debug/accuracy_tools/msprobe/docs/11.accuracy_compare_MindSpore.md +++ b/debug/accuracy_tools/msprobe/docs/11.accuracy_compare_MindSpore.md @@ -16,10 +16,8 @@ msprobe精度比对工具主要用于如下场景: - MindSpore与PyTorch跨框架比对 - 通过对同一个网络模型,在整网环境下分别在MindSpore动态图和PyTorch环境下获得API dump数据,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 - 通过对同一个网络模型,在整网环境下分别在MindSpore动态图和PyTorch环境下获得cell dump数据,由用户指定可以比对的cell list,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 - - 通过对同一个网络模型,在整网环境下分别在MindSpore静态图和PyTorch环境下获得cell dump数据,由用户指定可以比对的cell list,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 - 通过对同一个网络模型,在整网环境下分别在MindSpore动态图和PyTorch环境下获得API或模块dump数据,由用户指定可以比对的API或模块,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 - 通过对同一个网络模型,在整网环境下分别在MindSpore动态图和PyTorch环境下获得API或模块dump数据,由用户指定可以比对的模型代码中的Layer层,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 - - 通过对同一个网络模型,在整网环境下分别在MindSpore静态图和PyTorch环境下获得模块dump数据,由用户指定可以比对的模型代码中的Layer层,以PyTorch数据作为标杆,进行自动比对,从而实现跨框架的精度对比。 执行精度比对操作需要安装msprobe工具。详见《[MindStudio精度调试工具](../README.md)》的“工具安装”章节。 @@ -43,7 +41,7 @@ msprobe -f mindspore compare -i ./compare.json -o ./output -s | -o或--output_path | 配置比对结果文件存盘目录,默认会在当前目录创建output目录。文件名称基于时间戳自动生成,格式为:
`compare_result_{timestamp}.xlsx`
`compare_result_{rank_id}_{step_id}_{timestamp}.xlsx`(仅[不同版本下的全量kernel比对](#23-不同版本下的全量kernel比对)场景支持)。
提示:output目录下与结果件同名文件将被删除覆盖。 | 否 | | -s或--stack_mode | 比对结果展示调用栈信息(NPU_Stack_Info)的开关,bool 类型。单卡场景开启时,需要使用[比对文件](#41-比对文件)的单卡场景配置stack_path指定stack.json文件,才能生成详细调用栈信息,否则在比对时会报错;暂不支持多卡场景。通过直接配置该参数开启,默认未配置,表示关闭。 | 否 | | -c或--compare_only | 仅比对开关,bool 类型。该参数默认未配置,会启用自动精度分析,工具自动针对比对结果进行分析,识别到第一个精度可能不达标节点(在比对结果文件中的 Accuracy Reached or Not 列显示为 No),并给出问题可能产生的原因(打屏展示并生成 `advisor_{timestamp}.txt` 文件)。通过配置该参数取消自动精度分析,仅输出比对结果表格。 | 否 | -| -f或--fuzzy_match | 模糊匹配。开启后,对于跨框架比对场景不再校验dtype与pytorch侧的一致性,可匹配并进行比对。通过直接配置该参数开启,默认未配置,表示关闭。 | 否 | +| -f或--fuzzy_match | 模糊匹配。开启后,对于网络中同一层级且命名仅调用次数不同的API,可匹配并进行比对。通过直接配置该参数开启,默认未配置,表示关闭。 | 否 | | -am或--api_mapping | 跨框架比对。配置该参数时表示开启跨框架API比对功能,可以指定自定义映射文件*.yaml,不指定映射文件时按照msprobe定义的默认映射关系进行比对。自定义映射文件的格式请参见[自定义映射文件(api_mapping)](#43-自定义映射文件api_mapping)。仅[跨框架的API比对](#25-跨框架的api比对)场景需要配置。 | 否 | | -cm或--cell_mapping | 跨框架比对。配置该参数时表示开启跨框架cell模块比对功能,可以指定自定义映射文件*.yaml,不指定映射文件时按照msprobe定义的默认映射关系进行比对。自定义映射文件的格式请参见[自定义映射文件(cell_mapping)](#44-自定义映射文件cell_mapping)。仅[跨框架的cell模块比对](#26-跨框架的cell模块比对)场景需要配置。 | 否 | | -dm或--data_mapping | 同框架或跨框架比对。通过映射文件指定两个具体参数的对应关系,可以在L0、L1或mix采集场景下使用。配置该参数的同时需要指定自定义映射文件*.yaml。自定义映射文件的格式请参见[自定义映射文件(data_mapping)](#45-自定义映射文件data_mapping)。 | 否 | @@ -151,11 +149,6 @@ msprobe -f mindspore compare -i ./compare.json -o ./output -s cell_mapping.yaml文件配置请参见[自定义映射文件(cell_mapping)](#44-自定义映射文件cell_mapping)。 不传入cell_mapping.yaml的情况下仅将Cell改成Module后进行匹配;传入cell_mapping.yaml的情况下将按照cell_mapping.yaml的内容进行匹配。 - 如果跨框架比对场景不需要考虑dtype与pytorch侧的一致性,匹配并进行比对,可以开启-f或--fuzzy_match选项,例: - ```shell - msprobe -f mindspore compare -i ./compare.json -o ./output -s -f -cm cell_mapping.yaml - ``` - 此外,也可以通过data_mapping.yaml文件实现具体参数的匹配,例: ```shell msprobe -f mindspore compare -i ./compare.json -o ./output -s -dm data_mapping.yaml diff --git a/debug/accuracy_tools/msprobe/docs/19.monitor.md b/debug/accuracy_tools/msprobe/docs/19.monitor.md index c5f7cff85d662aa26ba9e59a8eebe3bca321cd1c..e0ac9f7220ba29bc3b79c7e82aa7c9307b554b66 100644 --- a/debug/accuracy_tools/msprobe/docs/19.monitor.md +++ b/debug/accuracy_tools/msprobe/docs/19.monitor.md @@ -23,10 +23,8 @@ | [优化器状态监控](#优化器状态监控) | 开启优化器状态监控 | PyTorch、MindSpore | | [指定监控对象](#指定监控对象) | 指定监控的nn.Module(nn.Cell)及对应的输入输出 | PyTorch、MindSpore | | [打印模型结构](#打印模型结构) | 打印模型结构 | PyTorch | -| [Module全量监控](#Module全量监控) | 对全量module的输入输出做监控 | PyTorch、MindSpore | -| [Parameter全量监控](#Parameter全量监控) | 对全量Parameter的输入输出做监控 | PyTorch、MindSpore | | [输出格式和统计量](#输出格式和统计量) | format PyTorch支持`csv`、`tensorboard`和`api`,MindSpore仅支持`csv`,`ops`、`ndigits`均支持 | PyTorch、MindSpore | -| [梯度异常时序判断](#梯度异常时序判断) | 梯度异常时自动梯度落盘 | PyTorch | +| [异常告警](#异常告警) | 监控对象指标异常时自动告警,支持异常数据落盘 | PyTorch、MindSpore | | [csv格式数据转tensorboard可视化显示](#csv格式数据转tensorboard可视化显示) | 将csv转为tensorboard文件显示 | PyTorch | | [动态启停](#动态启停) | 训练过程中动态修改配置开启监控 | PyTorch、MindSpore | | [功能重载](#功能重载) | 训练中开启激活值监控。待废弃,请使用动态启停功能代替。 | PyTorch | @@ -208,9 +206,12 @@ monitor.monitor_gnorm_with_ad( ## 高阶功能 + ### 指定监控对象 -工具支持对nn.Module(**激活值监控**)和nn.Parameter(**权重监控**、**权重梯度监控、优化器监控**)对象实现相应的监控行为,在配置文件的"targets"(dict)字段指定,targets格式为{module_name/param_name: {filed: format}}。 +工具支持对指定nn.Module进行状态监控,在配置文件的`targets`字段中指定,`targets`格式为{module_name: {}}。 + +module_name可以通过nn.Module的接口named_modules()获取。 #### 打印模型结构 工具提供可选项`print_struct`打印模型结构,帮助配置targets。工具会在在第一个step后打印结构并停止训练进程,模型结构默认打印在`$MONITOR_OUTPUT_DIR/module_struct.json`。 @@ -221,7 +222,6 @@ monitor.monitor_gnorm_with_ad( ``` 输出样例: -字段`config`用于配置文件中指定module target。其余为各个元素的shape和dtype。 ```json "0:63.mlp.linear_fc2": { @@ -245,40 +245,30 @@ monitor.monitor_gnorm_with_ad( } }, ``` +对于module对象,通常关心前向/反向传播的输入和输出: -- Module - 对于module对象,通常关心其前向的输入(input)输出(output)和反向的输入--前向输出的梯度(output_grad)和输出--前向输入的梯度(input_grad)。同时需要声明这些对象的类型,通常为"tensor"或"tuple\[length]"。 +- 前向的输入(input) +- 前向的输出(output) +- 反向的输入,表示前向输出的梯度(output_grad) +- 反向的输出,表示前向输入的梯度(input_grad) - "tensor"可以直接用来计算统计量,"tuple"需要进一步指定监控的索引。如"tuple[2]:0",表示该对象为长度2的tuple,对第0元素进行监控;不指定索引时,默认对第0元素进行监控。 - module_name可以通过nn.Module的接口`named_modules()`获取。 -```json -// 示例:对一个名为"module.encoder.layers.0.mlp"的module,监控其前向输入第0元素和输出。 -{ - "targets": { - "module.encoder.layers.0.mlp": { - "input": "tuple[2]:0", - "output": "tensor" - } - } -} -``` -#### Module全量监控 -工具提供简便的全量module监控方式。或不配置targets、all_xy字段,同样表示全量监控。 +#### 指定监控对象 + +targets字段指定监控对象示例如下: ```json -{ - "targets": {}, - "all_xy": true +// 示例:对一个名为"module.encoder.layers.0.mlp"的module。 +"targets": { + "module.encoder.layers.0.mlp": {} } ``` +对于parameter对象,通常会关注其在一个训练迭代中的梯度(weight grad)、adam类优化器中的动量(1st moment, 2nd moment)。 +parameter归属于某一module,可以通过指定module_name来监控包含在这一module中的**所有**parameter。 -- Parameter - 对于parameter对象,通常会关注其在一个训练迭代中的梯度(weight grad)、adam类优化器中的动量(1st moment, 2nd moment)。 - parameter归属于某一module,也可以通过指定module_name来监控包含在这一module中的**所有**parameter。 +param_name可以通过nn.Module的接口`named_parameters()`获取。 - param_name可以通过nn.Module的接口`named_parameters()`获取。 ```json // 示例:监控"module.encoder.layers.0.mlp"的所有参数和"module.embedding.word_embedding.weight"这一参数 { @@ -289,8 +279,9 @@ monitor.monitor_gnorm_with_ad( } ``` -#### Parameter全量监控 -工具提供简便的全量parameter监控方式。或不配置targets,同样表示全量监控。 +#### 全量监控 + +工具提供简便的全量module对象监控方式。 ```json { @@ -298,7 +289,9 @@ monitor.monitor_gnorm_with_ad( } ``` + ### 输出格式和统计量 + 工具配置示例: ```json { @@ -333,7 +326,7 @@ export MONITOR_OUTPUT_DIR=/xxx/output_dir 监控结果写入csv文件中,可以通过`ndigits`字段设置小数位数。 表头为 vpp_stage | name | step | micro_step(optional) | *ops |。 仅在激活值监控的输出文件中包含micor_step。 - 激活值监控的name为.\, 其他任务的name为> + 激活值监控的name为.\, 其他任务的name为 - **api** 监控结果不落盘,在训练过程中可以通过`generate_wgrad_metrics`、`generate_xy_metrics`等接口获取,使用方式参考[公开接口](#公开接口) 。 @@ -349,16 +342,36 @@ export MONITOR_OUTPUT_DIR=/xxx/output_dir ![step_count_per_record](img/monitor/step_count_per_record.png) -### 梯度异常时序判断 +### 异常告警 +工具的异常告警功能旨在自动判断训练过程中的异常现象,用户可通过在配置文件中配置alert字段来指定告警规则,并在训练过程中根据该规则及时打屏对用户发出告警。 + + 1. 训练前配置相关参数 -工具支持自动判断训练过程中的梯度异常,需要在配置文件中设置alert相关字段。"AnomalyTurbulence"会将当前数值与历史均值比较,如果相对偏差超过阈值,会在打屏信息中提示用户。如果打开"`dump`"选项,则会将异常梯度相关信息落盘到目录`monitor_output/anomaly_detected`,用于后续时序判断。 +当前支持的异常告警规则如下: + +| 异常告警 |解释| rule_name | args是否可选 | +|--------------|----|-----------|---------------------------------------------------------------------| +| 历史均值偏离告警 |将当前数值与历史均值比较。如果相对偏差超过阈值,会在打屏信息中提示用户| AnomalyTurbulence | 否,必须传入threshold | +| nan值/极大值告警 |根据是否提供threshold来判断nan值或极大值| AnomalyNan | 是, 若未配置args或未配置threshold,则默认检测nan,若提供threshold,则检测nan值以及绝对值超过阈值的极大值 | + +除此之外,我们在alert中支持dump配置项,如果打开"`dump`"选项,则会将异常信息落盘到目录`monitor_output/anomaly_detected`。 + +- 历史均值偏离告警案例如下: ```json "alert": { "rules": [{"rule_name": "AnomalyTurbulence", "args": {"threshold": 0.5}}], "dump": true }, ``` +- nan值/极大值告警案例如下: +```json + "alert": { + "rules": [{"rule_name": "AnomalyNan", "args": {"threshold": 1e10}}], + "dump": true + }, +``` + 2. 实例化工具时传入流水线并行group ```python monitor = TrainerMon( @@ -395,9 +408,9 @@ python3 -m msprobe.pytorch.monitor.anomaly_analyse -d $MONITOR_OUTPUT_DIR/anomal ``` 异常事件分析结束,将topk事件写入文件`anomaly_detected/anomaly_analyse.json`。异常分析支持以下参数配置: -| 字段名 | 解释 | 是否必选 | -| ----------------- | ------------------------------------------------------------ | -------- | -| -d 或 --data_path | 指定梯度异常落盘文件夹,梯度监控功能输出,一般为$MONITOR_OUTPUT_DIR/anomaly_detected。 | 是 | +| 字段名 | 解释 | 是否必选 | +| ----------------- | --------------------------------------------------------- | -------- | +| -d 或 --data_path | 指定异常落盘文件夹,监控功能输出,一般为$MONITOR_OUTPUT_DIR/anomaly_detected。 | 是 | | -o 或 --out_path | 排序后的异常落盘文件地址,默认在--data_path路径下落盘一个anomaly_analyse.json文件。 | 否 | | -k 或 --topk | 指定保留前topk个异常,默认为8。 | 否 | | -s 或 --step_list | 指定分析的step范围,默认为[]。 | 否 | @@ -412,7 +425,7 @@ from msprobe.pytorch.monitor.csv2tb import csv2tensorboard_by_step # 前三个参数用来指定需要转换的一批文件,指定monitor输出目录及一个时间范围,会对这个范围内的文件进行转换 # process_num指定拉起的进程个数,默认为1,更多的进程个数可以加速转换 # data_type_list是一个列表,指定需要转换的数据类型,默认转换全部数据,数据类型应来自输出件文件前缀,所有类型数据: -# ["actv", "actv_grad", "exp_avg", "exp_avg_sq", "grad_unreduced", "grad_reduced", "param"] +# ["actv", "actv_grad", "exp_avg", "exp_avg_sq", "grad_unreduced", "grad_reduced", "param_origin", "param_updated"] # output_dirpath可指定输出目录,默认保存到"{curtime}_csv2tensorboard_by_step"文件夹,其中curtime为自动获取的当前时间戳 csv2tensorboard_by_step( monitor_path="~/monitor_output", # 必填 @@ -507,7 +520,7 @@ csv2tensorboard_by_step(monitor_path, time_start, time_end, process_num=1, data_ | time_start | 起始时间戳。搭配time_end一起使用。指定一个时间范围,会对这个范围内的文件进行转换。左闭右闭的区间。 | 是 | | time_end | 结束时间戳。搭配time_start一起使用。指定一个时间范围,会对这个范围内的文件进行转换。左闭右闭的区间。 | 是 | | process_num | 指定拉起的进程个数,默认为1,更多的进程个数可以加速转换。 | 否 | -| data_type_list | 指定需要转换的数据类型, 数据类型应来自输出件文件前缀,所有类型数据:
["actv", "actv_grad", "exp_avg", "exp_avg_sq", "grad_unreduced", "grad_reduced", "param"]。
不指定就转换全部数据。 | 否 | +| data_type_list | 指定需要转换的数据类型, 数据类型应来自输出件文件前缀,所有类型数据:
["actv", "actv_grad", "exp_avg", "exp_avg_sq", "grad_unreduced", "grad_reduced", "param_origin", "param_updated"]。
不指定就转换全部数据。 | 否 | | output_dirpath | 指定转换后的输出路径,默认输出到"{curtime}_csv2tensorboard_by_step"文件夹,其中curtime为自动获取的当前时间戳。 | 否 | - 在模型任意位置获取当前参数**梯度**统计量 ```python diff --git a/debug/accuracy_tools/msprobe/docs/21.visualization_PyTorch.md b/debug/accuracy_tools/msprobe/docs/21.visualization_PyTorch.md index 15824858a43c87ac6a04d64e8243b6418a8e5ac7..daf81c004ab7f431cccc5e34b16d061d3e916acd 100644 --- a/debug/accuracy_tools/msprobe/docs/21.visualization_PyTorch.md +++ b/debug/accuracy_tools/msprobe/docs/21.visualization_PyTorch.md @@ -57,15 +57,14 @@ msprobe -f pytorch graph -i ./compare.json -o ./output ``` **命令行参数说明**: -| 参数名 | 说明 | 是否必选 | -|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| -i 或 --input_path | 指定比对文件,参考[比对文件说明](#313-比对文件说明) | 是 | -| -o 或 --output_path | 配置比对结果文件存盘目录,str 类型。文件名称基于时间戳自动生成,格式为:`compare_{timestamp}.vis或build_{timestamp}.vis`。 | 是 | -| -lm 或 --layer_mapping | 跨套件比对,例如同一个模型分别使用了DeepSpeed和Megatron套件的比对场景。配置该参数时表示开启跨套件Layer层的比对功能,指定模型代码中的Layer层后,可以识别对应dump数据中的模块或API。需要指定自定义映射文件*.yaml。自定义映射文件的格式请参见[自定义映射文件(Layer)](#71-自定义映射文件layer),如何配置自定义映射文件请参考[模型分级可视化如何配置layer mapping映射文件](./visualization/layer_mapping_example.md)。配置该参数后,将仅按节点名称进行比对,忽略节点的 type 和 shape。如果调试侧和标杆侧有名称不同的节点,则需要配置自定义映射文件,-lm参数传入自定义映射文件路径;如果调试侧和标杆侧节点名称相同,则仅指定-lm即可。 | 否 | -| -oc 或 --overflow_check | 是否开启溢出检测模式,开启后会在输出vis文件中(`compare_{timestamp}.vis或build_{timestamp}.vis`)对每个溢出节点进行标记溢出等级,溢出等级说明参考[溢出等级说明](#312-溢出等级说明) | 否 | -| -f 或 --fuzzy_match | 是否开启模糊匹配,bool类型。模糊匹配说明参考[匹配说明](#311-匹配说明) | 否 | -| -cs 或 --complete_stack | 是否使用完整的堆栈信息,bool类型。默认使用精简的堆栈信息,数据量小有助于增加流畅度。完整堆栈和精简堆栈信息参考[堆栈信息说明](#72-堆栈信息说明) | 否 | -| -mm 或 --multi_mapping | 一对一、一对多、多对一、多对多节点映射,例如待调试侧若干小算子与标杆侧融合算子比对等场景,需要指定自定义映射文件*.yaml。自定义映射文件的格式请参见[自定义映射文件(multi)](#73-自定义映射文件multi) | 否 | +| 参数名 | 说明 | 是否必选 | +|------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| +| -i 或 --input_path | 指定比对文件,参考[比对文件说明](#313-比对文件说明) | 是 | +| -o 或 --output_path | 配置比对结果文件存盘目录,str 类型。文件名称基于时间戳自动生成,格式为:`compare_{timestamp}.vis或build_{timestamp}.vis`。 | 是 | +| -lm 或 --layer_mapping | 跨套件比对,例如同一个模型分别使用了DeepSpeed和Megatron套件的比对场景。配置该参数时表示开启跨套件Layer层的比对功能,指定模型代码中的Layer层后,可以识别对应dump数据中的模块或API。需要指定自定义映射文件*.yaml。自定义映射文件的格式请参见[自定义映射文件(Layer)](#71-自定义映射文件layer),如何配置自定义映射文件请参考[模型分级可视化如何配置layer mapping映射文件](./visualization/layer_mapping_example.md)。 配置该参数后,将仅按节点名称进行比对,忽略节点的 type 和 shape。如果调试侧和标杆侧有名称不同的节点,则需要配置自定义映射文件,-lm参数传入自定义映射文件路径;如果调试侧和标杆侧节点名称相同,则仅指定-lm即可。| 否 | +| -oc 或 --overflow_check | 是否开启溢出检测模式,开启后会在输出vis文件中(`compare_{timestamp}.vis或build_{timestamp}.vis`)对每个溢出节点进行标记溢出等级,溢出等级说明参考[溢出等级说明](#312-溢出等级说明) | 否 | +| -f 或 --fuzzy_match | 是否开启模糊匹配,bool类型。模糊匹配说明参考[匹配说明](#311-匹配说明) | 否 | +| -cs 或 --complete_stack | 是否使用完整的堆栈信息,bool类型。默认使用精简的堆栈信息,数据量小有助于增加流畅度。完整堆栈和精简堆栈信息参考[堆栈信息说明](#72-堆栈信息说明) | 否 | #### 3.1.1 匹配说明 @@ -394,7 +393,6 @@ tensorboard --logdir out_path ![vis_precision_info.png](./img/visualization/vis_precision_info.png) ### 5.5 未匹配节点筛选 -节点匹配规则: 参考[匹配说明](#311-匹配说明) ,不符合匹配规则的节点为无匹配节点,颜色标灰。适用于排查两个模型结构差异的场景。 @@ -408,25 +406,56 @@ tensorboard --logdir out_path ## 6.图比对说明 -### 颜色 +### 6.1 颜色 颜色越深,精度比对差异越大,越可疑,具体信息可见浏览器页面左下角颜色图例。 -### 疑似有精度问题判定 - -#### 真实数据模式 +#### 6.1.1 真实数据模式 节点中所有输入的最小双千指标和所有输出的最小双千分之一指标的差值,反映了双千指标的下降情况,**值越大精度差距越大,颜色标记越深**。 ``One Thousandth Err Ratio(双千分之一)精度指标:Tensor中的元素逐个与对应的标杆数据对比,相对误差小于千分之一的比例占总元素个数的比例,比例越接近1越好`` -#### 统计信息模式 +如果调试侧(NPU)节点的output指标中的最大值(MAX)或最小值(MIN)中存在 nan/inf/-inf,直接标记为最深颜色。 + +#### 6.1.2 统计信息模式 节点中输出的统计量相对误差,**值越大精度差距越大,颜色标记越深**。 ``相对误差:abs((npu统计值 - bench统计值) / bench统计值)`` -#### md5模式 +如果调试侧(NPU)节点的output指标中的最大值(MAX)或最小值(MIN)中存在 nan/inf/-inf,直接标记为最深颜色。 + +#### 6.1.3 md5模式 节点中任意输入输出的md5值不同。 +### 6.2 指标说明 + +精度比对从三个层面评估 API 的精度,依次是:真实数据模式、统计数据模式和 MD5 模式。比对结果分别有不同的指标。 + +**公共指标**: +- name: 参数名称,例如input.0 +- type: 类型,例如torch.Tensor +- dtype: 数据类型,例如torch.float32 +- shape: 张量形状,例如[32, 1, 32] +- Max: 最大值 +- Min: 最小值 +- Mean: 平均值 +- Norm: L2-范数 + +**真实数据模式指标**: +- Cosine: tensor 余弦相似度 +- EucDist: tensor 欧式距离 +- MaxAbsErr: tensor 最大绝对误差 +- MaxRelativeErr: tensor 最大相对误差 +- One Thousandth Err Ratio: tensor 相对误差小于千分之一的比例(双千分之一) +- Five Thousandth Err Ratio: tensor 相对误差小于千分之五的比例(双千分之五) + +**统计数据模式指标** +- (Max, Min, Mean, Norm) diff: 统计量绝对误差 +- (Max, Min, Mean, Norm) RelativeErr: 统计量相对误差 + +**MD5模式指标** +- md5: CRC-32 值 + ## 7.附录 ### 7.1 自定义映射文件(Layer) @@ -504,27 +533,6 @@ yaml文件中只需配置待调试侧与标杆侧模型代码中功能一致但 ] } -``` -### 7.3 自定义映射文件(multi) -支持一对一、一对多、多对一、多对多节点映射配置,**多个节点使用英文逗号,分隔开**。 - -配置多个节点时,如果待配置节点为Module.layer3.Linear.forward.0、Module.layer4.Linear.forward.0和Module.layer5.Linear.forward.0,则Module.layer4.Linear.forward.0无需配置,仅取首尾节点配置即可(Module.layer3.Linear.forward.0,Module.layer5.Linear.forward.0)。注意,**配置节点的先后顺序不能乱(construct.json中的节点名称顺序代表先后顺序,请参考[dump结果文件介绍](./05.data_dump_PyTorch.md#3-dump-结果文件介绍))**,Module.layer3.Linear.forward.0在前,就不能配置成Module.layer5.Linear.forward.0,Module.layer3.Linear.forward.0,会导致配置无效。 - -```yaml -# 一对一 -Module.layer.Linear.forward.0: Module.layer1.Linear.forward.0 -``` -```yaml -# 一对多 -Module.layer.Linear.forward.0: Module.layer1.Linear.forward.0,Module.layer2.Linear.forward.0 -``` -```yaml -# 多对一 -Module.layer1.Linear.forward.0,Module.layer2.Linear.forward.0: Module.layer.Linear.forward.0 -``` -```yaml -# 多对多 -Module.layer3.Linear.forward.0,Module.layer5.Linear.forward.0: Module.layer1.Linear.forward.0,Module.layer2.Linear.forward.0 ``` # FAQ 1. 图比对场景,节点呈现灰色,且没有精度比对数据,怎么处理? diff --git a/debug/accuracy_tools/msprobe/docs/22.visualization_MindSpore.md b/debug/accuracy_tools/msprobe/docs/22.visualization_MindSpore.md index aad82503a9451b8e42d454927f99567f5c344792..2b8a87c5069a15b710a9b8323fb9224227f20a09 100644 --- a/debug/accuracy_tools/msprobe/docs/22.visualization_MindSpore.md +++ b/debug/accuracy_tools/msprobe/docs/22.visualization_MindSpore.md @@ -2,7 +2,7 @@ 分级可视化工具将msprobe工具dump的精度数据进行解析,还原模型图结构,实现模型各个层级的精度数据比对,方便用户理解模型结构、分析精度问题。 -工具支持MindSpore版本:2.4.0 +工具支持MindSpore版本:>=2.4.0 ## 更新通知 @@ -65,7 +65,6 @@ msprobe -f mindspore graph -i ./compare.json -o ./output | -oc 或 --overflow_check | 是否开启溢出检测模式,开启后会在输出vis文件中(`compare_{timestamp}.vis或build_{timestamp}.vis`)对每个溢出节点进行标记溢出等级,溢出等级说明参考[溢出等级说明](#312-溢出等级说明) | 否 | | -f 或 --fuzzy_match | 是否开启模糊匹配,bool类型。模糊匹配说明参考[匹配说明](#311-匹配说明) | 否 | | -cs 或 --complete_stack | 是否使用完整的堆栈信息,bool类型。默认使用精简的堆栈信息,数据量小有助于增加流畅度。完整堆栈和精简堆栈信息参考[堆栈信息说明](#72-堆栈信息说明) | 否 | -| -mm 或 --multi_mapping | 一对一、一对多、多对一、多对多节点映射,例如待调试侧若干小算子与标杆侧融合算子比对等场景,需要指定自定义映射文件*.yaml。自定义映射文件的格式请参见[自定义映射文件(multi)](#73-自定义映射文件multi) | 否 | #### 3.1.1 匹配说明 @@ -396,7 +395,6 @@ tensorboard --logdir out_path ![vis_precision_info.png](./img/visualization/vis_precision_info.png) ### 5.5 未匹配节点筛选 -节点匹配规则: 参考[匹配说明](#311-匹配说明) ,不符合匹配规则的节点为无匹配节点,颜色标灰。适用于排查两个模型结构差异的场景。 @@ -410,25 +408,56 @@ tensorboard --logdir out_path ## 6.图比对说明 -### 颜色 +### 6.1 颜色 颜色越深,精度比对差异越大,越可疑,具体信息可见浏览器页面左下角颜色图例。 -### 疑似有精度问题判定 - -#### 真实数据模式 +#### 6.1.1 真实数据模式 节点中所有输入的最小双千指标和所有输出的最小双千分之一指标的差值,反映了双千指标的下降情况,**值越大精度差距越大,颜色标记越深**。 ``One Thousandth Err Ratio(双千分之一)精度指标:Tensor中的元素逐个与对应的标杆数据对比,相对误差小于千分之一的比例占总元素个数的比例,比例越接近1越好`` -#### 统计信息模式 +如果调试侧(NPU)节点的output指标中的最大值(MAX)或最小值(MIN)中存在 nan/inf/-inf,直接标记为最深颜色。 + +#### 6.1.2 统计信息模式 节点中输出的统计量相对误差,**值越大精度差距越大,颜色标记越深**。 ``相对误差:abs((npu统计值 - bench统计值) / bench统计值)`` -#### md5模式 +如果调试侧(NPU)节点的output指标中的最大值(MAX)或最小值(MIN)中存在 nan/inf/-inf,直接标记为最深颜色。 + +#### 6.1.3 md5模式 节点中任意输入输出的md5值不同。 +### 6.2 指标说明 + +精度比对从三个层面评估 API 的精度,依次是:真实数据模式、统计数据模式和 MD5 模式。比对结果分别有不同的指标。 + +**公共指标**: +- name: 参数名称,例如input.0 +- type: 类型,例如mindspore.Tensor +- dtype: 数据类型,例如BFloat32 +- shape: 张量形状,例如[32, 1, 32] +- Max: 最大值 +- Min: 最小值 +- Mean: 平均值 +- Norm: L2-范数 + +**真实数据模式指标**: +- Cosine: tensor 余弦相似度 +- EucDist: tensor 欧式距离 +- MaxAbsErr: tensor 最大绝对误差 +- MaxRelativeErr: tensor 最大相对误差 +- One Thousandth Err Ratio: tensor 相对误差小于千分之一的比例(双千分之一) +- Five Thousandth Err Ratio: tensor 相对误差小于千分之五的比例(双千分之五) + +**统计数据模式指标** +- (Max, Min, Mean, Norm) diff: 统计量绝对误差 +- (Max, Min, Mean, Norm) RelativeErr: 统计量相对误差 + +**MD5模式指标** +- md5: CRC-32 值 + ## 7.附录 ### 7.1 自定义映射文件(Layer) @@ -521,27 +550,6 @@ yaml文件中只需配置MindSpore与PyTorch模型代码中功能一致但名称 ] } ``` -### 7.3 自定义映射文件(multi) -支持一对一、一对多、多对一、多对多节点映射配置,**多个节点使用英文逗号,分隔开**。 - -配置多个节点时,如果待配置节点为Cell.layer3.Linear.forward.0、Cell.layer4.Linear.forward.0和Cell.layer5.Linear.forward.0,则Cell.layer4.Linear.forward.0无需配置,仅取首尾节点配置即可(Cell.layer3.Linear.forward.0,Cell.layer5.Linear.forward.0)。注意,**配置节点的先后顺序不能乱(construct.json中的节点名称顺序代表先后顺序,请参考[dump结果文件介绍](./06.data_dump_MindSpore.md#82-动态图场景))**,Cell.layer3.Linear.forward.0在前,就不能配置成Cell.layer5.Linear.forward.0,Cell.layer3.Linear.forward.0,会导致配置无效。 - -```yaml -# 一对一 -Cell.layer.Linear.forward.0: Cell.layer1.Linear.forward.0 -``` -```yaml -# 一对多 -Cell.layer.Linear.forward.0: Cell.layer1.Linear.forward.0,Cell.layer2.Linear.forward.0 -``` -```yaml -# 多对一 -Cell.layer1.Linear.forward.0,Cell.layer2.Linear.forward.0: Cell.layer.Linear.forward.0 -``` -```yaml -# 多对多 -Cell.layer3.Linear.forward.0,Cell.layer5.Linear.forward.0: Cell.layer1.Linear.forward.0,Cell.layer2.Linear.forward.0 -``` # FAQ 1. 图比对场景,节点呈现灰色,且没有精度比对数据,怎么处理? diff --git a/debug/accuracy_tools/msprobe/docs/28.debugger_save_instruction.md b/debug/accuracy_tools/msprobe/docs/28.debugger_save_instruction.md index 6f4d519d5f61d5efaaffe54a1bde4f140b539f72..f275dc9cfec309f4860f315ec88435cefb0b440c 100644 --- a/debug/accuracy_tools/msprobe/docs/28.debugger_save_instruction.md +++ b/debug/accuracy_tools/msprobe/docs/28.debugger_save_instruction.md @@ -1,12 +1,14 @@ -# 单点保存工具 README +# 单点保存工具 ## 简介 -L0, L1, mix dump存在盲区,网络中的非api/module的输入输出不会被批量dump下来。单点保存提供类似np.save和print的功能和使用体验,可以保存指定的变量。同时针对大模型场景进行了增强,具备以下特性: +L0, L1, mix级别的dump能力存在盲区,网络中的非API或module的输入输出不会被批量dump下来。单点保存提供类似np.save和print的功能和使用体验,可以保存指定的变量。同时针对大模型场景进行了增强,具备以下特性: - 可保存变量的反向梯度结果。 - 能直接保存嵌套结构数据(如 list、dict),无需手动遍历。 -- 自动分 rank 保存。 +- 自动分 Rank 保存。 +- 可分 Step 保存数据。 - 多次调用时会自动计数。 - 可配置保存统计值或者张量。 +- 支持异步保存。 ## 支持场景 仅支持 PyTorch 与 MindSpore 的动态图场景。 @@ -15,14 +17,16 @@ L0, L1, mix dump存在盲区,网络中的非api/module的输入输出不会被 ### 配置文件说明 -通用配置: +通用配置 (细节详见[通用配置说明](./02.config_introduction.md#11-通用配置) ): | 参数 | 解释 | 是否必选 | | -------- |-------------------------------------------| -------- | | task | dump 的任务类型,str 类型。 单点保存场景仅支持传入"statistics", "tensor"。 | 是 | | level | dump 级别,str 类型,根据不同级别采集不同数据。单点保存场景传入"debug"。 | 是 | -| dump_path | 设置 dump 数据目录路径,str 类型。细节详见[通用配置说明](./02.config_introduction.md#11-通用配置) | 是 | -| rank | 指定对某张卡上的数据进行采集,list[Union[int, str]] 类型。细节详见[通用配置说明](./02.config_introduction.md#11-通用配置) | 否 | +| dump_path | 设置 dump 数据目录路径,str 类型。 | 是 | +| rank | 指定对某张卡上的数据进行采集,list[Union[int, str]] 类型。 | 否 | +| step | 指定采集某个 Step 的数据,list[Union[int, str]] 类型。 | 否 | +| async_dump | 异步 dump 开关,bool 类型。 | 否 | "statistics" 任务子配置项: | 参数 | 解释 | 是否必选 | @@ -33,9 +37,9 @@ L0, L1, mix dump存在盲区,网络中的非api/module的输入输出不会被 ### 接口调用说明 -调用PrecisionDebugger.save,传入需要保存的变量,指定变量名称以及是否需要保存反向数据。接口入参说明详见[pytorch单点保存接口](./05.data_dump_PyTorch.md#19-save),[mindspore单点保存接口](./06.data_dump_MindSpore.md#615-save) +调用PrecisionDebugger.save,传入需要保存的变量,指定变量名称以及是否需要保存反向数据。接口入参说明详见[PyTorch单点保存接口](./05.data_dump_PyTorch.md#19-save),[MindSpore单点保存接口](./06.data_dump_MindSpore.md#615-save) -### 实例(以pytorch场景为例) +### 实例(以PyTorch场景为例,MindSpore场景只需要从msprobe.mindspore模块导包即可) 配置文件 ```json @@ -43,7 +47,9 @@ L0, L1, mix dump存在盲区,网络中的非api/module的输入输出不会被 "task": "statistics", "dump_path": "./dump_path", "rank": [], + "step": [], "level": "debug", + "async_dump": false, "statistics": { "summary_mode": "statistics" } @@ -53,7 +59,7 @@ L0, L1, mix dump存在盲区,网络中的非api/module的输入输出不会被 初始化 ```python # 训练启动py脚本 -from mindspore.pytorch import PrecisionDebugger +from msprobe.pytorch import PrecisionDebugger debugger = PrecisionDebugger("./config.json") for data, label in data_loader: # 执行模型训练 @@ -64,7 +70,7 @@ for data, label in data_loader: 初始化(无配置文件) ```python # 训练启动py脚本 -from mindspore.pytorch import PrecisionDebugger +from msprobe.pytorch import PrecisionDebugger debugger = PrecisionDebugger(dump_path="dump_path", level="debug") for data, label in data_loader: # 执行模型训练 @@ -75,7 +81,7 @@ for data, label in data_loader: 调用保存接口 ```python # 训练过程中被调用py文件 -from mindspore.pytorch import PrecisionDebugger +from msprobe.pytorch import PrecisionDebugger dict_variable = {"key1": "value1", "key2": [1, 2]} PrecisionDebugger.save(dict_variable, "dict_variable", save_backward=False) @@ -83,12 +89,13 @@ PrecisionDebugger.save(dict_variable, "dict_variable", save_backward=False) ## 输出结果 * **"task" 配置为 "statistics" 场景** :在 dump 目录下会生成包含变量统计值信息的 `debug.json` 文件。 - * **"task" 配置为 "tensor" 场景** :除了在 dump 目录下生成包含变量统计值信息的 `debug.json` 文件外,还会在 dump 子目录 `dump_tensor_data` 中保存张量二进制文件,文件名称格式为 `{variable_name}{grad_flag}.{count}.tensor.{indexes}.{file_suffix}`。 + `debug.json` 中统计值的key命名格式为 `{variable_name}{grad_flag}.{count}.debug`。 + * **"task" 配置为 "tensor" 场景** :除了在 dump 目录下生成包含变量统计值信息的 `debug.json` 文件外,还会在 dump 子目录 `dump_tensor_data` 中保存张量二进制文件,文件名称格式为 `{variable_name}{grad_flag}.{count}.debug.{indexes}.{file_suffix}`。 - variable_name: 传入save接口的变量名称。 - grad_flag: 反向数据标识,反向数据为"_grad",正向数据为""。 - count: 调用计数,多次以相同变量名称调用时的计数。 - - indexes: 索引,在保存嵌套结构数据时的索引。例如:嵌套结构为`{"key1": "value1", "key2": ["value2", "value3"]}`,"value2"的索引为"key2.0" - - file_suffix:文件后缀,pytorch场景为"pt",mindspore场景为"npy" + - indexes: 索引,在保存嵌套结构数据时的索引。例如:嵌套结构为`{"key1": "value1", "key2": ["value2", "value3"]}`,"value2"的索引为"key2.0"。 + - file_suffix:文件后缀,PyTorch场景为"pt",MindSpore场景为"npy"。 diff --git a/debug/accuracy_tools/msprobe/docs/31.config_checking.md b/debug/accuracy_tools/msprobe/docs/31.config_check.md similarity index 34% rename from debug/accuracy_tools/msprobe/docs/31.config_checking.md rename to debug/accuracy_tools/msprobe/docs/31.config_check.md index 6d47c744b2dfd578a8c15d71d3d7fe1ae000746f..4bbe9162c8e98b630887b77a0f7657ec4d686aba 100644 --- a/debug/accuracy_tools/msprobe/docs/31.config_checking.md +++ b/debug/accuracy_tools/msprobe/docs/31.config_check.md @@ -1,8 +1,8 @@ -# config checking +# config check ## 介绍 -该工具主要适用于对比两个环境下可能影响训练精度的配置差异,目前只支持pytorch,包括: +该工具主要适用于对比两个环境下可能影响训练精度的配置差异,支持mindspore和pytorch两个框架,包括: - 环境变量 - 三方库版本 @@ -18,44 +18,69 @@ ## 使用说明 -首先需要有两个用来训练的环境, 工具会采集两个环境下影响精度的配置,并支持比对。 +用户需要在两个待比对的训练的环境上分别进行数据采集, 工具会采集两个环境下影响精度的配置,采集结果上传到同一机器进行比对。 -1、数据采集。在其中一个环境执行如下操作: +### 数据采集 -在训练脚本开始处插入如下代码: +#### 静态数据采集 + +静态数据采集仅支持环境变量,三方库版本及训练超参采集,其中环境变量,三方库版本默认采集,训练超参采集需要用户传入启动训练的 shell 脚本路径或 yaml 配置文件, +支持多个输入,不传入表示不采集。 + +启动命令如下 +```shell +msprobe -f pytorch/mindspore config_check -d **.sh **.yaml -o output_path ``` -from msprobe.pytorch.config_checking.checkers.random_checker import apply_patches -apply_patches() +-f 代表训练框架,传入pytorch或mindspore,必选。 + +-d 代表数据采集模式,可传入启动训练的 shell 脚本路径或 yaml 配置文件路径,可选,不传入代表不采集。 + +-o 代表输出路径,可选,默认为 config_check_pack.zip。 + +#### 动态数据采集 + + +在训练流程执行到的第一个python脚本开始处插入如下代码: ``` +from msprobe.core.config_check import ConfigChecker +ConfigChecker.apply_patches(fmk) +``` + +说明: + +- fmk:训练框架。可选 pytorch 和 mindspore ,不传默认为 pytorch。 在模型初始化好之后插入如下代码: ``` -from msprobe.pytorch.config_checking.config_checker import ConfigChecker -ConfigChecker(model, shell_path, output_zip_path) +from msprobe.core.config_check import ConfigChecker +ConfigChecker(model, shell_path, output_zip_path, fmk) ``` 说明: - model:初始化好的模型。不传或缺省就不会采集权重和数据集。 -- shell_path:训练脚本路径,类型为列表,传入一个或多个训练配置/启动脚本。不传或缺省就不会采集超参。 +- shell_path:动态采集模式下支持 **megatron** 训练超参自动捕获,使用 **megatron** 时推荐不传入,其他情况下可传入训练脚本路径,类型为列表,传入一个或多个训练配置/启动脚本。不传或缺省就不会采集超参。 - output_zip_path:输出zip包的路径,不传默认为"./config_check_pack.zip"。 +- fmk:当前是什么框架。可选 pytorch 和 mindspore ,不传默认为 pytorch。 -采集完成后会得到一个zip包,里面包括各项[影响精度的配置](#介绍)。 +采集完成后会得到一个zip包,里面包括各项[影响精度的配置](#介绍)。会分rank和step存储,其中step为micro_step。 -2、在另一个环境上执行上述操作,得到另一个zip包 +在另一个环境上执行上述操作,得到另一个zip包 -3、将两个zip包传到同一个环境下,使用如下命令进行比对: +### 数据比对 -``` -msprobe -f pytorch config_checking -c bench_zip_path cmp_zip_path -o output_path +将两个zip包传到同一个环境下,使用如下命令进行比对: + +```shell +msprobe -f pytorch config_check -c bench_zip_path cmp_zip_path -o output_path ``` 其中**bench_zip_path** 为标杆侧采集到的数据, **cmp_zip_path** 为待对比侧采集到的数据。 **output_path 会被删掉再新建**,不传默认为"./config_check_result", 在 **output_path** 里会生成2个目录和1个文件: -- bench:bench_zip_path里打包的数据 -- cmp:cmp_zip_path里打包的数据 -- result.xlsx:比对结果。里面会有多个sheet页,其中**summary**总览通过情况,其余页是具体检查项的详情 +- bench:bench_zip_path里打包的数据。 +- cmp:cmp_zip_path里打包的数据。 +- result.xlsx:比对结果。里面会有多个sheet页,其中**summary**总览通过情况,其余页是具体检查项的详情。其中step为micro_step。 ## 通过标准 diff --git a/debug/accuracy_tools/msprobe/docs/32.checkpoint_compare.md b/debug/accuracy_tools/msprobe/docs/32.checkpoint_compare.md deleted file mode 100644 index c49b4bfc8ee079cfdf2583c0c84372fe74aec6a7..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/docs/32.checkpoint_compare.md +++ /dev/null @@ -1,60 +0,0 @@ -# 权重比对 - -msprobe 工具提供大模型权重比对功能。当前支持pytorch下megatron/mindspeed不同模型并行策略下的权重互相比对。 - -> **Attention:** Ensure megatron in the PYTHONPATH to load a megatron checkpoint. - -## 1. 工具安装 - -[msprobe工具安装](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/01.installation.md) - -## 2. 工具使用 -```shell -msprobe -f pytorch config_checking -c PATH/TO/A/CHECKPOINT PATH/TO/THE/OTHER/CHECKPOINT -s -o PATH/FOR/OUTPUT -``` - -**命令行参数说明**: - -| 参数名 | 说明 | 是否必选 | -|------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| -| -c --compare | 需要比较的两个checkpoint路径 | 是 | -| -s --ckpt-sim | store_true。使能权重比对功能,否则为配置比对 | 是 | -| -o 或 --out | 权重比对结果文件存盘目录,默认为'ckpt_compare_out.json' | 否 | - - - - -Sample stdout: -```txt -Loaded checkpoint from iteration x -Found xxx total parameters across all ranks -Loaded checkpoint from iteration x -Found xxx total parameters across all ranks -2025-03-25 08:24:48 (552546) [WARNING] Parameters not in ckpt2: set() -2025-03-25 08:24:48 (552546) [WARNING] Parameters not in ckpt1: set() -... -[INFO] Comparison results written to ckpt_compare_out.json -``` - -Sample result: -```json -{ - "embedding.word_embeddings.weight": { - "l2": 0.0, - "cos": 1.0, - "numel": 25755648, - "shape": [ - 50304, - 512 - ] - }, - "decoder.layers.0.input_layernorm.bias": { - "l2": 0.0, - "cos": 0.9999999403953552, - "numel": 512, - "shape": [ - 512 - ] - } -} -``` \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/docs/data_dump_MindSpore/dynamic_graph_quick_start_example.md b/debug/accuracy_tools/msprobe/docs/data_dump_MindSpore/dynamic_graph_quick_start_example.md index a05d7aaaefec6eec9ae70c5c3d1b7f2b6da2a220..14bb2cd2c54793b5a61af5e106bcfcd484e8ecef 100644 --- a/debug/accuracy_tools/msprobe/docs/data_dump_MindSpore/dynamic_graph_quick_start_example.md +++ b/debug/accuracy_tools/msprobe/docs/data_dump_MindSpore/dynamic_graph_quick_start_example.md @@ -51,6 +51,7 @@ debugger = PrecisionDebugger(config_path=config_path) # 设置 MindSpore 设备上下文 context.set_context(mode=ms.PYNATIVE_MODE, device_target="Ascend", device_id=0) +print("Context set successfully. Please wait for the training task.") # 定义卷积层 def conv_layer(in_channels, out_channels, kernel_size, stride=1, padding=0, pad_mode="valid", has_bias=True): diff --git a/debug/accuracy_tools/msprobe/docs/img/compare_result.png b/debug/accuracy_tools/msprobe/docs/img/compare_result.png index b6d7ec6dfcbc44b4b7056e1297a481f495ceb86e..b321ebed8c7ea04357b57da81cc31ee038d4b94f 100644 Binary files a/debug/accuracy_tools/msprobe/docs/img/compare_result.png and b/debug/accuracy_tools/msprobe/docs/img/compare_result.png differ diff --git a/debug/accuracy_tools/msprobe/mindspore/__init__.py b/debug/accuracy_tools/msprobe/mindspore/__init__.py index cbdab34f0446ee12c07b2aba8b4f75018496eda6..d92f9f2413fda84a05d1fe6aeedf5d9dbca24f2f 100644 --- a/debug/accuracy_tools/msprobe/mindspore/__init__.py +++ b/debug/accuracy_tools/msprobe/mindspore/__init__.py @@ -25,5 +25,3 @@ except ImportError: from msprobe.mindspore.debugger.precision_debugger import PrecisionDebugger from msprobe.mindspore.common.utils import seed_all, MsprobeStep, MsprobeInitStep from msprobe.mindspore.monitor.module_hook import TrainerMon - -os.environ["MS_HOOK_ENABLE"] = "on" diff --git a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_accuracy_checker.py b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_accuracy_checker.py index 557d731e042913da3a622035219ec8dea0409ab4..701d2cb3270db9721121c3605d02122f46a96edc 100644 --- a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_accuracy_checker.py +++ b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_accuracy_checker.py @@ -14,8 +14,10 @@ # limitations under the License. import os +from dataclasses import dataclass +from typing import Any, Optional from tqdm import tqdm - +import numpy as np from msprobe.core.common.const import Const, CompareConst from msprobe.core.common.file_utils import FileOpen, create_directory, write_csv, load_json, load_yaml from msprobe.core.common.utils import add_time_as_suffix @@ -28,6 +30,13 @@ from msprobe.mindspore.api_accuracy_checker.utils import (check_and_get_from_jso from msprobe.mindspore.common.const import MsCompareConst from msprobe.mindspore.common.log import logger from msprobe.mindspore.api_accuracy_checker import torch_mindtorch_importer +from msprobe.mindspore import PrecisionDebugger +from msprobe.core.data_dump.data_processor.base import (ModuleBackwardInputs, ModuleBackwardOutputs, + ModuleForwardInputsOutputs) +from msprobe.core.common.file_utils import create_directory +from msprobe.core.data_dump.data_collector import build_data_collector +from msprobe.core.common.utils import Const, print_tools_ends_info, DumpPathAggregation +from msprobe.core.data_dump.data_processor.base import ModuleForwardInputsOutputs, ModuleBackwardInputsOutputs cur_path = os.path.dirname(os.path.realpath(__file__)) yaml_path = os.path.join(cur_path, MsCompareConst.SUPPORTED_API_LIST_FILE) @@ -59,13 +68,128 @@ class ProcessResultPacket: self.err_msg = err_msg +@dataclass +class Config: + execution_mode: str + dump_path: str + task: str + level: str + scope: Optional[Any] + list: Optional[Any] + framework: str + data_mode: str + file_format: str + dump_tensor_data_dir: str + async_dump: bool + summary_mode: Optional[Any] = None + + class ApiAccuracyChecker: def __init__(self, args): self.api_infos = dict() self.data_manager = DataManager(args.out_path, args.result_csv_path) # 在初始化时实例化 DataManager + self.save_error_data = args.save_error_data + if self.save_error_data: + config, dump_path_aggregation = self.init_save_error_data(args) + self.data_collector = build_data_collector(config) + self.data_collector.update_dump_paths(dump_path_aggregation) @staticmethod - def run_and_compare_helper(api_info, api_name_str, api_input_aggregation, forward_or_backward): + def init_save_error_data(args): + config = Config( + execution_mode="pynative", + dump_path=f"{args.out_path}", + dump_tensor_data_dir=f"{args.out_path}", + task="tensor", # 任务类型,模拟保存tensor数据 + level="L1", # 级别 + scope=None, # 作用域 (None) + list=None, # API 列表 (None) + framework=Const.MS_FRAMEWORK, # 框架类型 + data_mode="all", + file_format="npy", + async_dump=False + ) + + dump_dir = f"{args.out_path}" + dump_data_dir = os.path.join(dump_dir, "error_data") + create_directory(dump_data_dir) + dump_path_aggregation = DumpPathAggregation() + dump_path_aggregation.dump_file_path = os.path.join(dump_dir, "dump.json") + dump_path_aggregation.stack_file_path = os.path.join(dump_dir, "stack.json") + dump_path_aggregation.dump_tensor_data_dir = dump_data_dir + return config, dump_path_aggregation + + @staticmethod + def prepare_api_input_aggregation(api_info, forward_or_backward=Const.FORWARD): + """ + Args: + api_info: ApiInfo + forward_or_backward: str + Returns: + ApiInputAggregation + """ + forward_inputs = api_info.get_compute_element_list(Const.FORWARD, Const.INPUT) + kwargs = api_info.get_kwargs() + if forward_or_backward == Const.FORWARD: + gradient_inputs = None + else: + gradient_inputs = api_info.get_compute_element_list(Const.BACKWARD, Const.INPUT) + return ApiInputAggregation(forward_inputs, kwargs, gradient_inputs) + + @staticmethod + def is_api_checkable(api_name_str): + ''' + Args: + api_name_str: str, e.g. "MintFunctional.relu.0.forward", key in data field of api_info.json + Returns: + is_checkable: bool + Description: + tell whether this api is checkable based on the key in "data" dict in api_info.json + ''' + api_name_str_list = api_name_str.split(Const.SEP) + if len(api_name_str_list) < MsCompareConst.API_NAME_STR_LENGTH: + return False + api_type_str = api_name_str_list[0] + real_api_str = Const.SEP.join(api_name_str_list[1:-2]) + api_list = load_yaml(yaml_path) + supported_tensor_api_list = api_list.get(MsCompareConst.SUPPORTED_TENSOR_LIST_KEY) + supported_fusion_api_list = MsCompareConst.SUPPORTED_FUSION_LIST + if api_type_str in (MsCompareConst.MINT, MsCompareConst.MINT_FUNCTIONAL) \ + and global_context.get_framework() == Const.MS_FRAMEWORK: + return True + if api_type_str in MsCompareConst.MT_VALID_API_TYPES \ + and global_context.get_framework() == Const.MT_FRAMEWORK: + return True + if api_type_str == MsCompareConst.TENSOR_API and real_api_str in supported_tensor_api_list \ + and global_context.get_framework() == Const.MS_FRAMEWORK: + return True + if api_type_str == MsCompareConst.FUNCTIONAL_API and real_api_str in supported_fusion_api_list \ + and global_context.get_framework() == Const.MS_FRAMEWORK: + return True + return False + + def post_forward_hook(self, api_or_module_name, primitive_instance, args, kwargs, output): + self.data_collector.update_api_or_module_name(api_or_module_name) + module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=output) + self.data_collector.forward_data_collect_only_tensor( + api_or_module_name, + primitive_instance, + os.getpid(), + module_input_output + ) + + def backward_hook(self, api_or_module_name, module, grad_input, grad_output): + self.data_collector.update_api_or_module_name(api_or_module_name) + + module_input_output = ModuleBackwardInputsOutputs(grad_input=grad_output, grad_output=grad_input) + self.data_collector.backward_data_collect_only_tensor( + api_or_module_name, + module, + os.getpid(), + module_input_output + ) + + def run_and_compare_helper(self, api_info, api_name_str, api_input_aggregation, forward_or_backward): """ Args: api_info: ApiInfo @@ -83,13 +207,22 @@ class ApiAccuracyChecker: """ # get output if global_context.get_is_constructed(): - # constructed situation, need use constructed input to run mindspore api getting tested_output - tested_outputs = api_runner(api_input_aggregation, api_name_str, - forward_or_backward, global_context.get_framework()) + if forward_or_backward == Const.FORWARD: + tested_outputs, inputs, kwargs, forward_result_tuple = api_runner(api_input_aggregation, api_name_str, + forward_or_backward, + global_context.get_framework()) + elif forward_or_backward == Const.BACKWARD: + tested_outputs, gradient_inputs, backward_result_tuple = api_runner(api_input_aggregation, api_name_str, + forward_or_backward, + global_context.get_framework()) + else: + tested_outputs = api_runner(api_input_aggregation, api_name_str, + forward_or_backward, global_context.get_framework()) else: tested_outputs = api_info.get_compute_element_list(forward_or_backward, Const.OUTPUT) bench_outputs = api_runner(api_input_aggregation, api_name_str, forward_or_backward, Const.PT_FRAMEWORK) + tested_outputs = trim_output_compute_element_list(tested_outputs, forward_or_backward) bench_outputs = trim_output_compute_element_list(bench_outputs, forward_or_backward) if len(tested_outputs) != len(bench_outputs): @@ -114,64 +247,26 @@ class ApiAccuracyChecker: compare_result_dict.get(CompareConst.MAX_ABS_ERR).pass_status == CompareConst.PASS: status = CompareConst.PASS err_msg = "" + else: status = CompareConst.ERROR err_msg = (compare_result_dict.get(CompareConst.COSINE).err_msg + compare_result_dict.get(CompareConst.MAX_ABS_ERR).err_msg) + if forward_or_backward == Const.FORWARD and self.save_error_data \ + and global_context.get_is_constructed(): + api_name_str_backward = f"{api_name_str}{Const.SEP}{Const.FORWARD}" + self.post_forward_hook(api_name_str_backward, None, inputs, kwargs, forward_result_tuple) + + if forward_or_backward == Const.BACKWARD and self.save_error_data \ + and global_context.get_is_constructed(): + api_name_str_backward = f"{api_name_str}{Const.SEP}{Const.BACKWARD}" + self.backward_hook(api_name_str_backward, None, gradient_inputs, backward_result_tuple) + basic_info_status = \ BasicInfoAndStatus(api_name_with_slot, bench_dtype, tested_dtype, shape, status, err_msg) output_list.append(tuple([api_name_str, forward_or_backward, basic_info_status, compare_result_dict])) return output_list - @staticmethod - def prepare_api_input_aggregation(api_info, forward_or_backward=Const.FORWARD): - """ - Args: - api_info: ApiInfo - forward_or_backward: str - Returns: - ApiInputAggregation - """ - forward_inputs = api_info.get_compute_element_list(Const.FORWARD, Const.INPUT) - kwargs = api_info.get_kwargs() - if forward_or_backward == Const.FORWARD: - gradient_inputs = None - else: - gradient_inputs = api_info.get_compute_element_list(Const.BACKWARD, Const.INPUT) - return ApiInputAggregation(forward_inputs, kwargs, gradient_inputs) - - @staticmethod - def is_api_checkable(api_name_str): - ''' - Args: - api_name_str: str, e.g. "MintFunctional.relu.0.forward", key in data field of api_info.json - Returns: - is_checkable: bool - Description: - tell whether this api is checkable based on the key in "data" dict in api_info.json - ''' - api_name_str_list = api_name_str.split(Const.SEP) - if len(api_name_str_list) < MsCompareConst.API_NAME_STR_LENGTH: - return False - api_type_str = api_name_str_list[0] - real_api_str = Const.SEP.join(api_name_str_list[1:-2]) - api_list = load_yaml(yaml_path) - supported_tensor_api_list = api_list.get(MsCompareConst.SUPPORTED_TENSOR_LIST_KEY) - supported_fusion_api_list = MsCompareConst.SUPPORTED_FUSION_LIST - if api_type_str in (MsCompareConst.MINT, MsCompareConst.MINT_FUNCTIONAL) \ - and global_context.get_framework() == Const.MS_FRAMEWORK: - return True - if api_type_str in MsCompareConst.MT_VALID_API_TYPES \ - and global_context.get_framework() == Const.MT_FRAMEWORK: - return True - if api_type_str == MsCompareConst.TENSOR_API and real_api_str in supported_tensor_api_list \ - and global_context.get_framework() == Const.MS_FRAMEWORK: - return True - if api_type_str == MsCompareConst.FUNCTIONAL_API and real_api_str in supported_fusion_api_list \ - and global_context.get_framework() == Const.MS_FRAMEWORK: - return True - return False - def parse(self, api_info_path): api_info_dict = load_json(api_info_path) @@ -183,9 +278,9 @@ class ApiAccuracyChecker: MsCompareConst.TENSOR_TASK)) try: framework = check_and_get_from_json_dict(api_info_dict, MsCompareConst.FRAMEWORK, - "framework field in api_info.json", accepted_type=str, - accepted_value=(Const.MS_FRAMEWORK, - Const.MT_FRAMEWORK)) + "framework field in api_info.json", accepted_type=str, + accepted_value=(Const.MS_FRAMEWORK, + Const.MT_FRAMEWORK)) except Exception as e: framework = Const.MS_FRAMEWORK logger.warning(f"JSON parsing error in framework field: {e}") @@ -301,4 +396,4 @@ class ApiAccuracyChecker: elif process_result_packet.process_status == MsCompareConst.ProcessStatus.EXCEPTION_SKIP: self.data_manager.record_exception_skip(api_name_str, Const.BACKWARD, process_result_packet.err_msg) - self.data_manager.save_results(api_name_str) + self.data_manager.save_results(api_name_str) \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_runner.py b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_runner.py index 82c2790452776733f924eccb82da3dc6a339594f..e1640aab9e038d075f6879693e97352d5f3eb001 100644 --- a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_runner.py +++ b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/api_runner.py @@ -13,6 +13,8 @@ # See the License for the specific language governing permissions and # limitations under the License. +import os +import numpy as np import mindspore from mindspore import ops from msprobe.core.common.const import Const @@ -38,7 +40,6 @@ else: import torch - class ApiInputAggregation: def __init__(self, inputs, kwargs, gradient_inputs) -> None: """ @@ -189,6 +190,8 @@ class ApiRunner: forward_result = api_instance(*inputs, **kwargs) # can be single tensor or tuple forward_result_tuple = convert_to_tuple(forward_result) res_compute_element_list = [ComputeElement(parameter=api_res) for api_res in forward_result_tuple] + if api_platform == Const.MS_FRAMEWORK or api_platform == Const.MT_FRAMEWORK: + return res_compute_element_list, inputs, kwargs, forward_result_tuple else: if gradient_inputs is None: err_msg = f"ApiRunner.run_api failed: run backward api but gradient_inputs is missing" @@ -206,6 +209,7 @@ class ApiRunner: backward_result = grad_func(*inputs, gradient_inputs) # can be single tensor or tuple backward_result_tuple = convert_to_tuple(backward_result) res_compute_element_list = [ComputeElement(parameter=api_res) for api_res in backward_result_tuple] + return res_compute_element_list, gradient_inputs, backward_result_tuple else: # set requires_grad requires_grad_index = [] diff --git a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/cmd_parser.py b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/cmd_parser.py index 4af92bfa1002c419d0bd84e5dfd250b712b57136..a55df65a3772c99a6b63ebff171adb710714ab90 100644 --- a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/cmd_parser.py +++ b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/cmd_parser.py @@ -39,6 +39,8 @@ def add_api_accuracy_checker_argument(parser): help=" The ut task result out path.") parser.add_argument("-csv_path", "--result_csv_path", dest="result_csv_path", default="", type=str, required=False, help=" the exit csv for continue") + parser.add_argument('-save_error_data', dest="save_error_data", action="store_true", + help=" Save compare failed api output.", required=False) def multi_add_api_accuracy_checker_argument(parser): @@ -49,6 +51,8 @@ def multi_add_api_accuracy_checker_argument(parser): help=" The ut task result out path.") parser.add_argument("-csv_path", "--result_csv_path", dest="result_csv_path", default="", type=str, required=False, help=" the exit csv for continue") + parser.add_argument('-save_error_data', dest="save_error_data", action="store_true", + help=" Save compare failed api output.", required=False) #以下属于多线程参数 parser.add_argument("-d", "--device", dest="device_id", nargs='+', type=int, help=" set device id to run ut, must be unique and in range 0-7", diff --git a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/multi_api_accuracy_checker.py b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/multi_api_accuracy_checker.py index 1913675ad162bf690fc0aed5fc84c245ae4f73ca..c1991c078f10194b84ec4fe5b7c39992827bf5cb 100644 --- a/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/multi_api_accuracy_checker.py +++ b/debug/accuracy_tools/msprobe/mindspore/api_accuracy_checker/multi_api_accuracy_checker.py @@ -33,6 +33,14 @@ from msprobe.mindspore.api_accuracy_checker.multi_data_manager import MultiDataM from msprobe.mindspore.common.log import logger from msprobe.mindspore.common.const import MsCompareConst +from msprobe.mindspore import PrecisionDebugger +from msprobe.core.data_dump.data_processor.base import (ModuleBackwardInputs, ModuleBackwardOutputs, + ModuleForwardInputsOutputs) +from msprobe.core.common.file_utils import create_directory +from msprobe.core.data_dump.data_collector import build_data_collector +from msprobe.core.common.utils import Const, print_tools_ends_info, DumpPathAggregation +from msprobe.core.data_dump.data_processor.base import ModuleForwardInputsOutputs, ModuleBackwardInputsOutputs + class MultiApiAccuracyChecker(ApiAccuracyChecker): def __init__(self, args): @@ -51,6 +59,12 @@ class MultiApiAccuracyChecker(ApiAccuracyChecker): # 初始化一个属性来存储当前的设备ID(用于日志中显示) self.current_device_id = None + self.save_error_data = args.save_error_data + if self.save_error_data: + config, dump_path_aggregation = self.init_save_error_data(args) + self.data_collector = build_data_collector(config) + self.data_collector.update_dump_paths(dump_path_aggregation) + def process_on_device(self, device_id, api_infos, progress_queue): """ 在特定设备上处理一部分API。 diff --git a/debug/accuracy_tools/msprobe/mindspore/cell_processor.py b/debug/accuracy_tools/msprobe/mindspore/cell_processor.py index 3ca9fb4951e0257b968cead790ba96c9f3ccd0f3..f959d2aef12a41c1ad7f776536bbac44aa2b8551 100644 --- a/debug/accuracy_tools/msprobe/mindspore/cell_processor.py +++ b/debug/accuracy_tools/msprobe/mindspore/cell_processor.py @@ -13,8 +13,22 @@ # See the License for the specific language governing permissions and # limitations under the License. -from msprobe.core.data_dump.scope import ModuleRangeScope, MixRangeScope +from collections import OrderedDict + +from mindspore import Tensor +from mindspore.common.hook_handle import HookHandle +from mindspore.ops.operations import _inner_ops as inner + from msprobe.core.common.const import Const +from msprobe.core.common.exceptions import MsprobeException +from msprobe.core.data_dump.scope import ModuleRangeScope, MixRangeScope, BaseScope +from msprobe.mindspore.common.const import Const as MsConst +from msprobe.mindspore.common.log import logger +from msprobe.mindspore.common.utils import ( + is_mindtorch, + get_cells_and_names, + has_kwargs_in_forward_hook +) def get_cell_construct(construct): @@ -28,14 +42,17 @@ def get_cell_construct(construct): class CellProcessor: cell_count = {} cell_stack = [] - api_parent_node = "" + api_parent_node = None module_node = {} + cell_bw_hook_kernels = {} + cell_backward_pre_hook = [] + cell_backward_hook = [] def __init__(self, scope): self.scope = scope if isinstance(scope, (ModuleRangeScope, MixRangeScope)) else None @staticmethod - def set_cell_count(cell_name): + def set_and_get_calls_number(cell_name): if cell_name not in CellProcessor.cell_count: CellProcessor.cell_count[cell_name] = 0 else: @@ -46,42 +63,171 @@ class CellProcessor: def reset_cell_stats(cls): cls.cell_count = {} cls.cell_stack = [] - cls.api_parent_node = "" + cls.api_parent_node = None cls.module_node = {} + cls.cell_bw_hook_kernels = {} + cls.cell_backward_pre_hook = [] + cls.cell_backward_hook = [] - def node_hook(self, name_prefix, start_or_stop, **kwargs): - def begin_hook(cell, input_data): - full_name = self.set_and_get_reserved_name(cell, name_prefix, is_called_by_pre_hook=True) - if CellProcessor.cell_stack: - CellProcessor.module_node[full_name] = CellProcessor.cell_stack[-1] - else: - CellProcessor.module_node[full_name] = None + def register_cell_hook(self, models, build_hook): + if not models: + raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, + 'The model cannot be None, when level is "L0" or "mix"') + + is_registered = False + model_type = Const.MODULE if is_mindtorch() else Const.CELL + cells_and_names_with_index = get_cells_and_names(models) + construct_name = '_call_impl' if is_mindtorch() else '_run_construct' + + for index, cells_and_names in cells_and_names_with_index.items(): + model = models if index == "-1" else models[int(index)] + for name, cell in cells_and_names: + if cell == model: + continue + + if not has_kwargs_in_forward_hook(): + if not hasattr(cell.__class__, 'msprobe_construct'): + setattr(cell.__class__, 'msprobe_construct', True) + if hasattr(cell.__class__, construct_name): + setattr(cell.__class__, construct_name, + get_cell_construct(getattr(cell.__class__, construct_name))) + setattr(cell, 'msprobe_hook', True) + + cell_index = (index + Const.SEP) if index != "-1" else "" + prefix = f'{model_type}{Const.SEP}{cell_index}{name}{Const.SEP}{cell.__class__.__name__}{Const.SEP}' + + forward_pre_hook = self.build_cell_hook(prefix, build_hook) + cell.register_forward_pre_hook(forward_pre_hook) + + if not is_registered: + logger.info("The cell hook function is successfully mounted to the model.") + is_registered = True + + def build_cell_hook(self, cell_name, build_data_hook): + def forward_pre_hook(cell, args): + index = CellProcessor.set_and_get_calls_number(cell_name) + full_forward_name = f'{cell_name}{Const.FORWARD}{Const.SEP}{index}' + full_backward_name = f'{cell_name}{Const.BACKWARD}{Const.SEP}{index}' + + self.set_construct_info_in_pre_hook(full_forward_name) - CellProcessor.cell_stack.append(full_name) - CellProcessor.api_parent_node = full_name + if not hasattr(cell, 'msprobe_forward_hook'): + if is_mindtorch(): + cell.register_forward_hook(forward_hook, prepend=True, with_kwargs=True) + else: + forward_hook_dict = getattr(cell, '_forward_hook', OrderedDict()) + if has_kwargs_in_forward_hook(): + forward_hook_with_kwargs_dict = getattr(cell, '_forward_hook_with_kwargs', OrderedDict()) + handle = HookHandle(forward_hook_dict, extra_dict=forward_hook_with_kwargs_dict) + forward_hook_with_kwargs_dict[handle.handle_id] = True + else: + handle = HookHandle(forward_hook_dict) + forward_hook_dict[handle.handle_id] = forward_hook + forward_hook_dict.move_to_end(handle.handle_id, last=False) - if self.scope: - self.scope.begin_module(full_name) + setattr(cell, 'msprobe_forward_hook', True) - def end_hook(cell, input_data, output_data): - if CellProcessor.cell_stack: - CellProcessor.cell_stack.pop() - if CellProcessor.cell_stack: - CellProcessor.api_parent_node = CellProcessor.cell_stack[-1] + def get_backward_hook(backward_data_hook, full_backward_name): + def backward_hook_fn(cell, grad_input, grad_output): + new_output = backward_data_hook(cell, grad_input, grad_output) + self.set_construct_info_in_hook(full_backward_name) + cell.has_pre_hook_called = False + return new_output + return backward_hook_fn + + enable_hooked = sum( + [isinstance(ele, Tensor) and ele.dtype not in MsConst.NonDifferentiableType for ele in args] + ) + if enable_hooked: + backward_hook = OrderedDict() + _, _, backward_data_hook, _ = build_data_hook(BaseScope.Module_Type_Module, full_forward_name) + backward_hook[full_backward_name] = get_backward_hook(backward_data_hook, full_backward_name) + CellProcessor.cell_backward_hook.append(backward_hook) + bw_hook = inner.CellBackwardHook(full_backward_name, cell, + self.cell_backward_hook[-1]) + bw_hook.register_backward_hook() + CellProcessor.cell_bw_hook_kernels[full_forward_name] = bw_hook + + args = bw_hook(*args) + + return args + + def forward_hook(cell, args, kwargs_or_output, output_or_kwargs=None): + index = CellProcessor.cell_count.get(cell_name, 0) + full_forward_name = f'{cell_name}{Const.FORWARD}{Const.SEP}{index}' + full_backward_name = f'{cell_name}{Const.BACKWARD}{Const.SEP}{index}' + + self.set_construct_info_in_hook(full_forward_name) + + _, forward_data_hook, backward_data_hook, _ = build_data_hook(BaseScope.Module_Type_Module, + full_forward_name) + hook_result = forward_data_hook(cell, args, kwargs_or_output, output_or_kwargs) + if hook_result is not None: + outputs = hook_result else: - CellProcessor.api_parent_node = None + outputs = output_or_kwargs if has_kwargs_in_forward_hook() else kwargs_or_output - if self.scope: - self.scope.end_module(cell.mindstudio_reserved_name) + bw_hook = CellProcessor.cell_bw_hook_kernels.get(full_forward_name) + if bw_hook: + if not isinstance(outputs, (Tensor, tuple)): + logger.warning("For backward hooks to be called," + " cell output should be a Tensor or a tuple of Tensors" + f" but received {type(outputs)}") + if isinstance(outputs, tuple): + new_outputs = bw_hook(*outputs) + else: + new_outputs = bw_hook(outputs) + if isinstance(outputs, tuple) and len(outputs) == 1: + new_outputs = (new_outputs,) + outputs = new_outputs - return begin_hook if Const.START == start_or_stop else end_hook + def get_backward_pre_hook(full_backward_name, backward_data_hook): + def backward_pre_hook_fn(cell, grad_output): + cell.has_pre_hook_called = True + self.set_construct_info_in_pre_hook(full_backward_name) + if backward_data_hook: + backward_data_hook(cell, (), grad_output) + self.set_construct_info_in_hook(full_backward_name) + cell.has_pre_hook_called = False + return backward_pre_hook_fn - def set_and_get_reserved_name(self, cell, cell_name, is_called_by_pre_hook=False): - if not is_called_by_pre_hook and hasattr(cell, 'has_pre_hook_called') and cell.has_pre_hook_called: - cell.has_pre_hook_called = False + backward_pre_hook = OrderedDict() + backward_data_hook = None if bw_hook else backward_data_hook + backward_pre_hook[full_backward_name] = get_backward_pre_hook(full_backward_name, backward_data_hook) + CellProcessor.cell_backward_pre_hook.append(backward_pre_hook) + bw_pre_hook = inner.CellBackwardHook(full_backward_name, cell, + self.cell_backward_pre_hook[-1]) + bw_pre_hook.register_backward_pre_hook() + + if isinstance(outputs, tuple): + result = bw_pre_hook(*outputs) + else: + result = bw_pre_hook(outputs) + if isinstance(outputs, tuple): + if len(outputs) == 1: + result = (result,) + if len(result) != len(outputs): + raise TypeError( + f"The backward pre hook return value size is {len(result)} " + f"not equal to output size {len(outputs)}" + ) + return result + + return forward_pre_hook + + def set_construct_info_in_pre_hook(self, full_name): + if self.cell_stack: + CellProcessor.module_node[full_name] = self.cell_stack[-1] else: - if is_called_by_pre_hook: - cell.has_pre_hook_called = True - index = self.set_cell_count(cell_name) - cell.mindstudio_reserved_name = cell_name + Const.SEP + str(index) - return cell.mindstudio_reserved_name + CellProcessor.module_node[full_name] = None + CellProcessor.cell_stack.append(full_name) + CellProcessor.api_parent_node = full_name + if self.scope: + self.scope.begin_module(full_name) + + def set_construct_info_in_hook(self, full_name): + if self.cell_stack: + CellProcessor.cell_stack.pop() + CellProcessor.api_parent_node = CellProcessor.cell_stack[-1] if self.cell_stack else None + if self.scope: + self.scope.end_module(full_name) diff --git a/debug/accuracy_tools/msprobe/mindspore/code_mapping/graph_parser.py b/debug/accuracy_tools/msprobe/mindspore/code_mapping/graph_parser.py index 262e3b6fe0703f9ea1bc757ae3873e4fc3c99fac..e09178d6dce5da7adc382f7ee62e8e32fca4aac4 100644 --- a/debug/accuracy_tools/msprobe/mindspore/code_mapping/graph_parser.py +++ b/debug/accuracy_tools/msprobe/mindspore/code_mapping/graph_parser.py @@ -111,8 +111,9 @@ class Parser: scope_match = scope_pattern.search(text, end_pos) scope = scope_match.group(1) if scope_match else "" - id_pattern = re.compile(r'.*cnode_primal_attrs:' - r'\s*\{.*\b(?:forward_unique_id|unique_id):\s*\"(\d+)\".*', re.IGNORECASE) + id_pattern = re.compile( + r'cnode_primal_attrs:'r'\s*\{[\w+]{1, 10000}\b(?:forward_unique_id|unique_id):\s*\"(\d+)\"', + re.IGNORECASE) unique_id_match = id_pattern.search(text, end_pos, scope_match.start()) unique_id = unique_id_match.group(1) if unique_id_match else None @@ -173,7 +174,7 @@ class Parser: node_info.var_inputs.append(callee_name) def parse_subgraphs(self, text: str) -> None: - subgraph_pattern = re.compile(r'subgraph\s+@(\S+)(\([^\)]*\))?\s+.*\{') + subgraph_pattern = re.compile(r'/subgraph\s+@([\w+]{1,1000)(\([^\)]{1,100}\))?\s+\S[^\{]\{/+') matches = list(subgraph_pattern.finditer(text)) end_pos = 0 for match in matches: diff --git a/debug/accuracy_tools/msprobe/mindspore/common/const.py b/debug/accuracy_tools/msprobe/mindspore/common/const.py index b41dc5ce012dc5353a2f62607eabc604fda4eb3a..ff72ea74d5cbf57a9c82bebec3bcb29ac8cf0545 100644 --- a/debug/accuracy_tools/msprobe/mindspore/common/const.py +++ b/debug/accuracy_tools/msprobe/mindspore/common/const.py @@ -15,6 +15,7 @@ import numpy as np import mindspore as ms +from mindspore import dtype as mstype from msprobe.core.common.const import Const as CoreConst @@ -70,6 +71,13 @@ class Const: MINT_NN_FUNC_DATA_PREFIX: MINT_NN_FUNC_PREFIX } + NonDifferentiableType = ( + mstype.bool_, mstype.int8, mstype.byte, mstype.uint8, mstype.ubyte, + mstype.int16, mstype.short, mstype.uint16, mstype.ushort, + mstype.int32, mstype.intc, mstype.uint32, mstype.uintc, + mstype.int64, mstype.intp, mstype.uint64, mstype.uintp + ) + class MsCompareConst: # api_info field @@ -89,14 +97,11 @@ class MsCompareConst: MINDTORCH_NPU = "NPU" MINDTORCH_DIST = "Distributed" - - MT_VALID_API_TYPES = [ MINDTORCH, MINDTORCH_FUNC, MINDTORCH_TENSOR ] SUPPORTED_FUSION_LIST = ["flash_attention_score"] - TASK_FIELD = "task" STATISTICS_TASK = "statistics" FRAMEWORK = "framework" @@ -130,8 +135,6 @@ class MsCompareConst: EXCEPTION_SKIP = "exception_skip" - - class FreeBenchmarkConst: ADD_NOISE = "add_noise" BIT_NOISE = "bit_noise" diff --git a/debug/accuracy_tools/msprobe/mindspore/common/utils.py b/debug/accuracy_tools/msprobe/mindspore/common/utils.py index 453af63641e8deb49b9d91b7b4330a504ad137f0..0f6f81351a0b7cdc3f7cdf7075768b48ee0d6de4 100644 --- a/debug/accuracy_tools/msprobe/mindspore/common/utils.py +++ b/debug/accuracy_tools/msprobe/mindspore/common/utils.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import inspect import os import random @@ -28,6 +29,11 @@ from msprobe.core.common.const import Const from msprobe.core.common.utils import CompareException, check_seed_all, is_save_variable_valid +mindtorch_check_result = None +register_backward_hook_functions = {} +kwargs_exist_in_forward_hook = None + + class MsprobeStep(ms.train.Callback): def __init__(self, debugger): super(MsprobeStep, self).__init__() @@ -152,9 +158,6 @@ def remove_dropout(): nn.functional.dropout = dropout_ext -mindtorch_check_result = None - - def is_mindtorch(): global mindtorch_check_result if mindtorch_check_result is None: @@ -169,11 +172,11 @@ def is_mindtorch(): return mindtorch_check_result -register_backward_hook_functions = {} - - def set_register_backward_hook_functions(): global register_backward_hook_functions + if register_backward_hook_functions: + return + if is_mindtorch(): import torch from msprobe.mindspore.mindtorch import (_call_impl, @@ -209,3 +212,34 @@ def check_save_param(variable, name, save_backward): "should be bool. " "Skip current save process.") raise ValueError + + +def get_cells_and_names(models): + cells_and_names_with_index = {} + + def get_cell_or_module(model): + return model.named_modules() if is_mindtorch() else model.cells_and_names() + + if isinstance(models, (list, tuple)): + for index, model in enumerate(models): + cells_and_names_with_index[str(index)] = get_cell_or_module(model) + else: + cells_and_names_with_index["-1"] = get_cell_or_module(models) + return cells_and_names_with_index + + +def has_kwargs_in_forward_hook(): + global kwargs_exist_in_forward_hook + + if kwargs_exist_in_forward_hook is None: + if is_mindtorch(): + kwargs_exist_in_forward_hook = True + return kwargs_exist_in_forward_hook + + try: + func_params = inspect.signature(nn.Cell.register_forward_hook).parameters + kwargs_exist_in_forward_hook = 'with_kwargs' in func_params + except Exception: + kwargs_exist_in_forward_hook = False + + return kwargs_exist_in_forward_hook diff --git a/debug/accuracy_tools/msprobe/mindspore/compare/ms_compare.py b/debug/accuracy_tools/msprobe/mindspore/compare/ms_compare.py index 7eae4c3ac469ff4975a644193d9ca08ad01181de..0c8bb42a4faef2c62b76a12dbf483d43d73c426e 100644 --- a/debug/accuracy_tools/msprobe/mindspore/compare/ms_compare.py +++ b/debug/accuracy_tools/msprobe/mindspore/compare/ms_compare.py @@ -13,420 +13,29 @@ # See the License for the specific language governing permissions and # limitations under the License. -import os -import re -from collections import defaultdict - -import numpy as np -import pandas as pd - -from msprobe.core.common.const import CompareConst, Const -from msprobe.core.common.exceptions import FileCheckException -from msprobe.core.common.file_utils import create_directory, load_json, load_npy, load_yaml -from msprobe.core.common.log import logger -from msprobe.core.common.utils import CompareException, check_compare_param, check_configuration_param, \ - check_op_str_pattern_valid, get_dump_mode, set_dump_path, detect_framework_by_dump_json -from msprobe.core.compare.acc_compare import Comparator, ModeConfig -from msprobe.core.compare.check import dtype_mapping +from msprobe.core.compare.acc_compare import Comparator, ModeConfig, MappingConfig, setup_comparison from msprobe.core.compare.layer_mapping import generate_data_mapping_by_layer_mapping -from msprobe.core.compare.utils import set_stack_json_path, reorder_op_x_list - - -class MappingConfig: - def __init__(self, cell_mapping=None, api_mapping=None, data_mapping=None): - self.cell_mapping = cell_mapping - self.api_mapping = api_mapping - self.data_mapping = data_mapping - - -class MSComparator(Comparator): - """ - 用于mindspore动态图同框架/跨框架精度比对,支持md5/summary/all模式。 - cell_mapping: mindspore在cell级别(L0)dump数据和pytorch的module之间的映射关系; - api_mapping: mindspore在api级别(L1)dump数据和pytorch的api之间的映射关系; - data_mapping: mindspore的cell或api的入参/出参和pytorch之间的映射关系; - is_cross_framework: 是否跨框架。 - """ - def __init__(self, mode_config, mapping_config=None, is_cross_framework=False): - super().__init__(mode_config) - self.frame_name = MSComparator.__name__ - - self.stack_mode = mode_config.stack_mode - self.auto_analyze = mode_config.auto_analyze - self.fuzzy_match = mode_config.fuzzy_match - self.dump_mode = mode_config.dump_mode - - if mapping_config: - self.cell_mapping = mapping_config.cell_mapping - self.api_mapping = mapping_config.api_mapping - self.data_mapping = mapping_config.data_mapping - - if self.data_mapping: - self.cross_frame = is_cross_framework - else: - self.cross_frame = self.cell_mapping is not None or self.api_mapping is not None - self.cell_mapping_dict = self.load_mapping_file(self.cell_mapping) - self.api_mapping_dict = self.load_mapping_file(self.api_mapping) - if self.api_mapping is not None: - self.ms_to_pt_mapping = self.load_internal_api() - - if isinstance(self.data_mapping, str) or self.data_mapping is None: - self.data_mapping_dict = self.load_mapping_file(self.data_mapping) - elif isinstance(self.data_mapping, dict): - self.data_mapping_dict = self.data_mapping - else: - raise TypeError(f"The type of parameter `data_mapping` must be dict, str or None, but got " - f"{type(self.data_mapping)}") - - @staticmethod - def process_data_name(result): - result['data_name_x'] = result.apply(lambda row: [row['data_name_x'], row['data_name_y']], axis=1) - return result - - def calc_accuracy(self, result_df, header): - condition_no_bench = result_df[CompareConst.BENCH_NAME] == CompareConst.N_A - result_df[condition_no_bench] = result_df[condition_no_bench].fillna(CompareConst.N_A) - result_df.loc[condition_no_bench, CompareConst.ERROR_MESSAGE] = CompareConst.NO_BENCH - - def calc_summary_diff(data_type: str): - def type_check(val): - check_series = pd.Series(False, index=val.index) - val_str = val.astype(str) - check_series[pd.to_numeric(val_str, errors='coerce').notna() | val_str.str.lower().eq('nan')] = True - return check_series - - def get_number(val): - return pd.to_numeric(val.astype(str), errors='coerce') - - ms_val = result_df['NPU ' + data_type] - pt_val = result_df['Bench ' + data_type] - diff_name = data_type.capitalize() + ' diff' - rel_err_name = ('norm' if data_type == 'l2norm' else data_type).capitalize() + 'RelativeErr' - condition_na = ~type_check(ms_val) | ~type_check(pt_val) - result_df.loc[condition_na, [diff_name, rel_err_name]] = CompareConst.N_A - result_df.loc[~(condition_no_bench | condition_na), diff_name] = get_number(ms_val) - get_number(pt_val) - condition_nan_diff = ~condition_no_bench & ~condition_na & result_df[diff_name].isna() - condition_not_nan_diff = ~condition_no_bench & ~condition_na & result_df[diff_name].notna() - result_df.loc[condition_nan_diff, [diff_name, rel_err_name]] = CompareConst.NAN - condition_pt_zero = pt_val == 0 - result_df.loc[condition_not_nan_diff & condition_pt_zero, rel_err_name] = CompareConst.NAN - condition_ref_err = condition_not_nan_diff & ~condition_pt_zero - result_df.loc[condition_ref_err, rel_err_name] = (result_df.loc[condition_ref_err, diff_name] / - pt_val[condition_ref_err] * 100) - result_df.loc[condition_ref_err, rel_err_name] = (result_df.loc[condition_ref_err, rel_err_name] - .abs().astype(str) + '%') - magnitude = get_number(result_df[diff_name]).abs() / ( - pd.Series(np.maximum(get_number(ms_val), get_number(pt_val))).abs() + CompareConst.EPSILON) - return magnitude > CompareConst.MAGNITUDE - - if self.dump_mode == Const.MD5: - condition_md5_equal = result_df[CompareConst.NPU_MD5] == result_df[CompareConst.BENCH_MD5] - result_df.loc[condition_md5_equal, CompareConst.RESULT] = CompareConst.PASS - result_df.loc[~condition_md5_equal & ~condition_no_bench, CompareConst.RESULT] = CompareConst.DIFF - elif self.dump_mode == Const.SUMMARY: - warning_list = [calc_summary_diff(data_type) for data_type in ['max', 'min', 'mean', 'l2norm']] - warning_flag = pd.DataFrame(warning_list).any() - result_df.loc[~condition_no_bench, [CompareConst.RESULT, CompareConst.ERROR_MESSAGE]] = '' - result_df.loc[warning_flag, CompareConst.RESULT] = CompareConst.WARNING - result_df.loc[warning_flag, CompareConst.ERROR_MESSAGE] = 'Need double check api accuracy.' - else: - fill_cols = [CompareConst.COSINE, CompareConst.EUC_DIST, - CompareConst.MAX_ABS_ERR, CompareConst.MAX_RELATIVE_ERR, - CompareConst.ONE_THOUSANDTH_ERR_RATIO, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO, - CompareConst.ERROR_MESSAGE] - result_df.loc[~condition_no_bench, fill_cols] = '' - result_df.loc[~condition_no_bench, CompareConst.ACCURACY] = CompareConst.ACCURACY_CHECK_YES - return result_df[header] - - def make_result_df(self, result): - header = CompareConst.HEAD_OF_COMPARE_MODE[self.dump_mode][:] - - if self.stack_mode: - header.append(CompareConst.STACK) - if self.dump_mode == Const.ALL: - header.append(CompareConst.DATA_NAME) - result = self.process_data_name(result) - - result.rename(columns={'op_name_x': CompareConst.NPU_NAME, - 'op_name_y': CompareConst.BENCH_NAME, - 'dtype_x': CompareConst.NPU_DTYPE, - 'dtype_y': CompareConst.BENCH_DTYPE, - 'shape_x': CompareConst.NPU_SHAPE, - 'shape_y': CompareConst.BENCH_SHAPE, - 'md5_x': CompareConst.NPU_MD5, - 'md5_y': CompareConst.BENCH_MD5, - 'data_name_x': CompareConst.DATA_NAME, - 'stack_info_x': CompareConst.STACK}, inplace=True) - - npu_summary = [CompareConst.NPU_MAX, CompareConst.NPU_MIN, CompareConst.NPU_MEAN, CompareConst.NPU_NORM] - bench_summary = [CompareConst.BENCH_MAX, CompareConst.BENCH_MIN, CompareConst.BENCH_MEAN, - CompareConst.BENCH_NORM] - - def set_summary(summary): - if summary == CompareConst.N_A: - return [CompareConst.N_A] * 4 - summary_list = [] - for i in summary: - if i is None: - summary_list.append(CompareConst.N_A) - elif str(i).lower() == 'nan': - summary_list.append(CompareConst.NAN) - else: - summary_list.append(i) - return summary_list - - result[npu_summary] = result['summary_x'].apply(set_summary).tolist() - result[bench_summary] = result['summary_y'].apply(set_summary).tolist() - - result_df = pd.DataFrame(columns=header) - for h in header: - if h in result.columns: - result_df[h] = result[h] - return self.calc_accuracy(result_df, header) - - def load_internal_api(self): - cur_path = os.path.dirname(os.path.realpath(__file__)) - yaml_path = os.path.abspath(os.path.join(cur_path, CompareConst.INTERNAL_API_MAPPING_FILE)) - return load_yaml(yaml_path) - - def load_mapping_file(self, mapping_file): - if isinstance(mapping_file, str): - mapping_dict = load_yaml(mapping_file) - else: - mapping_dict = {} - return mapping_dict - - def process_cell_mapping(self, npu_op_name): - if not npu_op_name: - return CompareConst.N_A - param_grad_flag = Const.PARAMS_GRAD in npu_op_name.split(Const.SEP) - if not param_grad_flag and not re.search(Const.REGEX_FORWARD_BACKWARD, npu_op_name): - return CompareConst.N_A - npu_op_name = npu_op_name.replace("Cell", "Module", 1) - if self.cell_mapping_dict: - # get cell name & class name from op_name - # Cell.fc1.Dense.forward.0.input.0 - cell_name = re.split(r'\.(?:forward|backward|parameters_grad)\.', npu_op_name.split(Const.SEP, 1)[-1])[0] - if cell_name in self.cell_mapping_dict: - npu_op_name = npu_op_name.replace(cell_name, self.cell_mapping_dict[cell_name], 1) - return npu_op_name - - def read_npy_data(self, dir_path, file_name, load_pt_file=False): - if not file_name: - return None - data_path = os.path.join(dir_path, file_name) - if load_pt_file: - import torch - from msprobe.pytorch.common.utils import load_pt - data_value = load_pt(data_path, True).detach() - if data_value.dtype == torch.bfloat16: - data_value = data_value.to(torch.float32) - data_value = data_value.numpy() - else: - data_value = load_npy(data_path) - return data_value - - def process_internal_api_mapping(self, npu_op_name): - # get api name & class name from op_name - # Functional.addcmul.0.forward.input.0 - ms_api_name = self.get_api_name(npu_op_name.split(Const.SEP)) - class_name = ms_api_name.split(Const.SEP)[0] - if class_name == "Mint": - return npu_op_name.replace("Mint", "Torch") - elif class_name == "MintFunctional": - return npu_op_name.replace("MintFunctional", "Functional") - elif self.ms_to_pt_mapping.get(ms_api_name): - return npu_op_name.replace(ms_api_name, self.ms_to_pt_mapping.get(ms_api_name)) - else: - return npu_op_name - - def get_api_name(self, api_list): - try: - api_name = api_list[0] + Const.SEP + api_list[1] - except IndexError as error: - logger.error(f'Failed to retrieve API name, please check if the dump data is reasonable') - raise CompareException(CompareException.INDEX_OUT_OF_BOUNDS_ERROR) from error - return api_name - - def compare_process(self, file_lists): - npu_json_path, bench_json_path, stack_json_path = file_lists - npu_json_data = load_json(npu_json_path) - bench_json_data = load_json(bench_json_path) - stack_json_data = load_json(stack_json_path) if self.stack_mode else None - - npu_df = self.gen_data_df(npu_json_data, stack_json_data) - bench_df = self.gen_data_df(bench_json_data, stack_json_data) - if self.cell_mapping: - npu_df[CompareConst.COMPARE_KEY] = npu_df[CompareConst.OP_NAME].apply(self.process_cell_mapping) - elif self.api_mapping: - npu_df[CompareConst.COMPARE_KEY] = npu_df[CompareConst.OP_NAME].apply(self.process_internal_api_mapping) - if isinstance(self.api_mapping, str): - self.modify_compare_data_with_user_mapping(npu_df, bench_df) - else: - npu_df[CompareConst.COMPARE_KEY] = npu_df[CompareConst.OP_NAME] - npu_df[[Const.DTYPE, Const.SHAPE]] = npu_df[[Const.DTYPE, Const.SHAPE]].astype(str) - bench_df[[Const.DTYPE, Const.SHAPE]] = bench_df[[Const.DTYPE, Const.SHAPE]].astype(str) - npu_df[CompareConst.COMPARE_SHAPE] = npu_df[Const.SHAPE] - bench_df[CompareConst.COMPARE_KEY] = bench_df[CompareConst.OP_NAME] - bench_df[CompareConst.COMPARE_SHAPE] = bench_df[Const.SHAPE] - match_result = pd.merge(npu_df, bench_df, on=([CompareConst.COMPARE_KEY] if self.fuzzy_match - else [CompareConst.COMPARE_KEY, CompareConst.COMPARE_SHAPE]), - how='outer') - match_result = match_result[match_result['op_name_x'].notna()].fillna(CompareConst.N_A) - - def gen_dtype_condition(): - npu_dtype = match_result['dtype_x'] - bench_dtype = match_result['dtype_y'] - if self.cross_frame: - npu_dtype = npu_dtype.map(dtype_mapping).fillna(npu_dtype) - - equal_condition = npu_dtype == bench_dtype - match_condition = ( - (npu_dtype.isin(CompareConst.DTYPE_MATCH_GROUPS[0]) & bench_dtype.isin( - CompareConst.DTYPE_MATCH_GROUPS[0])) | - (npu_dtype.isin(CompareConst.DTYPE_MATCH_GROUPS[1]) & bench_dtype.isin( - CompareConst.DTYPE_MATCH_GROUPS[1])) - ) - return equal_condition | match_condition - - if not self.fuzzy_match: - match_result.loc[~gen_dtype_condition(), [i + '_y' for i in bench_df.columns]] = CompareConst.N_A - return self.make_result_df(match_result) - - def modify_compare_data_with_user_mapping(self, npu_df, bench_df): - def get_api_indices_dict(op_name_df): - api_indices_dict = defaultdict(list) - for op_index, name in enumerate(op_name_df[CompareConst.OP_NAME]): - api = self.get_api_name(name.split(Const.SEP)) - api_indices_dict[api].append(op_index) - return api_indices_dict - - ms_api_indices_dict = get_api_indices_dict(npu_df) - pt_api_indices_dict = get_api_indices_dict(bench_df) - - def gen_input_compare_key(pattern, term): - flag = True - for i, prefix in enumerate(mapping_dict.get(f'ms_{term}')): - if op_name.split(pattern)[1].startswith(str(prefix)): - npu_df.loc[index, CompareConst.COMPARE_KEY] = ( - op_name.replace(pattern + str(prefix), - pattern + str(mapping_dict.get(f'pt_{term}')[i]))) - flag = False - return flag - - for mapping_dict in self.api_mapping_dict: - keys_to_compare = [ - ('ms_args', 'pt_args'), - ('ms_outputs', 'pt_outputs'), - ('ms_parameters', 'pt_parameters'), - ('ms_parameters_grad', 'pt_parameters_grad'), - ] - if not all(len(mapping_dict.get(k1, [])) == len(mapping_dict.get(k2, [])) for k1, k2 in keys_to_compare): - logger.warning('The user-defined mapping table is incorrect,\ - make sure that the number of parameters is equal') - continue - - ms_api, pt_api = mapping_dict.get('ms_api'), mapping_dict.get('pt_api') - if ms_api not in ms_api_indices_dict or pt_api not in pt_api_indices_dict: - continue - for index in ms_api_indices_dict.get(ms_api): - op_name = npu_df.loc[index, CompareConst.OP_NAME].replace(ms_api, pt_api, 1) - if CompareConst.INPUT_PATTERN in op_name: - is_abandoned = gen_input_compare_key(CompareConst.INPUT_PATTERN, 'args') - elif CompareConst.KWARGS_PATTERN in op_name: - is_abandoned = gen_input_compare_key(CompareConst.KWARGS_PATTERN, 'args') - elif CompareConst.OUTPUT_PATTERN in op_name: - is_abandoned = gen_input_compare_key(CompareConst.OUTPUT_PATTERN, 'output') - elif CompareConst.PARAMS_PATTERN in op_name: - is_abandoned = gen_input_compare_key(CompareConst.PARAMS_PATTERN, 'parameters') - elif CompareConst.PARAMS_GRAD_PATTERN in op_name: - is_abandoned = gen_input_compare_key(CompareConst.PARAMS_GRAD_PATTERN, 'parameters_grad') - else: - logger.error(f'Excepted op_name: {op_name}') - raise CompareException(CompareException.INVALID_DATA_ERROR) - if is_abandoned: - npu_df.loc[index, CompareConst.COMPARE_KEY] = op_name + 'abandoned' - - def gen_data_df(self, data_json, stack_json_data): - result = { - CompareConst.OP_NAME: [], - Const.DTYPE: [], - Const.SHAPE: [], - Const.SUMMARY: [], - 'stack_info': [] - } - if self.dump_mode == Const.ALL: - result['data_name'] = [] - elif self.dump_mode == Const.MD5: - result[Const.MD5] = [] - for data_name in data_json['data']: - check_op_str_pattern_valid(data_name) - merge_list = self.gen_merge_list(data_json, data_name, stack_json_data) - if not merge_list: - continue - - op_name_list = merge_list.get(CompareConst.OP_NAME) - summary_list = merge_list.get(Const.SUMMARY) - data_name_list = merge_list.get('data_name') - op_name_reorder, summary_reorder, data_name_reorder = reorder_op_x_list(op_name_list, - summary_list, - data_name_list) - for op_name in op_name_reorder: - result[CompareConst.OP_NAME].append(op_name) - if (CompareConst.INPUT_PATTERN in op_name) or (CompareConst.KWARGS_PATTERN in op_name): - struct = merge_list[CompareConst.INPUT_STRUCT].pop(0) - elif CompareConst.OUTPUT_PATTERN in op_name: - struct = merge_list[CompareConst.OUTPUT_STRUCT].pop(0) - elif CompareConst.PARAMS_PATTERN in op_name: - struct = merge_list[CompareConst.PARAMS_STRUCT].pop(0) - else: - struct = merge_list[CompareConst.PARAMS_GRAD_STRUCT].pop(0) - result[Const.DTYPE].append(struct[0]) - result[Const.SHAPE].append(struct[1]) - if self.dump_mode == Const.MD5: - result[Const.MD5].append(struct[2]) - result[Const.SUMMARY].append(summary_reorder.pop(0)) - result['stack_info'].append(merge_list['stack_info'][0] if self.stack_mode else None) - if self.dump_mode == Const.ALL: - result['data_name'].append(data_name_reorder.pop(0)) - return pd.DataFrame(result) +from msprobe.mindspore.compare.utils import read_npy_data, check_cross_framework +from msprobe.pytorch.compare.utils import read_pt_data -def check_cross_framework(bench_json_path): - framework = detect_framework_by_dump_json(bench_json_path) - if framework == Const.PT_FRAMEWORK: - return True +def read_real_data(npu_dir, npu_data_name, bench_dir, bench_data_name, cross_frame) -> tuple: + n_value = read_npy_data(npu_dir, npu_data_name) + if cross_frame: + b_value = read_pt_data(bench_dir, bench_data_name) else: - return False + b_value = read_npy_data(bench_dir, bench_data_name) + return n_value, b_value def ms_compare(input_param, output_path, **kwargs): - try: - auto_analyze = kwargs.get('auto_analyze', True) - fuzzy_match = kwargs.get('fuzzy_match', False) - cell_mapping = kwargs.get('cell_mapping', None) - api_mapping = kwargs.get('api_mapping', None) - data_mapping = kwargs.get('data_mapping', None) - layer_mapping = kwargs.get('layer_mapping', None) - suffix = kwargs.get('suffix', '') + config = setup_comparison(input_param, output_path, **kwargs) - set_dump_path(input_param) - dump_mode = get_dump_mode(input_param) - if 'stack_json_path' in input_param: - stack_mode = kwargs.get('stack_mode', False) - else: - stack_mode = set_stack_json_path(input_param) # set stack_mode and set "stack_json_path" in input_param - check_configuration_param(stack_mode, auto_analyze, fuzzy_match, input_param.get('is_print_compare_log', True)) - create_directory(output_path) - check_compare_param(input_param, output_path, dump_mode, stack_mode) - except (CompareException, FileCheckException) as error: - logger.error('Compare failed. Please check the arguments and do it again!') - raise CompareException(error.code) from error - if layer_mapping: - data_mapping = generate_data_mapping_by_layer_mapping(input_param, layer_mapping, output_path) + if config.layer_mapping: + config.data_mapping = generate_data_mapping_by_layer_mapping(input_param, config.layer_mapping, output_path) - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(cell_mapping, api_mapping, data_mapping) is_cross_framework = check_cross_framework(input_param.get('bench_json_path')) - ms_comparator = MSComparator(mode_config, mapping_config, is_cross_framework) - ms_comparator.compare_core(input_param, output_path, suffix=suffix) + mode_config = ModeConfig(config.stack_mode, config.auto_analyze, config.fuzzy_match, config.dump_mode) + mapping_config = MappingConfig(config.cell_mapping, config.api_mapping, config.data_mapping) + ms_comparator = Comparator(read_real_data, mode_config, mapping_config, is_cross_framework) + ms_comparator.compare_core(input_param, output_path, suffix=config.suffix) diff --git a/debug/accuracy_tools/msprobe/mindspore/compare/utils.py b/debug/accuracy_tools/msprobe/mindspore/compare/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..7a9c78e8f74426c23982723fcf90f729fc9e694c --- /dev/null +++ b/debug/accuracy_tools/msprobe/mindspore/compare/utils.py @@ -0,0 +1,37 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +from msprobe.core.common.const import Const +from msprobe.core.common.file_utils import load_npy, FileChecker, FileCheckConst +from msprobe.core.common.utils import detect_framework_by_dump_json + + +def read_npy_data(dir_path, file_name): + if not file_name: + return None + + data_path = os.path.join(dir_path, file_name) + path_checker = FileChecker(data_path, FileCheckConst.FILE, FileCheckConst.READ_ABLE, + FileCheckConst.NUMPY_SUFFIX, False) + data_path = path_checker.common_check() + data_value = load_npy(data_path) + return data_value + + +def check_cross_framework(bench_json_path): + framework = detect_framework_by_dump_json(bench_json_path) + return framework == Const.PT_FRAMEWORK diff --git a/debug/accuracy_tools/msprobe/mindspore/debugger/debugger_config.py b/debug/accuracy_tools/msprobe/mindspore/debugger/debugger_config.py index 789dc9ef2077bec826392abea4d5725b68ad98a0..1e9721c72afb5c195cc968a71385187ddf5f0f37 100644 --- a/debug/accuracy_tools/msprobe/mindspore/debugger/debugger_config.py +++ b/debug/accuracy_tools/msprobe/mindspore/debugger/debugger_config.py @@ -42,11 +42,12 @@ class DebuggerConfig: self.framework = Const.MS_FRAMEWORK self.summary_mode = task_config.summary_mode self.async_dump = common_config.async_dump if common_config.async_dump else False - if hasattr(task_config, 'td_config_path'): - self.td_config_path = "" if not task_config.td_config_path else task_config.td_config_path + if hasattr(task_config, 'td_config_path') and task_config.td_config_path: + self.td_config_path = task_config.td_config_path else: self.td_config_path = "" self.check() + self._check_statistics_config(task_config) create_directory(self.dump_path) if self.task == Const.FREE_BENCHMARK: @@ -80,8 +81,12 @@ class DebuggerConfig: self.check_mode = "all" if not isinstance(self.async_dump, bool): raise Exception("The parameters async_dump should be bool.") - if self.async_dump and self.task == Const.TENSOR and not self.list: - raise Exception("The parameters async_dump is true in tensor task, the parameters list cannot be empty.") + if self.async_dump and self.task == Const.TENSOR: + if self.level_ori == Const.LEVEL_DEBUG: + self.list = [] # async_dump + debug level case ignore list + if not self.list and self.level_ori != Const.LEVEL_DEBUG: + raise Exception("The parameters async_dump is true in tensor task," + " the parameters list cannot be empty.") if self.task == Const.STRUCTURE and self.level_ori not in [Const.LEVEL_L0, Const.LEVEL_MIX]: logger.warning_on_rank_0( f"When the task is set to structure, the level should be one of {[Const.LEVEL_L0, Const.LEVEL_MIX]}. " @@ -102,3 +107,14 @@ class DebuggerConfig: if not self.list or len(self.list) != 1: raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, f"When level is set to L2, the list must be configured as a list with one api name.") + + def _check_statistics_config(self, task_config): + if self.task != Const.STATISTICS: + return + self.tensor_list = [] + if not hasattr(task_config, "tensor_list"): + return + if self.level_ori == Const.LEVEL_DEBUG and task_config.tensor_list: + logger.warning_on_rank_0("When level is set to debug, the tensor_list will be invalid.") + return + self.tensor_list = task_config.tensor_list diff --git a/debug/accuracy_tools/msprobe/mindspore/debugger/precision_debugger.py b/debug/accuracy_tools/msprobe/mindspore/debugger/precision_debugger.py index bfad585662ec0b9cdedfee34b77d67b53f935646..74d61101cd07bdb86aefb853457455d264233806 100644 --- a/debug/accuracy_tools/msprobe/mindspore/debugger/precision_debugger.py +++ b/debug/accuracy_tools/msprobe/mindspore/debugger/precision_debugger.py @@ -86,6 +86,7 @@ class PrecisionDebugger: self.config = DebuggerConfig(common_config, task_config) if self._need_msprobe_c() and _msprobe_c: + os.environ["MS_HOOK_ENABLE"] = "on" _msprobe_c._PrecisionDebugger(framework="MindSpore", config_path=config_path) self.config.execution_mode = self._get_execution_mode() @@ -242,6 +243,31 @@ class PrecisionDebugger: instance.service.init_step = step instance.service.loop = 0 + @classmethod + def register_custom_api(cls, module, api_name, api_prefix=None): + if not api_prefix: + api_prefix = getattr(module, "__name__", "Custom") + if not isinstance(api_prefix, str): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, "api_prefix must be string") + if not hasattr(module, api_name): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, f"module {str(module)} does not have {api_name}") + instance = cls._instance + if not instance: + raise Exception(MsgConst.NOT_CREATED_INSTANCE) + instance.service.register_custom_api(module, api_name, api_prefix) + + @classmethod + def restore_custom_api(cls, module, api): + if not hasattr(module, api): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, f"module {str(module)} does not have {api}") + instance = cls._instance + if not instance: + raise Exception(MsgConst.NOT_CREATED_INSTANCE) + instance.service.restore_custom_api(module, api) + @classmethod def _need_service(cls): instance = cls._instance diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/cell_dump_process.py b/debug/accuracy_tools/msprobe/mindspore/dump/cell_dump_process.py index a77f3d4fe3747f378365a63027e57e479a4eae24..cf8266fe28c00311582c2dcb0cdb79ab7c5bc935 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/cell_dump_process.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/cell_dump_process.py @@ -15,7 +15,6 @@ import os import time import re -import json import atexit from multiprocessing import Pool @@ -81,7 +80,7 @@ def gen_file_path(dump_path, cell_prefix, suffix, io_type, index): step_path = os.path.join(dump_path, "{step}") rank_path = os.path.join(step_path, "{rank}") data_path = os.path.join(rank_path, CoreConst.DUMP_TENSOR_DATA) - file_name = CoreConst.SEP.join([cell_prefix, suffix, io_type, str(index)]) + file_name = cell_prefix + CoreConst.SEP + suffix + CoreConst.SEP + io_type + CoreConst.SEP + str(index) return os.path.join(data_path, file_name) @@ -95,7 +94,7 @@ def clip_gradient(dump_path, cell_prefix, index, io_type, dx): if io_type == KEY_OUTPUT: temp = td(gen_file_path(dump_path, cell_prefix, KEY_BACKWARD, io_type, index), dx) dx = ops.depend(dx, temp) - if io_type == KEY_INPUT: + elif io_type == KEY_INPUT: temp = td_in(gen_file_path(dump_path, cell_prefix, KEY_BACKWARD, io_type, index), dx) dx = ops.depend(dx, temp) return dx @@ -112,24 +111,24 @@ def cell_construct_wrapper(func, self): index = 0 item = None + backward_or_all = self.data_mode in ["backward", "all"] + forward_or_all = self.data_mode in ["forward", "all"] # The inputs of the cell. for index, item in enumerate(args): - if self.data_mode == "backward" or self.data_mode == "all": - if ops.is_tensor(item): - item = self.output_clips[index](item) - if self.data_mode == "forward" or self.data_mode == "all": - if ops.is_tensor(item): - if need_tensordump_in(self, 'input_dump_mode'): - temp = td_in( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_INPUT, index), - item - ) - else: - temp = td( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_INPUT, index), - item - ) - item = ops.depend(item, temp) + if backward_or_all and ops.is_tensor(item): + item = self.output_clips[index](item) + if forward_or_all and ops.is_tensor(item): + if need_tensordump_in(self, 'input_dump_mode'): + temp = td_in( + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_INPUT, index), + item + ) + else: + temp = td( + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_INPUT, index), + item + ) + item = ops.depend(item, temp) new_args.append(item) out = func(*new_args, **kwargs) @@ -137,43 +136,40 @@ def cell_construct_wrapper(func, self): # The outputs of the cell. if isinstance(out, tuple): for index, item in enumerate(out): - if self.data_mode == "backward" or self.data_mode == "all": - if ops.is_tensor(item): - item = self.input_clips[index](item) - if self.data_mode == "forward" or self.data_mode == "all": - if ops.is_tensor(item): - if need_tensordump_in(self, 'output_dump_mode'): - temp = td_in( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, index), - item - ) - else: - temp = td( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, index), - item - ) - item = ops.depend(item, temp) - out_list.append(item) - else: - out_list.append(item) - out_list = tuple(out_list) - return out_list - else: - if self.data_mode == "backward" or self.data_mode == "all": - out = self.input_clips[0](out) - if self.data_mode == "forward" or self.data_mode == "all": - if ops.is_tensor(out): + if backward_or_all and ops.is_tensor(item): + item = self.input_clips[index](item) + if forward_or_all and ops.is_tensor(item): if need_tensordump_in(self, 'output_dump_mode'): temp = td_in( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, 0), - out + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, index), + item ) else: temp = td( - gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, 0), - out + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, index), + item ) - out = ops.depend(out, temp) + item = ops.depend(item, temp) + out_list.append(item) + elif forward_or_all and not ops.is_tensor(item): + out_list.append(item) + out_list = tuple(out_list) + return out_list + else: + if backward_or_all: + out = self.input_clips[0](out) + if forward_or_all and ops.is_tensor(out): + if need_tensordump_in(self, 'output_dump_mode'): + temp = td_in( + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, 0), + out + ) + else: + temp = td( + gen_file_path(self.dump_path, self.cell_prefix, KEY_FORWARD, KEY_OUTPUT, 0), + out + ) + out = ops.depend(out, temp) return out return new_construct.__get__(self, type(self)) @@ -258,16 +254,17 @@ def get_data_mode(str): def check_relation(cell_name, parent_cell_name): layers_pattern = rf"{CoreConst.SEP}{KEY_LAYERS}{CoreConst.SEP}\d+$" last_dot_index = cell_name.rfind(CoreConst.SEP) - if last_dot_index != -1: - # 如果cell_name最后一个'.'之前的字段等于parent_cell_name,则判定存在父子关系 - sub_cell_name = cell_name[:last_dot_index] + if last_dot_index == -1: + return False + # 如果cell_name最后一个'.'之前的字段等于parent_cell_name,则判定存在父子关系 + sub_cell_name = cell_name[:last_dot_index] + if sub_cell_name == parent_cell_name: + return True + elif re.search(layers_pattern, cell_name): + # 如果cell_name以".layer.{layer_id}"结尾,且去掉该字段后等于parent_cell_name,则判定存在父子关系 + sub_cell_name = re.sub(layers_pattern, '', cell_name) if sub_cell_name == parent_cell_name: return True - elif re.search(layers_pattern, cell_name): - # 如果cell_name以".layer.{layer_id}"结尾,且去掉该字段后等于parent_cell_name,则判定存在父子关系 - sub_cell_name = re.sub(layers_pattern, '', cell_name) - if sub_cell_name == parent_cell_name: - return True return False @@ -554,7 +551,7 @@ def start(net=None, dump_path="./", data_mode=CoreConst.ALL, td_config_path=''): else: #Format: Cell.{cell_name}.{class_name} cell.cell_prefix = CoreConst.SEP.join([CoreConst.CELL, name, cell.__class__.__name__]) - + # 根据yaml配置文件设置cell的TensorDump模式 if class_name in first_layer_key: layer_data = yaml_data.get(class_name) diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/dump_tool_factory.py b/debug/accuracy_tools/msprobe/mindspore/dump/dump_tool_factory.py index f3030b4cdc63772f362dbdec2206e720e69bb26a..beb5eec518555e4a67146f1c004aaae6f5207ca3 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/dump_tool_factory.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/dump_tool_factory.py @@ -25,7 +25,7 @@ class DumpToolFactory: tools = { Const.CELL: { Const.GRAPH_KBYK_MODE: GraphModeCellDump, - Const.GRAPH_GE_MODE: GraphModeCellDump, + Const.GRAPH_GE_MODE: None, Const.PYNATIVE_MODE: None }, Const.API: { diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/graph_mode_cell_dump.py b/debug/accuracy_tools/msprobe/mindspore/dump/graph_mode_cell_dump.py index 52e2d57af22940807d361272273e1d82f7c8520c..868ae5e4679f69a7e584fbd43919656676208787 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/graph_mode_cell_dump.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/graph_mode_cell_dump.py @@ -13,11 +13,12 @@ # See the License for the specific language governing permissions and # limitations under the License. import os -from msprobe.mindspore.common.log import logger -from msprobe.mindspore.debugger.debugger_config import DebuggerConfig import mindspore as ms from mindspore.ops.primitive import _run_op from mindspore import hal, ops + +from msprobe.mindspore.common.log import logger +from msprobe.mindspore.debugger.debugger_config import DebuggerConfig import msprobe.mindspore.dump.cell_dump_process as cellDumper from msprobe.mindspore.common.const import Const @@ -53,11 +54,11 @@ class GraphModeCellDump: ops.tensordump(step_flag, temp_tensor) def check_config(self): - if self.rank != []: + if self.rank: raise Exception("In graph mode, cell dump does not currently support specifying rank.") - if self.scope != []: + if self.scope: raise Exception("In graph mode, cell dump does not currently support specifying scope.") - if self.list != []: + if self.list: raise Exception("In graph mode, cell dump does not currently support specifying list.") if len(self.data_mode) != 1 or self.data_mode[0] not in Const.GRAPH_CELL_DUMP_DATA_MODE_LIST: raise Exception("In graph mode and cell dump, data_mode must be one of all, forword, backword.") @@ -83,4 +84,4 @@ class GraphModeCellDump: dump_path=self.dump_path, data_mode=self.data_mode[0], td_config_path=self.td_config_path - ) \ No newline at end of file + ) diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/api_register.py b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/api_register.py index 432e7ac35134412edbfa88dd1151ef62b48cb241..9718132acc5ef85b7a617dd6f29540c24b2aec6c 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/api_register.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/api_register.py @@ -14,6 +14,7 @@ # limitations under the License. import os +import inspect from mindspore import Tensor, ops, mint from mindspore.mint import distributed @@ -23,6 +24,7 @@ from mindspore.communication import comm_func from msprobe.core.common.file_utils import load_yaml from msprobe.core.common.utils import Const from msprobe.core.data_dump.api_registry import ApiRegistry +from msprobe.mindspore.common.log import logger from msprobe.mindspore.common.const import Const as MsConst from msprobe.mindspore.common.utils import is_mindtorch from msprobe.mindspore.dump.hook_cell.hook_cell import HOOKCell @@ -72,7 +74,7 @@ _inner_used_api = { ops, "norm", "square", "sqrt", "is_complex", "stack", "is_floating_point" ), Const.MS_FRAMEWORK + Const.SEP + Const.MS_API_TYPE_TENSOR: ( - Tensor, "to", "numel" + Tensor, "to", "numel", 'sum' ), Const.MS_FRAMEWORK + Const.SEP + Const.MS_API_TYPE_MINT: ( mint, "max", "min", "mean", "norm" @@ -86,7 +88,8 @@ class ApiTemplate(HOOKCell): self.api_func = api_func self.prefix_api_name = prefix + Const.SEP + str(api_name.split(Const.SEP)[-1]) + Const.SEP super().__init__(hook_build_func) - if prefix == Const.MINT_DIST_API_TYPE_PREFIX: + distributed_prefix = Const.DIST_API_TYPE_PREFIX if is_mindtorch() else Const.MINT_DIST_API_TYPE_PREFIX + if prefix == distributed_prefix: self.op_is_distributed = True @staticmethod @@ -110,7 +113,15 @@ class ApiTemplate(HOOKCell): if self.prefix_api_name.startswith( (MsConst.DISTRIBUTED_DATA_PREFIX, Const.MINT_DIST_API_TYPE_PREFIX) ): - if kwargs.get("async_op") or self.api_name in ["isend", "irecv"]: + try: + bound = inspect.signature(self.api_func).bind(*args, **kwargs) + bound.apply_defaults() + use_asyn_op_flag = bound.arguments.get("asyn_op", False) + except Exception as e: + use_asyn_op_flag = False + logger.warning(f"fail to get dist api's func signature because {e}, no wait") + + if use_asyn_op_flag or self.api_name in ["isend", "irecv"]: output = self.async_to_sync(output) if self.api_name == "batch_isend_irecv" and isinstance(output, list): output = [self.async_to_sync(handle) for handle in output] diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/hook_cell.py b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/hook_cell.py index 868d71bfc20d5a058a7dcb42ff05b44be3ba3862..f19e1e9e4c4aaccce9558fd5a2dce8cebc417cb7 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/hook_cell.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/hook_cell.py @@ -31,7 +31,7 @@ def get_cell_count(name): def __init__(self, hook_build_func) -> None: super(HOOKCell, self).__init__() self.changed_status = False - self.input_kwargs = {} + self.msprobe_input_kwargs = {} if not HOOKCell.g_stop_hook: HOOKCell.g_stop_hook = True self.changed_status = True @@ -49,7 +49,7 @@ def __init__(self, hook_build_func) -> None: # 重载call,加全局标志。 def __call__(self, *args, **kwargs): try: - setattr(self, 'msprobe_input_kwargs', kwargs) + self.msprobe_input_kwargs = kwargs out = super(HOOKCell, self).__call__(*args, **kwargs) except Exception as e: raise e diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/primitive_hooks.py b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/primitive_hooks.py index 656e48c678956563a6f2d1d5f5ab8a4d03f074e7..4b187e13148b06d0429983522cbca443c29ec5d7 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/primitive_hooks.py +++ b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/primitive_hooks.py @@ -58,7 +58,7 @@ class PrimitiveHookService: def backward_hook(grad): captured_grads.extend(grad) backward_primitive_name = f"{updated_primitive_name}{Const.SEP}{Const.BACKWARD}" - + self.service_instance.inner_switch = True try: if hook_type == Const.INPUT: self.service_instance.data_collector.update_api_or_module_name(backward_primitive_name) @@ -77,6 +77,7 @@ class PrimitiveHookService: logger.error(f"This is a primitive op {hook_type}_backward dump error: {exception}, " f"updated_primitive_name: {updated_primitive_name}") raise DumpException(DumpException.BACKWARD_DATA_COLLECTION_ERROR) from exception + self.service_instance.inner_switch = False return backward_hook @@ -137,6 +138,7 @@ class PrimitiveHookService: def pre_forward_hook(primitive_name, primitive_instance, args, kwargs): module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=None) + self.service_instance.inner_switch = True try: self.service_instance.data_collector.forward_input_data_collect( primitive_name, @@ -148,9 +150,11 @@ class PrimitiveHookService: logger.error(f"This is a primitive op dump error during forward input data collection: {exception}, " f"primitive_name: {primitive_name}") raise DumpException(DumpException.FORWARD_DATA_COLLECTION_ERROR) from exception + self.service_instance.inner_switch = False def post_forward_hook(primitive_name, primitive_instance, args, kwargs, output): module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=output) + self.service_instance.inner_switch = True try: self.service_instance.data_collector.forward_output_data_collect( primitive_name, @@ -162,6 +166,7 @@ class PrimitiveHookService: logger.error(f"This is a primitive op dump error during forward output data collection: {exception}, " f"primitive_name: {primitive_name}") raise DumpException(DumpException.FORWARD_DATA_COLLECTION_ERROR) from exception + self.service_instance.inner_switch = False def wrapped_primitive_call(instance_self, *args, **kwargs): """ @@ -179,7 +184,7 @@ class PrimitiveHookService: current_count = self.primitive_counters.get(primitive_name, 0) updated_primitive_name = f"{Const.PRIMITIVE_PREFIX}{Const.SEP}{primitive_name}{Const.SEP}{current_count}" - if not self.service_instance.primitive_switch: + if not self.service_instance.primitive_switch or self.service_instance.inner_switch: return origin_func(*args, **kwargs) captured_grads_input, captured_grads_output = [], [] diff --git a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/support_wrap_ops.yaml b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/support_wrap_ops.yaml index b4f8b114c7e77e2f9f2e377e4c57ab7bedd6a610..eae8f85a87fb2b0986cefb2e6faae7399a86f367 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/support_wrap_ops.yaml +++ b/debug/accuracy_tools/msprobe/mindspore/dump/hook_cell/support_wrap_ops.yaml @@ -564,15 +564,15 @@ tensor: - all - amax - amin + - angle - any - arccos - arccosh - - argmax - - angle - arcsin - arcsinh - arctan - arctanh + - argmax - argmin - argsort - asin @@ -582,19 +582,23 @@ tensor: - atanh - baddbmm - bernoulli + - bfloat16 - bincount - bitwise_and - bitwise_or - bitwise_xor - bmm - bool + - bool astype - broadcast_to + - byte - ceil - - cholesky_solve - cholesky + - cholesky_solve - clamp - clip - conj + - copy - copysign - cos - cosh @@ -606,11 +610,13 @@ tensor: - deg2rad - diag - diagflat + - diagonal - diff - digamma - div - div_ - divide + - double - equal - erf - erfc @@ -618,13 +624,16 @@ tensor: - exp - expand_as - expm1 + - flatten - flip - fliplr - flipud + - float - float_power - floor - fmod - frac + - from_numpy - gather_elements - ge - geqrf @@ -648,12 +657,12 @@ tensor: - inner - int - inverse + - is_complex + - is_signed - isclose - isfinite - isinf - isnan - - is_complex - - is_signed - isneginf - isposinf - isreal @@ -704,28 +713,27 @@ tensor: - new_ones - new_zeros - nextafter - - norm - nonzero + - norm - not_equal - ormqr - permute - pow - prod - qr + - rad2deg - ravel - real - reciprocal - remainder - renorm - - rad2deg - - tile - repeat_interleave - reshape - reshape - - round + - resize - rot90 + - round - rsqrt - - sum_to_size - scatter - sgn - short @@ -745,7 +753,8 @@ tensor: - sub - sub_ - subtract - - subtract + - sum + - sum_to_size - svd - swapaxes - swapdims @@ -753,13 +762,13 @@ tensor: - take - tan - tanh - - trace - - swapaxes + - tensor_split - tile + - to - topk - - tril - - tensor_split + - trace - transpose + - tril - true_divide - trunc - unbind @@ -769,17 +778,6 @@ tensor: - view - where - xlogy - - from_numpy - - std - - take - - var - - all - - any - - copy - - diagonal - - flatten - - resize - - sum mint.ops: - abs diff --git a/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cc b/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cc deleted file mode 100644 index b72d68741da491fc450c2d697a3ebfec895a3447..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cc +++ /dev/null @@ -1,140 +0,0 @@ -/** - * Copyright 2024 Huawei Technologies Co., Ltd - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#include "hook_dynamic_loader.h" -#include -#include -#include -#include "utils/log_adapter.h" - -namespace { - -// Utility function to check if a file path is valid -bool IsValidPath(const std::string &path) { - struct stat fileStat; - if (stat(path.c_str(), &fileStat) != 0) { - MS_LOG(ERROR) << "File does not exist or cannot be accessed: " << path; - return false; - } - - if (S_ISLNK(fileStat.st_mode)) { - MS_LOG(ERROR) << "File is a symbolic link, which is not allowed: " << path; - return false; - } - - if (!S_ISREG(fileStat.st_mode)) { - MS_LOG(ERROR) << "File is not a regular file: " << path; - return false; - } - - if (path.substr(path.find_last_of(".")) != ".so") { - MS_LOG(ERROR) << "File is not a .so file: " << path; - return false; - } - - return true; -} - -} // namespace - -HookDynamicLoader &HookDynamicLoader::GetInstance() { - static HookDynamicLoader instance; - return instance; -} - -bool HookDynamicLoader::loadFunction(void *handle, const std::string &functionName) { - void *func = dlsym(handle, functionName.c_str()); - if (!func) { - MS_LOG(WARNING) << "Could not load function: " << functionName << ", error: " << dlerror(); - return false; - } - funcMap_[functionName] = func; - return true; -} - -bool HookDynamicLoader::validateLibraryPath(const std::string &libPath) { - char *realPath = realpath(libPath.c_str(), nullptr); - if (!realPath) { - MS_LOG(WARNING) << "Failed to resolve realpath for the library: " << libPath; - return false; - } - - bool isValid = IsValidPath(realPath); - free(realPath); // Free memory allocated by realpath - return isValid; -} - -bool HookDynamicLoader::LoadLibrary() { - const char *libPath = std::getenv("HOOK_TOOL_PATH"); - if (!libPath) { - MS_LOG(WARNING) << "HOOK_TOOL_PATH is not set!"; - return false; - } - - std::string resolvedLibPath(libPath); - if (!validateLibraryPath(resolvedLibPath)) { - MS_LOG(WARNING) << "Library path validation failed."; - return false; - } - - std::lock_guard lock(mutex_); - if (handle_) { - MS_LOG(WARNING) << "Hook library already loaded!"; - return false; - } - - handle_ = dlopen(resolvedLibPath.c_str(), RTLD_LAZY | RTLD_LOCAL); - if (!handle_) { - MS_LOG(WARNING) << "Failed to load Hook library: " << dlerror(); - return false; - } - - for (const auto &functionName : functionList_) { - if (!loadFunction(handle_, functionName)) { - MS_LOG(WARNING) << "Failed to load function: " << functionName; - dlclose(handle_); - handle_ = nullptr; - return false; - } - } - - MS_LOG(INFO) << "Hook library loaded successfully."; - return true; -} - -bool HookDynamicLoader::UnloadLibrary() { - std::lock_guard lock(mutex_); - if (!handle_) { - MS_LOG(WARNING) << "Hook library hasn't been loaded."; - return false; - } - - dlclose(handle_); - handle_ = nullptr; - funcMap_.clear(); - MS_LOG(INFO) << "Library unloaded successfully."; - return true; -} - -void *HookDynamicLoader::GetHooker(const std::string &funcName) { - std::lock_guard lock(mutex_); - auto iter = funcMap_.find(funcName); - if (iter == funcMap_.end()) { - MS_LOG(WARNING) << "Function not found: " << funcName; - return nullptr; - } - return iter->second; -} diff --git a/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cpp b/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cpp new file mode 100644 index 0000000000000000000000000000000000000000..6cd3d0c75b4e9e2ca8000db0866bfeaa5958a66f --- /dev/null +++ b/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.cpp @@ -0,0 +1,110 @@ +/** + * Copyright 2024 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include "hook_dynamic_loader.h" +#include +#include +#include +#include +#include "utils/log_adapter.h" + +namespace py = pybind11; + +HookDynamicLoader &HookDynamicLoader::GetInstance() +{ + static HookDynamicLoader instance; + return instance; +} + +bool HookDynamicLoader::LoadFunction(void *handle, const std::string &functionName) { + void *func = dlsym(handle, functionName.c_str()); + if (!func) { + MS_LOG(WARNING) << "Could not load function: " << functionName << ", error: " << dlerror(); + return false; + } + funcMap_[functionName] = func; + return true; +} + +bool HookDynamicLoader::LoadLibrary() +{ + std::string msprobePath = ""; + // 获取gil锁 + py::gil_scoped_acquire acquire; + try { + py::module msprobeMod = py::module::import("msprobe.lib._msprobe_c"); + if (!py::hasattr(msprobeMod, "__file__")) { + MS_LOG(WARNING) << "Adump mod not found"; + return false; + } + msprobePath = msprobeMod.attr("__file__").cast(); + } catch (const std::exception& e) { + MS_LOG(WARNING) << "Adump mod path unable to get: " << e.what(); + return false; + } + std::lock_guard lock(mutex_); + if (handle_) { + MS_LOG(WARNING) << "Hook library already loaded!"; + return false; + } + if (msprobePath == "") { + MS_LOG(WARNING) << "Adump path not loaded"; + return false; + } + handle_ = dlopen(msprobePath.c_str(), RTLD_LAZY | RTLD_LOCAL); + if (!handle_) { + MS_LOG(WARNING) << "Failed to load Hook library: " << dlerror(); + return false; + } + + for (const auto &functionName : functionList_) { + if (!LoadFunction(handle_, functionName)) { + MS_LOG(WARNING) << "Failed to load adump function"; + dlclose(handle_); + handle_ = nullptr; + return false; + } + } + + MS_LOG(INFO) << "Hook library loaded successfully."; + return true; +} + +bool HookDynamicLoader::UnloadLibrary() +{ + std::lock_guard lock(mutex_); + if (!handle_) { + MS_LOG(WARNING) << "Hook library hasn't been loaded."; + return false; + } + + dlclose(handle_); + handle_ = nullptr; + funcMap_.clear(); + MS_LOG(INFO) << "Library unloaded successfully."; + return true; +} + +void *HookDynamicLoader::GetHooker(const std::string &funcName) +{ + std::lock_guard lock(mutex_); + auto iter = funcMap_.find(funcName); + if (iter == funcMap_.end()) { + MS_LOG(WARNING) << "Function not found: " << funcName; + return nullptr; + } + return iter->second; +} diff --git a/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.h b/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.h index 6309e60b662a03d7f77cb450986ded5329fd8960..f1bcd84e70bc4a1e3bf4e164eb3da6374a60b3b6 100644 --- a/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.h +++ b/debug/accuracy_tools/msprobe/mindspore/dym_loader/hook_dynamic_loader.h @@ -27,27 +27,26 @@ constexpr auto kHookBegin = "MS_DbgOnStepBegin"; constexpr auto kHookEnd = "MS_DbgOnStepEnd"; class HookDynamicLoader { - public: - static HookDynamicLoader &GetInstance(); +public: + static HookDynamicLoader &GetInstance(); - HookDynamicLoader(const HookDynamicLoader &) = delete; - HookDynamicLoader &operator=(const HookDynamicLoader &) = delete; + HookDynamicLoader(const HookDynamicLoader &) = delete; + HookDynamicLoader &operator=(const HookDynamicLoader &) = delete; - bool LoadLibrary(); - bool UnloadLibrary(); - void *GetHooker(const std::string &funcName); + bool LoadLibrary(); + bool UnloadLibrary(); + void *GetHooker(const std::string &funcName); - private: - // Helper functions - bool loadFunction(void *handle, const std::string &functionName); - bool validateLibraryPath(const std::string &libPath); +private: + // Helper functions + bool LoadFunction(void *handle, const std::string &functionName); - HookDynamicLoader() = default; + HookDynamicLoader() = default; - void *handle_ = nullptr; - std::vector functionList_ = {kHookBegin, kHookEnd}; - std::map funcMap_; - std::mutex mutex_; + void *handle_ = nullptr; + std::vector functionList_ = {kHookBegin, kHookEnd}; + std::map funcMap_; + std::mutex mutex_; }; #endif // HOOK_DYNAMIC_LOADER_H diff --git a/debug/accuracy_tools/msprobe/mindspore/free_benchmark/api_pynative_self_check.py b/debug/accuracy_tools/msprobe/mindspore/free_benchmark/api_pynative_self_check.py index da4821b3ac45a689fab5ba5c63515f88bd6e17c3..8a2f5d3b6b35843801baad30c17acb4debb50760 100644 --- a/debug/accuracy_tools/msprobe/mindspore/free_benchmark/api_pynative_self_check.py +++ b/debug/accuracy_tools/msprobe/mindspore/free_benchmark/api_pynative_self_check.py @@ -75,7 +75,7 @@ class ApiPyNativeSelfCheck: ret = None if not need_wrapper_func(): - del cell.input_kwargs + del cell.msprobe_input_kwargs return ret api_name_with_id = api_name_with_id[:-1] @@ -84,9 +84,9 @@ class ApiPyNativeSelfCheck: api_name_with_id[api_name_with_id.find(Const.SEP) + 1:api_name_with_id.rfind(Const.SEP)]) if api_name in self.api_list: ret = check_self(api_name_with_id, output_data, self.ori_func.get(api_name), - *input_data, **cell.input_kwargs) + *input_data, **cell.msprobe_input_kwargs) - del cell.input_kwargs + del cell.msprobe_input_kwargs return ret def backward_hook(cell, grad_input, grad_output): diff --git a/debug/accuracy_tools/msprobe/mindspore/grad_probe/global_context.py b/debug/accuracy_tools/msprobe/mindspore/grad_probe/global_context.py index bbdff002201a6588860ed9ab437e280913a7a7c6..ca032e61e5b5cc0d98732ac0bca2d14f377ebfb1 100644 --- a/debug/accuracy_tools/msprobe/mindspore/grad_probe/global_context.py +++ b/debug/accuracy_tools/msprobe/mindspore/grad_probe/global_context.py @@ -69,6 +69,7 @@ class GlobalContext: create_directory(self._setting.get(GradConst.OUTPUT_PATH)) else: logger.warning("The output_path exists, the data will be covered.") + self._setting[GradConst.TIME_STAMP] = str(int(time.time())) def get_context(self, key: str): diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_analyse.py b/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_analyse.py new file mode 100644 index 0000000000000000000000000000000000000000..d9331d2ba9e2f8ae16d33a7daa5b0335faf39e9c --- /dev/null +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_analyse.py @@ -0,0 +1,63 @@ +# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +from msprobe.core.common.log import logger +from msprobe.core.common.const import MonitorConst +from msprobe.core.common.file_utils import save_json, create_directory, remove_path, \ + check_file_or_directory_path, load_json + + +class AnomalyDataWriter: + """ + 异常数据写入类,负责将异常数据写入到JSON文件中。 + """ + + def __init__(self, dump_path, rank) -> None: + self.dump_path = dump_path + self.dump_rank_dir = os.path.join(self.dump_path, f"rank{rank}") + self.json_path = os.path.join(self.dump_rank_dir, MonitorConst.ANOMALY_JSON) + + @staticmethod + def get_anomaly_dict(anomalies): + """将GradAnomalyData列表转换为json""" + anomalies_json = {} + for anomaly in anomalies: + anomalies_json.update({anomaly.get_key(): anomaly.to_dict()}) + return anomalies_json + + def init_detected_json(self): + """初始化落盘文件""" + create_directory(self.dump_rank_dir) + + if os.path.exists(self.json_path): + check_file_or_directory_path(self.json_path, isdir=False) + logger.warning(f"The existing file will be deleted: {self.json_path}.") + remove_path(self.json_path) + save_json(self.json_path, {}, indent=1) + + def write_detected_json(self, anomalies): + """ + 落盘异常数据 + Args: + anomalies: GradAnomalyData对象列表 + """ + anomalies_json = self.get_anomaly_dict(anomalies) + logger.info(f"{MonitorConst.ANOMALY_JSON} is at {self.dump_rank_dir}.") + + data_to_write = load_json(self.json_path) if os.path.exists(self.json_path) else {} + data_to_write.update(anomalies_json) + save_json(self.json_path, data_to_write, indent=1) diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_detect.py b/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_detect.py index 3544ebbd025614349585bc799b15e00a5c2c7956..bcd902b24de5e2b7441741017394f8ba4ebebb71 100644 --- a/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_detect.py +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/anomaly_detect.py @@ -16,6 +16,7 @@ import itertools import os import sys +import math import statistics as st from abc import ABC from dataclasses import dataclass, field @@ -25,6 +26,7 @@ from collections import defaultdict import pandas as pd from mindspore import ops +from mindspore import Tensor from mindspore import _no_grad from msprobe.core.common.log import logger from msprobe.core.common.file_utils import change_mode, create_directory, write_df_to_csv @@ -34,7 +36,7 @@ from msprobe.core.common.const import FileCheckConst, MonitorConst class ScanRule(ABC): name = "ScanRule" - def apply(self, history, cur): + def apply(self, cur, history=None): raise NotImplementedError("abstract method apply is not implemented") @@ -44,7 +46,7 @@ class AnomalyTurbulence(ScanRule): def __init__(self, threshold) -> None: self.threshold = threshold - def apply(self, history, cur): + def apply(self, cur, history=None): baseline = st.mean(history) if isinstance(history, list) else history up_bound = baseline + baseline * self.threshold @@ -54,6 +56,16 @@ class AnomalyTurbulence(ScanRule): return cur < up_bound +class AnomalyNan(ScanRule): + name = "AnomalyNan" + + def __init__(self, threshold=None) -> None: + self.threshold = threshold + + def apply(self, cur, history=None): + return math.isnan(cur) or (self.threshold is not None and abs(cur) > self.threshold) + + class AnomalyScanner: @staticmethod @@ -70,7 +82,7 @@ class AnomalyScanner: rule_args = spec.get("args") # 检查必要的键是否存在 - if rule_cls_name is None or rule_args is None: + if rule_cls_name is None or (rule_cls_name == "AnomalyTurbulence" and rule_args is None): logger.warning(f"Spec is missing required keys: {spec}") continue @@ -82,7 +94,7 @@ class AnomalyScanner: continue try: - rule_instance = rule_cls(**rule_args) + rule_instance = rule_cls(**rule_args) if rule_args is not None else rule_cls() alert_rules.append(rule_instance) except Exception as e: logger.error(f"Error creating instance of rule '{rule_cls_name}': {e}") @@ -94,7 +106,7 @@ class AnomalyScanner: def scan(scan_rules: List[ScanRule], history, cur): anomaly = False for rule in scan_rules: - anomaly = rule.apply(history, cur) + anomaly = rule.apply(cur, history=history) if anomaly: return anomaly, rule.name return anomaly, None @@ -162,9 +174,8 @@ class TrainStage: OPTIMIZER_STAGE = 2 -FORWARD_KEY = [MonitorConst.ACTV_IN, MonitorConst.ACTV_OUT] -BACKWARD_KEY = [MonitorConst.ACTVGRAD_IN, MonitorConst.ACTVGRAD_OUT, - MonitorConst.PRE_GRAD, MonitorConst.POST_GRAD, MonitorConst.ACC_GRAD] +FORWARD_KEY = [MonitorConst.ACTV] +BACKWARD_KEY = [MonitorConst.ACTVGRAD, MonitorConst.PRE_GRAD, MonitorConst.POST_GRAD, MonitorConst.ACC_GRAD] OPTIMIZER_KEY = [MonitorConst.EXP_AVG, MonitorConst.EXP_AVG_SQ] TRAIN_STAGE = { **{key_: TrainStage.FORWARD_STAGE for key_ in FORWARD_KEY}, @@ -222,7 +233,7 @@ class GradAnomalyData: @staticmethod def get_train_stage(tag_name): """ - :param tag_name: "0:fc2_0/rank0/input", "0:fc1.weight/rank0/post_grad", "0:fc2.weight/rank0/exp_avg_sq" + :param tag_name: "0:fc2.input:0/rank0/actv", "0:fc1.weight/rank0/post_grad", "0:fc2.weight/rank0/exp_avg_sq" :return: int, if forward return 0; if backward return 1; if optimizer return 2 """ key_ = tag_name.split("/")[-1] @@ -255,6 +266,40 @@ class BaseWriterWithAD: self.anomalies = [] self.ndigits = writer_input.ndigits + @staticmethod + def stack_tensors(tensor_list): + """ + Torch not support stack cpu and xpu tensors. Group the tensors into cpu_group and xpu_group, + stack them separately, migrate xpu_group to cpu, and then restore in the order of input. + + :param tensor_list: [tensor(-1.6165), tensor(-1.0985), tensor(-1.7777), tensor(-1.8408, device='npu:0')] + :return: result: list of float + """ + cpu_tensors = [] + xpu_tensors = [] + + for tensor in tensor_list: + if isinstance(tensor, Tensor): + # 将device上的tensor先stack后to cpu + xpu_tensors.append(tensor) + else: + cpu_tensors.append(tensor) + + xpu_stack = ops.stack(xpu_tensors).tolist() if xpu_tensors else ops.tensor([]) + + # 按照输入的顺序恢复 + result = [] + cpu_tensors_idx, xpu_tensors_idx = 0, 0 + for tensor in tensor_list: + if isinstance(tensor, Tensor): + result.append(xpu_stack[xpu_tensors_idx]) + xpu_tensors_idx += 1 + else: + result.append(cpu_tensors[cpu_tensors_idx]) + cpu_tensors_idx += 1 + + return result + def get_anomalies(self): """返回已检测到的异常列表 """ @@ -290,8 +335,12 @@ class BaseWriterWithAD: tags = list(itertools.product(metric_value.keys(), op_list)) for op2tensor in metric_value.values(): tensors.extend(op2tensor.values()) + + if not tensors: + return + with _no_grad(): - metric_list = ops.stack(tensors).tolist() if tensors else [] + metric_list = self.stack_tensors(tensors) for tag, metric in zip(tags, metric_list): self.add_scalar(tag, metric, step, need_explain) @@ -353,10 +402,9 @@ class CSVWriterWithAD(BaseWriterWithAD): new_data = [] for name, metric_value in self.context_dict.items(): - if MonitorConst.NAME_SEP not in name: - new_data.append([name] + [step] + metric_value) - else: - new_data.append(name.split(MonitorConst.NAME_SEP) + [step] + metric_value) + new_line = name.split(MonitorConst.NAME_SEP) + metric_value + new_line.insert(2, step) + new_data.append(new_line) new_data = pd.DataFrame(new_data).round(self.ndigits) write_df_to_csv(new_data, filepath, mode='a+', header=False) self.context_dict = defaultdict(list) @@ -379,26 +427,11 @@ class CSVWriterWithAD(BaseWriterWithAD): need_explain = prefix == 'other' super().write_metrics(op_list, metric_value, step, prefix='', need_explain=need_explain) - # generate csv headers - # set hashmap to reduce the number of headers generated. - # 前向的norm用input.ops_和output.ops_,反向的用input_grad.ops_和output_grad.ops_ - if prefix in {"actv", "actv_grad"}: - if prefix == "actv": - input_and_output = [MonitorConst.ACTV_IN, MonitorConst.ACTV_OUT] - else: - input_and_output = [MonitorConst.ACTVGRAD_IN, MonitorConst.ACTVGRAD_OUT] - ops_ = [MonitorConst.DOT.join(i) for i in itertools.product(input_and_output, op_list)] - csv_header = ["module_name", "step", *ops_] + if prefix in [MonitorConst.ACTV, MonitorConst.ACTVGRAD]: + self.header = MonitorConst.CSV_HEADER_XY + op_list else: - csv_header = ["param_name", "step", *op_list] - - keys = list(metric_value.keys()) - if keys and MonitorConst.NAME_SEP in keys[0]: - csv_header.insert(0, "vpp_stage") - - self.header = csv_header + self.header = MonitorConst.CSV_HEADER + op_list self.write_csv(prefix, step) - self.header = [] def close(self): pass diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/common_func.py b/debug/accuracy_tools/msprobe/mindspore/monitor/common_func.py new file mode 100644 index 0000000000000000000000000000000000000000..ef72a75ca246a8943bf580ba490465d2cca2c09b --- /dev/null +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/common_func.py @@ -0,0 +1,91 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +from mindspore import nn +from mindspore import communication +from msprobe.mindspore.monitor.utils import logger +from msprobe.mindspore.common.utils import is_mindtorch +if is_mindtorch(): + import torch + + +def is_valid_instance(model): + return isinstance(model, torch.nn.Module) if is_mindtorch() else isinstance(model, nn.Cell) + + +def get_submodules(model): + if not is_valid_instance(model): + logger.info("Counter invalid model, nothing to hook") + return {} + return model.named_modules() if is_mindtorch() else model.cells_and_names() + + +def get_parameters(model): + if not is_valid_instance(model): + return {} + if is_mindtorch(): + return model.named_parameters() + else: + return model.parameters_and_names() + + +def get_rank(): + if comm_is_initialized(): + return communication.get_rank() + return 0 + + +def comm_is_initialized(): + return communication.GlobalComm.INITED + + +def optimizer_pre_hook(optimizer, fn): + """ + fn should be fn(optimizer, args, **kwargs) + """ + if is_mindtorch(): + origin_api = optimizer.__class__.step + + def patch_step(func, optimizer): + def wrapper(*args, **kwargs): + fn(optimizer, args, kwargs) + out = func(*args, **kwargs) + return out + return wrapper + optimizer.__class__.step = patch_step(optimizer.__class__.step, optimizer) + return (optimizer.__class__.step, origin_api) + + else: + handle = optimizer.register_forward_pre_hook(fn) + return handle + + +def optimizer_post_hook(optimizer, fn): + if is_mindtorch(): + origin_api = optimizer.__class__.step + + def patch_step(func, optimizer): + def wrapper(*args, **kwargs): + out = func(*args, **kwargs) + fn(optimizer, args, kwargs) + return out + return wrapper + optimizer.__class__.step = patch_step(optimizer.__class__.step, optimizer) + return (optimizer.__class__.step, origin_api) + + else: + handle = optimizer.register_forward_hook(fn) + return handle diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/features.py b/debug/accuracy_tools/msprobe/mindspore/monitor/features.py index 0eae3eda7395b4b9725f6f97f5259a2804748b9c..997e39f1ecb6346fcbcd33c5bcf63d9c6731ec63 100644 --- a/debug/accuracy_tools/msprobe/mindspore/monitor/features.py +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/features.py @@ -56,10 +56,20 @@ def get_nans(t): return ops.isnan(t.astype(mstype.float32)).sum() +def get_shape(t): + return t.shape + + +def get_dtype(t): + return t.dtype + + FUNC_MAP = {"min" : get_min, "max" : get_max, "mean" : get_mean, "norm" : get_norm, "nans" : get_nans, - "zeros": get_zeros + "zeros": get_zeros, + "shape": get_shape, + "dtype": get_dtype } \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/module_hook.py b/debug/accuracy_tools/msprobe/mindspore/monitor/module_hook.py index 807c3a9046a2c56bc68dfd060405b6df60486a24..6da4b8616e7d8986a7bf38dd2bcfce1e15bc3c95 100644 --- a/debug/accuracy_tools/msprobe/mindspore/monitor/module_hook.py +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/module_hook.py @@ -20,21 +20,21 @@ from collections import defaultdict from datetime import datetime import pytz -import mindspore as ms from mindspore import Tensor, mint from mindspore import nn, _no_grad -from mindspore.communication import get_rank from msprobe.core.common.log import logger -from msprobe.core.common.const import MonitorConst +from msprobe.core.common.const import MonitorConst, Const from msprobe.core.common.file_utils import load_json, save_json +from msprobe.mindspore.common.utils import is_mindtorch +from msprobe.mindspore.monitor.common_func import is_valid_instance, get_parameters, get_submodules, get_rank from msprobe.mindspore.monitor.utils import get_summary_writer_tag_name, validate_config, step_accumulates_one, \ - is_skip_step, get_metrics, get_single_metrics, get_target_output_dir -from msprobe.mindspore.monitor.module_spec_verifier import validate_config_spec + is_skip_step, get_metrics, get_target_output_dir +from msprobe.mindspore.monitor.optimizer_collect import OptimizerMonFactory from msprobe.mindspore.monitor.anomaly_detect import AnomalyScanner, AnomalyDataFactory, \ CSVWriterWithAD, BaseWriterWithAD, WriterInput -from msprobe.mindspore.monitor.distributed.wrap_distributed import api_register, create_hooks, op_aggregate, \ - get_process_group +from msprobe.mindspore.monitor.anomaly_analyse import AnomalyDataWriter +from msprobe.mindspore.monitor.distributed.wrap_distributed import api_register, create_hooks, op_aggregate FORMAT_MAPPING = { MonitorConst.CSV: CSVWriterWithAD, @@ -88,24 +88,6 @@ class ModuleHookContext: self.actvgrad = [] self.module_name = module_name self.struct = {} - self.format_by_arg = {} - self.verified = False - self.focused_in_col = 0 - self.focused_out_col = 0 - self.ignore_in = False # no need to care when no key 'input' or 'input_grad' found - - def set_format_by_arg(self, key_name: str, target_config: dict): - cared = target_config.get(self.module_name, self.struct) - if key_name in cared: - if isinstance(cared[key_name], dict): - # current cared is self.struct - config = cared[key_name].get('config') - self.format_by_arg[key_name] = config - else: - # current cared is target_config[self.module_name] - self.format_by_arg[key_name] = cared[key_name] - elif key_name in ['input', 'input_grad']: - self.ignore_in = True def reset(self): self.actv.clear() @@ -186,6 +168,7 @@ class TrainerMon: self.config_file_path = config_file_path self.process_group = process_group self.params_have_main_grad = params_have_main_grad + self.is_mindtorch = is_mindtorch() self.config_timestamp = 0 # 后面有校验时间戳, 首次监控无需为了更新config文件时间戳而去改, 可通过dynamic_on开关直接打开 self.config = load_json(config_file_path) validate_config(self.config) @@ -218,6 +201,7 @@ class TrainerMon: self.dp_group = None self.tp_group = None self.micro_batch_number = 1 + self.optimizer_mon = None # TYPE3: 会随着训练中途config配置更新或监控状态改变而重置的变量 self.module_fwd_hook_context_by_module = defaultdict(ModuleHookContext) @@ -240,6 +224,8 @@ class TrainerMon: self.optimizer_hooked = False self.param_registered = False self.struct_printed = False + self.pre_step_hooks = [] + self.post_step_hooks = [] # 动静态区分 self.dynamic_enable = os.getenv("DYNAMIC_MONITOR", 'False').lower() == 'true' @@ -296,18 +282,25 @@ class TrainerMon: if self.format not in FORMAT_MAPPING: logger.error(f"Unsupported format: {self.format}, use default format: {MonitorConst.CSV}") self.format = MonitorConst.CSV - writer = FORMAT_MAPPING[self.format] self.step_count_per_record = self.config.get('step_count_per_record', 1) - self.summary_writer = writer( - WriterInput( - self.tensorboard_dir, - self.alert_rules, - self.unique_id, - self.anomaly_data_factory, - self.ndigits, - self.step_count_per_record + if not self.module_rank_list or (self.rank in self.module_rank_list): + writer = FORMAT_MAPPING[self.format] + self.summary_writer = writer( + WriterInput( + self.tensorboard_dir, + self.alert_rules, + self.unique_id, + self.anomaly_data_factory, + self.ndigits, + self.step_count_per_record + ) ) - ) + + # 初始化anomaly detected文件目录 + if self.anomaly_data_factory: + self.anomaly_data_writer = AnomalyDataWriter(os.path.join(self.output_base_dir, "anomaly_detected"), + self.rank) + self.anomaly_data_writer.init_detected_json() def common_info(self): if not self.xy_distribution: @@ -339,6 +332,7 @@ class TrainerMon: self.micro_batch_number = grad_acc_steps self.dp_group = dp_group self.tp_group = tp_group + self.optimizer_mon = OptimizerMonFactory.create_optimizer_mon(optimizer) self.hook_step_final(optimizer) if not isinstance(model, list): model = [model] @@ -359,6 +353,9 @@ class TrainerMon: context.step - self.start_step) % self.step_interval == 0) if module_rank_valid and step_condition: self.has_collect_times += 1 + + if self.anomaly_data_factory: + self.anomaly_data_factory.set_call_id(self.param_name_call_id) self.write_xy_tb(context.step) self.write_grad_tb(context.step) self.write_mv_tb(context) @@ -368,7 +365,10 @@ class TrainerMon: self.summary_writer.write_metrics(self.ops, context.metric_dict, context.step, 'other') context.metric_dict.clear() + if self.anomaly_data_factory: + self.anomaly_data_writer.write_detected_json(self.summary_writer.get_anomalies()) self.summary_writer.clear_anomalies() + self.call_id = 0 self.param_name_call_id.clear() @@ -378,7 +378,23 @@ class TrainerMon: context.step += 1 self.dynamic_monitor(optimizer) - optimizer.register_forward_hook(step_final_hook) + + def patch_step(func, optimizer): + def wrapper(*args, **kwargs): + for hook in self.pre_step_hooks: + hook(optimizer, args, kwargs) + out = func(*args, **kwargs) + for hook in self.post_step_hooks: + hook(optimizer, args, kwargs) + step_final_hook(optimizer, args, kwargs) + return out + return wrapper + + if self.is_mindtorch: + optimizer.__class__.step = patch_step(optimizer.__class__.step, optimizer) + else: + optimizer.__class__.construct = patch_step(optimizer.__class__.construct, optimizer) + return def dynamic_monitor(self, optimizer): @@ -413,7 +429,7 @@ class TrainerMon: logger.error(f"set config wrong because {e}, not updated, please check!!!") return - self._remove_all_hooks() + self._remove_all_hooks(optimizer) self.register_hooks(optimizer) def register_hooks(self, optimizer): @@ -438,45 +454,36 @@ class TrainerMon: hooked_count = 0 for vpp_stage, model_chunk in enumerate(self.model): - if not isinstance(model_chunk, nn.Cell): + if not is_valid_instance(model_chunk): logger.info("Target Model is not Cell") continue vpp_stage = f'{vpp_stage}{MonitorConst.NAME_SEP}' - targets = [x for x, _ in model_chunk.cells_and_names()] if self.print_struct else self.targets.keys() + targets = [x for x, _ in get_submodules(model_chunk)] if self.print_struct else self.targets.keys() hooked_count += self._hook_module(targets, model_chunk, vpp_stage) logger.info(f"> {hooked_count} modules are monitored.") def hook_optimizer(self, optimizer): - def optimizer_pre_hook_function(opt, grad_names, gradients): + def optimizer_pre_step_hook(opt, *args, **kwargs): context = self.optimizer_context[opt] if is_skip_step(context.step, self.start_step, self.step_interval, self.has_collect_times, self.collect_times): return - gradient_list = gradients[0] if isinstance(gradients, tuple) else gradients - is_select = self.is_select - for idx, grad in enumerate(gradient_list): - grad_name = grad_names[idx] - if is_select and grad_name not in self.targets: - continue - get_single_metrics(self.ops, grad_name, grad, context.param_weight_grad) - - if self.mv_distribution: - # fetch mean - for param in m_list: - name = param.name - if is_select and name not in self.targets: - continue - get_single_metrics(self.ops, name, param, context.exp_avg_metric) - # fetch variance - for param in v_list: - name = param.name - if is_select and name not in self.targets: - continue - get_single_metrics(self.ops, name, param, context.exp_avg_sq_metric) - if self.param_distribution: - for param in param_list: - get_single_metrics(self.ops, param.name, param, context.param_metric) - self.generate_wgrad_metrics() + + grad_dict = {} + if self.wg_distribution: + grad_dict = self.optimizer_mon.fetch_grad(self, self.param2name) + + if self.mv_distribution or self.ur_distribution or self.mg_direction: + if self.is_mindtorch: + context.param_exp_avg, context.param_exp_avg_sq, context.param_adam_update, \ + context.param_adam_ratio = self.optimizer_mon.fetch_mv(self, self.param2name) + else: + context.param_exp_avg, context.param_exp_avg_sq = self.get_mv_for_ms(optimizer) + + self.generate_wgrad_metrics(grad_dict) + self.generate_mv_metrics(context) + self.generate_param_metrics(context, MonitorConst.PRE_PARAM) + metric_dict = {} for cc in self.cc_context.values(): cc.aggregate() @@ -488,63 +495,78 @@ class TrainerMon: context.metric_dict = metric_dict return - def optimizer_pre_hook_wrapper(func, grad_names): - def wrapper(opt, gradients): - return func(opt, grad_names, gradients) - return wrapper + def optimizer_post_step_hook(optimizer, args, kwargs): + context = self.optimizer_context[optimizer] + self.generate_param_metrics(context, MonitorConst.POST_PARAM) + if self.optimizer_hooked or not self.is_target_rank(): return - - m_list = [] - v_list = [] - param_list = [] - grad_names = [] - for param in optimizer.get_parameters(): - if MonitorConst.EXP_AVG_SQ in param.name: - v_list.append(param) - elif MonitorConst.EXP_AVG in param.name: - m_list.append(param) - elif param.name in ['global_step', 'learning_rate']: - pass - else: - param_list.append(param) - grad_names.append(param.name) - - handle = optimizer.register_forward_pre_hook( - optimizer_pre_hook_wrapper(optimizer_pre_hook_function, grad_names)) - self.handles['optimizer'].append(handle) + + self.pre_step_hooks.append(optimizer_pre_step_hook) + self.post_step_hooks.append(optimizer_post_step_hook) self.optimizer_hooked = True return - def generate_wgrad_metrics(self): + def generate_wgrad_metrics(self, grad_dict): if not self.wg_distribution: - return {}, {} + return if self.weight_hooked: - try: - get_metrics(self.ops, self.grad_context.acc, self.eps, self.grad_context.acc_metric) - except Exception as e: - logger.warning(f"An error occurred while generating wgrad pre metrics") - return {}, {} + get_metrics(self.ops, self.grad_context.acc, self.eps, self.grad_context.acc_metric) - grad_dict = {} - for param, name in self.param2name.items(): - if self.duplicate_param.get(name, False): - continue - grad = param.main_grad if self.params_have_main_grad else param.grad - if grad is None: - logger.warning(f"grad is None: {name}, maybe something wrong happened.") + get_metrics(self.ops, grad_dict, self.eps, self.grad_context.post) + + def generate_param_map(self, tag, param_tensor): + metrics = {} + if not self.is_mindtorch: + return param_tensor + for name in self.param2name.values(): + key = get_summary_writer_tag_name(name, tag, self.rank) + self.register_param_call_id("optimizer_pre_step_hook", key) + if name not in param_tensor or param_tensor[name] is None: continue - tag = self.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) - self._register_param_call_id("hook_optimizer", tag) - grad_dict[tag] = grad - try: - get_metrics(self.ops, grad_dict, self.eps, self.grad_context.post) - except Exception as e: - logger.warning(f"An error occurred while generating wgrad post metrics") + metrics[key] = param_tensor[name] + return metrics + + def generate_param_metrics(self, opt_context, stage=MonitorConst.PRE_PARAM): + if not self.param_distribution: + return + tag2param = { + self.name2tag.get(name, {}).get(stage): param + for name, param in self.name2param.items() + if param.numel() != 0 + } + get_metrics(self.ops, tag2param, self.eps, opt_context.param_metric) + + def get_mv_for_ms(self, opt): + if not self.mv_distribution: return {}, {} - return self.grad_context.post, self.grad_context.pre + common_opt = opt + if not is_valid_instance(opt): + common_opt = getattr(opt, 'optimizer') + if not is_valid_instance(common_opt): + logger.warning("Optimizer is not valid, please check usage") + return {}, {} + m_dict = {} + v_dict = {} + for name, param in get_parameters(common_opt): + if MonitorConst.EXP_AVG_SQ in name: + m_dict[name] = param + elif MonitorConst.EXP_AVG in name: + v_dict[name] = param + return m_dict, v_dict + + def generate_mv_metrics(self, opt_context): + if not self.mv_distribution: + return + opt_context.exp_avg_metric = {} + opt_context.exp_avg_sq_metric = {} + m_tag_tensor_map = self.generate_param_map(MonitorConst.EXP_AVG, opt_context.param_exp_avg) + v_tag_tensor_map = self.generate_param_map(MonitorConst.EXP_AVG_SQ, opt_context.param_exp_avg_sq) + get_metrics(self.ops, m_tag_tensor_map, self.eps, opt_context.exp_avg_metric) + get_metrics(self.ops, v_tag_tensor_map, self.eps, opt_context.exp_avg_sq_metric) + def write_xy_tb(self, step): if not self.xy_distribution: @@ -552,21 +574,25 @@ class TrainerMon: for _, fwd_context in self.module_fwd_hook_context_by_module.items(): if len(fwd_context.actv) == 0: continue - self.summary_writer.write_metrics(self.ops, fwd_context.actv, step, 'actv') + self.summary_writer.write_metrics(self.ops, fwd_context.actv, step, MonitorConst.ACTV) fwd_context.actv.clear() if self.grad_context.actv: - self.summary_writer.write_metrics(self.ops, self.grad_context.actv, step, 'actv_grad') + self.summary_writer.write_metrics(self.ops, self.grad_context.actv, step, MonitorConst.ACTVGRAD) def write_param_tb(self, opt_context): if not self.param_distribution: return - self.summary_writer.write_metrics(self.ops, opt_context.param_metric, opt_context.step, 'param') + param_metrics = {k: v for k, v in opt_context.param_metric.items() if MonitorConst.PRE_PARAM in k} + updated_param_metrics = {k: v for k, v in opt_context.param_metric.items() if MonitorConst.POST_PARAM in k} + self.summary_writer.write_metrics(self.ops, param_metrics, opt_context.step, MonitorConst.PRE_PARAM) + self.summary_writer.write_metrics(self.ops, updated_param_metrics, opt_context.step, MonitorConst.POST_PARAM) def write_mv_tb(self, opt_context): if not self.mv_distribution: return - self.summary_writer.write_metrics(self.ops, opt_context.exp_avg_metric, opt_context.step, 'exp_avg') - self.summary_writer.write_metrics(self.ops, opt_context.exp_avg_sq_metric, opt_context.step, 'exp_avg_sq') + self.summary_writer.write_metrics(self.ops, opt_context.exp_avg_metric, opt_context.step, MonitorConst.EXP_AVG) + self.summary_writer.write_metrics(self.ops, opt_context.exp_avg_sq_metric, opt_context.step, + MonitorConst.EXP_AVG_SQ) def write_grad_tb(self, step): if not self.wg_distribution: @@ -580,13 +606,38 @@ class TrainerMon: return False return True - def build_tbtag_tensor_map(self, module_name, tag, tensor): - metrics = {} - key = get_summary_writer_tag_name(module_name, tag, str(self.rank)) + def build_tbtag_tensor_map(self, module_name, suffix, tag, tensor): + """ + :param module_name: str of module name + :param suffix: + :param tag: + :param tensor: torch.tensor or tuple/list of torch.tensor + :return: tensor_map + """ + tensor_map = {} if isinstance(tensor, Tensor): - self._register_param_call_id("_hook_module", key) - metrics[key] = tensor - return metrics + tensor = [tensor] + if isinstance(tensor, tuple) or isinstance(tensor, list): + if len(tensor) == 1: + key = get_summary_writer_tag_name(module_name + suffix, tag, self.rank) + self.register_param_call_id("_hook_module", key) + tensor_map[key] = tensor[0] + else: + for i, tensor_i in enumerate(tensor): + key = get_summary_writer_tag_name(module_name + f"_{i}" + suffix, tag, self.rank) + self.register_param_call_id("_hook_module", key) + tensor_map[key] = tensor_i + return tensor_map + + def register_param_call_id(self, hook_name: str, key: str): + """ + :param hook_name: + :param key: str, '0:relu_0/output_grad' + :return: + """ + logger.debug(f"{hook_name} {key}: {self.call_id}") + self.param_name_call_id[key] = self.call_id + self.call_id += 1 def _register_param_name(self): for vpp_stage, model_chunk in enumerate(self.model): @@ -595,8 +646,7 @@ class TrainerMon: def _register_chunk(self, model_chunk, prefix): index = 0 - for param in model_chunk.get_parameters(): - param_name = param.name + for param_name, param in get_parameters(model_chunk): if not param.requires_grad: continue if self._is_target_param(param_name, param, prefix): @@ -611,25 +661,37 @@ class TrainerMon: self.duplicate_param[name] = True if self.dp_group and param_is_data_parallel_duplicate(self.dp_group): self.duplicate_param[name] = True + keywords = [ + MonitorConst.PRE_GRAD, + MonitorConst.POST_GRAD, + MonitorConst.PRE_PARAM, + MonitorConst.POST_PARAM + ] self.name2tag[name] = { - MonitorConst.PRE_GRAD: get_summary_writer_tag_name(name, MonitorConst.PRE_GRAD, self.rank), - MonitorConst.POST_GRAD: get_summary_writer_tag_name(name, MonitorConst.POST_GRAD, self.rank) + k: get_summary_writer_tag_name(name, k, self.rank) + for k in keywords } index += 1 def _hook_module(self, target_names, module, vpp_stage=''): - if not isinstance(module, nn.Cell): + if not is_valid_instance(module): # nothing to hook return 0 - def fwd_hook_fun(module, module_input, module_output, name): + def fwd_hook_fun(module, args, kwargs, module_output, name): + + module_input = [tensor for tensor in args if isinstance(tensor, Tensor)] + if kwargs: + kwargs_tensors = [tensor for tensor in kwargs.values() if isinstance(tensor, Tensor)] + module_input.extend(kwargs_tensors) + if module not in self.module_fwd_hook_context_by_module: self.module_fwd_hook_context_by_module[module] = ModuleHookContext(name) context: ModuleHookContext = self.module_fwd_hook_context_by_module[module] if not context.struct: context.struct = { - MonitorConst.ACTV_IN: get_param_struct(module_input), - MonitorConst.ACTV_OUT: get_param_struct(module_output) + Const.INPUT: get_param_struct(module_input), + Const.OUTPUT: get_param_struct(module_output) } if self.print_struct: self.module_struct[context.module_name].update(context.struct) @@ -640,31 +702,16 @@ class TrainerMon: self.collect_times): step_accumulates_one(context, self.micro_batch_number) return - if not context.format_by_arg: - context.set_format_by_arg(MonitorConst.ACTV_IN, self.targets) - context.set_format_by_arg(MonitorConst.ACTV_OUT, self.targets) - if not context.format_by_arg: - return - if not context.verified: - if not context.ignore_in: - context.focused_in_col = validate_config_spec(context.format_by_arg[MonitorConst.ACTV_IN], - module_input, context.module_name, - MonitorConst.ACTV_IN) - context.focused_out_col = validate_config_spec(context.format_by_arg[MonitorConst.ACTV_OUT], - module_output, context.module_name, - MonitorConst.ACTV_OUT) - context.verified = True tbtag_tensor_map = {} - if not context.ignore_in: - cared_input = module_input if context.focused_in_col is None else module_input[context.focused_in_col] - tbtag_tensor_map.update( - self.build_tbtag_tensor_map(f'{context.module_name}_{context.micro_step}', MonitorConst.ACTV_IN, - cared_input)) - cared_output = module_output if context.focused_out_col is None else module_output[context.focused_out_col] tbtag_tensor_map.update( - self.build_tbtag_tensor_map(f'{context.module_name}_{context.micro_step}', MonitorConst.ACTV_OUT, - cared_output)) + self.build_tbtag_tensor_map( + f'{context.module_name}.{Const.INPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, module_input)) + tbtag_tensor_map.update( + self.build_tbtag_tensor_map( + f'{context.module_name}.{Const.OUTPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, module_output)) try: get_metrics(self.ops, tbtag_tensor_map, self.eps, context.actv) except Exception as e: @@ -689,31 +736,16 @@ class TrainerMon: step_accumulates_one(context, self.micro_batch_number) return - if not context.format_by_arg: - context.set_format_by_arg(MonitorConst.ACTVGRAD_IN, self.targets) - context.set_format_by_arg(MonitorConst.ACTVGRAD_OUT, self.targets) - if not context.format_by_arg: - return - if not context.verified: - if not context.ignore_in: - context.focused_in_col = validate_config_spec(context.format_by_arg[MonitorConst.ACTVGRAD_IN], - input_grad, context.module_name, - MonitorConst.ACTVGRAD_IN) - context.focused_out_col = validate_config_spec(context.format_by_arg[MonitorConst.ACTVGRAD_OUT], - output_grad, context.module_name, - MonitorConst.ACTVGRAD_OUT) - context.verified = True - tbtag_tensor_map = {} - if not context.ignore_in: - cared_input_grad = input_grad if context.focused_in_col is None else input_grad[context.focused_in_col] - tbtag_tensor_map.update( - self.build_tbtag_tensor_map( - f'{context.module_name}_{context.micro_step}', MonitorConst.ACTVGRAD_IN, cared_input_grad)) - cared_output_grad = output_grad if context.focused_out_col is None else output_grad[context.focused_out_col] tbtag_tensor_map.update( - self.build_tbtag_tensor_map(f'{context.module_name}_{context.micro_step}', MonitorConst.ACTVGRAD_OUT, - cared_output_grad)) + self.build_tbtag_tensor_map( + f'{context.module_name}.{Const.INPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, input_grad)) + + tbtag_tensor_map.update( + self.build_tbtag_tensor_map( + f'{context.module_name}.{Const.OUTPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, output_grad)) if context.micro_step == 0 and context.actvgrad: logger.warning(f"actvgrad context of {context.module_name} is not empty when first micro_step, " @@ -728,20 +760,21 @@ class TrainerMon: return def fwd_hook_fun_wrapper(fwd_hook_fun, name): - def wrapper(module, module_input, module_output): - return fwd_hook_fun(module, module_input, module_output, name) + def wrapper(module, args, kwargs, module_output): + return fwd_hook_fun(module, args, kwargs, module_output, name) return wrapper if self.backward_only and self.forward_only: logger.warning('not enable backward_only and forward_only simultaneously') hooked_count = 0 if self.xy_distribution or self.print_struct: - for module_name, submodule in module.cells_and_names(): + for module_name, submodule in get_submodules(module): name = self._is_target_module(module_name, target_names, vpp_stage) if not name: continue if not self.backward_only: - handle = submodule.register_forward_hook(fwd_hook_fun_wrapper(fwd_hook_fun, name=name)) + handle = submodule.register_forward_hook(fwd_hook_fun_wrapper(fwd_hook_fun, name=name), + with_kwargs=True) self.handles['xy'].append(handle) if not self.forward_only: handle = submodule.register_backward_hook(bwd_hook_fun) @@ -762,7 +795,7 @@ class TrainerMon: @_no_grad() def param_hook(grad, context_dict, param, key): param.micro_step += 1 - self._register_param_call_id("param_hook", key) + self.register_param_call_id("param_hook", key) if param.micro_step == self.micro_batch_number: param.micro_step = 0 context_dict[key] = grad @@ -801,17 +834,7 @@ class TrainerMon: return pattern return "" - def _register_param_call_id(self, hook_name: str, key: str): - """ - :param hook_name: - :param key: str, '0:relu_0/output_grad' - :return: - """ - logger.debug(f"{hook_name} {key}: {self.call_id}") - self.param_name_call_id[key] = self.call_id - self.call_id += 1 - - def _remove_all_hooks(self): + def _remove_all_hooks(self, optimizer): # 清空hook handle for handle in self.handles['xy']: handle.remove() @@ -829,9 +852,8 @@ class TrainerMon: self.weight_hooked = False if self.optimizer_hooked: - for handle in self.handles['optimizer']: - handle.remove() - self.handles['optimizer'].clear() + self.pre_step_hooks.clear() + self.post_step_hooks.clear() for _, context in self.optimizer_context.items(): context.reset() self.optimizer_hooked = False @@ -870,4 +892,4 @@ class TrainerMon: except Exception as e: logger.warning(f"Finish monitor, set config'dynamic_on=False fail because {e}, please check!!!") logger.info("Finish monitor") - self._remove_all_hooks() + self._remove_all_hooks(optimizer) diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/module_spec_verifier.py b/debug/accuracy_tools/msprobe/mindspore/monitor/module_spec_verifier.py deleted file mode 100644 index c06e8ea10f6a2178c3670e596ad64e333db44cab..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/mindspore/monitor/module_spec_verifier.py +++ /dev/null @@ -1,94 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import re -import abc -from mindspore import Tensor - -from msprobe.core.common.log import logger - - -# 用于存储所有validator实现类的注册表 -config_validator_registry = {} - - -def register_config_validator(cls): - """装饰器 用于注册ConfigValidator的实现类""" - config_validator_registry[cls.__name__] = cls - return cls - - -class ConfigValidator(metaclass=abc.ABCMeta): - @abc.abstractmethod - def check_pattern_match(self, config_spec: str): - pass - - @abc.abstractmethod - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - pass - - -@register_config_validator -class TensorValidator(ConfigValidator): - def check_pattern_match(self, config_spec: str): - pattern = re.compile(r"tensor") - return pattern.match(config_spec) - - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - if not isinstance(actual_data, Tensor): - raise ValueError( - f"Format of {module_name} {data_type} does not match the required format 'tensor' in config.") - - -@register_config_validator -class TupleValidator(ConfigValidator): - def check_pattern_match(self, config_spec: str): - pattern = re.compile(r"tuple\[(\d+)\]:?(\d+)?") - return pattern.match(config_spec) - - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - length, index = pattern_match.groups() - if index is None: - index = 0 - length, index = int(length), int(index) - - if not (0 <= index < length): - raise ValueError( - f"Format of {module_name} {data_type} in config.json does not match the required format 'tuple[x]:y'." - f"y must be greater than or equal to 0 and less than x.") - if not isinstance(actual_data, tuple): - raise ValueError( - f"Type of {module_name} {data_type} does not match spec of config.json, should be tuple, please check.") - if len(actual_data) != length: - raise ValueError( - f"Length of {module_name} {data_type} does not match spec of config.json, should be {length}, " - f"actual is {len(actual_data)} please check.") - return index - - -def validate_config_spec(config_spec: str, actual_data, module_name: str, data_type: str): - focused_col = None - for _, validator_cls in config_validator_registry.items(): - config_validator = validator_cls() - pattern_match = config_validator.check_pattern_match(config_spec) - if pattern_match: - try: - focused_col = config_validator.validate(actual_data, module_name, data_type, pattern_match) - except ValueError as e: - logger.warning(f"config spec validate failed: {str(e)}") - return focused_col - logger.warning(f"config spec in {module_name} {data_type} not supported, " - f"expected spec:'tuple\[(\d+)\]:(\d+)' or 'tensor', actual spec: {config_spec}.") - return focused_col \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/optimizer_collect.py b/debug/accuracy_tools/msprobe/mindspore/monitor/optimizer_collect.py new file mode 100644 index 0000000000000000000000000000000000000000..c12e892e5c964a5821534c653d458ef867d0ca80 --- /dev/null +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/optimizer_collect.py @@ -0,0 +1,322 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from abc import abstractmethod + +from mindspore import mint, ops + +from msprobe.mindspore.common.log import logger +from msprobe.core.common.const import MonitorConst + + +class OptimizerMon(object): + def __init__(self, optim) -> None: + self.fp16_to_fp32_param = {} + self.optim = optim + + def narrow_from_flatten(self, param, flatten_state): + return flatten_state + + def fetch_grad(self, monitor, params2name): + if not self.fp16_to_fp32_param: + self.map_fp16_to_fp32_param(self.optim) + + grad_dict = {} + first_param = True + for param, name in params2name.items(): + if monitor.duplicate_param.get(name, False): + continue + if self.fp16_to_fp32_param and param not in self.fp16_to_fp32_param: + continue + grad = param.main_grad if monitor.params_have_main_grad else param.grad + element_in_cur_partition = self.fp16_to_fp32_param.get(param, param).numel() + if param.numel() != element_in_cur_partition: + if first_param: + grad = grad.flatten()[-element_in_cur_partition:] + else: # supposed to be the last one + grad = grad.flatten()[:element_in_cur_partition] + first_param = False + if grad is None: + continue + tag = monitor.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) + monitor.register_param_call_id("hook_optimizer", tag) + grad_dict[tag] = grad + return grad_dict + + def map_fp16_to_fp32_param(self, optim): + pass + + def fetch_mv(self, monitor, params2name): + if not self.fp16_to_fp32_param: + self.map_fp16_to_fp32_param(self.optim) + exp_avg_dict = {} + exp_avg_sq_dict = {} + update_dict = {} + ratio_dict = {} + + if hasattr(self.optim, 'state'): + state = self.optim.state + elif hasattr(self.optim, 'optimizer') and hasattr(self.optim.optimizer, 'state'): + state = self.optim.optimizer.state + else: + logger.warning('optimizer state can not accessed') + return exp_avg_dict, exp_avg_sq_dict, update_dict, ratio_dict + + for lp_param, name in params2name.items(): + if lp_param in self.fp16_to_fp32_param: + hp_param = self.fp16_to_fp32_param[lp_param] + else: + hp_param = lp_param + + if hp_param in state: + state_param = state.get(hp_param, None) + exp_avg = self.narrow_from_flatten(lp_param, state_param.get("exp_avg", None)) + exp_avg_sq = self.narrow_from_flatten(lp_param, state_param.get("exp_avg_sq", None)) + if monitor.mv_distribution: + exp_avg_dict[name] = exp_avg + exp_avg_sq_dict[name] = exp_avg_sq + if monitor.mg_direction: + exp_avg_dict[name] = exp_avg + if monitor.ur_distribution: + if len(self.optim.param_groups) > 1: + logger.info(f"the length of optimizer.param_groups is {len(self.optim.param_groups)}.") + if 'step' in state_param: + step = state_param['step'] # Optimizer from pytorch or FusedAdam from apex(used by megatron) + elif 'step' in self.optim.param_groups[0]: + step = self.optim.param_groups[0]['step'] # AdamW from mindspeed + else: + logger.warning(f"step of {name} is None, maybe something wrong happened.") + continue + exp_avg_hat = exp_avg / (1 - self.optim.defaults['betas'][0] ** step) + exp_avg_sq_hat = exp_avg_sq / (1 - self.optim.defaults['betas'][1] ** step) + update_dict[name] = exp_avg_hat / (mint.sqrt(exp_avg_sq_hat) + self.optim.defaults['eps']) + ratio_dict[name] = exp_avg_hat / mint.sqrt(exp_avg_sq_hat) + monitor.update_heatmap_visualizer[name].pre_cal(update_dict[name]) + monitor.ratio_heatmap_visualizer[name].pre_cal(ratio_dict[name]) + return exp_avg_dict, exp_avg_sq_dict, update_dict, ratio_dict + + +class MixPrecisionOptimizerMon(OptimizerMon): + """ + 混合精度优化器监控类。在混合精度训练中监控和管理优化器。 + 混合精度训练通过适当降低某些计算的精度来加速训练过程并减少内存消耗。 + """ + def map_fp16_to_fp32_param(self, optim): + for fp16_group, fp32_group in zip(optim.float16_groups, optim.fp32_from_float16_groups): + for fp16_param, fp32_param in zip(fp16_group, fp32_group): + self.fp16_to_fp32_param[fp16_param] = fp32_param + + +class MegatronDistributedOptimizerMon(OptimizerMon): + def map_fp16_to_fp32_param(self, optim): + if not (hasattr(optim, "model_float16_groups") and + hasattr(optim, "shard_fp32_from_float16_groups")): + raise Exception( + "megatron distributed optimizer should have model_float16_groups and shard_fp32_from_float16_groups, " + "if not, please check megatron-lm version") + for fp16_group, shard_fp32_group in zip(optim.model_float16_groups, + optim.shard_fp32_from_float16_groups): + for fp16_param, shard_fp32_param in zip(fp16_group, shard_fp32_group): + self.fp16_to_fp32_param[fp16_param] = shard_fp32_param + + +class MegatronChainedDistributedOptimizerMon(MegatronDistributedOptimizerMon): + def map_fp16_to_fp32_param(self, optim): + for opt in optim.chained_optimizers: + super().map_fp16_to_fp32_param(opt) + + if not hasattr(self.optim, 'state'): + optim.state = {} + for opt in self.optim.chained_optimizers: + self.optim.state.update(opt.optimizer.state) + + +class MegatronChainedMixPrecisionOptimizerMon(MixPrecisionOptimizerMon): + def map_fp16_to_fp32_param(self, optim): + for opt in optim.chained_optimizers: + super().map_fp16_to_fp32_param(opt) + + if not hasattr(self.optim, 'state'): + optim.state = {} + for opt in self.optim.chained_optimizers: + self.optim.state.update(opt.optimizer.state) + + +class DeepSpeedZeroOptimizerMon(OptimizerMon): + """ + Base monitor class for DeepSpeed ZeRO optimizer. + ZeRO stage 0 no partition + ZeRO stage 1 partitions optimizer states across data parallel processes. + ZeRO stage 2 additionally partitions gradients. + ZeRO stage 3 additionally partitions parameters. + + This class provides monitoring capabilities for ZeRO optimizers by: + - Handling gradient collection for different ZeRO stages + - Managing optimizer state access for monitoring + """ + def __init__(self, optim): + super().__init__(optim) + self.stage = '' + self.bit16_groups = [] + self.fp32_flat_groups = [] + self.param2group = () + self.param2index = [] + self.group_offset = {} + + @abstractmethod + def get_grad_for_param(self, lp_param, group_idx, param_id): + raise NotImplementedError + + def param_not_in_partition(self, lp_param, group_idx): + param_slice_mapping = self.optim.state_dict()['param_slice_mappings'][group_idx] + hp_address = param_slice_mapping.get(self.optim.param_names.get(lp_param)) + return hp_address is None + + def get_position(self, lp_param, group_idx): + param_slice_mapping = self.optim.state_dict()['param_slice_mappings'][group_idx] + hp_address = param_slice_mapping.get(self.optim.param_names.get(lp_param)) + return hp_address.start, hp_address.numel + + def get_group_index(self): + param2group = {} + for group_idx, bit16_group in enumerate(self.bit16_groups): + for param in bit16_group: + param2group[param] = group_idx + return param2group + + def get_param_index(self, lp_param, group_idx): + if not self.param2index: + for group in self.bit16_groups: + param2index = {} + for index, param in enumerate(group): + param2index[param] = index + self.param2index.append(param2index) + + return self.param2index[group_idx][lp_param] + + def narrow_from_flatten(self, param, flatten_state): + if flatten_state is None: + return flatten_state + group_idx = self.param2group[param] + if self.param_not_in_partition(param, group_idx): + return None + start, numel = self.get_position(param, group_idx) + return flatten_state.narrow(0, start, numel) + + def map_fp16_to_fp32_param(self, optim): + for group_idx, group in enumerate(self.bit16_groups): + for param in group: + self.fp16_to_fp32_param[param] = self.fp32_flat_groups[group_idx] + + def fetch_grad(self, monitor, params2name): + grad_dict = {} + for lp_param, name in params2name.items(): + group_idx = self.param2group[lp_param] + param_id = self.get_param_index(lp_param, group_idx) + if self.param_not_in_partition(lp_param, group_idx): + continue + if self.stage == '1or2': + param_id = param_id - self.group_offset[group_idx] - 1 + grad = self.get_grad_for_param(lp_param, group_idx, param_id) + tag = monitor.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) + monitor.register_param_call_id("hook_optimizer", tag) + grad_dict[tag] = grad + + return grad_dict + + +class DeepSpeedZeroOptimizerStage0Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, optim): + super().__init__(optim) + self.stage = '0' + self.bit16_groups = optim.bf16_groups + self.fp32_flat_groups = optim.fp32_groups_flat_partition + self.param2group = self.get_group_index() + + def get_grad_for_param(self, lp_param, group_idx, param_id): + return self.optim.fp32_groups_gradient_dict[group_idx][param_id] + + +class DeepSpeedZeroOptimizerStage1or2Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, optim): + super().__init__(optim) + self.stage = '1or2' + self.bit16_groups = optim.bit16_groups + self.fp32_flat_groups = optim.single_partition_of_fp32_groups + self.param2group = self.get_group_index() + self.group_offset = {} + self.get_group_offset() + + def get_grad_for_param(self, lp_param, group_idx, param_id): + if getattr(self.optim, "cpu_offload", False): + grads = self.optim.single_partition_of_fp32_groups[group_idx].grad + start, numel = self.get_position(lp_param, group_idx) + grad = grads.narrow(0, start, numel) + else: + grad = self.optim.averaged_gradients[group_idx][param_id] + return grad + + def get_group_offset(self): + for group_idx, group in enumerate(self.bit16_groups): + self.group_offset[group_idx] = -1 + for lp_param in group: + if self.param_not_in_partition(lp_param, group_idx): + self.group_offset[group_idx] = self.get_param_index(lp_param, group_idx) + else: + break + + +class DeepSpeedZeroOptimizerStage3Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, optim): + super().__init__(optim) + self.stage = '3' + self.bit16_groups = optim.fp16_groups + self.fp32_flat_groups = optim.fp32_partitioned_groups_flat + self.param2group = self.get_group_index() + + def param_not_in_partition(self, param, group_index): + """Each param partioned across all zero ranks""" + return False + + def get_position(self, lp_param, group_idx): + param_id = self.optim.get_param_id(lp_param) + return self.optim.grad_position[param_id][1:] + + def get_grad_for_param(self, lp_param, group_idx, param_id): + return self.optim.averaged_gradients[group_idx][param_id] + + +class OptimizerMonFactory: + _optimizer_mon_map = { + "FP32Optimizer": OptimizerMon, + "Float16OptimizerWithFloat16Params": MixPrecisionOptimizerMon, + "DistributedOptimizer": MegatronDistributedOptimizerMon, + "ChainedDistributedOptimizer": MegatronChainedDistributedOptimizerMon, + "ChainedFloat16OptimizerWithFloat16Params": MegatronChainedMixPrecisionOptimizerMon, + "BF16_Optimizer": DeepSpeedZeroOptimizerStage0Mon, + "DeepSpeedZeroOptimizer": DeepSpeedZeroOptimizerStage1or2Mon, + "DeepSpeedZeroOptimizer_Stage3": DeepSpeedZeroOptimizerStage3Mon, + "Adam": OptimizerMon + } + + @staticmethod + def create_optimizer_mon(optimizer): + # auto replace opt_ty + optimizer_class = optimizer.__class__.__name__ + if optimizer_class == "ChainedOptimizer": + optimizer_class = "Chained" + optimizer.chained_optimizers[0].__class__.__name__ + logger.info(f'The optimizer type is {optimizer_class}') + + optimizer_mon_class = OptimizerMonFactory._optimizer_mon_map.get(optimizer_class, OptimizerMon) + return optimizer_mon_class(optimizer) diff --git a/debug/accuracy_tools/msprobe/mindspore/monitor/utils.py b/debug/accuracy_tools/msprobe/mindspore/monitor/utils.py index 0d1b430f71df53983a24556fe2b86abc30decbdb..eb375d1a634a3ad08233dba469c6909c632c0e24 100644 --- a/debug/accuracy_tools/msprobe/mindspore/monitor/utils.py +++ b/debug/accuracy_tools/msprobe/mindspore/monitor/utils.py @@ -35,7 +35,10 @@ def get_single_metrics(op_list, tag, tensor, output=None): if hasattr(statistic, "dtype") and statistic.dtype == mstype.bfloat16: statistic = float(statistic) statistic = Tensor(statistic) - output[tag][op] = statistic.astype(mstype.float32) + if isinstance(statistic, Tensor): + output[tag][op] = statistic.astype(mstype.float32) + else: + output[tag][op] = statistic def get_metrics(op_list, tag2tensor, eps, output=None): @@ -91,6 +94,9 @@ def validate_ops(ops): default_op = MonitorConst.OP_LIST[0] valid_ops.append(default_op) logger.info(f"There is no valid ops, default op {default_op} is used") + # 增加默认shape和dtype参数 + if "shape" not in valid_ops and "dtype" not in valid_ops: + valid_ops.extend(["shape", "dtype"]) return valid_ops diff --git a/debug/accuracy_tools/msprobe/mindspore/ms_config.py b/debug/accuracy_tools/msprobe/mindspore/ms_config.py index ff7fc28e76e25a6463b26fd49d2aea3d1900207e..8fe02a03f4c979b6fee696c77bdca0da484a407f 100644 --- a/debug/accuracy_tools/msprobe/mindspore/ms_config.py +++ b/debug/accuracy_tools/msprobe/mindspore/ms_config.py @@ -30,6 +30,7 @@ class TensorConfig(BaseConfig): self.file_format = json_config.get("file_format") self.td_config_path = json_config.get("td_config_path") self.check_config() + self._check_summary_mode() self._check_config() def _check_config(self): @@ -43,12 +44,14 @@ class StatisticsConfig(BaseConfig): self.file_format = None self.check_mode = None self.check_config() - self._check_config() + self._check_summary_mode() - def _check_config(self): - single_opt = ["statistics", "md5"] + self.tensor_list = json_config.get("tensor_list", []) + self._check_str_list_config(self.tensor_list, "tensor_list") + + def _check_summary_mode(self): muti_opt = ["md5", "max", "min", "mean", "l2norm"] - if isinstance(self.summary_mode, str) and self.summary_mode not in single_opt: + if isinstance(self.summary_mode, str) and self.summary_mode not in Const.SUMMARY_MODE: raise Exception("summary_mode is invalid") if isinstance(self.summary_mode, list) and not all(opt in muti_opt for opt in self.summary_mode): raise Exception("summary_mode is invalid") diff --git a/debug/accuracy_tools/msprobe/mindspore/service.py b/debug/accuracy_tools/msprobe/mindspore/service.py index a566ddc88451924db47c7891951d16dcbeef4cff..8368985bf715bb3ec706c04df70616f0a6542de3 100644 --- a/debug/accuracy_tools/msprobe/mindspore/service.py +++ b/debug/accuracy_tools/msprobe/mindspore/service.py @@ -32,16 +32,22 @@ else: from msprobe.core.common.exceptions import DistributedNotInitializedError, MsprobeException from msprobe.core.common.file_utils import create_directory -from msprobe.core.common.utils import Const, print_tools_ends_info, DumpPathAggregation +from msprobe.core.common.utils import Const, print_tools_ends_info, DumpPathAggregation, replace_last_occurrence from msprobe.core.data_dump.data_collector import build_data_collector from msprobe.core.data_dump.data_processor.base import (ModuleBackwardInputsOutputs, ModuleForwardInputsOutputs, ModuleBackwardInputs) from msprobe.core.data_dump.scope import BaseScope -from msprobe.mindspore.cell_processor import CellProcessor, get_cell_construct +from msprobe.core.data_dump.api_registry import ApiRegistry +from msprobe.mindspore.cell_processor import CellProcessor from msprobe.mindspore.common.log import logger -from msprobe.mindspore.common.utils import (get_rank_if_initialized, clean_input_kwargs, - is_mindtorch, register_backward_hook_functions) -from msprobe.mindspore.dump.hook_cell.api_register import get_api_register +from msprobe.mindspore.common.utils import ( + get_rank_if_initialized, + clean_input_kwargs, + is_mindtorch, + get_cells_and_names, + has_kwargs_in_forward_hook +) +from msprobe.mindspore.dump.hook_cell.api_register import get_api_register, ApiTemplate from msprobe.mindspore.dump.hook_cell.primitive_hooks import PrimitiveHookService from msprobe.mindspore.dump.jit_dump import JitDump from msprobe.mindspore.dump.hook_cell.hook_cell import HOOKCell @@ -75,7 +81,9 @@ class Service: # 提前注册,确保注册尽可能多的API hook self.api_register = get_api_register() self.register_api_hook() - self.init_for_debug_level() + self.currrent_step_first_debug_save = True + self.debug_variable_counter = None + self.ori_customer_func = {} @staticmethod def check_model_valid(models): @@ -100,22 +108,19 @@ class Service: def build_hook(self, target_type, name): def pre_hook(api_or_cell_name, cell, input_data): - if not self.should_execute_hook(target_type, cell, True): - return None + if target_type == BaseScope.Module_Type_Module or \ + not self.should_execute_hook(target_type, cell, True): + return with _no_grad(): self.inner_switch = True - if target_type == BaseScope.Module_Type_Module: - api_or_cell_name = self.cell_processor.set_and_get_reserved_name(cell, api_or_cell_name) - else: - cell.forward_data_collected = True - HOOKCell.add_cell_count(name) - module_input_output = ModuleForwardInputsOutputs(args=input_data, kwargs=cell.msprobe_input_kwargs, - output=None) + cell.forward_data_collected = True + HOOKCell.add_cell_count(name) + kwargs = cell.msprobe_input_kwargs if hasattr(cell, 'msprobe_input_kwargs') else {} + module_input_output = ModuleForwardInputsOutputs(args=input_data, kwargs=kwargs, output=None) self.data_collector.update_api_or_module_name(api_or_cell_name) self.data_collector.forward_input_data_collect(api_or_cell_name, cell, pid, module_input_output) self.inner_switch = False - return input_data def grad_hook(cell, ori_name, param_name): def hook_fn(grad): @@ -162,16 +167,20 @@ class Service: # 记录当前模块的参数梯度信息已占位 self.params_grad_info[grad_name] = True - def forward_hook(api_or_cell_name, cell, input_data, output): + def forward_hook(api_or_cell_name, cell, args, kwargs_or_output, output_or_kwargs): if not self.should_execute_hook(target_type, cell, True): clean_input_kwargs(cell) return None with _no_grad(): self.inner_switch = True - module_input_output = ModuleForwardInputsOutputs(args=input_data, kwargs=cell.msprobe_input_kwargs, - output=output) + if not has_kwargs_in_forward_hook() or target_type == BaseScope.Module_Type_API: + kwargs = cell.msprobe_input_kwargs if hasattr(cell, 'msprobe_input_kwargs') else {} + output = kwargs_or_output + else: + kwargs = kwargs_or_output + output = output_or_kwargs + module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=output) if target_type == BaseScope.Module_Type_Module: - api_or_cell_name = self.cell_processor.set_and_get_reserved_name(cell, api_or_cell_name) params_dict = {} if self.config.task != Const.STRUCTURE: params_dict = { @@ -194,10 +203,12 @@ class Service: self.data_collector.forward_output_data_collect(api_or_cell_name, cell, pid, module_input_output) clean_input_kwargs(cell) + if self.data_collector.if_return_forward_new_output(): forward_new_output = self.data_collector.get_forward_new_output() self.inner_switch = False return forward_new_output + self.inner_switch = False return output @@ -210,7 +221,6 @@ class Service: if target_type == BaseScope.Module_Type_Module: if not hasattr(cell, 'has_pre_hook_called') or not cell.has_pre_hook_called: need_exchange = False - api_or_cell_name = self.cell_processor.set_and_get_reserved_name(cell, api_or_cell_name) self.data_collector.update_api_or_module_name(api_or_cell_name) if self.data_collector: @@ -233,12 +243,11 @@ class Service: self.inner_switch = False pid = os.getpid() - if target_type == BaseScope.Module_Type_Module: - full_forward_name = name + Const.FORWARD - full_backward_name = name + Const.BACKWARD - else: + full_forward_name = name + if target_type == BaseScope.Module_Type_API: full_forward_name = name + str(HOOKCell.get_cell_count(name)) + Const.SEP + Const.FORWARD - full_backward_name = name + str(HOOKCell.get_cell_count(name)) + Const.SEP + Const.BACKWARD + full_backward_name = replace_last_occurrence(full_forward_name, Const.FORWARD, Const.BACKWARD) + pre_forward_hook = functools.partial(pre_hook, full_forward_name) forward_hook = functools.partial(forward_hook, full_forward_name) backward_hook = functools.partial(backward_hook, full_backward_name) @@ -247,8 +256,8 @@ class Service: def wrap_pre_forward_hook(cell, input_data): return pre_forward_hook(cell, input_data) - def wrap_forward_hook(cell, input_data, output_data): - return forward_hook(cell, input_data, output_data) + def wrap_forward_hook(cell, args, kwargs_or_output, output_or_kwargs=None): + return forward_hook(cell, args, kwargs_or_output, output_or_kwargs) def wrap_backward_hook(cell, grad_input, grad_output): return backward_hook(cell, grad_input, grad_output) @@ -265,11 +274,10 @@ class Service: self.primitive_counters[primitive_name] += 1 def step(self): - if self.config.level == Const.LEVEL_DEBUG: - return - if self.config.async_dump and self.config.task == Const.TENSOR: + if self.config.async_dump and self.config.task in [Const.STATISTICS, Const.TENSOR]: self.data_collector.data_processor.dump_async_data() self.data_collector.write_json() + self.currrent_step_first_debug_save = True self.loop += 1 self.reset_status() @@ -315,7 +323,8 @@ class Service: if self.config.rank and self.current_rank not in self.config.rank: return self.register_primitive_hook() - self.register_cell_hook() + if self.config.level in [Const.LEVEL_MIX, Const.LEVEL_L0]: + self.cell_processor.register_cell_hook(self.model, self.build_hook) self.first_start = False self.api_register.register_all_api() @@ -343,7 +352,7 @@ class Service: self.switch = False self.primitive_switch = False self.start_call = False - if self.config.async_dump and self.config.task == Const.TENSOR: + if self.config.async_dump and self.config.task in [Const.STATISTICS, Const.TENSOR]: self.data_collector.data_processor.dump_async_data() self.data_collector.write_json() JitDump.jit_dump_switch = False @@ -382,16 +391,20 @@ class Service: dump_dir = os.path.join(self.dump_iter_dir, f"rank{cur_rank}") create_directory(dump_dir) - if self.config.task in self.data_collector.tasks_need_tensor_data: + + dump_data_dir = None + if self.config.task in self.data_collector.tasks_need_tensor_data or ( + self.config.task == Const.STATISTICS and self.config.tensor_list): dump_data_dir = os.path.join(dump_dir, "dump_tensor_data") create_directory(dump_data_dir) - else: - dump_data_dir = None dump_path_aggregation = DumpPathAggregation() - dump_path_aggregation.dump_file_path = os.path.join(dump_dir, "dump.json") - dump_path_aggregation.stack_file_path = os.path.join(dump_dir, "stack.json") - dump_path_aggregation.construct_file_path = os.path.join(dump_dir, "construct.json") + if self.config.level != Const.LEVEL_DEBUG: + dump_path_aggregation.dump_file_path = os.path.join(dump_dir, "dump.json") + dump_path_aggregation.stack_file_path = os.path.join(dump_dir, "stack.json") + dump_path_aggregation.construct_file_path = os.path.join(dump_dir, "construct.json") + else: + dump_path_aggregation.debug_file_path = os.path.join(dump_dir, "debug.json") dump_path_aggregation.dump_tensor_data_dir = dump_data_dir self.data_collector.update_dump_paths(dump_path_aggregation) @@ -408,19 +421,6 @@ class Service: self.api_register.initialize_hook(functools.partial(self.build_hook, BaseScope.Module_Type_API)) self.api_register.register_all_api() - def get_cells_and_names(self): - cells_and_names_with_index = {} - - def get_cell_or_module(model): - return model.named_modules() if is_mindtorch() else model.cells_and_names() - - if isinstance(self.model, (list, tuple)): - for index, model in enumerate(self.model): - cells_and_names_with_index[str(index)] = get_cell_or_module(model) - else: - cells_and_names_with_index["-1"] = get_cell_or_module(self.model) - return cells_and_names_with_index - def register_primitive_hook(self): if self.config.level not in [Const.LEVEL_MIX, Const.LEVEL_L1]: return @@ -428,7 +428,7 @@ class Service: return primitive_set = set() - cells_and_names_with_index = self.get_cells_and_names() + cells_and_names_with_index = get_cells_and_names(self.model) for cells_and_names in cells_and_names_with_index.values(): for _, cell in cells_and_names: for attribute, value in vars(cell).items(): @@ -443,43 +443,6 @@ class Service: primitive_combined_name)}) primitive.__class__ = new_primitive - def register_cell_hook(self): - if self.config.level in [Const.LEVEL_MIX, Const.LEVEL_L0]: - logger.info(f"The cell {self.config.task} hook function is successfully mounted to the model.") - if not self.model: - raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, - f"The current level is {self.config.level}, the model cannot be None") - model_type = Const.MODULE if is_mindtorch() else Const.CELL - cells_and_names_with_index = self.get_cells_and_names() - - for index, cells_and_names in cells_and_names_with_index.items(): - model = self.model if index == "-1" else self.model[int(index)] - for name, cell in cells_and_names: - if cell == model: - continue - if not hasattr(cell.__class__, 'msprobe_construct'): - setattr(cell.__class__, 'msprobe_construct', True) - if is_mindtorch(): - setattr(cell.__class__, 'forward', get_cell_construct(cell.__class__.forward)) - else: - setattr(cell.__class__, 'construct', get_cell_construct(cell.__class__.construct)) - setattr(cell, 'msprobe_hook', True) - cell_index = (index + Const.SEP) if index != "-1" else "" - prefix = (model_type + Const.SEP + cell_index + name + - Const.SEP + cell.__class__.__name__ + Const.SEP) - _, forward_hook, backward_hook, _ = self.build_hook(BaseScope.Module_Type_Module, prefix) - cell.register_forward_hook(forward_hook) - cell.register_forward_pre_hook( - self.cell_processor.node_hook(prefix + Const.FORWARD, Const.START)) - cell.register_forward_hook( - self.cell_processor.node_hook(prefix + Const.FORWARD, Const.STOP)) - - register_backward_hook_functions["full"](cell, backward_hook) - register_backward_hook_functions["pre"]( - cell, self.cell_processor.node_hook(prefix + Const.BACKWARD, Const.START)) - register_backward_hook_functions["full"]( - cell, self.cell_processor.node_hook(prefix + Const.BACKWARD, Const.STOP)) - def reset_status(self): self.primitive_hook_service.primitive_counters.clear() self.data_collector.reset_status() @@ -493,33 +456,6 @@ class Service: if self.config.rank and self.current_rank not in self.config.rank: return - def init_for_debug_level(self): - if not (self.config.level == Const.LEVEL_DEBUG and self.config.task in [Const.TENSOR, Const.STATISTICS]): - return - try: - self.current_rank = get_rank_if_initialized() - except DistributedNotInitializedError: - self.current_rank = None - # dir: dump_path -- rank{} -- debug.json - self.dump_iter_dir = self.config.dump_path - cur_rank = self.current_rank if self.current_rank is not None else '' - dump_dir = os.path.join(self.dump_iter_dir, f"rank{cur_rank}") - create_directory(dump_dir) - if self.config.task in self.data_collector.tasks_need_tensor_data: - dump_data_dir = os.path.join(dump_dir, "dump_tensor_data") - create_directory(dump_data_dir) - else: - dump_data_dir = None - - dump_path_aggregation = DumpPathAggregation() - dump_path_aggregation.dump_tensor_data_dir = dump_data_dir - dump_path_aggregation.debug_file_path = os.path.join(dump_dir, "debug.json") - self.data_collector.update_dump_paths(dump_path_aggregation) - self.data_collector.initialize_json_file( - framework=Const.MT_FRAMEWORK if is_mindtorch() else Const.MS_FRAMEWORK - ) - self.debug_variable_counter = defaultdict(int) - def save(self, variable, name, save_backward): ''' Args: @@ -531,6 +467,21 @@ class Service: ''' if self.config.level != Const.LEVEL_DEBUG: return + + self.current_iter = self.loop + self.init_step + if self.config.step and self.current_iter not in self.config.step: + return + + if self.currrent_step_first_debug_save: + try: + self.current_rank = get_rank_if_initialized() + except DistributedNotInitializedError: + self.current_rank = None + + self.create_dirs() + self.debug_variable_counter = defaultdict(int) + self.currrent_step_first_debug_save = False + count = self.debug_variable_counter[name] self.debug_variable_counter[name] += 1 @@ -543,3 +494,13 @@ class Service: # backward save if save_backward: self.data_collector.debug_data_collect_backward(variable, grad_name_with_count) + + def register_custom_api(self, module, api_name, api_prefix): + self.ori_customer_func[str(module) + Const.SEP + api_name] = getattr(module, api_name) + ApiRegistry.register_custom_api(module, api_name, api_prefix, + functools.partial(self.build_hook, BaseScope.Module_Type_API), ApiTemplate) + + def restore_custom_api(self, module, api): + ori_func = self.ori_customer_func.get(str(module) + Const.SEP + api) + if ori_func: + setattr(module, api, ori_func) diff --git a/debug/accuracy_tools/msprobe/msprobe.py b/debug/accuracy_tools/msprobe/msprobe.py index 7a4618120db8986691d1175b1c3b7317e0b6f44d..5d290b3d3f1befa15ae38137545d846380116732 100644 --- a/debug/accuracy_tools/msprobe/msprobe.py +++ b/debug/accuracy_tools/msprobe/msprobe.py @@ -22,6 +22,8 @@ from msprobe.core.common.log import logger from msprobe.core.compare.utils import _compare_parser from msprobe.core.compare.compare_cli import compare_cli from msprobe.core.compare.merge_result.merge_result_cli import _merge_result_parser, merge_result_cli +from msprobe.core.config_check.config_check_cli import _config_checking_parser, \ + _run_config_checking_command def is_module_available(module_name): @@ -51,11 +53,13 @@ def main(): graph_service_cmd_parser = subparsers.add_parser('graph') op_generate_cmd_parser = subparsers.add_parser('op_generate') merge_result_parser = subparsers.add_parser('merge_result') - config_checking_parser = subparsers.add_parser('config_checking') + config_checking_parser = subparsers.add_parser('config_check') + _config_checking_parser(config_checking_parser) _compare_parser(compare_cmd_parser) _merge_result_parser(merge_result_parser) is_torch_available = is_module_available("torch") + if len(sys.argv) < 4: parser.print_help() sys.exit(0) @@ -71,8 +75,6 @@ def main(): from msprobe.visualization.graph_service import _pt_graph_service_parser, _pt_graph_service_command from msprobe.pytorch.api_accuracy_checker.generate_op_script.op_generator import _op_generator_parser, \ _run_operator_generate_commond - from msprobe.pytorch.config_checking.config_checking import _config_checking_parser, \ - _run_config_checking_command _run_ut_parser(run_ut_cmd_parser) _run_ut_parser(multi_run_ut_cmd_parser) @@ -82,7 +84,6 @@ def main(): _run_overflow_check_parser(run_overflow_check_cmd_parser) _pt_graph_service_parser(graph_service_cmd_parser) _op_generator_parser(op_generate_cmd_parser) - _config_checking_parser(config_checking_parser) elif framework_args.framework == Const.MS_FRAMEWORK: from msprobe.mindspore.api_accuracy_checker.cmd_parser import add_api_accuracy_checker_argument from msprobe.visualization.graph_service import _ms_graph_service_parser, _ms_graph_service_command @@ -121,7 +122,7 @@ def main(): compare_cli(args) elif sys.argv[3] == "merge_result": merge_result_cli(args) - elif sys.argv[3] == "config_checking": + elif sys.argv[3] == "config_check": _run_config_checking_command(args) else: if not is_module_available(Const.MS_FRAMEWORK): @@ -142,6 +143,8 @@ def main(): elif sys.argv[3] == "code_mapping": from msprobe.mindspore.code_mapping.main import code_mapping_main code_mapping_main(args) + elif sys.argv[3] == "config_check": + _run_config_checking_command(args) if __name__ == "__main__": diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/common/config.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/common/config.py index 588a1eb349a6223f1c86df04fe3ae590a4e2a1ca..1e844ff81a8543c9865dbefc3c39c12202d2c6e2 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/common/config.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/common/config.py @@ -52,9 +52,7 @@ class Config: 'host': str, 'port': int, 'rank_list': list, - 'tls_path': str, - 'master_ip': str, - 'master_port': str + 'tls_path': str } if key not in validators: raise ValueError(f"{key} must be one of {validators.keys()}") @@ -74,10 +72,6 @@ class Config: RunUTConfig.check_nfs_path_config(value) if key == 'tls_path': RunUTConfig.check_tls_path_config(value) - if key == 'master_ip': - RunUTConfig.check_master_ip_config(value) - if key == 'master_port': - RunUTConfig.check_master_port_config(value) return value @@ -97,8 +91,6 @@ class CheckerConfig: self.port = msCheckerConfig.port self.rank_list = msCheckerConfig.rank_list self.tls_path = msCheckerConfig.tls_path - self.master_ip = msCheckerConfig.master_ip - self.master_port = msCheckerConfig.master_port if task_config: self.load_config(task_config) @@ -113,8 +105,6 @@ class CheckerConfig: self.port = task_config.port self.rank_list = task_config.rank_list self.tls_path = task_config.tls_path - self.master_ip = task_config.master_ip - self.master_port = task_config.master_port def get_online_config(self): return OnlineConfig( @@ -135,8 +125,8 @@ class CheckerConfig: save_error_data=config_params.get('save_error_data'), is_continue_run_ut=config_params.get('is_continue_run_ut'), real_data_path=config_params.get('real_data_path'), - white_list=self.white_list, - black_list=self.black_list, + white_list=self.white_list.copy() if self.white_list else [], + black_list=self.black_list.copy() if self.black_list else [], error_data_path=config_params.get('error_data_path'), online_config=self.get_online_config() ) diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/algorithm.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/algorithm.py index abe8f2b4b3cd1cf8195fc86ed5c6a07e1daddf15..ddee254c2b1085f9af96fe2774c53fb88c5821f4 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/algorithm.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/algorithm.py @@ -261,54 +261,3 @@ def compare_bool_tensor(bench_output, device_output): error_rate = float(error_nums / bench_output.size) result = CompareConst.PASS if error_rate == 0 else CompareConst.ERROR return error_rate, result, "" - - -def maximize_kahan_loss(cumsum, addend, negative=False): - """ - Calculate the precision loss in Kahan summation and select the maximum or minimum loss. - - Parameters: - cumsum (torch.Tensor): The current cumulative sum. - addend (torch.Tensor): The value to be added in the current step. - negative (bool): Whether to select the negative direction of loss. - Default is False (select positive direction which minimizes the sum). - - Returns: - loss_res (torch.Tensor): The selected maximum or minimum loss value. - mask (torch.Tensor): - A boolean mask indicating whether the loss value should be compensated. - """ - loss_all = (cumsum + addend) - cumsum - addend - if negative: - loss_res = torch.min(loss_all, dim=0)[0] - mask = loss_res <= 0 - else: - loss_res = torch.max(loss_all, dim=0)[0] - mask = loss_res >= 0 - return loss_res, mask - - -def kahan_range(tensors, negative=False): - """ - Perform Kahan summation on a list of tensors and track precision loss. - - Parameters: - tensors (list of torch.Tensor): The list of tensors to be summed. - negative (bool): Whether to select the negative direction of loss. - Default is False (select positive direction which minimizes the sum). - Returns: - sum_max: The summation results. - """ - if len(tensors) < 1: - raise ValueError("tensors should have at least 1 element") - cumsum_temp = torch.clone(tensors[0]).unsqueeze(dim=0) - sum_max = torch.clone(tensors[0]) - loss_max = torch.tensor(0) - - for tensor in tensors[1:]: - addend = tensor - loss_max - loss_max, mask = maximize_kahan_loss(cumsum_temp, addend, negative) - sum_max = sum_max + (addend - torch.where(mask, loss_max, 0)) - loss_max = torch.where(mask, 0, loss_max) - cumsum_temp = torch.cat((cumsum_temp, sum_max.unsqueeze(dim=0))) - return sum_max diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/compare.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/compare.py index cf5928e509e3138ea762cd9d7af6fc26a5d2c5c9..c12a54c18ad07ae302b41d12704dc82fec01b4c2 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/compare.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/compare/compare.py @@ -40,6 +40,7 @@ from msprobe.pytorch.api_accuracy_checker.compare.compare_utils import check_dty DETAIL_TEST_ROWS, BENCHMARK_COMPARE_SUPPORT_LIST from msprobe.pytorch.api_accuracy_checker.common.utils import extract_basic_api_segments from msprobe.pytorch.common.log import logger +from msprobe.core.common.decorator import recursion_depth_decorator ResultInfo = namedtuple('ResultInfo', ['full_api_name', 'fwd_success_status', 'bwd_success_status', @@ -178,6 +179,41 @@ class Comparator: if not os.path.exists(detail_save_path): write_csv(DETAIL_TEST_ROWS, detail_save_path) + @recursion_depth_decorator("compare_core") + def _compare_core(self, api_name, bench_output, device_output): + compare_column = CompareColumn() + if not isinstance(bench_output, type(device_output)): + status = CompareConst.ERROR + message = "bench and npu output type is different." + elif isinstance(bench_output, dict): + b_keys, n_keys = set(bench_output.keys()), set(device_output.keys()) + if b_keys != n_keys: + status = CompareConst.ERROR + message = "bench and npu output dict keys are different." + else: + status, compare_column, message = self._compare_core(api_name, list(bench_output.values()), + list(device_output.values())) + elif isinstance(bench_output, torch.Tensor): + copy_bench_out = bench_output.detach().clone() + copy_device_output = device_output.detach().clone() + compare_column.bench_type = str(copy_bench_out.dtype) + compare_column.npu_type = str(copy_device_output.dtype) + compare_column.shape = tuple(device_output.shape) + status, compare_column, message = self._compare_torch_tensor(api_name, copy_bench_out, copy_device_output, + compare_column) + elif isinstance(bench_output, (bool, int, float, str)): + compare_column.bench_type = str(type(bench_output)) + compare_column.npu_type = str(type(device_output)) + status, compare_column, message = self._compare_builtin_type(bench_output, device_output, compare_column) + elif bench_output is None: + status = CompareConst.SKIP + message = "Bench output is None, skip this test." + else: + status = CompareConst.ERROR + message = "Unexpected output type in compare_core: {}".format(type(bench_output)) + + return status, compare_column, message + def write_summary_csv(self, test_result): test_rows = [] try: @@ -293,40 +329,6 @@ class Comparator: test_final_success = CompareConst.WARNING return test_final_success, detailed_result_total - def _compare_core(self, api_name, bench_output, device_output): - compare_column = CompareColumn() - if not isinstance(bench_output, type(device_output)): - status = CompareConst.ERROR - message = "bench and npu output type is different." - elif isinstance(bench_output, dict): - b_keys, n_keys = set(bench_output.keys()), set(device_output.keys()) - if b_keys != n_keys: - status = CompareConst.ERROR - message = "bench and npu output dict keys are different." - else: - status, compare_column, message = self._compare_core(api_name, list(bench_output.values()), - list(device_output.values())) - elif isinstance(bench_output, torch.Tensor): - copy_bench_out = bench_output.detach().clone() - copy_device_output = device_output.detach().clone() - compare_column.bench_type = str(copy_bench_out.dtype) - compare_column.npu_type = str(copy_device_output.dtype) - compare_column.shape = tuple(device_output.shape) - status, compare_column, message = self._compare_torch_tensor(api_name, copy_bench_out, copy_device_output, - compare_column) - elif isinstance(bench_output, (bool, int, float, str)): - compare_column.bench_type = str(type(bench_output)) - compare_column.npu_type = str(type(device_output)) - status, compare_column, message = self._compare_builtin_type(bench_output, device_output, compare_column) - elif bench_output is None: - status = CompareConst.SKIP - message = "Bench output is None, skip this test." - else: - status = CompareConst.ERROR - message = "Unexpected output type in compare_core: {}".format(type(bench_output)) - - return status, compare_column, message - def _compare_torch_tensor(self, api_name, bench_output, device_output, compare_column): cpu_shape = bench_output.shape npu_shape = device_output.shape diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/config.yaml b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/config.yaml index 30cea3b8e01f1c1a8a3a3d25620ba4bb2c9e709a..2ec9251009e61ef68dbfed987abe457d47b91e9a 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/config.yaml +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/config.yaml @@ -8,5 +8,3 @@ host: "" port: -1 rank_list: [0] tls_path: "./" -master_ip: '127.0.0.1' -master_port: '2688' diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/generate_op_script/operator_replication.template b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/generate_op_script/operator_replication.template index b6d5028c019e42c01b1f51f338d1d2b5f3c7933f..c60d84994745e94bef6d05a78d83fae81df7ed1e 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/generate_op_script/operator_replication.template +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/generate_op_script/operator_replication.template @@ -1,6 +1,6 @@ -import json import os -import math +import re +import stat from enum import Enum, auto import torch try: @@ -25,6 +25,31 @@ RAISE_PRECISION = {{ }} THOUSANDTH_THRESHOLDING = 0.001 BACKWARD = 'backward' +DIR = "dir" +FILE = "file" +READ_ABLE = "read" +WRITE_ABLE = "write" +READ_WRITE_ABLE = "read and write" +DIRECTORY_LENGTH = 4096 +FILE_NAME_LENGTH = 255 +SOFT_LINK_ERROR = "检测到软链接" +FILE_PERMISSION_ERROR = "文件权限错误" +INVALID_FILE_ERROR = "无效文件" +ILLEGAL_PATH_ERROR = "非法文件路径" +ILLEGAL_PARAM_ERROR = "非法打开方式" +FILE_TOO_LARGE_ERROR = "文件过大" +FILE_VALID_PATTERN = r"^[a-zA-Z0-9_.:/-]+$" +FILE_SIZE_DICT = {{ + ".pkl": 1073741824, # 1 * 1024 * 1024 * 1024 + ".npy": 10737418240, # 10 * 1024 * 1024 * 1024 + ".json": 1073741824, # 1 * 1024 * 1024 * 1024 + ".pt": 10737418240, # 10 * 1024 * 1024 * 1024 + ".csv": 1073741824, # 1 * 1024 * 1024 * 1024 + ".xlsx": 1073741824, # 1 * 1024 * 1024 * 1024 + ".yaml": 1073741824, # 1 * 1024 * 1024 * 1024 + ".ir": 1073741824 # 1 * 1024 * 1024 * 1024 +}} +COMMOM_FILE_SIZE = 1048576 # 1 * 1024 * 1024 class CompareStandard(Enum): BINARY_EQUALITY_STANDARD = auto() @@ -33,8 +58,184 @@ class CompareStandard(Enum): BENCHMARK_STANDARD = auto() THOUSANDTH_STANDARD = auto() +class FileChecker: + """ + The class for check file. + + Attributes: + file_path: The file or dictionary path to be verified. + path_type: file or dictionary + ability(str): FileCheckConst.WRITE_ABLE or FileCheckConst.READ_ABLE to set file has writability or readability + file_type(str): The correct file type for file + """ + + def __init__(self, file_path, path_type, ability=None, file_type=None, is_script=True): + self.file_path = file_path + self.path_type = self._check_path_type(path_type) + self.ability = ability + self.file_type = file_type + self.is_script = is_script + + @staticmethod + def _check_path_type(path_type): + if path_type not in [DIR, FILE]: + print(f'ERROR: The path_type must be {{DIR}} or {{FILE}}.') + raise Exception(ILLEGAL_PARAM_ERROR) + return path_type + + def common_check(self): + """ + 功能:用户校验基本文件权限:软连接、文件长度、是否存在、读写权限、文件属组、文件特殊字符 + 注意:文件后缀的合法性,非通用操作,可使用其他独立接口实现 + """ + FileChecker.check_path_exists(self.file_path) + FileChecker.check_link(self.file_path) + self.file_path = os.path.realpath(self.file_path) + FileChecker.check_path_length(self.file_path) + FileChecker.check_path_type(self.file_path, self.path_type) + self.check_path_ability() + if self.is_script: + FileChecker.check_path_owner_consistent(self.file_path) + FileChecker.check_path_pattern_valid(self.file_path) + FileChecker.check_common_file_size(self.file_path) + FileChecker.check_file_suffix(self.file_path, self.file_type) + if self.path_type == FILE: + FileChecker.check_dirpath_before_read(self.file_path) + return self.file_path + + def check_path_ability(self): + if self.ability == WRITE_ABLE: + FileChecker.check_path_writability(self.file_path) + if self.ability == READ_ABLE: + FileChecker.check_path_readability(self.file_path) + if self.ability == READ_WRITE_ABLE: + FileChecker.check_path_readability(self.file_path) + FileChecker.check_path_writability(self.file_path) + + @staticmethod + def check_path_exists(path): + if not os.path.exists(path): + print(f'ERROR: The file path %s does not exist.' % path) + raise Exception() + + @staticmethod + def check_link(path): + abs_path = os.path.abspath(path) + if os.path.islink(abs_path): + print('ERROR: The file path {{}} is a soft link.'.format(path)) + raise Exception(SOFT_LINK_ERROR) + + @staticmethod + def check_path_length(path, name_length=None): + file_max_name_length = name_length if name_length else FILE_NAME_LENGTH + if len(path) > DIRECTORY_LENGTH or \ + len(os.path.basename(path)) > file_max_name_length: + print(f'ERROR: The file path length exceeds limit.') + raise Exception(ILLEGAL_PATH_ERROR) + + @staticmethod + def check_path_type(file_path, file_type): + if file_type == FILE: + if not os.path.isfile(file_path): + print(f"ERROR: The {{file_path}} should be a file!") + raise Exception(INVALID_FILE_ERROR) + if file_type == DIR: + if not os.path.isdir(file_path): + print(f"ERROR: The {{file_path}} should be a dictionary!") + raise Exception(INVALID_FILE_ERROR) + + @staticmethod + def check_path_owner_consistent(path): + file_owner = os.stat(path).st_uid + if file_owner != os.getuid() and os.getuid() != 0: + print('ERROR: The file path %s may be insecure because is does not belong to you.' % path) + raise Exception(FILE_PERMISSION_ERROR) + + @staticmethod + def check_path_pattern_valid(path): + if not re.match(FILE_VALID_PATTERN, path): + print('ERROR: The file path %s contains special characters.' % (path)) + raise Exception(ILLEGAL_PATH_ERROR) + + @staticmethod + def check_common_file_size(file_path): + if os.path.isfile(file_path): + for suffix, max_size in FILE_SIZE_DICT.items(): + if file_path.endswith(suffix): + FileChecker.check_file_size(file_path, max_size) + return + FileChecker.check_file_size(file_path, COMMOM_FILE_SIZE) + + @staticmethod + def check_file_size(file_path, max_size): + try: + file_size = os.path.getsize(file_path) + except OSError as os_error: + print(f'ERROR: Failed to open "{{file_path}}". {{str(os_error)}}') + raise Exception(INVALID_FILE_ERROR) from os_error + if file_size >= max_size: + print(f'ERROR: The size ({{file_size}}) of {{file_path}} exceeds ({{max_size}}) bytes, tools not support.') + raise Exception(FILE_TOO_LARGE_ERROR) + + @staticmethod + def check_file_suffix(file_path, file_suffix): + if file_suffix: + if not file_path.endswith(file_suffix): + print(f"The {{file_path}} should be a {{file_suffix}} file!") + raise Exception(INVALID_FILE_ERROR) + + @staticmethod + def check_dirpath_before_read(path): + path = os.path.realpath(path) + dirpath = os.path.dirname(path) + if FileChecker.check_others_writable(dirpath): + print(f"WARNING: The directory is writable by others: {{dirpath}}.") + try: + FileChecker.check_path_owner_consistent(dirpath) + except Exception: + print(f"WARNING: The directory {{dirpath}} is not yours.") + + @staticmethod + def check_others_writable(directory): + dir_stat = os.stat(directory) + is_writable = ( + bool(dir_stat.st_mode & stat.S_IWGRP) or # 组可写 + bool(dir_stat.st_mode & stat.S_IWOTH) # 其他用户可写 + ) + return is_writable + + @staticmethod + def check_path_readability(path): + if not os.access(path, os.R_OK): + print('ERROR: The file path %s is not readable.' % path) + raise Exception(FILE_PERMISSION_ERROR) + + @staticmethod + def check_path_writability(path): + if not os.access(path, os.W_OK): + print('ERROR: The file path %s is not writable.' % path) + raise Exception(FILE_PERMISSION_ERROR) + + +def check_file_or_directory_path(path, isdir=False): + """ + Function Description: + check whether the path is valid + Parameter: + path: the path to check + isdir: the path is dir or file + Exception Description: + when invalid data throw exception + """ + if isdir: + path_checker = FileChecker(path, DIR, WRITE_ABLE) + else: + path_checker = FileChecker(path, FILE, READ_ABLE) + path_checker.common_check() + def load_pt(pt_path, to_cpu=False): pt_path = os.path.realpath(pt_path) + check_file_or_directory_path(pt_path) try: if to_cpu: pt = torch.load(pt_path, map_location=torch.device("cpu"), weights_only=True) @@ -202,6 +403,7 @@ def compare_tensor(out_device, out_bench, api_name): else: abs_err = torch.abs(out_device - out_bench) abs_bench = torch.abs(out_bench) + eps = 2 ** -23 if dtype_bench == torch.float32: eps = 2 ** -23 if dtype_bench == torch.float64: diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/data_generate.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/data_generate.py index 15e14b68c7da4f2c7fadd4e0285c79fec5fa78f1..9d89b2de32f70c6fa7abf38add49b58a13531d7a 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/data_generate.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/data_generate.py @@ -1,7 +1,9 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +#!/usr/bin/env python3 +# -*- coding: utf-8 -*- +# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. # All rights reserved. # -# Licensed under the Apache License, Version 2.0 (the "License"); +# Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # @@ -13,27 +15,20 @@ # See the License for the specific language governing permissions and # limitations under the License. -import math import os - -import numpy +import math import torch +import numpy -from msprobe.core.common.const import Const, FileCheckConst, CompareConst, DistributedCheckConst -from msprobe.core.common.file_utils import FileChecker, load_npy +from msprobe.pytorch.api_accuracy_checker.run_ut.run_ut_utils import hf_32_standard_api from msprobe.pytorch.api_accuracy_checker.common.utils import check_object_type, get_full_data_path, \ CompareException, get_module_and_atttribute_name, get_attribute -from msprobe.pytorch.api_accuracy_checker.run_ut.run_ut_utils import hf_32_standard_api +from msprobe.core.common.file_utils import FileChecker, load_npy from msprobe.pytorch.common.log import logger from msprobe.pytorch.common.utils import load_pt -from msprobe.pytorch.hook_module.api_register import get_api_register +from msprobe.core.common.const import Const, FileCheckConst, CompareConst -api_register = get_api_register(return_new=True) -api_register.initialize_hook(None) -distribute_api_key = Const.PT_FRAMEWORK + Const.SEP + Const.PT_API_TYPE_DIST -distribute_api_list = list(api_register.ori_api_attr.get(distribute_api_key, {}).keys()) - TORCH_TYPE = ["torch.device", "torch.dtype"] TENSOR_DATA_LIST = ["torch.Tensor", "torch.nn.parameter.Parameter"] FLOAT_TYPE = [ @@ -73,7 +68,7 @@ def gen_data(info, api_name, need_grad, convert_type, real_data_path=None): data = gen_random_tensor(info, convert_type) if api_name in hf_32_standard_api and data.dtype == torch.float32: data = fp32_to_hf32_to_fp32(data) - if info.get('requires_grad') and need_grad and api_name not in distribute_api_list: + if info.get('requires_grad') and need_grad: data.requires_grad_(True) temp_data = data * 1 data = temp_data.type_as(data) @@ -266,14 +261,11 @@ def gen_args(args_info, api_name, func_options): Function Description: Based on API basic information, generate input parameters: args, for API forward running Parameter: - args_info: API basic information. DICT + api_info: API basic information. List api_name: API name - func_options: the options for generating args. Dict - need_grad: set Tensor grad for backward - convert_type: convert ori_type to dist_type flag. - real_data_path: the root directory for storing real data. - depth: the depth of recursion. - kwargs_params: the input kwargs parameters. + need_grad: set Tensor grad for backward + convert_type: convert ori_type to dist_type flag. + real_data_path: the root directory for storing real data. """ check_object_type(args_info, list) args_result = [] @@ -282,7 +274,6 @@ def gen_args(args_info, api_name, func_options): convert_type = func_options.get('convert_type', None) real_data_path = func_options.get('real_data_path', None) depth = func_options.get('depth', 0) - kwargs_params = func_options.get('input_kwargs', {}) if depth > Const.MAX_DEPTH: logger.error("The depth of args is too large, please check the input args.") @@ -293,11 +284,7 @@ def gen_args(args_info, api_name, func_options): func_options['depth'] = depth + 1 data = gen_args(arg, api_name, func_options) elif isinstance(arg, dict): - if arg.get('type') == DistributedCheckConst.TORCH_PROCESS_GROUP: - data = None - kwargs_params[DistributedCheckConst.GROUP] = arg - else: - data = gen_data(arg, api_name, need_grad, convert_type, real_data_path) + data = gen_data(arg, api_name, need_grad, convert_type, real_data_path) elif arg is None: data = None else: @@ -324,8 +311,6 @@ def gen_kwargs(api_info, api_name, convert_type=None, real_data_path=None): kwargs_params[key] = gen_list_kwargs(value, api_name, convert_type, real_data_path) elif value is None: kwargs_params[key] = None - elif key == DistributedCheckConst.GROUP and value.get('type') == DistributedCheckConst.TORCH_PROCESS_GROUP: - kwargs_params[key] = value elif key == 'atten_mask' and api_name == 'npu_fusion_attention': sparse_mode = kwargs_params.get('sparse_mode', {}) if isinstance(sparse_mode, dict): @@ -430,19 +415,17 @@ def gen_api_params(api_info, api_name, need_grad=True, convert_type=None, real_d if convert_type and convert_type not in Const.CONVERT: error_info = f"convert_type params not support {convert_type}." raise CompareException(CompareException.INVALID_PARAM_ERROR, error_info) - + kwargs_params = gen_kwargs(api_info, api_name, convert_type, real_data_path) func_options = { 'need_grad': need_grad, 'convert_type': convert_type, 'real_data_path': real_data_path, - 'depth': 0, - 'input_kwargs': api_info.get("input_kwargs", {}) + 'depth': 0 } if api_info.get("input_args"): args_params = gen_args(api_info.get("input_args"), api_name, func_options) else: logger.warning(f'Warning: No args in {api_info} ') args_params = [] - kwargs_params = gen_kwargs(api_info, api_name, convert_type, real_data_path) output_dtype = get_output_dtype(api_info) return args_params, kwargs_params, output_dtype diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_bench_function.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_bench_function.py deleted file mode 100644 index 18ff05bc00c2c5271e965dbd91fd54be1d410876..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_bench_function.py +++ /dev/null @@ -1,204 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import torch - -from msprobe.core.common.const import DistributedCheckConst -from msprobe.pytorch.api_accuracy_checker.common.utils import check_object_type -from msprobe.pytorch.api_accuracy_checker.compare.algorithm import kahan_range -from msprobe.pytorch.api_accuracy_checker.run_ut.run_ut_utils import get_distributed_args - - -def sort_all_input(inputs): - ranks = len(inputs) - if ranks <= 1: - return inputs - combined_tensor = torch.stack(inputs) - sorted_indices = torch.argsort(combined_tensor, descending=True, dim=0) - combined_tensor = torch.gather(combined_tensor, 0, sorted_indices) - sorted_inputs = [combined_tensor[i] for i in range(ranks)] - return sorted_inputs - - -def reduce_sum(tensors): - min_bound = torch.min( - kahan_range(tensors, negative=False), - kahan_range(tensors[::-1], negative=False), - ) - max_bound = torch.max( - kahan_range(tensors, negative=True), kahan_range(tensors[::-1], negative=True) - ) - tensors_sorted = sort_all_input(tensors) - min_sorted_bound = torch.min( - kahan_range(tensors_sorted, negative=False), - kahan_range(tensors_sorted[::-1], negative=False), - ) - max_sorted_bound = torch.max( - kahan_range(tensors_sorted, negative=True), - kahan_range(tensors_sorted[::-1], negative=True), - ) - return torch.min(min_bound, min_sorted_bound), torch.max( - max_bound, max_sorted_bound - ) - - -def reduce_product(tensors): - return torch.stack(tensors).prod(dim=0) - - -def reduce_min(tensors): - return torch.stack(tensors).min(dim=0).values - - -def reduce_max(tensors): - return torch.stack(tensors).max(dim=0).values - - -def reduce_band(tensors): - reduce_tensor = tensors[0].clone() - if len(tensors) > 1: - for t in tensors[1:]: - reduce_tensor &= t - return reduce_tensor - - -def reduce_bor(tensors): - reduce_tensor = tensors[0].clone() - if len(tensors) > 1: - for t in tensors[1:]: - reduce_tensor |= t - return reduce_tensor - - -def reduce_bxor(tensors): - reduce_tensor = tensors[0].clone() - if len(tensors) > 1: - for t in tensors[1:]: - reduce_tensor ^= t - return reduce_tensor - - -def mock_broadcast(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - if len(input_args) < 1 or len(input_kwargs) < 1: - raise ValueError("input_args and input_kwargs should have at least 1 element") - - src = get_distributed_args(api_name, input_args[0], input_kwargs[0], DistributedCheckConst.SRC) - - group = get_distributed_args(api_name, input_args[0], input_kwargs[0], DistributedCheckConst.GROUP) - group_ranks = group.get(DistributedCheckConst.GROUP_RANKS, []) - if not group_ranks: - raise ValueError("group_ranks should not be empty") - real_src = src - min(group_ranks) - if len(input_args) <= real_src: - raise ValueError("input_args should have at least {} element".format(real_src + 1)) - - return input_args[real_src][0] - - -def mock_reduce(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - if len(input_args) < 1 or len(input_kwargs) < 1: - raise ValueError("input_args and input_kwargs should have at least 1 element") - - reduce_op = get_distributed_args(api_name, input_args[0], input_kwargs[0], DistributedCheckConst.OP) - tensors = [] - for arg in input_args: - if len(arg) > 0: - tensors.append(arg[0]) - reduce_tensor = None - if not tensors: - return reduce_tensor - reduce_ops = { - DistributedCheckConst.REDOPTYPE_SUM: reduce_sum, - DistributedCheckConst.REDOPTYPE_PRODUCT: reduce_product, - DistributedCheckConst.REDOPTYPE_MIN: reduce_min, - DistributedCheckConst.REDOPTYPE_MAX: reduce_max, - DistributedCheckConst.REDOPTYPE_BAND: reduce_band, - DistributedCheckConst.REDOPTYPE_BOR: reduce_bor, - DistributedCheckConst.REDOPTYPE_BXOR: reduce_bxor, - } - if reduce_op not in reduce_ops: - raise ValueError(f"Unsupported reduce operation: {reduce_op}") - reduce_tensor = reduce_ops[reduce_op](tensors) - - return reduce_tensor - - -def mock_scatter(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - if len(input_args) < 1 or len(input_kwargs) < 1: - raise ValueError("input_args and input_kwargs should have at least 1 element") - - src = get_distributed_args(api_name, input_args[0], input_kwargs[0], DistributedCheckConst.SRC) - group = get_distributed_args(api_name, input_args[0], input_kwargs[0], DistributedCheckConst.GROUP) - group_ranks = group.get(DistributedCheckConst.GROUP_RANKS, []) - if not group_ranks: - raise ValueError("group_ranks should not be empty") - real_src = src - min(group_ranks) - if len(input_args) <= real_src: - raise ValueError("input_args should have at least {} element".format(real_src + 1)) - scatter_list = get_distributed_args(api_name, input_args[real_src], input_kwargs[real_src], - DistributedCheckConst.SCATTER_LIST) - return scatter_list - - -def mock_all_gather(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - gather_tensor = [] - for data in input_args: - if len(data) > 1: - gather_tensor.append(data[1]) - return gather_tensor - - -def mock_all_to_all(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - input_tensor_list = [] - for data in input_args: - if len(data) >= 2: - input_tensor_list.append(data[1]) - world_size = len(input_tensor_list) - output_tensor_list = [] - for rank in range(world_size): - output_chunk = [] - for data in input_tensor_list: - if len(data) <= rank: - raise ValueError("input_tensor_list should have at least {} element".format(rank + 1)) - output_chunk.append(data[rank]) - output_tensor_list.append(output_chunk) - return output_tensor_list - - -def mock_all_to_all_single(api_name, input_args, input_kwargs): - check_object_type(input_args, list) - check_object_type(input_kwargs, list) - input_tensor_list = [] - for data in input_args: - if len(data) >= 2: - input_tensor_list.append(data[1]) - if not input_tensor_list: - return [] - input_tensor = torch.stack(input_tensor_list) - output_tensor = input_tensor.t() - output_tensor_list = [tensor.clone() for tensor in output_tensor] - return output_tensor_list diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_compare_function.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_compare_function.py deleted file mode 100644 index f7cf95a1d0d9060b75a45a360e6a4d5d8b087637..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_compare_function.py +++ /dev/null @@ -1,116 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import itertools -import torch -import tqdm - -from msprobe.core.common.const import CompareConst, DistributedCheckConst - - -def cumulative_check(rank, inputs, output, min_bound, max_bound): - # 检查每个元素是否在最小值和最大值之间 - res = CompareConst.PASS - out_of_bounds = torch.nonzero((output < min_bound) | (output > max_bound)) - if out_of_bounds.shape[0] == 0: - return res - # 对超出范围的值进行累加序遍历检查 - perms = list(itertools.permutations(list(range(len(inputs))))) - if len(out_of_bounds) > DistributedCheckConst.MAX_CUMSUM_CHECK_NUM: - res = CompareConst.WARNING - out_of_bounds = out_of_bounds[: DistributedCheckConst.MAX_CUMSUM_CHECK_NUM] - pbar = tqdm.tqdm( - out_of_bounds, - position=rank + 1, - desc=f"Suspicious cumulative result check for rank{rank}", - ) - for indice in pbar: - indice_tuple = tuple(indice) - input_values = torch.stack([input_[indice_tuple] for input_ in inputs])[perms] - for i in range(1, len(inputs)): - input_values[:, 0] += input_values[:, i] - if output[indice_tuple] not in input_values[:, 0]: - res = CompareConst.ERROR - break - pbar.close() - return res - - -def compare_broadcast(device_out, bench_out, **kwargs): - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - compare_result = torch.equal(device_out[0].cpu(), bench_out) - - return CompareConst.PASS if compare_result else CompareConst.ERROR - - -def compare_all_reduce(device_out, bench_out, **kwargs): - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - if isinstance(bench_out, tuple): - rank = kwargs.get("local_rank", 0) - input_args = kwargs.get("input_args", []) - tensors = [] - for arg in input_args: - if len(arg) > 0: - tensors.append(arg[0]) - if len(tensors) < 1: - raise ValueError("input_args should have at least 1 element") - result = cumulative_check(rank, tensors, device_out[0].cpu(), *bench_out) - else: - compare_result = torch.equal(device_out[0].cpu(), bench_out) - result = CompareConst.PASS if compare_result else CompareConst.ERROR - return result - - -def compare_scatter(device_out, bench_out, **kwargs): - rank = kwargs.get("local_rank", 0) - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - if len(bench_out) <= rank: - raise ValueError("bench_out should have at least rank+1 outputs") - compare_result = torch.equal(device_out[0].cpu(), bench_out[rank]) - - return CompareConst.PASS if compare_result else CompareConst.ERROR - - -def compare_all_gather(device_out, bench_out, **kwargs): - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - device_out_cpu = [tensor.cpu() for tensor in device_out[0]] - compare_result = all(torch.equal(a, b) for a, b in zip(device_out_cpu, bench_out)) - - return CompareConst.PASS if compare_result else CompareConst.ERROR - - -def compare_all_to_all(device_out, bench_out, **kwargs): - rank = kwargs.get("local_rank", 0) - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - device_out_cpu = [tensor.cpu() for tensor in device_out[0]] - compare_result = all(torch.equal(a, b) for a, b in zip(device_out_cpu, bench_out[rank])) - - return CompareConst.PASS if compare_result else CompareConst.ERROR - - -def compare_all_to_all_single(device_out, bench_out, **kwargs): - rank = kwargs.get("local_rank", 0) - if len(device_out) < 1: - raise ValueError("device_out should not be empty") - compare_result = torch.equal(device_out[0].cpu(), bench_out[rank]) - - return CompareConst.PASS if compare_result else CompareConst.ERROR diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_function_registry.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_function_registry.py deleted file mode 100644 index 6758b4ff4f8b286477880f74cd34e3516060c3fb..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/distributed_function_registry.py +++ /dev/null @@ -1,68 +0,0 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from typing import Callable - -from msprobe.pytorch.api_accuracy_checker.run_ut.distributed_bench_function import \ - mock_broadcast, mock_reduce, mock_scatter, mock_all_gather, mock_all_to_all, \ - mock_all_to_all_single -from msprobe.pytorch.api_accuracy_checker.run_ut.distributed_compare_function import \ - compare_broadcast, compare_all_reduce, compare_scatter, \ - compare_all_gather, compare_all_to_all, compare_all_to_all_single -from msprobe.core.common.const import DistributedCheckConst - - -class DistributedFunctionRegistry: - def __init__(self): - self.compare_functions = {} - self.bench_functions = {} - self.support_api_list = [DistributedCheckConst.BROADCAST, DistributedCheckConst.ALL_REDUCE, - DistributedCheckConst.SCATTER, DistributedCheckConst.ALL_GATHER, - DistributedCheckConst.ALL_TO_ALL, DistributedCheckConst.ALL_TO_ALL_SINGLE] - - def register_compare_function(self, api_name: str, function: Callable): - self.compare_functions[api_name] = function - - def register_bench_function(self, api_name: str, function: Callable): - self.bench_functions[api_name] = function - - def register_functions(self, functions_dict): - for api_name, (bench_function, compare_function) in functions_dict.items(): - self.register_bench_function(api_name, bench_function) - self.register_compare_function(api_name, compare_function) - - def get_compare_function(self, api_name: str) -> Callable: - if not self.compare_functions.get(api_name): - raise Exception("No compare function registered for api: {}".format(api_name)) - return self.compare_functions.get(api_name) - - def get_bench_function(self, api_name: str) -> Callable: - if not self.bench_functions.get(api_name): - raise Exception("No benchmark function registered for api: {}".format(api_name)) - return self.bench_functions.get(api_name) - - -functions_map = { - DistributedCheckConst.BROADCAST: (mock_broadcast, compare_broadcast), - DistributedCheckConst.ALL_REDUCE: (mock_reduce, compare_all_reduce), - DistributedCheckConst.SCATTER: (mock_scatter, compare_scatter), - DistributedCheckConst.ALL_GATHER: (mock_all_gather, compare_all_gather), - DistributedCheckConst.ALL_TO_ALL: (mock_all_to_all, compare_all_to_all), - DistributedCheckConst.ALL_TO_ALL_SINGLE: (mock_all_to_all_single, compare_all_to_all_single) -} -distributed_func_registry = DistributedFunctionRegistry() -distributed_func_registry.register_functions(functions_map) diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/multi_run_ut.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/multi_run_ut.py index 4ac42ac81e1a60158d8fb0beb1d2f951851c614c..763a4505b2ddfa466fb1b3b1cd40b1c3bd799805 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/multi_run_ut.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/multi_run_ut.py @@ -87,10 +87,6 @@ def signal_handler(signum, frame): raise KeyboardInterrupt() -signal.signal(signal.SIGINT, signal_handler) -signal.signal(signal.SIGTERM, signal_handler) - - ParallelUTConfig = namedtuple('ParallelUTConfig', ['api_files', 'out_path', 'num_splits', 'save_error_data_flag', 'jit_compile_flag', 'device_id', 'result_csv_path', 'total_items', 'config_path']) @@ -217,6 +213,8 @@ def prepare_config(args): def main(): + signal.signal(signal.SIGINT, signal_handler) + signal.signal(signal.SIGTERM, signal_handler) parser = argparse.ArgumentParser(description='Run UT in parallel') _run_ut_parser(parser) parser.add_argument('-n', '--num_splits', type=int, choices=range(1, 65), default=8, diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_distributed_check.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_distributed_check.py deleted file mode 100644 index 54f3790bbc048a9265419a52e18519b77ab25de8..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_distributed_check.py +++ /dev/null @@ -1,254 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import argparse -import os -import sys -import time -from collections import namedtuple -import copy - -import torch_npu -import torch.distributed as dist -import torch.multiprocessing as mp - -from msprobe.core.common.const import Const, FileCheckConst, DistributedCheckConst, CompareConst -from msprobe.core.common.file_utils import FileChecker, write_csv, create_directory -from msprobe.core.compare.utils import check_and_return_dir_contents -from msprobe.pytorch.api_accuracy_checker.common.config import CheckerConfig -from msprobe.pytorch.api_accuracy_checker.common.utils import extract_basic_api_segments -from msprobe.pytorch.api_accuracy_checker.run_ut.run_ut_utils import generate_device_params, get_group_info, \ - is_port_in_use -from msprobe.pytorch.api_accuracy_checker.run_ut.run_ut import get_api_info -from msprobe.pytorch.api_accuracy_checker.run_ut.distributed_function_registry import distributed_func_registry -from msprobe.pytorch.common.log import logger -from msprobe.pytorch.common.parse_json import parse_json_info_forward_backward -from msprobe.pytorch.hook_module.api_register import get_api_register -from msprobe.pytorch.pt_config import parse_json_config - - -api_register = get_api_register(return_new=True) -api_register.initialize_hook(None) -distribute_api_key = Const.PT_FRAMEWORK + Const.SEP + Const.PT_API_TYPE_DIST -distributed_func = api_register.ori_api_attr.get(distribute_api_key, {}) - -current_time = time.strftime("%Y%m%d%H%M%S") -RESULT_FILE_NAME = "accuracy_checking_result_" + current_time + ".csv" -RESULT_CSV_HEADER = [['API_NAME', 'RANK', 'COMPARE_RESULT', 'MESSAGE']] -DistributedCheckParams = namedtuple("DistributedCheckParams", ["api_full_name", "all_args", "all_kwargs", - "group_ranks", "result_file_path", "checker_config"]) -special_rank_api_list = [DistributedCheckConst.SCATTER, - DistributedCheckConst.ALL_TO_ALL, - DistributedCheckConst.ALL_TO_ALL_SINGLE] - - -def cleanup(): - dist.destroy_process_group() - - -def distributed_setup(rank, world_size, master_ip, master_port): - init_method = DistributedCheckConst.TCP + Const.COLON + Const.DOUBLE_SLASH + master_ip + Const.COLON + master_port - dist.init_process_group(backend=DistributedCheckConst.HCCL, init_method=init_method, - world_size=world_size, rank=rank) - - -def parse_distributed_api(forward_content): - distributed_api = {} - for api_full_name, api_info_dict in forward_content.items(): - split_name = api_full_name.split(Const.SEP)[0] - if split_name == Const.DISTRIBUTED: - distributed_api.update({api_full_name: api_info_dict}) - return distributed_api - - -def _run_distributed_parser(parser): - parser.add_argument("-api_info", "--api_info_dir", dest="api_info_dir", default="", type=str, - help=" The api param tool result dir: generate from api param tool. ", - required=True) - parser.add_argument("-o", "--out_path", dest="out_path", default="", type=str, - help=" The ut task result out path.", - required=False) - parser.add_argument("-config", "--config_path", dest="config_path", default="", type=str, - help=" The path of config.json", required=False) - - -def _run_distributed(parser=None): - if parser is None: - parser = argparse.ArgumentParser() - _run_distributed_parser(parser) - args = parser.parse_args(sys.argv[1:]) - run_distributed_command(args) - - -def run_distributed_command(args): - input_checker = FileChecker(args.api_info_dir, FileCheckConst.DIR, ability=FileCheckConst.READ_ABLE) - api_info_dir = input_checker.common_check() - ranks = sorted(check_and_return_dir_contents(api_info_dir, Const.RANK)) - file_paths = [os.path.join(api_info_dir, rank, 'dump.json') for rank in ranks] - forward_contents = [] - real_data_paths = [] - for file_path in file_paths: - forward_content, _, real_data_path = parse_json_info_forward_backward(file_path) - if real_data_path: - dump_path = os.path.dirname(file_path) - real_data_path = os.path.join(dump_path, Const.DUMP_TENSOR_DATA) - distributed_api = parse_distributed_api(forward_content) - forward_contents.append(distributed_api) - real_data_paths.append(real_data_path) - - out_path = args.out_path if args.out_path else Const.DEFAULT_PATH - create_directory(out_path) - out_path_checker = FileChecker(out_path, FileCheckConst.DIR, ability=FileCheckConst.WRITE_ABLE) - out_path = out_path_checker.common_check() - result_file_path = os.path.join(out_path, RESULT_FILE_NAME) - write_csv(RESULT_CSV_HEADER, result_file_path) - if args.config_path: - config_path_checker = FileChecker(args.config_path, FileCheckConst.FILE, - FileCheckConst.READ_ABLE, FileCheckConst.JSON_SUFFIX) - checked_config_path = config_path_checker.common_check() - _, task_config = parse_json_config(checked_config_path, Const.RUN_UT) - checker_config = CheckerConfig(task_config) - else: - checker_config = CheckerConfig() - run_distributed_check(forward_contents, real_data_paths, result_file_path, checker_config) - - -def run_distributed_check(forward_contents, real_data_paths, result_file_path, checker_config): - for rank, forward_content in enumerate(forward_contents): - logger.info("Start to check distributed api in rank {}.".format(rank)) - - for api_full_name, api_info_dict in forward_content.items(): - _, api_name = extract_basic_api_segments(api_full_name) - - if api_name not in distributed_func_registry.support_api_list: - message = "The api {} doesn't support distributed check.".format(api_full_name) - logger.warning(message) - result_rows = [] - df_row = list([api_full_name, rank, CompareConst.SKIP, message]) - result_rows.append(df_row) - write_csv(result_rows, result_file_path) - continue - - if api_info_dict.get('used'): - continue - - group_ranks, group_id = get_group_info(api_full_name, api_name, api_info_dict) - if not group_ranks or not group_id: - logger.warning("The api {} doesn't support distributed check.".format(api_full_name)) - continue - all_args, all_kwargs = get_distributed_args_kwargs(forward_contents, api_full_name, - real_data_paths, group_ranks) - try: - distributed_check_params = DistributedCheckParams(api_full_name, all_args, all_kwargs, group_ranks, - result_file_path, checker_config) - distributed_check(distributed_check_params) - except Exception as e: - logger.error("The api {} in rank {} distributed check failed.".format(api_full_name, rank)) - result_rows = [] - df_row = list([api_full_name, rank, CompareConst.ERROR, str(e)]) - result_rows.append(df_row) - write_csv(result_rows, result_file_path) - - -def distributed_check(distributed_check_params): - api_full_name = distributed_check_params.api_full_name - all_args = distributed_check_params.all_args - all_kwargs = distributed_check_params.all_kwargs - group_ranks = distributed_check_params.group_ranks - result_file_path = distributed_check_params.result_file_path - checker_config = distributed_check_params.checker_config - - _, api_name = extract_basic_api_segments(api_full_name) - nprocs = len(group_ranks) - distributed_config = {} - distributed_config[DistributedCheckConst.API_FULL_NAME] = api_full_name - distributed_config[DistributedCheckConst.API_NAME] = api_name - distributed_config[DistributedCheckConst.GROUP_RANKS] = group_ranks - distributed_config[DistributedCheckConst.ALL_ARGS] = all_args - distributed_config[DistributedCheckConst.ALL_KWARGS] = all_kwargs - distributed_config[DistributedCheckConst.RESULT_FILE_PATH] = result_file_path - benchmark_function = distributed_func_registry.get_bench_function(api_name) - distributed_config[DistributedCheckConst.BENCHMARK_RESULT] = benchmark_function(api_name, all_args, all_kwargs) - distributed_config[DistributedCheckConst.MASTER_IP] = checker_config.master_ip - distributed_config[DistributedCheckConst.MASTER_PORT] = checker_config.master_port - distributed_config[DistributedCheckConst.WORLD_SIZE] = nprocs - - if is_port_in_use(checker_config.master_port, checker_config.master_ip): - raise ValueError( - f"Warning: Port {checker_config.master_port} on host " - f"{checker_config.master_ip} is already in use." - ) - logger.info(f"Port {checker_config.master_port} on host {checker_config.master_ip} is available.") - - mp.spawn(run_hccl, - args=(distributed_config,), - nprocs=nprocs) - - -def run_hccl(rank, distributed_config): - local_rank = distributed_config[DistributedCheckConst.GROUP_RANKS][rank] - torch_npu.npu.set_device(local_rank) - world_size = distributed_config[DistributedCheckConst.WORLD_SIZE] - master_ip = distributed_config[DistributedCheckConst.MASTER_IP] - master_port = distributed_config[DistributedCheckConst.MASTER_PORT] - distributed_setup(rank, world_size, master_ip, master_port) - api_full_name = distributed_config[DistributedCheckConst.API_FULL_NAME] - api_name = distributed_config[DistributedCheckConst.API_NAME] - input_args = distributed_config[DistributedCheckConst.ALL_ARGS] - rank_args = input_args[rank] - rank_kwargs = distributed_config[DistributedCheckConst.ALL_KWARGS][rank] - result_file_path = distributed_config[DistributedCheckConst.RESULT_FILE_PATH] - benchmark_result = distributed_config[DistributedCheckConst.BENCHMARK_RESULT] - device_args, _ = generate_device_params(rank_args, rank_kwargs, False, api_name) - logger.info("Start to check distributed api {} in rank {}.".format(api_full_name, local_rank)) - distributed_func.get(api_name)(*device_args) - dist.barrier() - if api_name in special_rank_api_list: - local_rank = rank - kwargs = { - "local_rank": local_rank, - "input_args": input_args - } - compare_function = distributed_func_registry.get_compare_function(api_name) - status = compare_function(device_args, benchmark_result, **kwargs) - message = '' - result_rows = [] - df_row = list([api_full_name, local_rank, status, message]) - result_rows.append(df_row) - write_csv(result_rows, result_file_path) - cleanup() - - -def get_distributed_args_kwargs(forward_contents, api_full_name, real_data_paths, group_ranks): - all_args, all_kwargs = [], [] - _, api_name = extract_basic_api_segments(api_full_name) - for group_rank in group_ranks: - target_api_info = forward_contents[group_rank].get(api_full_name) - if not target_api_info: - logger.warning("The api {} doesn't exist in rank {}.".format(api_full_name, group_rank)) - continue - if target_api_info.get('used'): - continue - target_api_info['used'] = True - args, kwargs, _ = get_api_info(target_api_info, api_name, real_data_paths[group_rank]) - all_args.append(args) - all_kwargs.append(kwargs) - return all_args, all_kwargs - - -if __name__ == '__main__': - logger.info("Start to run distributed ut task.") - _run_distributed() - logger.info("End to run distributed ut task.") diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_overflow_check.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_overflow_check.py index 7364ddb0c79c70371fb6192f8f2cafa309386996..0f184d14b66d84607a6767ba9ef5210ff4fc5b69 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_overflow_check.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_overflow_check.py @@ -65,6 +65,7 @@ def check_tensor_overflow(x): return False +@recursion_depth_decorator("check_data_overflow") def check_data_overflow(x, device): if isinstance(x, (tuple, list)): if not x: diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut.py index e33b69b5cd77341de2c16d9ff0b7857a6a9dccb1..52486480dcaf93d743fef2bb4de8a9a30a7ec90e 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut.py @@ -65,7 +65,8 @@ DETAILS_FILE_NAME = "accuracy_checking_details_" + current_time + ".csv" not_backward_list = ['repeat_interleave'] unsupported_backward_list = ['masked_select'] -unsupported_api_list = ["to"] +unsupported_api_list = ["to", "empty", "empty_like", "empty_strided", "new_empty", "new_empty_strided", + "empty_with_format"] tqdm_params = { @@ -480,6 +481,7 @@ def _run_ut(parser=None): _run_ut_parser(parser) args = parser.parse_args(sys.argv[1:]) run_ut_command(args) + def checked_online_config(online_config): @@ -581,6 +583,7 @@ def run_ut_command(args): if len(parts_by_underscore) < 2: raise ValueError("File name part does not contain enough '_' separated segments.") time_info = parts_by_underscore[-1] + global UT_ERROR_DATA_DIR UT_ERROR_DATA_DIR = 'ut_error_data' + time_info error_data_path = initialize_save_error_data(error_data_path) diff --git a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut_utils.py b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut_utils.py index cbd75da166b4d7ddaa89e9672fd1442cba87028e..289773d0e603192ffe5bf83447d7452b5fad4b37 100644 --- a/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut_utils.py +++ b/debug/accuracy_tools/msprobe/pytorch/api_accuracy_checker/run_ut/run_ut_utils.py @@ -14,7 +14,6 @@ # limitations under the License. import os -import socket from collections import namedtuple import re @@ -28,7 +27,7 @@ else: current_device = "npu" from torch_npu.npu.amp import autocast -from msprobe.core.common.const import FileCheckConst, Const, CompareConst, DistributedCheckConst +from msprobe.core.common.const import FileCheckConst, Const, CompareConst from msprobe.core.common.file_utils import FileChecker from msprobe.core.common.log import logger from msprobe.core.common.utils import CompareException @@ -125,8 +124,6 @@ def exec_api(exec_params): api_register.initialize_hook(None) api_func_type = list(prefix_map.keys())[list(prefix_map.values()).index(api_type)] api_func = api_register.ori_api_attr.get(Const.PT_FRAMEWORK + Const.SEP + api_func_type, {}).get(api_name) - if api_func is None: - return out torch_api = ApiTemplate(api_name, api_func, api_type, None, need_hook=False, device=device) if is_autocast: @@ -235,7 +232,7 @@ def generate_cpu_params(input_args, input_kwargs, need_backward, api_name): origin_dtype = need_raise_dtypes.pop() raise_dtype = PRECISION_MAPPING.get(origin_dtype, torch.float32) autocast_dtype = origin_dtype - + elif len(need_raise_dtypes) >= 2: raise_dtype = torch.float32 need_raise_dtypes.discard(torch.float32) @@ -262,65 +259,3 @@ def is_unsupported_api(api_name, is_overflow_check=False): if flag: logger.info(f"{split_name} api is not supported for run ut. SKIP.") return flag - - -def get_args_index(api_name, args_name): - """ - 根据 API 名字和参数名获取参数索引。获取 group_index 或者 src_index。 - :param api_name: API 名字,如 "broadcast" 或 "all_reduce" - :param args_name: 参数名,如 "group" 或 "src" - :return: 参数索引 或 None(如果 API 名字或参数名不存在) - """ - api_info = DistributedCheckConst.API_ARGS_INDEX.get(api_name) - if api_info: - return api_info.get(args_name) - return None - - -def get_distributed_args(api_name, input_args, input_kwargs, args_name): - res = None - res = input_kwargs.get(args_name) - if res: - return res - res_index = get_args_index(api_name, args_name) - if not res_index or len(input_args) <= res_index: - return None - res = input_args[res_index] - return res - - -def get_group_info(api_full_name, api_name, api_info_dict): - input_args = api_info_dict.get('input_args', []) - input_kwargs = api_info_dict.get('input_kwargs', {}) - group = get_distributed_args(api_name, input_args, input_kwargs, DistributedCheckConst.GROUP) - - if not group: - logger.warning("The api {} doesn't have group info.".format(api_full_name)) - return None, None - group_ranks = group.get('group_ranks') - if not group_ranks: - logger.warning("The group of api {} doesn't have group_ranks info.".format(api_full_name)) - return None, None - group_id = group.get('group_id') - if not group_id: - logger.warning("The group of api {} doesn't have group_id info.".format(api_full_name)) - return None, None - return group_ranks, group_id - - -def is_port_in_use(port, host): - """ - 检测指定端口是否被占用。 - :param port: 要检测的端口号 - :param host: 主机地址 - :return: 如果端口被占用返回 True,否则返回 False - """ - if not isinstance(port, str) or not port.isdigit(): - raise Exception(f"port: {port} is invalid. Port must be a numeric string.") - port = int(port) - with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s: - try: - s.bind((host, port)) - return False # 端口未被占用 - except socket.error: - return True # 端口已被占用 diff --git a/debug/accuracy_tools/msprobe/pytorch/common/utils.py b/debug/accuracy_tools/msprobe/pytorch/common/utils.py index c5cef997aef93fd8f5e30213a64301a0793e8a05..f906eaaba2a26ca26efa077946db1b199a2f71fe 100644 --- a/debug/accuracy_tools/msprobe/pytorch/common/utils.py +++ b/debug/accuracy_tools/msprobe/pytorch/common/utils.py @@ -24,6 +24,7 @@ from functools import wraps import numpy as np import torch import torch.distributed as dist + from msprobe.core.common.exceptions import DistributedNotInitializedError from msprobe.core.common.file_utils import (FileCheckConst, change_mode, check_file_or_directory_path, check_path_before_create, FileOpen) @@ -38,7 +39,9 @@ except ImportError: else: is_gpu = False + torch_without_guard_version = torch.__version__ >= '2.1' +torch_version_above_or_equal_2 = torch.__version__.split('+')[0] >= '2.0' if not is_gpu and not torch_without_guard_version: from torch_npu.utils.device_guard import torch_device_guard as torch_npu_device_guard @@ -313,14 +316,14 @@ def print_rank_0(message): logger.info(message) -def load_pt(pt_path, to_cpu=False, weights_only=True): +def load_pt(pt_path, to_cpu=False): pt_path = os.path.realpath(pt_path) check_file_or_directory_path(pt_path) try: if to_cpu: - pt = torch.load(pt_path, map_location=torch.device("cpu"), weights_only=weights_only) + pt = torch.load(pt_path, map_location=torch.device("cpu"), weights_only=True) else: - pt = torch.load(pt_path, weights_only=weights_only) + pt = torch.load(pt_path, weights_only=True) except Exception as e: raise RuntimeError(f"load pt file {pt_path} failed") from e return pt @@ -395,7 +398,7 @@ def save_api_data(api_data): io_buff = io.BytesIO() torch.save(api_data, io_buff) except Exception as e: - raise RuntimeError(f"save api_data to io_buff failed") from e + raise RuntimeError("save api_data to io_buff failed") from e return io_buff @@ -405,7 +408,7 @@ def load_api_data(api_data_bytes): buffer = io.BytesIO(api_data_bytes) buffer = torch.load(buffer, map_location="cpu") except Exception as e: - raise RuntimeError(f"load api_data from bytes failed") from e + raise RuntimeError("load api_data from bytes failed") from e return buffer @@ -476,15 +479,6 @@ def check_save_param(variable, name, save_backward): raise ValueError -def replace_last_occurrence(text, old, new): - if text is None: - return text - index = text.rfind(old) - if index != -1: - return text[:index] + text[index:].replace(old, new, 1) - return text - - def is_torch_nn_module(variable): return isinstance(variable, torch.nn.Module) and not isinstance(variable, torch.jit.ScriptModule) @@ -499,3 +493,17 @@ def is_float8_tensor(tensor): if str(tensor.dtype) in [Const.FLOAT8_E5M2_TYPE, Const.FLOAT8_E4M3FN_TYPE]: return True return is_hifloat8_tensor(tensor) + + +def register_forward_pre_hook(module, forward_pre_hook): + if torch_version_above_or_equal_2: + module.register_forward_pre_hook(forward_pre_hook, with_kwargs=True) + else: + module.register_forward_pre_hook(forward_pre_hook) + + +def register_forward_hook(module, forward_hook): + if torch_version_above_or_equal_2: + module.register_forward_hook(forward_hook, with_kwargs=True) + else: + module.register_forward_hook(forward_hook) diff --git a/debug/accuracy_tools/msprobe/pytorch/compare/distributed_compare.py b/debug/accuracy_tools/msprobe/pytorch/compare/distributed_compare.py index de62af421b5a37e39140a9836fb16853443740d7..a484ad5ceed06fd7e8ecd8c1ada7b9b7060260ab 100644 --- a/debug/accuracy_tools/msprobe/pytorch/compare/distributed_compare.py +++ b/debug/accuracy_tools/msprobe/pytorch/compare/distributed_compare.py @@ -1,4 +1,4 @@ -# Copyright (c) 2019-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -15,14 +15,10 @@ import os -from msprobe.core.common.exceptions import FileCheckException -from msprobe.core.common.file_utils import create_directory -from msprobe.core.common.utils import CompareException, check_compare_param, check_configuration_param, get_dump_mode, \ - set_dump_path -from msprobe.core.compare.acc_compare import ModeConfig -from msprobe.core.compare.utils import check_and_return_dir_contents, extract_json, set_stack_json_path +from msprobe.core.common.utils import CompareException +from msprobe.core.compare.utils import check_and_return_dir_contents, extract_json from msprobe.pytorch.common.log import logger -from msprobe.pytorch.compare.pt_compare import PTComparator, compare +from msprobe.pytorch.compare.pt_compare import compare def compare_distributed(npu_dump_dir, bench_dump_dir, output_path, **kwargs): diff --git a/debug/accuracy_tools/msprobe/pytorch/compare/pt_compare.py b/debug/accuracy_tools/msprobe/pytorch/compare/pt_compare.py index 308a82b3d6e9beb67a669ea05b83d7b8a6eddc90..16f0dedb9eea111fdfe090b68b9e7716df9f961d 100644 --- a/debug/accuracy_tools/msprobe/pytorch/compare/pt_compare.py +++ b/debug/accuracy_tools/msprobe/pytorch/compare/pt_compare.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -13,92 +13,20 @@ # See the License for the specific language governing permissions and # limitations under the License. -import os.path +from msprobe.core.compare.acc_compare import Comparator, ModeConfig, MappingConfig, setup_comparison +from msprobe.pytorch.compare.utils import read_pt_data -import torch -from msprobe.core.common.const import FileCheckConst -from msprobe.core.common.exceptions import FileCheckException -from msprobe.core.common.file_utils import FileChecker, create_directory, load_yaml -from msprobe.core.common.utils import CompareException, check_compare_param, check_configuration_param, get_dump_mode, \ - set_dump_path -from msprobe.core.compare.acc_compare import Comparator, ModeConfig -from msprobe.core.compare.utils import set_stack_json_path -from msprobe.pytorch.common.log import logger -from msprobe.pytorch.common.utils import load_pt - - -class PTComparator(Comparator): - def __init__(self, mode_config, data_mapping=None): - super().__init__(mode_config) - - self.stack_mode = mode_config.stack_mode - self.auto_analyze = mode_config.auto_analyze - self.fuzzy_match = mode_config.fuzzy_match - self.dump_mode = mode_config.dump_mode - - self.frame_name = PTComparator.__name__ - self.data_mapping = data_mapping - if isinstance(self.data_mapping, str) or self.data_mapping is None: - self.data_mapping_dict = self.load_mapping_file(self.data_mapping) - elif isinstance(self.data_mapping, dict): - self.data_mapping_dict = self.data_mapping - else: - raise TypeError(f"The type of parameter `data_mapping` must be dict, str or None, but got " - f"{type(self.data_mapping)}") - - @staticmethod - def load_mapping_file(mapping_file): - if isinstance(mapping_file, str): - mapping_dict = load_yaml(mapping_file) - else: - mapping_dict = {} - return mapping_dict - - def read_npy_data(self, dir_path, file_name): - if not file_name: - return None - data_path = os.path.join(dir_path, file_name) - path_checker = FileChecker(data_path, FileCheckConst.FILE, FileCheckConst.READ_ABLE, - FileCheckConst.PT_SUFFIX, False) - data_path = path_checker.common_check() - try: - # detach because numpy can not process gradient information - data_value = load_pt(data_path, to_cpu=True).detach() - except RuntimeError as e: - # 这里捕获 load_pt 中抛出的异常 - logger.error(f"Failed to load the .pt file at {data_path}.") - raise CompareException(CompareException.INVALID_FILE_ERROR) from e - except AttributeError as e: - # 这里捕获 detach 方法抛出的异常 - logger.error(f"Failed to detach the loaded tensor.") - raise CompareException(CompareException.DETACH_ERROR) from e - if data_value.dtype == torch.bfloat16: - data_value = data_value.to(torch.float32) - data_value = data_value.numpy() - return data_value +def read_real_data(npu_dir, npu_data_name, bench_dir, bench_data_name, _) -> tuple: + n_value = read_pt_data(npu_dir, npu_data_name) + b_value = read_pt_data(bench_dir, bench_data_name) + return n_value, b_value def compare(input_param, output_path, **kwargs): - try: - auto_analyze = kwargs.get('auto_analyze', True) - fuzzy_match = kwargs.get('fuzzy_match', False) - data_mapping = kwargs.get('data_mapping', None) - suffix = kwargs.get('suffix', '') - - set_dump_path(input_param) - dump_mode = get_dump_mode(input_param) - if "stack_json_path" in input_param: - stack_mode = kwargs.get('stack_mode', False) - else: - stack_mode = set_stack_json_path(input_param) # set stack_mode and set "stack_json_path" in input_param - check_configuration_param(stack_mode, auto_analyze, fuzzy_match, input_param.get('is_print_compare_log', True)) - create_directory(output_path) - check_compare_param(input_param, output_path, dump_mode, stack_mode) - except (CompareException, FileCheckException) as error: - logger.error('Compare failed. Please check the arguments and do it again!') - raise CompareException(error.code) from error + config = setup_comparison(input_param, output_path, **kwargs) - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - pt_comparator = PTComparator(mode_config, data_mapping) - pt_comparator.compare_core(input_param, output_path, suffix=suffix) + mode_config = ModeConfig(config.stack_mode, config.auto_analyze, config.fuzzy_match, config.dump_mode) + mapping_config = MappingConfig(data_mapping=config.data_mapping) + pt_comparator = Comparator(read_real_data, mode_config, mapping_config) + pt_comparator.compare_core(input_param, output_path, suffix=config.suffix) diff --git a/debug/accuracy_tools/msprobe/pytorch/compare/utils.py b/debug/accuracy_tools/msprobe/pytorch/compare/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..16473ff386d89de5f3bbb269e69837c07a950ea5 --- /dev/null +++ b/debug/accuracy_tools/msprobe/pytorch/compare/utils.py @@ -0,0 +1,47 @@ +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os + +import torch + +from msprobe.core.common.utils import logger, CompareException +from msprobe.core.common.file_utils import FileChecker, FileCheckConst +from msprobe.pytorch.common.utils import load_pt + + +def read_pt_data(dir_path, file_name): + if not file_name: + return None + + data_path = os.path.join(dir_path, file_name) + path_checker = FileChecker(data_path, FileCheckConst.FILE, FileCheckConst.READ_ABLE, + FileCheckConst.PT_SUFFIX, False) + data_path = path_checker.common_check() + try: + # detach because numpy can not process gradient information + data_value = load_pt(data_path, to_cpu=True).detach() + except RuntimeError as e: + # 这里捕获 load_pt 中抛出的异常 + logger.error(f"Failed to load the .pt file at {data_path}.") + raise CompareException(CompareException.INVALID_FILE_ERROR) from e + except AttributeError as e: + # 这里捕获 detach 方法抛出的异常 + logger.error(f"Failed to detach the loaded tensor.") + raise CompareException(CompareException.DETACH_ERROR) from e + if data_value.dtype == torch.bfloat16: + data_value = data_value.to(torch.float32) + data_value = data_value.numpy() + return data_value diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/compare_weight.py b/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/compare_weight.py deleted file mode 100644 index b4c49fc3a8e0ed7838de451f9e8dcfbcf4363388..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/compare_weight.py +++ /dev/null @@ -1,71 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from typing import Dict -from tqdm import tqdm - -from msprobe.core.common.file_utils import save_json, check_file_or_directory_path -from msprobe.pytorch.common.log import logger -from msprobe.pytorch.config_checking.ckpt_compare.megatron_loader import load_megatron_weights -from msprobe.pytorch.config_checking.ckpt_compare.metrics import METRIC_FUNC - - -def compare_checkpoints(ckpt_path1, ckpt_path2, output_path) -> Dict: - """Compare weights between two checkpoints using cosine similarity and L2 distance. - - Args: - ckpt_path1 (str): Path to first checkpoint directory - ckpt_path2 (str): Path to second checkpoint directory - output_path (str): Path to save comparison results JSON file - - Returns: - Dict: Dictionary containing comparison metrics for each parameter. The dictionary has the following structure: - { - "param_name": { - "cosine_similarity": float, # Cosine similarity between parameter tensors - "l2_distance": float, # L2 distance between parameter tensors - "shape": List[int] # Shape of the parameter tensors - }, - ... - } - """ - - # Load both checkpoints - check_file_or_directory_path(output_path) - weights1 = load_megatron_weights(ckpt_path1) - weights2 = load_megatron_weights(ckpt_path2) - - # Initialize results dictionary - results = {} - - # Compare weights with matching keys - common = set(weights1) & set(weights2) - logger.warning(f'Parameters not in ckpt2: {set(weights1) - set(weights2)}') - logger.warning(f'Parameters not in ckpt1: {set(weights2) - set(weights1)}') - for key in tqdm(common): - tensor1 = weights1[key].float() - tensor2 = weights2[key].float() - - results[key] = {} - for metric, func in METRIC_FUNC.items(): - try: - results[key][metric] = func(tensor1, tensor2) - except Exception as e: - logger.warning(f'Error when calculate {metric} for reason: {e}') - - # Write results to JSON file - save_json(output_path, results, indent=4) - logger.info(f"Comparison results written to {output_path}") - return results diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/megatron_loader.py b/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/megatron_loader.py deleted file mode 100644 index 4b756b25f4eafdff3eb92b09377c80f35ef3a7e8..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/megatron_loader.py +++ /dev/null @@ -1,273 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import re -from collections import defaultdict -from typing import Dict -import torch -from msprobe.pytorch.common.log import logger -from msprobe.core.common.utils import recursion_depth_decorator -from msprobe.core.common.const import Const -from msprobe.core.common.file_utils import FileOpen, load_yaml -from msprobe.pytorch.common.utils import load_pt - -try: - import megatron -except ModuleNotFoundError as e: - raise ModuleNotFoundError("No module named 'megatron', which is required to load a megatron ckpt") from e - - -COLUMN_PARALLEL_PARAMS = ['linear_qkv', 'linear_fc1', 'word_embeddings.weight'] -ARGS = 'args' -LAYER_IDX_PATTERN = re.compile('layers\.(\d+)\.') -EXPERT_IDX_PATTERN = re.compile('experts\.(\d+)\.') - - -@recursion_depth_decorator('') -def _get_parameter(weights, prefix=''): - for k, v in weights.items(): - name = Const.SEP.join([prefix, k]).strip(Const.SEP) - if isinstance(v, dict): - yield from _get_parameter(v, prefix=name) - elif isinstance(v, torch.Tensor): - yield name, v - - -def _map_to_mcore_local_names(param_name: str) -> str: - """Map parameter names to mcore + local transformer implementation names.""" - mcore_local_map = load_yaml(os.path.join(os.path.dirname(__file__), 'name_mapping.yaml')) - for other_name, mcore_local_name in mcore_local_map.items(): - param_name = param_name.replace(other_name, mcore_local_name) - - return param_name - - -def _parse_real_layer_idx(param_name, num_layers_per_stage, pp_size, pp_rank): - """Map local (virtual) pipeline stage layer index to global layer index. - - For virtual pipeline parallel, each pipeline stage is further divided into virtual stages. - The global layer index needs to account for both pipeline stage and virtual stage. - - Args: - param_name (str): Parameter name containing layer index - num_layers_per_stage (int): Number of layers per pipeline stage - pp_size (int): Pipeline parallel size - - Returns: - int: Global layer index accounting for both pipeline and virtual pipeline stages - """ - # Extract local layer index from parameter name - layer_match = re.search(LAYER_IDX_PATTERN, param_name) - param_name, vpp_stage = param_name.split(Const.SCOPE_SEPARATOR) - if not layer_match: - return param_name - - local_layer_idx = int(layer_match.group(1)) - vpp_stage = int(vpp_stage) - - # Calculate global layer index based on pipeline stage and virtual stage - real_layer_idx = local_layer_idx + (pp_size * vpp_stage + pp_rank) * num_layers_per_stage - - return param_name.replace(f'layers.{local_layer_idx}', f'layers.{real_layer_idx}') - - -def _parse_real_expert_idx(param_name, num_experts_per_rank, exp_rank): - """Map local expert index to global expert index. TODO: shared expert - - For expert parallel, experts are distributed across ranks. This function maps - the local expert index on a rank to its global index across all ranks. - - Args: - param_name (str): Parameter name containing local expert index - num_experts_per_rank (int): Number of experts on each rank - exp_rank (int): Expert parallel rank - - Returns: - str: Parameter name with local expert index replaced by global expert index - """ - # Extract local layer index from parameter name - expert_match = re.search(EXPERT_IDX_PATTERN, param_name) - if not expert_match: - return param_name - - local_expert_idx = int(expert_match.group(1)) - # Calculate global layer index based on pipeline stage and virtual stage - real_experts_idx = local_expert_idx + exp_rank * num_experts_per_rank - - return param_name.replace(f'experts.{local_expert_idx}', f'experts.{real_experts_idx}') - - -def _consolidate_tp_weights(weights: Dict) -> Dict: - """Consolidate weights from different tensor parallel ranks into combined tensors. - - Args: - weights: Dictionary of weights with rank information in keys - - Returns: - Dict: Consolidated weights without rank information - """ - consolidated = {} - for key, tensors in weights.items(): - if any([name in key for name in COLUMN_PARALLEL_PARAMS]): - # Column parallel - concatenate along input dimension (dim 0) - combined = torch.cat(tensors, dim=0) - elif "linear_proj.weight" in key or "linear_fc2.weight" in key: - # Row parallel - concatenate along output dimension (dim 1) - combined = torch.cat(tensors, dim=1) - else: - # For other params, verify identical and use first - if not all(torch.allclose(tensors[0], t) for t in tensors[1:]): - logger.warning(f"Inconsistent values for {key} across TP ranks") - combined = tensors[0] - - consolidated[key] = combined - return consolidated - - -def _parse_num_layers_per_stage(tp_partition): - match = [re.findall(LAYER_IDX_PATTERN, key) for key in tp_partition.keys()] - layer_idx = [int(i[0]) for i in match if i] - num_layers_per_pipeline_stage = max(layer_idx) + 1 - - return num_layers_per_pipeline_stage - - -def parse_parallel_size(checkpoint_dir: str): - """Parse tensor, pipeline and expert parallel sizes from checkpoint filenames. - - Args: - checkpoint_dir (str): Directory containing checkpoint files - - Returns: - Namespace - """ - # Find all rank directories - rank_dirs = [d for d in os.listdir(checkpoint_dir) if d.startswith('mp_rank_')] - - if not rank_dirs: - raise ValueError(f"No checkpoint rank directories found in {checkpoint_dir}") - - ckpt = load_pt(os.path.join(checkpoint_dir, rank_dirs[0], 'model_optim_rng.pt'), to_cpu=True, weights_only=False) - args = ckpt[ARGS] - return ( - args.tensor_model_parallel_size, - args.pipeline_model_parallel_size, - args.expert_model_parallel_size, - args.num_experts - ) - - -def parse_iteration(checkpoint_path: str) -> Dict: - iteration = None - latest_iteration = None - tracker_file = os.path.join(checkpoint_path, "latest_checkpointed_iteration.txt") - if os.path.exists(tracker_file): - with FileOpen(tracker_file, 'r') as f: - iteration = latest_iteration = int(f.read().strip()) - else: - match = re.findall('iter_([\d]{7})', checkpoint_path) - if match: - iteration = int(match[0]) - - # Checkpoint directory for this iteration - logger.info(f"Loaded checkpoint from iteration {iteration}") - if latest_iteration: - checkpoint_path = os.path.join(checkpoint_path, f'iter_{iteration:07d}') - if not os.path.exists(checkpoint_path): - raise ValueError(f"Checkpoint directory not found: {checkpoint_path}") - - return checkpoint_path - - -def get_weights_from_state_dict(state_dict): - weights = {} - if 'model' in state_dict: - model_weights = state_dict['model'] - vpp_stage = 0 - - for key, value in _get_parameter(model_weights): - key = _map_to_mcore_local_names(key) - weights[f"{key}{Const.SCOPE_SEPARATOR}{vpp_stage}"] = value - - elif 'model0' in state_dict: - #vpp enabled - vpp_size = 0 - while f'model{vpp_size}' in state_dict: - model_weights = state_dict[f'model{vpp_stage}'] - for key, value in _get_parameter(model_weights): - key = _map_to_mcore_local_names(key) - weights[f"{key}{Const.SCOPE_SEPARATOR}{vpp_stage}"] = value - vpp_size += 1 - return weights - - -def load_megatron_weights(checkpoint_path: str) -> Dict: - """Load Megatron parallel checkpoint weights into a single dictionary. - - Args: - checkpoint_path (str): Base checkpoint directory path - - Returns: - combined_weights: Dict with weights from all ranks, keys include rank info - """ - # Find latest iteration if not specified - checkpoint_path = parse_iteration(checkpoint_path) - - # Parse parallel sizes from checkpoint directory structure - tp_size, pp_size, exp_size, num_experts = parse_parallel_size(checkpoint_path) - combined_weights = {} - - # Load checkpoints from all ranks - for exp_rank in range(exp_size): - num_layers_per_pipeline_stage = 0 - for pp_rank in range(pp_size): - tp_partition = defaultdict(list) - for tp_rank in range(tp_size): - # Construct checkpoint path based on parallel ranks - if pp_size > 1: - rank_dir = f'mp_rank_{tp_rank:02d}_{pp_rank:03d}' - else: - rank_dir = f'mp_rank_{tp_rank:02d}' - - if exp_size > 1: - rank_dir = f'{rank_dir}_{exp_rank:03d}' - - ckpt_file = os.path.join(checkpoint_path, rank_dir, 'model_optim_rng.pt') - try: - state_dict = load_pt(ckpt_file, to_cpu=True, weights_only=False) - partition = get_weights_from_state_dict(state_dict) - for key, weight in partition.items(): - tp_partition[key].append(weight) - - except Exception as load_error: - logger.warning(f"Error loading {ckpt_file}: {load_error}") - - if not tp_partition: - raise ValueError('No state loaded.') - - if not num_layers_per_pipeline_stage: - num_layers_per_pipeline_stage = _parse_num_layers_per_stage(tp_partition) - - consolidated_weight = _consolidate_tp_weights(tp_partition) - for key, value in consolidated_weight.items(): - key = _parse_real_layer_idx(key, num_layers_per_pipeline_stage, pp_size, pp_rank) - if num_experts: - key = _parse_real_expert_idx(key, num_experts // exp_size, exp_rank) - combined_weights[key] = value - - logger.info(f"Found {len(combined_weights)} total parameters across all ranks") - - return combined_weights diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/metrics.py b/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/metrics.py deleted file mode 100644 index 65b5feb659f2fc515d5f2f57faf107d65937d16c..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/metrics.py +++ /dev/null @@ -1,95 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import torch -from torch.nn import functional as F - -from msprobe.pytorch.common.log import logger - -MAX_SLICE = 1000000 - - -def in_different_shape(a, b): - if a.shape != b.shape: - logger.warning(f"a, b are in different shape. a: {a.shape}, b: {b.shape}") - return True - return False - - -def l2_distance(a, b): - if a is None or b is None: - return None - if in_different_shape(a, b): - return None - return (a - b).square().sum().sqrt().item() - - -def cos_sim(a, b, eps=1e-8): - if a is None or b is None: - return None - if a.dtype not in [torch.float64, torch.float32, torch.float16, torch.bfloat16]: - return None - - if in_different_shape(a, b): - return None - if a.dim() > 0: - a = a.flatten().squeeze() - b = b.flatten().squeeze() - - num_element = a.numel() - if num_element > MAX_SLICE: - logger.info(f'num parameters: {num_element}. Calculate cos by chunks') - n_batch = num_element // MAX_SLICE + 1 - sim = 0 - total_norm_a = eps - total_norm_b = eps - for i in range(n_batch): - slice_a = a[i * MAX_SLICE: min((i + 1) * MAX_SLICE, num_element)] - slice_b = b[i * MAX_SLICE: min((i + 1) * MAX_SLICE, num_element)] - slice_sim = (slice_a * slice_b).sum().item() - total_norm_a += (slice_a ** 2).sum().item() - total_norm_b += (slice_a ** 2).sum().item() - sim += slice_sim - sim = sim / total_norm_a ** 0.5 / total_norm_b ** 0.5 - - else: - sim = F.cosine_similarity(a, b, dim=0, eps=eps).item() - - return sim - - -def numel(a, b): - n1 = a.numel() - n2 = b.numel() - if n1 != n2: - logger.warning('parameters have different number of element') - return (n1, n2) - return n1 - - -def shape(a, b): - s1 = a.shape - s2 = b.shape - if in_different_shape(a, b): - return [list(s1), list(s2)] - return list(s1) - - -METRIC_FUNC = { - 'l2': l2_distance, - 'cos': cos_sim, - 'numel': numel, - 'shape': shape - } \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/name_mapping.yaml b/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/name_mapping.yaml deleted file mode 100644 index 0caecc53a73b108939435867fe1b6e614bd91812..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/config_checking/ckpt_compare/name_mapping.yaml +++ /dev/null @@ -1,12 +0,0 @@ -self_attention.linear_qkv.layer_norm_: input_layernorm. -language_model.: '' -encoder: decoder -.input_norm.: .input_layernorm. -query_key_value: linear_qkv -.dense.: .linear_proj. -post_attention_norm: pre_mlp_layernorm -dense_h_to_4h: linear_fc1 -dense_4h_to_h: linear_fc2 -mlp.local_experts: mlp.experts.local_experts -final_norm: final_layernorm -word_embeddings_for_head: output_layer diff --git a/debug/accuracy_tools/msprobe/pytorch/debugger/debugger_config.py b/debug/accuracy_tools/msprobe/pytorch/debugger/debugger_config.py index f90c7698163dc799e3eea27e41086010b5436254..cdd13325ddb618a6ce05273ad5e849bda5970e33 100644 --- a/debug/accuracy_tools/msprobe/pytorch/debugger/debugger_config.py +++ b/debug/accuracy_tools/msprobe/pytorch/debugger/debugger_config.py @@ -59,6 +59,7 @@ class DebuggerConfig: if isinstance(task_config.online_run_ut_recompute, bool) else False self.check() + self._check_statistics_config(task_config) if self.level == Const.LEVEL_L2: self.is_backward_kernel_dump = False @@ -77,10 +78,13 @@ class DebuggerConfig: if not isinstance(self.async_dump, bool): raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, f"The parameters async_dump should be bool.") - if self.async_dump and self.task == Const.TENSOR and not self.list: - raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, - f"The parameters async_dump is true in tensor task, the parameters list cannot be " - f"empty.") + if self.async_dump and self.task == Const.TENSOR: + if self.level == Const.LEVEL_DEBUG: + self.list = [] # async_dump + debug level case ignore list + if not self.list and self.level != Const.LEVEL_DEBUG: + raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, + f"The parameters async_dump is true in tensor task, the parameters list cannot be " + f"empty.") if self.task == Const.STRUCTURE and self.level not in [Const.LEVEL_L0, Const.LEVEL_MIX]: logger.warning_on_rank_0( f"When the task is set to structure, the level should be one of {[Const.LEVEL_L0, Const.LEVEL_MIX]}. " @@ -129,8 +133,23 @@ class DebuggerConfig: if not self.list or len(self.list) != 1: raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, f"When level is set to L2, the list must be configured as a list with one api name.") + if self.task != Const.TENSOR: + raise MsprobeException(MsprobeException.INVALID_PARAM_ERROR, + f"When level is set to L2, the task must be set to tensor.") + api_name = self.list[0] if api_name.endswith(Const.BACKWARD): self.is_backward_kernel_dump = True api_forward_name = api_name[:-len(Const.BACKWARD)] + Const.FORWARD self.list.append(api_forward_name) + + def _check_statistics_config(self, task_config): + if self.task != Const.STATISTICS: + return + self.tensor_list = [] + if not hasattr(task_config, "tensor_list"): + return + if self.level == Const.LEVEL_DEBUG and task_config.tensor_list: + logger.warning_on_rank_0("When level is set to debug, the tensor_list will be invalid.") + return + self.tensor_list = task_config.tensor_list diff --git a/debug/accuracy_tools/msprobe/pytorch/debugger/precision_debugger.py b/debug/accuracy_tools/msprobe/pytorch/debugger/precision_debugger.py index 98051f4eda1e8ece6060112f4fd72d22953390b3..ed2a712d38f2007b387753b47b6ebab4a77ab3ca 100644 --- a/debug/accuracy_tools/msprobe/pytorch/debugger/precision_debugger.py +++ b/debug/accuracy_tools/msprobe/pytorch/debugger/precision_debugger.py @@ -12,7 +12,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - +import functools from collections import namedtuple from torch.utils.data import dataloader @@ -76,6 +76,7 @@ class PrecisionDebugger: self.service = Service(self.config) self.module_dumper = ModuleDumper(self.service) self.enable_dataloader = self.config.enable_dataloader + self.ori_customer_func = {} if self.enable_dataloader: logger.warning_on_rank_0("The enable_dataloader feature will be deprecated in the future.") dataloader._BaseDataLoaderIter.__next__ = iter_tracer(dataloader._BaseDataLoaderIter.__next__) @@ -171,7 +172,7 @@ class PrecisionDebugger: except ValueError: return instance.service.save(variable, name, save_backward) - + @classmethod def set_init_step(cls, step): instance = cls._instance @@ -181,6 +182,31 @@ class PrecisionDebugger: instance.service.init_step = step instance.service.loop = 0 + @classmethod + def register_custom_api(cls, module, api, api_prefix=None): + if not api_prefix: + api_prefix = getattr(module, "__name__", "Custom") + if not isinstance(api_prefix, str): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, "api_prefix must be string") + if not hasattr(module, api): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, f"module {str(module)} does not have {api}") + instance = cls._instance + if not instance: + raise Exception(MsgConst.NOT_CREATED_INSTANCE) + instance.service.register_custom_api(module, api, api_prefix) + + @classmethod + def restore_custom_api(cls, module, api): + if not hasattr(module, api): + raise MsprobeException( + MsprobeException.INVALID_PARAM_ERROR, f"module {str(module)} does not have {api}") + instance = cls._instance + if not instance: + raise Exception(MsgConst.NOT_CREATED_INSTANCE) + instance.service.restore_custom_api(module, api) + def module_dump(module, dump_name): if not is_torch_nn_module(module): diff --git a/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_dump.py b/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_dump.py index cc78962f401a9e4f46d5794d7ca074f2e37f45e0..5bf26f7ac0d91cce630a3b9c8e648453ae4ab65c 100644 --- a/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_dump.py +++ b/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_dump.py @@ -13,75 +13,28 @@ # See the License for the specific language governing permissions and # limitations under the License. -import torch -from msprobe.core.common.const import Const -from msprobe.core.data_dump.scope import BaseScope from msprobe.pytorch.common.log import logger +from msprobe.pytorch.dump.module_dump.module_processer import ModuleProcesser from msprobe.pytorch.hook_module.api_register import get_api_register -torch_version_above_or_equal_2 = torch.__version__.split('+')[0] >= '2.0' - class ModuleDumper: def __init__(self, service): self.service = service - self.hook_handle_list = [] self.api_register = get_api_register() def start_module_dump(self, module, dump_name): + if hasattr(module, 'msprobe_hook') and not hasattr(module, 'msprobe_module_dump'): + logger.info_on_rank_0("The init dump is enabled, and the module dump function will not be available.") + return + + ModuleProcesser.enable_module_dump = True self.api_register.restore_all_api() - self.register_hook(module, dump_name) + if not hasattr(module, 'msprobe_module_dump'): + self.service.module_processor.register_module_hook(module, self.service.build_hook, + recursive=False, module_names=[dump_name]) + setattr(module, 'msprobe_module_dump', True) def stop_module_dump(self): + ModuleProcesser.enable_module_dump = False self.api_register.register_all_api() - for hook_handle in self.hook_handle_list: - if isinstance(hook_handle, torch.utils.hooks.RemovableHandle): - hook_handle.remove() - self.hook_handle_list.clear() - - def register_hook(self, module, dump_name): - prefix_name = ( - BaseScope.Module_Type_Module + Const.SEP + - dump_name + Const.SEP + - module.__class__.__name__ + Const.SEP - ) - module_processor = self.service.module_processor - _, forward_hook, backward_hook, forward_hook_torch_version_below_2 = self.service.build_hook( - BaseScope.Module_Type_Module, - prefix_name - ) - - if module_processor.has_register_backward_hook(module): - logger.warning( - f"The {dump_name} module has registered deprecated register_backward_hook," - f"which may cause abnormal data dump. The backward data dump for this module will be skipped." - ) - if torch_version_above_or_equal_2: - forward_hook_handle = module.register_forward_hook(forward_hook, with_kwargs=True) - else: - if not module_processor.has_register_backward_hook(module): - backward_hook_handle = module.register_full_backward_hook( - module_processor.node_hook(prefix_name + Const.BACKWARD, Const.STOP) - ) - self.hook_handle_list.append(backward_hook_handle) - forward_hook_handle = module.register_forward_hook(forward_hook_torch_version_below_2) - self.hook_handle_list.append(forward_hook_handle) - if not module_processor.has_register_backward_hook(module): - backward_hook_handle = module.register_full_backward_hook(backward_hook) - self.hook_handle_list.append(backward_hook_handle) - - forward_pre_hook_handle = module.register_forward_pre_hook( - module_processor.node_hook(prefix_name + Const.FORWARD, Const.START) - ) - forward_hook_handle = module.register_forward_hook( - module_processor.node_hook(prefix_name + Const.FORWARD, Const.STOP) - ) - self.hook_handle_list.extend([forward_pre_hook_handle, forward_hook_handle]) - if torch_version_above_or_equal_2 and not module_processor.has_register_backward_hook(module): - backward_pre_hook_handle = module.register_full_backward_pre_hook( - module_processor.node_hook(prefix_name + Const.BACKWARD, Const.START) - ) - backward_hook_handle = module.register_full_backward_hook( - module_processor.node_hook(prefix_name + Const.BACKWARD, Const.STOP) - ) - self.hook_handle_list.extend([backward_pre_hook_handle, backward_hook_handle]) diff --git a/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_processer.py b/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_processer.py index 6fb1d45064a42c353ed01cbcc47c34423574df04..01ea4ee664a8b43d008a6b04cbf4bbd22e125a2e 100644 --- a/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_processer.py +++ b/debug/accuracy_tools/msprobe/pytorch/dump/module_dump/module_processer.py @@ -13,12 +13,15 @@ # See the License for the specific language governing permissions and # limitations under the License. +from collections import OrderedDict + import torch +from torch.utils.hooks import BackwardHook, RemovableHandle from msprobe.core.common.const import Const from msprobe.core.data_dump.scope import BaseScope, ModuleRangeScope, MixRangeScope from msprobe.pytorch.common.log import logger -from msprobe.pytorch.common.utils import replace_last_occurrence, is_torch_nn_module +from msprobe.pytorch.common.utils import is_torch_nn_module, register_forward_pre_hook from msprobe.pytorch.dump.module_dump.hook_wrapper import wrap_setup_input_output_hook torch_version_above_or_equal_2 = torch.__version__.split('+')[0] >= '2.0' @@ -51,6 +54,9 @@ class ModuleProcesser: module_stack = [] api_parent_node = "" module_node = {} + module_bw_hook_kernels = {} + module_with_backward_hook = {} + enable_module_dump = False def __init__(self, scope): self.scope = scope if isinstance(scope, (ModuleRangeScope, MixRangeScope)) else None @@ -66,7 +72,7 @@ class ModuleProcesser: logger.info_on_rank_0(f"Patch megatron method failed, detail:{str(e)}") @staticmethod - def module_count_func(module_name): + def set_and_get_calls_number(module_name): if module_name not in ModuleProcesser.module_count: ModuleProcesser.module_count[module_name] = 0 else: @@ -80,13 +86,19 @@ class ModuleProcesser: module._is_full_backward_hook is False @staticmethod - def get_modules_and_names(models): + def get_modules_and_names(models, recursive, module_names): modules_and_names_with_index = {} if isinstance(models, (list, tuple)): + if not recursive and len(module_names) != len(models): + return modules_and_names_with_index for index, model in enumerate(models): - modules_and_names_with_index[str(index)] = model.named_modules() + modules_and_names_with_index[str(index)] = model.named_modules() if recursive else \ + [(module_names[index], model)] else: - modules_and_names_with_index["-1"] = models.named_modules() + if not recursive and len(module_names) != 1: + return modules_and_names_with_index + modules_and_names_with_index["-1"] = models.named_modules() if recursive else \ + [(module_names[0], models)] return modules_and_names_with_index @classmethod @@ -95,14 +107,18 @@ class ModuleProcesser: cls.module_stack = [] cls.api_parent_node = "" cls.module_node = {} + cls.module_bw_hook_kernels = {} + cls.enable_module_dump = False + + def register_module_hook(self, models, build_hook, recursive=True, module_names=None): + if module_names is None: + module_names = [] - def register_module_hook(self, models, build_hook): - logger.info_on_rank_0("The init dump is enabled, and the module dump function will not be available.") - modules_and_names_with_index = self.get_modules_and_names(models) + modules_and_names_with_index = self.get_modules_and_names(models, recursive, module_names) for index, modules_and_names in modules_and_names_with_index.items(): model = models if index == "-1" else models[int(index)] for name, module in modules_and_names: - if module == model: + if recursive and module == model: continue if not is_torch_nn_module(module): logger.warning( @@ -112,96 +128,113 @@ class ModuleProcesser: continue if module.__class__.__name__ == "FullyShardedDataParallel": continue + setattr(module, 'msprobe_hook', True) module_index = (index + Const.SEP) if index != "-1" else "" - prefix_name = (BaseScope.Module_Type_Module + Const.SEP + module_index + - name + Const.SEP + module.__class__.__name__ + Const.SEP) - pre_forward_hook, forward_hook, backward_hook, forward_hook_torch_version_below_2 = build_hook( - BaseScope.Module_Type_Module, - prefix_name - ) + prefix_name = f'{BaseScope.Module_Type_Module}{Const.SEP}{module_index}{name}{Const.SEP}' + \ + f'{module.__class__.__name__}{Const.SEP}' + + forward_pre_hook = self.build_module_hook(prefix_name, build_hook) if self.has_register_backward_hook(module): logger.warning( f"The {prefix_name[:-1]} has registered deprecated register_backward_hook," f"which may cause abnormal data dump. The backward data dump for this module will be skipped." ) + ModuleProcesser.module_with_backward_hook[prefix_name] = True + register_forward_pre_hook(module, forward_pre_hook) + + def build_module_hook(self, module_name, build_data_hook): + def forward_pre_hook(module, args, kwargs=None): + if kwargs is None: + kwargs = {} + + if hasattr(module, 'msprobe_module_dump') and not self.enable_module_dump: + return (args, kwargs) if torch_version_above_or_equal_2 else args + + index = ModuleProcesser.set_and_get_calls_number(module_name) + full_forward_name = f'{module_name}{Const.FORWARD}{Const.SEP}{index}' + full_backward_name = f'{module_name}{Const.BACKWARD}{Const.SEP}{index}' + + self.set_construct_info_in_pre_hook(full_forward_name) + + if not hasattr(module, 'msprobe_forward_hook'): + forward_hooks_dict = getattr(module, '_forward_hooks', OrderedDict()) + handle = RemovableHandle(forward_hooks_dict) + forward_hooks_dict[handle.id] = forward_hook + forward_hooks_dict.move_to_end(handle.id, last=False) if torch_version_above_or_equal_2: - module.register_forward_hook(forward_hook, with_kwargs=True) + forward_hooks_with_kwargs_dict = getattr(module, '_forward_hooks_with_kwargs', OrderedDict()) + forward_hooks_with_kwargs_dict[handle.id] = True + + setattr(module, 'msprobe_forward_hook', True) + + _, _, backward_data_hook = build_data_hook(BaseScope.Module_Type_Module, full_forward_name) + + def get_backward_pre_hook(full_backward_name): + def backward_pre_hook_fn(module, grad_output): + self.set_construct_info_in_pre_hook(full_backward_name) + return backward_pre_hook_fn + + def get_backward_hook(backward_data_hook, full_backward_name): + def backward_hook_fn(module, grad_input, grad_output): + new_output = backward_data_hook(module, grad_input, grad_output) + self.set_construct_info_in_hook(full_backward_name, is_forward=False) + return new_output + return backward_hook_fn + + if not ModuleProcesser.module_with_backward_hook.get(module_name): + backward_pre_hook = get_backward_pre_hook(full_backward_name) + backward_hook = get_backward_hook(backward_data_hook, full_backward_name) + if torch_version_above_or_equal_2: + bw_hook = BackwardHook(module, [backward_hook], [backward_pre_hook]) else: - if not self.has_register_backward_hook(module): - module.register_full_backward_hook(self.node_hook(prefix_name + Const.BACKWARD, Const.STOP)) - module.register_forward_hook(forward_hook_torch_version_below_2) - if not self.has_register_backward_hook(module): - module.register_full_backward_hook(backward_hook) - - module.register_forward_pre_hook(self.node_hook(prefix_name + Const.FORWARD, Const.START)) - module.register_forward_hook(self.node_hook(prefix_name + Const.FORWARD, Const.STOP)) - if torch_version_above_or_equal_2 and not self.has_register_backward_hook(module): - module.register_full_backward_pre_hook(self.node_hook(prefix_name + Const.BACKWARD, Const.START)) - module.register_full_backward_hook(self.node_hook(prefix_name + Const.BACKWARD, Const.STOP)) - - def node_hook(self, name_prefix, start_or_stop, **kwargs): - - def pre_hook(module, input, output=None): - try: - index = ModuleProcesser.module_count_func(name_prefix) - except IndexError as e: - index = None - pass - full_name = name_prefix + Const.SEP + str(index) - if not hasattr(module, "mindstudio_reserved_name") or not module.mindstudio_reserved_name: - module.mindstudio_reserved_name = [] - module.mindstudio_reserved_name.append(full_name) - if self.module_stack: - ModuleProcesser.module_node[full_name] = self.module_stack[-1] + bw_hook = BackwardHook(module, [backward_hook]) + ModuleProcesser.module_bw_hook_kernels[full_forward_name] = bw_hook + args = bw_hook.setup_input_hook(args) + return (args, kwargs) if torch_version_above_or_equal_2 else args + + def forward_hook(module, args, kwargs_or_output, output_or_kwargs=None): + if hasattr(module, 'msprobe_module_dump') and not self.enable_module_dump: + return output_or_kwargs if torch_version_above_or_equal_2 else kwargs_or_output + + index = ModuleProcesser.module_count.get(module_name) + full_name = f'{module_name}{Const.FORWARD}{Const.SEP}{index}' + + _, forward_data_hook, _ = build_data_hook(BaseScope.Module_Type_Module, full_name) + hook_result = forward_data_hook(module, args, kwargs_or_output, output_or_kwargs) + self.set_construct_info_in_hook(full_name) + + if hook_result is not None: + result = hook_result else: - ModuleProcesser.module_node[full_name] = None + result = output_or_kwargs if torch_version_above_or_equal_2 else kwargs_or_output - ModuleProcesser.module_stack.append(full_name) - if self.module_stack: - ModuleProcesser.api_parent_node = self.module_stack[-1] - if self.scope: - self.scope.begin_module(full_name) + bw_hook = ModuleProcesser.module_bw_hook_kernels.get(full_name) + if bw_hook: + result = bw_hook.setup_output_hook(result) + + return result + + return forward_pre_hook - def end_hook(module, input, output=None): + def set_construct_info_in_pre_hook(self, full_name): + if self.module_stack: + ModuleProcesser.module_node[full_name] = self.module_stack[-1] + else: + ModuleProcesser.module_node[full_name] = None + ModuleProcesser.module_stack.append(full_name) + ModuleProcesser.api_parent_node = full_name + if self.scope: + self.scope.begin_module(full_name) + + def set_construct_info_in_hook(self, full_name, is_forward=True): + if torch_version_above_or_equal_2 or is_forward: if self.module_stack: ModuleProcesser.module_stack.pop() - if self.module_stack: - ModuleProcesser.api_parent_node = self.module_stack[-1] - else: - ModuleProcesser.api_parent_node = None - if not hasattr(module, "mindstudio_reserved_name") or not module.mindstudio_reserved_name: - raise RuntimeError(f"module reserve name is None when pop") - current_name = module.mindstudio_reserved_name.pop() + ModuleProcesser.api_parent_node = ModuleProcesser.module_stack[-1] if self.module_stack else None if self.scope: - self.scope.end_module(current_name) - - def backward_hook(module, input, output=None): - try: - index = ModuleProcesser.module_count_func(name_prefix) - except IndexError as e: - index = None - pass - full_name = name_prefix + Const.SEP + str(index) - if not hasattr(module, "mindstudio_reserved_name") or not module.mindstudio_reserved_name: - module.mindstudio_reserved_name = [] - module.mindstudio_reserved_name.append(full_name) - forward_full_name = replace_last_occurrence(full_name, Const.BACKWARD, Const.FORWARD) - ModuleProcesser.module_node[full_name] = replace_last_occurrence( - ModuleProcesser.module_node.get(forward_full_name), Const.FORWARD, Const.BACKWARD) - ModuleProcesser.api_parent_node = None + self.scope.end_module(full_name) + else: if self.scope: self.scope.begin_module(full_name) - - if torch_version_above_or_equal_2: - if Const.START in start_or_stop: - return pre_hook - else: - return end_hook - else: - if Const.FORWARD in name_prefix and Const.START in start_or_stop: - return pre_hook - elif Const.BACKWARD in name_prefix: - return backward_hook - else: - return end_hook + ModuleProcesser.api_parent_node = full_name diff --git a/debug/accuracy_tools/msprobe/pytorch/hook_module/api_register.py b/debug/accuracy_tools/msprobe/pytorch/hook_module/api_register.py index f8da9453e8317f942f65a9366fb93da898103625..7ef5622641d42b507aed95631d248418b92049a2 100644 --- a/debug/accuracy_tools/msprobe/pytorch/hook_module/api_register.py +++ b/debug/accuracy_tools/msprobe/pytorch/hook_module/api_register.py @@ -15,18 +15,21 @@ import functools import os +import inspect import torch import torch.distributed as dist from msprobe.core.common.const import Const from msprobe.core.data_dump.api_registry import ApiRegistry +from msprobe.pytorch.common.log import logger from msprobe.pytorch.common.utils import ( torch_without_guard_version, is_gpu, torch_device_guard, parameter_adapter ) from msprobe.pytorch.function_factory import npu_custom_functions from msprobe.pytorch.hook_module.hook_module import HOOKModule from msprobe.pytorch.hook_module.utils import dynamic_import_op +from msprobe.core.common.file_utils import load_yaml try: import mindspeed.ops @@ -38,6 +41,10 @@ else: torch_version_above_2 = torch.__version__.split('+')[0] > '2.0' +_inner_used_api = {} +_supported_api_list_path = (os.path.join(os.path.dirname(os.path.realpath(__file__)), Const.SUPPORT_API_FILE_NAME),) +_cuda_func_mapping = {"npu_fusion_attention": "gpu_fusion_attention"} + _api_types = { Const.PT_FRAMEWORK: { Const.PT_API_TYPE_FUNCTIONAL: (torch.nn.functional, (torch.nn.functional,)), @@ -67,11 +74,9 @@ if not is_gpu: ) if mindspeed_enable: _api_types.get(Const.PT_FRAMEWORK).update({Const.PT_API_TYPE_MINDSPEED: (mindspeed.ops, (mindspeed.ops,))}) - dynamic_import_op(mindspeed.ops) - -_inner_used_api = {} -_supported_api_list_path = (os.path.join(os.path.dirname(os.path.realpath(__file__)), Const.SUPPORT_API_FILE_NAME),) -_cuda_func_mapping = {"npu_fusion_attention": "gpu_fusion_attention"} + mindspeed_op_list = load_yaml(_supported_api_list_path[0]).get(Const.PT_API_TYPE_MINDSPEED) + mindspeed_op_file_list = [op.split(Const.SEP)[0] + Const.PY_SUFFIX for op in mindspeed_op_list] + dynamic_import_op(mindspeed.ops, mindspeed_op_file_list) @parameter_adapter @@ -81,7 +86,15 @@ def tensor_module_forward(module, *args, **kwargs): def dist_module_forward(module, *args, **kwargs): handle = module.api_func(*args, **kwargs) - if kwargs.get("async_op") or module.api_name in ["isend", "irecv"]: + try: + bound = inspect.signature(module.api_func).bind(*args, **kwargs) + bound.apply_defaults() + use_asyn_op_flag = bound.arguments.get("asyn_op", False) + except Exception as e: + use_asyn_op_flag = False + logger.warning(f"fail to get dist api's func signature because {e}, no wait") + + if use_asyn_op_flag or module.api_name in ["isend", "irecv"]: if handle and hasattr(handle, 'wait'): handle.wait() if module.api_name == "batch_isend_irecv": diff --git a/debug/accuracy_tools/msprobe/pytorch/hook_module/hook_module.py b/debug/accuracy_tools/msprobe/pytorch/hook_module/hook_module.py index dccf9c7a9221990eb5ec3829544368ede1297b2c..f8c1d2d6f557f2f90ff348db6f230364db73161c 100644 --- a/debug/accuracy_tools/msprobe/pytorch/hook_module/hook_module.py +++ b/debug/accuracy_tools/msprobe/pytorch/hook_module/hook_module.py @@ -21,9 +21,7 @@ import torch import torch.nn as nn import torch.utils.hooks as full_hooks -from msprobe.pytorch.common.utils import is_float8_tensor - -torch_version_above_or_equal_2 = torch.__version__.split('+')[0] >= '2.0' +from msprobe.pytorch.common.utils import is_float8_tensor, register_forward_pre_hook, register_forward_hook class HOOKModule(nn.Module): @@ -43,13 +41,9 @@ class HOOKModule(nn.Module): prefix = self.prefix_api_name if hasattr(self, "prefix_api_name") else "" if callable(hook_build_func): - forward_pre_hook, forward_hook, backward_hook, _ = hook_build_func(prefix) - if torch_version_above_or_equal_2: - self.register_forward_pre_hook(forward_pre_hook, with_kwargs=True) - self.register_forward_hook(forward_hook, with_kwargs=True) - else: - self.register_forward_pre_hook(forward_pre_hook) - self.register_forward_hook(forward_hook) + forward_pre_hook, forward_hook, backward_hook = hook_build_func(prefix) + register_forward_pre_hook(self, forward_pre_hook) + register_forward_hook(self, forward_hook) self.register_backward_hook(backward_hook) def __call__(self, *args, **kwargs): @@ -79,13 +73,7 @@ class HOOKModule(nn.Module): if len(self._backward_hooks) > 0: full_backward_hooks, non_full_backward_hooks = self._get_backward_hooks() for hook in self._forward_pre_hooks.values(): - result_args, result_kwargs = hook(self, args, kwargs) - if result_args is not None: - if not isinstance(result_args, tuple): - result_args = (result_args,) - args = result_args - if result_kwargs is not None: - kwargs = result_kwargs + hook(self, args, kwargs) bw_hook = None if len(full_backward_hooks) > 0: bw_hook = full_hooks.BackwardHook(self, full_backward_hooks) diff --git a/debug/accuracy_tools/msprobe/pytorch/hook_module/support_wrap_ops.yaml b/debug/accuracy_tools/msprobe/pytorch/hook_module/support_wrap_ops.yaml index 14bf0929557665b2673da10d39b10a464bfe9b23..f2d5d22ade2c52057b969a93b73e0897e5d64ae3 100644 --- a/debug/accuracy_tools/msprobe/pytorch/hook_module/support_wrap_ops.yaml +++ b/debug/accuracy_tools/msprobe/pytorch/hook_module/support_wrap_ops.yaml @@ -149,6 +149,7 @@ tensor: - __bool__ - __div__ - __eq__ + - __floordiv__ - __ge__ - __gt__ - __iadd__ @@ -159,23 +160,33 @@ tensor: - __imod__ - __imul__ - __ior__ + - __ipow__ - __irshift__ - __isub__ - __ixor__ + - __le__ - __lshift__ + - __lt__ - __matmul__ - __mod__ - __mul__ + - __ne__ - __nonzero__ - __or__ + - __pow__ - __radd__ + - __rdiv__ + - __rmod__ - __rmul__ + - __ror__ + - __rpow__ - __rshift__ + - __rsub__ + - __rxor__ - __setitem__ - __sub__ - __truediv__ - __xor__ - - __pow__ - abs - abs_ - absolute @@ -198,12 +209,14 @@ tensor: - addmv_ - addr - addr_ + - adjoint - align_as - align_to - all - allclose - amax - amin + - aminmax - angle - any - arccos @@ -215,12 +228,15 @@ tensor: - arcsinh - arcsinh_ - arctan + - arctan2 + - arctan2_ - arctan_ - arctanh - arctanh_ - argmax - argmin - argsort + - argwhere - asin - asin_ - asinh @@ -235,39 +251,51 @@ tensor: - baddbmm_ - bernoulli - bernoulli_ + - bfloat16 - bincount - bitwise_and - bitwise_and_ + - bitwise_left_shift + - bitwise_left_shift_ - bitwise_not - bitwise_not_ - bitwise_or - bitwise_or_ + - bitwise_right_shift + - bitwise_right_shift_ - bitwise_xor - bitwise_xor_ - bmm + - bool - broadcast_to + - byte - cauchy_ - ceil - ceil_ + - cfloat + - char - cholesky + - cholesky_inverse + - cholesky_solve - chunk - clamp - - cholesky_solve - - cholesky_inverse - clamp_ - clamp_max - clamp_max_ - - clip - clamp_min - clamp_min_ + - clip - clip_ + - conj_physical - copysign - copysign_ + - corrcoef - cos - cos_ - cosh - cosh_ - count_nonzero + - cov - cummax - cummin - cumprod @@ -281,20 +309,23 @@ tensor: - diag_embed - diagflat - diagonal + - diagonal_scatter - diff - - dist - digamma - digamma_ + - dist - div - div_ - divide - divide_ - dot + - double + - dsplit - eig - eq - eq_ - - erf - equal + - erf - erf_ - erfc - erfc_ @@ -303,18 +334,21 @@ tensor: - exp - exp2 - exp2_ - - expm1 - exp_ + - expand + - expand_as + - expm1 - expm1_ - exponential_ - fill_ - - fix - fill_diagonal_ + - fix - fix_ + - flatten - flip - fliplr - - flatten - flipud + - float - float_power - float_power_ - floor @@ -327,6 +361,7 @@ tensor: - fmod_ - frac - frac_ + - frexp - gather - gcd - gcd_ @@ -337,31 +372,37 @@ tensor: - ger - greater - greater_ - - gt - - gt_ - greater_equal - greater_equal_ + - gt + - gt_ + - half - hardshrink - heaviside - heaviside_ - histc + - histogram + - hsplit - hypot - hypot_ + - i0 + - i0_ - igamma - igamma_ - igammac - igammac_ - index_add - index_add_ - - inverse - index_copy - index_copy_ - index_fill - index_fill_ - index_put - index_put_ - - inner - index_select + - inner + - int + - inverse - isclose - isfinite - isinf @@ -379,7 +420,6 @@ tensor: - le_ - lerp - lerp_ - - where - less - less_ - less_equal @@ -396,43 +436,47 @@ tensor: - log_ - log_normal_ - log_softmax - - logcumsumexp - - logdet - logaddexp - logaddexp2 + - logcumsumexp + - logdet - logical_and - logical_and_ - logical_not - - logit - logical_not_ - logical_or - logical_or_ - logical_xor - logical_xor_ + - logit - logit_ - logsumexp + - long - lstsq - lt - lt_ + - lu - lu_solve - map2_ - map_ - masked_fill - - matmul - masked_fill_ - masked_scatter - masked_scatter_ - masked_select + - matmul - matrix_exp + - matrix_power - max - maximum - mean - - matrix_power - median - min - minimum - mm - mode + - moveaxis + - movedim - msort - mul - mul_ @@ -442,6 +486,11 @@ tensor: - mv - mvlgamma - mvlgamma_ + - nan_to_num + - nan_to_num_ + - nanmean + - nanmedian + - nanquantile - nansum - narrow - narrow_copy @@ -451,20 +500,29 @@ tensor: - neg_ - negative - negative_ + - nextafter + - nextafter_ - nonzero - norm - normal_ - not_equal - not_equal_ + - numpy + - orgqr + - ormqr + - outer - permute - pinverse - polygamma + - polygamma_ - pow - pow_ - - polygamma_ - prelu - prod - put_ + - q_zero_point + - qr + - quantile - rad2deg - rad2deg_ - ravel @@ -473,15 +531,16 @@ tensor: - relu - relu_ - remainder - - repeat_interleave - - reshape - remainder_ - renorm - renorm_ - repeat + - repeat_interleave + - reshape - reshape_as - resize_ - resize_as_ + - resolve_neg - roll - rot90 - round @@ -495,6 +554,7 @@ tensor: - select - sgn - sgn_ + - short - sigmoid - sigmoid_ - sign @@ -506,11 +566,13 @@ tensor: - sinc_ - sinh - sinh_ + - slice_scatter - slogdet - smm - softmax - solve - sort + - split - split_with_sizes - sqrt - sqrt_ @@ -520,21 +582,29 @@ tensor: - squeeze_ - sspaddmm - std + - stft + - stride - sub - sub_ + - subtract - sum - sum_to_size - svd + - swapaxes + - swapdims + - swapdims_ - symeig - t - t_ - take + - take_along_dim - tan - tan_ - tanh - tanh_ - tensor_split - tile + - to - topk - transpose - transpose_ @@ -542,8 +612,8 @@ tensor: - tril - tril_ - triu - - true_divide - triu_ + - true_divide - true_divide_ - trunc - trunc_ @@ -551,37 +621,20 @@ tensor: - unbind - unflatten - unfold + - unique + - unique_consecutive - unsafe_chunk - - unsqueeze - unsafe_split - unsafe_split_with_sizes + - unsqueeze + - unsqueeze_ - var - vdot - - unsqueeze_ - view_as + - vsplit + - where - xlogy - xlogy_ - - split - - stft - - nan_to_num - - dsplit - - orgqr - - bitwise_left_shift_ - - arctan2 - - histogram - - q_zero_point - - adjoint - - ormqr - - bitwise_right_shift_ - - nanquantile - - lu - - quantile - - arctan2_ - - qr - - diagonal_scatter - - corrcoef - - vsplit - - aminmax torch: - linalg.norm @@ -641,13 +694,14 @@ torch: - addmv - addmv_ - addr - - amax - affine_grid_generator - align_tensors - all - alpha_dropout - - amin - alpha_dropout_ + - amax + - amin + - aminmax - angle - any - arange @@ -660,12 +714,14 @@ torch: - arcsinh - arcsinh_ - arctan + - arctan2 - arctan_ - arctanh - arctanh_ - argmax - argmin - argsort + - argwhere - asin - asin_ - asinh @@ -686,13 +742,13 @@ torch: - batch_norm_elemt - batch_norm_gather_stats - batch_norm_gather_stats_with_counts - - bernoulli - batch_norm_stats - batch_norm_update_stats + - bernoulli - bilinear + - binary_cross_entropy_with_logits - bincount - binomial - - binary_cross_entropy_with_logits - bitwise_and - bitwise_not - bitwise_or @@ -738,9 +794,9 @@ torch: - conv_transpose1d - conv_transpose2d - conv_transpose3d - - cos - convolution - copysign + - cos - cos_ - cosh - cosh_ @@ -754,14 +810,16 @@ torch: - cummin - cumprod - cumsum + - cumulative_trapezoid - deg2rad - deg2rad_ - det - diag - diag_embed - - diff - diagflat - diagonal + - diagonal_scatter + - diff - digamma - dist - div @@ -770,12 +828,15 @@ torch: - dropout - dropout_ - dsmm + - dsplit - dstack - eig - einsum - embedding - embedding_bag - embedding_renorm_ + - empty + - empty_like - eq - equal - erf @@ -790,12 +851,12 @@ torch: - expm1 - expm1_ - eye - - feature_dropout - feature_alpha_dropout - feature_alpha_dropout_ + - feature_dropout - feature_dropout_ - - fix - fill_ + - fix - fix_ - flatten - flip @@ -810,8 +871,9 @@ torch: - fmod - frac - frac_ - - full + - frexp - frobenius_norm + - full - full_like - gather - gcd @@ -823,8 +885,8 @@ torch: - greater_equal - grid_sampler - grid_sampler_2d - - group_norm - grid_sampler_3d + - group_norm - gru - gru_cell - gt @@ -834,23 +896,29 @@ torch: - heaviside - hinge_embedding_loss - histc + - histogram + - histogramdd - hsmm + - hsplit - hspmm - hstack - hypot + - i0 + - i0_ - igamma - igammac - index_add - index_copy - - inner - index_fill - index_put - index_put_ - index_select + - inner - instance_norm - inverse - isclose - isfinite + - isin - isinf - isnan - isneginf @@ -878,8 +946,8 @@ torch: - log1p_ - log2 - log2_ - - log_softmax - log_ + - log_softmax - logaddexp - logaddexp2 - logcumsumexp @@ -898,18 +966,18 @@ torch: - lt - lu_solve - lu_unpack - - masked_fill - margin_ranking_loss + - masked_fill - masked_scatter - masked_select - - matrix_exp - matmul + - matrix_exp - matrix_power - matrix_rank - max - max_pool1d - - max_pool2d - max_pool1d_with_indices + - max_pool2d - max_pool3d - maximum - mean @@ -928,18 +996,20 @@ torch: - mvlgamma - nan_to_num - nan_to_num_ + - nanmean - nanmedian + - nanquantile - nansum - narrow + - narrow_copy - native_batch_norm - native_group_norm - - narrow_copy - native_layer_norm - native_norm - ne - neg - - negative - neg_ + - negative - negative_ - nextafter - nonzero @@ -971,30 +1041,31 @@ torch: - ravel - real - reciprocal - - relu - reciprocal_ + - relu - relu_ - remainder - renorm - repeat_interleave - reshape - resize_as_ + - resolve_neg - roll - rot90 - round - round_ + - row_stack - rrelu - rrelu_ - rsqrt - - row_stack - rsqrt_ - rsub - saddmm - scalar_tensor - scatter - - select - scatter_add - searchsorted + - select - selu - selu_ - sgn @@ -1014,12 +1085,12 @@ torch: - solve - sort - sparse_coo_tensor - - square - split - split_with_sizes - spmm - sqrt - sqrt_ + - square - square_ - squeeze - sspaddmm @@ -1041,8 +1112,8 @@ torch: - tan_ - tanh - tanh_ - - tensordot - tensor_split + - tensordot - threshold - threshold_ - tile @@ -1058,19 +1129,21 @@ torch: - true_divide - trunc - trunc_ - - unique_consecutive - - xlogy - unbind + - unflatten + - unique_consecutive - unsafe_chunk - unsafe_split - - vander - - var - - vdot - unsafe_split_with_sizes - unsqueeze + - vander + - var - var_mean + - vdot + - vsplit - vstack - where + - xlogy - xlogy_ _VF: @@ -1164,6 +1237,27 @@ torch_npu: - npu_moe_finalize_routing - npu_moe_gating_top_k_softmax - npu_trans_quant_param + - npu_gelu + - npu_ffn + - npu_quant_matmul + - npu_format_cast_ + - npu_dynamic_quant + - npu_moe_compute_expert_tokens + - npu_weight_quant_batchmatmul + - npu_dynamic_quant_asymmetric + - npu_grouped_matmul + - npu_quant_scatter_ + - npu_group_quant + - npu_fused_infer_attention_score + - npu_quantize + - npu_fast_gelu + - npu_weight_quant_batchmatmul + - scatter_update + - scatter_update_ + - npu_moe_init_routing + - npu_scatter_nd_update_ + - npu_scatter_nd_update + - npu_prefetch - npu_dynamic_block_quant aten: @@ -1933,8 +2027,6 @@ mindspeed: - npu_ring_attention_update.npu_ring_attention_update - npu_matmul_add.npu_matmul_add_fp32 - npu_groupmatmul_add.npu_groupmatmul_add_fp32 - - npu_all_to_all_all_gather_bmm.npu_all_to_all_all_gather_bmm - - npu_bmm_reduce_scatter_all_to_all.npu_bmm_reduce_scatter_all_to_all - quant_gmm.npu_quant_gmm - quant_gmm.npu_quant_gmm_v2 - npu_apply_fused_ema_adamw.npu_apply_fused_ema_adamw \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/pytorch/hook_module/utils.py b/debug/accuracy_tools/msprobe/pytorch/hook_module/utils.py index 0992caf0a41e2bc203ca9793295e94261ca48b95..eac2bc93eb6e1d78f37a0ba2e01d4f718caff626 100644 --- a/debug/accuracy_tools/msprobe/pytorch/hook_module/utils.py +++ b/debug/accuracy_tools/msprobe/pytorch/hook_module/utils.py @@ -18,7 +18,7 @@ import importlib import inspect from msprobe.core.common.const import Const -from msprobe.core.common.file_utils import load_yaml +from msprobe.core.common.file_utils import load_yaml, check_link from msprobe.core.common.log import logger @@ -33,12 +33,13 @@ def get_ops(): return set(wrap_functional) | set(wrap_tensor) | set(wrap_torch) | set(wrap_npu_ops) -def dynamic_import_op(package): +def dynamic_import_op(package, white_list): package_name = package.__name__ ops = {} ops_dir, _ = os.path.split(package.__file__) + check_link(ops_dir) for file_name in os.listdir(ops_dir): - if file_name.endswith(Const.PY_SUFFIX) and file_name != Const.INIT_PY: + if file_name in white_list: sub_module_name = file_name[:-3] module_name = f"{package_name}.{sub_module_name}" try: diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_analyse.py b/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_analyse.py index 9a0b71e8a5791bc216c82737d1d4f4a482abceb9..f1bdaa35ef7dea1471c5d54fbefa513c410126b3 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_analyse.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_analyse.py @@ -21,7 +21,7 @@ import heapq from msprobe.pytorch.common.log import logger from msprobe.core.common.const import MonitorConst -from msprobe.core.common.file_utils import check_path_before_create, save_json, create_directory, remove_path, \ +from msprobe.core.common.file_utils import save_json, create_directory, remove_path, \ check_file_or_directory_path, load_json from msprobe.pytorch.monitor.anomaly_detect import GradAnomalyData @@ -46,12 +46,7 @@ class AnomalyDataWriter: def init_detected_json(self): """初始化落盘文件""" - check_path_before_create(self.dump_path) - if not os.path.exists(self.dump_path): - create_directory(self.dump_path) - - if not os.path.exists(self.dump_rank_dir): - create_directory(self.dump_rank_dir) + create_directory(self.dump_rank_dir) if os.path.exists(self.json_path): check_file_or_directory_path(self.json_path, isdir=False) diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_detect.py b/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_detect.py index 63f20b1928c80e1e29d7cb8224f267c246fcaa8b..a8fd42b598c8456bb98bb205b602719520d4a6dc 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_detect.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/anomaly_detect.py @@ -14,6 +14,7 @@ # limitations under the License. import itertools import os +import math import statistics as st import sys from abc import ABC @@ -33,7 +34,7 @@ from msprobe.pytorch.common.log import logger class ScanRule(ABC): name = "ScanRule" - def apply(self, history, cur): + def apply(self, cur, history=None): raise NotImplementedError("abstract method apply is not implemented") @@ -43,7 +44,7 @@ class AnomalyTurbulence(ScanRule): def __init__(self, threshold) -> None: self.threshold = threshold - def apply(self, history, cur): + def apply(self, cur, history=None): baseline = st.mean(history) if isinstance(history, list) else history up_bound = baseline + baseline * self.threshold @@ -53,6 +54,16 @@ class AnomalyTurbulence(ScanRule): return cur < up_bound +class AnomalyNan(ScanRule): + name = "AnomalyNan" + + def __init__(self, threshold=None) -> None: + self.threshold = threshold + + def apply(self, cur, history=None): + return math.isnan(cur) or (self.threshold is not None and abs(cur) > self.threshold) + + class AnomalyScanner: @staticmethod @@ -69,7 +80,7 @@ class AnomalyScanner: rule_args = spec.get("args") # 检查必要的键是否存在 - if rule_cls_name is None or rule_args is None: + if rule_cls_name is None or (rule_cls_name == "AnomalyTurbulence" and rule_args is None): logger.warning(f"Spec is missing required keys: {spec}") continue @@ -81,7 +92,7 @@ class AnomalyScanner: continue try: - rule_instance = rule_cls(**rule_args) + rule_instance = rule_cls(**rule_args) if rule_args is not None else rule_cls() alert_rules.append(rule_instance) except Exception as e: logger.error(f"Error creating instance of rule '{rule_cls_name}': {e}") @@ -93,7 +104,7 @@ class AnomalyScanner: def scan(scan_rules: List[ScanRule], history, cur): anomaly = False for rule in scan_rules: - anomaly = rule.apply(history, cur) + anomaly = rule.apply(cur, history=history) if anomaly: return anomaly, rule.name return anomaly, None @@ -254,6 +265,40 @@ class BaseWriterWithAD: self.anomalies = [] self.ndigits = writer_input.ndigits + @staticmethod + def stack_tensors(tensor_list): + """ + Torch not support stack cpu and xpu tensors. Group the tensors into cpu_group and xpu_group, + stack them separately, migrate xpu_group to cpu, and then restore in the order of input. + + :param tensor_list: [tensor(-1.6165), tensor(-1.0985), tensor(-1.7777), tensor(-1.8408, device='npu:0')] + :return: result: list of float + """ + cpu_tensors = [] + xpu_tensors = [] + + for tensor in tensor_list: + if isinstance(tensor, torch.Tensor) and tensor.device.type != 'cpu': + # 将device上的tensor先stack后to cpu + xpu_tensors.append(tensor) + else: + cpu_tensors.append(tensor) + + xpu_stack = torch.stack(xpu_tensors).cpu() if xpu_tensors else torch.tensor([]) + + # 按照输入的顺序恢复 + result = [] + cpu_tensors_idx, xpu_tensors_idx = 0, 0 + for tensor in tensor_list: + if isinstance(tensor, torch.Tensor) and tensor.device.type != 'cpu': + result.append(xpu_stack[xpu_tensors_idx]) + xpu_tensors_idx += 1 + else: + result.append(cpu_tensors[cpu_tensors_idx]) + cpu_tensors_idx += 1 + + return result + def get_anomalies(self): """返回已检测到的异常列表 """ @@ -299,7 +344,7 @@ class BaseWriterWithAD: end = (i+1) * MonitorConst.SLICE_SIZE if begin == len(tensors): continue - metric_list = torch.stack(tensors[begin:end]).cpu() + metric_list = self.stack_tensors(tensors[begin:end]) for tag, metric in zip(tags[begin:end], metric_list): self.add_scalar(tag, metric, step) @@ -376,7 +421,13 @@ class CSVWriterWithAD(BaseWriterWithAD): super().add_scalar(tag, scalar_value, global_step) name = tag[0].split('/')[0] - self.context_dict[name].append(scalar_value.item()) + if isinstance(scalar_value, torch.Tensor): + value = scalar_value.item() + elif isinstance(scalar_value, torch.Size): + value = list(scalar_value) + else: + value = scalar_value + self.context_dict[name].append(value) def write_metrics(self, ops, metric_value, step, prefix=''): super().write_metrics(ops, metric_value, step, prefix='') diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/csv2tb.py b/debug/accuracy_tools/msprobe/pytorch/monitor/csv2tb.py index 7fbcac84efb38814e01c2cc3cf5b3696a0c1afd2..467e056ef63ccade970c66ce6ffd9b5fcf9ff835 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/csv2tb.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/csv2tb.py @@ -28,7 +28,10 @@ from msprobe.core.common.decorator import recursion_depth_decorator from msprobe.pytorch.common.log import logger from msprobe.pytorch.monitor.utils import get_target_output_dir -all_data_type_list = ["actv", "actv_grad", "exp_avg", "exp_avg_sq", "grad_unreduced", "grad_reduced", "param"] +all_data_type_list = [ + "actv", "actv_grad", "exp_avg", "exp_avg_sq", + "grad_unreduced", "grad_reduced", "param_origin", "param_updated" +] CSV_FILE_SUFFIX = r"_\d+-\d+\.csv" MAX_PROCESS_NUM = 128 @@ -76,6 +79,7 @@ def write_step(output_dirpath, parse_step_result, rank, data_type): for op, value in ops.items(): tag = f"{vpp_name}/{op}" writer.add_scalar(tag, value, step) + writer.flush() @recursion_depth_decorator("update_dict", max_depth=50) diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/distributed/wrap_distributed.py b/debug/accuracy_tools/msprobe/pytorch/monitor/distributed/wrap_distributed.py index d819911b910ae23970acfb5e89430bbfcad03763..c209fdba97fa9a4a153516d340892fbefbf0284f 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/distributed/wrap_distributed.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/distributed/wrap_distributed.py @@ -108,6 +108,7 @@ class ApiRegistry: if args[0] in PENDING_ASYNC_CC_BY_HANDLE: store_func = PENDING_ASYNC_CC_BY_HANDLE.pop(args[0]) store_func() + return wrapped_wait dist.Work.wait = wrapped_wait(dist.Work) @@ -272,7 +273,7 @@ def create_hooks(context, monitor): RANK = dist.get_rank() if dist.is_initialized() and RANK not in monitor.module_rank_list and monitor.module_rank_list != []: return [pre_hooks, hooks] - + if monitor.cc_log_only: pre_hooks.append(cc_log_hook) return [pre_hooks, hooks] diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/features.py b/debug/accuracy_tools/msprobe/pytorch/monitor/features.py index a6cadb25c2c1e374a299b316ecfb72fa1ecd08bf..81c029d401f9194688d332ac711d6065f126ce6a 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/features.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/features.py @@ -33,10 +33,6 @@ def get_mean(x: torch.tensor): return torch.mean(x.to(torch.float64)) -@torch.no_grad() -def get_mean(x: torch.tensor): - return torch.mean(x) - @torch.no_grad() def get_norm(x: torch.tensor): return torch.norm(x.to(torch.float64), p=2) @@ -46,6 +42,7 @@ def get_norm(x: torch.tensor): def get_max(x: torch.tensor): return torch.max(x) + @torch.no_grad() def get_zeros(x: torch.tensor, eps: float): return torch.sum(torch.abs(x) < eps) / x.numel() diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/module_hook.py b/debug/accuracy_tools/msprobe/pytorch/monitor/module_hook.py index dc5038b024c90876ac650ebeb811850194632d21..3c22b95089b4d5ed01cd22ed00023e9329e4d6b1 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/module_hook.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/module_hook.py @@ -37,7 +37,6 @@ from msprobe.pytorch.monitor.distributed.wrap_distributed import api_register, c from msprobe.pytorch.monitor.features import get_sign_matches from msprobe.pytorch.monitor.module_metric import get_metrics, get_summary_writer_tag_name, \ TensorMetrics, squash_param_name -from msprobe.pytorch.monitor.module_spec_verifier import validate_config_spec from msprobe.pytorch.monitor.optimizer_collect import OptimizerMonFactory from msprobe.pytorch.monitor.utils import get_param_struct, validate_config, validate_ops, \ get_output_base_dir, get_target_output_dir, chmod_tensorboard_dir, validate_set_monitor @@ -72,36 +71,6 @@ class ModuleHookContext: self.actvgrad = [] self.module_name = module_name self.struct = {} - self.format_by_arg = {} - self.verified = False - self.focused_in_col = 0 - self.focused_out_col = 0 - - def set_format_by_arg(self, key_name: str, target_config: dict): - """ 按照监控对象配置format_by_arg - 1) module_name 在 target 中配置监控对象 - 2) module_name 未在 targets 中配置,且 all_xy 全量监控 - 3) module_name 未在 targets 中配置,且 all_xy 未全量监控 - - :param key_name: str, one of [input, output, input_grad, output_grad] - :param target_config: target obj in config json. - :return: - """ - cared = target_config.get(self.module_name, self.struct) - if key_name in cared: - target_module_config = cared[key_name] - if isinstance(target_module_config, dict): - # current cared is self.struct, monitor all data for module_name - self.format_by_arg[key_name] = target_module_config.get('config') - elif isinstance(target_module_config, str): - # current cared is target_config[self.module_name] - self.format_by_arg[key_name] = target_module_config - else: - logger.warning_on_rank_0(f"target module config error, result maybe empty." - f"module_name: {self.module_name}, key_name: {key_name}") - self.format_by_arg[key_name] = None - else: - self.format_by_arg[key_name] = self.struct.get(key_name).get('config') def reset(self): self.actv.clear() @@ -185,8 +154,8 @@ class TrainerMon: self.params_have_main_grad = params_have_main_grad self.update_heatmap_visualizer = defaultdict(HeatmapVisualizer) self.ratio_heatmap_visualizer = defaultdict(HeatmapVisualizer) - self.origin_step_func = None self.origin_start_grad_sync = None + self.fsdp_post_backward_hook = None self.config_timestamp = 0 # 后面有校验时间戳, 首次监控无需为了更新config文件时间戳而去改, 可通过dynamic_on开关直接打开 self.config = load_json(config_file_path) validate_config(self.config) @@ -221,8 +190,8 @@ class TrainerMon: self.dp_group = None self.tp_group = None self.enable_megatron = False + self.fsdp_wrapped_module = False self.micro_batch_number = 1 - self.optimizer_class = None self.optimizer_mon = None self.optimizer_trans = None @@ -234,7 +203,6 @@ class TrainerMon: self.grad_context = GradContext() self.handles = defaultdict(list) self.param2name = defaultdict(str) - self.name2index = defaultdict() self.name2indices = defaultdict() self.name2param = {} self.duplicate_param = {} @@ -247,6 +215,8 @@ class TrainerMon: self.optimizer_hooked = False self.param_registered = False self.struct_printed = False + self.pre_step_hooks = [] + self.post_step_hooks = [] # 动静态区分 self.dynamic_enable = os.getenv("DYNAMIC_MONITOR", 'False').lower() == 'true' @@ -411,7 +381,7 @@ class TrainerMon: self.micro_batch_number = grad_acc_steps self.dp_group = dp_group self.tp_group = tp_group - self.optimizer_mon, self.optimizer_class = OptimizerMonFactory.create_optimizer_mon(optimizer) + self.optimizer_mon = OptimizerMonFactory.create_optimizer_mon(optimizer) self.hook_step_final(optimizer) if not isinstance(model, list): model = [model] @@ -440,25 +410,48 @@ class TrainerMon: return self.tensor_metrics.stat_insert(target_tensor, ops_list, module_name, tensor_name, rank) - def build_tbtag_tensor_map(self, module_name, tag, tensor): - key = get_summary_writer_tag_name(module_name, tag, self.rank) - self._register_param_call_id("_hook_module", key) - return {key: tensor} + def build_tbtag_tensor_map(self, module_name, suffix, tag, tensor): + """ + :param module_name: str of module name + :param suffix: + :param tag: + :param tensor: torch.tensor or tuple/list of torch.tensor + :return: tensor_map + """ + tensor_map = {} + if isinstance(tensor, torch.Tensor): + tensor = [tensor] + if isinstance(tensor, tuple) or isinstance(tensor, list): + if len(tensor) == 1: + key = get_summary_writer_tag_name(module_name + suffix, tag, self.rank) + self.register_param_call_id("_hook_module", key) + tensor_map[key] = tensor[0] + else: + for i, tensor_i in enumerate(tensor): + key = get_summary_writer_tag_name(module_name + f"_{i}" + suffix, tag, self.rank) + self.register_param_call_id("_hook_module", key) + tensor_map[key] = tensor_i + return tensor_map def generate_param_map(self, tag, param_tensor): metrics = {} for name in self.param2name.values(): key = get_summary_writer_tag_name(name, tag, self.rank) - self._register_param_call_id("optimizer_pre_step_hook", key) + self.register_param_call_id("optimizer_pre_step_hook", key) if name not in param_tensor or param_tensor[name] is None: continue metrics[key] = param_tensor[name] return metrics - def generate_param_metrics(self, opt_context): + def generate_param_metrics(self, opt_context, stage=MonitorConst.PRE_PARAM): if not self.param_distribution: return - get_metrics(self.ops, self.name2param, self.eps, opt_context.param_metric) + tag2param = { + self.name2tag.get(name, {}).get(stage): param + for name, param in self.name2param.items() + if param.numel() != 0 + } + get_metrics(self.ops, tag2param, self.eps, opt_context.param_metric) def generate_mv_metrics(self, opt_context): if not self.mv_distribution: @@ -470,28 +463,20 @@ class TrainerMon: get_metrics(self.ops, m_tag_tensor_map, self.eps, opt_context.exp_avg_metric) get_metrics(self.ops, v_tag_tensor_map, self.eps, opt_context.exp_avg_sq_metric) - def generate_wgrad_metrics(self): + def generate_wgrad_metrics(self, post_grad_dict): if not self.wg_distribution: return {}, {} if self.weight_hooked: get_metrics(self.ops, self.grad_context.acc, self.eps, self.grad_context.acc_metric) - grad_dict = {} - for param, name in self.param2name.items(): - if self.duplicate_param.get(name, False): - continue - grad = param.main_grad if self.params_have_main_grad else param.grad - if grad is None: - logger.warning(f"grad is None: {name}, maybe something wrong happened.") - continue - tag = self.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) - self._register_param_call_id("hook_optimizer", tag) - grad_dict[tag] = grad - - get_metrics(self.ops, grad_dict, self.eps, self.grad_context.post) - unreduced_grad = self.grad_context.acc_metric if self.weight_hooked else self.grad_context.pre - return self.grad_context.post, unreduced_grad + get_metrics(self.ops, post_grad_dict, self.eps, self.grad_context.post) + reduced_grad = self.grad_context.post + if self.enable_megatron or self.fsdp_wrapped_module: + unreduced_grad = self.grad_context.pre + else: + unreduced_grad = self.grad_context.acc_metric + return reduced_grad, unreduced_grad def generate_xy_metrics(self): actv = {} @@ -531,7 +516,10 @@ class TrainerMon: def write_param_tb(self, opt_context): if not self.param_distribution: return - self.summary_writer.write_metrics(self.ops, opt_context.param_metric, opt_context.step, MonitorConst.PARAM) + param_metrics = {k: v for k, v in opt_context.param_metric.items() if MonitorConst.PRE_PARAM in k} + updated_param_metrics = {k: v for k, v in opt_context.param_metric.items() if MonitorConst.POST_PARAM in k} + self.summary_writer.write_metrics(self.ops, param_metrics, opt_context.step, MonitorConst.PRE_PARAM) + self.summary_writer.write_metrics(self.ops, updated_param_metrics, opt_context.step, MonitorConst.POST_PARAM) def write_mv_tb(self, opt_context): if not self.mv_distribution: @@ -545,7 +533,7 @@ class TrainerMon: if not self.wg_distribution: return - if self.enable_megatron: + if self.enable_megatron or self.fsdp_wrapped_module: self.summary_writer.write_metrics(self.ops, self.grad_context.pre, step, 'grad_unreduced') else: self.summary_writer.write_metrics(self.ops, self.grad_context.acc_metric, step, 'grad_unreduced') @@ -570,23 +558,23 @@ class TrainerMon: # skip generate metrics if context.step < self.start_step or (context.step - self.start_step) % self.step_interval != 0: return + + grad_dict = {} + if self.wg_distribution: + grad_dict = self.optimizer_mon.fetch_grad(self, self.param2name) + mv_result = None - if MonitorConst.DEEPSPEED_ZERO_OPT_FILTER in self.optimizer_class: # use deepspeed with zero1/2/3 - if not self.name2indices: - self.name2indices = self.optimizer_mon.get_param_index(self.param2name, self.name2index, optimizer) - mv_result = self.optimizer_mon.fetch_mv(self, optimizer, self.param2name, self.name2indices) - self.param2name = mv_result.grad - elif self.mv_distribution or self.ur_distribution or self.mg_direction: - mv_result = self.optimizer_mon.fetch_mv(self, optimizer, self.param2name) + if self.mv_distribution or self.ur_distribution or self.mg_direction: + mv_result = self.optimizer_mon.fetch_mv(self, self.param2name) if mv_result: context.param_exp_avg = mv_result.exp_avg context.param_exp_avg_sq = mv_result.exp_avg_sq context.param_adam_update = mv_result.update context.param_adam_ratio = mv_result.ratio - self.generate_wgrad_metrics() + self.generate_wgrad_metrics(grad_dict) self.generate_mv_metrics(context) - self.generate_param_metrics(context) + self.generate_param_metrics(context, MonitorConst.PRE_PARAM) tbtag_tensor_map = {} if self.mg_direction: @@ -614,17 +602,15 @@ class TrainerMon: context.metric_dict = metric_dict return - def patch_step(func, optimizer): - def wrapper(*args, **kwargs): - optimizer_pre_step_hook(optimizer, args, kwargs) - out = func(*args, **kwargs) - return out - return wrapper + def optimizer_post_step_hook(optimizer, args, kwargs): + context = self.optimizer_context[optimizer] + self.generate_param_metrics(context, MonitorConst.POST_PARAM) if self.optimizer_hooked: return - optimizer.__class__.step = patch_step(optimizer.__class__.step, optimizer) + self.pre_step_hooks.append(optimizer_pre_step_hook) + self.post_step_hooks.append(optimizer_post_step_hook) self.optimizer_hooked = True return @@ -716,13 +702,16 @@ class TrainerMon: def patch_step(func, optimizer): def wrapper(*args, **kwargs): + for hook in self.pre_step_hooks: + hook(optimizer, args, kwargs) out = func(*args, **kwargs) + for hook in self.post_step_hooks: + hook(optimizer, args, kwargs) step_final_hook(optimizer, args, kwargs) return out return wrapper optimizer.__class__.step = patch_step(optimizer.__class__.step, optimizer) - self.origin_step_func = optimizer.__class__.step return def hook_modules(self): @@ -765,6 +754,16 @@ class TrainerMon: BackwardHook.setup_input_hook = wrap_hook_setup(BackwardHook.setup_input_hook) BackwardHook.setup_output_hook = wrap_hook_setup(BackwardHook.setup_output_hook) return + + def register_param_call_id(self, hook_name: str, key: str): + """ + :param hook_name: + :param key: str, '0:relu_0/output_grad' + :return: + """ + logger.debug(f"{hook_name} {key}: {self.call_id}") + self.param_name_call_id[key] = self.call_id + self.call_id += 1 def _remove_all_hooks(self, optimizer): # 清空hook handle @@ -791,14 +790,18 @@ class TrainerMon: logger.info("remove _ParamAndGradBucketGroup start_grad_sync") except ImportError: pass - else: # not megatron + elif self.fsdp_post_backward_hook: # fsdp + torch.distributed.fsdp._runtime_utils._post_backward_hook = self.fsdp_post_backward_hook + logger.info("remove patch_post_backward_hook in fsdp.") + else: # not megatron and not fsdp for handle in self.handles['wgrads']: handle.remove() self.handles['wgrads'].clear() self.weight_hooked = False if self.optimizer_hooked: - optimizer.__class__.step = self.origin_step_func + self.pre_step_hooks.clear() + self.post_step_hooks.clear() for _, context in self.optimizer_context.items(): context.reset() @@ -813,7 +816,6 @@ class TrainerMon: # 清空节点缓存 self.param2name.clear() - self.name2index.clear() self.name2indices.clear() self.name2param.clear() self.duplicate_param.clear() @@ -873,27 +875,33 @@ class TrainerMon: return False def _register_chunk(self, model_chunk, prefix): - index = 0 for (param_name, param) in model_chunk.named_parameters(): if not param.requires_grad: continue + if not self.fsdp_wrapped_module and param_name.startswith("_fsdp_wrapped_module"): + self.fsdp_wrapped_module = True if self._is_target_param(param_name, param, prefix): name = prefix + squash_param_name(param_name, self.squash_name) if name in self.param2name.values(): name = prefix + param_name self.param2name[param] = name self.name2param[name] = param - self.name2index[name] = index if self.tp_group and not param_is_not_tensor_parallel_duplicate(param, self.tp_group): self.duplicate_param[name] = True if self.dp_group and param_is_data_parallel_duplicate(self.dp_group): self.duplicate_param[name] = True + + keywords = [ + MonitorConst.PRE_GRAD, + MonitorConst.POST_GRAD, + MonitorConst.PRE_PARAM, + MonitorConst.POST_PARAM + ] self.name2tag[name] = { - MonitorConst.PRE_GRAD: get_summary_writer_tag_name(name, MonitorConst.PRE_GRAD, self.rank), - MonitorConst.POST_GRAD: get_summary_writer_tag_name(name, MonitorConst.POST_GRAD, self.rank) + k: get_summary_writer_tag_name(name, k, self.rank) + for k in keywords } - index += 1 def _register_param_name(self): for vpp_stage, model_chunk in enumerate(self.model): @@ -916,11 +924,17 @@ class TrainerMon: # nothing to hook return 0 - def fwd_hook_fun(module, module_input, module_output, name): + def fwd_hook_fun(module, args, kwargs, module_output, name): if not module.training or is_recomputation(): # 1 only monitor training stage. # 2 when open recompute, skip recomputed forward stage. return + + module_input = [tensor for tensor in args if torch.is_tensor(tensor)] + if kwargs: + kwargs_tensors = [tensor for tensor in kwargs.values() if torch.is_tensor(tensor)] + module_input.extend(kwargs_tensors) + if module not in self.module_fwd_hook_context_by_module: self.module_fwd_hook_context_by_module[module] = ModuleHookContext(name) context: ModuleHookContext = self.module_fwd_hook_context_by_module[module] @@ -932,31 +946,16 @@ class TrainerMon: if self.print_struct: self.module_struct[context.module_name].update(context.struct) return - if not context.format_by_arg: - context.set_format_by_arg(Const.INPUT, self.config['targets']) - context.set_format_by_arg(Const.OUTPUT, self.config['targets']) - if not context.format_by_arg: - return - if not context.verified: - context.focused_in_col = validate_config_spec(context.format_by_arg[Const.INPUT], - module_input, context.module_name, - Const.INPUT) - context.focused_out_col = validate_config_spec(context.format_by_arg[Const.OUTPUT], - module_output, context.module_name, - Const.OUTPUT) - context.verified = True - # expect output be tensor type + tbtag_tensor_map = {} - cared_input = module_input if context.focused_in_col is None else module_input[context.focused_in_col] tbtag_tensor_map.update( self.build_tbtag_tensor_map( - f'{context.module_name}.{Const.INPUT}{MonitorConst.NAME_SEP}{context.micro_step}', - MonitorConst.ACTV, cared_input)) - cared_output = module_output if context.focused_out_col is None else module_output[context.focused_out_col] + f'{context.module_name}.{Const.INPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, module_input)) tbtag_tensor_map.update( self.build_tbtag_tensor_map( - f'{context.module_name}.{Const.OUTPUT}{MonitorConst.NAME_SEP}{context.micro_step}', - MonitorConst.ACTV, cared_output)) + f'{context.module_name}.{Const.OUTPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, module_output)) get_metrics(self.ops, tbtag_tensor_map, self.eps, context.actv) context.micro_step += 1 @@ -974,31 +973,17 @@ class TrainerMon: if self.print_struct: self.module_struct[context.module_name].update(context.struct) return - if not context.format_by_arg: - context.set_format_by_arg(MonitorConst.INPUT_GRAD, self.config['targets']) - context.set_format_by_arg(MonitorConst.OUTPUT_GRAD, self.config['targets']) - if not context.format_by_arg: - return - if not context.verified: - context.focused_in_col = validate_config_spec( - context.format_by_arg[MonitorConst.INPUT_GRAD], - input_grad, context.module_name, MonitorConst.INPUT_GRAD) - context.focused_out_col = validate_config_spec( - context.format_by_arg[MonitorConst.OUTPUT_GRAD], - output_grad, context.module_name, MonitorConst.OUTPUT_GRAD) - context.verified = True tbtag_tensor_map = {} - cared_input_grad = input_grad if context.focused_in_col is None else input_grad[context.focused_in_col] tbtag_tensor_map.update( self.build_tbtag_tensor_map( - f'{context.module_name}.{Const.INPUT}{MonitorConst.NAME_SEP}{context.micro_step}', - MonitorConst.ACTV, cared_input_grad)) - cared_output_grad = output_grad if context.focused_out_col is None else output_grad[context.focused_out_col] + f'{context.module_name}.{Const.INPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, input_grad)) + tbtag_tensor_map.update( self.build_tbtag_tensor_map( - f'{context.module_name}.{Const.OUTPUT}{MonitorConst.NAME_SEP}{context.micro_step}', - MonitorConst.ACTV, cared_output_grad)) + f'{context.module_name}.{Const.OUTPUT}', f'{MonitorConst.NAME_SEP}{context.micro_step}', + MonitorConst.ACTV, output_grad)) if context.micro_step == 0 and context.actvgrad: logger.warning(f"actvgrad context of {context.module_name} is not empty when first micro_step, " @@ -1021,8 +1006,10 @@ class TrainerMon: name = self._is_target_module(module_name, target_names, vpp_stage) if not name: continue + if submodule.__class__.__name__ == "FullyShardedDataParallel": + continue if not self.backward_only: - handle = submodule.register_forward_hook(partial(fwd_hook_fun, name=name)) + handle = submodule.register_forward_hook(partial(fwd_hook_fun, name=name), with_kwargs=True) self.handles['xy'].append(handle) if not self.forward_only and not self.has_register_backward_hook(name, submodule): handle = submodule.register_full_backward_hook(bwd_hook_fun) @@ -1051,7 +1038,7 @@ class TrainerMon: if tag is None: continue grad_dict[tag] = grad - self._register_param_call_id("sync_grad_func", tag) + self.register_param_call_id("sync_grad_func", tag) get_metrics(self.ops, grad_dict, self.eps, self.grad_context.pre) out = sync_grad_func(bucket) return out @@ -1060,6 +1047,10 @@ class TrainerMon: if not self.wg_distribution: return + if self.fsdp_wrapped_module: + # patch fsdp _runtime_utils._post_backward_hook + self._patch_fsdp_post_backward_hook() + return try: from megatron.core.distributed.param_and_grad_buffer import Bucket @@ -1078,9 +1069,44 @@ class TrainerMon: logger.info("megatron version is > core_r0.8.0 <= core_r0.9.0") except ImportError: self.enable_megatron = False | self.enable_megatron + if self.enable_megatron: + return - if not self.enable_megatron: - self._hook_weights() + # default hook weights + self._hook_weights() + + def _patch_fsdp_post_backward_hook(self): + """ + FSDP runtime 需要处理整个forward和backward计算和通信的流程,通过override nn.Module的forward,定义相应的逻辑。 + 对AccumulateGrad对象注册hook,可以在backward计算grad后立刻执行,在reduce_scatter操作前采集梯度累计后,通信聚合前的梯度。 + 每个forward阶段,fsdp对AccumulateGrad重复注册hook方法,monitor工具内注册hook无法生效, + 因此对_post_backward_hook进行patch,在backward后,reduce_scatter前采集梯度。 + """ + def patch_post_backward_hook(_post_backward_hook): + def wrapper(state, handle, *unused): + grad_dict = {} + offset = 0 + for param, name in self.param2name.items(): + limit = param.numel() + if not limit: + continue + grad = handle.flat_param.grad[offset:offset + limit] + offset += limit + tag = self.name2tag.get(name, {}).get(MonitorConst.PRE_GRAD) + if tag is None: + continue + grad_dict[tag] = grad + self.register_param_call_id("_post_backward_hook", tag) + get_metrics(self.ops, grad_dict, self.eps, self.grad_context.pre) + out = _post_backward_hook(state, handle, *unused) + return out + + return wrapper + + logger.info("Patch fsdp _post_backward_hook, collect pre_grad metrics.") + self.fsdp_post_backward_hook = torch.distributed.fsdp._runtime_utils._post_backward_hook + torch.distributed.fsdp._runtime_utils._post_backward_hook = \ + patch_post_backward_hook(torch.distributed.fsdp._runtime_utils._post_backward_hook) def _hook_weights(self): context = self.grad_context @@ -1088,7 +1114,7 @@ class TrainerMon: @torch.no_grad def param_hook(*args, context_dict, param, key, name): param.micro_step += 1 - self._register_param_call_id("param_hook", key) + self.register_param_call_id("param_hook", key) if param.micro_step == self.micro_batch_number: param.micro_step = 0 if self.params_have_main_grad: @@ -1111,13 +1137,3 @@ class TrainerMon: self.handles['wgrads'].append(handle) self.weight_hooked = True - - def _register_param_call_id(self, hook_name: str, key: str): - """ - :param hook_name: - :param key: str, '0:relu_0/output_grad' - :return: - """ - logger.debug(f"{hook_name} {key}: {self.call_id}") - self.param_name_call_id[key] = self.call_id - self.call_id += 1 diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/module_metric.py b/debug/accuracy_tools/msprobe/pytorch/monitor/module_metric.py index 5793000e5d63234d2d353e1a9cf44a438cdb3f5a..48d241c5f6129df05997f52c0957ee7976ff171e 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/module_metric.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/module_metric.py @@ -144,6 +144,20 @@ class IdentMetric(Metric): return tensor +@register_config_metric("shape") +class ShapeMetric(Metric): + @staticmethod + def get_metric_value(tensor, eps): + return tensor.shape + + +@register_config_metric("dtype") +class DtypeMetric(Metric): + @staticmethod + def get_metric_value(tensor, eps): + return tensor.dtype + + def get_metrics(ops, tag2tensor, eps, out_dict=None): """ :param ops: ["op1", "op2"] diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/module_spec_verifier.py b/debug/accuracy_tools/msprobe/pytorch/monitor/module_spec_verifier.py deleted file mode 100644 index 72c35c90bf9540a31cfa1176274a3d2c66bc8946..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/module_spec_verifier.py +++ /dev/null @@ -1,95 +0,0 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import re -import abc -import torch - -from msprobe.pytorch.common.log import logger - -# 用于存储所有validator实现类的注册表 -config_validator_registry = {} - - -def register_config_validator(cls): - """装饰器 用于注册ConfigValidator的实现类""" - config_validator_registry[cls.__name__] = cls - return cls - - -class ConfigValidator(metaclass=abc.ABCMeta): - @abc.abstractmethod - def check_pattern_match(self, config_spec: str): - pass - - @abc.abstractmethod - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - pass - - -@register_config_validator -class TensorValidator(ConfigValidator): - def check_pattern_match(self, config_spec: str): - pattern = re.compile(r"tensor") - return pattern.match(config_spec) - - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - if not torch.is_tensor(actual_data): - raise ValueError( - f"Format of {module_name} {data_type} does not match the required format 'tensor' in config.") - - -@register_config_validator -class TupleValidator(ConfigValidator): - def check_pattern_match(self, config_spec: str): - pattern = re.compile(r"tuple\[(\d+)\]:?(\d+)?") - return pattern.match(config_spec) - - def validate(self, actual_data, module_name: str, data_type: str, pattern_match): - length, index = pattern_match.groups() - if index is None: - index = 0 - length, index = int(length), int(index) - - if not (0 <= index < length): - raise ValueError( - f"Format of {module_name} {data_type} in config.json does not match the required format 'tuple[x]:y'." - f"y must be greater than or equal to 0 and less than x.") - if not isinstance(actual_data, tuple): - raise ValueError( - f"Type of {module_name} {data_type} does not match spec of config.json, should be tuple, please check.") - if len(actual_data) != length: - raise ValueError( - f"Length of {module_name} {data_type} does not match spec of config.json, should be {length}, " - f"actual is {len(actual_data)} please check.") - return index - - -def validate_config_spec(config_spec: str, actual_data, module_name: str, data_type: str): - focused_col = None - if not config_spec or not isinstance(config_spec, str): - return focused_col - for _, validator_cls in config_validator_registry.items(): - config_validator = validator_cls() - pattern_match = config_validator.check_pattern_match(config_spec) - if pattern_match: - try: - focused_col = config_validator.validate(actual_data, module_name, data_type, pattern_match) - except ValueError as e: - logger.warning(f"config spec validate failed: {str(e)}") - return focused_col - logger.warning(f"config spec in {module_name} {data_type} not supported, " - f"expected spec:'tuple\[(\d+)\]:(\d+)' or 'tensor', actual spec: {config_spec}.") - return focused_col diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/optimizer_collect.py b/debug/accuracy_tools/msprobe/pytorch/monitor/optimizer_collect.py index 04a7c0f4d18c9478ee0773722036a707fcffabdf..e074c78b8f8459fad00c2762fa87f8c093c182fe 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/optimizer_collect.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/optimizer_collect.py @@ -12,152 +12,120 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - -from collections import defaultdict +from abc import abstractmethod import torch import torch.distributed as dist from msprobe.pytorch.common.log import logger -from msprobe.pytorch.monitor.utils import MVResult, MVGradResult +from msprobe.pytorch.monitor.utils import MVResult +from msprobe.core.common.const import MonitorConst class OptimizerMon(object): - def __init__(self) -> None: + def __init__(self, torch_opt) -> None: self.fp16_to_fp32_param = {} - self.is_stage3 = False + self.torch_opt = torch_opt - def fetch_mv(self, monitor, torch_opt, params2name): - pass + def narrow_from_flatten(self, param, flatten_state): + return flatten_state + + def fetch_grad(self, monitor, params2name): + if not self.fp16_to_fp32_param: + self.map_fp16_to_fp32_param(self.torch_opt) - def _fetch_mv_in_adam(self, monitor, torch_opt, params2name): - exp_avg_dict = defaultdict(float) - exp_avg_sq_dict = defaultdict(float) - update_dict = defaultdict() - ratio_dict = defaultdict() + grad_dict = {} + first_param = True for param, name in params2name.items(): - if param in self.fp16_to_fp32_param: - param = self.fp16_to_fp32_param[param] - - if param in torch_opt.state: - state_param = torch_opt.state.get(param, None) - exp_avg = state_param.get("exp_avg", None) - exp_avg_sq = state_param.get("exp_avg_sq", None) - if exp_avg is None or exp_avg_sq is None: - logger.warning(f"exp_avg or exp_avg_sq of {name} is None, maybe something wrong happened.") - continue + if monitor.duplicate_param.get(name, False): + continue + if self.fp16_to_fp32_param and param not in self.fp16_to_fp32_param: + continue + grad = param.main_grad if monitor.params_have_main_grad else param.grad + element_in_cur_partition = self.fp16_to_fp32_param.get(param, param).numel() + if param.numel() != element_in_cur_partition: + if first_param: + grad = grad.flatten()[-element_in_cur_partition:] + else: # supposed to be the last one + grad = grad.flatten()[:element_in_cur_partition] + first_param = False + + if grad is None: + if not monitor.fsdp_wrapped_module: + logger.warning(f"grad is None: {name}, maybe something wrong happened.") + continue + tag = monitor.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) + monitor.register_param_call_id("hook_optimizer", tag) + grad_dict[tag] = grad + return grad_dict + + def map_fp16_to_fp32_param(self, torch_opt): + pass + + def fetch_mv(self, monitor, params2name): + if not self.fp16_to_fp32_param: + self.map_fp16_to_fp32_param(self.torch_opt) + + exp_avg_dict = {} + exp_avg_sq_dict = {} + update_dict = {} + ratio_dict = {} + + if hasattr(self.torch_opt, 'state'): + state = self.torch_opt.state + elif hasattr(self.torch_opt, 'optimizer') and hasattr(self.torch_opt.optimizer, 'state'): + state = self.torch_opt.optimizer.state + else: + logger.warning('optimizer state can not accessed') + return MVResult(exp_avg=exp_avg_dict, exp_avg_sq=exp_avg_sq_dict, update=update_dict, ratio=ratio_dict) + + for lp_param, name in params2name.items(): + if lp_param in self.fp16_to_fp32_param: + hp_param = self.fp16_to_fp32_param[lp_param] + else: + hp_param = lp_param + + if hp_param in state: + state_param = state.get(hp_param, None) + exp_avg = self.narrow_from_flatten(lp_param, state_param.get("exp_avg", None)) + exp_avg_sq = self.narrow_from_flatten(lp_param, state_param.get("exp_avg_sq", None)) if monitor.mv_distribution: exp_avg_dict[name] = exp_avg exp_avg_sq_dict[name] = exp_avg_sq if monitor.mg_direction: exp_avg_dict[name] = exp_avg if monitor.ur_distribution: - if len(torch_opt.param_groups) > 1: - logger.info(f"the length of torch_opt.param_groups is {len(torch_opt.param_groups)}.") + if len(self.torch_opt.param_groups) > 1: + logger.info(f"the length of torch_opt.param_groups is {len(self.torch_opt.param_groups)}.") if 'step' in state_param: step = state_param['step'] # Optimizer from pytorch or FusedAdam from apex(used by megatron) - elif 'step' in torch_opt.param_groups[0]: - step = torch_opt.param_groups[0]['step'] # AdamW from mindspeed + elif 'step' in self.torch_opt.param_groups[0]: + step = self.torch_opt.param_groups[0]['step'] # AdamW from mindspeed else: logger.warning(f"step of {name} is None, maybe something wrong happened.") continue - exp_avg_hat = exp_avg / (1 - torch_opt.defaults['betas'][0] ** step) - exp_avg_sq_hat = exp_avg_sq / (1 - torch_opt.defaults['betas'][1] ** step) - update_dict[name] = exp_avg_hat / (torch.sqrt(exp_avg_sq_hat) + torch_opt.defaults['eps']) + exp_avg_hat = exp_avg / (1 - self.torch_opt.defaults['betas'][0] ** step) + exp_avg_sq_hat = exp_avg_sq / (1 - self.torch_opt.defaults['betas'][1] ** step) + update_dict[name] = exp_avg_hat / (torch.sqrt(exp_avg_sq_hat) + self.torch_opt.defaults['eps']) ratio_dict[name] = exp_avg_hat / torch.sqrt(exp_avg_sq_hat) monitor.update_heatmap_visualizer[name].pre_cal(update_dict[name]) monitor.ratio_heatmap_visualizer[name].pre_cal(ratio_dict[name]) return MVResult(exp_avg=exp_avg_dict, exp_avg_sq=exp_avg_sq_dict, update=update_dict, ratio=ratio_dict) - - def _fetch_mv_grad_in_adam(self, monitor, torch_opt, params2name, name2indices, fp32_partitioned_groups_flat): - exp_avg_dict = defaultdict(float) - exp_avg_sq_dict = defaultdict(float) - update_dict = defaultdict() - ratio_dict = defaultdict() - param2name = defaultdict() - fp32_partitioned_groups_flat_grad = defaultdict() - partition_id = dist.get_rank() if dist.is_initialized() else 0 - world_size = dist.get_world_size() if dist.is_initialized() else 1 - - def get_flatten_grad(self, optimizer, group_idx): - if fp32_partitioned_groups_flat[group_idx].grad is None: - if partition_id == world_size - 1 and not self.is_stage3: - fp32_partitioned_groups_flat_grad = optimizer.flatten_dense_tensors_aligned( - optimizer.averaged_gradients[group_idx], - int(optimizer.partition_size[group_idx]) - ).to(fp32_partitioned_groups_flat[group_idx].dtype) - else: - fp32_partitioned_groups_flat_grad = optimizer.flatten( - optimizer.averaged_gradients[group_idx] - ).to(fp32_partitioned_groups_flat[group_idx].dtype) - return fp32_partitioned_groups_flat_grad - else: - return fp32_partitioned_groups_flat[group_idx].grad - - for group_idx in range(len(fp32_partitioned_groups_flat)): - fp32_partitioned_groups_flat_grad[group_idx] = get_flatten_grad(self, torch_opt, group_idx) - - for name in params2name.values(): - start_idx, end_idx, group_idx, group_with_rank = name2indices[name] - if group_with_rank != partition_id and isinstance(group_with_rank, int): - continue - fp32_param = fp32_partitioned_groups_flat[group_idx][start_idx: end_idx] - fp32_param.grad = fp32_partitioned_groups_flat_grad[group_idx][start_idx: end_idx] - param2name[fp32_param] = name - if not torch_opt.state: - continue - state_param = list(torch_opt.state.values())[group_idx] - exp_avg = state_param.get("exp_avg", None) - exp_avg_sq = state_param.get("exp_avg_sq", None) - if exp_avg is None or exp_avg_sq is None: - logger.warning(f"exp_avg or exp_avg_sq of {name} is None, maybe something wrong happened.") - continue - exp_avg = exp_avg[start_idx: end_idx] - exp_avg_sq = exp_avg_sq[start_idx: end_idx] - if monitor.mv_distribution: - exp_avg_dict[name] = exp_avg - exp_avg_sq_dict[name] = exp_avg_sq - if monitor.mg_direction: - exp_avg_dict[name] = exp_avg - if monitor.ur_distribution: - if 'step' in state_param: - step = state_param['step'] # Optimizer from pytorch or FusedAdam from apex(used by megatron) - elif 'step' in torch_opt.param_groups[group_idx]: - step = torch_opt.param_groups[group_idx]['step'] # AdamW from mindspeed - else: - logger.warning(f"step of {name} is None, maybe something wrong happened.") - continue - exp_avg_hat = exp_avg / (1 - torch_opt.defaults['betas'][0] ** step) - exp_avg_sq_hat = exp_avg_sq / (1 - torch_opt.defaults['betas'][1] ** step) - update_dict[name] = exp_avg_hat / (torch.sqrt(exp_avg_sq_hat) + torch_opt.defaults['eps']) - ratio_dict[name] = exp_avg_hat / torch.sqrt(exp_avg_sq_hat) - monitor.update_heatmap_visualizer[name].pre_cal(update_dict[name]) - monitor.ratio_heatmap_visualizer[name].pre_cal(ratio_dict[name]) - del fp32_partitioned_groups_flat_grad - return MVGradResult(exp_avg=exp_avg_dict, exp_avg_sq=exp_avg_sq_dict, update=update_dict, ratio=ratio_dict, - grad=param2name) - + class MixPrecisionOptimizerMon(OptimizerMon): """ 混合精度优化器监控类。在混合精度训练中监控和管理优化器。 混合精度训练通过适当降低某些计算的精度来加速训练过程并减少内存消耗。 """ - - def map_fp16_tp_fp32_param(self, torch_opt): + def map_fp16_to_fp32_param(self, torch_opt): for fp16_group, fp32_group in zip(torch_opt.float16_groups, torch_opt.fp32_from_float16_groups): for fp16_param, fp32_param in zip(fp16_group, fp32_group): self.fp16_to_fp32_param[fp16_param] = fp32_param - def fetch_mv(self, monitor, torch_opt, params2name): - if not self.fp16_to_fp32_param and torch_opt is not None: - self.map_fp16_tp_fp32_param(torch_opt) - - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) - class MegatronDistributedOptimizerMon(OptimizerMon): - def map_fp16_tp_fp32_param(self, torch_opt): + def map_fp16_to_fp32_param(self, torch_opt): if not (hasattr(torch_opt, "model_float16_groups") and hasattr(torch_opt, "shard_fp32_from_float16_groups")): raise Exception( @@ -168,184 +136,176 @@ class MegatronDistributedOptimizerMon(OptimizerMon): for fp16_param, shard_fp32_param in zip(fp16_group, shard_fp32_group): self.fp16_to_fp32_param[fp16_param] = shard_fp32_param - def fetch_mv(self, monitor, torch_opt, params2name): - if not self.fp16_to_fp32_param and torch_opt is not None: - self.map_fp16_tp_fp32_param(torch_opt) - - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) - - -class MegatronFP32OptimizerMon(OptimizerMon): - def fetch_mv(self, monitor, torch_opt, params2name): - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) - class MegatronChainedDistributedOptimizerMon(MegatronDistributedOptimizerMon): - def fetch_mv(self, monitor, torch_opt, params2name): - if not self.fp16_to_fp32_param and torch_opt is not None: - for opt in torch_opt.chained_optimizers: - self.map_fp16_tp_fp32_param(opt) + def map_fp16_to_fp32_param(self, torch_opt): + for opt in torch_opt.chained_optimizers: + super().map_fp16_to_fp32_param(opt) - if not isinstance(torch_opt, torch.optim.Optimizer) and not hasattr(torch_opt, 'state'): + if not hasattr(self.torch_opt, 'state'): torch_opt.state = {} - for opt in torch_opt.chained_optimizers: - torch_opt.state.update(opt.optimizer.state) - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) + for opt in self.torch_opt.chained_optimizers: + self.torch_opt.state.update(opt.optimizer.state) class MegatronChainedMixPrecisionOptimizerMon(MixPrecisionOptimizerMon): - def fetch_mv(self, monitor, torch_opt, params2name): - if not self.fp16_to_fp32_param and torch_opt is not None: - for opt in torch_opt.chained_optimizers: - self.map_fp16_tp_fp32_param(opt) + def map_fp16_to_fp32_param(self, torch_opt): + for opt in torch_opt.chained_optimizers: + super().map_fp16_to_fp32_param(opt) - if not isinstance(torch_opt, torch.optim.Optimizer) and not hasattr(torch_opt, 'state'): + if not hasattr(self.torch_opt, 'state'): torch_opt.state = {} - for opt in torch_opt.chained_optimizers: - torch_opt.state.update(opt.optimizer.state) - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) + for opt in self.torch_opt.chained_optimizers: + self.torch_opt.state.update(opt.optimizer.state) -class DeepSpeedZeroOptimizerStage0Mon(OptimizerMon): - def get_group_index(self, torch_opt): - bit16_groups = torch_opt.bf16_groups - param2group = defaultdict() - for group_idx, bit16_group in enumerate(bit16_groups): +class DeepSpeedZeroOptimizerMon(OptimizerMon): + """ + Base monitor class for DeepSpeed ZeRO optimizer. + ZeRO stage 0 no partition + ZeRO stage 1 partitions optimizer states across data parallel processes. + ZeRO stage 2 additionally partitions gradients. + ZeRO stage 3 additionally partitions parameters. + + This class provides monitoring capabilities for ZeRO optimizers by: + - Handling gradient collection for different ZeRO stages + - Managing optimizer state access for monitoring + """ + def __init__(self, torch_opt): + super().__init__(torch_opt) + self.stage = '' + self.bit16_groups = [] + self.fp32_flat_groups = [] + self.param2group = () + self.param2index = [] + self.group_offset = {} + + @abstractmethod + def get_grad_for_param(self, lp_param, group_idx, param_id): + raise NotImplementedError + + def param_not_in_partition(self, lp_param, group_idx): + param_slice_mapping = self.torch_opt.state_dict()['param_slice_mappings'][group_idx] + hp_address = param_slice_mapping.get(self.torch_opt.param_names.get(lp_param)) + return hp_address is None + + def get_position(self, lp_param, group_idx): + param_slice_mapping = self.torch_opt.state_dict()['param_slice_mappings'][group_idx] + hp_address = param_slice_mapping.get(self.torch_opt.param_names.get(lp_param)) + return hp_address.start, hp_address.numel + + def get_group_index(self): + param2group = {} + for group_idx, bit16_group in enumerate(self.bit16_groups): for param in bit16_group: param2group[param] = group_idx return param2group - - def fetch_mv(self, monitor, torch_opt, params2name, name2indices=None): - param2group = self.get_group_index(torch_opt) - exp_avg_dict = defaultdict(float) - exp_avg_sq_dict = defaultdict(float) - update_dict = defaultdict() - ratio_dict = defaultdict() - - param_slice_mappings = torch_opt.state_dict()['param_slice_mappings'] - for param, name in params2name.items(): - group_idx = param2group[param] - state = torch_opt.optimizer.state[torch_opt.fp32_groups_flat_partition[group_idx]] - if state.get('exp_avg', None) is None: - logger.warning(f"optimizer state is None. Something is wrong if this is not the first step") - break - param_slice_mapping = param_slice_mappings[group_idx] - hp_address = param_slice_mapping.get(torch_opt.param_names[param]) - if hp_address is None: + + def get_param_index(self, lp_param, group_idx): + if not self.param2index: + for group in self.bit16_groups: + param2index = {} + for index, param in enumerate(group): + param2index[param] = index + self.param2index.append(param2index) + + return self.param2index[group_idx][lp_param] + + def narrow_from_flatten(self, param, flatten_state): + if flatten_state is None: + return flatten_state + group_idx = self.param2group[param] + if self.param_not_in_partition(param, group_idx): + return None + start, numel = self.get_position(param, group_idx) + return flatten_state.narrow(0, start, numel) + + def map_fp16_to_fp32_param(self, torch_opt): + for group_idx, group in enumerate(self.bit16_groups): + for param in group: + self.fp16_to_fp32_param[param] = self.fp32_flat_groups[group_idx] + + def fetch_grad(self, monitor, params2name): + grad_dict = {} + for lp_param, name in params2name.items(): + group_idx = self.param2group[lp_param] + param_id = self.get_param_index(lp_param, group_idx) + if self.param_not_in_partition(lp_param, group_idx): continue - start = hp_address.start - numel = hp_address.numel - - if monitor.mv_distribution: - exp_avg_dict[name] = state['exp_avg'].narrow(0, start, numel) - exp_avg_sq_dict[name] = state['exp_avg_sq'].narrow(0, start, numel) - if monitor.mg_direction: - exp_avg_dict[name] = state['exp'].narrow(0, start, numel) - if monitor.ur_distribution: - if len(torch_opt.param_groups) > 1: - logger.info(f"the length of torch_opt.param_groups is {len(torch_opt.param_groups)}.") - if 'step' in state: - step = state['step'] # Optimizer from pytorch or FusedAdam from apex(used by megatron) - elif 'step' in torch_opt.param_groups[0]: - step = torch_opt.param_groups[0]['step'] # AdamW from mindspeed + if self.stage == '1or2': + param_id = param_id - self.group_offset[group_idx] - 1 + grad = self.get_grad_for_param(lp_param, group_idx, param_id) + tag = monitor.name2tag.get(name, {}).get(MonitorConst.POST_GRAD) + monitor.register_param_call_id("hook_optimizer", tag) + grad_dict[tag] = grad + + return grad_dict + + +class DeepSpeedZeroOptimizerStage0Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, torch_opt): + super().__init__(torch_opt) + self.stage = '0' + self.bit16_groups = torch_opt.bf16_groups + self.fp32_flat_groups = torch_opt.fp32_groups_flat_partition + self.param2group = self.get_group_index() + + def get_grad_for_param(self, lp_param, group_idx, param_id): + return self.torch_opt.fp32_groups_gradient_dict[group_idx][param_id] + + +class DeepSpeedZeroOptimizerStage1or2Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, torch_opt): + super().__init__(torch_opt) + self.stage = '1or2' + self.bit16_groups = torch_opt.bit16_groups + self.fp32_flat_groups = torch_opt.single_partition_of_fp32_groups + self.param2group = self.get_group_index() + self.group_offset = {} + self.get_group_offset() + + def get_grad_for_param(self, lp_param, group_idx, param_id): + if getattr(self.torch_opt, "cpu_offload", False): + grads = self.torch_opt.single_partition_of_fp32_groups[group_idx].grad + start, numel = self.get_position(lp_param, group_idx) + grad = grads.narrow(0, start, numel) + else: + grad = self.torch_opt.averaged_gradients[group_idx][param_id] + return grad + + def get_group_offset(self): + for group_idx, group in enumerate(self.bit16_groups): + self.group_offset[group_idx] = -1 + for lp_param in group: + if self.param_not_in_partition(lp_param, group_idx): + self.group_offset[group_idx] = self.get_param_index(lp_param, group_idx) else: - logger.warning(f"step of {name} is None, maybe something wrong happened.") - continue - exp_avg = state['exp_avg'].narrow(0, start, numel) - exp_avg_sq = state['exp_avg_sq'].narrow(0, start, numel) - exp_avg_hat = exp_avg / (1 - torch_opt.defaults['betas'][0] ** step) - exp_avg_sq_hat = exp_avg_sq / (1 - torch_opt.defaults['betas'][1] ** step) - update_dict[name] = exp_avg_hat / (torch.sqrt(exp_avg_sq_hat) + torch_opt.defaults['eps']) - ratio_dict[name] = exp_avg_hat / torch.sqrt(exp_avg_sq_hat) - monitor.update_heatmap_visualizer[name].pre_cal(update_dict[name]) - monitor.ratio_heatmap_visualizer[name].pre_cal(ratio_dict[name]) - return MVResult(exp_avg=exp_avg_dict, exp_avg_sq=exp_avg_sq_dict, update=update_dict, ratio=ratio_dict) - + break -class DeepSpeedZeroOptimizerStage3Mon(OptimizerMon): - def get_param_index(self, params2name, name2index, torch_opt): - fp16_groups = torch_opt.fp16_partitioned_groups - name2indices = defaultdict() - index_length = defaultdict() - idx = 0 - for group_idx, fp16_group in enumerate(fp16_groups): - index = 0 - for param in fp16_group: - param_length = len(param.flatten()) - index_length[idx] = (index, index + param_length, group_idx) - index += param_length - idx += 1 - for _, name in params2name.items(): - idx = name2index[name] - start_idx, end_idx, group_idx = index_length[idx] - name2indices[name] = (start_idx, end_idx, group_idx, None) - return name2indices - - def fetch_mv(self, monitor, torch_opt, params2name, name2indices=None): - self.is_stage3 = True - fp32_partitioned_groups_flat = torch_opt.fp32_partitioned_groups_flat - return self._fetch_mv_grad_in_adam(monitor, torch_opt, params2name, name2indices, fp32_partitioned_groups_flat) - - -class DeepSpeedZeroOptimizerStage1or2Mon(OptimizerMon): - @staticmethod - def get_group_index(fp32_length, world_size, index): - for i in range(len(fp32_length) - 1): - if fp32_length[i] <= index < fp32_length[i + 1]: - interval_start = fp32_length[i] - interval_length = fp32_length[i + 1] - fp32_length[i] - sub_interval_length = interval_length // world_size - sub_index = (index - interval_start) // sub_interval_length - sub_interval_start = interval_start + sub_index * sub_interval_length - return sub_interval_start, min(sub_index, world_size - 1) - return fp32_length[-1], 0 - - def get_param_index(self, params2name, name2index, torch_opt): - padding = torch_opt.groups_padding - world_size = dist.get_world_size() if dist.is_initialized() else 1 - fp32_length = [0] - for fp32_group_index, single_partition_of_fp32_group in enumerate(torch_opt.single_partition_of_fp32_groups): - fp32_length.append(len(single_partition_of_fp32_group) * world_size + fp32_length[fp32_group_index]) - - bf16_groups = [] - name2indices = defaultdict() - index_length = defaultdict() - index = 0 - idx = 0 - for group_idx, bf16_group in enumerate(torch_opt.bit16_groups): - bf16_groups.extend(bf16_group) - for param in bf16_group: - param_length = len(param.flatten()) - group_index, group_with_rank = self.get_group_index(fp32_length, world_size, index) - index_length[idx] = (index, index + param_length, group_idx, group_index, group_with_rank) - index += param_length - idx += 1 - group_length = len(bf16_groups) / len(torch_opt.bit16_groups) - for _, name in params2name.items(): - name_index = name2index[name] - start_idx, end_idx, group_idx, group_index, group_with_rank = index_length[name_index] - need_padding = True if group_with_rank == world_size - 1 else False - new_start_idx = start_idx - group_index - new_end_idx = end_idx - group_index - if need_padding and group_length - 1 <= name_index <= len(bf16_groups) - 1 and name_index % ( - group_length - 1) == 0: - new_end_idx -= padding[int(name_index // (group_length - 1) - 1)] - name2indices[name] = (new_start_idx, new_end_idx, group_idx, group_with_rank) - return name2indices - - def fetch_mv(self, monitor, torch_opt, params2name, name2indices=None): - fp32_partitioned_groups_flat = torch_opt.single_partition_of_fp32_groups - return self._fetch_mv_grad_in_adam(monitor, torch_opt, params2name, name2indices, fp32_partitioned_groups_flat) - - -class DummyOptimizerMon(OptimizerMon): - def fetch_mv(self, monitor, torch_opt, params2name): - return self._fetch_mv_in_adam(monitor, torch_opt, params2name) + +class DeepSpeedZeroOptimizerStage3Mon(DeepSpeedZeroOptimizerMon): + def __init__(self, torch_opt): + super().__init__(torch_opt) + self.stage = '3' + self.bit16_groups = torch_opt.fp16_groups + self.fp32_flat_groups = torch_opt.fp32_partitioned_groups_flat + self.param2group = self.get_group_index() + + def param_not_in_partition(self, param, group_index): + """Each param partioned across all zero ranks""" + return False + + def get_position(self, lp_param, group_idx): + param_id = self.torch_opt.get_param_id(lp_param) + return self.torch_opt.grad_position[param_id][1:] + + def get_grad_for_param(self, lp_param, group_idx, param_id): + return self.torch_opt.averaged_gradients[group_idx][param_id] class OptimizerMonFactory: _optimizer_mon_map = { - "FP32Optimizer": MegatronFP32OptimizerMon, + "FP32Optimizer": OptimizerMon, "Float16OptimizerWithFloat16Params": MixPrecisionOptimizerMon, "DistributedOptimizer": MegatronDistributedOptimizerMon, "ChainedDistributedOptimizer": MegatronChainedDistributedOptimizerMon, @@ -353,7 +313,7 @@ class OptimizerMonFactory: "BF16_Optimizer": DeepSpeedZeroOptimizerStage0Mon, "DeepSpeedZeroOptimizer": DeepSpeedZeroOptimizerStage1or2Mon, "DeepSpeedZeroOptimizer_Stage3": DeepSpeedZeroOptimizerStage3Mon, - "Adam": DummyOptimizerMon + "Adam": OptimizerMon } @staticmethod @@ -362,6 +322,7 @@ class OptimizerMonFactory: optimizer_class = optimizer.__class__.__name__ if optimizer_class == "ChainedOptimizer": optimizer_class = "Chained" + optimizer.chained_optimizers[0].__class__.__name__ + logger.info(f'The optimizer type is {optimizer_class}') - optimizer_mon_class = OptimizerMonFactory._optimizer_mon_map.get(optimizer_class, DummyOptimizerMon) - return optimizer_mon_class(), optimizer_class + optimizer_mon_class = OptimizerMonFactory._optimizer_mon_map.get(optimizer_class, OptimizerMon) + return optimizer_mon_class(optimizer) diff --git a/debug/accuracy_tools/msprobe/pytorch/monitor/utils.py b/debug/accuracy_tools/msprobe/pytorch/monitor/utils.py index 3ca2409b9c8ed22d2b5a68683acb26856d5fa448..3ec0735e57a7b252782f31af23d68f7ed29958fd 100644 --- a/debug/accuracy_tools/msprobe/pytorch/monitor/utils.py +++ b/debug/accuracy_tools/msprobe/pytorch/monitor/utils.py @@ -22,7 +22,7 @@ import re import torch -from msprobe.core.common.const import MonitorConst, Const +from msprobe.core.common.const import MonitorConst from msprobe.pytorch.common.log import logger from msprobe.core.common.utils import is_int from msprobe.core.common.file_utils import check_file_or_directory_path, recursive_chmod @@ -43,7 +43,6 @@ DIRECTORY_MAX_LENGTH = 4096 beijing_tz = timezone(timedelta(hours=8)) MVResult = namedtuple('MVResult', ("exp_avg", "exp_avg_sq", "update", "ratio")) -MVGradResult = namedtuple('MVGradResult', ("exp_avg", "exp_avg_sq", "update", "ratio", "grad")) class MsgConst: @@ -102,6 +101,9 @@ def validate_ops(ops): default_op = MonitorConst.OP_LIST[0] valid_ops.append(default_op) logger.info_on_rank_0(f"There is no valid ops, default op {default_op} is used") + # 增加默认shape和dtype参数 + if "shape" not in valid_ops and "dtype" not in valid_ops: + valid_ops.extend(["shape", "dtype"]) return valid_ops diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_dump_graph.py b/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_dump_graph.py deleted file mode 100644 index 9a5f80205371599f7e43ac5dae8880eea56b233e..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_dump_graph.py +++ /dev/null @@ -1,337 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from typing import Dict, List, Set, Optional, Tuple, Callable -from enum import Enum -from dataclasses import dataclass -from collections import defaultdict, deque - -from msprobe.core.common.log import logger -from msprobe.pytorch.nan_analyse.api_info import APIInfo -from msprobe.pytorch.nan_analyse.pre_process_dump_data import process_on_all_ranks - - -class NodeType(Enum): - COMPUTE = "compute" - SEND = "send" - RECV = "recv" - COLLECTIVE = "collective" - - -class EdgeType(Enum): - SEQUENTIAL = "sequential" - COMMUNICATION = "communication" - - -@dataclass -class Node: - node_id: str # (rank_id:api_name) - rank: int - api_info: APIInfo - node_type: NodeType - - def __hash__(self): - return hash(self.node_id) - - def __eq__(self, other): - return isinstance(other, Node) and self.node_id == other.node_id - - def __str__(self): - return self.node_id - - -class Edge: - def __init__(self, src: Node, dst: Node, edge_type: EdgeType = EdgeType.SEQUENTIAL): - self.src = src - self.dst = dst - self.edge_type = edge_type - self.edge_id = self.__generate_edge_name() - - def __generate_edge_name(self): - return f'{self.src.node_id}_{self.dst.node_id}' - - -class DistributedComputeGraph: - def __init__(self): - self.nodes: Dict[str, Node] = {} - self.edges: Dict[str, Edge] = {} - self.adj_list: Dict[Node, List[Node]] = defaultdict(list) - self.rank_to_nodes: Dict[int, List[Node]] = {} - # 添加入度统计 - self.in_degrees: Dict[Node, int] = defaultdict(int) - - def add_node(self, node: Node): - self.nodes[node.node_id] = node - if not self.rank_to_nodes.get(node.rank): - self.rank_to_nodes[node.rank] = [] - self.rank_to_nodes[node.rank].append(node) - - def add_edge(self, src: Node, dst: Node, edge_type: EdgeType = EdgeType.SEQUENTIAL): - edge = Edge(src, dst, edge_type) - # 边去重 - if self.edges.get(edge.edge_id): - return - self.edges[edge.edge_id] = edge - self.adj_list[src].append(dst) - # 更新入度 - self.in_degrees[dst] += 1 - - def get_node(self, node_id: str) -> Optional[Node]: - return self.nodes.get(node_id) - - def get_nodes_by_rank(self, rank_id: int) -> List[Node]: - return self.rank_to_nodes.get(rank_id, []) - - def get_start_nodes(self) -> List[Node]: - """获取所有入度为0的节点或者每个rank上首个节点""" - start_nodes = [node for node in self.nodes.values() if self.in_degrees[node] == 0] - if not start_nodes: - return self._get_first_nodes() - return start_nodes - - def _get_first_nodes(self): - first_nodes = [] - for rank in self.rank_to_nodes.keys(): - first_nodes.extend(self.__get_first_node_by_rank(rank)) - return first_nodes - - def __get_first_node_by_rank(self, rank): - nodes = self.rank_to_nodes.get(rank, []) - if not nodes: - return [] - return nodes[:1] - - -class GraphBuilder: - @staticmethod - def create_node(rank: int, api_info: APIInfo) -> Node: - node_id = f"{rank}:{api_info.api_name}" - - if api_info.is_communication_op: - if "send" in api_info.api_name.lower(): - node_type = NodeType.SEND - elif "recv" in api_info.api_name.lower(): - node_type = NodeType.RECV - else: - node_type = NodeType.COLLECTIVE - else: - node_type = NodeType.COMPUTE - - return Node(node_id, rank, api_info, node_type) - - @staticmethod - def build_graph(rank_ops_data: Dict[int, Dict]) -> DistributedComputeGraph: - graph = DistributedComputeGraph() - - # Step 1: Create all nodes - rank_nodes: Dict[int, List[Node]] = {} - for rank, ops in rank_ops_data.items(): - rank_nodes[rank] = [] - for _, api_info in ops.items(): - node = GraphBuilder.create_node(rank, api_info) - graph.add_node(node) - rank_nodes[rank].append(node) - - # Step 2: Connect sequential operations within each rank - for _, nodes in rank_nodes.items(): - for i in range(len(nodes) - 1): - graph.add_edge(nodes[i], nodes[i + 1], EdgeType.SEQUENTIAL) - - # Step 3: Connect communication operations between ranks - GraphBuilder._connect_p2p_operations(graph, rank_nodes) - GraphBuilder._connect_collective_operations(graph, rank_nodes) - - return graph - - @staticmethod - def _connect_p2p_operations(graph: DistributedComputeGraph, rank_nodes: Dict[int, List[Node]]): - match_list = [] - - for nodes in rank_nodes.values(): - match_list.extend(node for node in nodes if node.node_type in (NodeType.SEND, NodeType.RECV)) - - for node in match_list: - if not node.api_info.pg: - continue - - for rank in node.api_info.pg: - if rank == node.api_info.cur_rank: - continue - - for candi_node in graph.get_nodes_by_rank(rank): - if GraphBuilder._match_comm_ops(node, candi_node): - graph.add_edge(node, candi_node, EdgeType.COMMUNICATION) - break - - @staticmethod - def _connect_collective_operations(graph: DistributedComputeGraph, rank_nodes: Dict[int, List[Node]]): - collective_groups: Dict[str, List[Node]] = defaultdict(list) - - # Group collective operations by their process group - for nodes in rank_nodes.values(): - for node in nodes: - if node.node_type == NodeType.COLLECTIVE: - group_key = GraphBuilder._get_process_group_key(node.api_info) - collective_groups[group_key].append(node) - - # Connect nodes in the same collective operation - for group in collective_groups.values(): - for i, node_i in enumerate(group): - for j, node_j in enumerate(group): - if i >= j: - continue - graph.add_edge(node_i, node_j, EdgeType.COMMUNICATION) - graph.add_edge(node_j, node_i, EdgeType.COMMUNICATION) # Bidirectional for collectives - - @staticmethod - def _match_comm_ops(no1: Node, no2: Node) -> bool: - return no1.api_info == no2.api_info - - @staticmethod - def _get_process_group_key(api_info: APIInfo) -> str: - return api_info.process_group_id - - -class SortStrategy(Enum): - CALL_INDEX = "call_index" - RANK = "rank" - API_NAME = "api_name" - - -class GraphTraversal: - - @staticmethod - def sort_levels(levels: List[List[Node]], strategy: SortStrategy = SortStrategy.CALL_INDEX) -> List[List[Node]]: - """ - 对每一层的节点进行排序 - Args: - levels: 层次遍历的结果 - strategy: 排序策略 - Returns: - sorted_levels: 排序后的层次结果 - """ - sort_key = GraphTraversal._get_sort_key(strategy) - return [sorted(level, key=sort_key) for level in levels] - - @staticmethod - def bfs_by_level(graph: DistributedComputeGraph) -> List[List[Node]]: - """ - 使用BFS进行层次遍历,返回每一层的节点列表 - Args: - graph: 分布式计算图 - Returns: - levels: 每一层节点的列表的列表 - """ - start_nodes = graph.get_start_nodes() - if not start_nodes: - return [[]] - - # 记录已访问的节点和它们所在的层级 - visited = {} # 节点 -> 层级的映射 - current_level = 0 - levels = [[]] # 初始层包含起始节点 - queue = deque() # (节点, 层级)的队列 - - for n in start_nodes: - visited[n] = 0 - levels[0].append(n) - queue.append((n, 0)) - - while queue: - node, level = queue.popleft() - - # 如果遇到新的层级,创建新的层级列表 - if level > current_level: - current_level = level - levels.append([]) - - # 遍历邻接节点 - for neighbor in graph.adj_list[node]: - # 如果邻接节点未访问过,或者在更深的层级遇到了它 - if neighbor not in visited or visited[neighbor] > level + 1: - visited[neighbor] = level + 1 - queue.append((neighbor, level + 1)) - # 将节点添加到对应层级的列表中 - if len(levels) <= level + 1: - levels.append([]) - if neighbor not in levels[level + 1]: - levels[level + 1].append(neighbor) - - return levels - - @staticmethod - def get_node_info(node: Node) -> str: - """ - 获取节点的详细信息,用于调试和打印 - """ - return (f"Node(id={node.node_id}, rank={node.rank}, call_index={node.api_info.call_index}, " - f"type={node.node_type.value})") - - @staticmethod - def print_levels_info(levels: List[List[Node]]): - """ - 打印每一层的节点信息 - """ - logger.info("Level visit results:") - for i, level in enumerate(levels): - logger.info(f"level {i}:") - for node in level: - logger.info(f"node: {GraphTraversal.get_node_info(node)}") - - @staticmethod - def print_cycles_info(cycles: Set[Tuple[Node, Node]]): - """ - 打印检测到的环信息 - """ - logger.info("\n检测到的环:") - for source, target in cycles: - logger.info(f"环: {GraphTraversal.get_node_info(source)} -> {GraphTraversal.get_node_info(target)}") - - @staticmethod - def _get_sort_key(strategy: SortStrategy) -> Callable[[Node], any]: - """Get the sort key function based on the sorting strategy""" - if strategy == SortStrategy.CALL_INDEX: - return lambda node: (node.api_info.call_index, node.rank) - elif strategy == SortStrategy.RANK: - return lambda node: node.rank - elif strategy == SortStrategy.API_NAME: - return lambda node: node.api_info.api_name - else: - return lambda node: node.api_info.call_index # Default to call_index - - -def traverse_graph(graph: DistributedComputeGraph, sort_strategy: SortStrategy = SortStrategy.CALL_INDEX): - levels, cycles = GraphTraversal.bfs_by_level(graph), set() - sorted_levels = GraphTraversal.sort_levels(levels, sort_strategy) - - GraphTraversal.print_levels_info(sorted_levels) - GraphTraversal.print_cycles_info(cycles) - - return levels, cycles - - -def main(): - file_path = 'test_data/all_reduce_data' - # Load your data as before - data = process_on_all_ranks(file_path) - - # Build the graph - graph = GraphBuilder.build_graph(data) - - # Traverse the graph - _, _ = traverse_graph(graph) - - -if __name__ == '__main__': - main() diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_pp_partition.py b/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_pp_partition.py deleted file mode 100644 index 59e6952ce6a16260f035289f2af42d0746912436..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyze_pp_partition.py +++ /dev/null @@ -1,172 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from collections import defaultdict -from typing import Dict, List, Set, Optional - -from msprobe.core.common.log import logger -from msprobe.pytorch.nan_analyse.api_info import APIInfo -from msprobe.pytorch.nan_analyse.pre_process_dump_data import process_on_all_ranks -from msprobe.pytorch.nan_analyse.utils import singleton - - -MAX_RECURSIVE_DEPTH = 100 - - -def __is_send_op(op_name: str) -> bool: - if op_name.startswith('Distributed.') and 'send.' in op_name: - return True - return False - - -def __is_recv_op(op_name: str) -> bool: - if op_name.startswith('Distributed.') and 'recv.' in op_name: - return True - return False - - -def _is_first_send_op(op_name: str) -> bool: - if __is_send_op(op_name) and 'send.0' in op_name: - return True - return False - - -def _is_first_recv_op(op_name: str) -> bool: - if __is_recv_op(op_name) and 'recv.0' in op_name: - return True - return False - - -@singleton -class PPAnalyzer: - def __init__(self, rank_data: Dict[int, dict]): - # 初始化rank_to_data字典,rank_id --> dump_data - self.rank_to_data = rank_data - self.rank_to_stage = {} # 存储rank对应的pipeline stage - self.send_recv_pairs = defaultdict(list) # 存储send-recv配对信息 - - @staticmethod - def _find_start_ranks(rank_graph: Dict[int, Set[int]]) -> List[int]: - """找到没有入边的rank(pipeline的起始rank)""" - all_ranks = set(rank_graph.keys()) - target_ranks = set() - for ranks in rank_graph.values(): - target_ranks.update(ranks) - return list(all_ranks - target_ranks) - - @staticmethod - def _get_target_rank(op_info: APIInfo) -> Optional[int]: - """从send操作中提取目标rank""" - kwargs = op_info.input_kwargs - if 'dst' in kwargs: - return int(kwargs['dst'].get('value')) - return None - - @staticmethod - def _get_source_rank(op_info: APIInfo) -> Optional[int]: - """从recv操作中提取源rank""" - kwargs = op_info.input_kwargs - if 'src' in kwargs: - return kwargs['src'].get('value') - return None - - def get_pp_stages(self) -> Dict[int, List[int]]: - """获取每个stage包含的ranks""" - stages = defaultdict(list) - for rank, stage in self.rank_to_stage.items(): - stages[stage].append(rank) - return dict(stages) - - def analyze(self): - self.analyze_send_recv() - self.determine_pp_stages() - - def analyze_send_recv(self): - """分析所有rank的send和recv操作""" - rank_data = self.rank_to_data - for cur_rank, data in rank_data.items(): - self._analyze_cur_rank(cur_rank, data) - - def determine_pp_stages(self): - """确定每个rank属于哪个pipeline stage""" - # 构建rank之间的依赖关系图 - rank_graph = defaultdict(set) - for rank, pairs in self.send_recv_pairs.items(): - for op_type, other_rank in pairs: - if op_type == 'send': - rank_graph[rank].add(other_rank) - - # 没有send、recv操作,所有的rank属于同一个stage - if not rank_graph: - all_ranks = set(self.rank_to_data.keys()) - for rank in all_ranks: - self.rank_to_stage[rank] = 0 - return - - # 使用拓扑排序确定stage - visited = set() - - def dfs(rank_id: int, stage: int): - if stage >= MAX_RECURSIVE_DEPTH: - raise ValueError("Recursive depth exceeds the limit") - - if rank_id in visited: - return - visited.add(rank_id) - self.rank_to_stage[rank_id] = stage - - # 遍历所有下一个rank - for next_rank in rank_graph[rank_id]: - dfs(next_rank, stage + 1) - - # 找到起始rank(入度为0的节点)为首个PP stage - start_ranks = self._find_start_ranks(rank_graph) - for start_rank in start_ranks: - dfs(start_rank, 0) - - def _analyze_cur_rank(self, cur_rank: int, data: Dict[str, APIInfo]): - if not data: - return - - for op_name, op_info in data.items(): - if _is_first_send_op(op_name): - target_rank = self._get_target_rank(op_info) - if target_rank is None or target_rank < cur_rank: # 仅添加大于cur_rank的send操作,保证所有都是前向 - continue - self.send_recv_pairs[cur_rank].append(('send', target_rank)) - - # 不采集rcv的通信算子,仅仅从send数据分析,rcv算子用于做validation - elif _is_first_recv_op(op_name): - source_rank = self._get_source_rank(op_info) - if source_rank is None: - continue - - -def main(): - file_path = 'test_data/send_recv' - data = process_on_all_ranks(file_path) - - # 分析pp stage - analyzer = PPAnalyzer(data) - analyzer.analyze() - - pp_stages = analyzer.get_pp_stages() - - logger.info("Pipeline Parallel Stages:") - for stage, ranks in sorted(pp_stages.items()): - logger.info(f"Stage {stage}: Ranks {sorted(ranks)}") - - -if __name__ == "__main__": - main() diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyzer.py b/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyzer.py deleted file mode 100644 index 9afe1dc16a674817c7d5fc066c4d2be81c1cc3d2..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/analyzer.py +++ /dev/null @@ -1,65 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprobe.pytorch.nan_analyse.analyze_dump_graph import GraphBuilder, GraphTraversal -from msprobe.pytorch.nan_analyse.pre_process_dump_data import process_on_all_ranks - - -class HeapDumpAnalyzer: - def __init__(self, dump_file_path): - """初始化分析器 - Args: - dump_file_path (str): 堆转储文件的路径 - """ - self.dump_file_path = dump_file_path - self.processed_data = None - self.analysis_results = None - self.graph = None - self.visited_levels = None - - def pre_process(self): - """预处理dump文件 - Returns: - 处理后的数据结构 - """ - self.processed_data = process_on_all_ranks(self.dump_file_path) - self.graph = GraphBuilder.build_graph(self.processed_data) - - def analyze_graph(self): - """分析预处理后的数据 - Returns: - 分析结果 - """ - if self.processed_data is None or self.graph is None: - raise ValueError("Data or graph is not processed yet") - self.visited_levels = GraphTraversal.bfs_by_level(self.graph) - - def post_process(self): - """获取分析结果""" - self.analysis_results = GraphTraversal.sort_levels(self.visited_levels) - - def apply(self): - """执行完整的分析流程""" - self.pre_process() - - self.analyze_graph() - - self.post_process() - return self.analysis_results - - -if __name__ == "__main__": - analyzer = HeapDumpAnalyzer("test_data/send_recv") - results = analyzer.apply() - GraphTraversal.print_levels_info(results) diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/api_info.py b/debug/accuracy_tools/msprobe/pytorch/nan_analyse/api_info.py deleted file mode 100644 index 17fdc88a0abadc3a8e9e3cb12012eb0a6d3d9213..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/api_info.py +++ /dev/null @@ -1,164 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from dataclasses import dataclass - -from typing import Dict, List, Union, Any - -from msprobe.core.common.const import Const -from msprobe.core.overflow_check.filter import IgnoreFilter -from msprobe.pytorch.nan_analyse.utils import singleton, has_nan_inf, generate_hash - - -def is_comm_api_name_match(bench_api_name, cmp_api_name): - if 'send' in bench_api_name and 'recv' in cmp_api_name: - return True - if 'recv' in bench_api_name and 'send' in cmp_api_name: - return True - return bench_api_name == cmp_api_name - - -@dataclass -class APIInfo: - input_kwargs: Dict - output_data: List[Dict] - api_name: str - torch_api_name: str - input_args: List[Dict] - call_index: int - is_communication_op: bool - - cur_rank: int - process_group_id = str - - def __init__(self, api_name, input_args=None, input_kwargs=None, output_data=None, call_index=0, cur_rank=None): - self.input_kwargs = input_kwargs - self.output_data = output_data - self.api_name = api_name - self.input_args = input_args - self.call_index = call_index - self.cur_rank = cur_rank - self.torch_api_name = self.__extract_torch_api(self.api_name) - self.is_communication_op = self.__is_communication_operators() - self.pg, self.process_group_id = self.__generate_pg_id() - - def __eq__(self, other): - if not self.is_communication_op or not other.is_communication_op: - return False - - if not is_comm_api_name_match(self.torch_api_name, other.torch_api_name): - return False - - if self.torch_api_name != other.torch_api_name: - return False - if self.process_group_id != other.process_group_id: - return False - return True - - @staticmethod - def __extract_torch_api(api_name) -> str: - """ - Process tensor api name to extract first two fields in lowercase. - """ - # Empty string checking - if not api_name.strip(): - return "" - - parts = api_name.split(Const.SEP) - - # Handle different cases based on number of parts - if len(parts) == 0: - return "" - elif len(parts) == 1: - return parts[0].lower() - else: - return Const.SEP.join(parts[:2]).lower() - - def __is_communication_operators(self) -> bool: - # 定义通信算子的关键字,覆盖各种通信操作,如all_reduce, send, broadcast等 - # 从wrap文件中读取,先硬编码在文件中 - communication_keywords = [ - 'send', # send 算子 - 'recv', # recv 算子 - 'broadcast', # broadcast 算子 - 'all_reduce', # all_reduce 算子 - 'reduce', # reduce 算子 - 'all_gather', # all_gather 算子 - 'gather', # gather 算子 - 'isend', # isend 算子 - 'irecv', # irecv 算子 - 'scatter', # scatter 算子 - 'reduce_scatter', # reduce_scatter 算子 - '_reduce_scatter_base', # _reduce_scatter_base 算子 - '_all_gather_base', # _all_gather_base 算子 - 'all_to_all_single', # all_to_all_single 算子 - 'all_to_all', # all_to_all 算子 - 'all_gather_into_tensor', # all_gather_into_tensor 算子 - 'reduce_scatter_tensor' # reduce_scatter_tensor 算子 - ] - - # 是否以Distributed开头,并且算子名包含上述通信算子 - return (any(keyword in self.api_name for keyword in communication_keywords) or - self.api_name.startswith('Distributed.')) - - def __generate_pg_id(self): - if not self.is_communication_op: - return None, None - - process_group: List[int] = [] - if 'send' in self.api_name: - dst = int(self.input_kwargs.get('dst', {}).get('value')) - process_group.extend([self.cur_rank, dst]) - elif 'recv' in self.api_name: - src = int(self.input_kwargs.get('src', {}).get('value')) - process_group.extend([src, self.cur_rank]) - else: - process_group.extend(self.input_kwargs.get('group_ranks', [])) - - # 暂时直接使用调用的次数,而忽略pg的匹配 - call_cnt = self.api_name.split('.')[-2] - fmt = f'{call_cnt}_{str(process_group)}' - - return process_group, generate_hash(fmt) - - -@singleton -class AnomalyDetector: - def __init__(self): - self._filter = IgnoreFilter() - - @staticmethod - def _has_anomaly(data: Union[Dict, Any]) -> bool: - return has_nan_inf(data) - - def has_input_anomaly(self, api_data) -> bool: - """检查输入是否有异常(包括args和kwargs)""" - # args - args_anomaly = any(self._has_anomaly(x) for x in api_data.input_args if isinstance(x, dict)) - # kwargs - kwargs_anomaly = any(self._has_anomaly(x) for x in api_data.input_kwargs.values() if isinstance(x, dict)) - return args_anomaly or kwargs_anomaly - - def has_output_anomaly(self, api_data) -> bool: - """检查输出是否有异常""" - return any(self._has_anomaly(x) for x in api_data.output_data if isinstance(x, dict)) - - def has_overflow(self, data: APIInfo) -> bool: - # 输入输出不存在nan、inf,不存在溢出 - if not (self.has_input_anomaly(data) or self.has_output_anomaly(data)): - return False - # 是否真的溢出,并且对计算结果造成影响 - if self._filter.apply_filter(data): - return False - return True diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/pre_process_dump_data.py b/debug/accuracy_tools/msprobe/pytorch/nan_analyse/pre_process_dump_data.py deleted file mode 100644 index 73815f8b922d3c6e7ff5d7a8505524ad61a4ee15..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/pre_process_dump_data.py +++ /dev/null @@ -1,100 +0,0 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import os -import re -from typing import Any, Dict -from collections import OrderedDict - -from msprobe.core.common.const import Const -from msprobe.core.common.log import logger -from msprobe.core.common.file_utils import load_json -from msprobe.pytorch.nan_analyse.api_info import APIInfo, AnomalyDetector - - -def _create_api_info(api_name: str, _data: Dict, call_index: int = 0, cur_rank: int = 0) -> APIInfo: - """从原始数据创建APIInfo实例""" - return APIInfo( - api_name=api_name, - input_args=_data.get(Const.INPUT_ARGS, []), - input_kwargs=_data.get(Const.INPUT_KWARGS, {}), - output_data=_data.get(Const.OUTPUT, []), - call_index=call_index, - cur_rank=cur_rank - ) - - -def extract_essential_operators(dump_data: Any, cur_rank: int, common_overflow_num=5): - """ - 减少内存占用,仅筛选出溢出、通信算子等,用于下一步构图 - """ - # 从数据中提取通信算子和nan等溢出问题算子,使用顺序dict保存结果 - # order dict性能与list+dict性能比较,是否对这里进行改造 - extract_opts = OrderedDict() - detector = AnomalyDetector() # 单例,无额外内存占用 - cnt = 0 - index = 0 - for api_name, value in dump_data.get('data', {}).items(): - api_info = _create_api_info(api_name, value, call_index=index, cur_rank=cur_rank) - index += 1 - - is_overflow, is_comm_op = detector.has_overflow(api_info), api_info.is_communication_op - if cnt < common_overflow_num and is_overflow: - extract_opts[api_name] = api_info - cnt += 1 - continue - - return extract_opts - - -def process_on_all_ranks(base_path: str): - all_rank_ops_data = {} - - # 获取所有rank目录 - for rank_dir in os.listdir(base_path): - rank_path = os.path.join(base_path, rank_dir) - if not os.path.isdir(rank_path) or not rank_dir.startswith('rank'): - logger.warning(f"{rank_dir} is not a valid rank directory.") - continue - - dump_file = os.path.join(rank_path, 'dump.json') - if not os.path.exists(dump_file): - logger.warning(f"{dump_file} does not exist for {rank_dir}") - continue - - rank_id = get_rank_id(rank_dir) - dump_data = load_json(dump_file) - op_list = extract_essential_operators(dump_data, rank_id) - - if op_list: - all_rank_ops_data[rank_id] = op_list - else: - logger.warning(f"No essential operators found for {rank_id}") - - return all_rank_ops_data - - -def get_rank_id(rank_dir: str) -> int: - match = re.search(r'rank(\d+)', rank_dir) - - if not match: - raise ValueError(f"Invalid rank directory: {rank_dir}") - return int(match.group(1)) - - -if __name__ == '__main__': - file_path = 'test_data/all_reduce_data' - - data = process_on_all_ranks(file_path) - logger.info(data) diff --git a/debug/accuracy_tools/msprobe/pytorch/pt_config.py b/debug/accuracy_tools/msprobe/pytorch/pt_config.py index d704d7c1d45f44f1821b3891bbcaf34ac00c619d..2ddfaf7b3012292da7231f6a0ebd42d6d6b7d92a 100644 --- a/debug/accuracy_tools/msprobe/pytorch/pt_config.py +++ b/debug/accuracy_tools/msprobe/pytorch/pt_config.py @@ -43,6 +43,7 @@ class TensorConfig(BaseConfig): self.tls_path = json_config.get("tls_path", "./") self.online_run_ut_recompute = json_config.get("online_run_ut_recompute", False) self.check_config() + self._check_summary_mode() self._check_file_format() if self.online_run_ut: self._check_online_run_ut() @@ -82,9 +83,8 @@ class StatisticsConfig(BaseConfig): self.check_config() self._check_summary_mode() - def _check_summary_mode(self): - if self.summary_mode and self.summary_mode not in ["statistics", "md5"]: - raise Exception("summary_mode is invalid") + self.tensor_list = json_config.get("tensor_list", []) + self._check_str_list_config(self.tensor_list, "tensor_list") class OverflowCheckConfig(BaseConfig): @@ -256,8 +256,6 @@ class RunUTConfig(BaseConfig): self.port = json_config.get("port", -1) self.rank_list = json_config.get("rank_list", Const.DEFAULT_LIST) self.tls_path = json_config.get("tls_path", "./") - self.master_ip = json_config.get("master_ip", "127.0.0.1") - self.master_port = json_config.get("master_port", "8888") self.check_run_ut_config() @classmethod @@ -284,19 +282,6 @@ class RunUTConfig(BaseConfig): def check_tls_path_config(cls, tls_path): if tls_path: FileChecker(tls_path, FileCheckConst.DIR, FileCheckConst.READ_ABLE).common_check() - - @classmethod - def check_master_ip_config(cls, master_ip): - if not re.match(Const.ipv4_pattern, master_ip): - raise Exception("master_ip: %s is invalid" % master_ip) - - @classmethod - def check_master_port_config(cls, master_port): - if not isinstance(master_port, str) or not master_port.isdigit(): - raise Exception(f"port: {master_port} is invalid. Port must be a numeric string.") - port_number = int(master_port) - if not (0 < port_number <= 65535): - raise Exception(f"port: {master_port} is invalid. Port range must be between 1 and 65535.") def check_run_ut_config(self): RunUTConfig.check_filter_list_config(Const.WHITE_LIST, self.white_list) @@ -304,8 +289,6 @@ class RunUTConfig(BaseConfig): RunUTConfig.check_error_data_path_config(self.error_data_path) RunUTConfig.check_nfs_path_config(self.nfs_path) RunUTConfig.check_tls_path_config(self.tls_path) - RunUTConfig.check_master_ip_config(self.master_ip) - RunUTConfig.check_master_port_config(self.master_port) class GradToolConfig(BaseConfig): diff --git a/debug/accuracy_tools/msprobe/pytorch/service.py b/debug/accuracy_tools/msprobe/pytorch/service.py index 79631026266c7ad3bfe8660a86d2ab067b8f0c1d..06cd9143ffce8a154f67620966b2e93eeaadd944 100644 --- a/debug/accuracy_tools/msprobe/pytorch/service.py +++ b/debug/accuracy_tools/msprobe/pytorch/service.py @@ -15,22 +15,24 @@ import functools import os -from collections import namedtuple, defaultdict +from collections import defaultdict import torch + from msprobe.core.common.const import Const from msprobe.core.common.exceptions import DistributedNotInitializedError from msprobe.core.common.file_utils import create_directory -from msprobe.core.common.utils import print_tools_ends_info, DumpPathAggregation +from msprobe.core.common.utils import print_tools_ends_info, DumpPathAggregation, replace_last_occurrence from msprobe.core.data_dump.data_collector import build_data_collector from msprobe.core.data_dump.data_processor.base import ModuleForwardInputsOutputs, ModuleBackwardInputsOutputs from msprobe.core.data_dump.scope import BaseScope +from msprobe.core.data_dump.api_registry import ApiRegistry from msprobe.pytorch.api_accuracy_checker.common.utils import ApiData from msprobe.pytorch.common.log import logger from msprobe.pytorch.common.utils import get_rank_if_initialized, is_recomputation from msprobe.pytorch.dump.kernel_dump.kernel_config import create_kernel_config_json from msprobe.pytorch.dump.module_dump.module_processer import ModuleProcesser -from msprobe.pytorch.hook_module.api_register import get_api_register +from msprobe.pytorch.hook_module.api_register import get_api_register, ApiTemplate from msprobe.pytorch.hook_module.hook_module import HOOKModule from msprobe.pytorch.hook_module.jit_script_wrapper import wrap_jit_script_func from msprobe.pytorch.hook_module.register_optimizer_hook import register_optimizer_hook @@ -39,8 +41,6 @@ torch_version_above_or_equal_2 = torch.__version__.split('+')[0] >= '2.0' if torch_version_above_or_equal_2: from msprobe.pytorch.api_accuracy_checker.tensor_transport_layer.dump_dispatch import run_ut_dispatch -HookFn = namedtuple('hookFn', ['pre_hook', 'forward_hook', 'backward_hook', 'forward_hook_torch_version_below_2']) - class Service: def __init__(self, config): @@ -63,25 +63,27 @@ class Service: # 提前注册,确保注册尽可能多的API hook self.api_register = get_api_register() self.register_api_hook() - self.init_for_debug_level() + self.currrent_step_first_debug_save = True + self.debug_variable_counter = None + self.ori_customer_func = {} def build_hook(self, module_type, name): - def pre_hook(api_or_module_name, module, args, kwargs): - if not self.should_execute_hook(module_type, module, True): - return args, kwargs + def pre_hook(api_or_module_name, module, args, kwargs=None): + kwargs = {} if kwargs is None else kwargs + + if module_type == BaseScope.Module_Type_Module or \ + not self.should_execute_hook(module_type, module, True): + return is_recompute = is_recomputation() self.inner_switch = True - if module_type == BaseScope.Module_Type_Module: - api_or_module_name = module.mindstudio_reserved_name[-1] - else: - module.forward_data_collected = True - HOOKModule.add_module_count(name) + module.forward_data_collected = True + HOOKModule.add_module_count(name) self.data_collector.update_api_or_module_name(api_or_module_name) if self.config.online_run_ut: self.inner_switch = False - return None, None + return if self.data_collector: module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=None) self.data_collector.forward_input_data_collect( @@ -93,7 +95,6 @@ class Service: ) self.inner_switch = False - return args, kwargs def grad_hook(module, ori_name, param_name): def hook_fn(grad): @@ -140,10 +141,12 @@ class Service: # 记录当前模块的参数梯度信息已占位 self.params_grad_info[grad_name] = True - def forward_hook(api_or_module_name, module, args, kwargs, output): + def forward_hook(api_or_module_name, module, args, kwargs_or_output, output_or_kwargs=None): if not self.should_execute_hook(module_type, module, True): return None is_recompute = is_recomputation() + kwargs = kwargs_or_output if torch_version_above_or_equal_2 else {} + output = output_or_kwargs if torch_version_above_or_equal_2 else kwargs_or_output self.inner_switch = True if self.config.online_run_ut: @@ -163,9 +166,8 @@ class Service: return None module_input_output = ModuleForwardInputsOutputs(args=args, kwargs=kwargs, output=output) + self.data_collector.update_api_or_module_name(api_or_module_name) if module_type == BaseScope.Module_Type_Module: - api_or_module_name = module.mindstudio_reserved_name[-1] - self.data_collector.update_api_or_module_name(api_or_module_name) params_dict = {} if self.config.task != Const.STRUCTURE: params_dict = { @@ -189,7 +191,6 @@ class Service: ) init_params_grad_info(module, params_dict) else: - self.data_collector.update_api_or_module_name(api_or_module_name) self.data_collector.forward_output_data_collect( api_or_module_name, module, @@ -205,17 +206,12 @@ class Service: self.inner_switch = False return output - def forward_hook_torch_version_below_2(api_or_module_name, module, args, output): - return forward_hook(api_or_module_name, module, args, {}, output) - def backward_hook(api_or_module_name, module, grad_input, grad_output): if not self.should_execute_hook(module_type, module, False): return is_recompute = is_recomputation() self.inner_switch = True - if module_type == BaseScope.Module_Type_Module: - api_or_module_name = module.mindstudio_reserved_name[-1] self.data_collector.update_api_or_module_name(api_or_module_name) if self.config.online_run_ut: @@ -235,19 +231,15 @@ class Service: self.inner_switch = False pid = os.getpid() - full_forward_name = None - full_backward_name = None + full_forward_name = name if module_type == BaseScope.Module_Type_API: full_forward_name = name + str(HOOKModule.get_module_count(name)) + Const.SEP + Const.FORWARD - full_backward_name = name + str(HOOKModule.get_module_count(name)) + Const.SEP + Const.BACKWARD + full_backward_name = replace_last_occurrence(full_forward_name, Const.FORWARD, Const.BACKWARD) pre_forward_hook_fn = functools.partial(pre_hook, full_forward_name) forward_hook_fn = functools.partial(forward_hook, full_forward_name) backward_hook_fn = functools.partial(backward_hook, full_backward_name) - forward_hook_torch_version_below_2_fn = functools.partial( - forward_hook_torch_version_below_2, - full_forward_name - ) - return HookFn(pre_forward_hook_fn, forward_hook_fn, backward_hook_fn, forward_hook_torch_version_below_2_fn) + + return pre_forward_hook_fn, forward_hook_fn, backward_hook_fn def start(self, model): self.current_iter = self.loop + self.init_step @@ -294,19 +286,18 @@ class Service: if self.config.online_run_ut and torch_version_above_or_equal_2: run_ut_dispatch(self.attl, False, self.config.online_run_ut_recompute) return - if self.config.async_dump and self.config.task == Const.TENSOR: + if self.config.async_dump and self.config.task in [Const.STATISTICS, Const.TENSOR]: self.data_collector.data_processor.dump_async_data() self.data_collector.write_json() def step(self): - if self.config.level == Const.LEVEL_DEBUG: - return if self.should_stop_service: return - if self.config.async_dump and self.config.task == Const.TENSOR: + if self.config.async_dump and self.config.task in [Const.STATISTICS, Const.TENSOR]: self.data_collector.data_processor.dump_async_data() self.data_collector.write_json() self.loop += 1 + self.currrent_step_first_debug_save = True self.reset_status() def need_stop_service(self): @@ -353,16 +344,20 @@ class Service: dump_dir = os.path.join(self.dump_iter_dir, f"rank{cur_rank}") create_directory(dump_dir) - if self.config.task in self.data_collector.tasks_need_tensor_data: + + dump_data_dir = None + if self.config.task in self.data_collector.tasks_need_tensor_data or ( + self.config.task == Const.STATISTICS and self.config.tensor_list): dump_data_dir = os.path.join(dump_dir, "dump_tensor_data") create_directory(dump_data_dir) - else: - dump_data_dir = None dump_path_aggregation = DumpPathAggregation() - dump_path_aggregation.dump_file_path = os.path.join(dump_dir, "dump.json") - dump_path_aggregation.stack_file_path = os.path.join(dump_dir, "stack.json") - dump_path_aggregation.construct_file_path = os.path.join(dump_dir, "construct.json") + if self.config.level != Const.LEVEL_DEBUG: + dump_path_aggregation.dump_file_path = os.path.join(dump_dir, "dump.json") + dump_path_aggregation.stack_file_path = os.path.join(dump_dir, "stack.json") + dump_path_aggregation.construct_file_path = os.path.join(dump_dir, "construct.json") + else: + dump_path_aggregation.debug_file_path = os.path.join(dump_dir, "debug.json") dump_path_aggregation.dump_tensor_data_dir = dump_data_dir dump_path_aggregation.free_benchmark_file_path = os.path.join(dump_dir, "free_benchmark.csv") self.data_collector.update_dump_paths(dump_path_aggregation) @@ -380,6 +375,7 @@ class Service: def register_module_hook(self): if self.config.level in [Const.LEVEL_L0, Const.LEVEL_MIX]: logger.info_on_rank_0(f"The module {self.config.task} hook function is successfully mounted to the model.") + ModuleProcesser.enable_module_dump = True self.module_processor.register_module_hook(self.model, self.build_hook) def attl_init(self): @@ -427,36 +423,24 @@ class Service: if self.config.rank and self.current_rank not in self.config.rank: return - def init_for_debug_level(self): - if not (self.config.level == Const.LEVEL_DEBUG and self.config.task in [Const.TENSOR, Const.STATISTICS]): + def save(self, variable, name, save_backward): + if self.config.level != Const.LEVEL_DEBUG: return - try: - self.current_rank = get_rank_if_initialized() - except DistributedNotInitializedError: - self.current_rank = None - # dir: dump_path -- rank{} -- debug.json - self.dump_iter_dir = self.config.dump_path - cur_rank = self.current_rank if self.current_rank is not None else '' - dump_dir = os.path.join(self.dump_iter_dir, f"rank{cur_rank}") - create_directory(dump_dir) - if self.config.task in self.data_collector.tasks_need_tensor_data: - dump_data_dir = os.path.join(dump_dir, "dump_tensor_data") - create_directory(dump_data_dir) - else: - dump_data_dir = None + self.current_iter = self.loop + self.init_step + if self.config.step and self.current_iter not in self.config.step: + return - dump_path_aggregation = DumpPathAggregation() - dump_path_aggregation.dump_tensor_data_dir = dump_data_dir - dump_path_aggregation.debug_file_path = os.path.join(dump_dir, "debug.json") - self.data_collector.update_dump_paths(dump_path_aggregation) - self.data_collector.initialize_json_file(framework=Const.PT_FRAMEWORK) + if self.currrent_step_first_debug_save: + try: + self.current_rank = get_rank_if_initialized() + except DistributedNotInitializedError: + self.current_rank = None - self.debug_variable_counter = defaultdict(int) + self.create_dirs() + self.debug_variable_counter = defaultdict(int) + self.currrent_step_first_debug_save = False - def save(self, variable, name, save_backward): - if self.config.level != Const.LEVEL_DEBUG: - return count = self.debug_variable_counter[name] self.debug_variable_counter[name] += 1 @@ -469,3 +453,13 @@ class Service: # backward save if save_backward: self.data_collector.debug_data_collect_backward(variable, grad_name_with_count) + + def register_custom_api(self, module, api_name, api_prefix): + self.ori_customer_func[str(module) + Const.SEP + api_name] = getattr(module, api_name) + ApiRegistry.register_custom_api(module, api_name, api_prefix, + functools.partial(self.build_hook, BaseScope.Module_Type_API), ApiTemplate) + + def restore_custom_api(self, module, api): + ori_func = self.ori_customer_func.get(str(module) + Const.SEP + api) + if ori_func: + setattr(module, api, ori_func) diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/builder/graph_builder.py b/debug/accuracy_tools/msprobe/pytorch/visualization/builder/graph_builder.py deleted file mode 100644 index f623a48ae3b9607103b4af63bd8838d3d13c8a0b..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/builder/graph_builder.py +++ /dev/null @@ -1,84 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from ..graph.graph import Graph -from ..graph.node_op import NodeOp -from ..utils import load_json_file, load_data_json_file, save_json_file, GraphConst -from .msprobe_adapter import get_input_output - - -class GraphBuilder: - @staticmethod - def build(construct_path, data_path, model_name='DefaultModel'): - """ - GraphBuilder的对外提供的构图方法 - Args: - construct_path: construct.json路径 - data_path: dump.json路径 - model_name: 模型名字,依赖外部输入 - Returns: Graph,代表图的数据结构 - """ - construct_dict = load_json_file(construct_path) - data_dict = load_data_json_file(data_path) - graph = Graph(model_name) - GraphBuilder._init_nodes(graph, construct_dict, data_dict) - return graph - - @staticmethod - def to_json(filename, graph_n, graph_b=None, tool_tip=None): - """ - 将graph导出成.vis文件的接口 - Args: - filename: 输出文件路径 - graph_n: Graph - graph_b: bench Graph,为空是只输出graph_b,不为空会同时输出两个graph,作为对比的结果 - tool_tip: 在对比模型下输出的意见 - """ - result = {} - if graph_b: - result[GraphConst.JSON_NPU_KEY] = graph_n.to_dict() - result[GraphConst.JSON_BENCH_KEY] = graph_b.to_dict() - else: - result = graph_n.to_dict() - if tool_tip: - result[GraphConst.JSON_TIP_KEY] = tool_tip - save_json_file(filename, result) - - @staticmethod - def _init_nodes(graph, construct_dict, data_dict): - for subnode_id, upnode_id in construct_dict.items(): - if upnode_id: - upnode_op = NodeOp.get_node_op(upnode_id) - upnode = GraphBuilder._create_or_get_node(graph, data_dict, upnode_op, upnode_id) - else: - upnode = graph.root - node_op = NodeOp.get_node_op(subnode_id) - GraphBuilder._create_or_get_node(graph, data_dict, node_op, subnode_id, upnode) - - @staticmethod - def _create_or_get_node(graph, data_dict, op, name, upnode=None): - if name in graph.node_map: - node = graph.get_node(name) - else: - graph.add_node(op, name, upnode) - node = graph.get_node(name) - node_data = data_dict.get(name, {}) - # 添加输入输出数据 - input_data, output_data = get_input_output(node_data, node.id) - # 更新数据 - node.set_input_output(input_data, output_data) - # 添加节点 - node.add_upnode(upnode) - return node \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/builder/msprobe_adapter.py b/debug/accuracy_tools/msprobe/pytorch/visualization/builder/msprobe_adapter.py deleted file mode 100644 index 7ea0dfabedf7c482975094abdd981baa1afeb44e..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/builder/msprobe_adapter.py +++ /dev/null @@ -1,185 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import re -from ...compare.acc_compare import read_op, merge_tensor, get_accuracy, _do_multi_process -from ....core.common.utils import task_dumppath_get -from ..utils import GraphConst - - -# 用于将节点名字解析成对应的NodeOp的规则 -op_patterns = [ - r'^(Module)', #NodeOp.module - r'^(Tensor|Torch|Functional|NPU|VF|Distributed|Aten)' #NodeOp.function_api -] - - -def get_compare_mode(dump_path_param): - """ - 获得比较模式,包括summary、MD5和真实数据三种模式 - Args: - dump_path_param: 调用acc_compare接口所依赖的参数 - Returns: 0 summary mode, 1 md5 mode, 2 true data mode - """ - summary_compare, md5_compare = task_dumppath_get(dump_path_param) - if summary_compare: - compare_mode = GraphConst.SUMMARY_COMPARE - elif md5_compare: - compare_mode = GraphConst.MD5_COMPARE - else: - compare_mode = GraphConst.REAL_DATA_COMPARE - return compare_mode - - -def run_real_data(dump_path_param, csv_path): - """ - 多进程运行生成真实数据 - Args: - dump_path_param: 调用acc_compare接口所依赖的参数 - csv_path: 生成文件路径 - """ - return _do_multi_process(dump_path_param, csv_path) - - -def get_input_output(node_data, node_id): - """ - 将dump的原始数据进行拆解,分解为output和input两个数据 - Args: - node_data: 属于单个节点的dump数据 - node_id: 节点名字 - """ - input_data = {} - output_data = {} - op_parsed_list = read_op(node_data, node_id) - for item in op_parsed_list: - full_op_name = item.get('full_op_name', '') - if not full_op_name: - continue - splits = full_op_name.split('.') - if len(splits) <= GraphConst.OUTPUT_INDEX: - continue - if 'output' in splits[GraphConst.OUTPUT_INDEX]: - output_data[full_op_name] = item - else: - input_data[full_op_name] = item - return input_data, output_data - - -def compare_data(data_dict_list1, data_dict_list2): - """ - 比较get_input_output中输出的结果是否结构一致,比较一致返回True - """ - if len(data_dict_list1) != len(data_dict_list2): - return False - # 用于比较两个节点是否相等的关键字段 - tag_keys = ['type', 'dtype', 'shape'] - for key1, key2 in zip(data_dict_list1, data_dict_list2): - dict1 = data_dict_list1[key1] - dict2 = data_dict_list2[key2] - for tag_key in tag_keys: - tag_value1 = dict1.get(tag_key, None) - tag_value2 = dict2.get(tag_key, None) - if tag_value1 != tag_value2: - return False - return True - - -def format_node_data(data_dict): - """ - 批量进行节点数据的输出 - """ - del_list = ['requires_grad', 'data_name', 'full_op_name'] - for _, value in data_dict.items(): - if not isinstance(value, dict): - continue - for item in del_list: - if item in value: - del value[item] - _format_data(value) - return data_dict - - -def compare_node(node_ids, data_dicts, stack_json_data, is_summary_compare, is_md5_compare): - """ - 调用acc_compare.py中的get_accuracy获得精度对比指标 - 真实数据对比模式无法获得精度对比指标,需要调用多进程比对接口 - Returns: 包含参数信息和对比指标(真实数据对比模式除外)的list - """ - merge_n = _parse_node(node_ids[0], data_dicts[0], stack_json_data, is_summary_compare, is_md5_compare) - merge_b = _parse_node(node_ids[1], data_dicts[1], stack_json_data, is_summary_compare, is_md5_compare) - result = [] - get_accuracy(result, merge_n, merge_b, is_summary_compare, is_md5_compare) - return result - - -def _parse_node(node_id, data_dict, stack_json_data, is_summary_compare, is_md5_compare): - """ - 转换节点,使其能够作为acc_compare.py中的get_accuracy的入参 - """ - op_parsed_list = read_op(data_dict.get(node_id, {}), node_id) - if node_id in stack_json_data: - op_parsed_list.append( - {'full_op_name': node_id, 'full_info': stack_json_data[node_id]}) - else: - op_parsed_list.append({'full_op_name': node_id, 'full_info': None}) - result = merge_tensor(op_parsed_list, is_summary_compare, is_md5_compare) - if not result: - result['op_name'] = [] - return result - - -def _format_decimal_string(s): - """ - 使用正则表达式匹配包含数字、小数点和可选的百分号的字符串 - """ - pattern = re.compile(r'\d{1,20}\.\d{1,20}%?') - matches = pattern.findall(s) - for match in matches: - is_percent = match.endswith('%') - number_str = match.rstrip('%') - decimal_part = number_str.split('.')[1] - # 如果小数位数大于6,进行处理 - if len(decimal_part) > GraphConst.ROUND_TH: - number_float = float(number_str) - formatted_number = f"{number_float:.{GraphConst.ROUND_TH}f}" - # 如果原来是百分数,加回百分号 - if is_percent: - formatted_number += '%' - # 替换原字符串中的数值部分 - s = s.replace(match, formatted_number) - return s - - -def _format_data(data_dict): - """ - 格式化数据,小数保留6位,处理一些异常值 - """ - pattern = r'^[+-]?(\d+(.\d*)?|.\d+)([eE][+-]?\d+)$' - for key, value in data_dict.items(): - if isinstance(value, str): - # 将单引号删掉,None换成null避免前端解析错误 - value = value.replace("'", "").replace('None', 'null') - value = _format_decimal_string(value) - elif value is None or value == ' ': - value = 'null' - # 科学计数法1.123123123123e-11,格式化为1.123123e-11 - elif isinstance(value, float) and len(str(value)) < GraphConst.STR_MAX_LEN and re.match(pattern, str(value)): - value = "{:.6e}".format(value) - elif isinstance(value, float): - value = round(value, GraphConst.ROUND_TH) - # Inf会走入这里,确保转成Inf。另外给其他不符合预期的类型做兜底方案 - if not isinstance(value, (list, tuple, dict, str)): - value = str(value) - data_dict[key] = value diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/compare/graph_comparator.py b/debug/accuracy_tools/msprobe/pytorch/visualization/compare/graph_comparator.py deleted file mode 100644 index 3d5f2972468adab8a436167d2f50eab9ace05873..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/compare/graph_comparator.py +++ /dev/null @@ -1,104 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from ..builder.msprobe_adapter import compare_node, get_compare_mode, run_real_data -from ..utils import GraphConst, load_json_file, load_data_json_file, get_csv_df -from ..graph.graph import Graph -from .mode_adapter import ModeAdapter - - -class GraphComparator: - def __init__(self, graphs, data_paths, stack_path, output_path): - self.graph_n = graphs[0] - self.graph_b = graphs[1] - self._parse_param(data_paths, stack_path, output_path) - - def compare(self): - """ - 比较函数,初始化结束后单独调用。比较结果写入graph_n - """ - self._compare_nodes(self.graph_n.root) - self._postcompare() - - def add_compare_result_to_node(self, node, compare_result_list): - """ - 将比对结果添加到节点的输入输出数据中 - Args: - node: 节点 - compare_result_list: 包含参数信息和对比指标(真实数据对比模式除外)的list - """ - # 真实数据比对,先暂存节点,在多进程对比得到精度指标后,再将指标添加到节点中 - if self.ma.prepare_real_data(node): - return - compare_in_dict = {} - compare_out_dict = {} - # input和output对比数据分开 - for item in compare_result_list: - if 'output' in item[0]: - compare_out_dict[item[0]] = item - else: - compare_in_dict[item[0]] = item - precision_status, precision_index, other_dict = self.ma.parse_result(node, [compare_in_dict, compare_out_dict]) - node.data[GraphConst.JSON_STATUS_KEY] = precision_status - node.data[GraphConst.JSON_INDEX_KEY] = precision_index - node.data.update(other_dict) - if not precision_status: - self.ma.add_error_key(node.output_data) - node.get_suggestions() - - def _parse_param(self, data_paths, stack_path, output_path): - self.dump_path_param = { - 'npu_json_path': data_paths[0], - 'bench_json_path': data_paths[1], - 'stack_json_path': stack_path, - 'is_print_compare_log': True - } - self.output_path = output_path - compare_mode = get_compare_mode(self.dump_path_param) - self.ma = ModeAdapter(compare_mode) - self.data_n_dict = load_data_json_file(data_paths[0]) - self.data_b_dict = load_data_json_file(data_paths[1]) - self.stack_json_data = load_json_file(stack_path) - - def _postcompare(self): - if not self.ma.is_real_data_compare(): - return - df = get_csv_df(self.ma.is_md5_compare(), self.ma.is_summary_compare(), True, self.ma.csv_data) - df = run_real_data(self.dump_path_param, df) - compare_data_dict = {row[0]: row.tolist() for _, row in df.iterrows()} - for node in self.ma.compare_nodes: - precision_status, precision_index, _ = self.ma.parse_result(node, [compare_data_dict]) - node.data[GraphConst.JSON_STATUS_KEY] = precision_status - node.data[GraphConst.JSON_INDEX_KEY] = precision_index - if not precision_status: - self.ma.add_error_key(node.output_data) - node.get_suggestions() - - def _compare_nodes(self, node_n): - #递归遍历NPU树中的节点,如果在Bench中找到具有相同名称的节点,检查他们的祖先和参数信息,检查一致则及逆行精度数据对比 - #这里采用先序遍历,好处在于当这个节点被比较时,他的先序已经被匹配,这可以为后续的模糊匹配提供重要信息 - node_b, ancestors = Graph.match(self.graph_n, node_n, self.graph_b) - if node_b: - ancestors.append(node_b.id) - node_n.add_link(node_b, ancestors) - # 真实数据比对只会得到基本信息,并没有精度指标,需要调用多进程对比接口 - compare_result_list = compare_node([node_n.id, node_b.id], [self.data_n_dict, self.data_b_dict], - self.stack_json_data, self.ma.is_summary_compare(), - self.ma.is_md5_compare()) - if compare_result_list: - self.ma.add_csv_data(compare_result_list) - self.add_compare_result_to_node(node_n, compare_result_list) - for subnode in node_n.subnodes: - self._compare_nodes(subnode) diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/compare/mode_adapter.py b/debug/accuracy_tools/msprobe/pytorch/visualization/compare/mode_adapter.py deleted file mode 100644 index d58f2078b6f8996a31c2f830ef5adf79bc7948c3..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/compare/mode_adapter.py +++ /dev/null @@ -1,211 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import json -from ....core.common.const import CompareConst, Const -from ..utils import ToolTip, GraphConst, str2float - - -class ModeAdapter: - def __init__(self, compare_mode): - self.compare_mode = compare_mode - self.csv_data = [] - self.compare_nodes = [] - - @staticmethod - def _add_md5_compare_data(node_data, compare_data_dict): - precision_status = True - for key, value in node_data.items(): - if not isinstance(value, dict): - continue - compare_data = compare_data_dict.get(key) - if compare_data: - key_list = [GraphConst.JSON_MD5_KEY] - headers = CompareConst.MD5_COMPARE_RESULT_HEADER - id_list = [headers.index(x) for x in key_list] - ModeAdapter._match_data(value, compare_data, key_list, id_list) - # md5比对是否通过 - if value.get(GraphConst.JSON_MD5_KEY) != CompareConst.PASS: - precision_status = False - node_data[key] = value - return precision_status - - @staticmethod - def _add_real_compare_data(node_data, compare_data_dict): - min_thousandth = float(1) - numbers = [] - for key, value in node_data.items(): - if not isinstance(value, dict): - continue - compare_data = compare_data_dict.get(key) - if compare_data: - key_list = [CompareConst.COSINE, CompareConst.MAX_ABS_ERR, CompareConst.MAX_RELATIVE_ERR, - CompareConst.ONE_THOUSANDTH_ERR_RATIO, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO] - headers = CompareConst.COMPARE_RESULT_HEADER - id_list = [headers.index(x) for x in key_list] - ModeAdapter._match_data(value, compare_data, key_list, id_list) - # 获取一个节点所有的输入或输出最小的双千指标 - thousandth = value.get(CompareConst.ONE_THOUSANDTH_ERR_RATIO) - # 可能是None,可能是非数字内容str - try: - thousandth = float(thousandth) - except (ValueError, TypeError): - thousandth = None - if thousandth is not None: - numbers.append(thousandth) - node_data[key] = value - # 双千指标都是None的异常情况 - if not numbers: - min_thousandth = None - else: - min_thousandth = min(numbers + [min_thousandth]) - return min_thousandth - - @staticmethod - def _add_summary_compare_data( node_data, compare_data_dict): - precision_status = True - max_relative_err = 0 - for key, value in node_data.items(): - if not isinstance(value, dict): - continue - compare_data = compare_data_dict.get(key) - if compare_data: - # 对应比对结果csv的列 - key_list = [CompareConst.MAX_DIFF, CompareConst.MIN_DIFF, CompareConst.MEAN_DIFF, - CompareConst.NORM_DIFF, CompareConst.MAX_RELATIVE_ERR, CompareConst.MIN_RELATIVE_ERR, - CompareConst.MEAN_RELATIVE_ERR, CompareConst.NORM_RELATIVE_ERR] - headers = CompareConst.SUMMARY_COMPARE_RESULT_HEADER - id_list = [headers.index(x) for x in key_list] - ModeAdapter._match_data(value, compare_data, key_list, id_list) - # 相对误差大于0.5疑似有精度问题,小值域1e-3不比较相对误差 - for index, item in enumerate(key_list[4:]): - value_diff = value.get(key_list[index]) - if isinstance(value_diff, float) and value_diff != 0 and abs(value_diff) < GraphConst.SMALL_VALUE: - value[item] = ToolTip.SMALL_VALUE_TIP.format(key_list[index]) - continue - relative_err = str2float(value.get(item)) - max_relative_err = max(max_relative_err, relative_err) - node_data[key] = value - if max_relative_err > GraphConst.MAX_RELATIVE_ERR_TH: - precision_status = False - max_relative_err = 1 if max_relative_err > 1 else max_relative_err - precision_index = 1 - max_relative_err - return precision_status, precision_index - - @staticmethod - def _match_data(data_dict, compare_data, key_list, id_list): - """ - 绑定精度指标到node的input_data和output_data - """ - if len(key_list) != len(id_list): - return - for id, key in zip(id_list, key_list): - data = compare_data[id] - if data is not None and 'nan' not in str(data) and str(data) != ' ': - data_dict[key] = data - else: - data_dict[key] = 'null' - - def parse_result(self, node, compare_data_dict): - """ - 根据结果返回数据,分别是precision_status,precision_index,和附加数据 - """ - other_dict = {} - if self.is_md5_compare(): - precision_status_in = ModeAdapter._add_md5_compare_data(node.input_data, compare_data_dict[0]) - precision_status_out = ModeAdapter._add_md5_compare_data(node.output_data, compare_data_dict[1]) - # 所有输入输出md5对比通过,这个节点才算通过 - precision_status = precision_status_in and precision_status_out - precision_index = 1 if precision_status else 0 - other_result = CompareConst.PASS if precision_status else CompareConst.DIFF - other_dict[GraphConst.JSON_MD5_KEY] = other_result - elif self.is_summary_compare(): - precision_status_in, precision_index_in = ModeAdapter._add_summary_compare_data(node.input_data, compare_data_dict[0]) - precision_status_out, precision_index_out = ModeAdapter._add_summary_compare_data(node.output_data, compare_data_dict[1]) - precision_status = precision_status_in and precision_status_out - precision_index = min(precision_index_in, precision_index_out) - else: - min_thousandth_in = ModeAdapter._add_real_compare_data(node.input_data, compare_data_dict[0]) - min_thousandth_out = ModeAdapter._add_real_compare_data(node.output_data, compare_data_dict[0]) - if min_thousandth_in and min_thousandth_out: - change_percentage = abs(min_thousandth_in - min_thousandth_out) - else: - change_percentage = 0 - precision_status = True - if change_percentage > GraphConst.REAL_DATA_TH: - precision_status = False - precision_index = 0 if change_percentage > 1 else 1 - change_percentage - return precision_status, precision_index, other_dict - - def prepare_real_data(self, node): - """ - 为真实数据比较模式准备节点信息 - """ - if self.is_real_data_compare(): - self.compare_nodes.append(node) - return True - return False - - def is_summary_compare(self): - return self.compare_mode == GraphConst.SUMMARY_COMPARE - - def is_md5_compare(self): - return self.compare_mode == GraphConst.MD5_COMPARE - - def is_real_data_compare(self): - return self.compare_mode == GraphConst.REAL_DATA_COMPARE - - def add_csv_data(self, compare_result_list): - if not self.is_real_data_compare(): - return - self.csv_data.extend(compare_result_list) - - def add_error_key(self, node_data): - """ - 根据不同的模式进行提供不同错误信息 - """ - for key, value in node_data.items(): - if not isinstance(value, dict): - continue - if self.is_summary_compare(): - message = [CompareConst.MAX_RELATIVE_ERR, CompareConst.MIN_RELATIVE_ERR, - CompareConst.MEAN_RELATIVE_ERR, CompareConst.NORM_RELATIVE_ERR] - elif self.is_real_data_compare(): - message = [CompareConst.ONE_THOUSANDTH_ERR_RATIO, CompareConst.FIVE_THOUSANDTHS_ERR_RATIO] - else: - # 输出件优化 - message = [] - value[GraphConst.ERROR_KEY] = message - node_data[key] = value - - def get_tool_tip(self): - """ - 用于前端展示字段的具体含义 - """ - if self.is_summary_compare(): - tips = { - CompareConst.MAX_DIFF: ToolTip.MAX_DIFF, - CompareConst.MIN_DIFF: ToolTip.MIN_DIFF, - CompareConst.MEAN_DIFF: ToolTip.MEAN_DIFF, - CompareConst.NORM_DIFF: ToolTip.NORM_DIFF} - elif self.is_md5_compare(): - tips = {Const.MD5: ToolTip.MD5} - else: - tips = { - CompareConst.ONE_THOUSANDTH_ERR_RATIO: ToolTip.ONE_THOUSANDTH_ERR_RATIO, - CompareConst.COSINE: ToolTip.COSINE, - CompareConst.MAX_ABS_ERR: ToolTip.MAX_ABS_ERR, - CompareConst.MAX_RELATIVE_ERR: ToolTip.MAX_RELATIVE_ERR} - return tips diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/base_node.py b/debug/accuracy_tools/msprobe/pytorch/visualization/graph/base_node.py deleted file mode 100644 index f04f367f591244a6d1ed48529d1fb4aae7cb2453..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/base_node.py +++ /dev/null @@ -1,107 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from .node_op import NodeOp -from ..utils import Suggestions, GraphConst -from ..builder.msprobe_adapter import format_node_data, compare_data - - -class BaseNode: - def __init__(self, node_op, node_id, up_node=None): - self.op = node_op - self.id = node_id - self.data = {} - self.output_data = {} - self.input_data = {} - self.upnode = None - self.add_upnode(up_node) - self.subnodes = [] - self.matched_node_link = [] - self.suggestions = {} - - def __str__(self): - info = f'id:\t{self.id}' - return info - - def __eq__(self, other): - """ - 用来判断两个节点是否可以被匹配上,认为结构上是否一致 - """ - if not compare_data(self.input_data, other.input_data): - return False - if not compare_data(self.output_data, other.output_data): - return False - return True - - def get_suggestions(self): - """ - 精度疑似有问题时,提供一些建议 - """ - if self.op == NodeOp.module: - self.suggestions[GraphConst.SUGGEST_KEY] = Suggestions.Module - self.suggestions[Suggestions.PTDBG] = Suggestions.PTDBG_URL - elif self.op == NodeOp.function_api: - self.suggestions[GraphConst.SUGGEST_KEY] = Suggestions.API - self.suggestions[Suggestions.API_ACCURACY_CHECKER] = Suggestions.API_ACCURACY_CHECKER_URL - - def set_input_output(self, input_data, output_data): - self.input_data = input_data - self.output_data = output_data - - def add_upnode(self, node): - """ - 绑定upnode,用于对两个节点进行上下级关联 - """ - if not node or node.id == self.id or self.upnode: - return - self.upnode = node - node.subnodes.append(self) - - def add_link(self, node, ancestors): - """ - 在节点匹配成功后进行匹配数据的录入 - Args: - node: 和self相互匹配的节点 - ancestors: 对面节点的祖先信息 - """ - self.matched_node_link = ancestors - node.matched_node_link = ancestors - - def to_dict(self): - """ - 输出数据 - """ - result = {} - result['id'] = self.id - result['node_type'] = self.op.value - result['data'] = self.data - result['output_data'] = format_node_data(self.output_data) - result['input_data'] = format_node_data(self.input_data) - result['upnode'] = self.upnode.id if self.upnode else 'None' - result['subnodes'] = [node.id for node in self.subnodes] - result['matched_node_link'] = self.matched_node_link - result['suggestions'] = self.suggestions - return result - - def get_ancestors(self): - """ - 获取节点所有祖先的列表 - """ - ancestors = [] - current_node = self.upnode - while current_node: - ancestors.append(current_node.id) - current_node = current_node.upnode - return list(reversed(ancestors)) diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/graph.py b/debug/accuracy_tools/msprobe/pytorch/visualization/graph/graph.py deleted file mode 100644 index 6bae10ad3fc8a041d3ef2e8fb707d40a22b42f19..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/graph.py +++ /dev/null @@ -1,86 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from .base_node import BaseNode -from .node_op import NodeOp -from ..utils import GraphConst - - -class Graph: - def __init__(self, model_name): - self.node_map = {} - self.add_node(NodeOp.module, model_name) - self.root = self.get_node(model_name) - - def __str__(self): - infos = [f'{str(self.node_map.get(node_id))}' for node_id in self.node_map] - info = "\n".join(infos) - return info - - @staticmethod - def match(graph_n, node_n, graph_b): - """ - 给定节点n,在另一个graph中匹配它对应的节点。前置条件是它的父节点匹配已经完成 - 目前采用完全匹配的方式,后续可能在这里加入一定的模糊匹配逻辑 - 返回匹配结果,匹配到的节点,以及祖先树。没匹配到则返回None, [] - """ - if not node_n or node_n.id not in graph_b.node_map: - return None, [] - node_b = graph_b.node_map.get(node_n.id) - if node_n != node_b: - return None, [] - ancestors_n = node_n.get_ancestors() - ancestors_b = node_b.get_ancestors() - if ancestors_n != ancestors_b: - return None, [] - return node_b, ancestors_n - - @staticmethod - def dfs(node, result): - info = node.to_dict() - result[node.id] = info - for subnode in node.subnodes: - Graph.dfs(subnode, result) - - def add_node(self, node_op, node_id, up_node=None): - """ - 在graph中进行节点的添加 - Args: - node_op: 需要添加的节点类型 - node_id: 需要添加的节点id - up_node:对应节点的父节点 - """ - if node_id in self.node_map: - return - node = BaseNode(node_op, node_id, up_node) - self.node_map[node_id] = node - - def get_node(self, node_id): - """ - 返回节点,不存在返回None - """ - return self.node_map.get(node_id, None) - - def to_dict(self): - """ - 用于数据输出 - """ - result = {} - result[GraphConst.JSON_ROOT_KEY] = self.root.id if self.root else 'None' - result[GraphConst.JSON_NODE_KEY] = {} - for node_id in self.node_map: - info = self.node_map.get(node_id).to_dict() - result[GraphConst.JSON_NODE_KEY][node_id] = info - return result diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/node_op.py b/debug/accuracy_tools/msprobe/pytorch/visualization/graph/node_op.py deleted file mode 100644 index 1629caabd1989beac72646ea36efb4a82b328f3a..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/node_op.py +++ /dev/null @@ -1,37 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from enum import Enum -import re -from ..builder.msprobe_adapter import op_patterns - - -class NodeOp(Enum): - module = 0 - function_api = 1 - - @staticmethod - def get_node_op(node_name: str): - """ - 基于代表节点的字符串,解析节点种类 - """ - for op in NodeOp: - index = op.value - if index < 0 or index >= len(op_patterns): - raise Exception("NodeOp and op_patterns in MsprobeAdapter do not match") - pattern = op_patterns[index] - if re.match(pattern, node_name): - return op - raise Exception(f"Cannot parse node_name {node_name} into NodeOp") diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/test.py b/debug/accuracy_tools/msprobe/pytorch/visualization/test.py deleted file mode 100644 index 165d54ce17ed295308c7fa52b4dc5251271453a8..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/test.py +++ /dev/null @@ -1,85 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import time -import shutil -import filecmp -from .compare.graph_comparator import GraphComparator -from .utils import GraphConst -from .builder.graph_builder import GraphBuilder -from ...pytorch.common.log import logger -from ...core.common.file_check import create_directory - - -def compare_graph(dump_path_n, dump_path_b, out_path): - # 对两个数据进行构图 - construct_path_n = os.path.join(dump_path_n, GraphConst.CONSTRUCT_FILE) - construct_path_b = os.path.join(dump_path_b, GraphConst.CONSTRUCT_FILE) - data_path_n = os.path.join(dump_path_n, GraphConst.DUMP_FILE) - data_path_b = os.path.join(dump_path_b, GraphConst.DUMP_FILE) - graph_n = GraphBuilder.build(construct_path_n, data_path_n, 'TestNet') - graph_b = GraphBuilder.build(construct_path_b, data_path_b, 'TestNet') - # 基于graph、stack和data进行比较 - stack_path = os.path.join(dump_path_n, GraphConst.STACK_FILE) - graph_comparator = GraphComparator([graph_n, graph_b], [data_path_n, data_path_b], stack_path, out_path) - graph_comparator.compare() - output_path = os.path.join(out_path, 'compare.vis') - GraphBuilder.to_json(output_path, graph_n, graph_b, graph_comparator.ma.get_tool_tip()) - - -def build_graph(dump_path, out_path): - construct_path = os.path.join(dump_path, GraphConst.CONSTRUCT_FILE) - data_path = os.path.join(dump_path, GraphConst.DUMP_FILE) - output_path = os.path.join(out_path, 'build.vis') - graph = GraphBuilder.build(construct_path, data_path, 'TestNet') - GraphBuilder.to_json(output_path, graph) - - -def run_st(data_path): - start_time = time.time() - run_bench(data_path, 'output2') - end_time = time.time() - logger.info(f'run_st time cost: {end_time - start_time}') - # 比较output2的结果和output1 的bench结果差距 - for data_dir in os.listdir(data_path): - data_dir = os.path.join(data_path, data_dir) - if not os.path.isdir(data_dir): - continue - output1 = os.path.join(data_dir, 'output1') - output2 = os.path.join(data_dir, 'output2') - files = ['build.vis', 'compare.vis'] - for vis_file in files: - file1 = os.path.join(output1, vis_file) - file2 = os.path.join(output2, vis_file) - result = filecmp.cmp(file1, file2) - if result: - logger.info('pass ' + file1) - else: - logger.info('not pass ' + file1) - - -def run_bench(data_path, output_dir): - for data_dir in os.listdir(data_path): - data_dir = os.path.join(data_path, data_dir) - if not os.path.isdir(data_dir): - continue - run_data_path = os.path.join(data_dir, 'data') - output_path = os.path.join(data_dir, output_dir) - if os.path.exists(output_path): - shutil.rmtree(output_path) - create_directory(output_path) - build_graph(run_data_path, output_path) - compare_graph(run_data_path, run_data_path, output_path) diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/utils.py b/debug/accuracy_tools/msprobe/pytorch/visualization/utils.py deleted file mode 100644 index fb046f9758686fe810a05b1a23d76880b86bb994..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/pytorch/visualization/utils.py +++ /dev/null @@ -1,118 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import json -from ...core.common.file_check import FileOpen -from ..compare.acc_compare import result_to_csv - - -def load_json_file(file_path): - """ - 加载json文件 - """ - try: - with FileOpen(file_path, 'r') as f: - file_dict = json.load(f) - if not isinstance(file_dict, dict): - return {} - return file_dict - except json.JSONDecodeError: - return {} - - -def load_data_json_file(file_path): - """ - 加载dump.json中的data字段 - """ - return load_json_file(file_path).get(GraphConst.DATA_KEY, {}) - - -def save_json_file(file_path, data): - """ - 保存json文件 - """ - with FileOpen(file_path, 'w') as f: - f.write(json.dumps(data, indent=4)) - - -def get_csv_df(md5_compare, summary_compare, stack, csv_data): - """ - 调用acc接口写入csv - """ - return result_to_csv(md5_compare, summary_compare, stack, csv_data, None) - - -def str2float(percentage_str): - """ - 百分比字符串转换转换为浮点型 - Args: - percentage_str: '0.00%', '23.4%' - Returns: float 0.00, 0.234 - """ - try: - percentage_str = percentage_str.strip('%') - return float(percentage_str) / 100 - except ValueError: - return 0 - - -class ToolTip: - MAX_DIFF = 'NPU与标杆API统计信息比对,最大值的差值' - MIN_DIFF = 'NPU与标杆API统计信息比对,最小值的差值' - MEAN_DIFF = 'NPU与标杆API统计信息比对,平均值的差值' - NORM_DIFF = 'NPU与标杆API统计信息比对,2范数(平方根)的差值' - MD5 = '数据MD5信息,用于比较两个数据信息是否完全一致' - ONE_THOUSANDTH_ERR_RATIO = 'Tensor中的元素逐个与对应的标杆数据对比,相对误差大于千分之一的比例占总元素个数的比例小于千分之一' - COSINE = '通过计算两个向量的余弦值来判断其相似度,数值越接近于1说明计算出的两个张量越相似,实际可接受阈值为大于0.99。在计算中可能会存在nan,主要由于可能会出现其中一个向量为0' - MAX_ABS_ERR = '当最大绝对误差越接近0表示其计算的误差越小,实际可接受阈值为小于0.001' - MAX_RELATIVE_ERR = '当最大相对误差越接近0表示其计算的误差越小。当dump数据中存在0或Nan时,比对结果中最大相对误差则出现inf或Nan的情况,属于正常现象' - SMALL_VALUE_TIP = '{} 小于1e-3,不计算相对误差' - - -class Suggestions: - Module = '此模块精度比对结果疑似异常,请使用ptdbg工具对模块中的api进行dump比对' - API = '此api精度比对结果疑似异常,请使用api accuracy checker工具对api进行精度检测' - PTDBG = 'ptdbg工具' - PTDBG_URL = 'https://gitee.com/ascend/att/tree/master/debug/accuracy_tools/ptdbg_ascend' - API_ACCURACY_CHECKER = 'api accuracy checker工具' - API_ACCURACY_CHECKER_URL = 'https://gitee.com/ascend/att/tree/master/debug/accuracy_tools/api_accuracy_checker' - - -class GraphConst: - CONSTRUCT_FILE = 'construct.json' - DUMP_FILE = 'dump.json' - STACK_FILE = 'stack.json' - GRAPH_FILE = 'graph.vis' - ERROR_KEY = 'error_key' - SUMMARY_COMPARE = 0 - MD5_COMPARE = 1 - REAL_DATA_COMPARE = 2 - JSON_NPU_KEY = 'NPU' - JSON_BENCH_KEY = 'Bench' - JSON_TIP_KEY = 'Tooltip' - JSON_MD5_KEY = 'md5 Compare Result' - JSON_ROOT_KEY = 'root' - JSON_NODE_KEY = 'node' - DATA_KEY = 'data' - REAL_DATA_TH = 0.1 - MAX_RELATIVE_ERR_TH = 0.5 - ROUND_TH = 6 - JSON_STATUS_KEY = 'precision_status' - JSON_INDEX_KEY = 'precision_index' - SUGGEST_KEY = 'text' - TAG_NA = 'na' - OUTPUT_INDEX = -2 - STR_MAX_LEN = 50 - SMALL_VALUE = 1e-3 diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/__init__.py b/debug/accuracy_tools/msprobe/test/common_set_up/__init__.py similarity index 100% rename from debug/accuracy_tools/msprobe/pytorch/visualization/__init__.py rename to debug/accuracy_tools/msprobe/test/common_set_up/__init__.py diff --git a/profiler/msprof_analyze/precheck/env_check/environment_variable_check.py b/debug/accuracy_tools/msprobe/test/common_set_up/mindtorch.py similarity index 64% rename from profiler/msprof_analyze/precheck/env_check/environment_variable_check.py rename to debug/accuracy_tools/msprobe/test/common_set_up/mindtorch.py index 58d2becb23266ff085b80d2acd9c17a229e8420d..665d17c21e743fb5ffe6a0d9e014fe0a2da4af99 100644 --- a/profiler/msprof_analyze/precheck/env_check/environment_variable_check.py +++ b/debug/accuracy_tools/msprobe/test/common_set_up/mindtorch.py @@ -1,4 +1,4 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# Copyright (c) 2025-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -12,14 +12,18 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import SoftwareCheck +from mindspore import Tensor +import torch -class EnvironmentVariableCheck(SoftwareCheck): - CHECK_TYPE = "env_variable" - def __init__(self, **kwargs): - super().__init__(**kwargs) +def create_msa_tensor(data, dtype=None): + return Tensor(data, dtype) - def check(self): - pass + +tensor_tensor = torch.tensor +setattr(torch, 'tensor', create_msa_tensor) + + +def reset_torch_tensor(): + setattr(torch, 'tensor', tensor_tensor) diff --git a/debug/accuracy_tools/msprobe/test/common_set_up/test_set_up.py b/debug/accuracy_tools/msprobe/test/common_set_up/test_set_up.py index 6442908bb0e6dd573c101c4314388aabef4ed5c4..2a711d7b119622520ed125d4bd2437a240d64e7e 100644 --- a/debug/accuracy_tools/msprobe/test/common_set_up/test_set_up.py +++ b/debug/accuracy_tools/msprobe/test/common_set_up/test_set_up.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import importlib from unittest import TestCase from unittest.mock import MagicMock @@ -24,7 +25,22 @@ except ImportError: distributed = MagicMock() setattr(mint, 'distributed', distributed) +# ensure not to import torch_npu +from msprobe.mindspore import service +from msprobe.mindspore.monitor import common_func + +from .mindtorch import reset_torch_tensor +from msprobe.mindspore.common import utils +from msprobe.mindspore.common.utils import is_mindtorch + +utils.mindtorch_check_result = None +importlib.reload(service) +importlib.reload(common_func) +reset_torch_tensor() + class SetUp(TestCase): def test_case(self): self.assertTrue(hasattr(mint, 'distributed')) + self.assertTrue(is_mindtorch()) + utils.mindtorch_check_result = None diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare.py index 94244be326e9954c700339abec2db16a2ab31b07..ee15d9b06e530f32c5759492a9de40a2ab9cbf46 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare.py @@ -6,15 +6,49 @@ import threading import unittest from unittest.mock import patch +import numpy as np import pandas as pd import torch +from msprobe.core.common.file_utils import load_json from msprobe.core.common.const import CompareConst, Const from msprobe.core.common.utils import CompareException -from msprobe.core.compare.acc_compare import Comparator, ModeConfig -from msprobe.core.compare.highlight import find_error_rows, find_compare_result_error_rows, ApiBatch -from msprobe.core.compare.utils import get_accuracy -from msprobe.pytorch.compare.pt_compare import PTComparator +from msprobe.core.compare.acc_compare import ModeConfig, MappingConfig, MappingDict, Comparator, ParseData, ProcessDf, \ + Match, CreateTable, CalcStatsDiff + +npu_op_item_data_fuzzy = { + 'op_name': 'Functional.conv2d.0.forward.input.0', + 'dtype': 'torch.float32', + 'shape': [1, 1, 28, 28], + 'summary': [3.029174327850342, -2.926689624786377, -0.06619918346405029], + 'stack_info': [], + 'data_name': 'Functional.conv2d.0.forward.input.0.pt', + 'compare_key': 'Functional.conv2d.0.forward.input.0', + 'compare_shape': [1, 1, 28, 28], +} +npu_op_item_fuzzy = pd.Series(npu_op_item_data_fuzzy) +npu_op_item_data_fuzzy_2 = { + 'op_name': 'Functional.conv2d.0.forward.input.1', + 'dtype': 'torch.float32', + 'shape': [1, 1, 28, 28], + 'summary': [3.029174327850342, -2.926689624786377, -0.06619918346405029], + 'stack_info': [], + 'data_name': 'Functional.conv2d.0.forward.input.1.pt', + 'compare_key': 'Functional.conv2d.0.forward.input.1', + 'compare_shape': [1, 1, 28, 28], +} +npu_op_item_fuzzy_2 = pd.Series(npu_op_item_data_fuzzy_2) +bench_op_item_data_fuzzy = { + 'op_name': 'Functional.conv2d.1.forward.input.0', + 'dtype': 'torch.float32', + 'shape': [1, 1, 28, 28], + 'summary': [3.029174327850342, -2.926689624786377, -0.06619918346405029], + 'stack_info': [], + 'data_name': 'Functional.conv2d.1.forward.input.0.pt', + 'compare_key': 'Functional.conv2d.1.forward.input.0', + 'compare_shape': [1, 1, 28, 28], +} +bench_op_item_fuzzy = pd.Series(bench_op_item_data_fuzzy) npu_dict = {'op_name': ['Functional.conv2d.0.forward.input.0', 'Functional.conv2d.0.forward.input.1', 'Functional.conv2d.0.forward.input.2', 'Functional.conv2d.0.forward.output'], @@ -159,7 +193,8 @@ aten_result = [ -10.640625, -0.008758544921875, 5.397906303405762, -5.796811580657959, 2.5283952709287405e-10, 'Warning', 'Need double check api accuracy.', 'None'], ['Aten__native_batch_norm_legit_functional.default_0_forward.output.1', 'Nan', 'torch.float32', 'Nan', [256], 'Nan', - ' ', ' ', ' ', ' ', ' ', ' ', 0.30550330877304077, -0.24485322833061218, -0.010361209511756897, 'Nan', 'Nan', 'Nan', + ' ', ' ', ' ', ' ', ' ', ' ', 0.30550330877304077, -0.24485322833061218, -0.010361209511756897, 'Nan', 'Nan', + 'Nan', 'Yes', '', 'None'], ['Aten__native_batch_norm_legit_functional.default_0_forward.output.2', 'Nan', 'torch.float32', 'Nan', [256], 'Nan', ' ', ' ', ' ', ' ', ' ', ' ', 623.9192504882812, 432.96826171875, 520.2276611328125, 'Nan', 'Nan', 'Nan', @@ -173,40 +208,6 @@ aten_result = [ highlight_dict = {'red_rows': [], 'yellow_rows': []} -num_0, num_1, num_2, num_3 = 0, 1, 2, 3 -summary_line_input = ['Functional_batch_norm_0_forward.input.0', 'Functional_batch_norm_0_forward.input.0', - 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.01, 0, 0, 0, 1, 1, 1, 1, 1.01, 1, 1, 1, - 'Yes', ''] -summary_line_1 = ['Functional_batch_norm_0_forward.output.0', 'Functional_batch_norm_0_forward.output.0', - 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 10, 0, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, - 'Warning', ''] -summary_line_2 = ['Functional_batch_norm_0_forward.output.1', 'Functional_batch_norm_0_forward.output.1', - 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.02, 0, 0, 0, 0.12, 0, 1, 1, 0.1, 1, 1, 1, - 'Warning', ''] -summary_line_3 = ['Functional_batch_norm_0_forward.output.2', 'Functional_batch_norm_0_forward.output.2', - 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0, 0, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, - 'Warning', ''] -line_input = ['Functional.batch.norm.0.forward.input.0', 'Functional.batch.norm.0.forward.input.0', 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 1, 0.5, 1, 1, 0.95, 1, - 1, 1, 1, 1, 1.01, 1, 1, 1, - 'Yes', ''] -line_1 = ['Functional.batch.norm.0.forward.output.0', 'Functional.batch.norm.0.forward.output.0', 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.8, 0.5, 1, 1, 0.59, 1, - 'nan', 0, 1, 1, 19, 1, 1, 1, - 'Yes', ''] -line_2 = ['Functional.batch.norm.0.forward.output.1', 'Functional.batch.norm.0.forward.output.1', 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.9, 0.5, 1, 1, 0.8, 1, - 0, 0.12, 0, 1, 1, 0.1, 1, 1, - 'Yes', ''] -line_3 = ['Functional.batch.norm.0.forward.output.2', 'Functional.batch.norm.0.forward.output.2', 'torch.float16', - 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.8, 0.5, 1.1e+10, 1, 0.85, 1, - 9, 0.12, 0, 1, 1, 0.1, 1, 1, - 'Yes', ''] - op_data = { 'input_args': [{'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [16, 1, 3, 3], 'Max': 0.33033010363578796, 'Min': -0.331031858921051, 'Mean': -0.030964046716690063, @@ -267,6 +268,33 @@ def generate_dump_json(base_dir): json.dump(data, json_file) +def generate_dump_json_md5(base_dir): + data_path = os.path.join(base_dir, 'dump_md5.json') + data = { + 'task': 'statistics', + 'level': 'L1', + 'dump_data_dir': '', + 'data': { + 'Functional.linear.0.forward': { + 'input_args': [ + {'type': 'torch.Tensor', + 'dtype': 'torch.float32', + 'shape': [2, 2], + 'Max': 2, + 'Min': 0, + 'Mean': 1, + 'Norm': 1, + 'requires_grad': False, + 'md5': 123456 + } + ] + } + } + } + with open(data_path, 'w') as json_file: + json.dump(data, json_file) + + def generate_stack_json(base_dir): data_path = os.path.join(base_dir, 'stack.json') data = {'Functional.linear.0.forward': ['File']} @@ -300,145 +328,6 @@ class TestUtilsMethods(unittest.TestCase): if os.path.exists(base_dir3): shutil.rmtree(base_dir3) - def test_get_accuracy_graph_mode(self): - result = [] - get_accuracy(result, npu_dict_aten, bench_dict_functional, dump_mode=Const.SUMMARY) - self.assertEqual(result, aten_result) - - def test_find_error_rows(self): - api_batch = ApiBatch("Functional_batch_norm_0_forward", 0) - api_batch.input_len = 1 - api_batch.output_end_index = 4 - api_batch.params_end_index = 4 - summary_result = [summary_line_input, summary_line_1, summary_line_2, summary_line_3] - highlight_dict_test = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} - find_error_rows(summary_result, api_batch, highlight_dict_test, dump_mode=Const.SUMMARY) - self.assertEqual(highlight_dict_test, - {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []}) - - def test_find_compare_result_error_rows(self): - result = [line_input, line_1, line_2, line_3] - result_df = pd.DataFrame(result) - highlight_dict_test = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} - find_compare_result_error_rows(result_df, highlight_dict_test, dump_mode=Const.ALL) - self.assertEqual(highlight_dict_test, { - "red_rows": {1, 3}, - "yellow_rows": {2}, - "red_lines": [ - (1, ["maximum or minimum is nan, -inf, or inf"]), - (3, ["maximum absolute error exceeds 1e+10"]) - ], - "yellow_lines": [ - (2, ["The output's one thousandth err ratio decreases by more than 0.1 compared to the input/parameters's"]), - (3, [ - "maximum absolute error of both input/parameters and output exceed 1, " - "with the output larger by an order of magnitude", - "The output's cosine decreases by more than 0.1 compared to the input/parameters's"]) - ] - }) - - def test_calculate_summary_data(self): - npu_summary_data = [1, 1, 1, 1] - bench_summary_data = [2, 2, 2, 2] - result_item = ['', '', '', '', '', '', '', '', '', '', '', '', '', ''] - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - comparator = Comparator(mode_config) - comparator.calculate_summary_data(npu_summary_data, bench_summary_data, result_item) - self.assertEqual(result_item, - ['', '', '', '', '', '', -1, -1, -1, -1, '50.0%', '50.0%', '50.0%', '50.0%', '', '']) - - bench_summary_data = [0, 0, 0, 0] - result_item = ['', '', '', '', '', '', '', '', '', '', '', '', '', ''] - - comparator.calculate_summary_data(npu_summary_data, bench_summary_data, result_item) - self.assertEqual(result_item, ['', '', '', '', '', '', 1, 1, 1, 1, 'N/A', 'N/A', 'N/A', 'N/A', 'Warning', - 'Need double check api accuracy.']) - - def test_make_result_table_stack_mode_True(self): - result_md5 = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', 'File']] - result_summary = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', 'File']] - result_all = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', 'File', '-1']] - columns_md5_stack_mode_true = CompareConst.MD5_COMPARE_RESULT_HEADER + ['NPU_Stack_Info'] - result_table_md5_true = pd.DataFrame(result_md5, columns=columns_md5_stack_mode_true, dtype=object) - columns_summary_stack_mode_true = CompareConst.SUMMARY_COMPARE_RESULT_HEADER + ['NPU_Stack_Info'] - result_table_summary_true = pd.DataFrame(result_summary, columns=columns_summary_stack_mode_true, dtype=object) - columns_all_stack_mode_true = CompareConst.COMPARE_RESULT_HEADER + ['NPU_Stack_Info'] + ['Data_name'] - result_table_all_true = pd.DataFrame(result_all, columns=columns_all_stack_mode_true, dtype=object) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - - dump_mode = Const.MD5 - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_md5) - self.assertTrue(result_df.equals(result_table_md5_true)) - - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_summary) - self.assertTrue(result_df.equals(result_table_summary_true)) - - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_all) - self.assertTrue(result_df.equals(result_table_all_true)) - - def test_make_result_table_stack_mode_False(self): - result_md5_test = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '']] - result_md5 = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '']] - result_summary_test = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', '']] - result_summary = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '']] - result_all_test = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', '', '-1']] - result_all = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], '', '', '', '', '', '', - 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', '-1']] - columns_md5_stack_mode_true = CompareConst.MD5_COMPARE_RESULT_HEADER - result_table_md5_true = pd.DataFrame(result_md5, columns=columns_md5_stack_mode_true, dtype='object') - columns_summary_stack_mode_true = CompareConst.SUMMARY_COMPARE_RESULT_HEADER - result_table_summary_true = pd.DataFrame(result_summary, columns=columns_summary_stack_mode_true, - dtype='object') - columns_all_stack_mode_true = CompareConst.COMPARE_RESULT_HEADER + ['Data_name'] - result_table_all_true = pd.DataFrame(result_all, columns=columns_all_stack_mode_true, dtype='object') - - stack_mode = False - auto_analyze = True - fuzzy_match = False - - dump_mode = Const.MD5 - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_md5_test) - self.assertTrue(result_df.equals(result_table_md5_true)) - - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_summary_test) - self.assertTrue(result_df.equals(result_table_summary_true)) - - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result_df = Comparator(mode_config).make_result_table(result_all_test) - self.assertTrue(result_df.equals(result_table_all_true)) - def test_gen_merge_list(self): op_data = { 'input_args': [ @@ -469,32 +358,404 @@ class TestUtilsMethods(unittest.TestCase): dump_mode = Const.SUMMARY mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - result = Comparator(mode_config).gen_merge_list(json_data, op_name, stack_json_data) + result = ParseData(mode_config).gen_merge_list(json_data, op_name, stack_json_data) self.assertEqual(result, merge_list) - def test_check_op_fuzzy_false(self): + def test_check_op_item_fuzzy(self): stack_mode = False auto_analyze = True dump_mode = Const.SUMMARY - fuzzy_match = False + fuzzy_match = True mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + mapping_config = MappingConfig() - pt_comparator = PTComparator(mode_config) - result = pt_comparator.check_op(npu_dict, bench_dict) + match = Match(mode_config, mapping_config, cross_frame=False) + result = match.check_op_item(npu_op_item_fuzzy, bench_op_item_fuzzy) self.assertEqual(result, True) - def test_check_op_fuzzy_true(self): - stack_mode = False + def test_compare_statistics(self): + generate_dump_json(base_dir) + generate_stack_json(base_dir) + file_list = [os.path.join(base_dir, 'dump.json'), os.path.join(base_dir, 'dump.json'), + os.path.join(base_dir, 'stack.json')] + + stack_mode = True auto_analyze = True + fuzzy_match = False dump_mode = Const.SUMMARY - - fuzzy_match = True mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + mapping_config = MappingConfig() - pt_comparator = PTComparator(mode_config) - result = pt_comparator.check_op(npu_dict2, bench_dict) - self.assertEqual(result, True) + from msprobe.pytorch.compare.pt_compare import read_real_data + comparator = Comparator(read_real_data, mode_config, mapping_config) + result = comparator.compare_statistics(file_list) + o_data = [ + ['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', + 'torch.float32', 'torch.float32', '[2, 2]', '[2, 2]', 0, 0, 0, 0, '0.0%', 'N/A', '0.0%', '0.0%', + 2, 0, 1, 1, 2, 0, 1, 1, '', '', ['File'] + ] + ] + columns = CompareConst.SUMMARY_COMPARE_RESULT_HEADER + ['NPU_Stack_Info'] + o_result = pd.DataFrame(o_data, columns=columns, dtype=object) + self.assertTrue(np.array_equal(result.to_numpy(), o_result.to_numpy())) + + +class TestParseData(unittest.TestCase): + + def setUp(self): + os.makedirs(base_dir, mode=0o750, exist_ok=True) + generate_dump_json(base_dir) + generate_dump_json_md5(base_dir) + generate_stack_json(base_dir) + + self.lock = threading.Lock() + + def tearDown(self): + if os.path.exists(base_dir): + shutil.rmtree(base_dir) + + def test_parse(self): + file_list = [os.path.join(base_dir, 'dump.json'), os.path.join(base_dir, 'dump.json'), + os.path.join(base_dir, 'stack.json')] + + stack_mode = True + mode_config = ModeConfig(stack_mode=stack_mode) + parse_data = ParseData(mode_config) + npu_df, bench_df = parse_data.parse(file_list) + + target_df = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File']]], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info'] + ) + self.assertTrue(npu_df.equals(target_df)) + self.assertTrue(bench_df.equals(target_df)) + + def test_gen_data_df_summary(self): + npu_json_path = os.path.join(base_dir, 'dump.json') + stack_json_path = os.path.join(base_dir, 'stack.json') + npu_json_data = load_json(npu_json_path) + stack_json_data = load_json(stack_json_path) + + stack_mode = True + mode_config = ModeConfig(stack_mode=stack_mode) + parse_data = ParseData(mode_config) + npu_df = parse_data.gen_data_df(npu_json_data, stack_json_data) + + target_df = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File']]], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info'] + ) + self.assertTrue(npu_df.equals(target_df)) + + def test_gen_data_df_all(self): + npu_json_path = os.path.join(base_dir, 'dump.json') + stack_json_path = os.path.join(base_dir, 'stack.json') + npu_json_data = load_json(npu_json_path) + stack_json_data = load_json(stack_json_path) + + stack_mode = True + mode_config = ModeConfig(stack_mode=stack_mode, dump_mode=Const.ALL) + parse_data = ParseData(mode_config) + npu_df = parse_data.gen_data_df(npu_json_data, stack_json_data) + + target_df = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File'], 'Functional.linear.0.forward.input.0.pt']], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'data_name'] + ) + self.assertTrue(npu_df.equals(target_df)) + + def test_gen_data_df_md5(self): + npu_json_path = os.path.join(base_dir, 'dump_md5.json') + stack_json_path = os.path.join(base_dir, 'stack.json') + npu_json_data = load_json(npu_json_path) + stack_json_data = load_json(stack_json_path) + + stack_mode = True + mode_config = ModeConfig(stack_mode=stack_mode, dump_mode=Const.MD5) + parse_data = ParseData(mode_config) + npu_df = parse_data.gen_data_df(npu_json_data, stack_json_data) + + target_df = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File'], 123456]], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'md5'] + ) + self.assertTrue(npu_df.equals(target_df)) + + def test_gen_merge_list(self): + npu_json_path = os.path.join(base_dir, 'dump.json') + stack_json_path = os.path.join(base_dir, 'stack.json') + npu_json_data = load_json(npu_json_path) + stack_json_data = load_json(stack_json_path) + + stack_mode = True + mode_config = ModeConfig(stack_mode=stack_mode) + parse_data = ParseData(mode_config) + merge_list = parse_data.gen_merge_list(npu_json_data, 'Functional.linear.0.forward', stack_json_data) + + target_dict = { + 'input_struct': [('torch.float32', [2, 2])], + 'op_name': ['Functional.linear.0.forward.input.0'], + 'output_struct': [], + 'params_grad_struct': [], + 'params_struct': [], + 'stack_info': [['File']], + 'summary': [[2, 0, 1, 1]] + } + self.assertEqual(merge_list, target_dict) + + +class TestProcessDf(unittest.TestCase): + + def test_get_api_name_success(self): + api_list = ['Functional', 'linear', '0', 'forward', 'input', '0'] + + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + api_name = process_df.get_api_name(api_list) + + target_api_name = 'Functional.linear' + self.assertEqual(api_name, target_api_name) + + @patch('msprobe.core.compare.acc_compare.logger') + def test_get_api_name_index_error(self, mock_logger): + api_list = ['Functional'] + with self.assertRaises(CompareException) as context: + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + api_name = process_df.get_api_name(api_list) + self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) + mock_logger.error.assert_called_once_with('Failed to retrieve API name, please check if the dump data is reasonable') + + def test_process_compare_key_and_shape(self): + npu_df_o = bench_df_o = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File']]], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info'] + ) + + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + npu_df, bench_df = process_df.process_compare_key_and_shape(npu_df_o, bench_df_o) + + target_df = pd.DataFrame( + [['Functional.linear.0.forward.input.0', 'torch.float32', [2, 2], [2, 0, 1, 1], ['File'], 'Functional.linear.0.forward.input.0', [2, 2]]], + columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'compare_key', 'compare_shape'] + ) + self.assertTrue(npu_df.equals(target_df)) + self.assertTrue(bench_df.equals(target_df)) + + def test_process_internal_api_mapping(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + + # mint to torch + npu_op_name = 'Mint.mean.0.input.0' + target_name = 'Torch.mean.0.input.0' + name = process_df.process_internal_api_mapping(npu_op_name) + self.assertEqual(name, target_name) + + # mintfunctional to functional + npu_op_name = 'MintFunctional.mean.0.input.0' + target_name = 'Functional.mean.0.input.0' + name = process_df.process_internal_api_mapping(npu_op_name) + self.assertEqual(name, target_name) + + # inner mapping exists + npu_op_name = 'Functional.abs.0.input.0' + mapping_dict.ms_to_pt_mapping = {'Functional.abs': 'Torch.abs'} + target_name = 'Torch.abs.0.input.0' + name = process_df.process_internal_api_mapping(npu_op_name) + self.assertEqual(name, target_name) + + # inner mapping not found + npu_op_name = 'Functional.abs.0.input.0' + mapping_dict.ms_to_pt_mapping = {} + target_name = 'Functional.abs.0.input.0' + name = process_df.process_internal_api_mapping(npu_op_name) + self.assertEqual(name, target_name) + + def test_modify_compare_data_with_user_mapping(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + mapping_dict.api_mapping_dict = [{ + 'ms_api': 'Functional.conv2d', + 'pt_api': 'Torch.conv2d', + 'ms_args': [0], + 'pt_args': [0] + }] + + npu_df = pd.DataFrame([ + ['Functional.conv2d.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.0.forward.input.0'], + ['Functional.amax.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0'] + ], columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'compare_key']) + bench_df = pd.DataFrame([ + ['Torch.conv2d.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Torch.conv2d.0.forward.input.0'], + ['Torch.amax.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Torch.amax.0.forward.input.0'] + ], columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'compare_key']) + + process_df.modify_compare_data_with_user_mapping(npu_df, bench_df) + + def test_get_api_indices_dict(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + + op_name_df = pd.DataFrame([ + ['Functional.conv2d.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.0.forward.input.0'], + ['Functional.amax.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0'] + ], columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'compare_key']) + + api_indices_dict = process_df.get_api_indices_dict(op_name_df) + expected = { + 'Functional.conv2d': [0], + 'Functional.amax': [1] + } + self.assertEqual(api_indices_dict, expected) + + def test_process_cell_mapping(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + + # not name + npu_op_name = None + name = process_df.process_cell_mapping(npu_op_name) + self.assertEqual(name, CompareConst.N_A) + + # not params_grad + npu_op_name = 'MintFunctional.embedding.0.input.0' + name = process_df.process_cell_mapping(npu_op_name) + self.assertEqual(name, CompareConst.N_A) + + # default replace + npu_op_name = 'Cell.network_with_loss.module.GPTModel.forward.1.input.0' + name = process_df.process_cell_mapping(npu_op_name) + self.assertEqual(name, 'Module.network_with_loss.module.GPTModel.forward.1.input.0') + + # mapping_dict + npu_op_name = 'Cell.fc1.Dense.forward.0.input.0' + mapping_dict.cell_mapping_dict = {'fc1.Dense': 'module.name'} + name = process_df.process_cell_mapping(npu_op_name) + self.assertEqual(name, 'Module.module.name.forward.0.input.0') + + def test_process_data_mapping(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + mapping_dict = MappingDict(mapping_config) + process_df = ProcessDf(mode_config, mapping_config, mapping_dict) + + npu_op_name = 'Functional.flash_attention_score.4.forward.input.0' + mapping_dict.data_mapping_dict = {'Functional.flash_attention_score.4.forward.input.0': 'NPU.npu_fusion_attention.4.forward.input.0'} + name = process_df.process_data_mapping(npu_op_name) + self.assertEqual(name, 'NPU.npu_fusion_attention.4.forward.input.0') + + +class TestMatch(unittest.TestCase): + + def test_put_unmatched_in_table(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=False) + + match_result = pd.DataFrame(columns=CompareConst.MATCH_RESULT_COLUMNS) + npu_op_item = pd.Series(['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'op', [1, 2]], + index=['op_name_x', 'dtype_x', 'shape_x', 'summary_x', 'stack_info_x', 'data_name_x', + 'compare_key', 'compare_shape'] + ) + match_result = match.put_unmatched_in_table(match_result, npu_op_item) + target_match_result = pd.DataFrame([['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'op', [1, 2], + 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']], + columns=CompareConst.MATCH_RESULT_COLUMNS) + self.assertTrue(match_result.equals(target_match_result)) + + def test_put_matched_in_table(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=False) + + match_result = pd.DataFrame(columns=CompareConst.MATCH_RESULT_COLUMNS) + npu_op_item = pd.Series(['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'op', [1, 2]], + index=['op_name_x', 'dtype_x', 'shape_x', 'summary_x', 'stack_info_x', 'data_name_x', + 'compare_key', 'compare_shape'] + ) + bench_op_item = pd.Series(['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'op', [1, 2]], + index=['op_name_y', 'dtype_y', 'shape_y', 'summary_y', 'stack_info_y', 'data_name_y', + 'compare_key', 'compare_shape'] + ) + match_result = match.put_matched_in_table(match_result, npu_op_item, bench_op_item) + target_match_result = pd.DataFrame([['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'op', [1, 2], + 'op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name']], + columns=CompareConst.MATCH_RESULT_COLUMNS) + self.assertTrue(match_result.equals(target_match_result)) + + def test_rename_api(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=False) + + op_name_1 = 'Functional.linear.0.forward.input.0' + result_1 = match.rename_api(op_name_1) + self.assertTrue(result_1, 'Functional.linear.input.0') + + op_name_2 = 'Functional.linear.0.backward.input.0' + result_2 = match.rename_api(op_name_2) + self.assertTrue(result_2, 'Functional.linear.input.0') + + op_name_3 = 'Functional.linear.0.x.input.0' + result_3 = match.rename_api(op_name_3) + self.assertTrue(result_3, 'Functional.linear.0.x.input.0') + + def test_check_op_item(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=False) + + npu_op_item = pd.Series(['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'Functional.linear.0.forward.input.0', [1, 2]], + index=['op_name_x', 'dtype_x', 'shape_x', 'summary_x', 'stack_info_x', 'data_name_x', + 'compare_key', 'compare_shape'] + ) + bench_op_item = pd.Series(['op', 'float32', [1, 2], 'summary', 'stack_info', 'data_name', 'Functional.linear.1.forward.input.0', [1, 2]], + index=['op_name_y', 'dtype_y', 'shape_y', 'summary_y', 'stack_info_y', 'data_name_y', + 'compare_key', 'compare_shape'] + ) + result = match.check_op_item(npu_op_item, bench_op_item) + self.assertTrue(result) + + def test_process_fuzzy_match(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=False) + + npu_df = pd.DataFrame([ + ['Functional.conv2d.3.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.3.forward.input.0.pt', 'Functional.conv2d.3.forward.input.0', [1, 2]], + ['Functional.amax.1.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0.pt', 'Functional.amax.1.forward.input.0', [1, 2]] + ], columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'data_name', 'compare_key', 'compare_shape']) + bench_df = pd.DataFrame([ + ['Functional.conv2d.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.0.forward.input.0.pt', 'Functional.conv2d.0.forward.input.0', [1, 2]], + ['Functional.amax.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0.pt', 'Functional.amax.0.forward.input.0', [1, 2]] + ], columns=['op_name', 'dtype', 'shape', 'summary', 'stack_info', 'data_name', 'compare_key', 'compare_shape']) + + match_result = match.process_fuzzy_match(npu_df, bench_df) + expected = pd.DataFrame( + [ + ['Functional.conv2d.3.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.3.forward.input.0.pt', 'Functional.conv2d.3.forward.input.0', [1, 2], 'Functional.conv2d.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.conv2d.0.forward.input.0.pt'], + ['Functional.amax.1.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0.pt', 'Functional.amax.1.forward.input.0', [1, 2], 'Functional.amax.0.forward.input.0', 'float32', [1, 2], 'summary', 'stack_info', 'Functional.amax.0.forward.input.0.pt'] + ] + , columns=CompareConst.MATCH_RESULT_COLUMNS) + + self.assertTrue(match_result.equals(expected)) def test_match_op_both_last_element(self): stack_mode = False @@ -502,9 +763,10 @@ class TestUtilsMethods(unittest.TestCase): fuzzy_match = False dump_mode = Const.SUMMARY mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + mapping_config = MappingConfig() - pt_comparator = PTComparator(mode_config) - a, b = pt_comparator.match_op([npu_dict], [bench_dict]) + match = Match(mode_config, mapping_config, cross_frame=False) + a, b = match.match_op([npu_op_item_fuzzy], [bench_op_item_fuzzy]) self.assertEqual(a, 0) self.assertEqual(b, 0) @@ -514,9 +776,10 @@ class TestUtilsMethods(unittest.TestCase): fuzzy_match = False dump_mode = Const.SUMMARY mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + mapping_config = MappingConfig() - pt_comparator = PTComparator(mode_config) - a, b = pt_comparator.match_op([npu_dict], [bench_dict, 1]) + match = Match(mode_config, mapping_config, cross_frame=False) + a, b = match.match_op([npu_op_item_fuzzy], [bench_op_item_fuzzy, 1]) self.assertEqual(a, 0) self.assertEqual(b, 0) @@ -526,217 +789,102 @@ class TestUtilsMethods(unittest.TestCase): fuzzy_match = False dump_mode = Const.SUMMARY mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + mapping_config = MappingConfig() - pt_comparator = PTComparator(mode_config) - a, b = pt_comparator.match_op([npu_dict, npu_dict2], [bench_dict]) + match = Match(mode_config, mapping_config, cross_frame=False) + a, b = match.match_op([npu_op_item_fuzzy, npu_op_item_data_fuzzy_2], [bench_op_item_fuzzy]) self.assertEqual(a, 0) self.assertEqual(b, 0) - def test_compare_process(self): - generate_dump_json(base_dir) - generate_stack_json(base_dir) - file_lists = [os.path.join(base_dir, 'dump.json'), os.path.join(base_dir, 'dump.json'), - os.path.join(base_dir, 'stack.json')] + def test_gen_dtype_condition(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=True) - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + # data mapping + mapping_config.data_mapping = True + match_result = pd.DataFrame([1, 2, 3]) + result = match.gen_dtype_condition(match_result) + expected = pd.Series([True, True, True]) + self.assertTrue(result.equals(expected)) - result = PTComparator(mode_config).compare_process(file_lists) - o_data = [ - ['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], 0, 0, 0, 0, '0.0%', 'N/A', '0.0%', '0.0%', - 2, 0, 1, 1, 2, 0, 1, 1, '', '', ['File'] - ] - ] - columns = CompareConst.SUMMARY_COMPARE_RESULT_HEADER + ['NPU_Stack_Info'] - o_result = pd.DataFrame(o_data, columns=columns, dtype=object) - self.assertTrue(result.equals(o_result)) + # normal + mapping_config.data_mapping = None + match_result = pd.DataFrame([['Float16', 'Float32'], ['torch.float32', 'torch.bfloat16']], columns=['dtype_x', 'dtype_y']) + result = match.gen_dtype_condition(match_result) + expected = pd.Series([True, True]) + self.assertTrue(result.equals(expected)) - def test_merge_data(self): - op_data = { - 'input_args': [ - { - 'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [2, 2], - 'Max': 1, 'Min': 1, 'Mean': 1, 'Norm': 1, 'requires_grad': False, - 'data_name': 'Functional.linear.0.forward.input.0.pt', - 'full_op_name': 'Functional.linear.0.forward.input.0' - } - ] - } - json_data = {'data': {'Functional.linear.0.forward': op_data}} - stack_json_data = {'Functional.linear.0.forward': ['File']} - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - result = Comparator(mode_config).merge_data(json_data, stack_json_data) - ops_all = { - 'Functional.linear.0.forward.input.0': { - 'data_name': None, 'stack_info': [['File']], - 'struct': ('torch.float32', [2, 2]), 'summary': [1, 1, 1, 1] - } - } - self.assertEqual(result, ops_all) - - def test_compare_core_basic(self): - generate_dump_json(base_dir2) - generate_stack_json(base_dir2) - input_params = { - "npu_json_path": os.path.join(base_dir2, "dump.json"), - "bench_json_path": os.path.join(base_dir2, "dump.json"), - "stack_json_path": os.path.join(base_dir2, "stack.json"), - } - output_path = base_dir2 - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - PTComparator(mode_config).compare_core(input_params, output_path) - - output_files = os.listdir(output_path) - self.assertTrue(any(f.endswith(".xlsx") for f in output_files)) - - def test_compare_ops(self): - generate_dump_json(base_dir3) - generate_stack_json(base_dir3) - generate_pt(pt_dir) - dump_path = os.path.join(base_dir3, 'dump.json') - stack_path = os.path.join(base_dir3, 'stack.json') - input_param = {'npu_json_path': dump_path, 'bench_json_path': dump_path, 'stack_json_path': stack_path, - 'is_print_compare_log': True, 'npu_dump_data_dir': pt_dir, 'bench_dump_data_dir': pt_dir} - dump_path_dict = {'Functional.linear.0.forward.input.0': ['Functional.linear.0.forward.input.0.pt', - 'Functional.linear.0.forward.input.0.pt']} - result_df = pd.DataFrame({ - 'NPU Name': ['Functional.linear.0.forward.input.0'], - 'Bench Name': ['Functional.linear.0.forward.input.0'] - }) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - pt_comparator = PTComparator(mode_config) - updated_df = pt_comparator.compare_ops(idx=0, dump_path_dict=dump_path_dict, result_df=result_df, - lock=self.lock, input_param=input_param) - - self.assertEqual(updated_df.loc[0, CompareConst.COSINE], 1.0) - self.assertEqual(updated_df.loc[0, CompareConst.MAX_ABS_ERR], 0) - - def test_do_multi_process(self): - data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], - '', '', '', '', '', '', 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', ['-1', '-1']]] - o_data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', - 'torch.float32', 'torch.float32', [2, 2], [2, 2], - 'unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', - 1, 1, 1, 1, 1, 1, 1, 1, 'None', 'No bench data matched.', ['-1', '-1']]] - columns = CompareConst.COMPARE_RESULT_HEADER + ['Data_name'] - result_df = pd.DataFrame(data, columns=columns) - o_result = pd.DataFrame(o_data, columns=columns) - generate_dump_json(base_dir) - input_param = {'bench_json_path': os.path.join(base_dir, 'dump.json')} + def test_process_cross_frame_dtype(self): + mode_config = ModeConfig() + mapping_config = MappingConfig() + match = Match(mode_config, mapping_config, cross_frame=True) - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + dtype_o = pd.Series(['Int8', 'Float16', 'torch.bool', 'Complex64', 'unknown']) + dtype = match.process_cross_frame_dtype(dtype_o) + self.assertTrue(dtype.equals(pd.Series(['int', 'float', 'bool', 'complex', 'unknown']))) - comparator = Comparator(mode_config) - result = comparator.do_multi_process(input_param, result_df) - self.assertTrue(result.equals(o_result)) - def test_compare_by_op_1(self): - npu_op_name = 'Functional.linear.0.forward.input.0' - bench_op_name = 'N/A' - op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [-1, -1]} - input_param = {} +class TestCreateTable(unittest.TestCase): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + def test_process_data_name(self): + mode_config = ModeConfig() + create_table = CreateTable(mode_config) - pt_comparator = PTComparator(mode_config) - result = pt_comparator.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) + data = { + 'data_name_x': ['A', 'B', 'C'], + 'data_name_y': ['X', 'Y', 'Z'] + } + result_o = pd.DataFrame(data) + result = create_table.process_data_name(result_o) + target_data = { + 'data_name_x': [['A', 'X'], ['B', 'Y'], ['C', 'Z']], + 'data_name_y': ['X', 'Y', 'Z'] + } + target_result = pd.DataFrame(target_data) + self.assertTrue(result.equals(target_result)) - self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', - 'unsupported', 'No bench data matched.']) + def test_set_summary(self): + mode_config = ModeConfig() + create_table = CreateTable(mode_config) - def test_compare_by_op_2(self): - npu_op_name = 'Functional.linear.0.forward.input.0' - bench_op_name = 'Functional.linear.0.forward.input.0' + # all nan + result = create_table.set_summary(['nan', 'NaN', 'nAn']) + expected = [CompareConst.NAN, CompareConst.NAN, CompareConst.NAN] + self.assertEqual(result, expected) - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) + # mixed values + result = create_table.set_summary([1, 'nan', 2.0, 'NaN']) + expected = [1, CompareConst.NAN, 2.0, CompareConst.NAN] + self.assertEqual(result, expected) - pt_comparator = PTComparator(mode_config) + # NA case + result = create_table.set_summary(CompareConst.N_A) + expected = [CompareConst.N_A, CompareConst.N_A, CompareConst.N_A, CompareConst.N_A] + self.assertEqual(result, expected) - pt_name = '-1' - op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [pt_name, pt_name]} - input_param = {'npu_dump_data_dir': base_dir, 'bench_dump_data_dir': base_dir} - result = pt_comparator.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) - self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', - 'unsupported', 'No bench data matched.']) + # empty input + result = create_table.set_summary([]) + expected = [] + self.assertEqual(result, expected) - pt_name = 'Functional.linear.0.forward.input.0.pt' - op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [pt_name, pt_name]} - input_param = {'npu_dump_data_dir': base_dir, 'bench_dump_data_dir': base_dir} - result = pt_comparator.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) - self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', - 'unsupported', 'Dump file: Functional.linear.0.forward.input.0.pt not found.']) - generate_pt(base_dir) - result = pt_comparator.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) - self.assertEqual(result, [1.0, 0.0, 0.0, 0.0, 1.0, 1.0, '']) +class TestCalcStatsDiff(unittest.TestCase): + def test_type_check(self): + mode_config = ModeConfig() + calc_stats_diff = CalcStatsDiff(mode_config) -class TestComparator(unittest.TestCase): - def setUp(self): - mode_config = ModeConfig(dump_mode=Const.MD5) - self.comparator = Comparator(mode_config=mode_config) - self.npu_ops_all = { - 'op1': {'struct': ['float32', [1, 96, 2], '83dcefb7']}, - } - self.bench_ops_all = { - 'op1': {'struct': ['float32', [1, 96, 2], '83dcefb7']}, - } + series = pd.Series([float('nan'), 5, 'nan', 10, 'abc', None]) + result = calc_stats_diff.type_check(series) + expected = pd.Series([True, True, True, True, False, False]) + self.assertTrue(result.equals(expected)) - def test_normal(self): - expected_result = ['op1', 'op1', 'float32', 'float32', [1, 96, 2], [1, 96, 2], '83dcefb7', '83dcefb7', - CompareConst.PASS, CompareConst.NONE] - result = self.comparator.get_result_md5_compare('op1', 'op1', - self.npu_ops_all, self.bench_ops_all) - self.assertEqual(result, expected_result) + def test_get_number(self): + mode_config = ModeConfig() + calc_stats_diff = CalcStatsDiff(mode_config) - @patch('msprobe.core.compare.acc_compare.logger') - def test_length_exception(self, mock_logger): - self.npu_ops_all['op1']['struct'] = ['npu_val1', 'npu_val2'] - with self.assertRaises(CompareException) as context: - self.comparator.get_result_md5_compare('op1', 'op1', - self.npu_ops_all, self.bench_ops_all) - self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) - mock_logger.error.assert_called_once_with("The length of npu_struct and bench_struct must be >= 3, " - "but got npu_struct=2 and bench_struct=3. Please check!") - - def test_with_extra_args(self): - expected_result = ['op1', 'op1', 'float32', 'float32', [1, 96, 2], [1, 96, 2], '83dcefb7', '83dcefb7', - CompareConst.PASS, 'extra_data'] - result = self.comparator.get_result_md5_compare('op1', 'op1', - self.npu_ops_all, self.bench_ops_all, True, ['extra_data']) - self.assertEqual(result, expected_result) + series = pd.Series([1, '2', 3.5, 'text', None]) + result = calc_stats_diff.get_number(series) + expected = pd.Series([1, 2, 3.5, float('nan'), float('nan')]) + self.assertTrue(result.equals(expected)) diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_check.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_check.py index a1e5f8eee1bce9b170e6f4f7fdfeda65d47252c9..1a0a33f799724ffefe73bf8f024e0146b2925464 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_check.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_check.py @@ -1,7 +1,6 @@ # coding=utf-8 import unittest -from msprobe.core.compare.check import check_struct_match, check_type_shape_match, check_graph_mode, fuzzy_check_op, \ - fuzzy_check_name, check_dump_json_str, check_json_key_value, valid_key_value, check_stack_json_str +from msprobe.core.compare.check import check_dump_json_str, check_json_key_value, valid_key_value, check_stack_json_str from msprobe.core.common.utils import CompareException @@ -65,87 +64,6 @@ op_name = 'Functional.conv2d.0.backward.input.0' class TestUtilsMethods(unittest.TestCase): - - def test_check_struct_match_success(self): - result = check_struct_match(npu_dict, bench_dict) - self.assertTrue(result) - - def test_check_struct_match_fail(self): - npu_dict2 = {'input_struct': [('torch.float32', [1, 1, 28, 28]), ('torch.float32', [16, 1, 5, 5]), - ('torch.float32', [16])], - 'output_struct': [('torch.float32', [1, 16, 28, 28])] - } - - bench_dict2 = {'input_struct': [('torch.float32', [2, 1, 28, 28]), ('torch.float32', [16, 1, 5, 5]), - ('torch.float32', [16])], - 'output_struct': [('torch.float32', [1, 16, 28, 28])] - } - result = check_struct_match(npu_dict2, bench_dict2) - self.assertFalse(result) - - def test_check_struct_index_error(self): - npu_dict3 = {'input_struct': [('a'), ('torch.float32'), - ('torch.float32')], - 'output_struct': [('torch.float32')] - } - - bench_dict3 = {'input_struct': [('torch.float32'), ('torch.float32'), - ('torch.float32')], - 'output_struct': [('torch.float32')] - } - with self.assertRaises(CompareException) as context: - result = check_struct_match(npu_dict3, bench_dict3) - self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) - - def test_check_type_shape_match_success(self): - result = check_type_shape_match(npu_struct, bench_struct) - self.assertTrue(result) - - def test_check_type_shape_match_index_error(self): - npu_struct2 = [('a'), ('torch.float32'), ('torch.float32')] - bench_struct2 = [('torch.float32'), ('torch.float32'), ('torch.float32')] - with self.assertRaises(CompareException) as context: - result = check_type_shape_match(npu_struct2, bench_struct2) - self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) - - def test_check_graph_mode(self): - op1 = "Aten" - op2 = "torch" - self.assertTrue(check_graph_mode(op1, op2)) - self.assertTrue(check_graph_mode(op2, op1)) - self.assertFalse(check_graph_mode(op1, op1)) - self.assertFalse(check_graph_mode(op2, op2)) - - def test_fuzzy_check_op_1(self): - npu_name_list = [] - bench_name_list = [] - result = fuzzy_check_op(npu_name_list, bench_name_list) - self.assertFalse(result) - - def test_fuzzy_check_op_2(self): - npu_name_list = [] - bench_name_list = ['Functional.conv2d.0.forward.input.0'] - result = fuzzy_check_op(npu_name_list, bench_name_list) - self.assertFalse(result) - - def test_fuzzy_check_op_3(self): - npu_name_list = ['Functional.conv2d.0.forward.input.0'] - bench_name_list = ['Functional.conv2d.1.forward.input.0'] - result = fuzzy_check_op(npu_name_list, bench_name_list) - self.assertTrue(result) - - def test_fuzzy_check_name_1(self): - npu_name = 'Functional.conv2d.0.backward.input.0' - bench_name = 'Functional.conv2d.1.backward.input.0' - result = fuzzy_check_name(npu_name, bench_name) - self.assertTrue(result) - - def test_fuzzy_check_name_2(self): - npu_name = 'Functional.conv2d.0.backward.input.0' - bench_name = 'Functional.conv2d.1.backward.input.1' - result = fuzzy_check_name(npu_name, bench_name) - self.assertFalse(result) - def test_check_dump_json_str(self): with self.assertRaises(CompareException) as context: check_dump_json_str(op_data, op_name) diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_npy_compare.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_npy_compare.py index da315b657c8c1fc691136a1dbc56574d69c92076..a30d693f7b32a806dee8667e42794259e7785545 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_npy_compare.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_npy_compare.py @@ -454,7 +454,7 @@ class TestUtilsMethods(unittest.TestCase): result, err_msg = error_value_process(n_value) - self.assertEqual(result, 0) + self.assertEqual(result, CompareConst.UNSUPPORTED) self.assertEqual(err_msg, "") def test_error_value_process_shape_unmatch(self): diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_utils.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_utils.py index bf23f4de1dac73a44a2497e1a927ba30e5440715..11d88eee940f08a3a3276679fcb5bc490e2da02e 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_utils.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_acc_compare_utils.py @@ -12,9 +12,8 @@ import numpy as np from msprobe.core.common.const import CompareConst, Const from msprobe.core.common.utils import CompareException from msprobe.core.compare.utils import ApiItemInfo, _compare_parser, check_and_return_dir_contents, extract_json, \ - count_struct, get_accuracy, append_stack_info, get_rela_diff_summary_mode, get_un_match_accuracy, merge_tensor, \ - op_item_parse, read_op, rename_api, resolve_api_special_parameters, result_item_init, stack_column_process, \ - table_value_is_valid, get_name_and_state, reorder_op_name_list, reorder_op_x_list, gen_op_item + count_struct, get_accuracy, get_rela_diff_summary_mode, merge_tensor, op_item_parse, read_op, result_item_init, \ + stack_column_process, table_value_is_valid, get_name_and_state, reorder_op_name_list, reorder_op_x_list, gen_op_item # test_read_op_1 op_data = { @@ -350,18 +349,6 @@ class TestUtilsMethods(unittest.TestCase): result = check_and_return_dir_contents(base_dir2, 'rank') self.assertEqual(set(result), set(['rank0', 'rank1'])) - def test_rename_api_1(self): - test_name_1 = "Distributed.broadcast.0.forward.input.0" - expect_name_1 = "Distributed.broadcast.input.0" - actual_name_1 = rename_api(test_name_1, "forward") - self.assertEqual(actual_name_1, expect_name_1) - - def test_rename_api_2(self): - test_name_2 = "Torch.sum.0.backward.output.0" - expect_name_2 = "Torch.sum.output.0" - actual_name_2 = rename_api(test_name_2, "backward") - self.assertEqual(actual_name_2, expect_name_2) - def test_read_op(self): result = read_op(op_data, op_name) self.assertEqual(result, op_result) @@ -379,11 +366,6 @@ class TestUtilsMethods(unittest.TestCase): op_item_parse(parse_item, parse_op_name, depth=11) self.assertEqual(context.exception.code, CompareException.RECURSION_LIMIT_ERROR) - def test_resolve_api_special_parameters(self): - item_list = [] - resolve_api_special_parameters(data_dict, full_op_name, item_list) - self.assertEqual(item_list, o_result_api_special) - def test_get_rela_diff_summary_mode_float_or_int(self): result_item = [0] * 14 err_msg = '' @@ -449,57 +431,6 @@ class TestUtilsMethods(unittest.TestCase): get_accuracy(result, npu_dict, bench_dict, dump_mode=Const.SUMMARY) self.assertEqual(result, o_result) - def test_append_stack_info_stack_exist_index_0(self): - result_item = ['item1'] - npu_stack_info = ['stack_info1'] - index = 0 - - append_stack_info(result_item, npu_stack_info, index) - - self.assertEqual(result_item, ['item1', 'stack_info1']) - - def test_append_stack_info_stack_exist_index_not_0(self): - result_item = ['item1'] - npu_stack_info = ['stack_info1'] - index = 1 - - append_stack_info(result_item, npu_stack_info, index) - - self.assertEqual(result_item, ['item1', CompareConst.NONE]) - - def test_append_stack_info_stack_empty_index_0(self): - result_item = ['item1'] - npu_stack_info = [] - index = 0 - - append_stack_info(result_item, npu_stack_info, index) - - self.assertEqual(result_item, ['item1', CompareConst.NONE]) - - def test_append_stack_info_stack_empty_index_not_0(self): - result_item = ['item1'] - npu_stack_info = [] - index = 1 - - append_stack_info(result_item, npu_stack_info, index) - - self.assertEqual(result_item, ['item1', CompareConst.NONE]) - - def test_get_un_match_accuracy_md5(self): - result = [] - get_un_match_accuracy(result, npu_dict, dump_mode=Const.MD5) - self.assertEqual(result, o_result_unmatch_1) - - def test_get_un_match_accuracy_summary(self): - result = [] - get_un_match_accuracy(result, npu_dict, dump_mode=Const.SUMMARY) - self.assertEqual(result, o_result_unmatch_2) - - def test_get_un_match_accuracy_all(self): - result = [] - get_un_match_accuracy(result, npu_dict, dump_mode=Const.ALL) - self.assertEqual(result, o_result_unmatch_3) - def test_merge_tensor_summary(self): op_dict = merge_tensor(tensor_list, dump_mode=Const.SUMMARY) self.assertEqual(op_dict, result_op_dict) diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_highlight.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_highlight.py index 3261bce5d6d0a15d8e46c7d9fc22df0cf64c9e4d..5ffc0013fad8cfa289a79f5aaf39219b31b77c07 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_highlight.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_highlight.py @@ -12,12 +12,44 @@ import openpyxl from openpyxl import load_workbook from openpyxl.styles import PatternFill - from msprobe.core.common.const import CompareConst, Const from msprobe.core.compare.highlight import ApiBatch, CheckMaxRelativeDiff, CheckOrderMagnitude, \ - CheckOneThousandErrorRatio, CheckCosineSimilarity, add_highlight_row_info, compare_result_df_convert, \ - df_malicious_value_check, find_error_rows, highlight_rows_xlsx, update_highlight_err_msg, value_check - + CheckOneThousandErrorRatio, CheckCosineSimilarity, add_highlight_row_info, HighLight +from msprobe.core.compare.config import ModeConfig + + +summary_line_input = ['Functional_batch_norm_0_forward.input.0', 'Functional_batch_norm_0_forward.input.0', + 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.01, 0, 0, 0, 1, 1, 1, 1, 1.01, 1, 1, 1, + 'Yes', ''] +summary_line_1 = ['Functional_batch_norm_0_forward.output.0', 'Functional_batch_norm_0_forward.output.0', + 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 10, 0, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, + 'Warning', ''] +summary_line_2 = ['Functional_batch_norm_0_forward.output.1', 'Functional_batch_norm_0_forward.output.1', + 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.02, 0, 0, 0, 0.12, 0, 1, 1, 0.1, 1, 1, 1, + 'Warning', ''] +summary_line_3 = ['Functional_batch_norm_0_forward.output.2', 'Functional_batch_norm_0_forward.output.2', + 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0, 0, 0, 0, 2, 0, 1, 1, 1, 1, 1, 1, + 'Warning', ''] +line_input = ['Functional.batch.norm.0.forward.input.0', 'Functional.batch.norm.0.forward.input.0', 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 1, 0.5, 1, 1, 0.95, 1, + 1, 1, 1, 1, 1.01, 1, 1, 1, + 'Yes', ''] +line_1 = ['Functional.batch.norm.0.forward.output.0', 'Functional.batch.norm.0.forward.output.0', 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.8, 0.5, 1, 1, 0.59, 1, + 'nan', 0, 1, 1, 19, 1, 1, 1, + 'Yes', ''] +line_2 = ['Functional.batch.norm.0.forward.output.1', 'Functional.batch.norm.0.forward.output.1', 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.9, 0.5, 1, 1, 0.8, 1, + 0, 0.12, 0, 1, 1, 0.1, 1, 1, + 'Yes', ''] +line_3 = ['Functional.batch.norm.0.forward.output.2', 'Functional.batch.norm.0.forward.output.2', 'torch.float16', + 'torch.float32', [256, 256, 14, 14], [256, 256, 14, 14], 0.8, 0.5, 1.1e+10, 1, 0.85, 1, + 9, 0.12, 0, 1, 1, 0.1, 1, 1, + 'Yes', ''] base_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_highlight') @@ -161,7 +193,7 @@ class TestUtilsMethods(unittest.TestCase): num = 1 info = (api_in, api_out, num) CheckMaxRelativeDiff().apply(info, color_columns, dump_mode=Const.SUMMARY) - red_lines, yellow_lines = [], [(1, ["The output's maximum relative error exceeds 0.1, while the input/parameters's is below 0.01"])] + red_lines, yellow_lines = [], [(1, ["The output's maximum relative error exceeds 0.1, while the input/parameter's is below 0.01"])] target_color_columns = ColorColumns(red=red_lines, yellow=yellow_lines) self.assertEqual(color_columns, target_color_columns) @@ -178,45 +210,6 @@ class TestUtilsMethods(unittest.TestCase): result = CheckMaxRelativeDiff().apply(info, color_columns, dump_mode=Const.SUMMARY) self.assertEqual(result, None) - def test_find_error_rows_normal(self): - compare_result = np.array([ - ["Functional.linear.0.forward.input.0", "Functional.linear.0.forward.input.0", - "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", - 1, 1, 1, 1, 1, 1, 1, 1, "", ""], - ["Functional.linear.0.forward.input.1", "Functional.linear.0.forward.input.1", - "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", - 1, 1, 1, 1, 1, 1, 1, 1, "", ""], - ["Functional.linear.0.forward.input.2", "Functional.linear.0.forward.input.2", - "torch.float32", "torch.float32", [2], [2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", - 1, 1, 1, 1, 1, 1, 1, 1, "", ""], - ["Functional.linear.0.forward.output.0", "Functional.linear.0.forward.output.0", - "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", - 1, 1, 1, 1, 1, 1, 1, 1, "", ""], - ], dtype=object) - api_batch = ApiBatch("Functional.linear.0.forward", 0) - api_batch.input_len = 3 - api_batch.output_end_index = 4 - api_batch.params_end_index = 4 - highlight_dict = {"red_lines": [], "red_rows": set(), "yellow_lines": [], "yellow_rows": set()} - dump_mode = Const.ALL - - find_error_rows(compare_result, api_batch, highlight_dict, dump_mode) - - self.assertEqual(highlight_dict, {"red_lines": [], "red_rows": set(), "yellow_lines": [], "yellow_rows": set()}) - - def test_find_error_rows_md5(self): - compare_result = [] - api_batch = ApiBatch("", 0) - api_batch.input_len = 0 - api_batch.output_end_index = 1 - api_batch.params_end_index = 1 - highlight_dict = {} - dump_mode = Const.MD5 - - result = find_error_rows(compare_result, api_batch, highlight_dict, dump_mode) - - self.assertEqual(result, None) - def test_ApiBatch_increment_input(self): api_name = "functional.conv2d" start = 2 @@ -297,6 +290,48 @@ class TestUtilsMethods(unittest.TestCase): self.assertEqual(api_batch.output_end_index, 5) self.assertEqual(api_batch.params_grad_end_index, 5) + + def test_find_error_rows_normal(self): + compare_result = np.array([ + ["Functional.linear.0.forward.input.0", "Functional.linear.0.forward.input.0", + "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", + 1, 1, 1, 1, 1, 1, 1, 1, "", ""], + ["Functional.linear.0.forward.input.1", "Functional.linear.0.forward.input.1", + "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", + 1, 1, 1, 1, 1, 1, 1, 1, "", ""], + ["Functional.linear.0.forward.input.2", "Functional.linear.0.forward.input.2", + "torch.float32", "torch.float32", [2], [2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", + 1, 1, 1, 1, 1, 1, 1, 1, "", ""], + ["Functional.linear.0.forward.output.0", "Functional.linear.0.forward.output.0", + "torch.float32", "torch.float32", [2, 2], [2, 2], 0.0, 0.0, 0.0, 0.0, "0.0%", "0.0%", "0.0%", "0.0%", + 1, 1, 1, 1, 1, 1, 1, 1, "", ""], + ], dtype=object) + api_batch = ApiBatch("Functional.linear.0.forward", 0) + api_batch.input_len = 3 + api_batch.output_end_index = 4 + api_batch.params_end_index = 4 + highlight_dict = {"red_lines": [], "red_rows": set(), "yellow_lines": [], "yellow_rows": set()} + + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.find_error_rows(compare_result, api_batch, highlight_dict) + + self.assertEqual(highlight_dict, {"red_lines": [], "red_rows": set(), "yellow_lines": [], "yellow_rows": set()}) + + def test_find_error_rows_md5(self): + compare_result = [] + api_batch = ApiBatch("", 0) + api_batch.input_len = 0 + api_batch.output_end_index = 1 + api_batch.params_end_index = 1 + highlight_dict = {} + + mode_config = ModeConfig(dump_mode=Const.MD5) + highlight = HighLight(mode_config) + result = highlight.find_error_rows(compare_result, api_batch, highlight_dict) + + self.assertEqual(result, None) + @patch("msprobe.core.compare.highlight.logger") def test_value_check(self, mock_logger): value = "=functional.conv2d" @@ -304,7 +339,9 @@ class TestUtilsMethods(unittest.TestCase): i = 1 result_df_columns = CompareConst.COMPARE_RESULT_HEADER - value_check(value, api_name, i, result_df_columns) + mode_config = ModeConfig() + highlight = HighLight(mode_config) + highlight.value_check(value, api_name, i, result_df_columns) mock_logger.error.assert_called_once_with( "Malicious value [=functional.conv2d] at api_name [=functional.conv2d], column [Bench Name], " @@ -319,11 +356,15 @@ class TestUtilsMethods(unittest.TestCase): ] result_df = pd.DataFrame(data, columns=columns) - df_malicious_value_check(result_df, columns) + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.df_malicious_value_check(result_df, columns) def test_compare_result_df_convert(self): value = float("nan") - result = compare_result_df_convert(value) + mode_config = ModeConfig() + highlight = HighLight(mode_config) + result = highlight.compare_result_df_convert(value) self.assertEqual(result, "nan\t") def test_highlight_rows_xlsx_red(self): @@ -335,7 +376,11 @@ class TestUtilsMethods(unittest.TestCase): result_df = pd.DataFrame(data, columns=columns) highlight_dict = {'red_rows': [0]} file_path = os.path.join(base_dir, 'result.xlsx') - highlight_rows_xlsx(result_df, highlight_dict, file_path) + + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.highlight_rows_xlsx(result_df, highlight_dict, file_path) + generate_result_xlsx(base_dir) self.assertTrue(compare_excel_files_with_highlight(file_path, os.path.join(base_dir, 'target_result.xlsx'))) @@ -348,7 +393,11 @@ class TestUtilsMethods(unittest.TestCase): result_df = pd.DataFrame(data, columns=columns) highlight_dict = {'yellow_rows': [0]} file_path = os.path.join(base_dir, 'result.xlsx') - highlight_rows_xlsx(result_df, highlight_dict, file_path) + + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.highlight_rows_xlsx(result_df, highlight_dict, file_path) + generate_result_xlsx(base_dir) self.assertTrue(compare_excel_files_with_highlight(file_path, os.path.join(base_dir, 'target_result_yellow.xlsx'))) @@ -366,7 +415,9 @@ class TestUtilsMethods(unittest.TestCase): temp_output_file = 'temp_output.txt' sys.stdout = open(temp_output_file, 'w') - highlight_rows_xlsx(result_df, highlight_dict, file_path) + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.highlight_rows_xlsx(result_df, highlight_dict, file_path) with open(temp_output_file, 'r') as f: output = f.read() @@ -391,7 +442,9 @@ class TestUtilsMethods(unittest.TestCase): temp_output_file = 'temp_output.txt' sys.stdout = open(temp_output_file, 'w') - highlight_rows_xlsx(result_df, highlight_dict, file_path) + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.highlight_rows_xlsx(result_df, highlight_dict, file_path) with open(temp_output_file, 'r') as f: output = f.read() @@ -429,7 +482,10 @@ class TestUtilsMethods(unittest.TestCase): 'red_lines': [(0, ['a', 'b'])], 'yellow_lines': [(0, ['c']), (1, ['d'])] } - update_highlight_err_msg(result_df, highlight_dict) + + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.update_highlight_err_msg(result_df, highlight_dict) t_data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', 'torch.float32', 'torch.float32', [2, 2], [2, 2], @@ -449,7 +505,9 @@ class TestUtilsMethods(unittest.TestCase): result_df = pd.DataFrame(data, columns=columns) highlight_dict = {} - result = update_highlight_err_msg(result_df, highlight_dict) + mode_config = ModeConfig(dump_mode=Const.MD5) + highlight = HighLight(mode_config) + result = highlight.update_highlight_err_msg(result_df, highlight_dict) self.assertEqual(result, None) @@ -466,5 +524,43 @@ class TestUtilsMethods(unittest.TestCase): 'red_lines': [(0, ['a', 'b'])], 'yellow_lines': [(0, ['c']), (1, ['d'])] } - result = update_highlight_err_msg(result_df, highlight_dict) + mode_config = ModeConfig() + highlight = HighLight(mode_config) + result = highlight.update_highlight_err_msg(result_df, highlight_dict) self.assertEqual(result, None) + + def test_find_error_rows(self): + api_batch = ApiBatch("Functional_batch_norm_0_forward", 0) + api_batch.input_len = 1 + api_batch.output_end_index = 4 + api_batch.params_end_index = 4 + summary_result = [summary_line_input, summary_line_1, summary_line_2, summary_line_3] + highlight_dict_test = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} + mode_config = ModeConfig() + highlight = HighLight(mode_config) + highlight.find_error_rows(summary_result, api_batch, highlight_dict_test) + self.assertEqual(highlight_dict_test, + {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []}) + + def test_find_compare_result_error_rows(self): + result = [line_input, line_1, line_2, line_3] + result_df = pd.DataFrame(result) + highlight_dict_test = {"red_rows": set(), "yellow_rows": set(), "red_lines": [], "yellow_lines": []} + mode_config = ModeConfig(dump_mode=Const.ALL) + highlight = HighLight(mode_config) + highlight.find_compare_result_error_rows(result_df, highlight_dict_test) + self.assertEqual(highlight_dict_test, { + "red_rows": {1, 3}, + "yellow_rows": {2}, + "red_lines": [ + (1, ["maximum or minimum is nan, -inf, or inf"]), + (3, ["maximum absolute error exceeds 1e+10"]) + ], + "yellow_lines": [ + (2, ["The output's one thousandth err ratio decreases by more than 0.1 compared to the input/parameter's"]), + (3, [ + "maximum absolute error of both input/parameters and output exceed 1, " + "with the output larger by an order of magnitude", + "The output's cosine decreases by more than 0.1 compared to the input/parameter's"]) + ] + }) diff --git a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_multiprocessing_compute.py b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_multiprocessing_compute.py index 49f084ce07c8e90afb2aa1c3340bb4c3965c8fa7..0180c08e87f6cb78c392223830214fccffb8c149 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_multiprocessing_compute.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/compare/test_cmp_multiprocessing_compute.py @@ -7,12 +7,12 @@ import unittest import pandas as pd -from msprobe.core.common.const import CompareConst, Const +from msprobe.core.common.const import Const, CompareConst from msprobe.core.common.utils import CompareException -from msprobe.core.compare.acc_compare import Comparator, ModeConfig -from msprobe.core.compare.multiprocessing_compute import ComparisonResult, _handle_multi_process, _save_cmp_result, \ - check_accuracy, read_dump_data -from test_acc_compare import generate_dump_json +from msprobe.core.compare.acc_compare import ModeConfig +from msprobe.core.compare.multiprocessing_compute import check_accuracy, CompareRealData, ComparisonResult +from msprobe.pytorch.compare.pt_compare import read_real_data +from test_acc_compare import generate_dump_json, generate_pt, generate_stack_json data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', 'torch.float32', 'torch.float32', [2, 2], [2, 2], @@ -28,10 +28,49 @@ columns = CompareConst.COMPARE_RESULT_HEADER + ['Data_name'] result_df = pd.DataFrame(data, columns=columns) o_result = pd.DataFrame(o_data, columns=columns) base_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_cmp_multiprocessing_compute') +base_dir3 = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_acc_compare_data3') +pt_dir = os.path.join(base_dir3, f'dump_data_dir') class TestUtilsMethods(unittest.TestCase): + def test_check_accuracy(self): + max_abs_err = '' + + cos_1 = CompareConst.SHAPE_UNMATCH + result_1 = check_accuracy(cos_1, max_abs_err) + self.assertEqual(result_1, CompareConst.ACCURACY_CHECK_UNMATCH) + + cos_2 = CompareConst.NONE + result_2 = check_accuracy(cos_2, max_abs_err) + self.assertEqual(result_2, CompareConst.NONE) + + cos_3 = 'N/A' + result_3 = check_accuracy(cos_3, max_abs_err) + self.assertEqual(result_3, CompareConst.ACCURACY_CHECK_NO) + + cos_4 = '' + result_4 = check_accuracy(cos_4, max_abs_err) + self.assertEqual(result_4, CompareConst.NONE) + + cos_5 = 0.95 + max_abs_err = 0.002 + result_5 = check_accuracy(cos_5, max_abs_err) + self.assertEqual(result_5, CompareConst.ACCURACY_CHECK_NO) + + cos_6 = 0.85 + max_abs_err = 2 + result_6 = check_accuracy(cos_6, max_abs_err) + self.assertEqual(result_6, CompareConst.ACCURACY_CHECK_NO) + + cos_7 = 0.95 + max_abs_err = 0.001 + result_7 = check_accuracy(cos_7, max_abs_err) + self.assertEqual(result_7, CompareConst.ACCURACY_CHECK_YES) + + +class TestCompareRealData(unittest.TestCase): + def setUp(self): self.result_df = pd.DataFrame(columns=[ CompareConst.COSINE, CompareConst.EUC_DIST, CompareConst.MAX_ABS_ERR, CompareConst.MAX_RELATIVE_ERR, @@ -39,35 +78,39 @@ class TestUtilsMethods(unittest.TestCase): CompareConst.ACCURACY, CompareConst.ERROR_MESSAGE ]) os.makedirs(base_dir, mode=0o750, exist_ok=True) + os.makedirs(base_dir3, mode=0o750, exist_ok=True) + os.makedirs(pt_dir, mode=0o750, exist_ok=True) self.lock = threading.Lock() def tearDown(self): if os.path.exists(base_dir): shutil.rmtree(base_dir) - - def test_handle_multi_process(self): - stack_mode = False - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - func = Comparator(mode_config).compare_ops - generate_dump_json(base_dir) - input_param = {'bench_json_path': os.path.join(base_dir, 'dump.json')} - lock = multiprocessing.Manager().RLock() - result = _handle_multi_process(func, input_param, result_df, lock) - self.assertTrue(result.equals(o_result)) + if os.path.exists(pt_dir): + shutil.rmtree(pt_dir) + if os.path.exists(base_dir3): + shutil.rmtree(base_dir3) def test_read_dump_data(self): - result = read_dump_data(result_df) + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + + # normal + result = compare_real_data.read_dump_data(result_df) self.assertEqual(result, {'Functional.linear.0.forward.input.0': ['-1', '-1']}) + # index error with self.assertRaises(CompareException) as context: - result = read_dump_data(pd.DataFrame()) + result = compare_real_data.read_dump_data(pd.DataFrame()) self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) def test_save_cmp_result_success(self): + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + comparison_result = ComparisonResult( cos_result=[0.99, 0.98], max_err_result=[0.01, 0.02], @@ -78,13 +121,18 @@ class TestUtilsMethods(unittest.TestCase): err_msgs=['', 'Error in comparison'] ) offset = 0 - updated_df = _save_cmp_result(offset, comparison_result, self.result_df, self.lock) + updated_df = compare_real_data._save_cmp_result(offset, comparison_result, self.result_df, self.lock) self.assertEqual(updated_df.loc[0, CompareConst.COSINE], 0.99) self.assertEqual(updated_df.loc[1, CompareConst.COSINE], 0.98) self.assertEqual(updated_df.loc[1, CompareConst.ERROR_MESSAGE], 'Error in comparison') def test_save_cmp_result_index_error(self): + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + comparison_result = ComparisonResult( cos_result=[0.99], max_err_result=[], @@ -95,39 +143,108 @@ class TestUtilsMethods(unittest.TestCase): err_msgs=[''] ) with self.assertRaises(CompareException) as context: - _save_cmp_result(0, comparison_result, self.result_df, self.lock) + compare_real_data._save_cmp_result(0, comparison_result, self.result_df, self.lock) self.assertEqual(context.exception.code, CompareException.INDEX_OUT_OF_BOUNDS_ERROR) - def test_check_accuracy(self): - max_abs_err = '' - - cos_1 = CompareConst.SHAPE_UNMATCH - result_1 = check_accuracy(cos_1, max_abs_err) - self.assertEqual(result_1, CompareConst.ACCURACY_CHECK_UNMATCH) - - cos_2 = CompareConst.NONE - result_2 = check_accuracy(cos_2, max_abs_err) - self.assertEqual(result_2, CompareConst.NONE) + def test_compare_by_op_bench_normal(self): + npu_op_name = 'Functional.linear.0.forward.input.0' + bench_op_name = 'Functional.linear.0.forward.input.0' + + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + + pt_name = '-1' + op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [pt_name, pt_name]} + input_param = {'npu_dump_data_dir': base_dir, 'bench_dump_data_dir': base_dir} + result = compare_real_data.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) + self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', + 'unsupported', 'No bench data matched.']) + + pt_name = 'Functional.linear.0.forward.input.0.pt' + op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [pt_name, pt_name]} + input_param = {'npu_dump_data_dir': base_dir, 'bench_dump_data_dir': base_dir} + result = compare_real_data.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) + self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', + 'unsupported', 'Dump file: Functional.linear.0.forward.input.0.pt not found.']) + + generate_pt(base_dir) + result = compare_real_data.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) + self.assertEqual(result, [1.0, 0.0, 0.0, 0.0, 1.0, 1.0, '']) + + def test_compare_by_op_bench_na(self): + npu_op_name = 'Functional.linear.0.forward.input.0' + bench_op_name = 'N/A' + op_name_mapping_dict = {'Functional.linear.0.forward.input.0': [-1, -1]} + input_param = {} + + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + + result = compare_real_data.compare_by_op(npu_op_name, bench_op_name, op_name_mapping_dict, input_param) + self.assertEqual(result, ['unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', + 'unsupported', 'No bench data matched.']) + + def test_compare_ops(self): + generate_dump_json(base_dir3) + generate_stack_json(base_dir3) + generate_pt(pt_dir) + dump_path = os.path.join(base_dir3, 'dump.json') + stack_path = os.path.join(base_dir3, 'stack.json') + input_param = {'npu_json_path': dump_path, 'bench_json_path': dump_path, 'stack_json_path': stack_path, + 'is_print_compare_log': True, 'npu_dump_data_dir': pt_dir, 'bench_dump_data_dir': pt_dir} + dump_path_dict = {'Functional.linear.0.forward.input.0': ['Functional.linear.0.forward.input.0.pt', + 'Functional.linear.0.forward.input.0.pt']} + result_df = pd.DataFrame({ + 'NPU Name': ['Functional.linear.0.forward.input.0'], + 'Bench Name': ['Functional.linear.0.forward.input.0'] + }) + + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) + + updated_df = compare_real_data.compare_ops(idx=0, dump_path_dict=dump_path_dict, result_df=result_df, + lock=self.lock, input_param=input_param) + + self.assertEqual(updated_df.loc[0, CompareConst.COSINE], 1.0) + self.assertEqual(updated_df.loc[0, CompareConst.MAX_ABS_ERR], 0) + + def test_do_multi_process(self): + data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', + 'torch.float32', 'torch.float32', [2, 2], [2, 2], + '', '', '', '', '', '', 1, 1, 1, 1, 1, 1, 1, 1, 'Yes', '', ['-1', '-1']]] + o_data = [['Functional.linear.0.forward.input.0', 'Functional.linear.0.forward.input.0', + 'torch.float32', 'torch.float32', [2, 2], [2, 2], + 'unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', 'unsupported', + 1, 1, 1, 1, 1, 1, 1, 1, 'None', 'No bench data matched.', ['-1', '-1']]] + columns = CompareConst.COMPARE_RESULT_HEADER + ['Data_name'] + result_df = pd.DataFrame(data, columns=columns) + o_result = pd.DataFrame(o_data, columns=columns) + generate_dump_json(base_dir) + input_param = {'bench_json_path': os.path.join(base_dir, 'dump.json')} - cos_3 = 'N/A' - result_3 = check_accuracy(cos_3, max_abs_err) - self.assertEqual(result_3, CompareConst.ACCURACY_CHECK_NO) + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) - cos_4 = '' - result_4 = check_accuracy(cos_4, max_abs_err) - self.assertEqual(result_4, CompareConst.NONE) - - cos_5 = 0.95 - max_abs_err = 0.002 - result_5 = check_accuracy(cos_5, max_abs_err) - self.assertEqual(result_5, CompareConst.ACCURACY_CHECK_NO) + result = compare_real_data.do_multi_process(input_param, result_df) + self.assertTrue(result.equals(o_result)) - cos_6 = 0.85 - max_abs_err = 2 - result_6 = check_accuracy(cos_6, max_abs_err) - self.assertEqual(result_6, CompareConst.ACCURACY_CHECK_NO) + def test_handle_multi_process(self): + file_reader = read_real_data + mode_config = ModeConfig(dump_mode=Const.ALL) + cross_frame = False + compare_real_data = CompareRealData(file_reader, mode_config, cross_frame) - cos_7 = 0.95 - max_abs_err = 0.001 - result_7 = check_accuracy(cos_7, max_abs_err) - self.assertEqual(result_7, CompareConst.ACCURACY_CHECK_YES) + func = compare_real_data.compare_ops + generate_dump_json(base_dir) + input_param = {'bench_json_path': os.path.join(base_dir, 'dump.json')} + lock = multiprocessing.Manager().RLock() + result = compare_real_data._handle_multi_process(func, input_param, result_df, lock) + self.assertTrue(result.equals(o_result)) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/bench.sh b/debug/accuracy_tools/msprobe/test/core_ut/config_check/bench.sh similarity index 100% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/bench.sh rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/bench.sh diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/cmp.sh b/debug/accuracy_tools/msprobe/test/core_ut/config_check/cmp.sh similarity index 100% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/cmp.sh rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/cmp.sh diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_config_checking.py b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_config_check.py similarity index 48% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_config_checking.py rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/test_config_check.py index 27b6b6e4364ff440a74d9619d1439e349a696efe..9234cf0e0076a9bb268d96120d3de813c53a7c29 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_config_checking.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_config_check.py @@ -6,18 +6,23 @@ import torch import json import numpy as np import torch.nn as nn -from msprobe.pytorch.config_checking.config_checker import ConfigChecker -from msprobe.pytorch.config_checking.checkers.pip_checker import PipPackageChecker -from msprobe.pytorch.config_checking.checkers.random_checker import RandomChecker -from msprobe.pytorch.config_checking.checkers.dataset_checker import DatasetChecker -from msprobe.pytorch.config_checking.checkers.weights_checker import WeightsChecker -from msprobe.pytorch.config_checking.checkers.random_checker import apply_patches +import mindspore as ms +import mindspore.nn as ms_nn +from mindspore import Tensor +from msprobe.core.config_check.config_checker import ConfigChecker +from msprobe.core.config_check.checkers.pip_checker import PipPackageChecker +from msprobe.core.config_check.checkers.random_checker import RandomChecker +from msprobe.core.config_check.checkers.dataset_checker import DatasetChecker +from msprobe.core.config_check.checkers.weights_checker import WeightsChecker from msprobe.core.common.file_utils import read_xlsx +from msprobe.core.common.framework_adapter import FmkAdp + testdir = os.path.dirname(__file__) config_checking_dir = os.path.dirname(testdir) temp_dir = os.path.join(config_checking_dir, "temp") os.makedirs(temp_dir, exist_ok=True) +ms.set_context(device_target="CPU") def seed_all(seed=1234, mode=False): @@ -26,9 +31,10 @@ def seed_all(seed=1234, mode=False): np.random.seed(seed) torch.manual_seed(seed) torch.use_deterministic_algorithms(mode) + ms.set_seed(seed) -class MockModule(nn.Module): +class MockPyTorchModule(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(10, 5) @@ -40,49 +46,82 @@ class MockModule(nn.Module): return x2 +class MockMindSporeModule(ms_nn.Cell): + def __init__(self): + super().__init__() + self.linear = ms_nn.Dense(10, 5) + self.relu = ms_nn.ReLU() + + def construct(self, x): + x1 = self.linear(x) + x2 = self.relu(x1) + return x2 + + def get_test_dataset(): inputs = [torch.rand(10, 10) for _ in range(10)] labels = [torch.randint(0, 5, (10,)) for _ in range(10)] - return zip(inputs, labels) + ms_inputs = [Tensor(input.numpy()) for input in inputs] + ms_labels = [Tensor(label.numpy()) for label in labels] + return zip(inputs, labels), zip(ms_inputs, ms_labels) -def get_test_model(): - test_module = MockModule() - nn.init.constant_(test_module.linear.weight, 1.0) - nn.init.constant_(test_module.linear.bias, 1.0) - return test_module +def get_test_model(use_pytorch=True): + if use_pytorch: + test_module = MockPyTorchModule() + nn.init.constant_(test_module.linear.weight, 1.0) + nn.init.constant_(test_module.linear.bias, 1.0) + return test_module + else: + test_module = MockMindSporeModule() + for param in test_module.get_parameters(): + param.set_data(ms.Tensor(np.ones(param.data.shape), dtype=param.data.dtype)) + return test_module -@unittest.mock.patch("msprobe.pytorch.config_checking.checkers.pip_checker.collect_pip_data") -@unittest.mock.patch("msprobe.pytorch.config_checking.checkers.env_args_checker.collect_env_data") +@unittest.mock.patch("msprobe.core.config_check.checkers.pip_checker.collect_pip_data") +@unittest.mock.patch("msprobe.core.config_check.checkers.env_args_checker.collect_env_data") def train_test(seed, output_zip_path, shell_path, mock_env, mock_pip): - mock_env.return_value = {"HCCL_DETERMINISTIC": False} if seed == 1234: mock_pip.return_value = "transformers=0.0.1" + mock_env.return_value = {"NCCL_DETERMINISTIC": True} else: mock_pip.return_value = "transformers=0.0.2" + mock_env.return_value = {"HCCL_DETERMINISTIC": False, "ASCEND_LAUNCH_BLOCKING": 1} seed_all(seed) - loss_fun = nn.CrossEntropyLoss() - test_module = get_test_model() - optimizer = torch.optim.SGD(test_module.parameters(), lr=1e-2) + use_pytorch = seed == 1234 + test_dataset, ms_test_dataset = get_test_dataset() + test_module = get_test_model(use_pytorch) - ConfigChecker(test_module, shell_path, output_zip_path) + if use_pytorch: + loss_fun = nn.CrossEntropyLoss() + optimizer = torch.optim.SGD(test_module.parameters(), lr=1e-2) + ConfigChecker(test_module, shell_path, output_zip_path) - try: - for input_data, label in get_test_dataset(): + for input_data, label in test_dataset: output = test_module(input_data, y=input_data) loss = loss_fun(output, label) optimizer.zero_grad() loss.backward() optimizer.step() - except Exception: - pass + + else: + loss_fun = ms_nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction='mean') + optimizer = ms_nn.SGD(test_module.trainable_params(), learning_rate=1e-2) + train_network = ms_nn.TrainOneStepCell(ms_nn.WithLossCell(test_module, loss_fun), optimizer) + ConfigChecker(test_module, shell_path, output_zip_path, fmk="mindspore") + + for input_data, label in ms_test_dataset: + loss = train_network(input_data, label) + class TestConfigChecker(unittest.TestCase): def tearDown(self): + FmkAdp.set_fmk("pytorch") shutil.rmtree(temp_dir) + def test_all(self): train_test(1234, os.path.join(temp_dir, "config_check_pack1.zip"), [os.path.join(testdir, "cmp.sh")]) @@ -90,7 +129,8 @@ class TestConfigChecker(unittest.TestCase): ConfigChecker.pre_forward_fun_list = [] ConfigChecker.step = 0 RandomChecker.write_once = False - apply_patches() + ConfigChecker.apply_patches("pytorch") + ConfigChecker.apply_patches("mindspore") train_test(1233, os.path.join(temp_dir, "config_check_pack2.zip"), [os.path.join(testdir, "bench.sh")]) @@ -100,32 +140,34 @@ class TestConfigChecker(unittest.TestCase): compare_output_dir = os.path.join(temp_dir, "compare_output") - total_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename)) self.assertEqual(total_check_result.columns.tolist(), ConfigChecker.result_header) target_total_check_result = [ - ['env', True], - ['pip', False], - ['dataset', False], - ['weights', False], - ['hyperparameters', True], + ['env', False], + ['pip', False], + ['dataset', False], + ['weights', False], + ['hyperparameters', False], ['random', False] - ] + ] self.assertEqual(total_check_result.values.tolist(), target_total_check_result) - pip_data_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), sheet_name=PipPackageChecker.target_name_in_zip) + pip_data_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), + sheet_name=PipPackageChecker.target_name_in_zip) self.assertEqual(pip_data_check_result.columns.tolist(), PipPackageChecker.result_header) self.assertEqual(pip_data_check_result.iloc[0].tolist(), ['transformers', '0.0.1', '0.0.2', 'error']) - random_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), sheet_name=RandomChecker.target_name_in_zip) + random_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), + sheet_name=RandomChecker.target_name_in_zip) self.assertEqual(random_check_result.columns.tolist(), RandomChecker.result_header) - self.assertEqual(len(random_check_result), 3) + self.assertEqual(len(random_check_result), 5) - dataset_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), sheet_name=DatasetChecker.target_name_in_zip) + dataset_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), + sheet_name=DatasetChecker.target_name_in_zip) self.assertEqual(dataset_check_result.columns.tolist(), DatasetChecker.result_header) self.assertEqual(len(dataset_check_result), 20) - weight_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), sheet_name=WeightsChecker.target_name_in_zip) + weight_check_result = read_xlsx(os.path.join(compare_output_dir, ConfigChecker.result_filename), + sheet_name=WeightsChecker.target_name_in_zip) self.assertEqual(weight_check_result.columns.tolist(), WeightsChecker.result_header) self.assertEqual(len(weight_check_result), 20) - diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_dataset_checker.py b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_dataset_checker.py similarity index 83% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_dataset_checker.py rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/test_dataset_checker.py index 0898a4f8cf22847b87b66061e38a50c80bac9127..27db8e04d0890d579fae4ec02a7260102f08979b 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_dataset_checker.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_dataset_checker.py @@ -3,18 +3,12 @@ import torch import pandas as pd from unittest.mock import patch, MagicMock -from msprobe.pytorch.config_checking.checkers.dataset_checker import compare_dataset, \ - compare_dataset_dicts, parse_args_and_kargs, process_obj, process_tensor +from msprobe.core.config_check.checkers.dataset_checker import compare_dataset, \ + compare_dataset_dicts, parse_args_and_kargs, process_obj class TestTensorProcessing(unittest.TestCase): - def test_process_tensor(self): - tensor = torch.tensor([1.0, 2.0, 3.0]) - result = process_tensor(tensor) - self.assertEqual(isinstance(result, dict), True) - self.assertEqual(set(result.keys()), {'max', 'min', 'mean', 'norm'}) - def test_process_obj_tensor(self): tensor = torch.tensor([1.0, 2.0, 3.0]) result = process_obj(tensor) @@ -66,10 +60,10 @@ class TestTensorProcessing(unittest.TestCase): self.assertEqual(len(results), 1) self.assertEqual(results[0]['tag'], 'a.b') - @patch('os.listdir', return_value=['step1']) + @patch('os.listdir', side_effect=[["step1"], ["rank1"]]) @patch('os.path.isdir', return_value=True) @patch('os.path.isfile', return_value=True) - @patch('msprobe.pytorch.config_checking.checkers.dataset_checker.load_json') + @patch('msprobe.core.config_check.checkers.dataset_checker.load_json') def test_compare_dataset(self, mock_load_json, mock_isfile, mock_isdir, mock_listdir): mock_load_json.return_value = {'a': {'max': 1.0, 'min': 0.0, 'mean': 0.5, 'norm': 0.7}} bench_dir = 'bench' diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_random_checker.py b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_random_checker.py similarity index 94% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_random_checker.py rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/test_random_checker.py index 4b04351862796ccd433d7136e318f47dca856349..9a6bdb89f83d0c154c86d6522ffc76bb042cff4d 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_random_checker.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_random_checker.py @@ -2,14 +2,14 @@ import unittest import pandas as pd from unittest.mock import patch, MagicMock -from msprobe.pytorch.config_checking.checkers.random_checker import compare_json_files, compare_random, get_file_and_line +from msprobe.core.config_check.checkers.random_checker import compare_json_files, compare_random, get_file_and_line class TestCompareRandom(unittest.TestCase): @patch('os.listdir', return_value=['rank1.json', 'rank2.json']) @patch('os.path.join', return_value='test_path') - @patch("msprobe.pytorch.config_checking.checkers.random_checker.load_json") + @patch("msprobe.core.config_check.checkers.random_checker.load_json") def test_compare_random_with_files(self, mock_load_json, mock_path, mock_listdir): mock_load_json.return_value = {"op1": {"position1": 1}} bench_dir = 'test_bench' diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_weight_checker.py b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_weight_checker.py similarity index 78% rename from debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_weight_checker.py rename to debug/accuracy_tools/msprobe/test/core_ut/config_check/test_weight_checker.py index 7b76268d23799d31b7fc90fd091f83eb6cbba577..4920c034455920075c617f7db90a26fbe826b10c 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/config_checking/test_weight_checker.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/config_check/test_weight_checker.py @@ -4,11 +4,11 @@ import pandas as pd import os import torch -from msprobe.pytorch.config_checking.checkers.weights_checker import collect_weights_data, compare_weight, compare_weight_file +from msprobe.core.config_check.checkers.weights_checker import collect_weights_data, compare_weight, compare_weight_file class TestWeightComparison(unittest.TestCase): - @patch('msprobe.pytorch.config_checking.utils.utils.get_tensor_features') + @patch('msprobe.core.config_check.utils.utils.get_tensor_features') @patch('torch.nn.Module.named_parameters') def test_collect_weights_data(self, mock_named_parameters, mock_get_tensor_features): mock_model = unittest.mock.create_autospec(torch.nn.Module) @@ -17,7 +17,7 @@ class TestWeightComparison(unittest.TestCase): result = collect_weights_data(mock_model) self.assertEqual(isinstance(result, dict), True) - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.load_json') + @patch('msprobe.core.config_check.checkers.weights_checker.load_json') def test_compare_weight_file(self, mock_load_json): mock_load_json.side_effect = [ {'weight1': {'max': 1, 'min': 0, 'mean': 0.5, 'norm': 1}}, @@ -26,8 +26,8 @@ class TestWeightComparison(unittest.TestCase): result = compare_weight_file('bench.json', 'cmp.json') self.assertEqual(isinstance(result, list), True) - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.os_walk_for_files') - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.load_json') + @patch('msprobe.core.config_check.checkers.weights_checker.os_walk_for_files') + @patch('msprobe.core.config_check.checkers.weights_checker.load_json') @patch('os.path.exists') def test_compare_weight(self, mock_exists, mock_load_json, mock_os_walk_for_files): mock_os_walk_for_files.return_value = [ @@ -38,7 +38,7 @@ class TestWeightComparison(unittest.TestCase): result = compare_weight('bench', 'cmp') self.assertEqual(isinstance(result, pd.DataFrame), True) - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.load_json') + @patch('msprobe.core.config_check.checkers.weights_checker.load_json') def test_compare_weight_file_different_weights(self, mock_load_json): mock_load_json.side_effect = [ {'weight1': {'max': 1, 'min': 0, 'mean': 0.5, 'norm': 1}}, @@ -50,8 +50,8 @@ class TestWeightComparison(unittest.TestCase): if res["weight_name"] == "weight1": self.assertEqual(res["equal"], False) - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.os_walk_for_files') - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.load_json') + @patch('msprobe.core.config_check.checkers.weights_checker.os_walk_for_files') + @patch('msprobe.core.config_check.checkers.weights_checker.load_json') @patch('os.path.exists') def test_compare_weight_cmp_file_missing(self, mock_exists, mock_load_json, mock_os_walk_for_files): mock_os_walk_for_files.return_value = [ @@ -63,8 +63,8 @@ class TestWeightComparison(unittest.TestCase): self.assertEqual(isinstance(result, pd.DataFrame), True) self.assertEqual(len(result[result["equal"] == "only bench have"]), 1) - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.os_walk_for_files') - @patch('msprobe.pytorch.config_checking.checkers.weights_checker.load_json') + @patch('msprobe.core.config_check.checkers.weights_checker.os_walk_for_files') + @patch('msprobe.core.config_check.checkers.weights_checker.load_json') @patch('os.path.exists') def test_compare_weight_multiple_files(self, mock_exists, mock_load_json, mock_os_walk_for_files): mock_os_walk_for_files.return_value = [ diff --git a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_base.py b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_base.py index 594336e20b252dfdf9fe724a23645c809241c577..f9b6bd4d8a2266e0f449239a7df87d5caf9d1b10 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_base.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_base.py @@ -70,23 +70,22 @@ class TestBaseDataProcessor(unittest.TestCase): @patch('inspect.stack') def test_analyze_api_call_stack(self, mock_stack): mock_stack.return_value = [ - (None, 'file0.py', 0, 'function0', ['code line 0'], None), - (None, 'file1.py', 10, 'function1', ['code line 1'], None), - (None, 'file2.py', 20, 'function2', ['code line 2'], None), (None, 'file3.py', 30, 'function3', ['code line 3'], None), - (None, 'file4.py', 40, 'function4', ['code line 4'], None), - (None, 'file5.py', 50, 'function5', ['code line 5'], None), - (None, 'file6.py', 60, 'function6', ['code line 6'], None), - (None, 'file7.py', 70, 'function7', ['code line 7'], None), + (None, 'file1.py', 40, 'function1', ['code line 1'], None), + (None, 'file2.py', 50, 'function2', ['code line 2'], None), + (None, 'file3.py', 60, 'function3', ['code line 3'], None), + (None, 'file1.py', 70, 'function1', ['code line 1'], None), + (None, 'file1.py', 80, 'function1', ['code line 1'], None), + (None, 'file2.py', 90, 'function2', ['code line 2'], None), + (None, 'file3.py', 100, 'function3', ['code line 3'], None) ] result = BaseDataProcessor.analyze_api_call_stack('test_stack') - expected_output = { - 'test_stack': [ - 'File file5.py, line 50, in function5, \n code line 5', - 'File file6.py, line 60, in function6, \n code line 6', - 'File file7.py, line 70, in function7, \n code line 7', - ] - } + expected_output = ( + 'File file1.py, line 80, in function1, \n code line 1', + 'File file2.py, line 90, in function2, \n code line 2', + 'File file3.py, line 100, in function3, \n code line 3', + ) + self.assertEqual(result, expected_output) def test_analyze_builtin(self): @@ -127,8 +126,8 @@ class TestBaseDataProcessor(unittest.TestCase): expected = {"type": 'int8', "value": 1} self.assertEqual(result, expected) - result = BaseDataProcessor._analyze_numpy(np.complex128(1+2j)) - expected = {"type": 'complex128', "value": (1+2j)} + result = BaseDataProcessor._analyze_numpy(np.complex128(1 + 2j)) + expected = {"type": 'complex128', "value": (1 + 2j)} self.assertEqual(result, expected) def test_get_special_types(self): @@ -144,7 +143,7 @@ class TestBaseDataProcessor(unittest.TestCase): 'Max': 6, 'Min': 1, 'Mean': 3.5, - 'Norm':9.539392014169456 + 'Norm': 9.539392014169456 } self.assertEqual(result, expected_result) @@ -165,6 +164,7 @@ class TestBaseDataProcessor(unittest.TestCase): transform = lambda x, _: x * 2 Test = namedtuple("Test", ['a']) myNamedTuple = Test(1) + @dataclass class MyDataClass: last_hidden_state: int = None @@ -176,7 +176,7 @@ class TestBaseDataProcessor(unittest.TestCase): hidden_states=(2, 3), attentions=(4, 5) ) - expected_dataclass_res = {'last_hidden_state': 2, 'hidden_states': [4, 6], 'attentions': [8,10]} + expected_dataclass_res = {'last_hidden_state': 2, 'hidden_states': [4, 6], 'attentions': [8, 10]} self.assertEqual(BaseDataProcessor.recursive_apply_transform(2, transform), 4) self.assertEqual(BaseDataProcessor.recursive_apply_transform(myData, transform), expected_dataclass_res) self.assertEqual(BaseDataProcessor.recursive_apply_transform(myNamedTuple, transform), {'a': 2}) @@ -311,9 +311,9 @@ class TestBaseDataProcessor(unittest.TestCase): self.assertEqual(dst_data_structure, excepted_result) def test_analyze_element_to_all_none(self): - element = {"key1": [12, 3, {"key2": 10, "key3":["12"]}]} + element = {"key1": [12, 3, {"key2": 10, "key3": ["12"]}]} result = self.processor.analyze_element_to_all_none(element) - excepted_result = {"key1": [None, None, {"key2": None, "key3":[None]}]} + excepted_result = {"key1": [None, None, {"key2": None, "key3": [None]}]} self.assertEqual(result, excepted_result) @patch.object(MindsporeDataProcessor, "is_hookable_element", return_value=True) @@ -358,4 +358,4 @@ class TestBaseDataProcessor(unittest.TestCase): nested_data_structure, ["grad_name_1", "layer1", "layer2"], "grad_data_info" ) self.assertIsNone(self.processor.save_name) - self.assertEqual(result, grad) \ No newline at end of file + self.assertEqual(result, grad) diff --git a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_mindspore_processor.py b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_mindspore_processor.py index 46cc3b44747a548bee655927c7b5cef69b84d586..25141a9e774d6a6ca05be9b668de96a7d57cb373 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_mindspore_processor.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_mindspore_processor.py @@ -22,6 +22,7 @@ import mindspore as ms from mindspore import Tensor, ops, mint import numpy as np +from msprobe.core.common.const import Const from msprobe.core.data_dump.data_processor.base import BaseDataProcessor from msprobe.core.data_dump.data_processor.mindspore_processor import ( MindsporeDataProcessor, @@ -73,6 +74,20 @@ class TestMindsporeDataProcessor(unittest.TestCase): self.assertEqual(result.mean, 2.0) self.assertEqual(result.norm, ms.ops.norm(tensor).item()) + def test_get_stat_info_float_async(self): + self.config.async_dump = True + tensor = ms.tensor([1.0, 2.0, 3.0]) + result = self.processor.get_stat_info(tensor) + result_max = result.max + result_min = result.min + result_mean = result.mean + result_norm = result.norm + + self.assertEqual(result_max.item(), 3.0) + self.assertEqual(result_min.item(), 1.0) + self.assertEqual(result_mean.item(), 2.0) + self.assertEqual(result_norm.item(), ms.ops.norm(tensor).item()) + def test_get_stat_info_int(self): self.config.async_dump = False tensor = ms.Tensor([1, 2, 3], dtype=ms.int32) @@ -82,6 +97,17 @@ class TestMindsporeDataProcessor(unittest.TestCase): self.assertEqual(result.mean, 2) self.assertEqual(result.norm, ms.ops.norm(tensor).item()) + def test_get_stat_info_int_async(self): + self.config.async_dump = True + tensor = ms.tensor([1, 2, 3]) + result = self.processor.get_stat_info(tensor) + + result_max = result.max + result_min = result.min + + self.assertEqual(result_max.item(), 3.0) + self.assertEqual(result_min.item(), 1.0) + def test_get_stat_info_bool(self): self.config.async_dump = False tensor = ms.Tensor([True, False, True]) @@ -91,12 +117,71 @@ class TestMindsporeDataProcessor(unittest.TestCase): self.assertIsNone(result.mean) self.assertIsNone(result.norm) + def test_get_stat_info_bool_async(self): + self.config.async_dump = True + tensor = ms.Tensor([True, False, True]) + result = self.processor.get_stat_info(tensor) + + result_max = result.max + result_min = result.min + + self.assertEqual(result_max.item(), True) + self.assertEqual(result_min.item(), False) + + @patch.object(MindsporeDataProcessor, 'get_md5_for_tensor') + def test__analyze_tensor(self, get_md5_for_tensor): + get_md5_for_tensor.return_value = "test_md5" + tensor = ms.Tensor(np.array([1, 2, 3], dtype=np.int32)) + self.config.summary_mode = 'md5' + self.config.async_dump = False + suffix = "test_tensor" + expected_result = { + 'type': 'mindspore.Tensor', + 'dtype': 'Int32', + 'shape': (3,), + 'md5': 'test_md5', + } + result = self.processor._analyze_tensor(tensor, suffix) + # 删除不必要的字段 + result.pop('tensor_stat_index', None) + + self.assertEqual(result, expected_result) + + +class TestTensorDataProcessor(unittest.TestCase): + + def setUp(self): + self.config = MagicMock() + self.data_writer = MagicMock() + self.processor = TensorDataProcessor(self.config, self.data_writer) + self.data_writer.dump_tensor_data_dir = "./dump_data" + self.processor.current_api_or_module_name = "test_api" + self.processor.api_data_category = "input" + + @patch('msprobe.core.data_dump.data_processor.mindspore_processor.save_tensor_as_npy') + def test_analyze_tensor(self, mock_save): + self.config.framework = "mindspore" + self.config.async_dump = False + tensor = ms.Tensor([1.0, 2.0, 3.0]) + suffix = 'suffix' + result = self.processor._analyze_tensor(tensor, suffix) + mock_save.assert_called_once() + expected = { + 'type': 'mindspore.Tensor', + 'dtype': str(tensor.dtype), + 'shape': tensor.shape, + 'data_name': 'test_api.input.suffix.npy' + } + result.pop('tensor_stat_index', None) + self.assertEqual(expected, result) + class TestOverflowCheckDataProcessor(unittest.TestCase): def setUp(self): class Config: def __init__(self): self.overflow_nums = 1 + self.data_processor = OverflowCheckDataProcessor(Config(), None) def test___init__(self): @@ -107,6 +192,7 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): def test_analyze_forward(self): def func(_): self.data_processor.has_overflow = True + with patch.object(BaseDataProcessor, "analyze_forward", return_value={"min", 0}): with patch.object(OverflowCheckDataProcessor, "maybe_save_overflow_data"): api_info = self.data_processor.analyze_forward("name", "module", "module_input_output") @@ -120,6 +206,7 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): def test_analyze_backward(self): def func(_): self.data_processor.has_overflow = True + with patch.object(BaseDataProcessor, "analyze_backward", return_value={"min", 0}): with patch.object(OverflowCheckDataProcessor, "maybe_save_overflow_data"): api_info = self.data_processor.analyze_backward("name", "module", "module_input_output") @@ -151,6 +238,87 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): self.data_processor.overflow_nums = 3 self.assertFalse(self.data_processor.is_terminated) + # from unittest.mock import MagicMock + + def test__analyze_maybe_overflow_tensor(self): + # Mock DataWriter 和相关方法 + self.data_processor.data_writer = MagicMock() + + tensor_json = {Const.TENSOR_STAT_INDEX: 1} # 修正:添加正确的 tensor_stat_index + + # 模拟返回值 + self.data_processor.data_writer.get_buffer_values_max.return_value = 10 + self.data_processor.data_writer.get_buffer_values_min.return_value = -10 + + self.data_processor.has_overflow = False + # 调用函数并检查没有溢出 + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertFalse(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # max 值为 -np.inf,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_max.return_value = -np.inf + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # max 值为 np.inf,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_max.return_value = np.inf + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # max 值为 np.nan,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_max.return_value = np.nan + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # max 值为 0,不会触发溢出 + self.data_processor.data_writer.get_buffer_values_max.return_value = 0 + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertFalse(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # min 值为 -np.inf,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_min.return_value = -np.inf + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # min 值为 np.inf,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_min.return_value = np.inf + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + self.data_processor.has_overflow = False + # min 值为 np.nan,应该触发溢出 + self.data_processor.data_writer.get_buffer_values_min.return_value = np.nan + self.data_processor._analyze_maybe_overflow_tensor(tensor_json) + self.assertTrue(self.data_processor.has_overflow) + + @patch("msprobe.core.data_dump.data_processor.mindspore_processor.logger.warning") + @patch.object(OverflowCheckDataProcessor, "get_save_file_path") + @patch.object(MindsporeDataProcessor, "_analyze_tensor") + def test__analyze_tensor(self, mock_super, mock_get_file_path, mock_warning): + mock_get_file_path.return_value = ("dump_data_name", "file_path") + single_arg = {"Max": None} + mock_super.return_value = single_arg + + with patch("msprobe.core.data_dump.data_processor.mindspore_processor.path_len_exceeds_limit", + return_value=False): + ret = self.data_processor._analyze_tensor("tensor", "suffix") + self.assertEqual(self.data_processor.cached_tensors_and_file_paths, {"file_path": "tensor"}) + mock_warning.assert_called_with("tensor_stat_index does not exist in tensor_json.") + mock_super.assert_called_with("tensor", "suffix") + self.assertEqual(ret.get("Max"), None) + self.assertEqual(ret.get("data_name"), "dump_data_name") + + with patch("msprobe.core.data_dump.data_processor.mindspore_processor.path_len_exceeds_limit", + return_value=True): + self.data_processor._analyze_tensor("tensor", "suffix") + mock_warning.assert_called_with("tensor_stat_index does not exist in tensor_json.") + class TestKernelDumpDataProcessor(unittest.TestCase): def setUp(self): @@ -175,7 +343,8 @@ class TestKernelDumpDataProcessor(unittest.TestCase): def test_analyze_pre_forward_without_adump(self, mock_logger_warning): self.processor.enable_kernel_dump = True self.processor.analyze_forward_input("test_api_name", None, None) - mock_logger_warning.assert_called_with("The current msprobe package does not compile adump, and kernel dump cannot be used.") + mock_logger_warning.assert_called_with( + "The current msprobe package does not compile adump, and kernel dump cannot be used.") self.assertFalse(self.processor.enable_kernel_dump) @patch('msprobe.core.data_dump.data_processor.mindspore_processor.KernelDumpDataProcessor.stop_kernel_dump') @@ -201,7 +370,8 @@ class TestKernelDumpDataProcessor(unittest.TestCase): self.processor.enable_kernel_dump = True self.processor.analyze_backward_input("test_api_name", None, None) self.assertFalse(self.processor.enable_kernel_dump) - mock_logger_warning.assert_called_with("The current msprobe package does not compile adump, and kernel dump cannot be used.") + mock_logger_warning.assert_called_with( + "The current msprobe package does not compile adump, and kernel dump cannot be used.") @patch('msprobe.core.data_dump.data_processor.mindspore_processor.KernelDumpDataProcessor.stop_kernel_dump') @patch.object(logger, 'info') diff --git a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_pytorch_processor.py b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_pytorch_processor.py index a784886f949e09f49954274608f1c703d103531e..9fb305197007d85b01fe3412d5b0e9117aa63e9c 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_pytorch_processor.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/data_processor/test_pytorch_processor.py @@ -1,19 +1,3 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -# Copyright (C) 2024-2025. Huawei Technologies Co., Ltd. All rights reserved. -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -""" import hashlib import os import sys @@ -96,14 +80,42 @@ class TestPytorchDataProcessor(unittest.TestCase): self.assertEqual(result.mean, 2.0) self.assertEqual(result.norm, torch.norm(tensor).item()) + def test_get_stat_info_float_async(self): + tensor = torch.tensor([1.0, 2.0, 3.0]) + result = self.processor.get_stat_info_async(tensor) + + result_max = result.max + result_min = result.min + result_mean = result.mean + result_norm = result.norm + + self.assertEqual(result_max.item(), 3.0) + self.assertEqual(result_min.item(), 1.0) + self.assertEqual(result_mean.item(), 2.0) + self.assertEqual(result_norm.item(), torch.norm(tensor).item()) + def test_get_stat_info_int(self): tensor = torch.tensor([1, 2, 3], dtype=torch.int32) result = self.processor.get_stat_info(tensor) + self.assertEqual(result.max, 3) self.assertEqual(result.min, 1) self.assertEqual(result.mean, 2) self.assertEqual(result.norm, torch.norm(tensor.float()).item()) + def test_get_stat_info_int_async(self): + tensor = torch.tensor([1, 2, 3]) + result = self.processor.get_stat_info_async(tensor) + + result_max = result.max + result_min = result.min + result_mean = result.mean + result_norm = result.norm + + self.assertEqual(result_max.item(), 3.0) + self.assertEqual(result_min.item(), 1.0) + self.assertEqual(result_mean.item(), 2.0) + self.assertEqual(result_norm.item(), torch.norm(tensor.float()).item()) def test_get_stat_info_empty(self): tensor = torch.tensor([]) @@ -121,6 +133,16 @@ class TestPytorchDataProcessor(unittest.TestCase): self.assertIsNone(result.mean) self.assertIsNone(result.norm) + def test_get_stat_info_bool_async(self): + tensor = torch.tensor([True, False, True]) + result = self.processor.get_stat_info_async(tensor) + + result_max = result.max + result_min = result.min + + self.assertEqual(result_max.item(), True) + self.assertEqual(result_min.item(), False) + def test_get_stat_info_with_scalar_tensor(self): scalar_tensor = torch.tensor(42.0) result = PytorchDataProcessor.get_stat_info(scalar_tensor) @@ -242,6 +264,7 @@ class TestPytorchDataProcessor(unittest.TestCase): class TestReduceOp: def __str__(self): raise Exception("failed to convert str type") + arg = TestReduceOp() self.processor._analyze_reduce_op(arg) mock_logger_warning.assert_called_with( @@ -298,9 +321,9 @@ class TestPytorchDataProcessor(unittest.TestCase): expected = {"type": 'int8', "value": 1} self.assertEqual(result, expected) - numpy_element = np.complex128(1+2j) + numpy_element = np.complex128(1 + 2j) result = self.processor.analyze_single_element(numpy_element, []) - expected = {"type": 'complex128', "value": (1+2j)} + expected = {"type": 'complex128', "value": (1 + 2j)} self.assertEqual(result, expected) def test_analyze_single_element_tensor(self): @@ -320,6 +343,32 @@ class TestPytorchDataProcessor(unittest.TestCase): expected_result = self.processor._analyze_builtin(Ellipsis) self.assertEqual(result, expected_result) + @patch.object(PytorchDataProcessor, 'get_md5_for_tensor') + def test_analyze_tensor(self, get_md5_for_tensor): + get_md5_for_tensor.return_value = 'mocked_md5' + tensor = torch.tensor([1.0, 2.0, 3.0]) + self.config.summary_mode = 'md5' + self.config.async_dump = False + result = self.processor._analyze_tensor(tensor, 'suffix') + expected = { + 'type': 'torch.Tensor', + 'dtype': str(tensor.dtype), + 'shape': tensor.shape, + 'requires_grad': tensor.requires_grad, + 'md5': 'mocked_md5' + } + result.pop('tensor_stat_index', None) + self.assertDictEqual(expected, result) + + def test_analyze_tensor_with_empty_tensor(self): + tensor = torch.tensor([]) + result = self.processor._analyze_tensor(tensor, 'suffix') + + self.assertEqual(result['type'], "torch.Tensor") + self.assertEqual(result['dtype'], 'torch.float32') + self.assertEqual(result['shape'], torch.Size([0])) + self.assertEqual(result['requires_grad'], False) + def test_cast_to_float_if_fp8(self): tensor = MagicMock() tensor.dtype = "torch.float8_e5m2" @@ -341,6 +390,24 @@ class TestTensorDataProcessor(unittest.TestCase): self.processor.current_api_or_module_name = "test_api" self.processor.api_data_category = "input" + @patch('torch.save') + def test_analyze_tensor(self, mock_save): + self.config.framework = "pytorch" + self.config.async_dump = False + tensor = torch.tensor([1.0, 2.0, 3.0]) + suffix = 'suffix' + result = self.processor._analyze_tensor(tensor, suffix) + mock_save.assert_called_once() + expected = { + 'type': 'torch.Tensor', + 'dtype': 'torch.float32', + 'shape': tensor.shape, + 'requires_grad': False, + 'data_name': 'test_api.input.suffix.pt' + } + result.pop('tensor_stat_index', None) + self.assertEqual(expected, result) + class TestOverflowCheckDataProcessor(unittest.TestCase): @@ -356,6 +423,9 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): sys.modules['torch_npu'] = Mock() sys.modules['torch_npu.npu'] = Mock() sys.modules['torch_npu.npu.utils'] = Mock() + self.tensor_json = { + 'tensor_stat_index': 123 # 默认情况下 tensor_stat_index 存在 + } def test_is_terminated(self): self.processor.overflow_nums = -1 @@ -370,7 +440,7 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): def test_analyze_forward_input(self): with patch.object(BaseDataProcessor, "analyze_forward_input", return_value={"name": 1}): - api_info = self.processor.analyze_forward_input("name", "module","module_input_output") + api_info = self.processor.analyze_forward_input("name", "module", "module_input_output") self.assertEqual(self.processor.cached_api_info, {"name": 1}) self.assertIsNone(api_info) @@ -432,6 +502,57 @@ class TestOverflowCheckDataProcessor(unittest.TestCase): self.processor._is_support_inf_nan() self.assertTrue(self.processor.support_inf_nan) + def test_max_tensor_or_min_tensor_is_none(self): + # 让 get_buffer_values_max 和 get_buffer_values_min 返回 None + self.processor.data_writer.get_buffer_values_max.return_value = None + self.processor.data_writer.get_buffer_values_min.return_value = None + + # 在该情况下应该直接返回,不做任何改变 + self.processor._analyze_maybe_overflow_tensor(self.tensor_json) + + # 确保 has_overflow 没有被设置 + self.assertFalse(self.processor.has_overflow) + + def test_tensor_is_inf_or_nan(self): + # 模拟 max_tensor 为 Inf + self.processor.data_writer.get_buffer_values_max.return_value = torch.tensor(float('inf')) + self.processor.data_writer.get_buffer_values_min.return_value = torch.tensor(1.0) + + # 测试应该设置 has_overflow 为 True + self.processor._analyze_maybe_overflow_tensor(self.tensor_json) + self.assertTrue(self.processor.has_overflow) + + # 模拟 min_tensor 为 NaN + self.processor.data_writer.get_buffer_values_max.return_value = torch.tensor(1.0) + self.processor.data_writer.get_buffer_values_min.return_value = torch.tensor(float('nan')) + + # 测试应该设置 has_overflow 为 True + self.processor._analyze_maybe_overflow_tensor(self.tensor_json) + self.assertTrue(self.processor.has_overflow) + + def test_normal_tensor(self): + # 模拟正常的 max_tensor 和 min_tensor + self.processor.data_writer.get_buffer_values_max.return_value = torch.tensor(1.0) + self.processor.data_writer.get_buffer_values_min.return_value = torch.tensor(-1.0) + + # 在正常情况下不应该改变 has_overflow + self.processor._analyze_maybe_overflow_tensor(self.tensor_json) + self.assertFalse(self.processor.has_overflow) + + @patch('msprobe.core.common.file_utils.path_len_exceeds_limit', return_value=False) + @patch.object(BaseDataProcessor, 'get_save_file_path', + return_value=['test_api_name', 'test_api_name.0.forward.input.pt']) + def test_analyze_tensor(self, mock_path_len_exceeds_limit, _): + tensor = torch.tensor([1.0, 2.0, 3.0]) + suffix = 'suffix' + expected = {'Max': 3.0, 'Min': 1.0, 'data_name': 'test_api_name'} + with patch.object(PytorchDataProcessor, '_analyze_tensor', + return_value={'Max': 3.0, 'Min': 1.0}) as mock_super_analyze_tensor: + result = self.processor._analyze_tensor(tensor, suffix) + mock_super_analyze_tensor.assert_called_once_with(tensor, suffix) + mock_path_len_exceeds_limit.assert_called_once() + self.assertEqual(expected, result) + class TestFreeBenchmarkDataProcessor(unittest.TestCase): diff --git a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_data_collector.py b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_data_collector.py index b9d2e7abef7244fc12dc71e3113c26af52529ce9..74d84495dd0d4e5ae8cbcf5152e11179f99f4f3d 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_data_collector.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_data_collector.py @@ -106,7 +106,7 @@ class TestDataCollector(unittest.TestCase): @patch.object(BaseDataProcessor, "analyze_debug_forward", return_value="data_info") def test_debug_data_collect_forward(self, _, mock_update_debug): self.data_collector.debug_data_collect_forward("variable", "name_with_count") - mock_update_debug.assert_called_with({"name_with_count": "data_info"}) + mock_update_debug.assert_called_with({"name_with_count.debug": "data_info"}) @patch.object(DataWriter, "update_debug") @patch.object(BaseDataProcessor, "analyze_debug_backward") @@ -114,6 +114,6 @@ class TestDataCollector(unittest.TestCase): def test_debug_data_collect_backward(self, _, mock_analyze_debug_backward, mock_update_debug): self.data_collector.data_writer.cache_debug = {"data": None} self.data_collector.debug_data_collect_backward("variable", "name_with_count") - mock_update_debug.assert_called_with({"name_with_count": "all_none_data_info"}) - mock_analyze_debug_backward.assert_called_with("variable", "name_with_count", self.data_collector.data_writer.cache_debug['data']) + mock_update_debug.assert_called_with({"name_with_count.debug": "all_none_data_info"}) + mock_analyze_debug_backward.assert_called_with("variable", "name_with_count.debug", self.data_collector.data_writer.cache_debug['data']) self.data_collector.data_writer.cache_debug = None diff --git a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_json_writer.py b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_json_writer.py index 9b20ffb2197882e16c1550cf013d1ba132096063..3294c6596542f06e681e28d77483e291d5e37fd2 100644 --- a/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_json_writer.py +++ b/debug/accuracy_tools/msprobe/test/core_ut/data_dump/test_json_writer.py @@ -3,6 +3,7 @@ import os import unittest from unittest.mock import patch +from msprobe.core.common.const import Const from msprobe.core.common.utils import DumpPathAggregation from msprobe.core.common.file_utils import FileOpen, remove_path, load_json from msprobe.core.data_dump.json_writer import DataWriter @@ -13,6 +14,49 @@ class TestDataWriter(unittest.TestCase): self.data_writer = DataWriter() self.data_content = {"task": "tensor", "level": "L1", "data": {"Tensor.add": 1}} self.cur_path = os.path.dirname(os.path.realpath(__file__)) + self.stat_vector = [1.0, 2.0, 3.0, 4.0] # Example stat_vector for tests + self.data_writer.stat_stack_list = [self.stat_vector] # Mock the stat_stack_list + + def test_replace_stat_placeholders(self): + stat_result = [[1.0, 2.0, 3.0, 4.0]] # Mocking stat_result with a dummy value + data = {"type": "Tensor", "dtype": "float32", "shape": [1, 2, 3], Const.TENSOR_STAT_INDEX: 0} + + # Call _replace_stat_placeholders directly + self.data_writer._replace_stat_placeholders(data, stat_result) + + # Check that the function processed the placeholders correctly + self.assertEqual(data["Max"], 1.0) + self.assertEqual(data["Min"], 2.0) + self.assertEqual(data["Mean"], 3.0) + self.assertEqual(data["Norm"], 4.0) + + def test_append_stat_to_buffer(self): + index = self.data_writer.append_stat_to_buffer(self.stat_vector) + self.assertEqual(index, 1) # The first append will return index 0 + self.assertEqual(self.data_writer.stat_stack_list[0], + self.stat_vector) # Check if the stat is appended correctly + + def test_get_buffer_values_max(self): + max_value = self.data_writer.get_buffer_values_max(0) + self.assertEqual(max_value, 1.0) # The max value of stat_vector is 1.0 + + # Test when index is out of range + max_value_invalid = self.data_writer.get_buffer_values_max(1) + self.assertIsNone(max_value_invalid) # Should return None for invalid index + + def test_get_buffer_values_min(self): + min_value = self.data_writer.get_buffer_values_min(0) + self.assertEqual(min_value, 2.0) # The min value of stat_vector is 2.0 + + # Test when index is out of range + min_value_invalid = self.data_writer.get_buffer_values_min(1) + self.assertIsNone(min_value_invalid) # Should return None for invalid index + + def test_flush_stat_stack(self): + # Ensure that flush_stat_stack works and clears the stat_stack_list + result = self.data_writer.flush_stat_stack() + self.assertEqual(result, [[1.0, 2.0, 3.0, 4.0]]) # Returns the flushed stats + self.assertEqual(self.data_writer.stat_stack_list, []) # Ensure the list is cleared after flush def test_write_data_to_csv(self): cur_path = os.path.dirname(os.path.realpath(__file__)) @@ -42,9 +86,9 @@ class TestDataWriter(unittest.TestCase): remove_path(file_path) def test_reset_cache(self): - self.data_writer.cache_data={"data": 1} - self.data_writer.cache_stack={"stack": 2} - self.data_writer.cache_construct={"construct": 3} + self.data_writer.cache_data = {"data": 1} + self.data_writer.cache_stack = {"stack": 2} + self.data_writer.cache_construct = {"construct": 3} self.data_writer.reset_cache() self.assertEqual(self.data_writer.cache_data, {}) self.assertEqual(self.data_writer.cache_stack, {}) @@ -117,8 +161,9 @@ class TestDataWriter(unittest.TestCase): self.assertEqual(self.data_writer.cache_data, expected) def test_update_stack(self): - self.data_writer.update_stack(self.data_content) - self.assertEqual(self.data_writer.cache_stack, self.data_content) + self.data_writer.cache_stack = {"stack1": ["test1"]} + self.data_writer.update_stack("test2", "stack1") + self.assertEqual(self.data_writer.cache_stack, {"stack1": ["test1", "test2"]}) def test_update_construct(self): self.data_writer.update_construct(self.data_content) @@ -136,13 +181,13 @@ class TestDataWriter(unittest.TestCase): os.remove(file_path) def test_write_stack_info_json(self): - self.data_writer.cache_stack = self.data_content + self.data_writer.cache_stack = {("api1", "api2"): ["stack1"]} file_path = os.path.join(self.cur_path, "stack.json") self.data_writer.write_stack_info_json(file_path) load_result = load_json(file_path) try: - self.assertEqual(load_result, self.data_content) + self.assertEqual(load_result, {"0": [["stack1"], ["api1", "api2"]]}) finally: os.remove(file_path) diff --git a/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.cpp b/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.cpp index e744233b3199c15f5ce77b4690bbaa523b0bad45..08dddbed6b7b0c83691826998a2291bb99a40990 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.cpp @@ -2,7 +2,6 @@ #include #include #include -#include std::string TEST_ExecShellCommand(const std::string& cmd) { @@ -18,11 +17,10 @@ std::string TEST_ExecShellCommand(const std::string& cmd) return result; } -std::string trim(const std::string& str) +std::string Trim(const std::string& str) { std::string::size_type first = str.find_first_not_of(" \t\n\r\f\v"); std::string::size_type last = str.find_last_not_of(" \t\n\r\f\v"); - if (first == std::string::npos || last == std::string::npos) { return ""; } diff --git a/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.hpp b/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.hpp index ed842b87db77e75e618acd7a25949145a1578c37..08326522a9b06b671e62b5ecacbcc722f485f439 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.hpp +++ b/debug/accuracy_tools/msprobe/test/cpp/include/test_utils.hpp @@ -5,4 +5,4 @@ #define CONFIG_EXAMPLE __RESOURCES_PATH__"/config.json" std::string TEST_ExecShellCommand(const std::string& cmd); -std::string trim(const std::string& str); +std::string Trim(const std::string& str); diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_config.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_config.cpp index e8b9b73fb66c3fcae40819545c84b7fafb5d2c4d..36033c06a7336ec673eacf24c0c964b6a7719b59 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_config.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_config.cpp @@ -2,14 +2,14 @@ #include "gtest/gtest.h" #include "nlohmann/json.hpp" #include "test_utils.hpp" -#include "base/ErrorInfos.hpp" -#include "base/DebuggerConfig.hpp" +#include "base/ErrorInfosManager.h" +#include "base/DebuggerConfig.h" using namespace MindStudioDebugger; namespace MsProbeTest { -static const std::string cfgContent = R"({ +static const std::string CFG_CONTENT = R"({ "task": "statistics", "dump_path": "./dump_path", "rank": [], @@ -104,7 +104,7 @@ void TestConfigMindSpore::SetUp() DebuggerConfig::GetInstance().Reset(); CleanErrorInfoCache(); ErrorInfosManager::SetLogPath(logpath); - cfgJson = nlohmann::json::parse(cfgContent); + cfgJson = nlohmann::json::parse(CFG_CONTENT); } void TestConfigMindSpore::TearDown() @@ -173,7 +173,7 @@ TEST_F(TestConfigMindSpore, TestCommonCfg) ASSERT_EQ(DumpCfgFile(), 0); EXPECT_EQ(cfg.LoadConfig(framework, cfgPath), 0); EXPECT_EQ(cfg.GetTaskList(), std::vector({DebuggerTaskType::TASK_DUMP_STATISTICS})); - EXPECT_EQ(cfg.GetOutputPath(), trim(TEST_ExecShellCommand("realpath ./output1"))); + EXPECT_EQ(cfg.GetOutputPath(), Trim(TEST_ExecShellCommand("realpath ./output1"))); EXPECT_EQ(cfg.GetRankRange(), std::vector({0, 1, 8})); EXPECT_EQ(cfg.GetStepRange(), std::vector({2, 4, 6, 7, 8})); EXPECT_EQ(cfg.GetDebugLevel(), DebuggerLevel::L2); diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_cpython_utils.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_cpython_utils.cpp index 0d9188878c0864d66d76cc3a823b0a0a5cf644d5..8bb5af7123f41fb42091c2cb21d394bce2b1af8d 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_cpython_utils.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_cpython_utils.cpp @@ -2,7 +2,7 @@ #include #include "test_utils.hpp" -#include "utils/CPythonUtils.hpp" +#include "utils/CPythonUtils.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::CPythonUtils; @@ -56,79 +56,79 @@ TEST_F(CPythonUtilsTest, CPythonAgent) { TEST_F(CPythonUtilsTest, PythonObjectFromTo) { // 测试PythonObject的From和To函数 - int32_t input_int = -42; - PythonObject obj_int = PythonObject::From(input_int); - EXPECT_TRUE(obj_int.IsNumber()); + int32_t inputInt = -42; + PythonObject objInt = PythonObject::From(inputInt); + EXPECT_TRUE(objInt.IsNumber()); - int32_t output_int; - EXPECT_EQ(obj_int.To(output_int), 0); - EXPECT_EQ(output_int, input_int); + int32_t outputInt; + EXPECT_EQ(objInt.To(outputInt), 0); + EXPECT_EQ(outputInt, inputInt); - uint32_t input_uint = 56; - PythonObject obj_uint = PythonObject::From(input_uint); - EXPECT_TRUE(obj_uint.IsNumber()); + uint32_t inputUint = 56; + PythonObject objUint = PythonObject::From(inputUint); + EXPECT_TRUE(objUint.IsNumber()); - uint32_t output_uint; - EXPECT_EQ(obj_uint.To(output_uint), 0); - EXPECT_EQ(output_uint, input_uint); + uint32_t outputUint; + EXPECT_EQ(objUint.To(outputUint), 0); + EXPECT_EQ(outputUint, inputUint); - double input_double = 3.14; - PythonObject obj_double = PythonObject::From(input_double); - EXPECT_TRUE(obj_double.IsNumber()); + double inputDouble = 3.14; + PythonObject objDouble = PythonObject::From(inputDouble); + EXPECT_TRUE(objDouble.IsNumber()); - double output_double; - EXPECT_EQ(obj_double.To(output_double), 0); - EXPECT_DOUBLE_EQ(output_double, input_double); + double outputDouble; + EXPECT_EQ(objDouble.To(outputDouble), 0); + EXPECT_DOUBLE_EQ(outputDouble, inputDouble); - std::string input_str = "hello"; - PythonObject obj_str = PythonObject::From(input_str); - EXPECT_TRUE(obj_str.IsString()); + std::string inputStr = "hello"; + PythonObject objStr = PythonObject::From(inputStr); + EXPECT_TRUE(objStr.IsString()); - std::string output_str; - EXPECT_EQ(obj_str.To(output_str), 0); - EXPECT_EQ(output_str, input_str); + std::string outputStr; + EXPECT_EQ(objStr.To(outputStr), 0); + EXPECT_EQ(outputStr, inputStr); - const char* input_char = "world"; - PythonObject obj_str1 = PythonObject::From(input_char); - EXPECT_TRUE(obj_str1.IsString()); + const char* inputChar = "world"; + PythonObject objStr1 = PythonObject::From(inputChar); + EXPECT_TRUE(objStr1.IsString()); - EXPECT_EQ(obj_str1.To(output_str), 0); - EXPECT_EQ(output_str, std::string(input_char)); + EXPECT_EQ(objStr1.To(outputStr), 0); + EXPECT_EQ(outputStr, std::string(inputChar)); - bool input_bool = true; - PythonObject obj_bool = PythonObject::From(input_bool); - EXPECT_TRUE(obj_bool.IsBool()); + bool inputBool = true; + PythonObject objBool = PythonObject::From(inputBool); + EXPECT_TRUE(objBool.IsBool()); - bool output_bool; - EXPECT_EQ(obj_bool.To(output_bool), 0); - EXPECT_EQ(output_bool, input_bool); + bool outputBool; + EXPECT_EQ(objBool.To(outputBool), 0); + EXPECT_EQ(outputBool, inputBool); - std::vector input_vector_int = {1, 2, 3, 100}; - PythonObject list_int_obj = PythonObject::From(input_vector_int); - EXPECT_TRUE(list_int_obj.IsList()); + std::vector inputVectorInt = {1, 2, 3, 100}; + PythonObject listIntObj = PythonObject::From(inputVectorInt); + EXPECT_TRUE(listIntObj.IsList()); - std::vector output_vector_int; - EXPECT_EQ(list_int_obj.To(output_vector_int), 0); + std::vector outputVectorInt; + EXPECT_EQ(listIntObj.To(outputVectorInt), 0); - size_t size = input_vector_int.size(); - EXPECT_EQ(size, output_vector_int.size()); + size_t size = inputVectorInt.size(); + EXPECT_EQ(size, outputVectorInt.size()); for (size_t i = 0; i < size; ++i) { - EXPECT_EQ(input_vector_int[i], output_vector_int[i]); + EXPECT_EQ(inputVectorInt[i], outputVectorInt[i]); } - std::vector input_vector_str = {"a", "bb", "ccc", "dddd"}; - PythonObject list_str_obj = PythonObject::From(input_vector_str); - EXPECT_TRUE(list_str_obj.IsList()); + std::vector inputVectorStr = {"a", "bb", "ccc", "dddd"}; + PythonObject listStrObj = PythonObject::From(inputVectorStr); + EXPECT_TRUE(listStrObj.IsList()); - std::vector output_vector_str; - EXPECT_EQ(list_str_obj.To(output_vector_str), 0); + std::vector outputVectorStr; + EXPECT_EQ(listStrObj.To(outputVectorStr), 0); - size = input_vector_str.size(); - EXPECT_EQ(size, output_vector_str.size()); + size = inputVectorStr.size(); + EXPECT_EQ(size, outputVectorStr.size()); for (size_t i = 0; i < size; ++i) { - EXPECT_EQ(input_vector_str[i], output_vector_str[i]); + EXPECT_EQ(inputVectorStr[i], outputVectorStr[i]); } } @@ -199,18 +199,18 @@ TEST_F(CPythonUtilsTest, PythonNumberObject) { PythonNumberObject o5(PythonObject::From(4.44)); PythonNumberObject o6(PythonObject::From("1111")); - int int_v; - EXPECT_EQ(o1.To(int_v), 0); - EXPECT_EQ(int_v, 123); - double double_v; - EXPECT_EQ(o2.To(double_v), 0); - EXPECT_TRUE(std::fabs(double_v - 3.14) < 1e-5); - EXPECT_EQ(o3.To(int_v), 0); - EXPECT_EQ(int_v, 321); - EXPECT_EQ(o4.To(double_v), 0); - EXPECT_TRUE(std::fabs(double_v - 2.33) < 1e-5); - EXPECT_EQ(o5.To(double_v), 0); - EXPECT_TRUE(std::fabs(double_v - 4.44) < 1e-5); + int intV; + EXPECT_EQ(o1.To(intV), 0); + EXPECT_EQ(intV, 123); + double doubleV; + EXPECT_EQ(o2.To(doubleV), 0); + EXPECT_TRUE(std::fabs(doubleV - 3.14) < 1e-5); + EXPECT_EQ(o3.To(intV), 0); + EXPECT_EQ(intV, 321); + EXPECT_EQ(o4.To(doubleV), 0); + EXPECT_TRUE(std::fabs(doubleV - 2.33) < 1e-5); + EXPECT_EQ(o5.To(doubleV), 0); + EXPECT_TRUE(std::fabs(doubleV - 4.44) < 1e-5); EXPECT_TRUE(o6.IsNone()); } diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_data_utils.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_data_utils.cpp index 11442f12bfea9179ecd4e2e357bcf70b4212ab84..dd6325c2183332bfa0cc2acc592301f7cf58bda8 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_data_utils.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_data_utils.cpp @@ -2,7 +2,7 @@ #include #include #include -#include "utils/DataUtils.hpp" +#include "utils/DataUtils.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::DataUtils; @@ -10,15 +10,15 @@ using namespace MindStudioDebugger::DataUtils; namespace MsProbeTest { TEST(DataUtilsTest, TestUnpackUint64Value) { - uint64_t data_le = 0x0102030405060708; - uint64_t result = UnpackUint64Value_Le(&data_le); + uint64_t dataLe = 0x0102030405060708; + uint64_t result = UnpackUint64ValueLe(&dataLe); #if __BYTE_ORDER == __LITTLE_ENDIAN EXPECT_EQ(result, 0x0102030405060708); #else EXPECT_EQ(result, 0x0807060504030201); #endif - uint64_t data_be = 0x0102030405060708; - result = UnpackUint64Value_Be(&data_be); + uint64_t dataBe = 0x0102030405060708; + result = UnpackUint64ValueBe(&dataBe); #if __BYTE_ORDER == __LITTLE_ENDIAN EXPECT_EQ(result, 0x0807060504030201); #else @@ -74,7 +74,7 @@ TEST(DataUtilsTest, TestGetFormatString) { EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_FRACTAL_Z), "FRACTAL_Z"); EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_C1HWNC0), "C1HWNC0"); EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_HWCN), "HWCN"); - EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_C1HWNCoC0), "C1HWNCoC0"); + EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_C1HWNCOC0), "C1HWNCoC0"); EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_DHWNC), "DHWNC"); EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_NCL), "NCL"); EXPECT_EQ(GetFormatString(TensorFormat::FORMAT_MAX), "UNKNOWN"); diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_environ.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_environ.cpp index 94c830227ae58637642a189f36ade78de9a2a75c..be30c5c219ce1b34b92a1453eeb2050479bc7b97 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_environ.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_environ.cpp @@ -2,8 +2,8 @@ #include #include "include/test_utils.hpp" -#include "base/DebuggerConfig.hpp" -#include "base/Environment.hpp" +#include "base/DebuggerConfig.h" +#include "base/Environment.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::Environment; diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_file_operation.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_file_operation.cpp index 2886126e9f568fba6b8ce3eabd752653d4493108..99dbe8124d17fe7cdcfd1f812d2132f416cea1ea 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_file_operation.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_file_operation.cpp @@ -4,8 +4,8 @@ #include #include "test_utils.hpp" -#include "utils/DataUtils.hpp" -#include "utils/FileOperation.hpp" +#include "utils/DataUtils.h" +#include "utils/FileOperation.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::FileOperation; diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_file_utils.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_file_utils.cpp index 03449f761be0c8548021218581f4cbff12d4e07d..022ae396ba3d7a8343792a438162a74ff526fc75 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_file_utils.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_file_utils.cpp @@ -8,7 +8,7 @@ #include #include "test_utils.hpp" -#include "utils/FileUtils.hpp" +#include "utils/FileUtils.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::FileUtils; @@ -52,7 +52,7 @@ TEST_F(FileUtilsTest, TestIsPathExist) TEST_F(FileUtilsTest, TestGetAbsPath) { - std::string pwd = trim(TEST_ExecShellCommand("pwd")); + std::string pwd = Trim(TEST_ExecShellCommand("pwd")); EXPECT_EQ(pwd, GetAbsPath(".")); EXPECT_EQ(pwd + "/testpath", GetAbsPath("./testpath")); EXPECT_EQ(pwd + "/testpath", GetAbsPath("./testpath/")); @@ -210,8 +210,8 @@ TEST_F(FileUtilsTest, TestIsPathLengthLegal) TEST_F(FileUtilsTest, TestIsPathDepthValid) { EXPECT_TRUE(IsPathDepthValid("")); - EXPECT_TRUE(IsPathDepthValid(std::string(PATH_DEPTH_MAX, pathSeparator))); - EXPECT_FALSE(IsPathDepthValid(std::string(PATH_DEPTH_MAX + 1, pathSeparator))); + EXPECT_TRUE(IsPathDepthValid(std::string(PATH_DEPTH_MAX, PATH_SEPARATOR))); + EXPECT_FALSE(IsPathDepthValid(std::string(PATH_DEPTH_MAX + 1, PATH_SEPARATOR))); } TEST_F(FileUtilsTest, TestIsFileOwner) diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_log.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_log.cpp index 254b54359a50166e1d893c5b936eb220ee0b2a73..ddf6950fd5b6ff0c7f191e5aa0f8e79897db6c7c 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_log.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_log.cpp @@ -2,7 +2,7 @@ #include "gtest/gtest.h" #include "test_utils.hpp" -#include "base/ErrorInfos.hpp" +#include "base/ErrorInfosManager.h" using namespace MindStudioDebugger; diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_math_utils.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_math_utils.cpp index 3b23e9c879c431ef7457990ba774aa0dc1321b45..8e57d2cd53b5c52ca62084e8ecbd49e3e8138682 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_math_utils.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_math_utils.cpp @@ -3,7 +3,7 @@ #include #include #include -#include "utils/MathUtils.hpp" +#include "utils/MathUtils.h" using namespace MindStudioDebugger; using namespace MindStudioDebugger::MathUtils; diff --git a/debug/accuracy_tools/msprobe/test/cpp/test_precision_debugger.cpp b/debug/accuracy_tools/msprobe/test/cpp/test_precision_debugger.cpp index 69df0c18fcc27cd0ac359262649fcc588f2e9b9f..2832f2345d7d72efe2d1bb305c044f56d9b83e8d 100644 --- a/debug/accuracy_tools/msprobe/test/cpp/test_precision_debugger.cpp +++ b/debug/accuracy_tools/msprobe/test/cpp/test_precision_debugger.cpp @@ -2,9 +2,9 @@ #include #include "include/test_utils.hpp" -#include "third_party/ACL/AclApi.hpp" -#include "base/ErrorInfos.hpp" -#include "core/PrecisionDebugger.hpp" +#include "third_party/ACL/AclApi.h" +#include "base/ErrorInfosManager.h" +#include "core/PrecisionDebugger.h" using namespace MindStudioDebugger; @@ -17,15 +17,15 @@ public: std::string Name() const override {return "PrecisionDbgTaskStub";} bool Condition(const DebuggerConfig& cfg) const override {return true;} - void Initialize(const DebuggerConfig& cfg) {initialize_called = true;} - void OnStart() {start_called = true;} - void OnStop() {stop_called = true;} - void OnStep() {step_called = true;} + void Initialize(const DebuggerConfig& cfg) {initializeCalled = true;} + void OnStart() {startCalled = true;} + void OnStop() {stopCalled = true;} + void OnStep() {stepCalled = true;} - bool initialize_called{false}; - bool start_called{false}; - bool stop_called{false}; - bool step_called{false}; + bool initializeCalled{false}; + bool startCalled{false}; + bool stopCalled{false}; + bool stepCalled{false}; }; class PrecisionDbgTaskUselessStub : public PrecisionDbgTaskStub { @@ -35,11 +35,11 @@ public: TEST(PrecisionDebuggerTest, TestRegisterBeforeInit) { PrecisionDebugger& debugger = PrecisionDebugger::GetInstance(); - PrecisionDbgTaskStub stub_task; + PrecisionDbgTaskStub stubTask; DebuggerConfig::GetInstance().Reset(); - debugger.RegisterDebuggerTask(&stub_task); - stub_task.Register(); + debugger.RegisterDebuggerTask(&stubTask); + stubTask.Register(); EXPECT_FALSE(debugger.IsEnable()); EXPECT_EQ(debugger.GetCurStep(), 0); @@ -49,12 +49,12 @@ TEST(PrecisionDebuggerTest, TestRegisterBeforeInit) { debugger.Step(); EXPECT_EQ(debugger.GetCurStep(), 0); - EXPECT_FALSE(stub_task.initialize_called); - EXPECT_FALSE(stub_task.start_called); - EXPECT_FALSE(stub_task.stop_called); - EXPECT_FALSE(stub_task.step_called); + EXPECT_FALSE(stubTask.initializeCalled); + EXPECT_FALSE(stubTask.startCalled); + EXPECT_FALSE(stubTask.stopCalled); + EXPECT_FALSE(stubTask.stepCalled); - debugger.UnRegisterDebuggerTask(&stub_task); + debugger.UnRegisterDebuggerTask(&stubTask); debugger.UnRegisterDebuggerTask(nullptr); } @@ -81,39 +81,39 @@ TEST(PrecisionDebuggerTest, TestInit) { TEST(PrecisionDebuggerTest, TestSubTaskDispatch) { PrecisionDebugger& debugger = PrecisionDebugger::GetInstance(); - PrecisionDbgTaskStub stub_task1; - PrecisionDbgTaskStub stub_task2; - PrecisionDbgTaskUselessStub stub_task3; + PrecisionDbgTaskStub stubTask1; + PrecisionDbgTaskStub stubTask2; + PrecisionDbgTaskUselessStub stubTask3; MOCKER(MindStudioDebugger::AscendCLApi::LoadAclApi) .stubs() .then(returnValue(0)); - MOCKER(MindStudioDebugger::AscendCLApi::ACLAPI_aclrtSynchronizeDevice) + MOCKER(MindStudioDebugger::AscendCLApi::AclApiAclrtSynchronizeDevice) .stubs() .then(returnValue(0)) .expects(atLeast(1)); - stub_task1.Register(); + stubTask1.Register(); EXPECT_EQ(debugger.Initialize("MindSpore", CONFIG_EXAMPLE), 0); - stub_task2.Register(); - stub_task3.Register(); + stubTask2.Register(); + stubTask3.Register(); - EXPECT_TRUE(stub_task1.initialize_called); - EXPECT_TRUE(stub_task2.initialize_called); - EXPECT_FALSE(stub_task3.initialize_called); - EXPECT_FALSE(stub_task1.start_called); - EXPECT_FALSE(stub_task2.stop_called); - EXPECT_FALSE(stub_task3.step_called); + EXPECT_TRUE(stubTask1.initializeCalled); + EXPECT_TRUE(stubTask2.initializeCalled); + EXPECT_FALSE(stubTask3.initializeCalled); + EXPECT_FALSE(stubTask1.startCalled); + EXPECT_FALSE(stubTask2.stopCalled); + EXPECT_FALSE(stubTask3.stepCalled); debugger.Start(); - EXPECT_TRUE(stub_task1.start_called); - EXPECT_FALSE(stub_task3.start_called); + EXPECT_TRUE(stubTask1.startCalled); + EXPECT_FALSE(stubTask3.startCalled); debugger.Stop(); - EXPECT_TRUE(stub_task1.stop_called); - EXPECT_TRUE(stub_task2.stop_called); + EXPECT_TRUE(stubTask1.stopCalled); + EXPECT_TRUE(stubTask2.stopCalled); debugger.Step(); - EXPECT_TRUE(stub_task1.step_called); + EXPECT_TRUE(stubTask1.stepCalled); GlobalMockObject::verify(); GlobalMockObject::reset(); diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare.py index 6f7377894002e60add41dc7b2d3c1d3d68391e0b..b0e49fd545393bac6fbc3d7af554c91e24119e69 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare.py @@ -1,20 +1,10 @@ # coding=utf-8 -import json -import os + import random -import shutil -import tempfile import unittest from unittest.mock import patch -import numpy as np -import pandas as pd -import torch -import yaml - -from msprobe.core.common.utils import CompareException -from msprobe.core.compare.acc_compare import ModeConfig -from msprobe.mindspore.compare.ms_compare import MappingConfig, MSComparator, check_cross_framework +from msprobe.mindspore.compare.ms_compare import check_cross_framework from msprobe.core.common.const import Const npu_dict = {'op_name': ['Functional.conv2d.0.forward.input.0', 'Functional.conv2d.0.forward.input.1', @@ -190,169 +180,9 @@ def gen_data(is_ms=True): } -def gen_api_mapping_test_data(need_user_mapping=False): - result_npu = json_data_template.copy() - result_bench = json_data_template.copy() - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - ms_comparator = MSComparator(mode_config, mapping_config) - - api_mapping = ms_comparator.load_internal_api() - ms_api_list = np.random.choice(list(api_mapping.keys()), size=5, replace=False).astype(str).tolist() - ms_api_data = {} - pt_api_data = {} - user_mapping = [] - for api in ms_api_list: - call_num = random.randint(1, 10) - direction = random.choice(['forward', 'backward']) - data_name_ms = api + '.' + str(call_num) + '.' + direction - data_name_pt = api_mapping.get(api) + '.' + str(call_num) + '.' + direction - input_num = random.randint(1, 5) - output_num = random.randint(1, 5) - ms_data = {'input_args': [gen_data(True) for _ in range(input_num)], - 'output': [gen_data(True) for _ in range(output_num)]} - pt_data = {'input_args': [gen_data(False) for _ in range(input_num)], - 'output': [gen_data(False) for _ in range(output_num)]} - ms_api_data[data_name_ms] = ms_data - pt_api_data[data_name_pt] = pt_data - if need_user_mapping: - compare_num_input = random.randint(1, input_num) - compare_num_output = random.randint(1, output_num) - user_mapping_item = {'ms_api': api, - 'pt_api': api_mapping.get(api), - 'ms_args': sorted(np.random.choice(list(range(input_num)), size=compare_num_input, - replace=False).astype(int).tolist()), - 'pt_args': sorted(np.random.choice(list(range(input_num)), size=compare_num_input, - replace=False).astype(int).tolist()), - 'ms_output': sorted(np.random.choice(list(range(output_num)), size=compare_num_output, - replace=False).astype(int).tolist()), - 'pt_output': sorted(np.random.choice(list(range(output_num)), size=compare_num_output, - replace=False).astype(int).tolist())} - user_mapping.append(user_mapping_item) - ms_api_key_list = list(ms_api_data.keys()) - random.shuffle(ms_api_key_list) - result_npu['data'] = {k: ms_api_data.get(k) for k in ms_api_key_list} - pt_api_key_list = list(pt_api_data.keys()) - random.shuffle(pt_api_key_list) - result_bench['data'] = {k: pt_api_data.get(k) for k in pt_api_key_list} - return result_npu, result_bench, user_mapping - - class TestUtilsMethods(unittest.TestCase): - def test_check_op_ms(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - - ms_comparator = MSComparator(mode_config, mapping_config) - result = ms_comparator.check_op(npu_dict, bench_dict) - self.assertTrue(result) - - def test_data_mapping(self): - stack_json_data = {} - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(data_mapping=data_mapping) - ms_comparator = MSComparator(mode_config, mapping_config) - - npu_ops_all = ms_comparator.merge_data(npu_json_data, stack_json_data) - npu_ops_all_correct = { - 'Functional.flash_attention_score.4.forward.input.0': { - 'struct': ('BFloat16', [4096, 1, 2048]), - 'summary': [4.1875, -4.4375, -4.550282028503716e-05, 2316.379150390625], - 'data_name': None, - 'stack_info': [None] - }, - 'Functional.flash_attention_score.4.forward.output.0': { - 'struct': ('BFloat16', [4096, 1, 2048]), - 'summary': [4.1875, -4.4375, -4.550282028503716e-05, 2316.379150390625], - 'data_name': None, - 'stack_info': [None] - } - } - self.assertDictEqual(npu_ops_all, npu_ops_all_correct) - - bench_ops_all = ms_comparator.merge_data(bench_json_data, stack_json_data) - bench_ops_all_correct = { - 'NPU.npu_fusion_attention.4.forward.input.0': { - 'struct': ('torch.bfloat16', [4096, 1, 2048]), - 'summary': [4.1875, -4.4375, -4.553794860839844e-05, 2320.0], - 'data_name': None, - 'stack_info': [None] - }, - 'NPU.npu_fusion_attention.4.forward.output.0': { - 'struct': ('torch.bfloat16', [4096, 1, 2048]), - 'summary': [4.1875, -4.4375, -4.553794860839844e-05, 2320.0], - 'data_name': None, - 'stack_info': [None] - } - } - self.assertDictEqual(bench_ops_all, bench_ops_all_correct) - - result = ms_comparator.get_accuracy(npu_ops_all, bench_ops_all) - result_correct = [['Functional.flash_attention_score.4.forward.input.0', - 'NPU.npu_fusion_attention.4.forward.input.0', - 'BFloat16', 'torch.bfloat16', [4096, 1, 2048], [4096, 1, 2048], 0.0, 0.0, - 3.512832336127758e-08, -3.620849609375, '0.0%', '0.0%', '0.07714076816099476%', - '0.1560711038523707%', 4.1875, -4.4375, -4.550282028503716e-05, 2316.379150390625, - 4.1875, -4.4375, -4.553794860839844e-05, 2320.0, '', '', None], - ['Functional.flash_attention_score.4.forward.output.0', - 'NPU.npu_fusion_attention.4.forward.output.0', - 'BFloat16', 'torch.bfloat16', [4096, 1, 2048], [4096, 1, 2048], 0.0, 0.0, - 3.512832336127758e-08, -3.620849609375, '0.0%', '0.0%', '0.07714076816099476%', - '0.1560711038523707%', 4.1875, -4.4375, -4.550282028503716e-05, 2316.379150390625, - 4.1875, -4.4375, -4.553794860839844e-05, 2320.0, '', '', None] - ] - self.assertListEqual(result, result_correct) - - def test_dm_tensor_task(self): - self.compare_process_custom(dump_mode=Const.ALL) - - def compare_process_custom(self, dump_mode): - data_path = tempfile.mkdtemp(prefix='dump_data', dir='/tmp') - try: - npu_dump_path = os.path.join(data_path, 'npu_dump.json') - bench_dump_path = os.path.join(data_path, 'bench_dump.json') - npu_stack_path = os.path.join(data_path, 'npu_stack.json') - - with open(npu_dump_path, 'w') as n_d_f: - json.dump(npu_json_data, n_d_f) - with open(bench_dump_path, 'w') as b_d_f: - json.dump(bench_json_data, b_d_f) - with open(npu_stack_path, 'w') as n_s_f: - json.dump({}, n_s_f) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - - ms_comparator = MSComparator(mode_config, mapping_config) - result_df = ms_comparator.compare_process_custom((npu_dump_path, bench_dump_path, npu_stack_path)) - self.assertListEqual(result_df.values.tolist(), []) - finally: - shutil.rmtree(data_path) - - @patch('msprobe.mindspore.compare.ms_compare.detect_framework_by_dump_json') + @patch('msprobe.mindspore.compare.utils.detect_framework_by_dump_json') def test_check_cross_framework_valid_pytorch(self, mock_detect_framework): mock_detect_framework.return_value = Const.PT_FRAMEWORK @@ -360,203 +190,10 @@ class TestUtilsMethods(unittest.TestCase): self.assertTrue(result) - @patch('msprobe.mindspore.compare.ms_compare.detect_framework_by_dump_json') + @patch('msprobe.mindspore.compare.utils.detect_framework_by_dump_json') def test_check_cross_framework_invalid_framework(self, mock_detect_framework): mock_detect_framework.return_value = Const.MS_FRAMEWORK result = check_cross_framework("dummy_path") self.assertFalse(result) - - def test_comapre_process(self): - data_path = tempfile.mkdtemp(prefix='dump_data', dir='/tmp') - try: - npu_dump_path = os.path.join(data_path, 'npu_dump.json') - bench_dump_path = os.path.join(data_path, 'bench_dump.json') - npu_stack_path = os.path.join(data_path, 'npu_stack.json') - - npu_data, bench_data, _ = gen_api_mapping_test_data() - with open(npu_dump_path, 'w', encoding='utf8') as n_d_f: - json.dump(npu_data, n_d_f) - with open(bench_dump_path, 'w', encoding='utf8') as b_d_f: - json.dump(bench_data, b_d_f) - with open(npu_stack_path, 'w', encoding='utf8') as n_s_f: - json.dump({}, n_s_f) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(api_mapping=True) - - ms_comparator = MSComparator(mode_config, mapping_config) - result_df = ms_comparator.compare_process((npu_dump_path, bench_dump_path, npu_stack_path)) - self.assertTrue((result_df['Bench Name'] != 'N/A').all()) - finally: - shutil.rmtree(data_path) - - def test_compare_process_with_customize_api_mapping(self): - data_path = tempfile.mkdtemp(prefix='dump_data', dir='/tmp') - try: - npu_dump_path = os.path.join(data_path, 'npu_dump.json') - bench_dump_path = os.path.join(data_path, 'bench_dump.json') - npu_stack_path = os.path.join(data_path, 'npu_stack.json') - user_mapping_path = os.path.join(data_path, 'user_mapping.yaml') - - npu_data, bench_data, user_mapping = gen_api_mapping_test_data(True) - with open(npu_dump_path, 'w', encoding='utf8') as n_d_f: - json.dump(npu_data, n_d_f) - with open(bench_dump_path, 'w', encoding='utf8') as b_d_f: - json.dump(bench_data, b_d_f) - with open(npu_stack_path, 'w', encoding='utf8') as n_s_f: - json.dump({}, n_s_f) - with open(user_mapping_path, 'w', encoding='utf8') as u_m_f: - yaml.safe_dump(user_mapping, u_m_f) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(api_mapping=user_mapping_path) - - ms_comparator = MSComparator(mode_config, mapping_config) - result_df = ms_comparator.compare_process((npu_dump_path, bench_dump_path, npu_stack_path)) - - user_mapping_dict = {} - for i in user_mapping: - user_mapping_dict[i.get('ms_api')] = {'input': i.get('ms_args'), 'output': i.get('ms_output')} - match_set = set() - for key in npu_data.get('data').keys(): - matched_dict = user_mapping_dict.get(key.rsplit('.', 2)[0]) - match_set.update({key + '.input.' + str(i) for i in matched_dict.get('input')}) - match_set.update({key + '.output.' + str(i) for i in matched_dict.get('output')}) - - self.assertTrue((result_df.loc[result_df['NPU Name'].isin(match_set), 'Bench Name'] != 'N/A').all()) - self.assertTrue((result_df.loc[~result_df['NPU Name'].isin(match_set), 'Bench Name'] == 'N/A').all()) - finally: - shutil.rmtree(data_path) - - def test_load_internal_api(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - - ms_comparator = MSComparator(mode_config, mapping_config) - api_dict = ms_comparator.load_internal_api() - self.assertEqual(api_dict['Functional.abs'], 'Torch.abs') - - def test_process_cell_mapping(self): - self.base_test_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__)))) - self.input_dir = os.path.join(self.base_test_dir, 'resources') - cell_mapping_path = os.path.join(self.input_dir, 'common', 'cell_mapping.yaml') - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.SUMMARY - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(cell_mapping=cell_mapping_path) - - ms_comparator = MSComparator(mode_config, mapping_config) - npu_op_name = ms_comparator.process_cell_mapping(npu_cell_dict.get('op_name')[0]) - self.assertEqual(npu_op_name, 'Module.fc1.Linear.forward.0.input.0') - - def test_read_npy_data(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - - ms_comparator = MSComparator(mode_config, mapping_config) - - self.temp_file = tempfile.NamedTemporaryFile(suffix='.pt') - tensor = torch.Tensor([1, 2, 3]) - filename = self.temp_file.name.split('/')[-1] - torch.save(tensor, self.temp_file.name) - result = ms_comparator.read_npy_data('/tmp', filename, load_pt_file=True) - self.assertTrue(np.array_equal(result, np.array([1, 2, 3]))) - self.temp_file.close() - - self.temp_file = tempfile.NamedTemporaryFile(suffix='.npy') - tensor = np.array([1, 2, 3]) - filename = self.temp_file.name.split('/')[-1] - np.save(self.temp_file.name, tensor) - result = ms_comparator.read_npy_data('/tmp', filename, load_pt_file=False) - self.assertTrue(np.array_equal(result, np.array([1, 2, 3]))) - self.temp_file.close() - - def test_process_internal_api_mapping(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig(api_mapping=1) - - ms_comparator = MSComparator(mode_config, mapping_config) - - npu_op_name = "Mint.addcmul.0.forward.input.0" - result = ms_comparator.process_internal_api_mapping(npu_op_name) - self.assertEqual(result, "Torch.addcmul.0.forward.input.0") - - npu_op_name = "MintFunctional.addcmul.0.forward.input.0" - result = ms_comparator.process_internal_api_mapping(npu_op_name) - self.assertEqual(result, "Functional.addcmul.0.forward.input.0") - - npu_op_name = "Functional.abs" - result = ms_comparator.process_internal_api_mapping(npu_op_name) - self.assertEqual(result, "Torch.abs") - - def test_get_api_name(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - - ms_comparator = MSComparator(mode_config, mapping_config) - - api_list = ["Functional", "absolute", "0", "forward", "input", "0"] - result = ms_comparator.get_api_name(api_list) - self.assertEqual(result, "Functional.absolute") - - api_list = ["Mint"] - with self.assertRaises(CompareException): - ms_comparator.get_api_name(api_list) - - def test_process_data_name(self): - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - mapping_config = MappingConfig() - ms_comparator = MSComparator(mode_config, mapping_config) - - data = pd.DataFrame({ - 'data_name_x': ['A', 'B', 'C'], - 'data_name_y': ['X', 'Y', 'Z'] - }) - - result = ms_comparator.process_data_name(data.copy()) - - expected = pd.DataFrame({ - 'data_name_x': [['A', 'X'], ['B', 'Y'], ['C', 'Z']], - 'data_name_y': ['X', 'Y', 'Z'] - }) - - pd.testing.assert_frame_equal(result, expected) diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare_utils.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..d7fb5e38fb82b309caf3ab2a1b621655d7babc86 --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/compare/test_ms_compare_utils.py @@ -0,0 +1,24 @@ +import unittest +from unittest.mock import patch + +import numpy as np + +from msprobe.core.common.file_utils import FileCheckConst +from msprobe.mindspore.compare.utils import read_npy_data + + +class TestReadNpyData(unittest.TestCase): + + @patch('msprobe.mindspore.compare.utils.load_npy') + @patch('msprobe.mindspore.compare.utils.FileChecker') + @patch('os.path.join', return_value='/fake/path/to/file.npy') + def test_read_real_data_ms(self, mock_os, mock_file_checker, mock_load_npy): + mock_file_checker.return_value.common_check.return_value = '/fake/path/to/file.npy' + + mock_load_npy.return_value = np.array([1.0, 2.0, 3.0]) + + result = read_npy_data('/fake/dir', 'file_name.npy') + + mock_file_checker.assert_called_once_with('/fake/path/to/file.npy', FileCheckConst.FILE, FileCheckConst.READ_ABLE, FileCheckConst.NUMPY_SUFFIX, False) + mock_load_npy.assert_called_once_with('/fake/path/to/file.npy') + self.assertTrue(np.array_equal(result, np.array([1.0, 2.0, 3.0]))) diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_debugger_config.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_debugger_config.py index b345729ae55c9225a594a052229b04276becf3ea..8a7195eac824485e75d8c1ba0752715c7c6a5600 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_debugger_config.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_debugger_config.py @@ -17,10 +17,11 @@ import unittest from unittest.mock import patch from msprobe.core.common.const import Const -from msprobe.core.common_config import CommonConfig, BaseConfig from msprobe.core.common.log import logger +from msprobe.core.common_config import CommonConfig from msprobe.mindspore.common.const import FreeBenchmarkConst from msprobe.mindspore.debugger.debugger_config import DebuggerConfig +from msprobe.mindspore.ms_config import StatisticsConfig class TestDebuggerConfig(unittest.TestCase): @@ -34,12 +35,13 @@ class TestDebuggerConfig(unittest.TestCase): "level": "L2" } common_config = CommonConfig(json_config) - task_config = BaseConfig(json_config) + task_config = StatisticsConfig(json_config) debugger_config = DebuggerConfig(common_config, task_config) self.assertEqual(debugger_config.task, Const.STATISTICS) self.assertEqual(debugger_config.file_format, "npy") self.assertEqual(debugger_config.check_mode, "all") self.assertEqual(debugger_config.overflow_nums, 1) + self.assertEqual(debugger_config.tensor_list, []) common_config.level = "L1" common_config.task = Const.FREE_BENCHMARK diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_precision_debugger.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_precision_debugger.py index 790a02b4048cdade6679c0bdc94a03ee21863340..a1a3b2dc17513ca98434243f06175a02c18286d1 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_precision_debugger.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/debugger/test_ms_precision_debugger.py @@ -16,13 +16,14 @@ import unittest from unittest.mock import patch, MagicMock -from msprobe.core.common_config import CommonConfig, BaseConfig from msprobe.core.common.const import Const, MsgConst +from msprobe.core.common_config import CommonConfig from msprobe.mindspore.cell_processor import CellProcessor from msprobe.mindspore.common.const import Const as MsConst from msprobe.mindspore.debugger.debugger_config import DebuggerConfig from msprobe.mindspore.debugger.precision_debugger import PrecisionDebugger from msprobe.mindspore.dump.hook_cell.hook_cell import HOOKCell +from msprobe.mindspore.ms_config import StatisticsConfig from msprobe.mindspore.runtime import Runtime @@ -48,7 +49,7 @@ class TestPrecisionDebugger(unittest.TestCase): } common_config = CommonConfig(json_config) - task_config = BaseConfig(json_config) + task_config = StatisticsConfig(json_config) handler = Handler() mock_get_mode = MagicMock() @@ -94,18 +95,13 @@ class TestPrecisionDebugger(unittest.TestCase): self.assertTrue(Handler.called) def test_stop_step(self): - class MockConfig: - def __init__(self): - self.execution_mode = None - self.level = None - self.level_ori = Const.LEVEL_L1 - class MockPrecisionDebugger: def __init__(self): self.task = Const.TENSOR self.service = None - self.config = MockConfig() - + self.config = MagicMock() + self.config.level_ori = MagicMock() + self.config.level_ori.return_value = Const.LEVEL_L1 PrecisionDebugger._instance = None with self.assertRaises(Exception) as context: PrecisionDebugger.stop() diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/handler/test_ms_base_handler.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/handler/test_ms_base_handler.py index d7f5b0745cff481d0bc2e5771df36beb492d4015..91230456911fcace6877869a270264b6b95b6793 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/handler/test_ms_base_handler.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/handler/test_ms_base_handler.py @@ -24,6 +24,7 @@ from msprobe.mindspore.common.log import logger from msprobe.mindspore.free_benchmark.common.handler_params import HandlerParams from msprobe.mindspore.free_benchmark.common.utils import Tools from msprobe.mindspore.free_benchmark.handler.base_handler import BaseHandler +from msprobe.mindspore.dump.hook_cell.api_register import get_api_register class Handler(BaseHandler): @@ -45,6 +46,7 @@ class TestBaseHandler(unittest.TestCase): @classmethod def setUpClass(cls): cls.base_handler = Handler("api_name_with_id") + get_api_register(True).restore_all_api() def test___init__(self): base_handler = Handler("api_name_with_id") @@ -93,7 +95,7 @@ class TestBaseHandler(unittest.TestCase): first_tensor = Tensor([1.0, 1.2], dtype=ms.bfloat16) second_tensor = Tensor([1.5, 2.0], dtype=ms.bfloat16) - target = ops.max(ops.div(second_tensor.to(ms.float32), first_tensor.to(ms.float32)))[0].item() + target = ops.max(ops.div(ops.cast(second_tensor, ms.float32), ops.cast(first_tensor, ms.float32)))[0].item() ret = self.base_handler.get_endless_norm(first_tensor, second_tensor, abs_tol) self.assertEqual(ret, target) diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/test_ms_api_pynative_self_check.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/test_ms_api_pynative_self_check.py index 4872527e4c29e4200a9f60459137425cfbf5d73d..e07417aba8c745833a3f551a9e0489d848a58bb1 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/test_ms_api_pynative_self_check.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/free_benchmark/test_ms_api_pynative_self_check.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -99,11 +99,13 @@ class TestApiPyNativeSelfCheck(TestCase): _, forward_hook, backward_hook, _ = self.checker.build_hook("Functional.add.") cell = Cell() + cell.msprobe_input_kwargs = {} with patch("msprobe.mindspore.free_benchmark.api_pynative_self_check.need_wrapper_func", return_value=False): self.assertIsNone(forward_hook(cell, "input", "output")) cell = Cell() + cell.msprobe_input_kwargs = {} self.checker.api_list = ["mindspore.ops.add"] self.checker.ori_func["mindspore.ops.add"] = "add" with patch("msprobe.mindspore.free_benchmark.api_pynative_self_check.need_wrapper_func", return_value=True), \ diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/grad_probe/test_grad_analyzer.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/grad_probe/test_grad_analyzer.py index fefdaffec0798e09a5fc5787d11a0bf89167ecef..af8f6b0477f766507db55dbe345f2d802415dc14 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/grad_probe/test_grad_analyzer.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/grad_probe/test_grad_analyzer.py @@ -34,7 +34,7 @@ class TestGradAnalyzer(TestCase): GradConst.OUTPUT_PATH: self.output_path, GradConst.LEVEL: GradConst.LEVEL2, GradConst.BOUNDS: [-0.1, 0.0, 0.1], - GradConst.TIME_STAMP: self.time_stamp + GradConst.TIME_STAMP: self.time_stamp, }[x] })) # Clear dump directory before each test diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_common_func.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_common_func.py new file mode 100644 index 0000000000000000000000000000000000000000..d0753c5c2300b58814504f1e2a0e1bfe7cb56e12 --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_common_func.py @@ -0,0 +1,120 @@ +import pytest +from unittest.mock import patch, MagicMock +from mindspore import nn, context +from mindspore.common.initializer import Normal +import mindspore as ms + +from msprobe.mindspore.monitor.common_func import ( + is_valid_instance, + get_submodules, + get_parameters, + get_rank, + comm_is_initialized, + optimizer_pre_hook, + optimizer_post_hook +) + +TORCH_AVAILABLE = False +try: + import torch + import torch.nn as torch_nn + TORCH_AVAILABLE = True +except ImportError: + TORCH_AVAILABLE = False + + +class TestModelUtils: + @classmethod + def setup_class(cls): + """Setup once for all tests in this class""" + cls.ms_model = MSModel() + if TORCH_AVAILABLE: + cls.torch_model = TorchModel() + + @classmethod + def teardown_class(cls): + """Cleanup after all tests in this class""" + pass + + + def test_is_valid_instance_if_model_is_cell_or_module_then_return_true(self): + with patch('msprobe.mindspore.monitor.common_func.is_mindtorch') as mock_is_mindtorch: + if TORCH_AVAILABLE: + mock_is_mindtorch.return_value = True + assert is_valid_instance(self.torch_model) + mock_is_mindtorch.return_value = False + assert is_valid_instance(self.ms_model) + + def test_is_valid_instance_if_input_is_string_then_return_false(self): + assert not is_valid_instance("not a model") + + def test_is_valid_instance_if_input_is_number_then_return_false(self): + assert not is_valid_instance(123) + + def test_get_submodules_if_model_is_valid_then_return_non_empty_dict(self): + with patch('msprobe.mindspore.monitor.common_func.is_mindtorch') as mock_is_mindtorch: + mock_is_mindtorch.return_value = True + if TORCH_AVAILABLE: + submodules = dict(get_submodules(self.torch_model)) + assert len(submodules) > 0 + assert any(name == 'conv1' for name in submodules) + + mock_is_mindtorch.return_value = False + submodules = dict(get_submodules(self.ms_model)) + assert len(submodules) > 0 + assert any(name.endswith('conv1') for name in submodules) + + + def test_get_submodules_if_model_is_invalid_then_return_empty_dict(self): + assert get_submodules("invalid") == {} + + def test_get_parameters_if_model_is_valid_then_return_non_empty_dict(self): + with patch('msprobe.mindspore.monitor.common_func.is_mindtorch') as mock_is_mindtorch: + mock_is_mindtorch.return_value = True + if TORCH_AVAILABLE: + params = dict(get_parameters(self.torch_model)) + assert any(name == 'conv1.weight' for name in params) + mock_is_mindtorch.return_value = False + params = dict(get_parameters(self.ms_model)) + assert any('conv1.weight' in name for name in params) + + + def test_get_parameters_if_model_is_invalid_then_return_empty_dict(self): + assert get_parameters(123) == {} + + def test_get_rank_if_comm_initialized_then_return_integer(self): + rank = get_rank() + assert isinstance(rank, int) + assert rank >= 0 + + def test_comm_is_initialized_when_called_then_return_boolean(self): + assert isinstance(comm_is_initialized(), bool) + + +# Test models +class MSModel(nn.Cell): + def __init__(self): + super().__init__() + self.conv1 = nn.Conv2d(3, 64, 3, has_bias=True, weight_init=Normal(0.02)) + self.bn1 = nn.BatchNorm2d(64) + self.relu = nn.ReLU() + + def construct(self, x): + x = self.conv1(x) + x = self.bn1(x) + x = self.relu(x) + return x + +if TORCH_AVAILABLE: + class TorchModel(torch_nn.Module): + def __init__(self): + super().__init__() + self.conv1 = torch_nn.Conv2d(3, 64, 3) + self.bn1 = torch_nn.BatchNorm2d(64) + self.relu = torch_nn.ReLU() + + def forward(self, x): + x = self.conv1(x) + x = self.bn1(x) + x = self.relu(x) + return x \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_opt_collect.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_opt_collect.py new file mode 100644 index 0000000000000000000000000000000000000000..8471c289dce975754a957e2efad2878af1fd588e --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/ms_monitor/test_opt_collect.py @@ -0,0 +1,223 @@ +import pytest +import numpy as np +from mindspore import Tensor, nn, ops +from unittest.mock import MagicMock, patch + +from msprobe.core.common.const import MonitorConst +# Import the classes to test +from msprobe.core.common.log import logger +from msprobe.mindspore.monitor.optimizer_collect import ( + OptimizerMon, + MixPrecisionOptimizerMon, + MegatronDistributedOptimizerMon, + MegatronChainedDistributedOptimizerMon, + MegatronChainedMixPrecisionOptimizerMon, + DeepSpeedZeroOptimizerMon, + DeepSpeedZeroOptimizerStage0Mon, + DeepSpeedZeroOptimizerStage1or2Mon, + DeepSpeedZeroOptimizerStage3Mon, + OptimizerMonFactory +) + +class TestOptimizerMon: + @classmethod + def setup_class(cls): + """Setup once for all tests in this class""" + cls.mock_monitor = MagicMock() + cls.mock_monitor.name2tag = {"test_param": {MonitorConst.POST_GRAD: "test_tag"}} + cls.mock_monitor.duplicate_param = {} + cls.mock_monitor.params_have_main_grad = False + cls.mock_monitor.fsdp_wrapped_module = False + cls.mock_monitor.mv_distribution = True + cls.mock_monitor.mg_direction = True + cls.mock_monitor.ur_distribution = True + cls.mock_monitor.update_heatmap_visualizer = {"test_param": MagicMock()} + cls.mock_monitor.ratio_heatmap_visualizer = {"test_param": MagicMock()} + + def test_fetch_grad_if_param_has_valid_grad_then_return_correct_grad_values(self): + # Setup + param = MagicMock() + expected_grad = Tensor([1.0, 2.0, 3.0]) + param.grad = expected_grad + params2name = {param: "test_param"} + optimizer = MagicMock() + mon = OptimizerMon(optimizer) + + # Execute + result = mon.fetch_grad(self.mock_monitor, params2name) + + # Verify + assert len(result) == 1 + assert (result["test_tag"] == expected_grad).all() + self.mock_monitor.register_param_call_id.assert_called_once_with("hook_optimizer", "test_tag") + + def test_fetch_grad_if_param_has_main_grad_then_return_main_grad_values(self): + # Setup + param = MagicMock() + expected_grad = Tensor(np.array([1.5, 2.5])) + param.main_grad = expected_grad + param.grad = None + params2name = {param: "test_param"} + optimizer = MagicMock() + self.mock_monitor.params_have_main_grad = True + mon = OptimizerMon(optimizer) + + # Execute + result = mon.fetch_grad(self.mock_monitor, params2name) + + # Verify + assert len(result) == 1 + assert (result["test_tag"] == expected_grad).all() + + def test_fetch_mv_if_state_complete_then_return_correct_momentum_values(self): + # Setup + param = MagicMock() + params2name = {param: "test_param"} + optimizer = MagicMock() + optimizer.state = { + param: { + "exp_avg": Tensor([0.1]), + "exp_avg_sq": Tensor([0.2]), + "step": 10 + } + } + optimizer.defaults = {'betas': (0.9, 0.999), 'eps': 1e-8} + optimizer.param_groups = [{}] + + mon = OptimizerMon(optimizer) + mon.fp16_to_fp32_param = {} + + # Execute + exp_avg, exp_avg_sq, update, ratio = mon.fetch_mv(self.mock_monitor, params2name) + + # Verify + beta1, beta2 = optimizer.defaults['betas'] + step = optimizer.state[param]['step'] + + expected_exp_avg_hat = 0.1 / (1 - beta1**step) + expected_exp_avg_sq_hat = 0.2 / (1 - beta2**step) + expected_update = expected_exp_avg_hat / (np.sqrt(expected_exp_avg_sq_hat) + optimizer.defaults['eps']) + expected_ratio = expected_exp_avg_hat / np.sqrt(expected_exp_avg_sq_hat) + + assert exp_avg["test_param"] == Tensor([0.1]) + assert exp_avg_sq["test_param"] == Tensor([0.2]) + assert update["test_param"] == Tensor([expected_update]) + assert ratio["test_param"] == Tensor([expected_ratio]) + + def test_narrow_from_flatten_if_state_not_partitioned_then_return_original_state(self): + # Setup + param = MagicMock() + flatten_state = Tensor([1.0, 2.0, 3.0]) + mon = OptimizerMon(MagicMock()) + + # Execute + result = mon.narrow_from_flatten(param, flatten_state) + + # Verify + assert (result == flatten_state).all() + +class TestMixPrecisionOptimizerMon: + @classmethod + def setup_class(cls): + cls.mock_monitor = MagicMock() + cls.mock_monitor.mv_distribution = True + cls.mock_monitor.mg_direction = True + cls.mock_monitor.ur_distribution = True + cls.mock_monitor.update_heatmap_visualizer = {'param1': MagicMock(), 'param2': MagicMock()} + cls.mock_monitor.ratio_heatmap_visualizer = {'param1': MagicMock(), 'param2': MagicMock()} + + def test_map_fp16_to_fp32_param_if_multiple_groups_then_create_correct_mappings(self): + # Setup + optimizer = MagicMock() + fp16_params = [MagicMock(), MagicMock(), MagicMock()] + fp32_params = [MagicMock(), MagicMock(), MagicMock()] + optimizer.float16_groups = [fp16_params[:2], [fp16_params[2]]] + optimizer.fp32_from_float16_groups = [fp32_params[:2], [fp32_params[2]]] + + mon = MixPrecisionOptimizerMon(optimizer) + + # Execute + mon.map_fp16_to_fp32_param(optimizer) + + # Verify + assert len(mon.fp16_to_fp32_param) == 3 + for fp16, fp32 in zip(fp16_params, fp32_params): + assert mon.fp16_to_fp32_param[fp16] == fp32 + +class TestDeepSpeedZeroOptimizerStage1or2Mon: + @classmethod + def setup_class(cls): + """Setup once for all tests in this class""" + cls.mock_monitor = MagicMock() + cls.mock_monitor.name2tag = {"test_param": {MonitorConst.POST_GRAD: "test_tag"}} + cls.mock_monitor.duplicate_param = {} + cls.mock_monitor.params_have_main_grad = False + cls.mock_monitor.mg_direction = True + cls.mock_monitor.ur_distribution = True + + def test_fetch_grad_if_param_in_partition_then_return_correct_grad_slice(self): + # Setup + optimizer = MagicMock() + param = MagicMock() + params2name = {param: "test_param"} + expected_grad = Tensor(np.array([1.0, 2.0, 3.0])) + param.main_grad = expected_grad + param.grad = None + optimizer.bit16_groups = [[param]] + optimizer.cpu_offload = False + mon = DeepSpeedZeroOptimizerStage1or2Mon(optimizer) + mon.param2group = {param: 0} + mon.get_param_index = MagicMock(return_value=1) + mon.param_not_in_partition = MagicMock(return_value=False) + mon.get_position = MagicMock(return_value=(3, 3)) # start at index 3, length 3 + + # MagicMock the averaged_gradients structure + optimizer.averaged_gradients = { + 0: [ + None, # index 0 + Tensor(np.array([1.0, 2.0, 3.0])) # index 1 + ] + } + + # Execute + result = mon.fetch_grad(self.mock_monitor, params2name) + + # Verify + assert len(result) == 1 + assert (result["test_tag"] == expected_grad).all() + +class TestOptimizerMonFactory: + @classmethod + def setup_class(cls): + cls.mock_monitor = MagicMock() + cls.mock_monitor.mv_distribution = True + cls.mock_monitor.mg_direction = True + cls.mock_monitor.ur_distribution = True + cls.mock_monitor.update_heatmap_visualizer = {'param1': MagicMock(), 'param2': MagicMock()} + cls.mock_monitor.ratio_heatmap_visualizer = {'param1': MagicMock(), 'param2': MagicMock()} + + def test_create_optimizer_mon_if_chained_optimizer_then_return_correct_monitor_type(self): + # Setup + base_optimizer = MagicMock() + base_optimizer.__class__.__name__ = "DistributedOptimizer" + optimizer = MagicMock() + optimizer.__class__.__name__ = "ChainedOptimizer" + optimizer.chained_optimizers = [base_optimizer] + + # Execute + result = OptimizerMonFactory.create_optimizer_mon(optimizer) + + # Verify + assert isinstance(result, MegatronChainedDistributedOptimizerMon) + + def test_create_optimizer_mon_if_deepspeed_stage3_then_return_stage3_monitor(self): + # Setup + optimizer = MagicMock() + optimizer.__class__.__name__ = "DeepSpeedZeroOptimizer_Stage3" + + # Execute + result = OptimizerMonFactory.create_optimizer_mon(optimizer) + + # Verify + assert isinstance(result, DeepSpeedZeroOptimizerStage3Mon) + assert result.stage == '3' diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/save/test_debugger_save_mindspore.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/save/test_debugger_save_mindspore.py new file mode 100644 index 0000000000000000000000000000000000000000..fcefbb8c339ad6de1d14eaae7f75e6947efc5196 --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/save/test_debugger_save_mindspore.py @@ -0,0 +1,364 @@ +import unittest +import os +import json +import mindspore +import numpy as np +import shutil +from unittest.mock import patch + +from msprobe.mindspore import PrecisionDebugger +from msprobe.core.data_dump.data_processor.mindspore_processor import MindsporeDataProcessor +from msprobe.mindspore.dump.hook_cell.api_register import get_api_register + +current_file = __file__ +parent_dir = os.path.abspath(os.path.dirname(current_file)) +test_dir = os.path.join(parent_dir, "test_dir") + +def deep_compare(obj1, obj2, float_tolerance=1e-5): + """ + Recursively compare two objects to check if they are the same. + Supports nested dictionaries and lists. + """ + if type(obj1) != type(obj2): + return False + if isinstance(obj1, dict): + if obj1.keys() != obj2.keys(): + return False + return all(deep_compare(obj1[key], obj2[key]) for key in obj1) + if isinstance(obj1, (tuple, list)): + if len(obj1) != len(obj2): + return False + return all(deep_compare(item1, item2) for item1, item2 in zip(obj1, obj2)) + if isinstance(obj1, (int, float)): + return abs(obj1 - obj2) < float_tolerance + return obj1 == obj2 + +class TestDebuggerSave(unittest.TestCase): + @staticmethod + def write_config_json(step, async_dump, mode, dump_path, config_file_path): + task = "tensor" if mode == "tensor" else "statistics" + statistics_summary_mode = "statistics" if mode == "statistics" else "md5" + config = { + "task": task, + "dump_path": dump_path, + "rank": [], + "step": step, + "level": "debug", + "enable_dataloader": False, + "async_dump": async_dump, + "statistics": { + "summary_mode": statistics_summary_mode, + } + } + with open(config_file_path, "w", encoding="utf-8") as f: + json.dump(config, f, indent=4, ensure_ascii=False) + + @staticmethod + def read_debug_json_into_dict(debug_json_path): + with open(debug_json_path, "r", encoding="utf-8") as f: + debug_json = json.load(f) + return debug_json + + @staticmethod + def check_real_npy(npy_path, target_ms_tensor, check_values=True, rtol=1e-5, atol=1e-8): + """ + Enhanced version with optional value comparison. + + Args: + npy_path (str): Path to the .npy file + target_ms_tensor: Target mindspore tensor to compare + check_values (bool): If True, also compare array values + rtol, atol: Relative and absolute tolerances for value comparison + + Returns: + bool: True if all checks pass + """ + # Convert mindspore tensor to numpy if needed + if hasattr(target_ms_tensor, 'numpy'): + target_ms_tensor = target_ms_tensor.numpy() + # Load the npy file + try: + npy_data = np.load(npy_path) + except FileNotFoundError: + print(f"Error: The file {npy_path} does not exist.") + return False + except Exception as e: + print(f"Error loading npy file: {e}") + return False + # Check shapes + if npy_data.shape != target_ms_tensor.shape: + print(f"Shape mismatch: npy data shape is {npy_data.shape}, target tensor shape is {target_ms_tensor.shape}") + return False + # Check dtypes + if npy_data.dtype != target_ms_tensor.dtype: + print(f"Shape mismatch: npy data dtype is {npy_data.dtype}, target tensor dtype is {target_ms_tensor.dtype}") + return False + # Optionally check values + if check_values: + if not np.allclose(npy_data, target_ms_tensor, rtol=rtol, atol=atol): + print("Value mismatch: npy data and target tensor values do not match within the specified tolerances.") + return False + + return True + + def setUp(self): + if not os.path.exists(test_dir): + os.makedirs(test_dir) + PrecisionDebugger._instance = None + self.original_mindspore_special_type = MindsporeDataProcessor.mindspore_special_type + MindsporeDataProcessor.mindspore_special_type = tuple([mindspore.Tensor]) + + def tearDown(self): + if os.path.exists(test_dir): + shutil.rmtree(test_dir) + PrecisionDebugger._instance = None + MindsporeDataProcessor.mindspore_special_type = self.original_mindspore_special_type + get_api_register(True).restore_all_api() + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_save_real_tensor(self, _): + data = {"a": mindspore.Tensor([1., 2.])} + step = [] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + + # check npy file + npy_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "data_dict.0.debug.a.npy") + assert self.check_real_npy(npy_path, data["a"]) + + # check debug json + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "data_name": "data_dict.0.debug.a.npy" + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_save_md5(self, _): + data = {"a": mindspore.Tensor([1., 2.])} + step = [] + async_dump = False + mode = "md5" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check debug json + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "md5": "2e3fa576" + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_save_multiple_steps(self, _): + data = {"a": mindspore.Tensor([1., 2.])} + step = [0, 1, 2] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + for _ in step: + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check npy file + for i in step: + npy_path = os.path.join(dump_path, f"step{i}", "rank", "dump_tensor_data", "data_dict.0.debug.a.npy") + assert self.check_real_npy(npy_path, data["a"]) + # check debug json + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "data_name": "data_dict.0.debug.a.npy" + } + } + for i in step: + debug_json_path = os.path.join(dump_path, f"step{i}", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_async_save_tensor(self, _): + data = {"a": mindspore.Tensor([1., 2.])} + step = [] + async_dump = True + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check npy file + npy_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "data_dict.0.debug.a.npy") + assert self.check_real_npy(npy_path, data["a"]) + # check debug json + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "data_name": "data_dict.0.debug.a.npy", + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302 + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_async_save_md5(self, _): + # async_dump case, md5 configuration not working,only save statistics + data = {"a": mindspore.Tensor([1., 2.])} + step = [] + async_dump = True + mode = "md5" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check debug json + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302 + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_save_multiple_times(self, _): + data = {"a": mindspore.Tensor([1., 2.])} + step = [] + call_times = 3 + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + for _ in range(call_times): + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check npy file + for i in range(call_times): + npy_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", f"data_dict.{i}.debug.a.npy") + assert self.check_real_npy(npy_path, data["a"]) + # check debug json + for i in range(call_times): + target_debug_info = { + "a": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "data_name": f"data_dict.{i}.debug.a.npy" + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"][f"data_dict.{i}.debug"], target_debug_info) + + @patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions") + def test_save_compilcated_data_structure(self, _): + x = mindspore.Tensor([1., 2.]) + complicated_structure = [{"a_key": x}] + step = [] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(complicated_structure, "complicated_structure") + PrecisionDebugger.step() + complicated_structure_info_list = [ + x, + os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "complicated_structure.0.debug.0.a_key.npy"), + "complicated_structure.0.debug", + [ + { + "a_key": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "data_name": "complicated_structure.0.debug.0.a_key.npy" + } + } + ], + ] + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + target_tensor, target_tensor_path, target_tensor_key, target_tensor_info = complicated_structure_info_list + assert self.check_real_npy(target_tensor_path, target_tensor) + assert deep_compare(debug_json_dict["data"][target_tensor_key], target_tensor_info) \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_cell_processor.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_cell_processor.py index 40f5c0164115e18cdd49c046ce29967e7a3f63eb..64ed1fa578ca22c900adbdca3c789e8e2014cf5c 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_cell_processor.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_cell_processor.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -16,132 +16,337 @@ import unittest from unittest.mock import MagicMock, patch +import mindspore as ms +from mindspore import Tensor +from mindspore.ops.operations import _inner_ops + from msprobe.core.common.const import Const +from msprobe.core.common.exceptions import MsprobeException from msprobe.core.data_dump.scope import ModuleRangeScope -from msprobe.mindspore.cell_processor import CellProcessor - - -class MockCell: - def __init__(self): - self.mindstudio_reserved_name = None +from msprobe.mindspore.cell_processor import CellProcessor, get_cell_construct +from msprobe.mindspore.common.log import logger class TestCellProcessor(unittest.TestCase): + @classmethod + def setUpClass(cls): + CellProcessor.reset_cell_stats() + cls.scope = MagicMock(spec=ModuleRangeScope) + cls.processor = CellProcessor(cls.scope) - def setUp(self): - # 重置静态变量 + @classmethod + def tearDownClass(cls): CellProcessor.reset_cell_stats() - self.scope = MagicMock(spec=ModuleRangeScope) - self.processor = CellProcessor(self.scope) - def test_init_with_module_range_scope(self): - self.assertIsInstance(self.processor.scope, ModuleRangeScope) + def test_class_attribute(self): + self.assertTrue(hasattr(CellProcessor, 'cell_count')) + self.assertTrue(hasattr(CellProcessor, 'cell_stack')) + self.assertTrue(hasattr(CellProcessor, 'api_parent_node')) + self.assertTrue(hasattr(CellProcessor, 'module_node')) + self.assertTrue(hasattr(CellProcessor, 'cell_bw_hook_kernels')) + self.assertTrue(hasattr(CellProcessor, 'cell_backward_pre_hook')) + self.assertTrue(hasattr(CellProcessor, 'cell_backward_hook')) - def test_init_with_none_scope(self): + def test__init(self): + self.assertIsInstance(self.processor.scope, ModuleRangeScope) processor = CellProcessor(None) self.assertIsNone(processor.scope) - def test_set_cell_count_new_cell(self): - count = self.processor.set_cell_count("cell1") + def test_get_cell_construct(self): + def construct(self, *args, **kwargs): + return len(args) + + _constrct = get_cell_construct(construct) + ret = _constrct(self, 'argument') + self.assertFalse(hasattr(self, 'msprobe_input_kwargs')) + self.assertEqual(ret, 1) + + setattr(self, 'msprobe_hook', True) + _constrct = get_cell_construct(construct) + ret = _constrct(self, 'argument') + self.assertEqual(self.msprobe_input_kwargs, {}) + self.assertEqual(ret, 1) + + del self.msprobe_hook + del self.msprobe_input_kwargs + + def test_set_and_get_calls_number(self): + CellProcessor.cell_count = {} + count = self.processor.set_and_get_calls_number("cell") self.assertEqual(count, 0) - self.assertEqual(CellProcessor.cell_count["cell1"], 0) + self.assertEqual(CellProcessor.cell_count["cell"], 0) - def test_set_cell_count_existing_cell(self): - self.processor.set_cell_count("cell1") - count = self.processor.set_cell_count("cell1") + count = self.processor.set_and_get_calls_number("cell") self.assertEqual(count, 1) - self.assertEqual(CellProcessor.cell_count["cell1"], 1) + self.assertEqual(CellProcessor.cell_count["cell"], 1) + + CellProcessor.cell_count = {} def test_reset_cell_stats(self): - self.processor.set_cell_count("cell1") + CellProcessor.cell_count['cell'] = 0 + CellProcessor.cell_stack.append('cell') + CellProcessor.api_parent_node = 'cell' + CellProcessor.module_node['cell'] = 'null' + CellProcessor.cell_bw_hook_kernels['cell'] = 'bw' + CellProcessor.cell_backward_pre_hook.append('backward_pre_hook') + CellProcessor.cell_backward_hook.append('backward_hook') + CellProcessor.reset_cell_stats() self.assertEqual(CellProcessor.cell_count, {}) self.assertEqual(CellProcessor.cell_stack, []) - self.assertEqual(CellProcessor.api_parent_node, "") + self.assertIsNone(CellProcessor.api_parent_node) self.assertEqual(CellProcessor.module_node, {}) + self.assertEqual(CellProcessor.cell_bw_hook_kernels, {}) + self.assertEqual(CellProcessor.cell_backward_pre_hook, []) + self.assertEqual(CellProcessor.cell_backward_hook, []) - @patch('msprobe.core.common.const.Const') - def test_node_hook_begin(self, mock_const): - mock_const.SEP = "." # 确保 SEPARATOR 设置为字符串 - mock_const.START = "start" - cell = MockCell() - self.processor.node_hook("prefix", "start")(cell, "input") - - expected_name = "prefix" + mock_const.SEP + "0" - self.assertEqual(cell.mindstudio_reserved_name, expected_name) - self.assertIn(expected_name, CellProcessor.cell_stack) - self.assertEqual(CellProcessor.api_parent_node, expected_name) - self.scope.begin_module.assert_called_once_with(expected_name) - - @patch('msprobe.core.common.const.Const') - def test_node_hook_end(self, mock_const): - mock_const.START = "start" - cell = MockCell() - self.processor.node_hook("prefix", "start")(cell, "input") - self.processor.node_hook("prefix", "stop")(cell, "input", "output") - - self.assertEqual(len(CellProcessor.cell_stack), 0) - self.assertIsNone(CellProcessor.api_parent_node) - self.scope.end_module.assert_called_once_with(cell.mindstudio_reserved_name) + def test_register_cell_hook(self): + with self.assertRaises(MsprobeException) as context: + self.processor.register_cell_hook([], None) + self.assertEqual(str(context.exception), '[msprobe] 无效参数:The model cannot be None, when level is "L0" or "mix"') - @patch('msprobe.core.common.const.Const') - def test_multiple_node_hook_calls(self, mock_const): - mock_const.SEP = "." # 确保 SEPARATOR 设置为字符串 - mock_const.START = "start" - cell = MockCell() + with patch('msprobe.mindspore.cell_processor.is_mindtorch') as mock_is_mindtorch, \ + patch('msprobe.mindspore.cell_processor.get_cells_and_names') as mock_get_cells_and_names, \ + patch('msprobe.mindspore.cell_processor.CellProcessor.build_cell_hook') as mock_build_cell_hook, \ + patch('msprobe.mindspore.cell_processor.get_cell_construct') as mock_get_cell_construct, \ + patch.object(logger, 'info') as mock_logger_info: + mock_cell = MagicMock() + mock_sub_cell = MagicMock() + mock_get_cells_and_names.return_value = {'-1': [('cell', mock_cell), ('sub_cell', mock_sub_cell)]} + mock_build_cell_hook.return_value = 'forward_pre_hook' + mock_get_cell_construct.return_value = '_construct' - # First call - self.processor.node_hook("prefix", "start")(cell, "input") - expected_name1 = "prefix" + mock_const.SEP + "0" + mock_is_mindtorch.return_value = False + setattr(MagicMock, '_run_construct', '_run_construct') + self.processor.register_cell_hook(mock_cell, None) + self.assertTrue(mock_sub_cell.__class__.msprobe_construct) + mock_get_cell_construct.assert_called_with('_run_construct') + self.assertEqual(mock_sub_cell.__class__._run_construct, '_construct') + self.assertTrue(mock_sub_cell.msprobe_hook) + mock_build_cell_hook.assert_called_with('Cell.sub_cell.MagicMock.', None) + mock_cell.assert_not_called() + mock_sub_cell.register_forward_pre_hook.assert_called_with('forward_pre_hook') + mock_sub_cell.register_forward_hook.assert_not_called() + mock_logger_info.assert_called_with('The cell hook function is successfully mounted to the model.') - # Second call - self.processor.node_hook("prefix", "start")(cell, "input") - expected_name2 = "prefix" + mock_const.SEP + "1" + del MagicMock._run_construct + del mock_sub_cell.__class__._run_construct + del mock_sub_cell.__class__.msprobe_construct - self.assertEqual(cell.mindstudio_reserved_name, expected_name2) - self.assertEqual(CellProcessor.api_parent_node, expected_name2) + mock_get_cell_construct.reset_mock() + mock_another_sub_cell = MagicMock() + setattr(mock_another_sub_cell.__class__, 'msprobe_construct', True) + mock_get_cells_and_names.return_value = {'-1': [('cell', mock_cell), + ('another_sub_cell', mock_another_sub_cell)]} + self.processor.register_cell_hook(mock_cell, None) + mock_get_cell_construct.assert_not_called() + mock_another_sub_cell.register_forward_pre_hook.assert_called_with('forward_pre_hook') + mock_another_sub_cell.register_forward_hook.assert_not_called() - # End first call - self.processor.node_hook("prefix", "stop")(cell, "input", "output") - self.assertEqual(len(CellProcessor.cell_stack), 1) # Still one item in stack - self.assertEqual(CellProcessor.api_parent_node, expected_name1) + del mock_another_sub_cell.__class__.msprobe_construct - # End second call - self.processor.node_hook("prefix", "stop")(cell, "input", "output") - self.assertEqual(len(CellProcessor.cell_stack), 0) # Stack should be empty now - self.assertIsNone(CellProcessor.api_parent_node) + mock_build_cell_hook.reset_mock() + mock_get_cell_construct.reset_mock() + mock_another_sub_cell.reset_mock() + setattr(MagicMock, '_call_impl', '_call_impl') + mock_is_mindtorch.return_value = True + self.processor.register_cell_hook(mock_cell, None) + self.assertTrue(mock_another_sub_cell.__class__.msprobe_construct) + mock_get_cell_construct.assert_called_with('_call_impl') + mock_build_cell_hook.assert_called_with('Module.another_sub_cell.MagicMock.', None) + mock_cell.assert_not_called() + mock_another_sub_cell.register_forward_pre_hook.assert_called_with('forward_pre_hook') + mock_another_sub_cell.register_forward_hook.assert_not_called() + + del MagicMock._call_impl + del mock_another_sub_cell.__class__._call_impl + del mock_another_sub_cell.__class__.msprobe_construct + + def test_build_cell_hook(self): + CellProcessor.reset_cell_stats() + + cell_name = 'Cell.cell.Cell.' + mock_build_data_hook = MagicMock() + mock_backward_data_hook = MagicMock() + target_grad_output = (Tensor([0.5]),) + mock_backward_data_hook.return_value = target_grad_output + mock_build_data_hook.return_value = (None, None, mock_backward_data_hook, None) + mock_cell = MagicMock() - def test_set_and_get_reserved_name(self): - cell = MockCell() - cell.mindstudio_reserved_name = "mindstudio_reserved_name" + with patch.object(_inner_ops, 'CellBackwardHook') as mock_CellBackwardHook: + forward_pre_hook = self.processor.build_cell_hook(cell_name, mock_build_data_hook) + forward_hook = forward_pre_hook.__closure__[2].cell_contents + + mock_bw = mock_CellBackwardHook.return_value + mock_bw.return_value = (Tensor([0.0]),) + args = (Tensor([1.0]),) + target_args = (Tensor([0.0]),) + full_forward_name = f'{cell_name}{Const.FORWARD}.0' + full_backward_name = f'{cell_name}{Const.BACKWARD}.0' + # call testing function - forward_pre_hook + ret = forward_pre_hook(mock_cell, args) + self.assertIsNone(CellProcessor.module_node[full_forward_name]) + self.assertEqual(CellProcessor.cell_stack, [full_forward_name]) + self.assertEqual(CellProcessor.api_parent_node, full_forward_name) + self.scope.begin_module.assert_called_with(full_forward_name) + mock_build_data_hook.assert_called_with('Module', full_forward_name) + self.assertEqual(len(CellProcessor.cell_backward_hook), 1) + mock_CellBackwardHook.assert_called_with(full_backward_name, mock_cell, + CellProcessor.cell_backward_hook[-1]) + mock_bw.register_backward_hook.assert_called_once() + mock_bw.assert_called_with(*args) + self.assertTrue((ret[0] == target_args[0]).all()) + + backward_hook = CellProcessor.cell_backward_hook[-1][full_backward_name] + grad_input = (Tensor([1.0]),) + grad_output = (Tensor([2.0]),) + # call testing function - backward_hook + ret = backward_hook(mock_cell, grad_input, grad_output) + mock_backward_data_hook.assert_called_with(mock_cell, grad_input, grad_output) + self.assertFalse(mock_cell.has_pre_hook_called) + self.assertEqual(CellProcessor.cell_stack, []) + self.assertIsNone(CellProcessor.api_parent_node) + self.scope.end_module.assert_called_with(full_backward_name) + self.assertTrue((ret[0] == target_grad_output[0]).all()) + + mock_build_data_hook.reset_mock() + args = (Tensor([1], dtype=ms.int32),) + full_forward_name = f'{cell_name}{Const.FORWARD}.1' + # call testing function - forward_pre_hook + ret = forward_pre_hook(mock_cell, args) + self.assertIsNone(CellProcessor.module_node[full_forward_name]) + self.assertEqual(CellProcessor.cell_stack, [full_forward_name]) + self.assertEqual(CellProcessor.api_parent_node, full_forward_name) + self.scope.begin_module.assert_called_with(full_forward_name) + self.assertEqual(len(CellProcessor.cell_backward_hook), 1) + mock_build_data_hook.assert_not_called() + + full_forward_name = f'{cell_name}{Const.FORWARD}.0' + CellProcessor.cell_count = {cell_name: 0} + CellProcessor.cell_stack = [full_forward_name] + CellProcessor.api_parent_node = full_forward_name + CellProcessor.module_node = {full_forward_name: None} + self.scope.reset_mock() + mock_CellBackwardHook.reset_mock() + mock_bw.reset_mock() + target_output = Tensor([0.5]) + args = (Tensor([1.0]),) + output = Tensor([2.0]) + mock_bw.return_value = target_output + mock_backward_data_hook.reset_mock() + mock_forward_data_hook_hook = MagicMock() + mock_forward_data_hook_hook.return_value = output + mock_build_data_hook.return_value = (None, mock_forward_data_hook_hook, mock_backward_data_hook, None) + # call testing function - forward_hook + ret = forward_hook(mock_cell, args, output) + self.assertEqual(CellProcessor.cell_count.get(cell_name), 0) + self.assertEqual(CellProcessor.cell_stack, []) + self.assertIsNone(CellProcessor.api_parent_node) + self.scope.end_module.assert_called_with(full_forward_name) + self.assertEqual(mock_bw.call_count, 2) + self.assertEqual(mock_bw.call_args_list[0][0][0], output) + self.assertEqual(mock_bw.call_args_list[1][0][0], target_output) + self.assertEqual(mock_CellBackwardHook.call_count, 1) + self.assertEqual(len(CellProcessor.cell_backward_pre_hook), 1) + self.assertTrue((ret == target_output).all()) + + backward_pre_hook = CellProcessor.cell_backward_pre_hook[-1][full_backward_name] + mock_backward_data_hook.reset_mock() + grad_output = (Tensor([2.0]),) + # call testing function - backward_pre_hook + ret = backward_pre_hook(mock_cell, grad_output) + self.assertTrue(mock_cell.has_pre_hook_called) + self.scope.begin_module.assert_called_with(full_backward_name) + self.assertEqual(CellProcessor.cell_stack, [full_backward_name]) + self.assertEqual(CellProcessor.api_parent_node, full_backward_name) + self.assertEqual(CellProcessor.module_node, {full_forward_name: None, full_backward_name: None}) + self.scope.begin_module.assert_called_with(full_backward_name) + mock_backward_data_hook.assert_not_called() + self.assertIsNone(ret) + + CellProcessor.cell_count = {cell_name: 0} + CellProcessor.cell_stack = [full_forward_name] + CellProcessor.api_parent_node = full_forward_name + CellProcessor.module_node = {full_forward_name: None} + mock_bw.reset_mock() + args = (Tensor([1.0]),) + output = (Tensor([2.0]),) + mock_forward_data_hook_hook.return_value = output + target_output = (Tensor([0.5]),) + # call testing function - forward_hook + ret = forward_hook(mock_cell, args, output) + self.assertEqual(mock_bw.call_count, 2) + self.assertEqual(mock_bw.call_args_list[0][0][0], *output) + self.assertEqual(mock_bw.call_args_list[1][0][0], mock_bw.return_value) + self.assertTrue((ret[0] == target_output[0]).all()) + + CellProcessor.cell_count = {cell_name: 0} + CellProcessor.cell_stack = [full_forward_name] + CellProcessor.api_parent_node = full_forward_name + CellProcessor.module_node = {full_forward_name: None} + CellProcessor.cell_bw_hook_kernels.clear() + CellProcessor.cell_backward_pre_hook.clear() + mock_bw.reset_mock() + mock_bw.return_value = (Tensor([0.5]),) + output = (Tensor([1.0]), Tensor([2.0])) + mock_forward_data_hook_hook.return_value = output + with self.assertRaises(TypeError) as context: + # call testing function - forward_hook + forward_hook(mock_cell, args, output) + self.assertEqual(str(context.exception), + 'The backward pre hook return value size is 1 not equal to output size 2') + mock_bw.assert_called_with(*output) + + self.scope.reset_mock() + backward_pre_hook = CellProcessor.cell_backward_pre_hook[-1][full_backward_name] + # call testing function - backward_pre_hook + ret = backward_pre_hook(mock_cell, grad_output) + self.assertFalse(mock_cell.has_pre_hook_called) + self.scope.begin_module.assert_called_with(full_backward_name) + mock_backward_data_hook.assert_called_with(mock_cell, (), grad_output) + self.assertEqual(CellProcessor.cell_stack, []) + self.assertIsNone(CellProcessor.api_parent_node) + self.assertEqual(CellProcessor.module_node, {full_forward_name: None, full_backward_name: None}) + self.scope.end_module.assert_called_with(full_backward_name) + self.assertIsNone(ret) + + CellProcessor.reset_cell_stats() + + def test_set_construct_info_in_pre_hook(self): CellProcessor.reset_cell_stats() + self.processor.set_construct_info_in_pre_hook('full_name') + self.assertEqual(CellProcessor.module_node['full_name'], None) + self.assertEqual(CellProcessor.cell_stack, ['full_name']) + self.assertEqual(CellProcessor.api_parent_node, 'full_name') + self.scope.begin_module.assert_called_with('full_name') + + self.scope.begin_module.reset_mock() + self.processor.set_construct_info_in_pre_hook('sub_cell_name') + self.assertEqual(CellProcessor.module_node, {'full_name': None, 'sub_cell_name': 'full_name'}) + self.assertEqual(CellProcessor.cell_stack, ['full_name', 'sub_cell_name']) + self.assertEqual(CellProcessor.api_parent_node, 'sub_cell_name') + self.scope.begin_module.assert_called_with('sub_cell_name') - cell_name = "Cell.net.Net.forward" - ret = self.processor.set_and_get_reserved_name(cell, cell_name) - self.assertEqual(ret, cell_name + Const.SEP + "0") - self.assertEqual(cell.mindstudio_reserved_name, ret) - self.assertEqual(CellProcessor.cell_count[cell_name], 0) - self.assertFalse(hasattr(cell, "has_pre_hook_called")) - - cell.has_pre_hook_called = False - ret = self.processor.set_and_get_reserved_name(cell, cell_name) - self.assertEqual(ret, cell_name + Const.SEP + "1") - self.assertEqual(cell.mindstudio_reserved_name, ret) - self.assertEqual(CellProcessor.cell_count[cell_name], 1) - self.assertFalse(cell.has_pre_hook_called) - - cell.has_pre_hook_called = True - cell.mindstudio_reserved_name = "mindstudio_reserved_name" CellProcessor.reset_cell_stats() - ret = self.processor.set_and_get_reserved_name(cell, cell_name) - self.assertEqual(ret, "mindstudio_reserved_name") - self.assertEqual(cell.mindstudio_reserved_name, ret) - self.assertEqual(CellProcessor.cell_count, {}) - self.assertFalse(cell.has_pre_hook_called) - ret = self.processor.set_and_get_reserved_name(cell, cell_name, is_called_by_pre_hook=True) - self.assertEqual(ret, cell_name + Const.SEP + "0") - self.assertEqual(cell.mindstudio_reserved_name, ret) - self.assertEqual(CellProcessor.cell_count[cell_name], 0) - self.assertTrue(cell.has_pre_hook_called) + def test_set_construct_info_in_hook(self): + CellProcessor.reset_cell_stats() + self.processor.set_construct_info_in_hook('full_name') + self.assertIsNone(CellProcessor.api_parent_node) + self.scope.end_module.assert_called_with('full_name') + + self.scope.end_module.reset_mock() + CellProcessor.cell_stack = ['full_name'] + self.processor.set_construct_info_in_hook('full_name') + self.assertEqual(CellProcessor.cell_stack, []) + self.assertIsNone(CellProcessor.api_parent_node) + self.scope.end_module.assert_called_with('full_name') + + self.scope.end_module.reset_mock() + CellProcessor.cell_stack = ['Cell.0', 'Cell.1'] + self.processor.set_construct_info_in_hook('full_name') + self.assertEqual(CellProcessor.cell_stack, ['Cell.0']) + self.assertEqual(CellProcessor.api_parent_node, 'Cell.0') + self.scope.end_module.assert_called_with('full_name') + CellProcessor.reset_cell_stats() diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_dump_tool_factory.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_dump_tool_factory.py index b925d9abaa36b0925d763cfeb89b95d89c6a7d09..dede6f01eae96b0c6842c9b8eeed3bc2991020c0 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_dump_tool_factory.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_dump_tool_factory.py @@ -60,12 +60,13 @@ class TestDumpToolFactory(TestCase): with self.assertRaises(ValueError): DumpToolFactory.create(config) mock_logger_error.assert_called_with("Data dump is not supported in None mode when dump level is kernel.") + mock_logger_error.reset_mock() config.execution_mode = Const.GRAPH_GE_MODE config.level = Const.CELL - with self.assertRaises(Exception) as context: + with self.assertRaises(ValueError): DumpToolFactory.create(config) - self.assertEqual(str(context.exception), "The model is empty and cell dump is not enabled.") + mock_logger_error.assert_called_with("Data dump is not supported in graph_ge mode when dump level is cell.") config.execution_mode = Const.GRAPH_KBYK_MODE config.level = Const.KERNEL diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_dump.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_dump.py index 329274b19d862c8c0e50af0fdbd051909e6a60d6..ac353fd8832363ae42872272c8bdeda6e0620d69 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_dump.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_dump.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -14,6 +14,7 @@ # limitations under the License. import os +import sys from unittest import TestCase from unittest.mock import patch @@ -44,10 +45,26 @@ class TestKernelGraphDump(TestCase): self.assertEqual(dumper.dump_json["common_dump_settings"]["file_format"], "bin") self.assertEqual(dumper.dump_json["common_dump_settings"]["input_output"], 2) + _msprobe_c_existed = True + try: + from msprobe.lib import _msprobe_c + except ImportError: + _msprobe_c_existed = False + with patch("msprobe.mindspore.dump.kernel_graph_dump.create_directory"), \ patch("msprobe.mindspore.dump.kernel_graph_dump.logger.info"), \ patch("msprobe.mindspore.dump.kernel_graph_dump.save_json") as mock_save_json: + if _msprobe_c_existed: + dumper.handle() + mock_save_json.assert_not_called() + + _msprobe_c_path = _msprobe_c.__file__ + _msprobe_c_test_path = _msprobe_c_path.replace('_msprobe_c.so', '_msprobe_c_test.so') + os.rename(_msprobe_c_path, _msprobe_c_test_path) + sys.modules.pop('msprobe.lib') + sys.modules.pop('msprobe.lib._msprobe_c') + os.environ["GRAPH_OP_RUN"] = "1" with self.assertRaises(Exception) as context: dumper.handle() @@ -63,3 +80,5 @@ class TestKernelGraphDump(TestCase): del os.environ["MINDSPORE_DUMP_CONFIG"] if "MS_ACL_DUMP_CFG_PATH" in os.environ: del os.environ["MS_ACL_DUMP_CFG_PATH"] + if _msprobe_c_existed: + os.rename(_msprobe_c_test_path, _msprobe_c_path) diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_overflow_check.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_overflow_check.py index b484bc9b7cdceec3b8906600b16b2d4fdc6b1b5e..67118ceaf780bd227ada242f2c1ca4b5a925127e 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_overflow_check.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_kernel_graph_overflow_check.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -14,6 +14,7 @@ # limitations under the License. import os +import sys from unittest import TestCase from unittest.mock import patch @@ -41,11 +42,27 @@ class TestKernelGraphOverflowCheck(TestCase): checker = KernelGraphOverflowCheck(config) self.assertEqual(checker.dump_json["common_dump_settings"]["op_debug_mode"], 2) + _msprobe_c_existed = True + try: + from msprobe.lib import _msprobe_c + except ImportError: + _msprobe_c_existed = False + os.environ["MS_ACL_DUMP_CFG_PATH"] = "path" with patch("msprobe.mindspore.overflow_check.kernel_graph_overflow_check.create_directory"), \ patch("msprobe.mindspore.overflow_check.kernel_graph_overflow_check.logger.info"), \ patch("msprobe.mindspore.overflow_check.kernel_graph_overflow_check.save_json") as mock_save_json: + if _msprobe_c_existed: + checker.handle() + mock_save_json.assert_not_called() + + _msprobe_c_path = _msprobe_c.__file__ + _msprobe_c_test_path = _msprobe_c_path.replace('_msprobe_c.so', '_msprobe_c_test.so') + os.rename(_msprobe_c_path, _msprobe_c_test_path) + sys.modules.pop('msprobe.lib') + sys.modules.pop('msprobe.lib._msprobe_c') + os.environ["GRAPH_OP_RUN"] = "1" with self.assertRaises(Exception) as context: checker.handle() @@ -60,3 +77,5 @@ class TestKernelGraphOverflowCheck(TestCase): if "MINDSPORE_DUMP_CONFIG" in os.environ: del os.environ["MINDSPORE_DUMP_CONFIG"] + if _msprobe_c_existed: + os.rename(_msprobe_c_test_path, _msprobe_c_path) diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_debug_save.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_debug_save.py index 7af7dd89727d98a595103c6de9ee0106b0865665..064dd8192ae66c34616c532c7d1acb7a49845add 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_debug_save.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_debug_save.py @@ -38,4 +38,41 @@ class TestMindsporeDebuggerSave(TestCase): task_config = BaseConfig(statistics_task_json) with patch("msprobe.mindspore.debugger.precision_debugger.parse_json_config", return_value=(common_config, task_config)), \ patch("msprobe.mindspore.debugger.precision_debugger.set_register_backward_hook_functions"): - self.debugger = PrecisionDebugger() \ No newline at end of file + self.debugger = PrecisionDebugger() + + def test_forward_and_backward(self): + def forward_func(x, y): + PrecisionDebugger.save(x, "x_tensor") + return x * y + x = mindspore.Tensor([1.]) + y = mindspore.Tensor([2.]) + result_json = { + "task": "statistics", + "level": "debug", + "framework": "mindspore", + "dump_data_dir": None, + "data": { + "x_tensor.0.debug": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": (1,) + }, + "x_tensor_grad.0.debug": { + "type": "mindspore.Tensor", + "dtype": "Float32", + "shape": (1,) + } + } + } + + + grad_fn = mindspore.value_and_grad(forward_func, (0, 1)) + grad_fn(x, y) + + result = self.debugger.service.data_collector.data_writer.cache_debug + # Remove 'tensor_stat_index' from all entries in the data dictionary + for key in result["data"]: + if 'tensor_stat_index' in result["data"][key]: + del result["data"][key]['tensor_stat_index'] + + self.assertEqual(result, result_json) \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_service.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_service.py index cbb465c2b272d32605864c3cdd96198f594bb543..74f9dbca2cd92d448ee8dbd5667ec03a76342ad0 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_service.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_ms_service.py @@ -19,15 +19,13 @@ from collections import defaultdict from unittest.mock import MagicMock, patch from mindspore import nn, ops +import torch from msprobe.core.common.exceptions import MsprobeException from msprobe.core.common.utils import Const -from msprobe.core.data_dump.api_registry import ApiRegistry from msprobe.core.data_dump.scope import BaseScope from msprobe.mindspore.cell_processor import CellProcessor from msprobe.mindspore.common.log import logger -from msprobe.mindspore.common.utils import register_backward_hook_functions -from msprobe.mindspore.dump.hook_cell.api_register import get_api_register from msprobe.mindspore.dump.hook_cell.hook_cell import HOOKCell from msprobe.mindspore.dump.jit_dump import JitDump from msprobe.mindspore.service import Service @@ -41,36 +39,88 @@ class TestService(unittest.TestCase): self.config_mock.step = [] self.config_mock.rank = [] self.config_mock.task = Const.TENSOR - self.config_mock.framework = Const.MS_FRAMEWORK self.config_mock.list = [] self.config_mock.scope = [] - self.service = Service(self.config_mock) - self.service.model = MagicMock(spec=nn.Cell) - self.service.data_collector = MagicMock() - self.service.primitive_hook_service = MagicMock() - - def tearDown(self) -> None: - get_api_register().restore_all_api() + with patch('msprobe.mindspore.service.build_data_collector'), \ + patch('msprobe.mindspore.service.CellProcessor'), \ + patch('msprobe.mindspore.service.PrimitiveHookService'), \ + patch('msprobe.mindspore.service.get_api_register'): + self.service = Service(self.config_mock) def test_init(self): - self.assertEqual(self.service.config.level, "L0") - self.assertFalse(self.service.switch) - self.assertFalse(self.service.should_stop_service) - self.assertFalse(self.service.start_call) - self.assertTrue(self.service.first_start) - - def test_check_model_valid_with_valid_cell(self): - model = nn.Cell() - model_list = [model] - self.assertEqual(self.service.check_model_valid(model), model) - self.assertEqual(self.service.check_model_valid(model_list), model_list) - - def test_check_model_valid_with_invalid_type(self): - model = nn.Cell() - with self.assertRaises(MsprobeException): - self.service.check_model_valid("not a cell") - with self.assertRaises(MsprobeException): - self.service.check_model_valid(["not a cell", model]) + with patch('msprobe.mindspore.service.build_data_collector') as mock_build_data_collector, \ + patch('msprobe.mindspore.service.CellProcessor') as mock_CellProcessor, \ + patch('msprobe.mindspore.service.PrimitiveHookService') as mock_PrimitiveHookService, \ + patch('msprobe.mindspore.service.get_api_register') as mock_get_api_register, \ + patch.object(Service, 'register_api_hook') as mock_register_api_hook: + self.service = Service(self.config_mock) + self.assertIsNone(self.service.model) + self.assertEqual(self.service.config.level_ori, Const.LEVEL_L0) + self.assertEqual(self.service.config.dump_path, '/tmp/dump') + self.assertEqual(self.service.config.step, []) + self.assertEqual(self.service.config.rank, []) + self.assertEqual(self.service.config.task, Const.TENSOR) + self.assertEqual(self.service.config.list, []) + self.assertEqual(self.service.config.scope, []) + self.assertEqual(self.service.config.level, Const.LEVEL_L0) + mock_build_data_collector.assert_called_with(self.service.config) + mock_CellProcessor.assert_called_with(mock_build_data_collector.return_value.scope) + mock_PrimitiveHookService.assert_called_with(self.service) + self.assertFalse(self.service.switch) + self.assertFalse(self.service.inner_switch) + self.assertFalse(self.service.primitive_switch) + self.assertEqual(self.service.current_iter, 0) + self.assertEqual(self.service.loop, 0) + self.assertEqual(self.service.init_step, 0) + self.assertTrue(self.service.first_start) + self.assertIsNone(self.service.current_rank) + self.assertIsNone(self.service.dump_iter_dir) + self.assertFalse(self.service.start_call) + self.assertFalse(self.service.should_stop_service) + self.assertEqual(self.service.params_grad_info, {}) + self.assertEqual(self.service.hook_handle_dict, {}) + mock_get_api_register.assert_called_with() + mock_register_api_hook.assert_called_with() + + def test_check_model_valid(self): + with patch('msprobe.mindspore.service.is_mindtorch') as mock_is_mindtorch: + mock_is_mindtorch.return_value = False + model = None + self.assertIsNone(self.service.check_model_valid(model)) + model = 'model' + with self.assertRaises(MsprobeException) as context: + self.service.check_model_valid(model) + self.assertEqual(context.exception.code, MsprobeException.INVALID_PARAM_ERROR) + self.assertIn("The 'model' parameter must be a mindspore.nn.Cell or list[mindspore.nn.Cell] type, " + "currently there is a type.", str(context.exception)) + model = nn.Cell() + self.assertEqual(self.service.check_model_valid(model), model) + models = [model] + self.assertEqual(self.service.check_model_valid(models), models) + models = [model, 'model'] + with self.assertRaises(MsprobeException) as context: + self.service.check_model_valid(models) + self.assertEqual(context.exception.code, MsprobeException.INVALID_PARAM_ERROR) + self.assertIn("The 'model' parameter must be a mindspore.nn.Cell or list[mindspore.nn.Cell] type, " + "currently there is a type.", str(context.exception)) + + mock_is_mindtorch.return_value = True + model = 'model' + with self.assertRaises(MsprobeException) as context: + self.service.check_model_valid(model) + self.assertEqual(context.exception.code, MsprobeException.INVALID_PARAM_ERROR) + self.assertIn("The 'model' parameter must be a torch.nn.Module or list[torch.nn.Module] type, " + "currently there is a type.", str(context.exception)) + model = torch.nn.Module() + self.assertEqual(self.service.check_model_valid(model), model) + models = [model] + self.assertEqual(self.service.check_model_valid(models), models) + models = [model, 'model'] + with self.assertRaises(MsprobeException) as context: + self.service.check_model_valid(models) + self.assertEqual(context.exception.code, MsprobeException.INVALID_PARAM_ERROR) + self.assertIn("The 'model' parameter must be a torch.nn.Module or list[torch.nn.Module] type, " + "currently there is a type.", str(context.exception)) def test_update_primitive_counters(self): self.service.primitive_counters = {} @@ -85,35 +135,59 @@ class TestService(unittest.TestCase): self.service.current_rank = 0 self.service.data_collector.tasks_need_tensor_data = [Const.TENSOR] self.service.data_collector.update_dump_paths = MagicMock() - self.service.create_dirs() expected_calls = [ ("/tmp/dump"), ("/tmp/dump/step1/rank0"), "/tmp/dump/step1/rank0/dump_tensor_data" ] - mock_create_directory.assert_has_calls( - [unittest.mock.call(path) for path in expected_calls], any_order=True) - - args, _ = self.service.data_collector.update_dump_paths.call_args - self.assertEqual(args[0].dump_file_path, "/tmp/dump/step1/rank0/dump.json") - self.assertEqual(args[0].stack_file_path, "/tmp/dump/step1/rank0/stack.json") - self.assertEqual(args[0].construct_file_path, "/tmp/dump/step1/rank0/construct.json") - self.assertEqual(args[0].dump_tensor_data_dir, "/tmp/dump/step1/rank0/dump_tensor_data") - self.service.data_collector.initialize_json_file.assert_called_once_with( - framework=Const.MS_FRAMEWORK - ) - + with patch('msprobe.mindspore.service.is_mindtorch') as mock_is_mindtorch: + mock_is_mindtorch.return_value = False + self.service.create_dirs() + mock_create_directory.assert_has_calls( + [unittest.mock.call(path) for path in expected_calls], any_order=True) + + args, _ = self.service.data_collector.update_dump_paths.call_args + self.assertEqual(args[0].dump_file_path, "/tmp/dump/step1/rank0/dump.json") + self.assertEqual(args[0].stack_file_path, "/tmp/dump/step1/rank0/stack.json") + self.assertEqual(args[0].construct_file_path, "/tmp/dump/step1/rank0/construct.json") + self.assertEqual(args[0].dump_tensor_data_dir, "/tmp/dump/step1/rank0/dump_tensor_data") + self.service.data_collector.initialize_json_file.assert_called_once_with( + framework=Const.MS_FRAMEWORK + ) + + mock_create_directory.reset_mock() + self.service.data_collector.update_dump_paths.reset_mock() + self.service.data_collector.initialize_json_file.reset_mock() + + mock_is_mindtorch.return_value = True + self.service.create_dirs() + mock_create_directory.assert_has_calls( + [unittest.mock.call(path) for path in expected_calls], any_order=True) + + args, _ = self.service.data_collector.update_dump_paths.call_args + self.assertEqual(args[0].dump_file_path, "/tmp/dump/step1/rank0/dump.json") + self.assertEqual(args[0].stack_file_path, "/tmp/dump/step1/rank0/stack.json") + self.assertEqual(args[0].construct_file_path, "/tmp/dump/step1/rank0/construct.json") + self.assertEqual(args[0].dump_tensor_data_dir, "/tmp/dump/step1/rank0/dump_tensor_data") + self.service.data_collector.initialize_json_file.assert_called_once_with( + framework=Const.MT_FRAMEWORK + ) + + @patch.object(Service, 'check_model_valid') @patch.object(Service, 'need_end_service', return_value=False) - def test_start_stop_cycle(self, mock_need_end_service): + def test_start_stop_cycle(self, mock_need_end_service, mock_check_model_valid): self.service.model = nn.Cell() - with patch.object(self.service, 'register_cell_hook') as mock_register_hook: - self.should_stop_service = False - self.service.start(self.service.model) - self.assertTrue(self.service.switch) - self.service.stop() - self.assertFalse(self.service.switch) - mock_register_hook.assert_called_once() - mock_need_end_service.assert_called_once() + mock_check_model_valid.return_value = self.service.model + self.should_stop_service = False + self.service.start(self.service.model) + mock_check_model_valid.assert_called_with(self.service.model) + self.assertTrue(self.service.switch) + self.service.stop() + self.assertFalse(self.service.switch) + self.service.cell_processor.register_cell_hook.assert_called_once() + mock_need_end_service.assert_called_once() + + self.service.cell_processor.register_cell_hook.reset_mock() def test_should_execute_hook_return_false(self): cell = MagicMock() @@ -174,17 +248,16 @@ class TestService(unittest.TestCase): @patch.object(Service, 'need_end_service', return_value=False) @patch.object(logger, 'info') - @patch.object(Service, 'register_cell_hook') @patch.object(Service, 'register_primitive_hook') @patch.object(Service, 'create_dirs') @patch('msprobe.mindspore.service.get_rank_if_initialized', return_value=0) def test_start_first_time(self, mock_get_rank, mock_create_dirs, mock_register_primitive_hook, - mock_register_cell_hook, mock_logger, mock_need_end_service): + mock_logger, mock_need_end_service): self.service.first_start = True self.service.should_stop_service = False self.service.start(self.service.model) mock_get_rank.assert_called_once() - mock_register_cell_hook.assert_called_once() + self.service.cell_processor.register_cell_hook.assert_called_once() mock_register_primitive_hook.assert_called_once() mock_need_end_service.assert_called_once() mock_create_dirs.assert_called_once() @@ -193,33 +266,35 @@ class TestService(unittest.TestCase): self.assertTrue(self.service.primitive_switch) mock_logger.assert_called_with(f"Dump data will be saved in {self.service.dump_iter_dir}.") + self.service.cell_processor.register_cell_hook.reset_mock() + @patch.object(Service, 'register_primitive_hook') - @patch.object(Service, 'register_cell_hook') @patch.object(Service, 'need_end_service', return_value=False) @patch.object(JitDump, 'set_config') @patch.object(JitDump, 'set_data_collector') - @patch.object(ApiRegistry, 'register_all_api') - def test_start_with_jit_dump_enabled(self, mock_api_set_hook_func, mock_set_data_collector, - mock_set_config, mock_need_end_service, mock_register_cell_hook, - mock_register_primitive_hook): + def test_start_with_jit_dump_enabled(self, mock_set_data_collector, mock_set_config, + mock_need_end_service, mock_register_primitive_hook): self.service.config.level = Const.LEVEL_MIX self.service.first_start = True self.service.should_stop_service = False self.service.start(self.service.model) mock_set_config.assert_called_with(self.service.config) mock_set_data_collector.assert_called_with(self.service.data_collector) - mock_api_set_hook_func.assert_called_once() + self.service.api_register.register_all_api.assert_called_once() mock_need_end_service.assert_called_once() - mock_register_cell_hook.assert_called_once() + self.service.cell_processor.register_cell_hook.assert_called_once() mock_register_primitive_hook.assert_called_once() self.assertTrue(JitDump.jit_dump_switch) + self.service.api_register.register_all_api.reset_mock() + self.service.cell_processor.register_cell_hook.reset_mock() + def test_step_updates(self): CellProcessor.cell_count = {"test_api": 1} HOOKCell.cell_count = {"test_api": 1} JitDump.jit_count = {"test_api": 1} self.service.primitive_hook_service.primitive_counters = {"test_api": 1} - self.service.loop = 0 + self.service.loop = 0 self.service.step() self.assertEqual(self.service.loop, 1) self.service.data_collector.reset_status.assert_called_once() @@ -236,14 +311,13 @@ class TestService(unittest.TestCase): self.service.data_collector.backward_data_collect = MagicMock() mock_cell = MagicMock() - mock_cell.mindstudio_reserved_name = "TestCell" mock_input = (MagicMock(),) mock_output = MagicMock() - _, forward_hook, backward_hook, _ = self.service.build_hook(BaseScope.Module_Type_Module, "TestHook") + _, forward_hook, backward_hook, _ = self.service.build_hook(BaseScope.Module_Type_Module, "TestHook.forward.0") forward_hook(mock_cell, mock_input, mock_output) - self.service.data_collector.update_api_or_module_name.assert_called_with('TestCell') + self.service.data_collector.update_api_or_module_name.assert_called_with('TestHook.forward.0') self.service.data_collector.forward_data_collect.assert_called() self.service.data_collector.reset_mock() @@ -252,52 +326,33 @@ class TestService(unittest.TestCase): mock_grad_output = MagicMock() backward_hook(mock_cell, mock_grad_input, mock_grad_output) - self.service.data_collector.update_api_or_module_name.assert_called_with('TestHookbackward.0') + self.service.data_collector.update_api_or_module_name.assert_called_with('TestHook.backward.0') self.service.data_collector.backward_data_collect.assert_called() def test_register_primitive_hook(self): self.service.config.level = Const.LEVEL_MIX primitive_attr = ops.Add() primitive_name = "primitive_api" + mock_model = MagicMock() cell_mock = MagicMock() cell_mock.primitive_api = primitive_attr primitive_combined_name = primitive_name + Const.SEP + primitive_attr.__class__.__name__ - self.service.model.cells_and_names.return_value = [("cell_name", cell_mock)] - self.service.register_primitive_hook() + self.service.model = mock_model + with patch('msprobe.mindspore.service.get_cells_and_names') as mock_get_cells_and_names: + mock_get_cells_and_names.return_value = {'-1': [("cell_name", cell_mock)]} + self.service.register_primitive_hook() self.assertTrue(hasattr(primitive_attr.__class__, '__call__')) self.assertEqual(self.service.primitive_hook_service.wrap_primitive.call_args[0][1], primitive_combined_name) - @patch.object(ApiRegistry, 'initialize_hook') - @patch.object(ApiRegistry, 'register_all_api') @patch("msprobe.mindspore.service.logger.info") - def test_register_hook_new_with_level_mix(self, mock_logger, mock_api_set_hook_func, mock_initialize_hook): + def test_register_hook_new_with_level_mix(self, mock_logger): self.service.config.level = Const.LEVEL_MIX self.service.register_api_hook() - self.service.register_cell_hook() - mock_logger.assert_called_with(f"The cell {self.service.config.task} hook function " - "is successfully mounted to the model.") - mock_api_set_hook_func.assert_called() - mock_initialize_hook.assert_called() - - @patch.object(CellProcessor, 'node_hook') - def test_register_hook_new_with_level_l0(self, mock_node_hook): - global register_backward_hook_functions - self.service.config.level = Const.LEVEL_L0 - cell_mock = MagicMock() - setattr(MagicMock, 'construct', None) - self.service.model.cells_and_names.return_value = [("cell_name", cell_mock)] - register_backward_hook_functions["pre"] = cell_mock.register_backward_pre_hook - register_backward_hook_functions["full"] = cell_mock.register_backward_hook - self.service.register_cell_hook() - cell_mock.register_forward_hook.assert_called() - cell_mock.register_backward_hook.assert_called() - mock_node_hook.assert_called() - register_backward_hook_functions = {} - del MagicMock.construct - - def test_register_hook_new_without_model_raises_exception(self): - self.service.config.level = Const.LEVEL_L0 - self.service.model = None - with self.assertRaises(MsprobeException): - self.service.register_cell_hook() + mock_logger.assert_called_with(f'The api {self.service.config.task} hook function ' + 'is successfully mounted to the model.') + self.service.api_register.initialize_hook.assert_called_once() + self.service.api_register.register_all_api.assert_called_once() + + self.service.api_register.initialize_hook.reset_mock() + self.service.api_register.register_all_api.reset_mock() diff --git a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_primitive_dump.py b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_primitive_dump.py index 79deeee08e13273f08f32be26a375d1d26f5d2f1..734be70a5865fcaeaf2c66a0aed8b07c84322cda 100644 --- a/debug/accuracy_tools/msprobe/test/mindspore_ut/test_primitive_dump.py +++ b/debug/accuracy_tools/msprobe/test/mindspore_ut/test_primitive_dump.py @@ -1,8 +1,7 @@ -#!/usr/bin/env python3 -# -*- coding: utf-8 -*- -""" -# Copyright (C) 2024-2024. Huawei Technologies Co., Ltd. All rights reserved. -# Licensed under the Apache License, Version 2.0 (the "License"); +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # @@ -13,96 +12,22 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -""" + +from collections import defaultdict +import tempfile import unittest -import mindspore as ms -import numpy as np -import os from unittest.mock import Mock, patch -from mindspore import nn +import numpy as np +import mindspore as ms +from mindspore import Tensor, ops -import tempfile from msprobe.core.common.utils import Const from msprobe.mindspore.service import Service -from msprobe.core.common.exceptions import MsprobeException from msprobe.core.common_config import CommonConfig, BaseConfig from msprobe.mindspore.debugger.debugger_config import DebuggerConfig from msprobe.mindspore.dump.hook_cell.hook_cell import HOOKCell -from collections import defaultdict from msprobe.mindspore.dump.hook_cell.primitive_hooks import PrimitiveHookService -from mindspore.common.tensor import Tensor - - -class DummyModel(nn.Cell): - def __init__(self): - super(DummyModel, self).__init__() - self.dense = nn.Dense(2, 2) - - def construct(self, x): - return self.dense(x) - - -class TestService(unittest.TestCase): - @patch("msprobe.mindspore.debugger.debugger_config.create_directory") - def setUp(self, _): - json_config = { - "task": "statistics", - "dump_path": "/absolute_path", - "rank": [], - "step": [0, 2], - "level": "L1" - } - - common_config = CommonConfig(json_config) - task_config = BaseConfig(json_config) - config = DebuggerConfig(common_config, task_config) - self.service = Service(config) - self.service.model = Mock() - self.service.data_collector = Mock() - self.service.switch = True # Make sure the switch is on for testing - self.service.primitive_switch = True # Make sure the switch is on for testing - - def test_check_model_valid_none(self): - model = None - self.assertIsNone(self.service.check_model_valid(model)) - - def test_check_model_valid_valid_model(self): - model = DummyModel() - self.assertEqual(self.service.check_model_valid(model), model) - - def test_check_model_valid_invalid_model(self): - model = "invalid_model" - with self.assertRaises(MsprobeException) as context: - self.service.check_model_valid(model) - - def test_update_primitive_counters(self): - primitive_name = "test_primitive" - self.service.primitive_hook_service.update_primitive_counters(primitive_name) - self.assertEqual(self.service.primitive_hook_service.primitive_counters[primitive_name], 0) - self.service.primitive_hook_service.update_primitive_counters(primitive_name) - self.assertEqual(self.service.primitive_hook_service.primitive_counters[primitive_name], 1) - - def test_step_updates_iteration(self): - initial_iter = self.service.loop - self.service.step() - self.assertEqual(self.service.loop, initial_iter + 1) - - @patch.object(HOOKCell, 'cell_count', new_callable=lambda: defaultdict(int)) - def test_step_resets_counters(self, _): - # 假设在 step 调用之前已经有一些 primitive_counters - self.service.primitive_hook_service.primitive_counters["test_primitive"] = 5 - self.service.step() - self.assertEqual(self.service.primitive_hook_service.primitive_counters, {}) - self.assertEqual(HOOKCell.cell_count, defaultdict(int)) - - def test_start_calls_update_iter(self): - # 检查是否在调用 start 时调用了 update_iter - with patch.object(self.service.data_collector, 'update_iter') as mock_update_iter: - initial_iter = self.service.loop - init_step = self.service.init_step - self.service.start() - mock_update_iter.assert_called_once_with(initial_iter + init_step) class TestPrimitiveHookService(unittest.TestCase): @@ -121,19 +46,14 @@ class TestPrimitiveHookService(unittest.TestCase): common_config = CommonConfig(json_config) task_config = BaseConfig(json_config) config = DebuggerConfig(common_config, task_config) - self.service = Service(config) - self.service.model = Mock() - self.service.data_collector = Mock() - self.service.switch = True # Make sure the switch is on for testing - - # 模拟一个 service_instance 和 data_collector - self.mock_service_instance = Service(config) - self.mock_service_instance.switch = True - self.mock_service_instance.data_collector = Mock() - self.mock_service_instance.data_collector.dump_file_path = json_config["dump_path"] - # 初始化 PrimitiveHookService - self.primitive_hook_service = PrimitiveHookService(self.mock_service_instance) + with patch('msprobe.mindspore.service.build_data_collector'), \ + patch('msprobe.mindspore.service.CellProcessor'), \ + patch('msprobe.mindspore.service.PrimitiveHookService'), \ + patch('msprobe.mindspore.service.get_api_register'): + self.mock_service_instance = Service(config) + self.mock_service_instance.switch = True + self.primitive_hook_service = PrimitiveHookService(self.mock_service_instance) def tearDown(self): # 测试结束时删除临时目录 @@ -148,7 +68,6 @@ class TestPrimitiveHookService(unittest.TestCase): # 调用 wrap_primitive 获取包装函数通过闭包显式调用backward_hook hook_primitive_inputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[0].cell_contents - wrapped_primitive_call = self.primitive_hook_service.wrap_primitive(None, "example") create_backward_hook = hook_primitive_inputs.__closure__[0].cell_contents @@ -163,7 +82,6 @@ class TestPrimitiveHookService(unittest.TestCase): backward_hook(grad_2) self.assertEqual(len(captured_grads), 6) # 捕获到两个梯度 - print(f"1After first backward_hook call, len(captured_grads): {len(captured_grads)}") # 调用到达阈值,验证数据收集 self.assertTrue(self.mock_service_instance.data_collector.backward_output_data_collect.called) @@ -177,7 +95,6 @@ class TestPrimitiveHookService(unittest.TestCase): # 调用 wrap_primitive 获取包装函数通过闭包显式调用backward_hook hook_primitive_inputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[0].cell_contents - wrapped_primitive_call = self.primitive_hook_service.wrap_primitive(None, "example") create_backward_hook = hook_primitive_inputs.__closure__[0].cell_contents @@ -214,14 +131,7 @@ class TestPrimitiveHookService(unittest.TestCase): # 调用 wrap_primitive 获取包装函数通过闭包显式调用backward_hook hook_primitive_inputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[0].cell_contents - wrapped_primitive_call = self.primitive_hook_service.wrap_primitive(None, "example") - if wrapped_primitive_call.__closure__: - for i, closure in enumerate(wrapped_primitive_call.__closure__): - print(f"Closure[{i}]:", closure.cell_contents) - - if hook_primitive_inputs.__closure__: - for i, closure in enumerate(hook_primitive_inputs.__closure__): - print(f"2Closure[{i}]:", closure.cell_contents) + create_backward_hook = hook_primitive_inputs.__closure__[0].cell_contents backward_hook = create_backward_hook(captured_grads, num_tensors, updated_primitive_name, hook_type) @@ -235,7 +145,6 @@ class TestPrimitiveHookService(unittest.TestCase): backward_hook(grad_2) self.assertEqual(len(captured_grads), 6) # 捕获到两个梯度 - print(f"After first backward_hook call, len(captured_grads): {len(captured_grads)}") # 调用到达阈值,验证数据收集 self.assertTrue(self.mock_service_instance.data_collector.backward_input_data_collect.called) @@ -282,18 +191,15 @@ class TestPrimitiveHookService(unittest.TestCase): updated_primitive_name = "test_primitive_input" # 调用 hook_primitive_inputs - hooked_inputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[0].cell_contents(args, - captured_grads_input, - updated_primitive_name) - - # 验证 hooked_inputs 是否正确添加了 hook - for arg, hooked_arg in zip(args, hooked_inputs): - if isinstance(arg, Tensor): - print(f"Captured hooked_arg after hook: {hooked_arg}") - self.assertTrue(hasattr(hooked_arg, 'grad_fn')) - - # 打印调试信息 - print(f"Captured gradients after hook: {captured_grads_input}") + hook_primitive_inputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[0].cell_contents + with patch.object(ops, 'HookBackward') as mock_HookBackward: + target_value = Tensor([1.0]) + mock_hbw = mock_HookBackward.return_value + mock_hbw.return_value = target_value + hooked_inputs = hook_primitive_inputs(args, captured_grads_input, updated_primitive_name) + self.assertEqual(mock_HookBackward.call_count, len(args)) + for hooked_input in hooked_inputs: + self.assertTrue((hooked_input == target_value).all()) def test_hook_primitive_outputs(self): # 模拟前向输出 @@ -302,17 +208,16 @@ class TestPrimitiveHookService(unittest.TestCase): updated_primitive_name = "test_primitive_output" # 调用 hook_primitive_outputs - hook_primitive_outputs = self.primitive_hook_service.wrap_primitive(None, "example").__closure__[ - 1].cell_contents - hooked_outputs = hook_primitive_outputs(out, captured_grads_output, updated_primitive_name) - - # 验证 hooked_outputs 是否正确添加了 hook - for tensor, hooked_tensor in zip(out, hooked_outputs): - if isinstance(tensor, Tensor): - self.assertTrue(hasattr(hooked_tensor, 'grad_fn')) - - # 打印调试信息 - print(f"Captured gradients after output hook: {captured_grads_output}") + hook_primitive_outputs = self.primitive_hook_service.wrap_primitive(None, + "example").__closure__[1].cell_contents + with patch.object(ops, 'HookBackward') as mock_HookBackward: + target_value = Tensor([1.0]) + mock_hbw = mock_HookBackward.return_value + mock_hbw.return_value = target_value + hooked_outputs = hook_primitive_outputs(out, captured_grads_output, updated_primitive_name) + self.assertEqual(mock_HookBackward.call_count, len(out)) + for hooked_output in hooked_outputs: + self.assertTrue((hooked_output == target_value).all()) def test_wrapped_primitive_call_args(self): # 模拟前向输入 @@ -325,19 +230,18 @@ class TestPrimitiveHookService(unittest.TestCase): # 调用 wrapped_primitive_call 并检查 hooked_inputs 是否与原始 args 相同 try: - hooked_inputs = wrapped_primitive_call.__closure__[0].cell_contents(args, captured_grads_input, - updated_primitive_name) - for arg, hooked_arg in zip(args, hooked_inputs): - if isinstance(arg, Tensor): - self.assertTrue(hasattr(hooked_arg, 'grad_fn')) - self.assertTrue(np.array_equal(arg.asnumpy(), hooked_arg.asnumpy())) - print(f"Arg type: {type(arg)}, Hooked input type: {type(hooked_arg)}") - else: - self.assertEqual(arg, hooked_arg) + with patch.object(ops, 'HookBackward') as mock_HookBackward: + target_value = Tensor([1.0]) + mock_hbw = mock_HookBackward.return_value + mock_hbw.return_value = target_value + hooked_inputs = wrapped_primitive_call.__closure__[0].cell_contents(args, captured_grads_input, + updated_primitive_name) + self.assertEqual(mock_HookBackward.call_count, len(args)) + for hooked_input in hooked_inputs: + self.assertTrue((hooked_input == target_value).all()) except Exception as e: self.fail(f"wrapped_primitive_call raised an exception: {e}") - def test_update_primitive_counters_multiple(self): # 测试更新 primitive 计数器的功能,增加多个不同名称的测试 primitive_names = ["MatMul", "Conv2D", "ReLU", "Softmax"] @@ -416,13 +320,11 @@ class TestPrimitiveHookService(unittest.TestCase): for captured_grads in captured_grads_sets: updated_primitive_name = "MatMul.Backward" - num_tensors = len(captured_grads) hook = self.primitive_hook_service.wrap_primitive(Mock(), "MatMul") backward_hook = hook(Mock(), captured_grads, updated_primitive_name, Const.INPUT) self.assertIsNotNone(backward_hook) - @patch('msprobe.mindspore.dump.hook_cell.primitive_hooks.ops.HookBackward') def test_wrap_primitive_forward_and_backward_hooks(self, mock_hook_backward): # 模拟前向和后向钩子在同一个 primitive 中的行为 @@ -447,9 +349,6 @@ class TestPrimitiveHookService(unittest.TestCase): self.primitive_hook_service.update_primitive_counters(name) self.assertEqual(self.primitive_hook_service.primitive_counters[name], i) - - - def test_update_primitive_counters(self): primitive_name = "MatMul" self.primitive_hook_service.update_primitive_counters(primitive_name) @@ -496,7 +395,7 @@ class TestPrimitiveHookService(unittest.TestCase): wrapped_func = self.primitive_hook_service.wrap_primitive(mock_origin_func, "MatMul") # 模拟反向传播过程,调用包装的 primitive - with patch.object(self.mock_service_instance.data_collector, 'backward_data_collect') as mock_backward_collect: + with patch.object(self.mock_service_instance.data_collector, 'backward_data_collect'): result = wrapped_func(Mock(), input_tensor) # 验证结果是 Tensor 实例 @@ -544,7 +443,6 @@ class TestPrimitiveHookService(unittest.TestCase): # 测试 create_backward_hook 的功能 captured_grads = [] updated_primitive_name = "MatMul.Backward" - num_tensors = 2 # 创建 backward hook backward_hook = self.primitive_hook_service.wrap_primitive(Mock(), "MatMul") diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/common/test_config.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/common/test_config.py index 30fa11d94de0dd4fec483502a51d0474e8b7646a..df03485dc6c77371750fd0b67ca2c37ff7e2ed7b 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/common/test_config.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/common/test_config.py @@ -16,8 +16,6 @@ class TestUtConfig(): self.port = 8080 self.rank_list = [0, 1, 2] self.tls_path = '/path/to/tls' - self.master_ip = '127.0.0.1' - self.master_port = 8888 class TestConfig(unittest.TestCase): diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/compare/test_algorithm.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/compare/test_algorithm.py index f1cc0d31363c326c3412824f4a5a176b70da1a90..377a29f2237e2b3172e6fc35a712ff36cc69972d 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/compare/test_algorithm.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/compare/test_algorithm.py @@ -208,45 +208,3 @@ class TestAlgorithmMethods(unittest.TestCase): ulp_err = alg.calc_ulp_err(self.bench_data, self.device_data, eb, exponent_num, data_type) expected_ulp_err = (self.device_data.astype(data_type) - self.bench_data).astype(data_type) * np.exp2(-eb + exponent_num) self.assertTrue(np.allclose(ulp_err, expected_ulp_err)) - - -class TestKahanLossRange(unittest.TestCase): - - def setUp(self): - self.cumsum = torch.tensor( - [[1000, 30], [1, 20], [10, 10]], dtype=torch.bfloat16) - self.addend = torch.tensor([[3, 0.2]], dtype=torch.bfloat16) - self.tensors = [ - torch.tensor([1000], dtype=torch.bfloat16), - torch.tensor([1004], dtype=torch.bfloat16), - torch.tensor([103], dtype=torch.bfloat16), - torch.tensor([4], dtype=torch.bfloat16)] - - def test_kahan_loss_positive(self): - # 测试最大化需要补偿的正损失, loss_res为历史损失中最大值,且mask会遮蔽小于0的部分 - loss_res, mask = alg.maximize_kahan_loss(self.cumsum, self.addend, negative=False) - expected_loss = torch.tensor([1, 0.0498], dtype=torch.bfloat16) - expected_mask = expected_loss >= 0 - self.assertTrue(torch.allclose(loss_res, expected_loss)) - self.assertTrue(torch.allclose(mask, expected_mask)) - - def test_kahan_loss_negative(self): - # 测试最大化需要补偿的负损失, loss_res为历史损失中最小值,且mask会遮蔽大于0的部分 - loss_res, mask = alg.maximize_kahan_loss(self.cumsum, self.addend, negative=True) - expected_loss = torch.tensor([0, -0.0127], dtype=torch.bfloat16) - expected_mask = expected_loss <= 0 - self.assertTrue(torch.allclose(loss_res, expected_loss)) - self.assertTrue(torch.allclose(mask, expected_mask)) - - def test_kahan_range_empty_list(self): - # 测试输入为空列表的情况 - with self.assertRaises(ValueError): - alg.kahan_range([]) - - def test_kahan_range_min_max(self): - max_ = alg.kahan_range(self.tensors, negative=True) - min_ = alg.kahan_range(self.tensors, negative=False) - expected_min = torch.tensor(2096, dtype=torch.bfloat16) - expected_max = torch.tensor(2112, dtype=torch.bfloat16) - self.assertTrue(torch.allclose(min_, expected_min)) - self.assertTrue(torch.allclose(max_, expected_max)) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_data_generate.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_data_generate.py index 0a88476d600958b26eaf6ca20a9a70d35b4221cc..952a6dffbc85eea9dd2db87fa081bdf4bb3cae2a 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_data_generate.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_data_generate.py @@ -322,7 +322,7 @@ class TestDataGenerateMethods(unittest.TestCase): low_info = [1, float('-inf')] high_info = [2, float('-inf')] tensor = gen_common_tensor(low_info, high_info, shape, data_dtype, None) - self.assertTrue(torch.allclose(tensor.max(), torch.tensor(2.0), atol = 0.3)) + self.assertTrue(torch.allclose(tensor.max(), torch.tensor(2.0), atol = 0.5)) self.assertTrue(tensor.min() == float('-inf')) low_info = [1, float('nan')] diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_distributed_bench_function.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_distributed_bench_function.py deleted file mode 100644 index 0b21a9559e90acec80c9cb4726d8ce039ddb6a71..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_distributed_bench_function.py +++ /dev/null @@ -1,29 +0,0 @@ -import torch -import unittest - -from msprobe.pytorch.api_accuracy_checker.run_ut.distributed_bench_function import sort_all_input - -class TestSortAllInput(unittest.TestCase): - def setUp(self): - self.inputs = [ - torch.tensor([3.0, 2.0, 1.0]), - torch.tensor([6.0, 5.0, 4.0]), - torch.tensor([9.0, 8.0, 7.0]) - ] - - def test_normal_case(self): - # 测试正常情况 - sorted_inputs = sort_all_input(self.inputs) - expected_sorted_inputs = [ - torch.tensor([9.0, 8.0, 7.0]), - torch.tensor([6.0, 5.0, 4.0]), - torch.tensor([3.0, 2.0, 1.0]) - ] - for result, expected in zip(sorted_inputs, expected_sorted_inputs): - self.assertTrue(torch.equal(result, expected)) - - def test_single_tensor(self): - # 测试只有一个张量的情况 - single_input = [torch.tensor([2.0])] - sorted_inputs = sort_all_input(single_input) - self.assertTrue(torch.equal(sorted_inputs[0], single_input[0])) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_run_ut_utils.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_run_ut_utils.py index 751d3f6affd10c82f9aeee941bed8cf5453daad8..8cead7b0093ce68dca8a12b0ea6dbcde78a70c0b 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_run_ut_utils.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/api_accuracy_checker/run_ut/test_run_ut_utils.py @@ -22,7 +22,6 @@ from msprobe.core.common.file_utils import create_directory, write_csv class TestRunUtUtils(unittest.TestCase): - def setUp(self): save_path = "temp_save_path" create_directory(save_path) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_match.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_match.py deleted file mode 100644 index ac28e994e9c8e77f8ae675fec3322eaf64a64321..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_match.py +++ /dev/null @@ -1,20 +0,0 @@ -# coding=utf-8 -import unittest -from msprobe.pytorch.compare import match - - -class TestMatch(unittest.TestCase): - def test_graph_mapping(self): - op1 = "Aten_convolution_1_forward_0.input.0" - op2 = "Torch_conv2d_0_forward_0.input.0" - op3 = "Torch_batch_norm_0_forward_0.input.0" - op4 = "Aten_convolution.default_1_forward_0.input.0" - op5 = "Aten_foo_1_forward_0.input.0" - self.assertTrue(match.graph_mapping.match(op1, op2)) - self.assertTrue(match.graph_mapping.match(op2, op1)) - self.assertTrue(match.graph_mapping.match(op4, op2)) - self.assertTrue(match.graph_mapping.match(op2, op4)) - self.assertFalse(match.graph_mapping.match(op1, op3)) - self.assertFalse(match.graph_mapping.match(op3, op1)) - self.assertFalse(match.graph_mapping.match(op5, op2)) - self.assertFalse(match.graph_mapping.match(op2, op5)) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare.py index b079e646c4a8f4098bb233e3e6259ef3ebea9c94..e4c8b722b182b8c0a4e82ba1b0eeb1a6ed847ee2 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare.py @@ -3,16 +3,12 @@ import os import shutil import unittest -import numpy as np import torch -from msprobe.core.common.const import Const from msprobe.core.common.utils import CompareException -from msprobe.core.compare.acc_compare import ModeConfig -from msprobe.pytorch.compare.pt_compare import PTComparator, compare +from msprobe.pytorch.compare.pt_compare import compare from msprobe.test.core_ut.compare.test_acc_compare import generate_dump_json, generate_stack_json - base_dir1 = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_pt_compare1') base_dir2 = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_pt_compare2') @@ -40,36 +36,6 @@ class TestUtilsMethods(unittest.TestCase): if os.path.exists(base_dir2): shutil.rmtree(base_dir2) - def test_read_npy_data_bf16(self): - generate_bf16_pt(base_dir1) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - pt_comparator = PTComparator(mode_config) - result = pt_comparator.read_npy_data(base_dir1, 'bf16.pt') - - target_result = torch.tensor([1, 2, 3, 4], dtype=torch.float32).numpy() - self.assertTrue(np.array_equal(result, target_result)) - - def test_read_npy_data_dict(self): - generate_dict_pt(base_dir1) - - stack_mode = True - auto_analyze = True - fuzzy_match = False - dump_mode = Const.ALL - mode_config = ModeConfig(stack_mode, auto_analyze, fuzzy_match, dump_mode) - - pt_comparator = PTComparator(mode_config) - - with self.assertRaises(CompareException) as context: - result = pt_comparator.read_npy_data(base_dir1, 'dict.pt') - self.assertEqual(context.exception.code, CompareException.DETACH_ERROR) - def test_compare(self): generate_dump_json(base_dir2) generate_stack_json(base_dir2) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare_utils.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..d967c5e314cc872da7972a37ac72fa0ffdb04a23 --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/compare/test_pt_compare_utils.py @@ -0,0 +1,34 @@ +import os +import shutil +import threading +import unittest + +import numpy as np + +from msprobe.pytorch.compare.utils import read_pt_data +from msprobe.test.core_ut.compare.test_acc_compare import generate_pt + + +base_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), f'test_pt_compare_utils_data') +pt_dir = os.path.join(base_dir, f'dump_data_dir') + + +class TestReadPtData(unittest.TestCase): + + def setUp(self): + os.makedirs(base_dir, mode=0o750, exist_ok=True) + os.makedirs(pt_dir, mode=0o750, exist_ok=True) + + self.lock = threading.Lock() + + def tearDown(self): + if os.path.exists(pt_dir): + shutil.rmtree(pt_dir) + if os.path.exists(base_dir): + shutil.rmtree(base_dir) + + def test_read_pt_data(self): + generate_pt(pt_dir) + result = read_pt_data(pt_dir, 'Functional.linear.0.forward.input.0.pt') + expected = np.array([1.0, 2.0, 3.0, 4.0]) + self.assertTrue(np.array_equal(result, expected)) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger/test_pt_debugger_config.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger/test_pt_debugger_config.py index 4fc27c267ebe65ea46ecf0f17bc47ff702eb241d..09821abc636462e21d30675243ad0afc1a4fc86f 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger/test_pt_debugger_config.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger/test_pt_debugger_config.py @@ -1,6 +1,7 @@ import unittest -from unittest.mock import MagicMock +from unittest.mock import MagicMock, patch +import torch from msprobe.core.common.const import Const from msprobe.core.common.exceptions import MsprobeException from msprobe.pytorch.debugger.debugger_config import DebuggerConfig @@ -46,28 +47,95 @@ class TestDebuggerConfig(unittest.TestCase): self.assertEqual(debugger.nfs_path, "./nfs_path") self.assertEqual(debugger.port, 8080) - def test_valid_task_and_level(self): - config = DebuggerConfig(self.common_config, self.task_config, "tensor", None, "L1") - config.check_kwargs() + def test_check_kwargs_with_invalid_task(self): + self.common_config.task = "invalid_task" + with self.assertRaises(MsprobeException) as context: + DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertIn(f"The task is not in the {Const.TASK_LIST}", str(context.exception)) - def test_invalid_task(self): + def test_check_kwargs_with_invalid_level(self): + self.common_config.level = "invalid_level" with self.assertRaises(MsprobeException) as context: - config = DebuggerConfig(self.common_config, self.task_config, "invalid_task", None, "L1") - config.check_kwargs() - self.assertIn("not in the", str(context.exception)) + DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertIn(f"The level is not in the {Const.LEVEL_LIST}.", str(context.exception)) - def test_invalid_level(self): + def test_check_kwargs_with_invalid_dump_path(self): + self.common_config.dump_path = None with self.assertRaises(MsprobeException) as context: - config = DebuggerConfig(self.common_config, self.task_config, "tensor", None, "invalid_level") - config.check_kwargs() - self.assertIn("not in the", str(context.exception)) + DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertIn(f"The dump_path not found.", str(context.exception)) - def test_missing_dump_path(self): + def test_check_kwargs_with_invalid_async_dump(self): + self.common_config.async_dump = 1 with self.assertRaises(MsprobeException) as context: - self.common_config.dump_path = None - config = DebuggerConfig(self.common_config, self.task_config, "tensor", None, "L1") - config.check_kwargs() - self.assertIn("dump_path not found", str(context.exception)) + DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertIn(f"The parameters async_dump should be bool.", str(context.exception)) + + def test_check_kwargs_with_async_dump_and_debug(self): + self.common_config.async_dump = True + self.common_config.task = Const.TENSOR + self.common_config.level = Const.LEVEL_DEBUG + self.task_config.list = ["linear"] + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertEqual(config.list, []) + + def test_check_kwargs_with_async_dump_and_not_debug(self): + self.common_config.async_dump = True + self.common_config.task = Const.TENSOR + self.common_config.level = Const.LEVEL_MIX + self.task_config.list = [] + with self.assertRaises(MsprobeException) as context: + DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertIn(f"the parameters list cannot be empty.", str(context.exception)) + + def test_check_kwargs_with_structure_task(self): + self.common_config.task = Const.STRUCTURE + self.common_config.level = Const.LEVEL_L1 + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.assertEqual(config.level, Const.LEVEL_MIX) + + @patch('msprobe.pytorch.debugger.debugger_config.logger') + def test_check_model_with_l1(self, mock_logger): + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + instance = MagicMock() + instance.model = MagicMock() + config.check_model(instance, None) + mock_logger.info_on_rank_0.assert_called_once_with( + "The current level is not L0 or mix level, so the model parameters will not be used." + ) + + def test_check_model_with_model_is_none(self): + self.common_config.level = Const.LEVEL_L0 + instance = MagicMock() + instance.model = None + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + with self.assertRaises(MsprobeException) as context: + config.check_model(instance, None) + self.assertIn("missing the parameter 'model'", str(context.exception)) + + def test_check_model_with_single_model(self): + self.common_config.level = Const.LEVEL_MIX + model1 = torch.nn.ReLU() + model2 = torch.nn.Linear(2, 2) + + instance = MagicMock() + instance.model = model1 + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + config.check_model(instance, model2) + + self.assertEqual(instance.model, model2) + + def test_check_model_with_incorrect_model(self): + self.common_config.level = Const.LEVEL_L0 + model1 = torch.nn.ReLU() + model2 = [torch.nn.Linear(2, 2), torch.nn.ReLU(), "test_model"] + + instance = MagicMock() + instance.model = model1 + config = DebuggerConfig(self.common_config, self.task_config, None, None, None) + with self.assertRaises(MsprobeException) as context: + config.check_model(instance, model2) + self.assertIn("must be a torch.nn.Module or list[torch.nn.Module]", str(context.exception)) def test_check_and_adjust_config_with_l2_scope_not_empty(self): self.common_config.dump_path = "./dump_path" @@ -100,3 +168,50 @@ class TestDebuggerConfig(unittest.TestCase): debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) debugger._check_and_adjust_config_with_l2() self.assertIn("Functional.conv2d.0.forward", self.task_config.list) + + def test_check_and_adjust_config_with_l2_task_not_tensor(self): + self.common_config.dump_path = "./dump_path" + self.common_config.task = Const.STATISTICS + + self.task_config.scope = [] + self.task_config.list = ["Functional.conv2d.0.forward"] + debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) + with self.assertRaises(MsprobeException) as context: + debugger._check_and_adjust_config_with_l2() + self.assertIn("the task must be set to tensor", str(context.exception)) + + def test_check_statistics_config_task_not_statistics(self): + self.common_config.dump_path = "./dump_path" + self.common_config.task = Const.TENSOR + + debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) + debugger._check_statistics_config(self.task_config) + self.assertFalse(hasattr(debugger, "tensor_list")) + + def test_check_statistics_config_not_tensor_list(self): + self.common_config.dump_path = "./dump_path" + self.common_config.task = Const.STATISTICS + delattr(self.task_config, "tensor_list") + + debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) + debugger._check_statistics_config(self.task_config) + self.assertEqual(debugger.tensor_list, []) + + def test_check_statistics_config_debug_level(self): + self.common_config.dump_path = "./dump_path" + self.common_config.task = Const.STATISTICS + self.common_config.level = Const.DEBUG + + debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) + self.task_config.tensor_list = ["Functional.conv2d"] + debugger._check_statistics_config(self.task_config) + self.assertEqual(debugger.tensor_list, []) + + def test_check_statistics_config_success(self): + self.common_config.dump_path = "./dump_path" + self.common_config.task = Const.STATISTICS + + self.task_config.tensor_list = ["Functional.conv2d"] + debugger = DebuggerConfig(self.common_config, self.task_config, None, None, None) + debugger._check_statistics_config(self.task_config) + self.assertEqual(debugger.tensor_list, self.task_config.tensor_list) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger_save/test_debugger_save_pytorch.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger_save/test_debugger_save_pytorch.py new file mode 100644 index 0000000000000000000000000000000000000000..3a3d1dd2362146f56d4f5bcc53e4792689df3e90 --- /dev/null +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/debugger_save/test_debugger_save_pytorch.py @@ -0,0 +1,449 @@ +import unittest +import os +import json +import torch +import numpy as np +import shutil + +from msprobe.pytorch import PrecisionDebugger + +current_file = __file__ +parent_dir = os.path.abspath(os.path.dirname(current_file)) +test_dir = os.path.join(parent_dir, "test_dir") + +def deep_compare(obj1, obj2, float_tolerance=1e-5): + """ + Recursively compare two objects to check if they are the same. + Supports nested dictionaries and lists. + """ + if type(obj1) != type(obj2): + return False + if isinstance(obj1, dict): + if obj1.keys() != obj2.keys(): + return False + return all(deep_compare(obj1[key], obj2[key]) for key in obj1) + if isinstance(obj1, (tuple, list)): + if len(obj1) != len(obj2): + return False + return all(deep_compare(item1, item2) for item1, item2 in zip(obj1, obj2)) + if isinstance(obj1, (int, float)): + return abs(obj1 - obj2) < float_tolerance + return obj1 == obj2 + +class TestDebuggerSave(unittest.TestCase): + @staticmethod + def write_config_json(step, async_dump, mode, dump_path, config_file_path): + task = "tensor" if mode == "tensor" else "statistics" + statistics_summary_mode = "statistics" if mode == "statistics" else "md5" + config = { + "task": task, + "dump_path": dump_path, + "rank": [], + "step": step, + "level": "debug", + "enable_dataloader": False, + "async_dump": async_dump, + "statistics": { + "summary_mode": statistics_summary_mode, + } + } + with open(config_file_path, "w", encoding="utf-8") as f: + json.dump(config, f, indent=4, ensure_ascii=False) + + @staticmethod + def read_debug_json_into_dict(debug_json_path): + with open(debug_json_path, "r", encoding="utf-8") as f: + debug_json = json.load(f) + return debug_json + + + @staticmethod + def check_real_pt(pt_path, target_pt_tensor, check_values=True, rtol=1e-5, atol=1e-8): + """ + Enhanced version with optional value comparison. + + Args: + pt_path (str): Path to the .pt file + target_pt_tensor: Target torch tensor to compare + check_values (bool): If True, also compare array values + rtol, atol: Relative and absolute tolerances for value comparison + + Returns: + bool: True if all checks pass + """ + # Load the pt file + try: + pt_data = torch.load(pt_path) + except FileNotFoundError: + print(f"Error: The file {pt_path} does not exist.") + return False + except Exception as e: + print(f"Error loading pt file: {e}") + return False + # Check shapes + if pt_data.shape != target_pt_tensor.shape: + print(f"Shape mismatch: pt data shape is {pt_data.shape}, target tensor shape is {target_pt_tensor.shape}") + return False + # Check dtypes + if pt_data.dtype != target_pt_tensor.dtype: + print(f"Shape mismatch: pt data dtype is {pt_data.dtype}, target tensor dtype is {target_pt_tensor.dtype}") + return False + # Optionally check values + if check_values: + if not torch.allclose(pt_data, target_pt_tensor, rtol=rtol, atol=atol): + print("Value mismatch: pt data and target tensor values do not match within the specified tolerances.") + return False + return True + + def setUp(self): + if not os.path.exists(test_dir): + os.makedirs(test_dir) + PrecisionDebugger._instance = None + + def tearDown(self): + if os.path.exists(test_dir): + shutil.rmtree(test_dir) + PrecisionDebugger._instance = None + + def test_save_real_tensor(self): + data = {"a": torch.Tensor([1., 2.])} + step = [] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check pt file + pt_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "data_dict.0.debug.a.pt") + assert self.check_real_pt(pt_path, data["a"]) + # check debug json + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + "data_name": "data_dict.0.debug.a.pt" + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + def test_save_md5(self): + data = {"a": torch.Tensor([1., 2.])} + step = [] + async_dump = False + mode = "md5" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check debug json + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + "md5": "2e3fa576" + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + def test_save_multiple_steps(self): + data = {"a": torch.Tensor([1., 2.])} + step = [0, 1, 2] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + for _ in step: + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check pt file + for i in step: + pt_path = os.path.join(dump_path, f"step{i}", "rank", "dump_tensor_data", "data_dict.0.debug.a.pt") + assert self.check_real_pt(pt_path, data["a"]) + # check debug json + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + "data_name": "data_dict.0.debug.a.pt" + } + } + for i in step: + debug_json_path = os.path.join(dump_path, f"step{i}", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + def test_async_save_tensor(self): + data = {"a": torch.Tensor([1., 2.])} + step = [] + async_dump = True + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + + # check pt file + pt_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "data_dict.0.debug.a.pt") + assert self.check_real_pt(pt_path, data["a"]) + + # check debug json + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "data_name": "data_dict.0.debug.a.pt", + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + def test_async_save_md5(self): + # async_dump case, md5 configuration not working,only save statistics + data = {"a": torch.Tensor([1., 2.])} + step = [] + async_dump = True + mode = "md5" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + # check debug json + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + } + } + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"]["data_dict.0.debug"], target_debug_info) + + def test_save_multiple_times(self): + data = {"a": torch.Tensor([1., 2.])} + step = [] + call_times = 3 + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + for _ in range(call_times): + PrecisionDebugger.save(data, "data_dict", save_backward=False) + PrecisionDebugger.step() + + # check pt file + for i in range(call_times): + pt_path = os.path.join(dump_path, "step0", "rank", "dump_tensor_data", f"data_dict.{i}.debug.a.pt") + assert self.check_real_pt(pt_path, data["a"]) + + # check debug json + for i in range(call_times): + target_debug_info = { + "a": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": False, + "data_name": f"data_dict.{i}.debug.a.pt" + } + } + + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + assert deep_compare(debug_json_dict["data"][f"data_dict.{i}.debug"], target_debug_info) + + def test_save_backward(self): + x = torch.Tensor([1., 2.]) + target_x_grad = torch.Tensor([1., 1.]) + def _forward_simple_func(x): + PrecisionDebugger.save(x, "x_tensor") + return x.sum() + step = [] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + x.requires_grad = True + loss = _forward_simple_func(x) + loss.backward() + PrecisionDebugger.step() + x_info_list = [ + x, + os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "x_tensor.0.debug.pt"), + "x_tensor.0.debug", + { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": True, + "data_name": "x_tensor.0.debug.pt" + }, + ] + x_grad_info_list = [ + target_x_grad, + os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "x_tensor_grad.0.debug.pt"), + "x_tensor_grad.0.debug", + { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 1.0, + "Min": 1.0, + "Mean": 1.0, + "Norm": 1.4142135381698608, + "requires_grad": False, + "data_name": "x_tensor_grad.0.debug.pt" + }, + ] + check_list = [x_info_list, x_grad_info_list] + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + for check_info in check_list: + target_tensor, target_tensor_path, target_tensor_key, target_tensor_info = check_info + assert self.check_real_pt(target_tensor_path, target_tensor) + assert deep_compare(debug_json_dict["data"][target_tensor_key], target_tensor_info) + + def test_save_compilcated_data_structure_backward(self): + x = torch.Tensor([1., 2.]) + target_x_grad = torch.Tensor([1., 1.]) + def _forward_complicated_func(x): + complicated_structure = [{"a_key": x}] + PrecisionDebugger.save(complicated_structure, "complicated_structure") + return complicated_structure[0]["a_key"].sum() + step = [] + async_dump = False + mode = "tensor" + dump_path = os.path.join(test_dir, "debug_save") + config_file_path = os.path.join(test_dir, "config.json") + self.write_config_json(step, async_dump, mode, dump_path, config_file_path) + debugger = PrecisionDebugger(config_file_path) + x.requires_grad = True + loss = _forward_complicated_func(x) + loss.backward() + PrecisionDebugger.step() + complicated_structure_info_list = [ + x, + os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "complicated_structure.0.debug.0.a_key.pt"), + "complicated_structure.0.debug", + [ + { + "a_key": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 2.0, + "Min": 1.0, + "Mean": 1.5, + "Norm": 2.2360680103302, + "requires_grad": True, + "data_name": "complicated_structure.0.debug.0.a_key.pt" + } + } + ], + ] + complicated_structure_grad_info_list = [ + target_x_grad, + os.path.join(dump_path, "step0", "rank", "dump_tensor_data", "complicated_structure_grad.0.debug.0.a_key.pt"), + "complicated_structure_grad.0.debug", + [ + { + "a_key": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 2 + ], + "Max": 1.0, + "Min": 1.0, + "Mean": 1.0, + "Norm": 1.4142135381698608, + "requires_grad": False, + "data_name": "complicated_structure_grad.0.debug.0.a_key.pt" + } + } + ], + ] + check_list = [complicated_structure_info_list, complicated_structure_grad_info_list] + debug_json_path = os.path.join(dump_path, "step0", "rank", "debug.json") + debug_json_dict = self.read_debug_json_into_dict(debug_json_path) + for check_info in check_list: + target_tensor, target_tensor_path, target_tensor_key, target_tensor_info = check_info + assert self.check_real_pt(target_tensor_path, target_tensor) + assert deep_compare(debug_json_dict["data"][target_tensor_key], target_tensor_info) \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_dump.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_dump.py index 5aaf0820a78339ff4f1cc5d28aff8762bae31a39..4ba3556c277f3326520547a6124170f32a9cc8e8 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_dump.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_dump.py @@ -16,47 +16,68 @@ import unittest from unittest.mock import patch, MagicMock -import torch -import torch.nn as nn +from torch import nn -from msprobe.core.data_dump.api_registry import ApiRegistry -from msprobe.pytorch import PrecisionDebugger -from msprobe.pytorch.hook_module.api_register import get_api_register -from msprobe.pytorch.service import torch_version_above_or_equal_2 +from msprobe.pytorch.common.log import logger +from msprobe.pytorch.dump.module_dump.module_dump import ModuleDumper +from msprobe.pytorch.dump.module_dump.module_processer import ModuleProcesser class TestModuleDumper(unittest.TestCase): - @classmethod - def setUpClass(cls): - PrecisionDebugger._instance = None - get_api_register().restore_all_api() + def setUp(self): + self.service = MagicMock() + with patch('msprobe.pytorch.dump.module_dump.module_dump.get_api_register'): + self.module_dumper = ModuleDumper(self.service) - @classmethod - def tearDownClass(cls): - PrecisionDebugger._instance = None - get_api_register().restore_all_api() + def test__init__(self): + self.service = MagicMock() + with patch('msprobe.pytorch.dump.module_dump.module_dump.get_api_register') as mock_get_api_register: + self.module_dumper = ModuleDumper(self.service) + self.assertEqual(self.module_dumper.service, self.service) + mock_get_api_register.assert_called_once() - def setUp(self): - self.module = nn.Linear(8, 4) - debugger = PrecisionDebugger(dump_path="./") - self.module_dumper = debugger.module_dumper + def test_start_module_dump(self): + module = nn.Module() + with patch.object(logger, 'info_on_rank_0') as mock_info: + module.msprobe_hook = True + ModuleProcesser.enable_module_dump = False + self.module_dumper.api_register.restore_all_api.reset_mock() + self.module_dumper.start_module_dump(module, 'dump_name') + mock_info.assert_called_with('The init dump is enabled, and the module dump function will not be available.') + self.assertFalse(ModuleProcesser.enable_module_dump) + self.module_dumper.api_register.restore_all_api.assert_not_called() + self.assertFalse(hasattr(module, 'msprobe_module_dump')) + + del module.msprobe_hook + mock_info.reset_mock() + self.module_dumper.start_module_dump(module, 'dump_name') + mock_info.assert_not_called() + self.assertTrue(ModuleProcesser.enable_module_dump) + self.module_dumper.api_register.restore_all_api.assert_called_once() + self.module_dumper.service.module_processor.register_module_hook.assert_called_with( + module, + self.module_dumper.service.build_hook, + recursive=False, + module_names=['dump_name'] + ) + self.assertTrue(module.msprobe_module_dump) + ModuleProcesser.enable_module_dump = False + + self.module_dumper.api_register.restore_all_api.reset_mock() + self.module_dumper.service.module_processor.register_module_hook.reset_mock() + self.module_dumper.start_module_dump(module, 'dump_name') + mock_info.assert_not_called() + self.assertTrue(ModuleProcesser.enable_module_dump) + self.module_dumper.api_register.restore_all_api.assert_called_once() + self.module_dumper.service.module_processor.register_module_hook.assert_not_called() + + ModuleProcesser.enable_module_dump = False def test_stop_module_dump(self): - self.module_dumper.hook_handle_list.extend([1, 2, 3]) - with patch.object(ApiRegistry, 'register_all_api') as mock_api_register: - mock_handle1 = MagicMock(spec=torch.utils.hooks.RemovableHandle) - mock_handle2 = MagicMock(spec=torch.utils.hooks.RemovableHandle) - self.module_dumper.hook_handle_list.extend([mock_handle1, mock_handle2]) - - self.module_dumper.stop_module_dump() - mock_handle1.remove.assert_called_once() - mock_handle2.remove.assert_called_once() - self.assertEqual(self.module_dumper.hook_handle_list, []) - mock_api_register.assert_called_once() - - def test_register_hook(self): - self.module_dumper.register_hook(self.module, "TestModule") - if torch_version_above_or_equal_2: - self.assertEqual(len(self.module_dumper.hook_handle_list), 6) - else: - self.assertEqual(len(self.module_dumper.hook_handle_list), 5) + ModuleProcesser.enable_module_dump = True + self.module_dumper.api_register.register_all_api.reset_mock() + self.module_dumper.stop_module_dump() + self.assertFalse(ModuleProcesser.enable_module_dump) + self.module_dumper.api_register.register_all_api.assert_called_once() + + self.module_dumper.api_register.register_all_api.reset_mock() diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_processer.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_processer.py index 20cfdfa6ba399d274ca67effb6f93a7c3762edce..832f63f8fd99b53d8d1909bee45e7a5634c6ca92 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_processer.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/dump/test_module_processer.py @@ -1,10 +1,24 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import unittest from unittest.mock import MagicMock import torch from msprobe.core.data_dump.scope import ModuleRangeScope -from msprobe.pytorch.common.utils import Const from msprobe.pytorch.dump.module_dump.module_processer import ModuleProcesser @@ -25,58 +39,12 @@ class TestModuleProcesser(unittest.TestCase): processor = ModuleProcesser(scope) self.assertIsNone(processor.scope) - def test_module_count_func(self): + def test_set_and_get_calls_number(self): + ModuleProcesser.reset_module_stats() test = ModuleProcesser(None) self.assertEqual(test.module_count, {}) module_name = "nope" - test.module_count_func(module_name) + test.set_and_get_calls_number(module_name) self.assertEqual(test.module_count["nope"], 0) - def test_node_hook_forward_start(self): - name_prefix = "forward_layer" - hook = self.processor.node_hook(name_prefix, start_or_stop=Const.START) - module = MagicMock() - input = (self.mock_tensor,) - module.mindstudio_reserved_name = None - hook(module, input) - expected_name = f"forward_layer{Const.SEP}0" - self.assertEqual(module.mindstudio_reserved_name, [expected_name]) - self.assertIn(expected_name, ModuleProcesser.module_stack) - self.assertEqual(ModuleProcesser.api_parent_node, expected_name) - - def test_node_hook_forward_stop(self): - name_prefix = "forward_layer" - hook = self.processor.node_hook(name_prefix, start_or_stop=Const.STOP) - ModuleProcesser.module_stack.append(f"forward_layer{Const.SEP}0") - - module = MagicMock() - input = (self.mock_tensor,) - reserved_name = f"forward_layer{Const.SEP}0" - module.mindstudio_reserved_name = [reserved_name] - hook(module, input) - self.assertNotIn([f"forward_layer{Const.SEP}0"], ModuleProcesser.module_stack) - self.assertEqual(ModuleProcesser.api_parent_node, reserved_name) - - def test_node_hook_backward(self): - name_prefix = "backward_layer" - hook = self.processor.node_hook(name_prefix, start_or_stop=Const.START) - - module = MagicMock() - input = (self.mock_tensor,) - module.mindstudio_reserved_name = None - ModuleProcesser.module_node[f"forward_layer{Const.SEP}0"] = None - hook(module, input) - expected_name = f"backward_layer{Const.SEP}0" - self.assertEqual(module.mindstudio_reserved_name, [expected_name]) - self.assertIn(expected_name, ModuleProcesser.module_node) - - def test_has_register_backward_hook(self): - module = MagicMock() - module._backward_hooks = {0: lambda: None} - module._is_full_backward_hook = False - result = self.processor.has_register_backward_hook(module) - self.assertTrue(result) - - module._is_full_backward_hook = True - result = self.processor.has_register_backward_hook(module) - self.assertFalse(result) + ModuleProcesser.reset_module_stats() diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_hook_module.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_hook_module.py index 1524a82ae1fc81eee245fa73bde4b4938cb89638..d907b81af97aeafdd5e35de2bec0fecd97399835 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_hook_module.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_hook_module.py @@ -1,12 +1,29 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import unittest from unittest.mock import MagicMock, patch import threading + from msprobe.pytorch.hook_module.hook_module import HOOKModule + class TestHOOKModuleInit(unittest.TestCase): def setUp(self): - self.mock_build_hook = MagicMock(return_value=(MagicMock(), MagicMock(), MagicMock(), None)) + self.mock_build_hook = MagicMock(return_value=(MagicMock(), MagicMock(), MagicMock())) def test_thread_handling(self): module = HOOKModule(self.mock_build_hook) @@ -16,7 +33,7 @@ class TestHOOKModuleInit(unittest.TestCase): class TestHOOKModuleCall(unittest.TestCase): def setUp(self): - self.mock_build_hook = MagicMock(return_value=(MagicMock(), MagicMock(), MagicMock(), None)) + self.mock_build_hook = MagicMock(return_value=(MagicMock(), MagicMock(), MagicMock())) self.module = HOOKModule(self.mock_build_hook) @patch.object(HOOKModule, '_call_func') diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_wrap_aten.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_wrap_aten.py index af669cb5c73de85e51f36f62f9e7dc61bb599ca1..e565c1cc08d496bd96cc1e873f50e4c02e5c69a8 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_wrap_aten.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/hook_module/test_wrap_aten.py @@ -1,15 +1,34 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import unittest from unittest.mock import MagicMock, patch import torch from msprobe.pytorch.function_factory import npu_custom_grad_functions -from msprobe.pytorch.hook_module.wrap_aten import AtenOPTemplate, white_aten_ops, \ +from msprobe.pytorch.hook_module.wrap_aten import ( + AtenOPTemplate, + white_aten_ops, AtenOPPacketTemplate +) def mock_build_hook(prefix): - return (MagicMock(), MagicMock(), MagicMock(), MagicMock()) + return (MagicMock(), MagicMock(), MagicMock()) + class TestAtenOPTemplate(unittest.TestCase): @@ -79,8 +98,8 @@ class TestAtenOPPacketTemplate(unittest.TestCase): del self.mock_op_packet.nonexistent_attr with self.assertRaises(AttributeError) as context: _ = self.template.nonexistent_attr - self.assertIn("or OpOverloadPacket does not have attribute 'nonexistent_attr'.", \ - str(context.exception)) + self.assertIn("or OpOverloadPacket does not have attribute 'nonexistent_attr'.", + str(context.exception)) @patch('msprobe.pytorch.hook_module.wrap_aten.AtenOPTemplate', autospec=True) def test_getattr_op_overload(self, MockAtenOPTemplate): diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_analyse.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_analyse.py index 904be210a3771f1757e4410b5e0fa0f2ad6152f2..ad4a97acaa9940e807e4023b9745bd210a827501 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_analyse.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_analyse.py @@ -42,7 +42,6 @@ class TestAnomalyDataWriter(unittest.TestCase): writer.init_detected_json() # 检查是否创建了目录 - mock_create_directory.assert_any_call('/tmp/dump') mock_create_directory.assert_any_call('/tmp/dump/rank0') # 检查是否初始化了 JSON 文件 diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_detect.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_detect.py index fa0960e2cc1842a138b47fad3f86c1ed0d089db8..b50f77850485ef622b684ce757c7c1c64b2f7dba 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_detect.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_anomaly_detect.py @@ -2,7 +2,7 @@ import unittest from unittest import TestCase from unittest.mock import patch -from msprobe.pytorch.monitor.anomaly_detect import AnomalyTurbulence, AnomalyScanner, \ +from msprobe.pytorch.monitor.anomaly_detect import AnomalyTurbulence, AnomalyNan, AnomalyScanner, \ AnomalyDataFactory, GradAnomalyData, BaseWriterWithAD, ScanRule, WriterInput @@ -24,15 +24,43 @@ class TestAnomalyTurbulence(TestCase): def test_apply_with_positive_baseline(self): history = [10, 12, 14] cur = 16 - result = self.rule.apply(history, cur) + result = self.rule.apply(cur, history=history) self.assertTrue(result) def test_apply_with_non_positive_baseline(self): history = [0, 0, 0] cur = -1 - result = self.rule.apply(history, cur) + result = self.rule.apply(cur, history=history) self.assertTrue(result) + def test_apply_with_valid_value(self): + history = [0, 0, 0] + cur = 0 + result = self.rule.apply(cur, history=history) + self.assertFalse(result) + + +class TestAnomalyNan(TestCase): + + def setUp(self) -> None: + self.threshold = 1e10 + self.rule = AnomalyNan(self.threshold) + + def test_apply_with_nan(self): + cur = float("nan") + result = self.rule.apply(cur) + self.assertTrue(result) + + def test_apply_with_big_value(self): + cur = float("1e30") + result = self.rule.apply(cur) + self.assertTrue(result) + + def test_apply_with_valid_value(self): + cur = 0.5 + result = self.rule.apply(cur) + self.assertFalse(result) + class TestAnomalyScanner(TestCase): diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_csv2tb.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_csv2tb.py index 4178e2ef8fbfb2c2bafa90b32fa92d622b95e3cd..09e860e7ac5048bd059f888eabfd8ad1d7f45d37 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_csv2tb.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_csv2tb.py @@ -17,7 +17,6 @@ import os import shutil import random import unittest -import pytest import torch import numpy as np import torch.nn as nn @@ -30,13 +29,9 @@ from msprobe.pytorch.hook_module.api_register import get_api_register get_api_register().restore_all_api() - base_dir = os.path.dirname(os.path.realpath(__file__)) config_json_path = os.path.join(base_dir, "config", "all_config.json") monitor_output = os.path.join(base_dir, "./monitor_output_csv2tb") -os.environ[MonitorConst.MONITOR_OUTPUT_DIR] = monitor_output -timestamp_dirpath = None -csv2tb_dirpath = None def seed_all(seed=1234, mode=False): @@ -46,8 +41,8 @@ def seed_all(seed=1234, mode=False): torch.manual_seed(seed) torch.use_deterministic_algorithms(mode) -seed_all() +seed_all() inputs = [torch.rand(10, 10) for _ in range(10)] labels = [torch.randint(0, 5, (10,)) for _ in range(10)] @@ -65,31 +60,6 @@ class MockModule(nn.Module): return x2 -def data_collect(): - loss_fun = nn.CrossEntropyLoss() - test_module = MockModule() - nn.init.constant_(test_module.linear.weight, 1.0) - nn.init.constant_(test_module.linear.bias, 1.0) - optimizer = torch.optim.Adam(test_module.parameters()) - - monitor = TrainerMon(config_json_path, params_have_main_grad=False) - monitor.set_monitor(test_module, grad_acc_steps=1, optimizer=optimizer) - - for input_data, label in zip(inputs, labels): - output = test_module(input_data) - loss = loss_fun(output, label) - optimizer.zero_grad() - loss.backward() - optimizer.step() - - global timestamp_dirpath, csv2tb_dirpath - timestamp_dirpath = os.path.join(monitor_output, os.listdir(monitor_output)[0]) - csv2tensorboard_by_step(monitor_output) - for dirname in os.listdir(monitor_output): - if "csv2tensorboard" in dirname: - csv2tb_dirpath = os.path.join(monitor_output, dirname, "rank0") - - def extract_scalars_from_tensorboard(log_dir): # 初始化 EventAccumulator event_acc = EventAccumulator(log_dir) @@ -144,97 +114,102 @@ def compare_scalar_dicts(dict1, dict2): return True -@pytest.fixture(scope="session") -def setup_all(): - data_collect() - yield - shutil.rmtree(monitor_output) - -@pytest.mark.usefixtures("setup_all") class TestGradMonitor(unittest.TestCase): + timestamp_dirpath = None + csv2tb_dirpath = None + + @classmethod + def setUpClass(cls): + + os.environ[MonitorConst.MONITOR_OUTPUT_DIR] = monitor_output + if os.path.exists(monitor_output): + shutil.rmtree(monitor_output) + + loss_fun = nn.CrossEntropyLoss() + test_module = MockModule() + nn.init.constant_(test_module.linear.weight, 1.0) + nn.init.constant_(test_module.linear.bias, 1.0) + optimizer = torch.optim.Adam(test_module.parameters()) + + monitor = TrainerMon(config_json_path, params_have_main_grad=False) + monitor.set_monitor(test_module, grad_acc_steps=1, optimizer=optimizer) + + for input_data, label in zip(inputs, labels): + output = test_module(input_data) + loss = loss_fun(output, label) + optimizer.zero_grad() + loss.backward() + optimizer.step() + + cls.timestamp_dirpath = os.path.join(monitor_output, os.listdir(monitor_output)[0]) + csv2tensorboard_by_step(monitor_output) + for dirname in os.listdir(monitor_output): + if "csv2tensorboard" in dirname: + cls.csv2tb_dirpath = os.path.join(monitor_output, dirname, "rank0") + os.environ.pop(MonitorConst.MONITOR_OUTPUT_DIR) def setUp(self): self.maxDiff = None - + def test_actv(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"actv_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "actv_0-2.csv")) result = { 'vp0:.input:micro0': { - 0: {'nans': 0.0,'norm': 5.550016}, - 1: {'nans': 0.0,'norm': 5.975112}, - 2: {'nans': 0.0,'norm': 5.789881} - }, + 0: {'nans': 0.0, 'norm': 5.550016}, + 1: {'nans': 0.0, 'norm': 5.975112}, + 2: {'nans': 0.0, 'norm': 5.789881} + }, 'vp0:.output:micro0': { - 0: {'nans': 0.0,'norm': 41.842655}, - 1: {'nans': 0.0,'norm': 44.40981}, - 2: {'nans': 0.0,'norm': 43.578354} - }, + 0: {'nans': 0.0, 'norm': 41.842655}, + 1: {'nans': 0.0, 'norm': 44.40981}, + 2: {'nans': 0.0, 'norm': 43.578354} + }, 'vp0:linear.input:micro0': { - 0: {'nans': 0.0,'norm': 5.550016}, - 1: {'nans': 0.0,'norm': 5.975112}, - 2: {'nans': 0.0,'norm': 5.789881} - }, + 0: {'nans': 0.0, 'norm': 5.550016}, + 1: {'nans': 0.0, 'norm': 5.975112}, + 2: {'nans': 0.0, 'norm': 5.789881} + }, 'vp0:linear.output:micro0': { - 0: {'nans': 0.0,'norm': 41.842655}, - 1: {'nans': 0.0,'norm': 44.40981}, - 2: {'nans': 0.0,'norm': 43.578354} - }, + 0: {'nans': 0.0, 'norm': 41.842655}, + 1: {'nans': 0.0, 'norm': 44.40981}, + 2: {'nans': 0.0, 'norm': 43.578354} + }, 'vp0:relu.input:micro0': { - 0: {'nans': 0.0,'norm': 41.842655}, - 1: {'nans': 0.0,'norm': 44.40981}, - 2: {'nans': 0.0,'norm': 43.578354} - }, + 0: {'nans': 0.0, 'norm': 41.842655}, + 1: {'nans': 0.0, 'norm': 44.40981}, + 2: {'nans': 0.0, 'norm': 43.578354} + }, 'vp0:relu.output:micro0': { - 0: {'nans': 0.0,'norm': 41.842655}, - 1: {'nans': 0.0,'norm': 44.40981}, - 2: {'nans': 0.0,'norm': 43.578354} - } + 0: {'nans': 0.0, 'norm': 41.842655}, + 1: {'nans': 0.0, 'norm': 44.40981}, + 2: {'nans': 0.0, 'norm': 43.578354} } - self.assertEqual(dict_equal(data, result), True) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "actv")) + } + self.assertDictEqual(data, result) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "actv")) print(tb_data) tb_result = { 'vp0:.input:micro0/nans': [(0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:.input:micro0/norm': [(0, 5.550015926361084), - (1, 5.975111961364746), - (2, 5.789881229400635), - (3, 6.052319049835205), - (4, 5.573315143585205), - (5, 5.864360809326172), - (6, 5.292460918426514), - (7, 5.477899074554443), - (8, 5.884613990783691), - (9, 5.456457138061523)], + (1, 5.975111961364746), + (2, 5.789881229400635), + (3, 6.052319049835205), + (4, 5.573315143585205), + (5, 5.864360809326172), + (6, 5.292460918426514), + (7, 5.477899074554443), + (8, 5.884613990783691), + (9, 5.456457138061523)], 'vp0:.output:micro0/nans': [(0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], - 'vp0:.output:micro0/norm': [(0, 41.842655181884766), - (1, 44.40980911254883), - (2, 43.57835388183594), - (3, 45.83631134033203), - (4, 42.0673828125), - (5, 43.46839141845703), - (6, 39.77947235107422), - (7, 40.200843811035156), - (8, 44.453147888183594), - (9, 40.841522216796875)], - 'vp0:linear.input:micro0/nans': [(0, 0.0), (1, 0.0), (2, 0.0), (3, 0.0), @@ -244,117 +219,136 @@ class TestGradMonitor(unittest.TestCase): (7, 0.0), (8, 0.0), (9, 0.0)], + 'vp0:.output:micro0/norm': [(0, 41.842655181884766), + (1, 44.40980911254883), + (2, 43.57835388183594), + (3, 45.83631134033203), + (4, 42.0673828125), + (5, 43.46839141845703), + (6, 39.77947235107422), + (7, 40.200843811035156), + (8, 44.453147888183594), + (9, 40.841522216796875)], + 'vp0:linear.input:micro0/nans': [(0, 0.0), + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:linear.input:micro0/norm': [(0, 5.550015926361084), - (1, 5.975111961364746), - (2, 5.789881229400635), - (3, 6.052319049835205), - (4, 5.573315143585205), - (5, 5.864360809326172), - (6, 5.292460918426514), - (7, 5.477899074554443), - (8, 5.884613990783691), - (9, 5.456457138061523)], + (1, 5.975111961364746), + (2, 5.789881229400635), + (3, 6.052319049835205), + (4, 5.573315143585205), + (5, 5.864360809326172), + (6, 5.292460918426514), + (7, 5.477899074554443), + (8, 5.884613990783691), + (9, 5.456457138061523)], 'vp0:linear.output:micro0/nans': [(0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:linear.output:micro0/norm': [(0, 41.842655181884766), - (1, 44.40980911254883), - (2, 43.57835388183594), - (3, 45.83631134033203), - (4, 42.0673828125), - (5, 43.46839141845703), - (6, 39.77947235107422), - (7, 40.200843811035156), - (8, 44.453147888183594), - (9, 40.841522216796875)], + (1, 44.40980911254883), + (2, 43.57835388183594), + (3, 45.83631134033203), + (4, 42.0673828125), + (5, 43.46839141845703), + (6, 39.77947235107422), + (7, 40.200843811035156), + (8, 44.453147888183594), + (9, 40.841522216796875)], 'vp0:relu.input:micro0/nans': [(0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:relu.input:micro0/norm': [(0, 41.842655181884766), - (1, 44.40980911254883), - (2, 43.57835388183594), - (3, 45.83631134033203), - (4, 42.0673828125), - (5, 43.46839141845703), - (6, 39.77947235107422), - (7, 40.200843811035156), - (8, 44.453147888183594), - (9, 40.841522216796875)], + (1, 44.40980911254883), + (2, 43.57835388183594), + (3, 45.83631134033203), + (4, 42.0673828125), + (5, 43.46839141845703), + (6, 39.77947235107422), + (7, 40.200843811035156), + (8, 44.453147888183594), + (9, 40.841522216796875)], 'vp0:relu.output:micro0/nans': [(0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:relu.output:micro0/norm': [(0, 41.842655181884766), - (1, 44.40980911254883), - (2, 43.57835388183594), - (3, 45.83631134033203), - (4, 42.0673828125), - (5, 43.46839141845703), - (6, 39.77947235107422), - (7, 40.200843811035156), - (8, 44.453147888183594), - (9, 40.841522216796875)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) - + (1, 44.40980911254883), + (2, 43.57835388183594), + (3, 45.83631134033203), + (4, 42.0673828125), + (5, 43.46839141845703), + (6, 39.77947235107422), + (7, 40.200843811035156), + (8, 44.453147888183594), + (9, 40.841522216796875)]} + self.assertDictEqual(tb_data, tb_result) def test_actv_grad(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"actv_grad_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "actv_grad_0-2.csv")) nan = np.nan result = { 'vp0:.input:micro0': { - 0: {'norm': nan, 'nans': nan}, - 1: {'norm': nan, 'nans': nan}, + 0: {'norm': nan, 'nans': nan}, + 1: {'norm': nan, 'nans': nan}, 2: {'norm': nan, 'nans': nan} - }, + }, 'vp0:.output:micro0': { - 0: {'norm': 0.282843, 'nans': 0.0}, - 1: {'norm': 0.282617, 'nans': 0.0}, + 0: {'norm': 0.282843, 'nans': 0.0}, + 1: {'norm': 0.282617, 'nans': 0.0}, 2: {'norm': 0.282655, 'nans': 0.0} - }, + }, 'vp0:relu.input:micro0': { - 0: {'norm': 0.282843, 'nans': 0.0}, - 1: {'norm': 0.282617, 'nans': 0.0}, + 0: {'norm': 0.282843, 'nans': 0.0}, + 1: {'norm': 0.282617, 'nans': 0.0}, 2: {'norm': 0.282655, 'nans': 0.0} - }, + }, 'vp0:relu.output:micro0': { - 0: {'norm': 0.282843, 'nans': 0.0}, - 1: {'norm': 0.282617, 'nans': 0.0}, + 0: {'norm': 0.282843, 'nans': 0.0}, + 1: {'norm': 0.282617, 'nans': 0.0}, 2: {'norm': 0.282655, 'nans': 0.0} - }, + }, 'vp0:linear.input:micro0': { - 0: {'norm': nan, 'nans': nan}, - 1: {'norm': nan, 'nans': nan}, + 0: {'norm': nan, 'nans': nan}, + 1: {'norm': nan, 'nans': nan}, 2: {'norm': nan, 'nans': nan} - }, + }, 'vp0:linear.output:micro0': { - 0: {'norm': 0.282843, 'nans': 0.0}, - 1: {'norm': 0.282617, 'nans': 0.0}, + 0: {'norm': 0.282843, 'nans': 0.0}, + 1: {'norm': 0.282617, 'nans': 0.0}, 2: {'norm': 0.282655, 'nans': 0.0} - } } - self.assertEqual(dict_equal(data, result), True) - - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "actv_grad")) + } + print(data) + + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "actv_grad")) tb_result = { 'vp0:.input:micro0/nans': [(0, nan), (1, nan), @@ -475,88 +469,90 @@ class TestGradMonitor(unittest.TestCase): (6, 0.28316599130630493), (7, 0.28274500370025635), (8, 0.2833530008792877), - (9, 0.2825529873371124)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) + (9, 0.2825529873371124)] + } + print(tb_data) - def test_param(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"param_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "param_origin_0-2.csv")) result = { 'vp0:linear.bias': { 0: {'nans': 0.0, 'norm': 2.236068}, 1: {'nans': 0.0, 'norm': 2.236198}, 2: {'nans': 0.0, 'norm': 2.235769} - }, + }, 'vp0:linear.weight': { 0: {'nans': 0.0, 'norm': 7.071068}, 1: {'nans': 0.0, 'norm': 7.068808}, 2: {'nans': 0.0, 'norm': 7.06771} - } } - self.assertEqual(dict_equal(data, result), True) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "param")) + } + self.assertDictEqual(data, result) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "param_origin")) tb_result = { 'vp0:linear.weight/norm': [ - (0, 7.071067810058594), - (1, 7.068808078765869), - (2, 7.067709922790527), - (3, 7.0673418045043945), - (4, 7.066926956176758), - (5, 7.066311836242676), - (6, 7.065629959106445), - (7, 7.065262794494629), - (8, 7.065001964569092), - (9, 7.064840793609619)], + (0, 7.071067810058594), + (1, 7.068808078765869), + (2, 7.067709922790527), + (3, 7.0673418045043945), + (4, 7.066926956176758), + (5, 7.066311836242676), + (6, 7.065629959106445), + (7, 7.065262794494629), + (8, 7.065001964569092), + (9, 7.064840793609619)], 'vp0:linear.weight/nans': [ - (0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)], + (0, 0.0), + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0)], 'vp0:linear.bias/norm': [ - (0, 2.2360680103302), - (1, 2.2361979484558105), - (2, 2.235769033432007), - (3, 2.235903024673462), - (4, 2.2360129356384277), - (5, 2.2359039783477783), - (6, 2.2357990741729736), - (7, 2.2357349395751953), - (8, 2.2356700897216797), - (9, 2.235619068145752)], + (0, 2.2360680103302), + (1, 2.2361979484558105), + (2, 2.235769033432007), + (3, 2.235903024673462), + (4, 2.2360129356384277), + (5, 2.2359039783477783), + (6, 2.2357990741729736), + (7, 2.2357349395751953), + (8, 2.2356700897216797), + (9, 2.235619068145752) + ], 'vp0:linear.bias/nans': [ - (0, 0.0), - (1, 0.0), - (2, 0.0), - (3, 0.0), - (4, 0.0), - (5, 0.0), - (6, 0.0), - (7, 0.0), - (8, 0.0), - (9, 0.0)] - } - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) + (0, 0.0), + (1, 0.0), + (2, 0.0), + (3, 0.0), + (4, 0.0), + (5, 0.0), + (6, 0.0), + (7, 0.0), + (8, 0.0), + (9, 0.0) + ] + } + self.assertDictEqual(tb_data, tb_result) def test_exp_avg(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"exp_avg_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "exp_avg_0-2.csv")) result = { 'vp0:linear.bias': { 1: {'nans': 0.0, 'norm': 0.024495}, 2: {'nans': 0.0, 'norm': 0.052203} - }, + }, 'vp0:linear.weight': { 1: {'nans': 0.0, 'norm': 0.052394}, 2: {'nans': 0.0, 'norm': 0.099221} - } } - self.assertEqual(dict_equal(data, result), True) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "exp_avg")) + } + self.assertDictEqual(data, result) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "exp_avg")) tb_result = { 'vp0:linear.bias/nans': [(1, 0.0), (2, 0.0), @@ -594,22 +590,22 @@ class TestGradMonitor(unittest.TestCase): (7, 0.11372199654579163), (8, 0.12264800071716309), (9, 0.09017200022935867)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) + self.assertDictEqual(tb_data, tb_result) def test_exp_avg_sq(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"exp_avg_sq_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "exp_avg_sq_0-2.csv")) result = { 'vp0:linear.bias': { 1: {'nans': 0.0, 'norm': 4.2e-05}, 2: {'nans': 0.0, 'norm': 9.6e-05} - }, + }, 'vp0:linear.weight': { 1: {'nans': 0.0, 'norm': 6.7e-05}, 2: {'nans': 0.0, 'norm': 0.000126} - } } - self.assertEqual(dict_equal(data, result), True) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "exp_avg_sq")) + } + self.assertDictEqual(data, result) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "exp_avg_sq")) tb_result = { 'vp0:linear.bias/nans': [(1, 0.0), (2, 0.0), @@ -647,24 +643,24 @@ class TestGradMonitor(unittest.TestCase): (7, 0.00026000000070780516), (8, 0.00028700000257231295), (9, 0.0003060000017285347)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) - + self.assertDictEqual(tb_data, tb_result) + def test_grad_reduced(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"grad_reduced_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "grad_reduced_0-2.csv")) result = { 'vp0:linear.bias': { 0: {'nans': 0.0, 'norm': 0.244949}, 1: {'nans': 0.0, 'norm': 0.314345}, 2: {'nans': 0.0, 'norm': 0.281475} - }, + }, 'vp0:linear.weight': { 0: {'nans': 0.0, 'norm': 0.523935}, 1: {'nans': 0.0, 'norm': 0.595672}, 2: {'nans': 0.0, 'norm': 0.497603} - } } - self.assertEqual(dict_equal(data, result), True) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "grad_reduced")) + } + self.assertDictEqual(data, result) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "grad_reduced")) tb_result = { 'vp0:linear.bias/nans': [(0, 0.0), (1, 0.0), @@ -706,25 +702,25 @@ class TestGradMonitor(unittest.TestCase): (7, 0.4831080138683319), (8, 0.3234719932079315), (9, 0.32385098934173584)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) - + self.assertDictEqual(tb_data, tb_result) + def test_grad_unreduced(self): - data = parse_step_fn(os.path.join(timestamp_dirpath,"grad_unreduced_0-2.csv")) + data = parse_step_fn(os.path.join(self.timestamp_dirpath, "grad_unreduced_0-2.csv")) result = { 'vp0:linear.bias': { 0: {'nans': 0.0, 'norm': 0.244949}, 1: {'nans': 0.0, 'norm': 0.314345}, 2: {'nans': 0.0, 'norm': 0.281475} - }, + }, 'vp0:linear.weight': { 0: {'nans': 0.0, 'norm': 0.523935}, 1: {'nans': 0.0, 'norm': 0.595672}, 2: {'nans': 0.0, 'norm': 0.497603} - } } - self.assertEqual(dict_equal(data, result), True) + } + self.assertDictEqual(data, result) - tb_data = extract_scalars_from_tensorboard(os.path.join(csv2tb_dirpath, "grad_unreduced")) + tb_data = extract_scalars_from_tensorboard(os.path.join(self.csv2tb_dirpath, "grad_unreduced")) tb_result = { 'vp0:linear.bias/nans': [(0, 0.0), (1, 0.0), @@ -766,4 +762,8 @@ class TestGradMonitor(unittest.TestCase): (7, 0.4831080138683319), (8, 0.3234719932079315), (9, 0.32385098934173584)]} - self.assertEqual(compare_scalar_dicts(tb_data, tb_result), True) + self.assertDictEqual(tb_data, tb_result) + + +if __name__ == '__main__': + unittest.main() diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_module_hook.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_module_hook.py index 66d016f9487a4e7f7fc747dfb021b1f887c51f4a..9c990d158e18a4b28e40bfe28cdcee5f5ef1a938 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_module_hook.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_module_hook.py @@ -90,13 +90,13 @@ class TestModuleHook(unittest.TestCase): self.assertTrue(os.path.exists(actv_grad_0_csv)) # validate columns and lines actv_0 = pd.read_csv(actv_0_csv) - expect_columns = ['vpp_stage', 'name', 'step', 'micro_step', 'norm', 'nans'] + expect_columns = ['vpp_stage', 'name', 'step', 'micro_step', 'norm', 'nans', "shape", "dtype"] self.assertListEqual(list(actv_0.columns), expect_columns) - self.assertEqual(actv_0.shape, tuple([6, 6])) + self.assertEqual(actv_0.shape, tuple([6, 8])) actv_grad_0 = pd.read_csv(actv_grad_0_csv) - expect_columns = ['vpp_stage', 'name', 'step', 'micro_step', 'norm', 'nans'] + expect_columns = ['vpp_stage', 'name', 'step', 'micro_step', 'norm', 'nans', "shape", "dtype"] self.assertListEqual(list(actv_grad_0.columns), expect_columns) - self.assertEqual(actv_0.shape, tuple([6, 6])) + self.assertEqual(actv_0.shape, tuple([6, 8])) def test_wg_distribution(self): self.get_dist_mock(False) @@ -113,13 +113,13 @@ class TestModuleHook(unittest.TestCase): self.assertTrue(os.path.exists(grad_reduced_0_csv)) self.assertTrue(os.path.exists(grad_unreduced_0_csv)) # validate columns and lines - expect_columns = ["vpp_stage", "name", "step", "norm"] + expect_columns = ["vpp_stage", "name", "step", "norm", "shape", "dtype"] grad_reduced_0 = pd.read_csv(grad_reduced_0_csv) self.assertListEqual(list(grad_reduced_0.columns), expect_columns) - self.assertEqual(grad_reduced_0.shape, tuple([2, 4])) + self.assertEqual(grad_reduced_0.shape, tuple([2, 6])) grad_unreduced_0 = pd.read_csv(grad_unreduced_0_csv) self.assertListEqual(list(grad_unreduced_0.columns), expect_columns) - self.assertEqual(grad_unreduced_0.shape, tuple([2, 4])) + self.assertEqual(grad_unreduced_0.shape, tuple([2, 6])) def test_mv_distribution(self): self.get_dist_mock(False) @@ -136,13 +136,13 @@ class TestModuleHook(unittest.TestCase): self.assertTrue(os.path.exists(exp_avg_1_csv)) self.assertTrue(os.path.exists(exp_avg_sq_1_csv)) # validate columns and lines - expect_columns = ["vpp_stage", "name", "step", "norm"] + expect_columns = ["vpp_stage", "name", "step", "norm", "shape", "dtype"] exp_avg_1 = pd.read_csv(exp_avg_1_csv) self.assertListEqual(list(exp_avg_1.columns), expect_columns) - self.assertEqual(exp_avg_1.shape, tuple([2, 4])) + self.assertEqual(exp_avg_1.shape, tuple([2, 6])) exp_avg_sq_1 = pd.read_csv(exp_avg_sq_1_csv) self.assertListEqual(list(exp_avg_sq_1.columns), expect_columns) - self.assertEqual(exp_avg_sq_1.shape, tuple([2, 4])) + self.assertEqual(exp_avg_sq_1.shape, tuple([2, 6])) def test_ur_distribution(self): self.get_dist_mock(False) @@ -261,61 +261,6 @@ class TestParamIsDataParallelDuplicate(unittest.TestCase): self.assertFalse(result) -class TestModuleHookContext(unittest.TestCase): - def setUp(self): - self.module_name = "test_module" - self.context = ModuleHookContext(self.module_name) - self.context.struct = { - Const.INPUT: { - "config": "tuple[1]", - "0": "size=(2, 784), dtype=torch.float32", - }, - Const.OUTPUT: { - "config": "tensor", - "tensor": "size=(2, 10), dtype=torch.float32" - }, - MonitorConst.INPUT_GRAD: { - "config": "tuple[1]", - "0": "size=(2, 784), dtype=torch.float32" - }, - MonitorConst.OUTPUT_GRAD: { - "config": "tuple[1]", - "0": "size=(2, 10), dtype=torch.float32" - } - } - self.target_config = { - self.module_name: { - Const.INPUT: "tuple[1]:0", - Const.OUTPUT: "tensor", - MonitorConst.INPUT_GRAD: "tuple[1]:0" - } - } - - def test_set_format_by_arg_module_name_in_target_config(self): - self.context.set_format_by_arg(Const.INPUT, self.target_config) - self.assertEqual(self.context.format_by_arg[Const.INPUT], "tuple[1]:0") - self.context.set_format_by_arg(Const.OUTPUT, self.target_config) - self.assertEqual(self.context.format_by_arg[Const.OUTPUT], "tensor") - self.context.set_format_by_arg(MonitorConst.INPUT_GRAD, self.target_config) - self.assertEqual(self.context.format_by_arg[MonitorConst.INPUT_GRAD], "tuple[1]:0") - self.context.set_format_by_arg(MonitorConst.OUTPUT_GRAD, self.target_config) - self.assertEqual(self.context.format_by_arg[MonitorConst.OUTPUT_GRAD], "tuple[1]") - - def test_set_format_by_arg_module_name_not_in_target_config(self): - target_config = {} - self.context.set_format_by_arg(Const.INPUT, target_config) - self.assertEqual(self.context.format_by_arg[Const.INPUT], "tuple[1]") - self.context.set_format_by_arg(Const.OUTPUT, target_config) - self.assertEqual(self.context.format_by_arg[Const.OUTPUT], "tensor") - - @patch('msprobe.pytorch.monitor.module_hook.logger') - def test_set_format_by_arg_target_module_config_error(self, mock_logger): - target_config = {self.module_name: {Const.INPUT: 123}} - self.context.set_format_by_arg(Const.INPUT, target_config) - self.assertIsNone(self.context.format_by_arg.get(Const.INPUT)) - mock_logger.warning_on_rank_0.assert_called_once() - - class TestContext(unittest.TestCase): def test_communication_context(self): cc_ctx = CommunicationContext() diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_monitor_utils.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_monitor_utils.py index 0462ac3f39531119b40d3cc5051fad77f687b9b5..87822ab0503bd21e0546d8c846d69f56204eb048 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_monitor_utils.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_monitor_utils.py @@ -44,12 +44,12 @@ class TestValidationFunctions(unittest.TestCase): def test_validate_ops(self): ops = ['op1', 'op2', 'norm', 'max'] valid_ops = validate_ops(ops) - self.assertEqual(valid_ops, ['norm', 'max']) + self.assertEqual(valid_ops, ['norm', 'max', "shape", "dtype"]) def test_no_valid_ops(self): ops = ['op1', 'op2'] valid_ops = validate_ops(ops) - target_ops = [MonitorConst.OP_LIST[0]] + target_ops = [MonitorConst.OP_LIST[0], "shape", "dtype"] self.assertEqual(valid_ops, target_ops) def test_validate_ranks(self): @@ -104,7 +104,7 @@ class TestValidationFunctions(unittest.TestCase): 'alert': {'rules': [{'rule_name': 'AnomalyTurbulence', 'args': {'threshold': 10.0}}], 'dump': True} } validate_config(config) - target_ops = [MonitorConst.OP_LIST[0]] + target_ops = [MonitorConst.OP_LIST[0], "shape", "dtype"] self.assertEqual(config["ops"], target_ops) del config["targets"] validate_config(config) diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_optimizer_collect.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_optimizer_collect.py index e17b6249a34273f63f0508ef997f3b1cb8f8de66..242f70e50e4cdc2b2b50dc99be627bdec47ad263 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_optimizer_collect.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/monitor/test_optimizer_collect.py @@ -3,20 +3,19 @@ from collections import defaultdict from unittest.mock import Mock, patch, MagicMock import torch -from torch._utils import _flatten_dense_tensors +from msprobe.core.common.const import MonitorConst from msprobe.pytorch.monitor.optimizer_collect import OptimizerMon, \ - OptimizerMonFactory, DummyOptimizerMon, \ - MixPrecisionOptimizerMon, MegatronDistributedOptimizerMon, MegatronFP32OptimizerMon, \ + OptimizerMonFactory, MixPrecisionOptimizerMon, MegatronDistributedOptimizerMon, \ MegatronChainedDistributedOptimizerMon, MegatronChainedMixPrecisionOptimizerMon, \ - DeepSpeedZeroOptimizerStage0Mon, DeepSpeedZeroOptimizerStage1or2Mon, DeepSpeedZeroOptimizerStage3Mon - -from msprobe.pytorch.monitor.utils import MVResult, MVGradResult + DeepSpeedZeroOptimizerMon, DeepSpeedZeroOptimizerStage0Mon, \ + DeepSpeedZeroOptimizerStage1or2Mon, DeepSpeedZeroOptimizerStage3Mon +from msprobe.pytorch.monitor.utils import MVResult def setup_param_groups(num_groups=2, params_per_group=5): bit16_groups = [] param_names = {} - name2index = {} + grad_position = {} param_slice_mappings = [] count = 0 for group_idx in range(num_groups): @@ -27,16 +26,17 @@ def setup_param_groups(num_groups=2, params_per_group=5): name = f'param{group_idx}_{i}' p = torch.nn.Parameter(torch.randn(2,3, dtype=torch.bfloat16)) p.ds_tensor = torch.nn.Parameter(torch.randn(1,3, dtype=torch.bfloat16)) + p.ds_id = count param_slice_mapping[name] = MagicMock(start=offset, numel=p.numel()) - name2index[name] = count group.append(p) param_names[p] = name + grad_position[count] = [group_idx, offset, p.numel()] offset += p.numel() count += 1 bit16_groups.append(group) param_slice_mappings.append(param_slice_mapping) - return bit16_groups, param_names, name2index, param_slice_mappings + return bit16_groups, param_names, param_slice_mappings, grad_position def setup_mock_monitor(): mock_monitor = MagicMock() @@ -56,11 +56,11 @@ class TestOptimizerMon(unittest.TestCase): self.monitor.ratio_heatmap_visualizer = {'param1': Mock(), 'param2': Mock()} def test_fetch_mv(self): - optimizer_mon = OptimizerMon() - res = optimizer_mon.fetch_mv(None, None, None) - self.assertEqual(res, None) + optimizer_mon = OptimizerMon(None) + res = optimizer_mon.fetch_mv(None, {}) + self.assertEqual(res.exp_avg, {}) - def test_fetch_mv_in_adam(self): + def test_fetch_mv(self): self.torch_opt = Mock() self.torch_opt.state = { 'param1': {'exp_avg': torch.tensor(0.1), 'exp_avg_sq': torch.tensor(0.2), 'step': torch.tensor(10)}, @@ -70,48 +70,10 @@ class TestOptimizerMon(unittest.TestCase): self.torch_opt.defaults = {'betas': (0.9, 0.999), 'eps': 1e-8} self.params2name = {'param1': 'param1', 'param2': 'param2'} - self.optimizer_mon = OptimizerMon() - result = self.optimizer_mon._fetch_mv_in_adam(self.monitor, self.torch_opt, self.params2name) + self.optimizer_mon = OptimizerMon(None) + result = self.optimizer_mon.fetch_mv(self.monitor, self.params2name) self.assertIsInstance(result, MVResult) - @patch('msprobe.pytorch.monitor.optimizer_collect.dist') - def test_fetch_mv_grad_in_adam(self, mock_dist): - self.optimizer_mon = OptimizerMon() - self.monitor = MagicMock() - self.torch_opt = MagicMock() - self.params2name = defaultdict(str) - self.name2indices = defaultdict(tuple) - self.fp32_partitioned_groups_flat = defaultdict(torch.Tensor) - - # Mocking the dist.get_rank() and dist.get_world_size() - mock_dist.get_rank.return_value = 0 - mock_dist.get_world_size.return_value = 1 - - # Mocking the wrapped_optimizer - self.torch_opt.state = defaultdict(dict) - self.torch_opt.averaged_gradients = defaultdict(torch.Tensor) - self.torch_opt.partition_size = defaultdict(int) - self.torch_opt.flatten_dense_tensors_aligned = MagicMock() - self.torch_opt.flatten = MagicMock() - - # Mocking the torch_opt.param_groups - self.torch_opt.param_groups = [{'step': 1, 'betas': (0.9, 0.999)}, - {'step': 2, 'betas': (0.9, 0.999)}, - {'step': 3, 'betas': (0.9, 0.999)}] - - # Mocking the monitor.mv_distribution, monitor.mg_direction, monitor.ur_distribution - self.monitor.mv_distribution = True - self.monitor.mg_direction = True - self.monitor.ur_distribution = True - - # Mocking the monitor.update_heatmap_visualizer and monitor.ratio_heatmap_visualizer - self.monitor.update_heatmap_visualizer = defaultdict(MagicMock) - self.monitor.ratio_heatmap_visualizer = defaultdict(MagicMock) - - result = self.optimizer_mon._fetch_mv_grad_in_adam(self.monitor, self.torch_opt, self.params2name, - self.name2indices, self.fp32_partitioned_groups_flat) - self.assertIsInstance(result, MVGradResult) - class TestMixPrecisionOptimizerMon(unittest.TestCase): def test_fetch_mv_with_fp16_to_fp32_param_and_mix_prec_opt(self): @@ -122,16 +84,16 @@ class TestMixPrecisionOptimizerMon(unittest.TestCase): self.mix_prec_opt = MagicMock() self.mix_prec_opt.float16_groups = [MagicMock()] self.mix_prec_opt.fp32_from_float16_groups = [MagicMock()] - self.optimizer = MixPrecisionOptimizerMon() + self.optimizer = MixPrecisionOptimizerMon(self.torch_opt) self.optimizer.fp16_to_fp32_param = {} - # Mock _fetch_mv_in_adam method and set a fixed return value + # Mock fetch_mv method and set a fixed return value mv_result = MVResult(exp_avg={}, exp_avg_sq={}, update={}, ratio={}) - self.mock_fetch_mv_in_adam = MagicMock(return_value=mv_result) - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam + self.mock_fetch_mv = MagicMock(return_value=mv_result) + self.optimizer.fetch_mv = self.mock_fetch_mv - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) - self.mock_fetch_mv_in_adam.assert_called_once_with(self.monitor, self.torch_opt, self.params2name) + res = self.optimizer.fetch_mv(self.monitor, self.params2name) + self.mock_fetch_mv.assert_called_once_with(self.monitor, self.params2name) self.assertIsInstance(res, MVResult) @@ -143,17 +105,17 @@ class TestChainedMixPrecisionOptimizerMon(unittest.TestCase): self.params2name = MagicMock() self.torch_opt.float16_groups = [MagicMock()] self.torch_opt.fp32_from_float16_groups = [MagicMock()] - self.optimizer = MegatronChainedMixPrecisionOptimizerMon() + self.optimizer = MegatronChainedMixPrecisionOptimizerMon(self.torch_opt) self.optimizer.optimizer = [MagicMock(), MagicMock()] self.optimizer.fp16_to_fp32_param = {} - # Mock _fetch_mv_in_adam method and set a fixed return value + # Mock fetch_mv method and set a fixed return value mv_result = MVResult(exp_avg={}, exp_avg_sq={}, update={}, ratio={}) - self.mock_fetch_mv_in_adam = MagicMock(return_value=mv_result) - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam + self.mock_fetch_mv = MagicMock(return_value=mv_result) + self.optimizer.fetch_mv = self.mock_fetch_mv - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) - self.mock_fetch_mv_in_adam.assert_called_once_with(self.monitor, self.torch_opt, self.params2name) + res = self.optimizer.fetch_mv(self.monitor, self.params2name) + self.mock_fetch_mv.assert_called_once_with(self.monitor, self.params2name) self.assertIsInstance(res, MVResult) @@ -162,26 +124,27 @@ class TestMegatronChainedDistributedOptimizerMon(unittest.TestCase): self.monitor = MagicMock() self.torch_opt = MagicMock() self.params2name = MagicMock() + self.torch_opt.chained_optimizers = [MagicMock(), MagicMock()] mv_result = MVResult(exp_avg={}, exp_avg_sq={}, update={}, ratio={}) - self.mock_fetch_mv_in_adam = MagicMock(return_value=mv_result) - self.optimizer = MegatronChainedDistributedOptimizerMon() + self.mock_fetch_mv = MagicMock(return_value=mv_result) + self.optimizer = MegatronChainedDistributedOptimizerMon(self.torch_opt) def test_fetch_mv_with_valid_optimizer(self): - self.torch_opt.model_float16_groups = [MagicMock()] - self.torch_opt.shard_fp32_from_float16_groups = [MagicMock()] - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam + for opt in self.torch_opt.chained_optimizers: + opt.model_float16_groups = [MagicMock()] + opt.shard_fp32_from_float16_groups = [MagicMock()] + self.optimizer.fetch_mv = self.mock_fetch_mv - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) + res = self.optimizer.fetch_mv(self.monitor, self.params2name) self.assertIsInstance(res, MVResult) def test_fetch_mv_with_invalid_optimizer(self): - self.torch_opt = Mock() - self.torch_opt.model_float16_groups = None - self.torch_opt.shard_fp32_from_float16_groups = None - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam + for opt in self.torch_opt.chained_optimizers: + del opt.model_float16_groups + del opt.shard_fp32_from_float16_groups with self.assertRaises(Exception): - self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) + self.optimizer.fetch_mv(self.monitor, self.params2name) class TestMegatronDistributedOptimizerMon(unittest.TestCase): @@ -190,25 +153,23 @@ class TestMegatronDistributedOptimizerMon(unittest.TestCase): self.torch_opt = MagicMock() self.params2name = MagicMock() mv_result = MVResult(exp_avg={}, exp_avg_sq={}, update={}, ratio={}) - self.mock_fetch_mv_in_adam = MagicMock(return_value=mv_result) - self.optimizer = MegatronDistributedOptimizerMon() + self.mock_fetch_mv = MagicMock(return_value=mv_result) + self.optimizer = MegatronDistributedOptimizerMon(self.torch_opt) def test_fetch_mv_with_valid_optimizer(self): self.torch_opt.model_float16_groups = [MagicMock()] self.torch_opt.shard_fp32_from_float16_groups = [MagicMock()] - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam + self.optimizer.fetch_mv = self.mock_fetch_mv - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) + res = self.optimizer.fetch_mv(self.monitor, self.params2name) self.assertIsInstance(res, MVResult) def test_fetch_mv_with_invalid_optimizer(self): - self.torch_opt = Mock() self.torch_opt.model_float16_groups = None self.torch_opt.shard_fp32_from_float16_groups = None - self.optimizer._fetch_mv_in_adam = self.mock_fetch_mv_in_adam with self.assertRaises(Exception): - self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) + self.optimizer.fetch_mv(self.monitor, self.params2name) class TestCommonFetchMv(unittest.TestCase): @@ -217,157 +178,180 @@ class TestCommonFetchMv(unittest.TestCase): self.torch_opt = MagicMock() self.params2name = MagicMock() - def test_megatron_fp32_optimizer_mon(self): - self.optimizer = MegatronFP32OptimizerMon() - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) + def test_optimizer_mon(self): + self.optimizer = OptimizerMon(None) + res = self.optimizer.fetch_mv(self.monitor, self.params2name) self.assertIsInstance(res, MVResult) - def test_deepspeed_zero_optimizer_stage0_mon(self): - self.optimizer = DeepSpeedZeroOptimizerStage0Mon() - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) - self.assertIsInstance(res, MVResult) - def test_dummy_optimizer_mon(self): - self.optimizer = DummyOptimizerMon() - res = self.optimizer.fetch_mv(self.monitor, self.torch_opt, self.params2name) - self.assertIsInstance(res, MVResult) +class TestDeepSpeedZeroOptimizer(unittest.TestCase): + def setUp(self): + bit16_groups, param_names, param_slice_mappings, _ = setup_param_groups() + mock_opt = MagicMock() + mock_opt.state_dict.return_value = { + 'param_slice_mappings': param_slice_mappings + } + mock_opt.param_names = param_names + mock_opt.bit16_groups = bit16_groups + self.torch_opt = mock_opt + self.mock_monitor = setup_mock_monitor() + self.optimizer_mon = DeepSpeedZeroOptimizerMon(mock_opt) + self.optimizer_mon.bit16_groups = mock_opt.bit16_groups + self.optimizer_mon.param2group = self.optimizer_mon.get_group_index() -class TestDeepSpeedZeroOptimizerStage3Mon(unittest.TestCase): + def test_param_not_in_partition(self): + param_in_partition = list(self.torch_opt.param_names.keys())[0] + param_not_in_partition = torch.randn(2,3) + + self.assertFalse( + self.optimizer_mon.param_not_in_partition(param_in_partition, 0) + ) + self.assertTrue( + self.optimizer_mon.param_not_in_partition(param_not_in_partition, 0) + ) + + def test_get_position(self): + param_in_partition = list(self.torch_opt.param_names.keys())[0] + start, numel = self.optimizer_mon.get_position(param_in_partition, 0) + self.assertEqual(start, 0) + self.assertEqual(numel, 6) + + def test_get_group_index(self): + param = list(self.torch_opt.param_names.keys())[6] + self.assertEqual(self.optimizer_mon.param2group[param], 1) + +class TestDeepSpeedZeroOptimizerStage0Mon(unittest.TestCase): def setUp(self): - bit16_groups, param_names, name2index, _ = setup_param_groups() + bit16_groups, param_names, param_slice_mappings, _ = setup_param_groups() mock_opt = MagicMock() + mock_opt.state_dict.return_value = { + 'param_slice_mappings': param_slice_mappings + } mock_opt.param_names = param_names - mock_opt.fp16_groups = bit16_groups - mock_opt.fp32_partitioned_groups_flat = [torch.stack(group,dim=0).flatten().float() - for group in bit16_groups] - mock_opt.fp16_partitioned_groups = [[p.ds_tensor for p in group] for group in bit16_groups] - mock_opt.flatten = _flatten_dense_tensors - mock_opt.averaged_gradients = {group_idx: [torch.randn_like(p.ds_tensor) for p in group] - for group_idx, group in enumerate(bit16_groups)} + mock_opt.bf16_groups = bit16_groups + mock_opt.fp32_groups_flat_partition = [torch.stack(group,dim=0).flatten().float() \ + for group in bit16_groups]# mock name 2 index in subgroup mock_opt.state = { flat_group: { 'exp_avg': torch.ones_like(flat_group), 'exp_avg_sq': torch.ones_like(flat_group) - } for flat_group in mock_opt.fp32_partitioned_groups_flat + } for flat_group in mock_opt.fp32_groups_flat_partition } + mock_opt.cpu_offload = False self.torch_opt = mock_opt - self.optimizer_mon = DeepSpeedZeroOptimizerStage3Mon() self.mock_monitor = setup_mock_monitor() - self.name2index = name2index - self.params2name = param_names - - def test_get_param_index(self): - name2indices = self.optimizer_mon.get_param_index(self.params2name, self.name2index, self.torch_opt) - expected_name2indices = { - 'param0_0': (0, 3, 0, None), - 'param0_1': (3, 6, 0, None), - 'param0_2': (6, 9, 0, None), - 'param0_3': (9, 12, 0, None), - 'param0_4': (12, 15, 0, None), - 'param1_0': (0, 3, 1, None), - 'param1_1': (3, 6, 1, None), - 'param1_2': (6, 9, 1, None), - 'param1_3': (9, 12, 1, None), - 'param1_4': (12, 15, 1, None) - } - self.assertDictEqual(name2indices, expected_name2indices) + self.optimizer_mon = DeepSpeedZeroOptimizerStage0Mon(mock_opt) + + def test_get_grad_for_param(self): + param = list(self.torch_opt.param_names.keys())[0] + group_idx = 0 + param_id = 2 + grad_expected = torch.randn_like(param) + self.torch_opt.fp32_groups_gradient_dict = [[0, 0, grad_expected, 0]] + grad = self.optimizer_mon.get_grad_for_param(param, group_idx, param_id) + + self.assertTrue(torch.equal(grad_expected, grad)) + + def test_fetch_grad(self): + self.torch_opt.fp32_groups_gradient_dict = [[torch.randn_like(param) for param in group] for group in self.optimizer_mon.bit16_groups] + self.mock_monitor.name2tag = {name:{MonitorConst.POST_GRAD: name} for name in self.torch_opt.param_names.values()} + result = self.optimizer_mon.fetch_grad(self.mock_monitor, self.torch_opt.param_names) + for _, name in self.torch_opt.param_names.items(): + group_index, param_id = [int(i) for i in name.replace('param','').split('_')] + self.assertTrue(torch.equal(result[name], self.torch_opt.fp32_groups_gradient_dict[group_index][param_id])) def test_fetch_mv(self): - name2indices = self.optimizer_mon.get_param_index(self.params2name, self.name2index, self.torch_opt) - result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt, self.params2name, name2indices) - + result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt.param_names) for param, name in self.torch_opt.param_names.items(): - self.assertTrue(torch.equal(result.exp_avg[name], torch.ones_like(param.ds_tensor).flatten())) - self.assertTrue(torch.equal(result.exp_avg_sq[name], torch.ones_like(param.ds_tensor).flatten())) + self.assertTrue(torch.equal(result.exp_avg[name], torch.ones_like(param).flatten())) + self.assertTrue(torch.equal(result.exp_avg_sq[name], torch.ones_like(param).flatten())) class TestDeepSpeedZeroOptimizerStage1or2Mon(unittest.TestCase): def setUp(self): - """Mock zero1/2 partitions - """ - bit16_groups, param_names, name2index, _ = setup_param_groups() + bit16_groups, param_names, param_slice_mappings, _ = setup_param_groups() + mock_opt = MagicMock() - mock_opt.groups_padding = [0, 0] - mock_opt.single_partition_of_fp32_groups = [torch.stack(group,dim=0).flatten() for group in bit16_groups] - mock_opt.partition_size = [p.numel() for p in mock_opt.single_partition_of_fp32_groups] + mock_opt.state_dict.return_value = { + 'param_slice_mappings': param_slice_mappings + } mock_opt.param_names = param_names mock_opt.bit16_groups = bit16_groups - mock_opt.averaged_gradients = {group_idx: [torch.randn_like(param) for param in group] - for group_idx, group in enumerate(mock_opt.single_partition_of_fp32_groups)} - - mock_opt.flatten = _flatten_dense_tensors - def flatten_dense_tensors_aligned(tensor_list, alignment): - return _flatten_dense_tensors(tensor_list) - mock_opt.flatten_dense_tensors_aligned = flatten_dense_tensors_aligned - + mock_opt.single_partition_of_fp32_groups = [torch.stack(group,dim=0).flatten().float() \ + for group in bit16_groups] + mock_opt.averaged_gradients = {group_idx: [torch.randn_like(param) for param in group] for group_idx, group in enumerate(bit16_groups)}# mock name 2 index in subgroup mock_opt.state = { flat_group: { 'exp_avg': torch.ones_like(flat_group), 'exp_avg_sq': torch.ones_like(flat_group) } for flat_group in mock_opt.single_partition_of_fp32_groups } + mock_opt.cpu_offload = False self.torch_opt = mock_opt - self.optimizer_mon = DeepSpeedZeroOptimizerStage1or2Mon() self.mock_monitor = setup_mock_monitor() - self.name2index = name2index - self.params2name = param_names + self.optimizer_mon = DeepSpeedZeroOptimizerStage1or2Mon(mock_opt) - def test_get_group_index(self): - self.fp32_length = [10, 20, 30, 40] - self.world_size = 4 - self.indexes = [5, 7, 12, 25, 35, 45] - self.expected_results = [(40, 0), (40, 0), (12, 1), (24, 2), (34, 2), (40, 0)] + def test_get_grad_for_param(self): + param = list(self.torch_opt.param_names.keys())[0] + group_idx = 0 + param_id = 2 + grad_expected = torch.randn_like(param) + self.torch_opt.averaged_gradients = [[0, 0, grad_expected, 0]] + grad = self.optimizer_mon.get_grad_for_param(param, group_idx, param_id) - results = [self.optimizer_mon.get_group_index(self.fp32_length, self.world_size, index) for index in self.indexes] - self.assertEqual(results, self.expected_results) + self.assertTrue(torch.equal(grad_expected, grad)) - def test_get_param_index(self): - name2indices = self.optimizer_mon.get_param_index(self.params2name, self.name2index, self.torch_opt) - for name, indices in name2indices.items(): - self.assertIn(name, self.params2name.values()) - self.assertIsInstance(indices, tuple) - self.assertEqual(len(indices), 4) + def test_fetch_grad(self): + self.mock_monitor.name2tag = {name:{MonitorConst.POST_GRAD: name} for name in self.torch_opt.param_names.values()} + result = self.optimizer_mon.fetch_grad(self.mock_monitor, self.torch_opt.param_names) + for param, name in self.torch_opt.param_names.items(): + group_index, param_id = [int(i) for i in name.replace('param','').split('_')] + self.assertTrue(torch.equal(result[name], self.torch_opt.averaged_gradients[group_index][param_id])) def test_fetch_mv(self): - # mock _fetch_mv_grad_in_adam - name2indices = self.optimizer_mon.get_param_index(self.params2name, self.name2index, self.torch_opt) - result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt, self.params2name, name2indices) - + result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt.param_names) for param, name in self.torch_opt.param_names.items(): self.assertTrue(torch.equal(result.exp_avg[name], torch.ones_like(param).flatten())) self.assertTrue(torch.equal(result.exp_avg_sq[name], torch.ones_like(param).flatten())) -class TestDeepSpeedZeroOptimizerStage0Mon(unittest.TestCase): +class TestDeepSpeedZeroOptimizerStage3Mon(unittest.TestCase): def setUp(self): - bit16_groups, param_names, name2index, param_slice_mapping = setup_param_groups() - mock_opt = MagicMock() + bit16_groups, param_names, _, grad_position = setup_param_groups() - mock_opt.bf16_groups = bit16_groups - mock_opt.fp32_groups_flat_partition = [torch.stack(group,dim=0).flatten() for group in bit16_groups] - mock_opt.optimizer.state = { + mock_opt = MagicMock() + mock_opt.param_names = param_names + mock_opt.fp16_groups = bit16_groups + mock_opt.fp32_partitioned_groups_flat = [torch.stack(group,dim=0).flatten().float() + for group in bit16_groups] + mock_opt.averaged_gradients = {group_idx: [torch.randn_like(param) for param in group] + for group_idx, group in enumerate(bit16_groups)} + mock_opt.grad_position = grad_position + mock_opt.get_param_id = lambda x: int(param_names[x].split('_')[1]) + mock_opt.state = { flat_group: { 'exp_avg': torch.ones_like(flat_group), 'exp_avg_sq': torch.ones_like(flat_group) - } for flat_group in mock_opt.fp32_groups_flat_partition + } for flat_group in mock_opt.fp32_partitioned_groups_flat } - mock_opt.state_dict.return_value = {'param_slice_mappings':param_slice_mapping} - mock_opt.param_names = param_names - self.torch_opt = mock_opt - self.optimizer_mon = DeepSpeedZeroOptimizerStage0Mon() + self.optimizer_mon = DeepSpeedZeroOptimizerStage3Mon(mock_opt) self.mock_monitor = setup_mock_monitor() - self.name2index = name2index - self.params2name = param_names - def test_fetch_mv(self): - result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt, self.params2name) + def test_fetch_grad(self): + self.mock_monitor.name2tag = {name:{MonitorConst.POST_GRAD: name} for name in self.torch_opt.param_names.values()} + result = self.optimizer_mon.fetch_grad(self.mock_monitor, self.torch_opt.param_names) + for param, name in self.torch_opt.param_names.items(): + group_index, param_id = [int(i) for i in name.replace('param','').split('_')] + self.assertTrue(torch.equal(result[name], self.torch_opt.averaged_gradients[group_index][param_id])) + def test_fetch_mv(self): + result = self.optimizer_mon.fetch_mv(self.mock_monitor, self.torch_opt.param_names) for param, name in self.torch_opt.param_names.items(): self.assertTrue(torch.equal(result.exp_avg[name], torch.ones_like(param).flatten())) self.assertTrue(torch.equal(result.exp_avg_sq[name], torch.ones_like(param).flatten())) @@ -381,48 +365,48 @@ class TestOptimizerMonFactory(unittest.TestCase): mix_optimizer_class = MagicMock() mix_optimizer_class.__name__ = "Float16OptimizerWithFloat16Params" mix_optimizer.__class__ = mix_optimizer_class - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(mix_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(mix_optimizer), MixPrecisionOptimizerMon) dis_optimizer = MagicMock() dis_optimizer_class = MagicMock() dis_optimizer_class.__name__ = "DistributedOptimizer" dis_optimizer.__class__ = dis_optimizer_class - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(dis_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(dis_optimizer), MegatronDistributedOptimizerMon) fp32_optimizer = MagicMock() fp32_optimizer_class = MagicMock() fp32_optimizer_class.__name__ = "FP32Optimizer" fp32_optimizer.__class__ = fp32_optimizer_class - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(fp32_optimizer)[0], - MegatronFP32OptimizerMon) + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(fp32_optimizer), + OptimizerMon) chained_optimizer = MagicMock() chained_optimizer_class = MagicMock() chained_optimizer_class.__name__ = "ChainedOptimizer" chained_optimizer.__class__ = chained_optimizer_class chained_optimizer.chained_optimizers = [mix_optimizer, mix_optimizer] - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(chained_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(chained_optimizer), MegatronChainedMixPrecisionOptimizerMon) chained_optimizer.chained_optimizers = [dis_optimizer, dis_optimizer] - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(chained_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(chained_optimizer), MegatronChainedDistributedOptimizerMon) deepspeed_optimizer = MagicMock() deepspeed_optimizer_class = MagicMock() deepspeed_optimizer_class.__name__ = "BF16_Optimizer" deepspeed_optimizer.__class__ = deepspeed_optimizer_class - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer), DeepSpeedZeroOptimizerStage0Mon) deepspeed_optimizer_class.__name__ = "DeepSpeedZeroOptimizer" - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer), DeepSpeedZeroOptimizerStage1or2Mon) deepspeed_optimizer_class.__name__ = "DeepSpeedZeroOptimizer_Stage3" - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer)[0], + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(deepspeed_optimizer), DeepSpeedZeroOptimizerStage3Mon) - # 测试未知的优化器类型,应该返回DummyOptimizerMon + # 测试未知的优化器类型,应该返回OptimizerMon unknown_optimizer = MagicMock() unknown_optimizer_class = MagicMock() unknown_optimizer_class.__name__ = "unknown" unknown_optimizer.__class__ = unknown_optimizer_class - self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(unknown_optimizer)[0], DummyOptimizerMon) + self.assertIsInstance(OptimizerMonFactory.create_optimizer_mon(unknown_optimizer), OptimizerMon) if __name__ == '__main__': diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_config.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_config.py index f931c2b5c2bb3ecde578d12866fa7a56125d700e..f12cffd8da88ab3a42c471eb5a1c6197ef59d634 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_config.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_config.py @@ -181,7 +181,7 @@ class TestStatisticsConfig(unittest.TestCase): self.config.summary_mode = "invalid_mode" with self.assertRaises(Exception) as context: self.config._check_summary_mode() - self.assertIn(str(context.exception), "summary_mode is invalid") + self.assertIn(str(context.exception), "[msprobe] 无效参数:") def test_check_summary_mode_none(self): self.config.summary_mode = None diff --git a/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_debug_save.py b/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_debug_save.py index d68f28066fab1dbd453a91f34bfbc762949f3da0..cf7aec0ed1bc4147cd5ee1a56ecbc686bba33a54 100644 --- a/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_debug_save.py +++ b/debug/accuracy_tools/msprobe/test/pytorch_ut/test_pt_debug_save.py @@ -36,5 +36,47 @@ class TestPytorchDebuggerSave(TestCase): } common_config = CommonConfig(statistics_task_json) task_config = BaseConfig(statistics_task_json) - with patch("msprobe.pytorch.debugger.precision_debugger.parse_json_config", return_value=(common_config, task_config)): + with patch("msprobe.pytorch.debugger.precision_debugger.parse_json_config", + return_value=(common_config, task_config)): self.debugger = PrecisionDebugger() + + def test_forward_and_backward(self): + def forward_func(x, y): + PrecisionDebugger.save(x, "x_tensor") + return x * y + + x = torch.tensor([1.]) + y = torch.tensor([2.]) + x.requires_grad = True + y.requires_grad = True + result_json = { + "task": "statistics", + "level": "debug", + "framework": "pytorch", + "dump_data_dir": None, + "data": { + "x_tensor.0.debug": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": torch.Size([1]), + "requires_grad": True + }, + "x_tensor_grad.0.debug": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": torch.Size([1]), + "requires_grad": False + } + } + } + + loss = forward_func(x, y) + loss.backward() + + result = self.debugger.service.data_collector.data_writer.cache_debug + # Remove 'tensor_stat_index' from all entries in the data dictionary + for key in result["data"]: + if 'tensor_stat_index' in result["data"][key]: + del result["data"][key]['tensor_stat_index'] + + self.assertEqual(result, result_json) \ No newline at end of file diff --git a/debug/accuracy_tools/msprobe/test/run_ut.py b/debug/accuracy_tools/msprobe/test/run_ut.py index 06671c3d0d0e4440736712cb0718873280482781..c5ebc6e3f052b8ef7d16694c31c22d16f8ec930a 100644 --- a/debug/accuracy_tools/msprobe/test/run_ut.py +++ b/debug/accuracy_tools/msprobe/test/run_ut.py @@ -2,7 +2,6 @@ import os import shutil import subprocess import sys -import tempfile from msprobe.core.common.log import logger @@ -21,23 +20,6 @@ def run_ut(): shutil.rmtree(report_dir) os.makedirs(report_dir) - tmpdir = tempfile.mkdtemp() - sitecustomize_path = os.path.join(tmpdir, "sitecustomize.py") - - with open(sitecustomize_path, "w") as f: - f.write(""" -import mindspore - -class Distributed: - P2POp = None - -if not hasattr(mindspore.mint, 'distributed'): - setattr(mindspore.mint, 'distributed', Distributed()) - """) - - env = os.environ.copy() - env["PYTHONPATH"] = f"{tmpdir}:{env.get('PYTHONPATH', '')}" - pytest_cmd = [ "python3", "-m", "pytest", ut_path, diff --git a/debug/accuracy_tools/msprobe/test/visualization_ut/compare/test_multi_mapping.py b/debug/accuracy_tools/msprobe/test/visualization_ut/compare/test_multi_mapping.py deleted file mode 100644 index 7fe14317b2af7334693270d060c58af2dada4cbc..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/test/visualization_ut/compare/test_multi_mapping.py +++ /dev/null @@ -1,114 +0,0 @@ -import unittest -from msprobe.visualization.compare.multi_mapping import MultiMapping -from msprobe.visualization.graph.graph import Graph -from msprobe.visualization.graph.base_node import BaseNode -from msprobe.visualization.graph.node_op import NodeOp -from msprobe.visualization.utils import GraphConst - - -class TestMultiMapping(unittest.TestCase): - - def setUp(self): - pass - - def test_validate_yaml(self): - multi_mapping = MultiMapping.validate_yaml({}) - self.assertEqual(multi_mapping, {}) - - multi_mapping = MultiMapping.validate_yaml([]) - self.assertEqual(multi_mapping, {}) - - multi_mapping = MultiMapping.validate_yaml({'a': 'b'}) - self.assertEqual(multi_mapping, {('a',): ('b',)}) - - multi_mapping = MultiMapping.validate_yaml({'a': 'b c d'}) - self.assertEqual(multi_mapping, {('a',): ('b c d',)}) - - multi_mapping = MultiMapping.validate_yaml({'a': 'b, c, d'}) - self.assertEqual(multi_mapping, {('a',): ('b', 'd')}) - - def test_validate_ids_in_graph(self): - graph = Graph("model_name") - graph.node_map = {'node1': BaseNode(NodeOp.module, 'node1'), - 'node2': BaseNode(NodeOp.module, 'node2'), - 'node3': BaseNode(NodeOp.module, 'node3')} - result = MultiMapping.validate_ids_in_graph(['node1', 'node3'], graph) - self.assertTrue(result) - - result = MultiMapping.validate_ids_in_graph(['node1', 'node5'], graph) - self.assertFalse(result) - - def test_get_merged_nodes_data(self): - node_ids = ['Module.layer1.Linear.forward.0', 'Module.layer3.Linear.forward.0'] - dump_data = {'Module.layer1.Linear.forward.0': {'input_args': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 10], 'Max': 3.029174327850342, - 'Min': -3.405808448791504, 'Mean': -0.08760099112987518, 'Norm': 31.511741638183594, - 'requires_grad': False}], 'input_kwargs': {}, 'output': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 20], 'Max': 2.280996561050415, - 'Min': -2.6040544509887695, 'Mean': -0.05008987337350845, 'Norm': 26.9143123626709, - 'requires_grad': True}], 'parameters': { - 'weight': {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [20, 10], 'Max': 0.31333038210868835, - 'Min': -0.3147874176502228, 'Mean': -0.007642852142453194, 'Norm': 2.594407558441162, - 'requires_grad': True}, - 'bias': {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [20], 'Max': 0.3160688579082489, - 'Min': -0.31076428294181824, 'Mean': -0.05035770684480667, 'Norm': 0.8817608952522278, - 'requires_grad': True}}, 'is_recompute': False}, - 'Module.layer3.Linear.forward.0': {'input_args': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 30], 'Max': 1.8936877250671387, - 'Min': -1.60052490234375, 'Mean': -0.05550510436296463, 'Norm': 21.1639404296875, - 'requires_grad': True}], 'input_kwargs': {}, 'output': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 1], 'Max': 0.8175169229507446, - 'Min': -0.3781408369541168, 'Mean': 0.16728776693344116, 'Norm': 2.627354145050049, - 'requires_grad': True}], 'parameters': { - 'weight': {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [1, 30], - 'Max': 0.17745383083820343, 'Min': -0.11874081194400787, 'Mean': 0.013812449760735035, - 'Norm': 0.48705562949180603, 'requires_grad': True}, - 'bias': {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [1], 'Max': 0.1430283486843109, - 'Min': 0.1430283486843109, 'Mean': 0.1430283486843109, 'Norm': 0.1430283486843109, - 'requires_grad': True}}, 'is_recompute': False}} - multi_node_data = {'input_args': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 10], 'Max': 3.029174327850342, - 'Min': -3.405808448791504, 'Mean': -0.08760099112987518, 'Norm': 31.511741638183594, - 'requires_grad': False}], 'input_kwargs': {}, 'output': [ - {'type': 'torch.Tensor', 'dtype': 'torch.float32', 'shape': [100, 1], 'Max': 0.8175169229507446, - 'Min': -0.3781408369541168, 'Mean': 0.16728776693344116, 'Norm': 2.627354145050049, - 'requires_grad': True}]} - result = MultiMapping.get_merged_nodes_data(node_ids, dump_data, 'multi_node0') - self.assertEqual(result, {'multi_node0': multi_node_data}) - result = MultiMapping.get_merged_nodes_data([], dump_data, 'multi_node0') - self.assertEqual(result, {}) - - def test_merge_nodes(self): - graph = Graph('graph') - graph.add_node(NodeOp.module, 'Module.layer1.Linear.forward.0', graph.root) - graph.add_node(NodeOp.module, 'Module.layer2.Linear.forward.0', graph.root) - graph.add_node(NodeOp.module, 'Module.layer3.Linear.forward.0', graph.root) - result = MultiMapping.merge_nodes(['Module.layer1.Linear.forward.0', 'Module.layer3.Linear.forward.0'], - graph) - self.assertTrue(isinstance(result.multi_node, BaseNode)) - self.assertEqual(result.multi_node.subnodes, [graph.get_node('Module.layer1.Linear.forward.0'), - graph.get_node('Module.layer2.Linear.forward.0'), - graph.get_node('Module.layer3.Linear.forward.0')]) - self.assertEqual(result.multi_node.upnode, graph.get_node('graph')) - self.assertEqual(result.multi_node.id, GraphConst.MERGE_NODES + '.forward.0') - - result = MultiMapping.merge_nodes(['Module.layer1.Linear.forward.0'], graph) - self.assertEqual(result.multi_node, graph.get_node('Module.layer1.Linear.forward.0')) - - result = MultiMapping.merge_nodes(['Module.layer5.Linear.forward.0', 'Module.layer6.Linear.forward.0'], - graph) - self.assertIsNone(result.multi_node) - - result = MultiMapping.merge_nodes(['Module.layer3.Linear.forward.0', 'Module.layer1.Linear.forward.0'], - graph) - self.assertIsNone(result.multi_node) - - def test_split_mapping_str(self): - result = MultiMapping._split_mapping_str('a, b,c, d') - self.assertEqual(result, ('a', 'd')) - - result = MultiMapping._split_mapping_str('a') - self.assertEqual(result, ('a',)) - - result = MultiMapping._split_mapping_str('a b* c ') - self.assertEqual(result, ('a b* c',)) diff --git a/debug/accuracy_tools/msprobe/test/visualization_ut/graph/test_graph.py b/debug/accuracy_tools/msprobe/test/visualization_ut/graph/test_graph.py index 24f39cbb808234cfce6af02046755d3df3a1a5e4..f1c4ee95567e9021cc30e2e2f55702751e27bb94 100644 --- a/debug/accuracy_tools/msprobe/test/visualization_ut/graph/test_graph.py +++ b/debug/accuracy_tools/msprobe/test/visualization_ut/graph/test_graph.py @@ -54,7 +54,7 @@ class TestGraph(unittest.TestCase): matched_node, ancestors = Graph.match(graph_a, graph_a.get_node("node_id_a_1"), graph_b) self.assertIsNotNone(matched_node) self.assertEqual(ancestors, ['node_id_a']) - + def test_split_nodes_by_micro_step(self): nodes = [BaseNode(NodeOp.module, 'a.forward.0'), BaseNode(NodeOp.module, 'a.backward.0'), BaseNode(NodeOp.api_collection, 'apis.0'), BaseNode(NodeOp.module, 'a.forward.1'), diff --git a/debug/accuracy_tools/msprobe/test/visualization_ut/test_graph_service.py b/debug/accuracy_tools/msprobe/test/visualization_ut/test_graph_service.py index 0fe7047fb8aa3c1e7b0c9291b4892c9e75224a0d..f9ca5592aaa153bc0446443548c3e18329784a18 100644 --- a/debug/accuracy_tools/msprobe/test/visualization_ut/test_graph_service.py +++ b/debug/accuracy_tools/msprobe/test/visualization_ut/test_graph_service.py @@ -7,7 +7,7 @@ import argparse from dataclasses import dataclass from unittest.mock import patch -from msprobe.visualization.graph_service import _compare_graph, _build_graph, _compare_graph_ranks, \ +from msprobe.visualization.graph_service import _compare_graph_result, _build_graph_result, _compare_graph_ranks, \ _compare_graph_steps, _build_graph_ranks, _build_graph_steps, _graph_service_command, _graph_service_parser from msprobe.core.common.utils import CompareException @@ -21,7 +21,6 @@ class Args: overflow_check: bool = False fuzzy_match: bool = False complete_stack: bool = False - multi_mapping: str = None class TestGraphService(unittest.TestCase): @@ -46,30 +45,31 @@ class TestGraphService(unittest.TestCase): last_call_args = mock_log_info.call_args[0][0] self.assertIn(log_info, last_call_args) matches = re.findall(self.pattern, last_call_args) - self.assertTrue(os.path.exists(os.path.join(self.output, matches[0]))) + if matches: + self.assertTrue(os.path.exists(os.path.join(self.output, matches[0]))) @patch('msprobe.core.common.log.logger.info') - def test_compare_graph(self, mock_log_info): + def test_compare_graph_result(self, mock_log_info): args = Args(output_path=self.output, framework='pytorch') - result = _compare_graph(self.input_param, args) + result = _compare_graph_result(self.input_param, args) self.assertEqual(mock_log_info.call_count, 2) self.assertIsNotNone(result) args = Args(output_path=self.output, framework='mindspore') - result = _compare_graph(self.input_param, args) + result = _compare_graph_result(self.input_param, args) self.assertIsNotNone(result) args = Args(output_path=self.output, framework='pytorch', layer_mapping=self.layer_mapping) - result = _compare_graph(self.input_param, args) + result = _compare_graph_result(self.input_param, args) self.assertIsNotNone(result) args = Args(output_path=self.output, framework='pytorch', overflow_check=True) - result = _compare_graph(self.input_param, args) + result = _compare_graph_result(self.input_param, args) self.assertIsNotNone(result) @patch('msprobe.core.common.log.logger.info') - def test_build_graph(self, mock_log_info): - result = _build_graph(os.path.join(self.input, 'step0', 'rank0'), Args(overflow_check=True)) + def test_build_graph_result(self, mock_log_info): + result = _build_graph_result(os.path.join(self.input, 'step0', 'rank0'), Args(overflow_check=True)) self.assertEqual(mock_log_info.call_count, 1) self.assertIsNotNone(result) @@ -82,7 +82,7 @@ class TestGraphService(unittest.TestCase): } args = Args(output_path=self.output, framework='pytorch') _compare_graph_ranks(input_param, args) - self.assert_log_info(mock_log_info) + self.assert_log_info(mock_log_info, 'Successfully exported compare graph results.') input_param1 = { 'npu_path': os.path.join(self.input, 'step0'), @@ -102,7 +102,7 @@ class TestGraphService(unittest.TestCase): } args = Args(output_path=self.output, framework='pytorch') _compare_graph_steps(input_param, args) - self.assert_log_info(mock_log_info) + self.assert_log_info(mock_log_info, 'Successfully exported compare graph results.') input_param1 = { 'npu_path': self.input, @@ -116,12 +116,12 @@ class TestGraphService(unittest.TestCase): @patch('msprobe.core.common.log.logger.info') def test_build_graph_ranks(self, mock_log_info): _build_graph_ranks(os.path.join(self.input, 'step0'), Args(output_path=self.output)) - self.assert_log_info(mock_log_info, "Model graph built successfully, the result file is saved in") + self.assert_log_info(mock_log_info, "Successfully exported build graph results.") @patch('msprobe.core.common.log.logger.info') def test_build_graph_steps(self, mock_log_info): _build_graph_steps(self.input, Args(output_path=self.output)) - self.assert_log_info(mock_log_info, "Model graph built successfully, the result file is saved in") + self.assert_log_info(mock_log_info, "Successfully exported build graph results.") @patch('msprobe.core.common.log.logger.info') def test_graph_service_command(self, mock_log_info): @@ -130,7 +130,7 @@ class TestGraphService(unittest.TestCase): args = Args(input_path=self.output_json[0], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info) + self.assert_log_info(mock_log_info, 'Exporting compare graph result successfully, the result file is saved in') input_param1 = { 'npu_path': os.path.join(self.input, 'step0', 'rank0'), @@ -140,7 +140,7 @@ class TestGraphService(unittest.TestCase): json.dump(input_param1, f, indent=4) args = Args(input_path=self.output_json[1], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info, "Model graph built successfully, the result file is saved in") + self.assert_log_info(mock_log_info, "Model graph exported successfully, the result file is saved in") input_param2 = { 'npu_path': os.path.join(self.input, 'step0'), @@ -151,7 +151,7 @@ class TestGraphService(unittest.TestCase): json.dump(input_param2, f, indent=4) args = Args(input_path=self.output_json[2], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info) + self.assert_log_info(mock_log_info, 'Successfully exported compare graph results.') input_param3 = { 'npu_path': self.input, @@ -162,7 +162,7 @@ class TestGraphService(unittest.TestCase): json.dump(input_param3, f, indent=4) args = Args(input_path=self.output_json[3], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info) + self.assert_log_info(mock_log_info, 'Successfully exported compare graph results.') input_param4 = { 'npu_path': os.path.join(self.input, 'step0'), @@ -172,7 +172,7 @@ class TestGraphService(unittest.TestCase): json.dump(input_param4, f, indent=4) args = Args(input_path=self.output_json[4], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info, "Model graph built successfully, the result file is saved in") + self.assert_log_info(mock_log_info, "Successfully exported build graph results.") input_param5 = { 'npu_path': self.input, @@ -182,7 +182,7 @@ class TestGraphService(unittest.TestCase): json.dump(input_param5, f, indent=4) args = Args(input_path=self.output_json[5], output_path=self.output, framework='pytorch') _graph_service_command(args) - self.assert_log_info(mock_log_info, "Model graph built successfully, the result file is saved in") + self.assert_log_info(mock_log_info, "Successfully exported build graph results.") input_param6 = { 'npu_path': self.input, diff --git a/debug/accuracy_tools/msprobe/test/visualization_ut/test_visualization_utils.py b/debug/accuracy_tools/msprobe/test/visualization_ut/test_visualization_utils.py index e5b0afaadf9def910c248b945ad15084300a65c0..41ea145208dc658a83bb5c791d6b05a0abb30616 100644 --- a/debug/accuracy_tools/msprobe/test/visualization_ut/test_visualization_utils.py +++ b/debug/accuracy_tools/msprobe/test/visualization_ut/test_visualization_utils.py @@ -1,7 +1,7 @@ import os import unittest from msprobe.visualization.utils import (load_json_file, load_data_json_file, str2float, check_directory_content, - GraphConst) + GraphConst, SerializableArgs) class TestMappingConfig(unittest.TestCase): @@ -37,6 +37,21 @@ class TestMappingConfig(unittest.TestCase): input_type = check_directory_content(os.path.join(self.input, "step0", "rank0")) self.assertEqual(input_type, GraphConst.FILES) + def test_serializable_args(self): + class TmpArgs: + def __init__(self, a, b, c): + self.a = a + self.b = b + self.c = c + input_args1 = TmpArgs('a', 123, [1, 2, 3]) + serializable_args1 = SerializableArgs(input_args1) + self.assertEqual(serializable_args1.__dict__, input_args1.__dict__) + input_args2 = TmpArgs('a', 123, lambda x: print(x)) + serializable_args2 = SerializableArgs(input_args2) + self.assertNotEqual(serializable_args2.__dict__, input_args2.__dict__) + + + if __name__ == '__main__': unittest.main() diff --git a/debug/accuracy_tools/msprobe/visualization/builder/graph_builder.py b/debug/accuracy_tools/msprobe/visualization/builder/graph_builder.py index bec99d675f4b1238fde3905037ec5f7fb5a0c8fe..78b4b83cb17c99a80dfbc6eeb9ceafba1543fedf 100644 --- a/debug/accuracy_tools/msprobe/visualization/builder/graph_builder.py +++ b/debug/accuracy_tools/msprobe/visualization/builder/graph_builder.py @@ -14,9 +14,11 @@ # limitations under the License. import re +from dataclasses import dataclass from msprobe.core.common.const import Const from msprobe.core.common.file_utils import load_json, save_json +from msprobe.core.common.utils import load_stack_json from msprobe.visualization.builder.msprobe_adapter import get_input_output from msprobe.visualization.builder.msprobe_adapter import op_patterns from msprobe.visualization.graph.graph import Graph @@ -44,7 +46,7 @@ class GraphBuilder: """ construct_dict = load_json(construct_path) dump_dict = load_json(data_path) - stack_dict = load_json(stack_path) + stack_dict = load_stack_json(stack_path) if not complete_stack: GraphBuilder._simplify_stack(stack_dict) data_dict = dump_dict.get(GraphConst.DATA_KEY, {}) @@ -61,10 +63,10 @@ class GraphBuilder: """ result = {} if config.graph_b: - result[GraphConst.JSON_NPU_KEY] = config.graph_n.to_dict() - result[GraphConst.JSON_BENCH_KEY] = config.graph_b.to_dict() + result[GraphConst.JSON_NPU_KEY] = config.graph_n.to_dict(config.compare_mode) + result[GraphConst.JSON_BENCH_KEY] = config.graph_b.to_dict(config.compare_mode) else: - result = config.graph_n.to_dict() + result = config.graph_n.to_dict(config.compare_mode) if config.tool_tip: result[GraphConst.JSON_TIP_KEY] = config.tool_tip if config.node_colors: @@ -277,7 +279,7 @@ class GraphBuilder: class GraphExportConfig: def __init__(self, graph_n, graph_b=None, tool_tip=None, node_colors=None, micro_steps=None, task='', - overflow_check=False): + overflow_check=False, compare_mode=None): self.graph_n = graph_n self.graph_b = graph_b self.tool_tip = tool_tip @@ -285,3 +287,21 @@ class GraphExportConfig: self.micro_steps = micro_steps self.task = task self.overflow_check = overflow_check + self.compare_mode = compare_mode + + +@dataclass +class GraphInfo: + graph: Graph + construct_path: str + data_path: str + stack_path: str + + +@dataclass +class BuildGraphTaskInfo: + graph_info_n: GraphInfo + graph_info_b: GraphInfo + npu_rank: str + bench_rank: str + time_str: str diff --git a/debug/accuracy_tools/msprobe/visualization/builder/msprobe_adapter.py b/debug/accuracy_tools/msprobe/visualization/builder/msprobe_adapter.py index 2f219ce099c83254051ecb3d566b1bc1529e3f99..2b7f7886535068824e782c8cfab1b6aa283198e5 100644 --- a/debug/accuracy_tools/msprobe/visualization/builder/msprobe_adapter.py +++ b/debug/accuracy_tools/msprobe/visualization/builder/msprobe_adapter.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -12,13 +12,16 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. + import re -from msprobe.core.compare.acc_compare import read_op, merge_tensor, get_accuracy + +from msprobe.core.compare.acc_compare import ModeConfig +from msprobe.core.compare.multiprocessing_compute import CompareRealData +from msprobe.core.compare.utils import read_op, merge_tensor, get_accuracy, make_result_table from msprobe.core.common.utils import set_dump_path, get_dump_mode from msprobe.visualization.utils import GraphConst from msprobe.core.common.const import Const -from msprobe.core.compare.acc_compare import ModeConfig -from msprobe.core.common.file_utils import load_json + # 用于将节点名字解析成对应的NodeOp的规则 op_patterns = [ @@ -54,38 +57,11 @@ def run_real_data(dump_path_param, csv_path, framework, is_cross_frame=False): mode_config = ModeConfig(stack_mode=False, auto_analyze=True, fuzzy_match=False, dump_mode=Const.ALL) if framework == Const.PT_FRAMEWORK: - from msprobe.pytorch.compare.pt_compare import PTComparator - return PTComparator(mode_config).do_multi_process(dump_path_param, csv_path) - else: - from msprobe.mindspore.compare.ms_compare import MSComparator, MappingConfig - ms_comparator = MSComparator(mode_config, MappingConfig()) - ms_comparator.cross_frame = is_cross_frame - return ms_comparator.do_multi_process(dump_path_param, csv_path) - - -def run_real_data_single(op_names, op_name_mapping_dict, input_param, framework, is_cross_frame=False): - """ - 单进程运行生成真实数据 - Args: - op_names: [npu_op_name, bench_op_name], excel中的NPU_Name和Bench_Name,例如:Functional.conv2d.0.forward.input.3.0 - op_name_mapping_dict: op_name和npy或pt文件的映射关系 - input_param: npu_json_path/bench_json_path/stack_json_path等参数 - framework: 框架类型, pytorch或mindspore - is_cross_frame: 是否进行跨框架比对,仅支持mindspore比pytorch, 其中pytorch为标杆 - """ - if not isinstance(op_names, list) or len(op_names) != 2: - return [] - mode_config = ModeConfig(stack_mode=False, auto_analyze=True, fuzzy_match=False, dump_mode=Const.ALL) - set_dump_path(input_param) - - if framework == Const.PT_FRAMEWORK: - from msprobe.pytorch.compare.pt_compare import PTComparator - return PTComparator(mode_config).compare_by_op(op_names[0], op_names[1], op_name_mapping_dict, input_param) + from msprobe.pytorch.compare.pt_compare import read_real_data + return CompareRealData(read_real_data, mode_config, is_cross_frame).do_multi_process(dump_path_param, csv_path) else: - from msprobe.mindspore.compare.ms_compare import MSComparator, MappingConfig - ms_comparator = MSComparator(mode_config, MappingConfig()) - ms_comparator.cross_frame = is_cross_frame - return ms_comparator.compare_by_op(op_names[0], op_names[1], op_name_mapping_dict, input_param) + from msprobe.mindspore.compare.ms_compare import read_real_data + return CompareRealData(read_real_data, mode_config, is_cross_frame).do_multi_process(dump_path_param, csv_path) def get_input_output(node_data, node_id): @@ -145,11 +121,13 @@ def compare_data_fuzzy(data_dict_list1, data_dict_list2): return True -def format_node_data(data_dict, node_id=None): +def format_node_data(data_dict, node_id=None, compare_mode=None): """ 删除节点数据中不需要展示的字段 """ del_list = ['requires_grad', 'full_op_name'] + if GraphConst.MD5_COMPARE != compare_mode: + del_list.append(Const.MD5) if node_id and GraphConst.BATCH_P2P in node_id: del_list.extend(['op', 'peer', 'tag', 'group_id']) for _, value in data_dict.items(): @@ -197,7 +175,7 @@ def _format_decimal_string(s): """ 使用正则表达式匹配包含数字、小数点和可选的百分号的字符串 """ - pattern = re.compile(r'\d{1,20}\.\d{1,20}%?') + pattern = re.compile(r'^\d{1,20}\.\d{1,20}%?$') matches = pattern.findall(s) for match in matches: is_percent = match.endswith('%') @@ -252,3 +230,12 @@ def _format_data(data_dict): if all_null: data_dict.clear() data_dict[GraphConst.VALUE] = GraphConst.NULL + + +def get_csv_df(stack_mode, csv_data, compare_mode): + """ + 调用acc接口写入csv + """ + + dump_mode = GraphConst.GRAPHCOMPARE_MODE_TO_DUMP_MODE_TO_MAPPING.get(compare_mode) + return make_result_table(csv_data, dump_mode, stack_mode) diff --git a/debug/accuracy_tools/msprobe/visualization/compare/graph_comparator.py b/debug/accuracy_tools/msprobe/visualization/compare/graph_comparator.py index 91547deaccc7c36b036503dc945cdfe433a752ad..95982658d2f431463476912e9c229b281f817861 100644 --- a/debug/accuracy_tools/msprobe/visualization/compare/graph_comparator.py +++ b/debug/accuracy_tools/msprobe/visualization/compare/graph_comparator.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -14,15 +14,11 @@ # limitations under the License. import re -from msprobe.visualization.builder.msprobe_adapter import compare_node, get_compare_mode, run_real_data, \ - run_real_data_single -from msprobe.visualization.utils import GraphConst, load_json_file, load_data_json_file, get_csv_df +from msprobe.visualization.builder.msprobe_adapter import compare_node, get_compare_mode, run_real_data, get_csv_df +from msprobe.visualization.utils import GraphConst, load_json_file, load_data_json_file from msprobe.visualization.graph.graph import Graph, NodeOp from msprobe.visualization.compare.mode_adapter import ModeAdapter -from msprobe.core.common.const import Const, CompareConst -from msprobe.core.common.log import logger -from msprobe.core.common.file_utils import load_yaml -from msprobe.visualization.compare.multi_mapping import MultiMapping +from msprobe.core.common.const import Const from msprobe.core.common.decorator import recursion_depth_decorator @@ -50,74 +46,6 @@ class GraphComparator: self._compare_nodes(self.graph_n.root) self._postcompare() - def multi_compare(self, multi_yaml_path): - """ - 多对多节点比对,需建立数量n与数量m节点之间的映射关系 - Args: - multi_yaml_path: 映射文件路径 - """ - multi_mapping = MultiMapping.validate_yaml(load_yaml(multi_yaml_path)) - if not multi_mapping: - logger.warning( - f'The multi mapping file {multi_yaml_path} content is incorrect, and the mapping is not effective.') - return - if self.ma.compare_mode == GraphConst.REAL_DATA_COMPARE: - # 获取真实数据指标在真实数据表头的索引 - id_list = [CompareConst.COMPARE_RESULT_HEADER.index(x) for x in CompareConst.ALL_COMPARE_INDEX] - for node_n_ids, node_b_ids in multi_mapping.items(): - if not MultiMapping.validate_ids_in_graph(node_n_ids, self.graph_n): - continue - if not MultiMapping.validate_ids_in_graph(node_b_ids, self.graph_b, GraphConst.JSON_BENCH_KEY): - continue - merged_items_n = MultiMapping.merge_nodes(node_n_ids, self.graph_n) - merged_items_b = MultiMapping.merge_nodes(node_b_ids, self.graph_b) - node_n = merged_items_n.multi_node - node_n_data = self.data_n_dict - node_b = merged_items_b.multi_node - node_b_data = self.data_b_dict - - if node_n.op == NodeOp.multi_collection: - node_n_data = MultiMapping.get_merged_nodes_data(node_n_ids, self.data_n_dict, node_n.id) - if node_b.op == NodeOp.multi_collection: - node_b_data = MultiMapping.get_merged_nodes_data(node_b_ids, self.data_b_dict, node_b.id) - - node = self._compare_node_with_mapping(node_n, {node_n.id: node_b.id}) - if not node: - continue - compare_result_list = compare_node([node_n.id, node_b.id], - [node_n_data, node_b_data], - self.stack_json_data, self.ma.compare_mode) - if not compare_result_list: - continue - # 真实数据模式,compare_result_list里没有精度指标,需要调用真实数据的比对接口得到指标 - if self.ma.compare_mode == GraphConst.REAL_DATA_COMPARE: - for compare_result in compare_result_list: - # 准备真实数据比对接口需要的参数 - full_param_name_n = compare_result[0] - full_param_name_b = compare_result[1] - - data_name_n = MultiMapping.get_dump_data_name(merged_items_n, full_param_name_n) - data_name_b = MultiMapping.get_dump_data_name(merged_items_b, full_param_name_b) - op_name_mapping_dict = {full_param_name_n: [data_name_n, data_name_b]} - - real_compare_result = run_real_data_single([full_param_name_n, full_param_name_b], - op_name_mapping_dict, self.dump_path_param, - self.framework, self.is_cross_framework) - if len(real_compare_result) < len(id_list): - continue - for i, index in enumerate(id_list): - # 根据索引,将真实数据指标插入表头相应位置 - compare_result[index] = real_compare_result[i] - compare_dict = {} - for item in compare_result_list: - if not isinstance(item, (list, tuple)) or not item: - continue - compare_dict[MultiMapping.replace_param_name(item[0], node_n.id)] = item - precision_index, _ = self.ma.parse_result(node_n, [compare_dict]) - node_n.data[GraphConst.JSON_INDEX_KEY] = precision_index - else: - self.add_compare_result_to_node(node_n, compare_result_list) - def add_compare_result_to_node(self, node, compare_result_list): """ 将比对结果添加到节点的输入输出数据中 @@ -143,54 +71,56 @@ class GraphComparator: node.data[GraphConst.JSON_INDEX_KEY] = precision_index node.data.update(other_dict) - @recursion_depth_decorator('GraphComparator._compare_nodes', max_depth=MAX_DEPTH) - def _compare_nodes(self, node_n): + def _compare_nodes(self, node_root): """ - 递归遍历NPU树中的节点,如果在Bench中找到具有相同名称的节点,检查他们的祖先和参数信息,检查一致则及逆行精度数据对比 + 遍历NPU树中的节点,如果在Bench中找到具有相同名称的节点,检查他们的祖先和参数信息,检查一致则及逆行精度数据对比 这里采用先序遍历,好处在于当这个节点被比较时,他的先序已经被匹配,这可以为后续的模糊匹配提供重要信息 """ - if self.layer_mapping: - node_b = self._compare_node_with_mapping(node_n, self.mapping_dict) - else: - node_b, ancestors = Graph.match(self.graph_n, node_n, self.graph_b) - if node_b: - ancestors.append(node_b.id) - node_n.add_link(node_b, ancestors) - if node_b: - # 真实数据比对只会得到基本信息,并没有精度指标,需要调用多进程对比接口 - self._get_and_add_result(node_n, node_b) - for subnode in node_n.subnodes: - self._compare_nodes(subnode) - - @recursion_depth_decorator('GraphComparator._compare_nodes_fuzzy', max_depth=MAX_DEPTH) - def _compare_nodes_fuzzy(self, node_n): - if node_n.op != NodeOp.function_api: - # 模块经过模糊匹配 - node_b, ancestors_n, ancestors_b = Graph.fuzzy_match(node_n, self.graph_b.node_map.get(node_n.id)) + def compare_single_node(node_n): + if self.layer_mapping: + node_b, ancestors_n, ancestors_b = Graph.mapping_match(node_n, self.graph_b, self.mapping_dict) + if node_b: + ancestors_n.append(node_n.id) + ancestors_b.append(node_b.id) + node_n.matched_node_link = ancestors_b + node_b.matched_node_link = ancestors_n + else: + node_b, ancestors = Graph.match(self.graph_n, node_n, self.graph_b) + if node_b: + ancestors.append(node_b.id) + node_n.add_link(node_b, ancestors) if node_b: - self._process_matched_nodes(node_n, node_b, ancestors_n, ancestors_b) - # 匹配上的两个模块中的所有api, 忽略dump调用次数,按照名称一致+模块中的调用顺序进行匹配 - recount_result_n = self._recount_api_node(node_n) - recount_result_b = self._recount_api_node(node_b) - for recount_node_id, node_id_n in recount_result_n.items(): - api_node_n = self.graph_n.node_map.get(node_id_n) - if not api_node_n: - continue - api_node_b, ancestors_n, ancestors_b = Graph.fuzzy_match( - api_node_n, self.graph_b.node_map.get(recount_result_b.get(recount_node_id))) - if api_node_b: - self._process_matched_nodes(api_node_n, api_node_b, ancestors_n, ancestors_b) - for sub_node in node_n.subnodes: - self._compare_nodes_fuzzy(sub_node) - - def _compare_node_with_mapping(self, node_n, mapping_dict): - node_b, ancestors_n, ancestors_b = Graph.mapping_match(node_n, self.graph_b, mapping_dict) - if node_b: - ancestors_n.append(node_n.id) - ancestors_b.append(node_b.id) - node_n.matched_node_link = ancestors_b - node_b.matched_node_link = ancestors_n - return node_b + # 真实数据比对只会得到基本信息,并没有精度指标,需要调用多进程对比接口 + self._get_and_add_result(node_n, node_b) + node_list.extend(node_n.subnodes) + + node_list = [node_root] + while node_list: + compare_single_node(node_list.pop(0)) + + def _compare_nodes_fuzzy(self, node_root): + def compare_single_nodes_fuzzy(node_n): + if node_n.op != NodeOp.function_api: + # 模块经过模糊匹配 + node_b, ancestors_n, ancestors_b = Graph.fuzzy_match(node_n, self.graph_b.node_map.get(node_n.id)) + if node_b: + self._process_matched_nodes(node_n, node_b, ancestors_n, ancestors_b) + # 匹配上的两个模块中的所有api, 忽略dump调用次数,按照名称一致+模块中的调用顺序进行匹配 + recount_result_n = self._recount_api_node(node_n) + recount_result_b = self._recount_api_node(node_b) + for recount_node_id, node_id_n in recount_result_n.items(): + api_node_n = self.graph_n.node_map.get(node_id_n) + if not api_node_n: + continue + api_node_b, ancestors_n, ancestors_b = Graph.fuzzy_match( + api_node_n, self.graph_b.node_map.get(recount_result_b.get(recount_node_id))) + if api_node_b: + self._process_matched_nodes(api_node_n, api_node_b, ancestors_n, ancestors_b) + node_list.extend(node_n.subnodes) + + node_list = [node_root] + while node_list: + compare_single_nodes_fuzzy(node_list.pop(0)) def _parse_param(self, dump_path_param, output_path): self.dump_path_param = dump_path_param diff --git a/debug/accuracy_tools/msprobe/visualization/compare/mode_adapter.py b/debug/accuracy_tools/msprobe/visualization/compare/mode_adapter.py index 7b961c4e8cdcb0b2d636d2782d3a9cce851a982f..dd6f4fb1e63106001e8f22a3cb68e0ea47cbb345 100644 --- a/debug/accuracy_tools/msprobe/visualization/compare/mode_adapter.py +++ b/debug/accuracy_tools/msprobe/visualization/compare/mode_adapter.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import math import json from msprobe.core.common.const import CompareConst, Const from msprobe.visualization.utils import ToolTip, GraphConst, str2float @@ -24,6 +25,12 @@ class ModeAdapter: self.csv_data = [] self.compare_nodes = [] + @staticmethod + def _is_invalid(value): + if not isinstance(value, float): + return False + return math.isnan(value) or math.isinf(value) + @staticmethod def _add_md5_compare_data(node_data, compare_data_dict): precision_index = GraphConst.MAX_INDEX_KEY @@ -48,6 +55,8 @@ class ModeAdapter: for key, value in node_data.items(): if not isinstance(value, dict): continue + if value.get(Const.MAX) is None: + continue compare_data = compare_data_dict.get(key) if compare_data: headers = CompareConst.COMPARE_RESULT_HEADER @@ -66,9 +75,13 @@ class ModeAdapter: if thousandth is not None: numbers.append(thousandth) node_data[key] = value + if ModeAdapter._is_invalid(value.get(Const.MAX)) or ModeAdapter._is_invalid(value.get(Const.MIN)): + numbers.append(CompareConst.N_A) # 双千指标都是None的异常情况 if not numbers: min_thousandth = None + elif CompareConst.N_A in numbers: + min_thousandth = CompareConst.N_A else: min_thousandth = min(numbers + [min_thousandth]) return min_thousandth @@ -80,6 +93,8 @@ class ModeAdapter: for key, data_info in node_data.items(): if not isinstance(data_info, dict): continue + if data_info.get(Const.MAX) is None: + continue compare_data = compare_data_dict.get(key) if compare_data: # 对应比对结果csv的列 @@ -91,6 +106,8 @@ class ModeAdapter: relative_err = str2float(data_info.get(item)) max_relative_err = max(max_relative_err, relative_err) node_data[key] = data_info + if ModeAdapter._is_invalid(data_info.get(Const.MAX)) or ModeAdapter._is_invalid(data_info.get(Const.MIN)): + max_relative_err = GraphConst.MAX_INDEX_KEY max_relative_err = 1 if max_relative_err > 1 else max_relative_err return max_relative_err @@ -132,7 +149,11 @@ class ModeAdapter: ModeAdapter._check_list_len(compare_data_dict_list, 1) min_thousandth_in = ModeAdapter._add_real_compare_data(node.input_data, compare_data_dict_list[0]) min_thousandth_out = ModeAdapter._add_real_compare_data(node.output_data, compare_data_dict_list[0]) - if min_thousandth_in is not None and min_thousandth_out is not None: + if CompareConst.N_A == min_thousandth_out: + change_percentage = GraphConst.MAX_INDEX_KEY + elif CompareConst.N_A == min_thousandth_in: + change_percentage = GraphConst.MIN_INDEX_KEY + elif min_thousandth_in is not None and min_thousandth_out is not None: change_percentage = min_thousandth_in - min_thousandth_out else: change_percentage = GraphConst.MIN_INDEX_KEY diff --git a/debug/accuracy_tools/msprobe/visualization/compare/multi_mapping.py b/debug/accuracy_tools/msprobe/visualization/compare/multi_mapping.py deleted file mode 100644 index bcc7c0f31351a52e40acfd6824c6b2f8f49ffd52..0000000000000000000000000000000000000000 --- a/debug/accuracy_tools/msprobe/visualization/compare/multi_mapping.py +++ /dev/null @@ -1,173 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from dataclasses import dataclass - -from msprobe.core.common.const import Const -from msprobe.core.common.log import logger -from msprobe.visualization.utils import GraphConst -from msprobe.visualization.graph.graph import NodeOp, BaseNode -from msprobe.core.compare.utils import get_name_and_state - - -@dataclass -class MergedItems: - multi_node: BaseNode = None - start_node: BaseNode = None - end_node: BaseNode = None - - -class MultiMapping: - - @staticmethod - def validate_yaml(yaml_file): - multi_mapping = {} - if not yaml_file: - logger.warning(f'The multi mapping file cannot be empty.') - return multi_mapping - if not isinstance(yaml_file, dict): - logger.warning(f'The multi mapping file format must be a dict.') - return multi_mapping - for key, value in yaml_file.items(): - multi_mapping[MultiMapping._split_mapping_str(key)] = MultiMapping._split_mapping_str(value) - return multi_mapping - - @staticmethod - def validate_ids_in_graph(node_ids, graph, graph_type=GraphConst.JSON_NPU_KEY): - in_graph = True - for node_id in node_ids: - if node_id not in graph.node_map: - logger.warning(f'{node_id} does not exist in the {graph_type} graph, and the mapping is not effective.') - in_graph = False - return in_graph - - @staticmethod - def get_merged_nodes_data(node_ids: (list, tuple), dump_data: dict, multi_node_id: str): - if len(node_ids) < 2: - return {} - multi_node_data = {} - for k, v in dump_data.get(node_ids[0], {}).items(): - if k in [Const.INPUT, Const.INPUT_ARGS, Const.INPUT_KWARGS]: - multi_node_data[k] = v - for k, v in dump_data.get(node_ids[-1], {}).items(): - if k == Const.OUTPUT: - multi_node_data[k] = v - return {multi_node_id: multi_node_data} - - @staticmethod - def replace_param_name(param_name: str, multi_node_id): - try: - api, _ = get_name_and_state(param_name) - except Exception: - return param_name - return param_name.replace(api, multi_node_id + Const.SEP) - - @staticmethod - def merge_nodes(node_ids, graph): - """ - 根据传入的节点名称列表,将列表中的节点合并为一个节点,并取列表中的首节点输入数据作为融合节点的输入,尾节点的输出数据作为融合节点的输出 - Args: - node_ids: 节点名称列表 - graph: 图 - - Returns: 融合节点,首节点,尾节点 - - """ - if not node_ids or not isinstance(node_ids, (list, tuple)): - return MergedItems() - if len(node_ids) == 1: - return MergedItems(graph.get_node(node_ids[0])) - # 根据映射文件中配置的首尾节点id,得到首尾节点id之间的所有节点id列表 - node0 = graph.get_node(node_ids[0]) - node1 = graph.get_node(node_ids[-1]) - if not node0 or not node1: - return MergedItems() - current_node_list = node0.upnode.subnodes - - start_index = end_index = 0 - for i, node in enumerate(current_node_list): - if node.id == node_ids[0]: - start_index = i - elif node.id == node_ids[-1]: - end_index = i - - if start_index > end_index: - logger.warning(f'{node_ids[0]} and {node_ids[-1]} are in the wrong order, {node_ids[0]} should come first, ' - f'and the mapping is not effective.') - return MergedItems() - - current_node_list = current_node_list[start_index:end_index + 1] - - # 创建一个新的节点,作为被映射多个节点的集合,输入使用第一个节点的输入,输出使用最后一个节点的输出 - multi_node_name = GraphConst.MERGE_NODES + Const.SEP + Const.FORWARD \ - if Const.SEP + Const.FORWARD + Const.SEP in node0.id \ - else GraphConst.MERGE_NODES + Const.SEP + Const.BACKWARD - multi_node_id = graph.add_node(NodeOp.multi_collection, multi_node_name, id_accumulation=True) - multi_node = graph.get_node(multi_node_id) - multi_node.subnodes = current_node_list - multi_node.upnode = node0.upnode - # 重新确立父子关系 - for node in current_node_list: - node.upnode = multi_node - - multi_node.upnode.subnodes[start_index:end_index + 1] = [multi_node] - - # 给节点添加输入输出数据, parameters信息不添加, 因为多对多节点之间的parameters的shape会不一致导致无法比对 - input_data = {} - output_data = {} - for key, value in node0.input_data.items(): - if any(s in key for s in [Const.INPUT, Const.INPUT_ARGS, Const.INPUT_KWARGS]): - input_data[MultiMapping.replace_param_name(key, multi_node_id)] = value - for key, value in node1.output_data.items(): - output_data[MultiMapping.replace_param_name(key, multi_node_id)] = value - multi_node.input_data = input_data - multi_node.output_data = output_data - - return MergedItems(multi_node, node0, node1) - - @staticmethod - def get_dump_data_name(merged_items, full_param_name): - """ - 根据节点参数名称,从融合节点信息中获取此参数的真实数据名称 - Args: - merged_items: 融合节点信息 - full_param_name: 参数名称,例如Module.layer.Linear.forward.0.input.0 - - Returns: 真实数据名称,例如Module.layer.Linear.forward.0.input.0.pt - - """ - try: - _, state = get_name_and_state(full_param_name) - except Exception: - return "-1" - node = merged_items.multi_node - # 如果是融合节点,那么其真实数据的存盘data_name需要从融合节点的首节点和尾节点中获取 - if node.op == NodeOp.multi_collection: - data = merged_items.end_node.output_data \ - if Const.OUTPUT == state \ - else merged_items.start_node.input_data - else: - data = node.output_data \ - if Const.OUTPUT == state \ - else node.input_data - - return data.get(full_param_name, {}).get("data_name", "-1") - - @staticmethod - def _split_mapping_str(x: str): - if Const.COMMA in x: - split_list = x.split(Const.COMMA) - return split_list[0].strip(), split_list[-1].strip() - return (x.strip(),) diff --git a/debug/accuracy_tools/msprobe/visualization/graph/base_node.py b/debug/accuracy_tools/msprobe/visualization/graph/base_node.py index fd1541b87bf5e7ba54a95089646683c41f546ca6..dee86180586670d6f9c0c4672375479e805f818b 100644 --- a/debug/accuracy_tools/msprobe/visualization/graph/base_node.py +++ b/debug/accuracy_tools/msprobe/visualization/graph/base_node.py @@ -87,15 +87,15 @@ class BaseNode: self.matched_node_link = ancestors node.matched_node_link = ancestors - def to_dict(self): + def to_dict(self, compare_mode=None): """ 输出数据 """ result = { 'id': self.id, 'node_type': self.op.value, - 'output_data': format_node_data(self.output_data, self.id), - 'input_data': format_node_data(self.input_data, self.id), + 'output_data': format_node_data(self.output_data, self.id, compare_mode), + 'input_data': format_node_data(self.input_data, self.id, compare_mode), 'upnode': self.upnode.id if self.upnode else 'None', 'subnodes': [node.id for node in self.subnodes], 'matched_node_link': self.matched_node_link, diff --git a/debug/accuracy_tools/msprobe/visualization/graph/graph.py b/debug/accuracy_tools/msprobe/visualization/graph/graph.py index 90574174144ecc6b53033871dceda2bc53c87ba5..5bcad6446ca29ca09a986c315690bbfe2c26d36f 100644 --- a/debug/accuracy_tools/msprobe/visualization/graph/graph.py +++ b/debug/accuracy_tools/msprobe/visualization/graph/graph.py @@ -146,7 +146,7 @@ class Graph: """ return self.node_map.get(node_id, None) - def to_dict(self): + def to_dict(self, compare_mode=None): """ 用于数据输出 """ @@ -155,7 +155,7 @@ class Graph: result[GraphConst.JSON_DATA_KEY] = self.data_path result[GraphConst.JSON_NODE_KEY] = {} for node_id in self.node_map: - info = self.node_map.get(node_id).to_dict() + info = self.node_map.get(node_id).to_dict(compare_mode) result[GraphConst.JSON_NODE_KEY][node_id] = info return result diff --git a/debug/accuracy_tools/msprobe/visualization/graph/node_op.py b/debug/accuracy_tools/msprobe/visualization/graph/node_op.py index 12072fff032ee1e26c5e8274cd1676679d531331..85d7e65bc528298d596398e10e4f9b9d2a35882f 100644 --- a/debug/accuracy_tools/msprobe/visualization/graph/node_op.py +++ b/debug/accuracy_tools/msprobe/visualization/graph/node_op.py @@ -22,7 +22,6 @@ from msprobe.core.common.log import logger class NodeOp(Enum): module = 0 function_api = 1 - multi_collection = 8 api_collection = 9 @staticmethod diff --git a/debug/accuracy_tools/msprobe/visualization/graph_service.py b/debug/accuracy_tools/msprobe/visualization/graph_service.py index b4ab43141c8b0b061edd07cdd35db85b1778e725..b14ccab0386be92c0cdce7ebc89854a9ce17aa92 100644 --- a/debug/accuracy_tools/msprobe/visualization/graph_service.py +++ b/debug/accuracy_tools/msprobe/visualization/graph_service.py @@ -15,13 +15,15 @@ import os import time +from copy import deepcopy +from multiprocessing import cpu_count, Pool from msprobe.core.common.file_utils import (check_file_type, create_directory, FileChecker, check_file_or_directory_path, load_json) from msprobe.core.common.const import FileCheckConst, Const -from msprobe.core.common.utils import CompareException +from msprobe.core.common.utils import CompareException, get_dump_mode from msprobe.visualization.compare.graph_comparator import GraphComparator -from msprobe.visualization.utils import GraphConst, check_directory_content -from msprobe.visualization.builder.graph_builder import GraphBuilder, GraphExportConfig +from msprobe.visualization.utils import GraphConst, check_directory_content, SerializableArgs +from msprobe.visualization.builder.graph_builder import GraphBuilder, GraphExportConfig, GraphInfo, BuildGraphTaskInfo from msprobe.core.common.log import logger from msprobe.visualization.graph.node_colors import NodeColors from msprobe.core.compare.layer_mapping import generate_api_mapping_by_layer_mapping @@ -32,75 +34,74 @@ from msprobe.visualization.graph.distributed_analyzer import DistributedAnalyzer current_time = time.strftime("%Y%m%d%H%M%S") -def _compare_graph(input_param, args): - logger.info('Start building model graphs...') - # 对两个数据进行构图 - dump_path_n = input_param.get('npu_path') - dump_path_b = input_param.get('bench_path') - construct_path_n = FileChecker(os.path.join(dump_path_n, GraphConst.CONSTRUCT_FILE), - FileCheckConst.FILE, FileCheckConst.READ_ABLE).common_check() - construct_path_b = FileChecker(os.path.join(dump_path_b, GraphConst.CONSTRUCT_FILE), - FileCheckConst.FILE, FileCheckConst.READ_ABLE).common_check() - data_path_n = FileChecker(os.path.join(dump_path_n, GraphConst.DUMP_FILE), FileCheckConst.FILE, - FileCheckConst.READ_ABLE).common_check() - data_path_b = FileChecker(os.path.join(dump_path_b, GraphConst.DUMP_FILE), FileCheckConst.FILE, - FileCheckConst.READ_ABLE).common_check() - stack_path_n = FileChecker(os.path.join(dump_path_n, GraphConst.STACK_FILE), FileCheckConst.FILE, - FileCheckConst.READ_ABLE).common_check() - stack_path_b = FileChecker(os.path.join(dump_path_b, GraphConst.STACK_FILE), FileCheckConst.FILE, - FileCheckConst.READ_ABLE).common_check() - graph_n = GraphBuilder.build(construct_path_n, data_path_n, stack_path_n, complete_stack=args.complete_stack) - graph_b = GraphBuilder.build(construct_path_b, data_path_b, stack_path_b, complete_stack=args.complete_stack) - logger.info('Model graphs built successfully, start Comparing graphs...') - # 基于graph、stack和data进行比较 +def _compare_graph(graph_n: GraphInfo, graph_b: GraphInfo, input_param, args): dump_path_param = { - 'npu_json_path': data_path_n, - 'bench_json_path': data_path_b, - 'stack_json_path': stack_path_n, + 'npu_json_path': graph_n.data_path, + 'bench_json_path': graph_b.data_path, + 'stack_json_path': graph_n.stack_path, 'is_print_compare_log': input_param.get("is_print_compare_log", True) } mapping_dict = {} if args.layer_mapping: try: - mapping_dict = generate_api_mapping_by_layer_mapping(data_path_n, data_path_b, args.layer_mapping) + mapping_dict = generate_api_mapping_by_layer_mapping(graph_n.data_path, graph_b.data_path, + args.layer_mapping) except Exception: logger.warning('The layer mapping file parsing failed, please check file format, mapping is not effective.') - - is_cross_framework = detect_framework_by_dump_json(data_path_n) != detect_framework_by_dump_json(data_path_b) + is_cross_framework = detect_framework_by_dump_json(graph_n.data_path) != \ + detect_framework_by_dump_json(graph_b.data_path) if is_cross_framework and not args.layer_mapping: logger.error('The cross_frame graph comparison failed. ' 'Please specify -lm or --layer_mapping when performing cross_frame graph comparison.') raise CompareException(CompareException.CROSS_FRAME_ERROR) - graph_comparator = GraphComparator([graph_n, graph_b], dump_path_param, args, is_cross_framework, + graph_comparator = GraphComparator([graph_n.graph, graph_b.graph], dump_path_param, args, is_cross_framework, mapping_dict=mapping_dict) graph_comparator.compare() - micro_steps = graph_n.paging_by_micro_step(graph_b) + return graph_comparator + + +def _compare_graph_result(input_param, args): + logger.info('Start building model graphs...') + # 对两个数据进行构图 + graph_n = _build_graph_info(input_param.get('npu_path'), args) + graph_b = _build_graph_info(input_param.get('bench_path'), args) + logger.info('Model graphs built successfully, start Comparing graphs...') + # 基于graph、stack和data进行比较 + graph_comparator = _compare_graph(graph_n, graph_b, input_param, args) + # 增加micro step标记 + micro_steps = graph_n.graph.paging_by_micro_step(graph_b.graph) # 开启溢出检测 if args.overflow_check: - graph_n.overflow_check() - graph_b.overflow_check() + graph_n.graph.overflow_check() + graph_b.graph.overflow_check() - if args.multi_mapping: - graph_comparator.multi_compare(args.multi_mapping) + return CompareGraphResult(graph_n.graph, graph_b.graph, graph_comparator, micro_steps) - return CompareGraphResult(graph_n, graph_b, graph_comparator, micro_steps) - -def _export_compare_graph_result(args, graphs, graph_comparator, micro_steps, - output_file_name=f'compare_{current_time}.vis'): - create_directory(args.output_path) +def _export_compare_graph_result(args, result): + graphs = [result.graph_n, result.graph_b] + graph_comparator = result.graph_comparator + micro_steps = result.micro_steps + output_file_name = result.output_file_name + if not output_file_name: + output_file_name = f'compare_{current_time}.vis' + logger.info(f'Start exporting compare graph result, file name: {output_file_name}...') output_path = os.path.join(args.output_path, output_file_name) task = GraphConst.GRAPHCOMPARE_MODE_TO_DUMP_MODE_TO_MAPPING.get(graph_comparator.ma.compare_mode) export_config = GraphExportConfig(graphs[0], graphs[1], graph_comparator.ma.get_tool_tip(), NodeColors.get_node_colors(graph_comparator.ma.compare_mode), micro_steps, task, - args.overflow_check) - GraphBuilder.to_json(output_path, export_config) - logger.info(f'Model graphs compared successfully, the result file is saved in {output_path}') + args.overflow_check, graph_comparator.ma.compare_mode) + try: + GraphBuilder.to_json(output_path, export_config) + logger.info(f'Exporting compare graph result successfully, the result file is saved in {output_path}') + return '' + except RuntimeError as e: + logger.error(f'Failed to export compare graph result, file: {output_file_name}, error: {e}') + return output_file_name -def _build_graph(dump_path, args): - logger.info('Start building model graph...') +def _build_graph_info(dump_path, args): construct_path = FileChecker(os.path.join(dump_path, GraphConst.CONSTRUCT_FILE), FileCheckConst.FILE, FileCheckConst.READ_ABLE).common_check() data_path = FileChecker(os.path.join(dump_path, GraphConst.DUMP_FILE), FileCheckConst.FILE, @@ -108,6 +109,13 @@ def _build_graph(dump_path, args): stack_path = FileChecker(os.path.join(dump_path, GraphConst.STACK_FILE), FileCheckConst.FILE, FileCheckConst.READ_ABLE).common_check() graph = GraphBuilder.build(construct_path, data_path, stack_path, complete_stack=args.complete_stack) + return GraphInfo(graph, construct_path, data_path, stack_path) + + +def _build_graph_result(dump_path, args): + logger.info('Start building model graphs...') + graph = _build_graph_info(dump_path, args).graph + # 增加micro step标记 micro_steps = graph.paging_by_micro_step() # 开启溢出检测 if args.overflow_check: @@ -115,15 +123,128 @@ def _build_graph(dump_path, args): return BuildGraphResult(graph, micro_steps) -def _export_build_graph_result(out_path, graph, micro_steps, overflow_check, - output_file_name=f'build_{current_time}.vis'): - create_directory(out_path) +def _run_build_graph_compare(input_param, args, nr, br): + logger.info(f'Start building graph for {nr}...') + graph_n = _build_graph_info(input_param.get('npu_path'), args) + graph_b = _build_graph_info(input_param.get('bench_path'), args) + logger.info(f'Building graph for {nr} finished.') + return BuildGraphTaskInfo(graph_n, graph_b, nr, br, current_time) + + +def _run_build_graph_single(dump_ranks_path, rank, step, args): + logger.info(f'Start building graph for {rank}...') + dump_path = os.path.join(dump_ranks_path, rank) + output_file_name = f'build_{step}_{rank}_{current_time}.vis' if step else f'build_{rank}_{current_time}.vis' + result = _build_graph_result(dump_path, args) + result.output_file_name = output_file_name + if rank != Const.RANK: + try: + result.rank = int(rank.replace(Const.RANK, "")) + except Exception as e: + logger.error('The folder name format is incorrect, expected rank+number.') + raise CompareException(CompareException.INVALID_PATH_ERROR) from e + logger.info(f'Building graph for step: {step}, rank: {rank} finished.') + return result + + +def _run_graph_compare(graph_task_info, input_param, args, output_file_name): + logger.info(f'Start comparing data for {graph_task_info.npu_rank}...') + graph_n = graph_task_info.graph_info_n + graph_b = graph_task_info.graph_info_b + nr = graph_task_info.npu_rank + graph_comparator = _compare_graph(graph_n, graph_b, input_param, args) + micro_steps = graph_n.graph.paging_by_micro_step(graph_b.graph) + # 开启溢出检测 + if args.overflow_check: + graph_n.graph.overflow_check() + graph_b.graph.overflow_check() + graph_result = CompareGraphResult(graph_n.graph, graph_b.graph, graph_comparator, micro_steps) + graph_result.output_file_name = output_file_name + if nr != Const.RANK: + try: + graph_result.rank = int(nr.replace(Const.RANK, "")) + except Exception as e: + logger.error('The folder name format is incorrect, expected rank+number.') + raise CompareException(CompareException.INVALID_PATH_ERROR) from e + logger.info(f'Comparing data for {graph_task_info.npu_rank} finished.') + return graph_result + + +def _export_build_graph_result(args, result): + out_path = args.output_path + graph = result.graph + micro_steps = result.micro_steps + overflow_check = args.overflow_check + output_file_name = result.output_file_name + if not output_file_name: + output_file_name = f'build_{current_time}.vis' + logger.info(f'Start exporting graph for {output_file_name}...') output_path = os.path.join(out_path, output_file_name) - GraphBuilder.to_json(output_path, GraphExportConfig(graph, micro_steps=micro_steps, overflow_check=overflow_check)) - logger.info(f'Model graph built successfully, the result file is saved in {output_path}') + try: + GraphBuilder.to_json(output_path, GraphExportConfig(graph, micro_steps=micro_steps, + overflow_check=overflow_check)) + logger.info(f'Model graph exported successfully, the result file is saved in {output_path}') + return None + except RuntimeError as e: + logger.error(f'Failed to export model graph, file: {output_file_name}, error: {e}') + return output_file_name + + +def is_real_data_compare(input_param, npu_ranks, bench_ranks): + dump_rank_n = input_param.get('npu_path') + dump_rank_b = input_param.get('bench_path') + has_real_data = False + for nr, br in zip(npu_ranks, bench_ranks): + dump_path_param = { + 'npu_json_path': FileChecker(os.path.join(dump_rank_n, nr, GraphConst.DUMP_FILE), FileCheckConst.FILE, + FileCheckConst.READ_ABLE).common_check(), + 'bench_json_path': FileChecker(os.path.join(dump_rank_b, br, GraphConst.DUMP_FILE), FileCheckConst.FILE, + FileCheckConst.READ_ABLE).common_check() + } + has_real_data |= get_dump_mode(dump_path_param) == Const.ALL + return has_real_data + + +def _mp_compare(input_param, serializable_args, output_file_name, nr, br): + graph_task_info = _run_build_graph_compare(input_param, serializable_args, nr, br) + return _run_graph_compare(graph_task_info, input_param, serializable_args, output_file_name) def _compare_graph_ranks(input_param, args, step=None): + with Pool(processes=max(int((cpu_count() + 1) // 4), 1)) as pool: + def err_call(err): + logger.error(f'Error occurred while comparing graph ranks: {err}') + try: + pool.close() + except OSError as e: + logger.error(f'Error occurred while terminating the pool: {e}') + + serializable_args = SerializableArgs(args) + # 暂存所有rank的graph,用于匹配rank间的分布式节点 + compare_graph_results = _get_compare_graph_results(input_param, serializable_args, step, pool, err_call) + + # 匹配rank间的分布式节点 + if len(compare_graph_results) > 1: + DistributedAnalyzer({obj.rank: obj.graph_n for obj in compare_graph_results}, + args.overflow_check).distributed_match() + DistributedAnalyzer({obj.rank: obj.graph_b for obj in compare_graph_results}, + args.overflow_check).distributed_match() + + export_res_task_list = [] + create_directory(args.output_path) + for result in compare_graph_results: + export_res_task_list.append(pool.apply_async(_export_compare_graph_result, + args=(serializable_args, result), + error_callback=err_call)) + export_res_list = [res.get() for res in export_res_task_list] + if any(export_res_list): + failed_names = list(filter(lambda x: x, export_res_list)) + logger.error(f'Unable to export compare graph results: {", ".join(failed_names)}.') + else: + logger.info('Successfully exported compare graph results.') + + +def _get_compare_graph_results(input_param, serializable_args, step, pool, err_call): dump_rank_n = input_param.get('npu_path') dump_rank_b = input_param.get('bench_path') npu_ranks = sorted(check_and_return_dir_contents(dump_rank_n, Const.RANK)) @@ -132,32 +253,33 @@ def _compare_graph_ranks(input_param, args, step=None): logger.error('The number of ranks in the two runs are different. Unable to match the ranks.') raise CompareException(CompareException.INVALID_PATH_ERROR) compare_graph_results = [] - for nr, br in zip(npu_ranks, bench_ranks): - logger.info(f'Start processing data for {nr}...') - input_param['npu_path'] = os.path.join(dump_rank_n, nr) - input_param['bench_path'] = os.path.join(dump_rank_b, br) - output_file_name = f'compare_{step}_{nr}_{current_time}.vis' if step else f'compare_{nr}_{current_time}.vis' - result = _compare_graph(input_param, args) - result.output_file_name = output_file_name - if nr != Const.RANK: - try: - result.rank = int(nr.replace(Const.RANK, "")) - except Exception as e: - logger.error('The folder name format is incorrect, expected rank+number.') - raise CompareException(CompareException.INVALID_PATH_ERROR) from e - # 暂存所有rank的graph,用于匹配rank间的分布式节点 - compare_graph_results.append(result) - - # 匹配rank间的分布式节点 - if len(compare_graph_results) > 1: - DistributedAnalyzer({obj.rank: obj.graph_n for obj in compare_graph_results}, - args.overflow_check).distributed_match() - DistributedAnalyzer({obj.rank: obj.graph_b for obj in compare_graph_results}, - args.overflow_check).distributed_match() - - for result in compare_graph_results: - _export_compare_graph_result(args, [result.graph_n, result.graph_b], result.graph_comparator, - result.micro_steps, output_file_name=result.output_file_name) + if is_real_data_compare(input_param, npu_ranks, bench_ranks): + mp_task_dict = {} + for nr, br in zip(npu_ranks, bench_ranks): + input_param['npu_path'] = os.path.join(dump_rank_n, nr) + input_param['bench_path'] = os.path.join(dump_rank_b, br) + output_file_name = f'compare_{step}_{nr}_{current_time}.vis' if step else f'compare_{nr}_{current_time}.vis' + input_param_copy = deepcopy(input_param) + mp_task_dict[output_file_name] = pool.apply_async(_run_build_graph_compare, + args=(input_param_copy, serializable_args, nr, br), + error_callback=err_call) + + mp_res_dict = {k: v.get() for k, v in mp_task_dict.items()} + for output_file_name, mp_res in mp_res_dict.items(): + compare_graph_results.append(_run_graph_compare(mp_res, input_param, serializable_args, output_file_name)) + else: + compare_graph_tasks = [] + for nr, br in zip(npu_ranks, bench_ranks): + input_param['npu_path'] = os.path.join(dump_rank_n, nr) + input_param['bench_path'] = os.path.join(dump_rank_b, br) + output_file_name = f'compare_{step}_{nr}_{current_time}.vis' if step else f'compare_{nr}_{current_time}.vis' + input_param_copy = deepcopy(input_param) + compare_graph_tasks.append(pool.apply_async(_mp_compare, + args=(input_param_copy, serializable_args, output_file_name, nr, + br), + error_callback=err_call)) + compare_graph_results = [task.get() for task in compare_graph_tasks] + return compare_graph_results def _compare_graph_steps(input_param, args): @@ -181,28 +303,39 @@ def _compare_graph_steps(input_param, args): def _build_graph_ranks(dump_ranks_path, args, step=None): ranks = sorted(check_and_return_dir_contents(dump_ranks_path, Const.RANK)) - build_graph_results = [] - for rank in ranks: - logger.info(f'Start processing data for {rank}...') - dump_path = os.path.join(dump_ranks_path, rank) - output_file_name = f'build_{step}_{rank}_{current_time}.vis' if step else f'build_{rank}_{current_time}.vis' - result = _build_graph(dump_path, args) - result.output_file_name = output_file_name - if rank != Const.RANK: + serializable_args = SerializableArgs(args) + with Pool(processes=max(int((cpu_count() + 1) // 4), 1)) as pool: + def err_call(err): + logger.error(f'Error occurred while comparing graph ranks: {err}') try: - result.rank = int(rank.replace(Const.RANK, "")) - except Exception as e: - logger.error('The folder name format is incorrect, expected rank+number.') - raise CompareException(CompareException.INVALID_PATH_ERROR) from e - build_graph_results.append(result) - - if len(build_graph_results) > 1: - DistributedAnalyzer({obj.rank: obj.graph for obj in build_graph_results}, - args.overflow_check).distributed_match() + pool.close() + except OSError as e: + logger.error(f'Error occurred while terminating the pool: {e}') + + build_graph_tasks = [] + for rank in ranks: + build_graph_tasks.append(pool.apply_async(_run_build_graph_single, + args=(dump_ranks_path, rank, step, serializable_args), + error_callback=err_call)) + build_graph_results = [task.get() for task in build_graph_tasks] + + if len(build_graph_results) > 1: + DistributedAnalyzer({obj.rank: obj.graph for obj in build_graph_results}, + args.overflow_check).distributed_match() + + create_directory(args.output_path) + export_build_graph_tasks = [] + for result in build_graph_results: + export_build_graph_tasks.append(pool.apply_async(_export_build_graph_result, + args=(serializable_args, result), + error_callback=err_call)) + export_build_graph_result = [task.get() for task in export_build_graph_tasks] + if any(export_build_graph_result): + failed_names = list(filter(lambda x: x, export_build_graph_result)) + logger.error(f'Unable to export build graph results: {failed_names}.') + else: + logger.info(f'Successfully exported build graph results.') - for result in build_graph_results: - _export_build_graph_result(args.output_path, result.graph, result.micro_steps, args.overflow_check, - result.output_file_name) def _build_graph_steps(dump_steps_path, args): @@ -226,8 +359,6 @@ def _graph_service_parser(parser): help=" Whether to perform a fuzzy match on the api name.", required=False) parser.add_argument("-cs", "--complete_stack", dest="complete_stack", action="store_true", help=" Whether to use complete stack information.", required=False) - parser.add_argument("-mm", "--multi_mapping", dest="multi_mapping", type=str, - help=" The multi mapping file path.", required=False) def _graph_service_command(args): @@ -244,8 +375,11 @@ def _graph_service_command(args): elif content == GraphConst.STEPS: _build_graph_steps(npu_path, args) else: - result = _build_graph(npu_path, args) - _export_build_graph_result(args.output_path, result.graph, result.micro_steps, args.overflow_check) + result = _build_graph_result(npu_path, args) + create_directory(args.output_path) + file_name = _export_build_graph_result(args, result) + if file_name: + logger.error('Failed to export model build graph.') elif check_file_type(npu_path) == FileCheckConst.DIR and check_file_type(bench_path) == FileCheckConst.DIR: content_n = check_directory_content(npu_path) content_b = check_directory_content(bench_path) @@ -256,9 +390,11 @@ def _graph_service_command(args): elif content_n == GraphConst.STEPS: _compare_graph_steps(input_param, args) else: - result = _compare_graph(input_param, args) - _export_compare_graph_result(args, [result.graph_n, result.graph_b], - result.graph_comparator, result.micro_steps) + result = _compare_graph_result(input_param, args) + create_directory(args.output_path) + file_name = _export_compare_graph_result(args, result) + if file_name: + logger.error('Failed to export model compare graph.') else: logger.error("The npu_path or bench_path should be a folder.") raise CompareException(CompareException.INVALID_COMPARE_MODE) diff --git a/debug/accuracy_tools/msprobe/visualization/utils.py b/debug/accuracy_tools/msprobe/visualization/utils.py index 35914c216aec1c52a1ed9c093049258aa3f09ebf..242d641e31ae54c99a347f29928bca38523fa975 100644 --- a/debug/accuracy_tools/msprobe/visualization/utils.py +++ b/debug/accuracy_tools/msprobe/visualization/utils.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -16,9 +16,10 @@ import os import re import json +import pickle from msprobe.core.common.file_utils import FileOpen from msprobe.core.common.const import CompareConst, Const -from msprobe.core.compare.acc_compare import Comparator, ModeConfig +from msprobe.core.common.log import logger def load_json_file(file_path): @@ -42,15 +43,6 @@ def load_data_json_file(file_path): return load_json_file(file_path).get(GraphConst.DATA_KEY, {}) -def get_csv_df(stack_mode, csv_data, compare_mode): - """ - 调用acc接口写入csv - """ - dump_mode = GraphConst.GRAPHCOMPARE_MODE_TO_DUMP_MODE_TO_MAPPING.get(compare_mode) - mode_config = ModeConfig(stack_mode=stack_mode, dump_mode=dump_mode) - return Comparator(mode_config).make_result_table(csv_data) - - def str2float(percentage_str): """ 百分比字符串转换转换为浮点型 @@ -159,7 +151,6 @@ class GraphConst: REAL_DATA_INDEX_LIST = CompareConst.ALL_COMPARE_INDEX SUMMARY_INDEX_LIST = CompareConst.SUMMARY_COMPARE_INDEX APIS_BETWEEN_MODULES = 'Apis_Between_Modules' - MERGE_NODES = 'Merged_Nodes' NULL = 'null' NONE = 'None' VALUE = 'value' @@ -193,3 +184,24 @@ class GraphConst: OP = 'op' PEER = 'peer' GROUP_ID = 'group_id' + + +def is_serializable(obj): + """ + Check if an object is serializable + """ + try: + pickle.dumps(obj) + return True + except (pickle.PicklingError, AttributeError, TypeError): + return False + except Exception as e: + logger.error('Unexpected error occurred while pickling obj.') + raise RuntimeError('Unexpected error occurred while pickling obj.') from e + + +class SerializableArgs: + def __init__(self, args): + for k, v in vars(args).items(): + if is_serializable(v): + setattr(self, k, v) diff --git a/msmonitor/README.md b/msmonitor/README.md new file mode 100644 index 0000000000000000000000000000000000000000..0a11f4fcdbfd30f8a5ba2c1ce82c55c855e690bd --- /dev/null +++ b/msmonitor/README.md @@ -0,0 +1,291 @@ +# msMonitor: MindStudio一站式在线监控工具 + +## 安装方式 + +### 1. clone 代码 + +```bash +git clone https://gitee.com/ascend/mstt.git +``` + +### 2. 安装依赖 +dynolog的编译依赖,确保安装了以下依赖: +
PyTorch、MSAdapter 以及 MindSpore 动态图场景指定某一类 API,dump 某一类的 API 级别输入输出数据。
配置示例:"list": ["relu"]。
PyTorch、MSAdapter 以及 MindSpore 动态图场景在level为 mix 级别时, 会dump名称中包含list中配置的字符串的API数据,还会将名称中包含list中配置的字符串的模块进行展开dump (dump该模块从执行开始到执行结束期间的所有数据)。
MindSpore 静态图场景配置 kernel_name,可以是算子的名称列表,也可以指定算子类型(jit_level=O2 时不支持),还可以配置算子名称的正则表达式(当字符串符合“name-regex(xxx)”格式时,后台则会将其作为正则表达式。
配置示例:list: ["name-regex(Default/.+)"]
可匹配算子名称以“Default/”开头的所有算子。
tensor_list自定义采集真实数据的算子列表,list[str] 类型,默认未配置。包含以下配置方法:
PyTorch、MSAdapter 以及 MindSpore 动态图场景指定某一类 API 或模块,即会 dump 这一类 API 或模块输入输出的统计量信息和完整的 tensor 数据。
配置示例:"tensor_list": ["relu"]。
PyTorch、MSAdapter 以及 MindSpore 动态图场景目前只支持level配置为 L0, L1 和 mix 级别。
MindSpore 静态图场景不支持。
data_modedump 数据过滤,str 类型。
PyTorch、MSAdapter 以及 MindSpore 动态图场景:支持"all"、"forward"、"backward"、"input"和"output",除"all"外,其余参数可以自由组合。默认为["all"],即保存所有 dump 的数据。
配置示例:"data_mode": ["backward"] (仅保存反向数据)或 "data_mode": ["forward", "input"](仅保存前向的输入数据)。
MindSpore 静态图场景:L0 级别 dump 仅支持"all"、"forward"和"backward"参数;L2 级别 dump 仅支持"all"、"input"和"output"参数,且各参数只能单独配置,不支持自由组合。
配置示例:"data_mode": ["all"]。
MindSpore 静态图场景:L0 级别 dump 仅支持"all"、"forward"和"backward"参数;L2 级别 dump 仅支持"all"、"input"和"output"参数。且各参数只能单独配置,不支持自由组合。
配置示例:"data_mode": ["all"]。
summary_mode控制 dump 文件输出的模式,str 类型,支持 PyTorch、MSAdapter、MindSpore 动态图以及 MindSpore 静态图 jit_level=O2 场景。
PyTorch、MSAdapter 以及 MindSpore 动态图场景:可选参数为
md5:dump 输出包含 CRC-32 值以及 API 统计信息的 dump.json 文件,用于验证数据的完整性;
statistics:dump 仅输出包含 API 统计信息的 dump.json 文件,默认值。
配置示例:"summary_mode": "md5"。
MindSpore 静态图 jit_level=O2 场景:支持上述配置的同时额外支持配置统计项列表,可选统计项为max、min、mean、l2norm,可从中任意选取组合搭配。其中mean、l2norm的结果为float数据格式。
配置示例:"summary_mode": ["max", "min"]。
+ + + + + + + + + + + + +
Language + Toolchain +
C++ + gcc 8.5.0+ +
Rust + Rust >= 1.81 +
+ +- 安装rust + +```bash +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh + +source $HOME/.cargo/env +``` + +- 安装ninja + +```bash +# debian +sudo apt-get install -y cmake ninja-build + +# centos +sudo yum install -y cmake ninja +``` + +- 安装openssl(RPC TLS认证)& 生成证书密钥 +安装 +```bash +# debian +sudo apt-get install -y openssl + +# centos +sudo yum install -y openssl +``` +dyno CLI与dynolog daemon之间的RPC通信使用TLS证书密钥加密,在启动dyno和dynlog二进制时需要指定证书密钥存放的路径,路径下需要满足如下结构和名称: + +```bash +rpc_certs +├── ca.crt +├── server.crt +├── server.key +├── client.crt +└── client.key +``` + +### 3. 编译 + +- dynolog编译 + +默认编译生成dyno和dynolog二进制文件, -t参数可以支持将二进制文件打包成deb包或rpm包。 + +```bash +# 编译dyno和dynolog二进制文件 +bash scripts/build.sh + +# 编译deb包, 当前支持amd64和aarch64平台, 默认为amd64, 编译aarch64平台需要修改third_party/dynolog/scripts/debian/control文件中的Architecture改为aarch64 +bash scripts/build.sh -t deb + +# 编译rpm包, 当前只支持amd64平台 +bash scripts/build.sh -t rpm +``` + +- msmonitor_plugin wheel包编译 + +msmonitor_plugin wheel包提供IPCMonitor,MsptiMonitor等公共能力,使用nputrace和npu-monitor功能前必须安装该wheel包,具体编译安装指导可参考msmonitor\plugin\README.md。 + +## 使用方式 + +- **说明**:**Profiler trace dump**功能和**NPU Monitor**功能**不能**同时开启。 + +### Profiler trace dump功能 +Profiler trace dump功能基于dynolog开发,实现类似于动态profiling的动态触发Ascend Torch Profiler采集profiling的功能。用户基于dyno CLI命令行可以动态触发指定节点的训练进程trace dump。 + +- 查看dyno支持的命令和帮助 + +```bash +dyno --help +``` + +dyno命令支持的参数选项 + +| 命令 | 参数类型 | 说明 | +|-----------|--------|-------------------------------------| +| hostname | String | 网络中唯一标识一台设备的名称,默认值localhost | +| port | i32 | 用于区分同一设备上的不同网络服务或应用程序,默认值1778 | +| certs-dir | String | 用于指定dyno与dynolog RPC通信时TLS证书的路径,必选值 | + +- 查看nputrace支持的命令和帮助 + +```bash +dyno nputrace --help +``` + +- nputrace使用方式 + +```bash +dyno nputrace [SUBCOMMANDS] --log-file +``` + +nputrace子命令支持的参数选项 + +| 子命令 | 参数类型 | 说明 | PyTorch支持 | MindSpore支持 | +|-------|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------:|:-----------:| +| job-id | u64 | 采集任务的job id,默认值0,dynolog原生参数 | N | N | +| pids | String | 采集任务的pid列表,多个pid用逗号分隔,默认值0,dynolog原生参数 | N | N | +| process-limit | u64 | 最大采集进程的数量,默认值3,dynolog原生参数 | N | N | +| profile-start-time | u64 | 用于同步采集的Unix时间戳,单位毫秒,默认值0,dynolog原生参数 | N | N | +| duration-ms | u64 | 采集的周期,单位毫秒,默认值500,dynolog原生参数 | N | N | +| iterations | i64 | 采集总迭代数,默认值-1,dynolog原生参数 | Y | Y | +| log-file | String | 采集落盘的路径,必选值 | Y | Y | +| start-step | u64 | 开始采集的迭代数,默认值0 | Y | Y | +| record-shapes | action | 是否采集算子的InputShapes和InputTypes,设置参数采集,默认不采集 | Y | N | +| profile-memory | action | 是否采集算子内存信息,设置参数采集,默认不采集 | Y | Y | +| with-stack | action | 是否采集Python调用栈,设置参数采集,默认不采集,当前MindSpore和PyTorch框架都支持 | Y | Y | +| with-flops | action | 是否采集算子flops,设置参数采集,默认不采集 | Y | N | +| with-modules | action | 是否采集modules层级的Python调用栈,设置参数采集,默认不采集 | Y | N | +| analyse | action | 采集后是否自动解析,设置参数解析,默认不解析 | Y | Y | +| l2-cache | action | 是否采集L2 Cache数据,设置参数采集,默认不采集 | Y | Y | +| op-attr | action | 是否采集算子属性信息,设置参数采集,默认不采集 | Y | N | +| msprof-tx | action | 是否使能MSTX,设置参数采集,默认不使能 | Y | Y | +| data-simplification | String | 解析完成后是否数据精简,可选值范围[`true`, `false`],默认值`true` | Y | Y | +| activities | String | 控制CPU、NPU事件采集范围,可选值范围[`CPU,NPU`, `NPU,CPU`, `CPU`, `NPU`],默认值`CPU,NPU` | Y | Y | +| profiler-level | String | 控制profiler的采集等级,可选值范围[`Level_none`, `Level0`, `Level1`, `Level2`],默认值`Level0` | Y | Y | +| aic-metrics | String | AI Core的性能指标采集项,可选值范围[`AiCoreNone`, `PipeUtilization`, `ArithmeticUtilization`, `Memory`, `MemoryL0`, `ResourceConflictRatio`, `MemoryUB`, `L2Cache`, `MemoryAccess`],默认值`AiCoreNone` | Y | Y | +| export-type | String | profiler解析导出数据的类型,可选值范围[`Text`, `Db`],默认值`Text` | Y | Y | +| gc-detect-threshold | Option | GC检测阈值,单位ms,只采集超过阈值的GC事件。该参数为可选参数,默认不设置时不开启GC检测 | Y | N | + + +- nputrace使用方法 + +Step0: 参考`3.编译`章节完成dynolog的编译,以及dynolog_npu_plugin wheel包的编译和安装。 + +Step1:拉起dynolog daemon进程 +```bash +# 方法1和方法2 二选一 +# 方法1:使用systemd拉起service +# 修改配置文件/etc/dynolog.gflags, 使能ipc_monitor +echo "--enable_ipc_monitor" | sudo tee -a /etc/dynolog.gflags +sudo systemctl start dynolog + +# 方法2:命令行执行 +dynolog --enable-ipc-monitor --certs-dir /home/rpc_certs + +#dynolog daemon的日志路径为:/var/log/dynolog.log +``` + +Step 2:使能dynolog trace dump环境变量 +```bash +export KINETO_USE_DAEMON=1 +``` + +Step 3: 拉起训练任务 +```bash +# 训练任务中需要使用pytorch的优化器/继承原生优化器 +bash train.sh +``` + +Step 4:使用dyno CLI动态触发trace dump +```bash +# 示例1:从第10个step开始采集,采集2个step,采集框架、CANN和device数据,同时采集完后自动解析以及解析完成不做数据精简,落盘路径为/tmp/profile_data +dyno --certs-dir /home/rpc_certs nputrace --start-step 10 --iterations 2 --activities CPU,NPU --analyse --data-simplification false --log-file /tmp/profile_data + +# 示例2:从第10个step开始采集,采集2个step,只采集CANN和device数据,同时采集完后自动解析以及解析完成后开启数据精简,落盘路径为/tmp/profile_data +dyno --certs-dir /home/rpc_certs nputrace --start-step 10 --iterations 2 --activities NPU --analyse --data-simplification true --log-file /tmp/profile_data + +# 示例3:从第10个step开始采集,采集2个step,只采集CANN和device数据,只采集不解析,落盘路径为/tmp/profile_data +dyno --certs-dir /home/rpc_certs nputrace --start-step 10 --iterations 2 --activities NPU --log-file /tmp/profile_data + +# 示例4:多机场景下向特定机器x.x.x.x发送参数信息,参数表示从第10个step开始采集,采集2个step,只采集CANN和device数据,只采集不解析,落盘路径为/tmp/profile_data +dyno --certs-dir /home/rpc_certs --hostname x.x.x.x nputrace --start-step 10 --iterations 2 --activities NPU --log-file /tmp/profile_data +``` + +### NPU Monitor功能 +NPU Monitor基于MSPTI/MSTX能力开发,实现了轻量级在线监控能力,能够用于性能问题的初步定位。 + +```bash +dyno npu-monitor --help +``` + +- npu-monitor使用方式 + +```bash +dyno npu-monitor [SUBCOMMANDS] +``` + +npu-monitor子命令支持的参数选项 + +| 子命令 | 参数类型 | 说明 | PyTorch支持 | MindSpore支持 | +|-------|-------|----------------------------------------------------------------------------------------------------------------------------------|:---------:|:-----------:| +| npu-monitor-start | action | 开启性能监控,设置参数后生效,默认不生效 | Y | Y | +| npu-monitor-stop | action | 停止性能监控,设置参数后生效,默认不生效 | Y | Y | +| report-interval-s | int | 性能监控数据上报周期,单位s,需要在启动时设置。默认值60 | Y | Y | +| mspti-activity-kind | String | 性能监控数据上报数据类型,可以设置单个或多个,多个类型以逗号分隔,每次设置时刷新全局上报类型。可选值范围[`Marker`, `Kernel`, `API`, `Hccl`, `Memory`, `MemSet`, `MemCpy`] , 默认值`Marker` | Y | Y | + +- npu-monitor使用方法 + +Step1: 拉起dynolog daemon进程 +```bash +# 方法1和方法2 二选一 +# 方法1:使用systemd拉起service +# 修改配置文件/etc/dynolog.gflags, 使能ipc_monitor +echo "--enable_ipc_monitor" | sudo tee -a /etc/dynolog.gflags +sudo systemctl start dynolog + +# 方法2:命令行执行 +dynolog --enable-ipc-monitor --certs-dir /home/rpc_certs + +# 使用Prometheus上报数据需要指定参数:--use_prometheus +# dynolog daemon的日志路径为:/var/log/dynolog.log +``` + +Step 2:使能dynolog环境变量 +```bash +export KINETO_USE_DAEMON=1 +``` + +Step 3: 配置Msmonitor日志路径(可选,默认路径为当前目录下的msmonitor_log) +```bash +export MSMONITOR_LOG_PATH= +# 示例: +export MSMONITOR_LOG_PATH=/tmp/msmonitor_log +``` + +Step 4: 拉起训练任务 +```bash +# 训练任务拉起前需要设置LD_PRELOAD +# 示例:export LD_PRELOAD=/usr/local/Ascend/ascend-toolkit/latest/lib64/libmspti.so +export LD_PRELOAD=/ascend-tookit/latest/lib64/libmspti.so + +# 训练任务中需要使用pytorch的优化器/继承原生优化器 +bash train.sh +``` + +Step 5:使用dyno CLI使能npu-monitor +```bash +# 示例1:开启性能监控,使用默认配置 +dyno --certs-dir /home/rpc_certs npu-monitor --npu-monitor-start + +# 示例2:暂停性能监控 +dyno --certs-dir /home/rpc_certs npu-monitor --npu-monitor-stop + +# 示例3:性能监控过程中修改配置 +# 上报周期30s, 上报数据类型Marker和Kernel +dyno --certs-dir /home/rpc_certs npu-monitor --report-interval-s 30 --mspti-activity-kind Marker,Kernel + +# 示例4:性能监控开启时修改配置 +# 上报周期30s, 上报数据类型Marker和Kernel +dyno --certs-dir /home/rpc_certs npu-monitor --npu-monitor-start --report-interval-s 30 --mspti-activity-kind Marker,Kernel + +# 示例5:多机场景下性能监控开启时修改配置 +# 多机场景下向特定机器x.x.x.x发送参数信息,参数表示上报周期30s, 上报数据类型Marker和Kernel +dyno --certs-dir /home/rpc_certs --hostname x.x.x.x npu-monitor --npu-monitor-start --report-interval-s 30 --mspti-activity-kind Marker,Kernel +``` + +Step6: 观测Prometheus上报数据 +``` +# Prometheus默认端口为8080 +curl 127.0.0.1:8080/metrics +``` + +## 附录 + +[Mindspore框架下msMonitor的使用方法](./docs/mindspore_adapter.md) + +[安全声明](./docs/security_statement.md) \ No newline at end of file diff --git a/msmonitor/docs/mindspore_adapter.md b/msmonitor/docs/mindspore_adapter.md new file mode 100644 index 0000000000000000000000000000000000000000..cb048e81dc766cc2c6156dafdb73ee30702fb853 --- /dev/null +++ b/msmonitor/docs/mindspore_adapter.md @@ -0,0 +1,60 @@ +## MindSpore框架下msMonitor的使用方法 + +### 1. 动态profiling自定义for循环方式 + +Step 1:拉起dynolog daemon进程 + +Step 2:使能dynolog环境变量 + +Step 3:配置msMonitor日志路径 + +- 前3步以及第5步操作可以参考[msMonitor使用教程](/msmonitor/README.md) + +Step 4: 拉起训练任务 +在训练任务中实例化DynamicProfilerMonitor对象,且在每一次训练后,调用step()方法。 + +- 示例代码如下: +```python +import numpy as np +import mindspore +import mindspore.dataset as ds +from mindspore import nn +from mindspore.profiler import DynamicProfilerMonitor + +class Net(nn.Cell): + def __init__(self): + super(Net, self).__init__() + self.fc = nn.Dense(2, 2) + + def construct(self, x): + return self.fc(x) + + +def generator_net(): + for _ in range(2): + yield np.ones([2, 2]).astype(np.float32), np.ones([2]).astype(np.int32) + + +def train(test_net): + optimizer = nn.Momentum(test_net.trainable_params(), 1, 0.9) + loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) + data = ds.GeneratorDataset(generator_net(), ["data", "label"]) + model = mindspore.train.Model(test_net, loss, optimizer) + model.train(1, data) + +if __name__ == '__main__': + dp = DynamicProfilerMonitor() + step_num = 100 + # 定义模型 + net = Net() + for i in range(step_num): + # 模型训练 + train(net) + # 调用step方法实现npu trace dump或npu monitor功能 + dp.step() +``` + +Step 5:使用dyno CLI使能trace dump或npu-monitor + +### 2. 动态profiling call back方式 +该使能方式与动态profiling自定义for循环方式一致,唯一区别是将step()方法适配在step_begin、step_end回调函数中。 diff --git a/msmonitor/docs/security_statement.md b/msmonitor/docs/security_statement.md new file mode 100644 index 0000000000000000000000000000000000000000..fafd0d5ba4fa944fcecb0d8383a9058ae4e77997 --- /dev/null +++ b/msmonitor/docs/security_statement.md @@ -0,0 +1,6 @@ +## 安全声明 +### 通信矩阵 + +| 序号 | 代码仓 | 功能 | 源设备 | 源IP | 源端口 | 目的设备 | 目的IP | 目的端口
(侦听) | 协议 | 端口说明 | 端口配置 | 侦听端口是否可更改 | 认证方式 | 加密方式 | 所属平面 | 版本 | 特殊场景 | 备注 | +|:----|:------------|:-----------|:------------------|:---------------------|:------|:-------------------|:---------------------|:--------------|:-----------|:-------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------|:-----|:-----|:-------|:-----------------------|:-----|:---| +| 1 | msMonitor | dyno和dynolog RPC通信 | dyno客户端 | 运行dyno客户端进程的服务器的ip | | dynolog服务端所在服务器 | dynolog服务端所在服务器的ip | 1778 | TCP | RPC通信 | 不涉及 | 可修改 | 证书密钥 | TLS | 业务面 | 所有版本 | 无 | | diff --git a/msmonitor/dynolog_npu/cli/Cargo.toml b/msmonitor/dynolog_npu/cli/Cargo.toml new file mode 100644 index 0000000000000000000000000000000000000000..7d87551ba4f2e9dc3b6710ab44964365e41910e3 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/Cargo.toml @@ -0,0 +1,24 @@ +[package] +name = "dyno" +version = "0.1.0" +edition = "2021" + +[dependencies] +anyhow = "1.0.57" +clap = { version = "3.1.0", features = ["derive"]} +serde_json = "1.0" +rustls = "0.21.0" +rustls-pemfile = "1.0" +webpki = "0.22" + +[net] +git-fetch-with-cli = true + +[build] +rustflags = [ + "-C", "relocation_model=pie", + "-C", "link-args=-Wl,-z,now", + "-C", "link-args=-Wl,-z,relro", + "-C", "strip=symbols", + "-C", "overflow_checks" +] \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/dcgm.rs b/msmonitor/dynolog_npu/cli/src/commands/dcgm.rs new file mode 100644 index 0000000000000000000000000000000000000000..a5261fc8acefe0199340b9d7ca77903a533ee3d7 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/dcgm.rs @@ -0,0 +1,49 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +use std::net::TcpStream; +use rustls::{ClientConnection, StreamOwned}; + +use anyhow::Result; + +#[path = "utils.rs"] +mod utils; + +// This module contains the handling logic for dcgm + +/// Pause dcgm module profiling +pub fn run_dcgm_pause( + mut client: StreamOwned, + duration_s: i32, +) -> Result<()> { + let request_json = format!( + r#" +{{ + "fn": "dcgmProfPause", + "duration_s": {} +}}"#, + duration_s + ); + + utils::send_msg(&mut client, &request_json).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + Ok(()) +} + +/// Resume dcgm module profiling +pub fn run_dcgm_resume(mut client: StreamOwned) -> Result<()> { + utils::send_msg(&mut client, r#"{"fn":"dcgmProfResume"}"#) + .expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + Ok(()) +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/gputrace.rs b/msmonitor/dynolog_npu/cli/src/commands/gputrace.rs new file mode 100644 index 0000000000000000000000000000000000000000..c27b7534e06a8ed8569a44dadaaf2654da093589 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/gputrace.rs @@ -0,0 +1,217 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +use std::net::TcpStream; +use rustls::{ClientConnection, StreamOwned}; + +use anyhow::Result; +use serde_json::Value; + +#[path = "utils.rs"] +mod utils; + +// This module contains the handling logic for dyno gputrace + +#[derive(Debug)] +pub enum GpuTraceTriggerConfig { + DurationBased { + profile_start_time: u64, + duration_ms: u64, + }, + IterationBased { + profile_start_iteration_roundup: u64, + iterations: i64, + }, +} + +impl GpuTraceTriggerConfig { + fn config(&self) -> String { + match *self { + GpuTraceTriggerConfig::DurationBased { + profile_start_time, + duration_ms, + } => format!( + "PROFILE_START_TIME={}\nACTIVITIES_DURATION_MSECS={}", + profile_start_time, duration_ms + ), + GpuTraceTriggerConfig::IterationBased { + profile_start_iteration_roundup, + iterations, + } => format!( + r#"PROFILE_START_ITERATION=0 +PROFILE_START_ITERATION_ROUNDUP={} +ACTIVITIES_ITERATIONS={}"#, + profile_start_iteration_roundup, iterations + ), + } + } +} + +#[derive(Debug)] +pub struct GpuTraceOptions { + pub record_shapes: bool, + pub profile_memory: bool, + pub with_stacks: bool, + pub with_flops: bool, + pub with_modules: bool, +} + +impl GpuTraceOptions { + fn config(&self) -> String { + format!( + r#" +PROFILE_REPORT_INPUT_SHAPES={} +PROFILE_PROFILE_MEMORY={} +PROFILE_WITH_STACK={} +PROFILE_WITH_FLOPS={} +PROFILE_WITH_MODULES={}"#, + self.record_shapes, + self.profile_memory, + self.with_stacks, + self.with_flops, + self.with_modules + ) + } +} + +#[derive(Debug)] +pub struct GpuTraceConfig { + pub log_file: String, + pub trigger_config: GpuTraceTriggerConfig, + pub trace_options: GpuTraceOptions, +} + +impl GpuTraceConfig { + fn config(&self) -> String { + format!( + "ACTIVITIES_LOG_FILE={}\n{}{}", + self.log_file, + self.trigger_config.config(), + self.trace_options.config() + ) + } +} + +/// Gputrace command triggers GPU profiling on pytorch apps +pub fn run_gputrace( + mut client: StreamOwned, + job_id: u64, + pids: &str, + process_limit: u32, + config: GpuTraceConfig, +) -> Result<()> { + let kineto_config = config.config(); + println!("Kineto config = \n{}", kineto_config); + let kineto_config = kineto_config.replace('\n', "\\n"); + + let request_json = format!( + r#" +{{ + "fn": "setKinetOnDemandRequest", + "config": "{}", + "job_id": {}, + "pids": [{}], + "process_limit": {} +}}"#, + kineto_config, job_id, pids, process_limit + ); + + utils::send_msg(&mut client, &request_json).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + let resp_v: Value = serde_json::from_str(&resp_str)?; + let processes = resp_v["processesMatched"].as_array().unwrap(); + + if processes.is_empty() { + println!("No processes were matched, please check --job-id or --pids flags"); + } else { + println!("Matched {} processes", processes.len()); + println!("Trace output files will be written to:"); + + for pid in processes { + let pid = pid.as_i64().unwrap(); + println!( + " {}", + config.log_file.replace(".json", &format!("_{}.json", pid)) + ); + } + } + + Ok(()) +} + +#[cfg(test)] +mod tests { + use crate::*; + + #[test] + fn test_gputrace_trigger_config() { + let trigger_config = GpuTraceTriggerConfig::DurationBased { + profile_start_time: 1000, + duration_ms: 42, + }; + assert_eq!( + trigger_config.config(), + r#"PROFILE_START_TIME=1000 +ACTIVITIES_DURATION_MSECS=42"# + ); + + let trigger_config = GpuTraceTriggerConfig::IterationBased { + profile_start_iteration_roundup: 1000, + iterations: 42, + }; + assert_eq!( + trigger_config.config(), + r#"PROFILE_START_ITERATION=0 +PROFILE_START_ITERATION_ROUNDUP=1000 +ACTIVITIES_ITERATIONS=42"# + ); + } + + #[test] + fn test_gputrace_config() { + let mut test_trace_options = GpuTraceOptions { + record_shapes: true, + profile_memory: false, + with_stacks: true, + with_flops: false, + with_modules: true, + }; + assert_eq!( + test_trace_options.config(), + r#" +PROFILE_REPORT_INPUT_SHAPES=true +PROFILE_PROFILE_MEMORY=false +PROFILE_WITH_STACK=true +PROFILE_WITH_FLOPS=false +PROFILE_WITH_MODULES=true"# + ); + + test_trace_options.profile_memory = true; + + let test_trace_config = GpuTraceConfig { + log_file: String::from("/tmp/test_trace.json"), + trigger_config: GpuTraceTriggerConfig::DurationBased { + profile_start_time: 1000, + duration_ms: 42, + }, + trace_options: test_trace_options, + }; + assert_eq!( + test_trace_config.config(), + r#"ACTIVITIES_LOG_FILE=/tmp/test_trace.json +PROFILE_START_TIME=1000 +ACTIVITIES_DURATION_MSECS=42 +PROFILE_REPORT_INPUT_SHAPES=true +PROFILE_PROFILE_MEMORY=true +PROFILE_WITH_STACK=true +PROFILE_WITH_FLOPS=false +PROFILE_WITH_MODULES=true"# + ); + } +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/mod.rs b/msmonitor/dynolog_npu/cli/src/commands/mod.rs new file mode 100644 index 0000000000000000000000000000000000000000..18950d3c1a01d972db58a614a46f08176b02c725 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/mod.rs @@ -0,0 +1,18 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +// Export all command submodules to be used in main.rs +// Note: This "intermediate" commands module is purely for organizational purposes. +// This allows for a clear distinction between the command dispatching code and the command +// handling code. Additionally, explicitly "exporting" all the command modules here allows +// us to avoid having to explicitly list all the command modules in main.rs. + +pub mod dcgm; +pub mod gputrace; +pub mod nputrace; +pub mod npumonitor; +pub mod status; +pub mod version; +// ... add new command modules here \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/npumonitor.rs b/msmonitor/dynolog_npu/cli/src/commands/npumonitor.rs new file mode 100644 index 0000000000000000000000000000000000000000..f8f73c5b959af37973552286426d6a20edea650f --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/npumonitor.rs @@ -0,0 +1,60 @@ +use rustls::{ClientConnection, StreamOwned}; +use std::net::TcpStream; + +use anyhow::Result; + +#[path = "utils.rs"] +mod utils; + +#[derive(Debug)] +pub struct NpuMonitorConfig { + pub npu_monitor_start: bool, + pub npu_monitor_stop: bool, + pub report_interval_s: u32, + pub mspti_activity_kind: String, +} + +impl NpuMonitorConfig { + fn config(&self) -> String { + format!( + r#" +NPU_MONITOR_START={} +NPU_MONITOR_STOP={} +REPORT_INTERVAL_S={} +MSPTI_ACTIVITY_KIND={}"#, + self.npu_monitor_start, + self.npu_monitor_stop, + self.report_interval_s, + self.mspti_activity_kind + ) + } +} + +pub fn run_npumonitor( + mut client: StreamOwned, + config: NpuMonitorConfig, +) -> Result<()> { + let config_str = config.config(); + println!("Npu monitor config = \n{}", config_str); + let config_str = config_str.replace('\n', "\\n"); + + let request_json = format!( + r#" +{{ + "fn": "setKinetOnDemandRequest", + "config": "{}", + "job_id": 0, + "pids": [0], + "process_limit": 3 +}}"#, + config_str + ); + + utils::send_msg(&mut client, &request_json).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + Ok(()) +} diff --git a/msmonitor/dynolog_npu/cli/src/commands/nputrace.rs b/msmonitor/dynolog_npu/cli/src/commands/nputrace.rs new file mode 100644 index 0000000000000000000000000000000000000000..0095aab6ef60a353c6b294de3942a29f6f9af141 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/nputrace.rs @@ -0,0 +1,248 @@ +use std::net::TcpStream; +use rustls::{ClientConnection, StreamOwned}; + +use anyhow::Result; +use serde_json::Value; + +#[path = "utils.rs"] +mod utils; + +#[derive(Debug)] +pub enum NpuTraceTriggerConfig { + DurationBased { + profile_start_time: u64, + duration_ms: u64, + }, + IterationBased { + start_step: u64, + iterations: i64, + }, +} + +impl NpuTraceTriggerConfig { + fn config(&self) -> String { + match *self { + NpuTraceTriggerConfig::DurationBased { + profile_start_time, + duration_ms, + } => format!( + "PROFILE_START_TIME={}\nACTIVITIES_DURATION_MSECS={}", + profile_start_time, duration_ms + ), + NpuTraceTriggerConfig::IterationBased { + start_step, + iterations, + } => format!( + r#"PROFILE_START_ITERATION=0 +PROFILE_START_STEP={} +ACTIVITIES_ITERATIONS={}"#, + start_step, iterations + ), + } + } +} + +// torch npu profiler config +#[derive(Debug)] +pub struct NpuTraceOptions { + pub record_shapes: bool, + pub profile_memory: bool, + pub with_stack: bool, + pub with_flops: bool, + pub with_modules: bool, + pub activities: String, + pub analyse: bool, + pub profiler_level: String, + pub aic_metrics: String, + pub l2_cache: bool, + pub op_attr: bool, + pub msprof_tx: bool, + pub gc_detect_threshold: Option, + pub data_simplification: String, + pub export_type: String, +} + +impl NpuTraceOptions { + fn config(&self) -> String { + format!( + r#" +PROFILE_RECORD_SHAPES={} +PROFILE_PROFILE_MEMORY={} +PROFILE_WITH_STACK={} +PROFILE_WITH_FLOPS={} +PROFILE_WITH_MODULES={} +PROFILE_ACTIVITIES={} +PROFILE_ANALYSE={} +PROFILE_PROFILER_LEVEL={} +PROFILE_AIC_METRICS={} +PROFILE_L2_CACHE={} +PROFILE_OP_ATTR={} +PROFILE_MSPROF_TX={} +PROFILE_GC_DETECT_THRESHOLD={} +PROFILE_DATA_SIMPLIFICATION={} +PROFILE_EXPORT_TYPE={}"#, + self.record_shapes, + self.profile_memory, + self.with_stack, + self.with_flops, + self.with_modules, + self.activities, + self.analyse, + self.profiler_level, + self.aic_metrics, + self.l2_cache, + self.op_attr, + self.msprof_tx, + self.gc_detect_threshold.map_or("None".to_string(), |v| v.to_string()), + self.data_simplification, + self.export_type + ) + } +} + +#[derive(Debug)] +pub struct NpuTraceConfig { + pub log_file: String, + pub trigger_config: NpuTraceTriggerConfig, + pub trace_options: NpuTraceOptions, +} + +impl NpuTraceConfig { + fn config(&self) -> String { + format!( + "ACTIVITIES_LOG_FILE={}\n{}{}", + self.log_file, + self.trigger_config.config(), + self.trace_options.config() + ) + } +} + +pub fn run_nputrace( + mut client: StreamOwned, + job_id: u64, + pids: &str, + process_limit: u32, + config: NpuTraceConfig, +) -> Result<()> { + let config_str = config.config(); + println!("NpuTrace config = \n{}", config_str); + let config_str = config_str.replace('\n', "\\n"); + + let request_json = format!( + r#" +{{ + "fn": "setKinetOnDemandRequest", + "config": "{}", + "job_id": {}, + "pids": [{}], + "process_limit": {} +}}"#, + config_str, job_id, pids, process_limit + ); + + utils::send_msg(&mut client, &request_json).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + let resp_v: Value = serde_json::from_str(&resp_str)?; + let processes = resp_v["processesMatched"].as_array().unwrap(); + + if processes.is_empty() { + println!("No processes were matched, please check --job-id or --pids flags"); + } else { + println!("Matched {} processes", processes.len()); + println!("Trace output files will be written to:"); + + for pid in processes { + let pid = pid.as_i64().unwrap(); + println!( + " {}", + config.log_file.replace(".json", &format!("_{}.json", pid)) + ); + } + } + + Ok(()) +} + + +#[cfg(test)] +mod test { + use crate::*; + + #[test] + fn test_nputrace_trigger_config() { + let trigger_config = NpuTraceTriggerConfig::DurationBased { + profile_start_time: 1000, + duration_ms: 1000, + }; + assert_eq!( + trigger_config.config(), + r#"PROFILE_START_TIME=1000 +ACTIVITIES_DURATION_MSECS=1000"# + ); + + let trigger_config = NpuTraceTriggerConfig::IterationBased { + profile_start_step: 1000, + iterations: 1000, + }; + assert_eq!( + trigger_config.config(), + r#"PROFILE_START_ITERATION=0 +PROFILE_START_STEP=1000 +ACTIVITIES_ITERATIONS=1000"# + ); + } + + #[test] + fn test_nputrace_config() { + let config = NpuTraceConfig { + log_file: "test.json".to_string(), + trigger_config: NpuTraceTriggerConfig::DurationBased { + profile_start_time: 1000, + duration_ms: 1000, + }, + trace_options: NpuTraceOptions { + record_shapes: true, + profile_memory: false, + with_stack: true, + with_flops: true, + with_modules: true, + activities: "CPU,NPU".to_string(), + analyse: false, + profiler_level: "Level0".to_string(), + aic_metrics: "AiCoreNone".to_string(), + l2_cache: true, + op_attr: true, + msprof_tx: true, + gc_detect_threshold: 0.1, + data_simplification: "true", + export_type: "Text".to_string(), + }, + }; + assert_eq!( + config.config(), + r#"ACTIVITIES_LOG_FILE=test.json +PROFILE_START_TIME=1000 +ACTIVITIES_DURATION_MSECS=1000 +PROFILE_RECORD_SHAPES=true +PROFILE_PROFILE_MEMORY=false +PROFILE_WITH_STACK=true +PROFILE_WITH_FLOPS=true +PROFILE_WITH_MODULES=true +PROFILE_ACTIVITIES=CPU,NPU +PROFILE_ANALYSE=false +PROFILE_PROFILER_LEVEL=Level0 +PROFILE_AIC_METRICS=AiCoreNone +PROFILE_L2_CACHE=true +PROFILE_OP_ATTR=true +PROFILE_MSPROF_TX=true +PROFILE_GC_DETECT_THRESHOLD=0.1 +PROFILE_DATA_SIMPLIFICATION=true +PROFILE_EXPORT_TYPE=Text"# + ); + } +} diff --git a/msmonitor/dynolog_npu/cli/src/commands/status.rs b/msmonitor/dynolog_npu/cli/src/commands/status.rs new file mode 100644 index 0000000000000000000000000000000000000000..46a56b6c64582c1b710d7cf0d8beba0c87728525 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/status.rs @@ -0,0 +1,25 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +use rustls::{ClientConnection, StreamOwned}; +use std::net::TcpStream; + +use anyhow::Result; + +#[path = "utils.rs"] +mod utils; + +// This module contains the handling logic for dyno status + +/// Get system info +pub fn run_status(mut client: StreamOwned) -> Result<()> { + utils::send_msg(&mut client, r#"{"fn":"getStatus"}"#).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + Ok(()) +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/utils.rs b/msmonitor/dynolog_npu/cli/src/commands/utils.rs new file mode 100644 index 0000000000000000000000000000000000000000..ab78ec1a8ab35f75076715766a02ffb39a7682d9 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/utils.rs @@ -0,0 +1,33 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +use std::io::{Read, Write}; + +use anyhow::Result; + +pub fn send_msg(client: &mut T, msg: &str) -> Result<()> { + let msg_len: [u8; 4] = i32::try_from(msg.len()).unwrap().to_ne_bytes(); + + client.write_all(&msg_len)?; + client.write_all(msg.as_bytes()).map_err(|err| err.into()) +} + +pub fn get_resp(client: &mut T) -> Result { + // Response is prefixed with length + let mut resp_len: [u8; 4] = [0; 4]; + client.read_exact(&mut resp_len)?; + + let resp_len = i32::from_ne_bytes(resp_len); + let resp_len = usize::try_from(resp_len).unwrap(); + + println!("response length = {}", resp_len); + + let mut resp_str = Vec::::new(); + resp_str.resize(resp_len, 0); + + client.read_exact(resp_str.as_mut_slice())?; + + String::from_utf8(resp_str).map_err(|err| err.into()) +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/commands/version.rs b/msmonitor/dynolog_npu/cli/src/commands/version.rs new file mode 100644 index 0000000000000000000000000000000000000000..5a29a85aaad3a7affe508e4c400de1b4e16beee0 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/commands/version.rs @@ -0,0 +1,24 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +use rustls::{ClientConnection, StreamOwned}; +use std::net::TcpStream; +use anyhow::Result; + +#[path = "utils.rs"] +mod utils; + +// This module contains the handling logic for querying dyno version + +/// Get version info +pub fn run_version(mut client: StreamOwned) -> Result<()> { + utils::send_msg(&mut client, r#"{"fn":"getVersion"}"#).expect("Error sending message to service"); + + let resp_str = utils::get_resp(&mut client).expect("Unable to decode output bytes"); + + println!("response = {}", resp_str); + + Ok(()) +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/cli/src/main.rs b/msmonitor/dynolog_npu/cli/src/main.rs new file mode 100644 index 0000000000000000000000000000000000000000..637e7c44f2d56a339a86382cb59e1475e55b43c6 --- /dev/null +++ b/msmonitor/dynolog_npu/cli/src/main.rs @@ -0,0 +1,431 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. +use std::fs::File; +use std::io::BufReader; +use rustls::{Certificate, RootCertStore, PrivateKey, ClientConnection, StreamOwned}; +use std::sync::Arc; +use std::net::TcpStream; +use std::net::ToSocketAddrs; +use std::path::PathBuf; +use std::io; + +use anyhow::Result; +use clap::Parser; +use std::collections::HashSet; + +// Make all the command modules accessible to this file. +mod commands; +use commands::gputrace::GpuTraceConfig; +use commands::gputrace::GpuTraceOptions; +use commands::gputrace::GpuTraceTriggerConfig; +use commands::nputrace::NpuTraceConfig; +use commands::nputrace::NpuTraceOptions; +use commands::nputrace::NpuTraceTriggerConfig; +use commands::npumonitor::NpuMonitorConfig; +use commands::*; + +/// Instructions on adding a new Dyno CLI command: +/// +/// 1. Add a new variant to the `Command` enum. +/// Please include a description of the command and, if applicable, its flags/subcommands. +/// +/// 2. Create a new file for the command's implementation in the commands/ directory (ie +/// commands/status.rs). This new file is where the command should be implemented. +/// Make the new command's module accessible from this file by adding +/// a new line with `pub mod ;` to commands/mod.rs. +/// +/// +/// 3. Add a branch to the match statement in main() to handle the new enum variant (from step 1). +/// From here, invoke the handling logic defined in the new file (from step 2). In an effort to keep +/// the command dispatching logic clear and concise, please keep the code in the match branch to a minimum. + +const DYNO_PORT: u16 = 1778; + +#[derive(Debug, Parser)] +struct Opts { + #[clap(long, default_value = "localhost")] + hostname: String, + #[clap(long, default_value_t = DYNO_PORT)] + port: u16, + #[clap(long, required = true)] + certs_dir: String, + #[clap(subcommand)] + cmd: Command, +} + +const ALLOWED_VALUES: &[&str] = &["Marker", "Kernel", "API", "Hccl", "Memory", "MemSet", "MemCpy"]; + +fn parse_mspti_activity_kinds(src: &str) -> Result{ + let allowed_values: HashSet<&str> = ALLOWED_VALUES.iter().cloned().collect(); + + let kinds: Vec<&str> = src.split(',').map(|s| s.trim()).collect(); + + for kind in &kinds { + if !allowed_values.contains(kind) { + return Err(format!("Invalid MSPTI activity kind: {}, Possible values: {:?}.]", kind, allowed_values)); + } + } + + Ok(src.to_string()) +} + +#[derive(Debug, Parser)] +enum Command { + /// Check the status of a dynolog process + Status, + /// Check the version of a dynolog process + Version, + /// Capture gputrace + Gputrace { + /// Job id of the application to trace. + #[clap(long, default_value_t = 0)] + job_id: u64, + /// List of pids to capture trace for (comma separated). + #[clap(long, default_value = "0")] + pids: String, + /// Duration of trace to collect in ms. + #[clap(long, default_value_t = 500)] + duration_ms: u64, + /// Training iterations to collect, this takes precedence over duration. + #[clap(long, default_value_t = -1)] + iterations: i64, + /// Log file for trace. + #[clap(long)] + log_file: String, + /// Unix timestamp used for synchronized collection (milliseconds since epoch). + #[clap(long, default_value_t = 0)] + profile_start_time: u64, + /// Start iteration roundup, starts an iteration based trace at a multiple + /// of this value. + #[clap(long, default_value_t = 1)] + profile_start_iteration_roundup: u64, + /// Max number of processes to profile. + #[clap(long, default_value_t = 3)] + process_limit: u32, + /// Record PyTorch operator input shapes and types. + #[clap(long, action)] + record_shapes: bool, + /// Profile PyTorch memory. + #[clap(long, action)] + profile_memory: bool, + /// Capture Python stacks in traces. + #[clap(long, action)] + with_stacks: bool, + /// Annotate operators with analytical flops. + #[clap(long, action)] + with_flops: bool, + /// Capture PyTorch operator modules in traces. + #[clap(long, action)] + with_modules: bool, + }, + /// Capture nputrace. Subcommand functions aligned with Ascend Torch Profiler. + Nputrace { + /// Job id of the application to trace. + #[clap(long, default_value_t = 0)] + job_id: u64, + /// List of pids to capture trace for (comma separated). + #[clap(long, default_value = "0")] + pids: String, + /// Duration of trace to collect in ms. + #[clap(long, default_value_t = 500)] + duration_ms: u64, + /// Training iterations to collect, this takes precedence over duration. + #[clap(long, default_value_t = -1)] + iterations: i64, + /// Log file for trace. + #[clap(long)] + log_file: String, + /// Unix timestamp used for synchronized collection (milliseconds since epoch). + #[clap(long, default_value_t = 0)] + profile_start_time: u64, + /// Number of steps to start profile. + #[clap(long, default_value_t = 0)] + start_step: u64, + /// Max number of processes to profile. + #[clap(long, default_value_t = 3)] + process_limit: u32, + /// Whether to record PyTorch operator input shapes and types. + #[clap(long, action)] + record_shapes: bool, + /// Whether to profile PyTorch memory. + #[clap(long, action)] + profile_memory: bool, + /// Whether to profile the Python call stack in trace. + #[clap(long, action)] + with_stack: bool, + /// Annotate operators with analytical flops. + #[clap(long, action)] + with_flops: bool, + /// Whether to profile PyTorch operator modules in traces. + #[clap(long, action)] + with_modules: bool, + /// The scope of the profile's events. + #[clap(long, value_parser = ["CPU,NPU", "NPU,CPU", "CPU", "NPU"], default_value = "CPU,NPU")] + activities: String, + /// Profiler level. + #[clap(long, value_parser = ["Level0", "Level1", "Level2", "Level_none"], default_value = "Level0")] + profiler_level: String, + /// AIC metrics. + #[clap(long, value_parser = ["AiCoreNone", "PipeUtilization", "ArithmeticUtilization", "Memory", "MemoryL0", "ResourceConflictRatio", "MemoryUB", "L2Cache", "MemoryAccess"], default_value = "AiCoreNone")] + aic_metrics: String, + /// Whether to analyse the data after collection. + #[clap(long, action)] + analyse: bool, + /// Whether to collect L2 cache. + #[clap(long, action)] + l2_cache: bool, + /// Whether to collect op attributes. + #[clap(long, action)] + op_attr: bool, + /// Whether to enable MSTX. + #[clap(long, action)] + msprof_tx: bool, + /// GC detect threshold. + #[clap(long)] + gc_detect_threshold: Option, + /// Whether to streamline data after analyse is complete. + #[clap(long, value_parser = ["true", "false"], default_value = "true")] + data_simplification: String, + /// Types of data exported by the profiler. + #[clap(long, value_parser = ["Text", "Db"], default_value = "Text")] + export_type: String, + }, + /// Ascend MSPTI Monitor + NpuMonitor { + /// Start NPU monitor. + #[clap(long, action)] + npu_monitor_start: bool, + /// Stop NPU monitor. + #[clap(long, action)] + npu_monitor_stop: bool, + /// NPU monitor report interval in seconds. + #[clap(long, default_value_t = 60)] + report_interval_s: u32, + /// MSPTI collect activity kind + #[clap(long, value_parser = parse_mspti_activity_kinds, default_value = "Marker")] + mspti_activity_kind: String, + }, + /// Pause dcgm profiling. This enables running tools like Nsight compute and avoids conflicts. + DcgmPause { + /// Duration to pause dcgm profiling in seconds + #[clap(long, default_value_t = 300)] + duration_s: i32, + }, + /// Resume dcgm profiling + DcgmResume, +} + +struct ClientConfigPath { + cert_path: PathBuf, + key_path: PathBuf, + ca_cert_path: PathBuf, +} + +fn create_dyno_client( + host: &str, + port: u16, + config: &ClientConfigPath +) -> Result> { + let addr = (host, port) + .to_socket_addrs()? + .next() + .ok_or_else(|| io::Error::new( + io::ErrorKind::NotFound, + "Could not resolve the host address" + ))?; + + let stream = TcpStream::connect(addr)?; + + println!("Loading CA cert from: {}", config.ca_cert_path.display()); + let mut root_store = RootCertStore::empty(); + let ca_file = File::open(&config.ca_cert_path)?; + let mut ca_reader = BufReader::new(ca_file); + let ca_certs = rustls_pemfile::certs(&mut ca_reader)?; + for ca_cert in ca_certs { + root_store.add(&Certificate(ca_cert))?; + } + + println!("Loading client cert from: {}", config.cert_path.display()); + let cert_file = File::open(&config.cert_path)?; + let mut cert_reader = BufReader::new(cert_file); + let certs = rustls_pemfile::certs(&mut cert_reader)? + .into_iter() + .map(Certificate) + .collect(); + + println!("Loading client key from: {}", config.key_path.display()); + let key_file = File::open(&config.key_path)?; + let mut key_reader = BufReader::new(key_file); + let keys = rustls_pemfile::pkcs8_private_keys(&mut key_reader)?; + if keys.is_empty() { + return Err(io::Error::new( + io::ErrorKind::InvalidData, + "No private key found in the key file" + ).into()); + } + let key = PrivateKey(keys[0].clone()); + + let config = rustls::ClientConfig::builder() + .with_safe_defaults() + .with_root_certificates(root_store) + .with_client_auth_cert(certs, key)?; + + let server_name = rustls::ServerName::try_from(host) + .map_err(|e| io::Error::new( + io::ErrorKind::InvalidInput, + format!("Invalid hostname: {}", e) + ))?; + + let conn = rustls::ClientConnection::new( + Arc::new(config), + server_name + )?; + + // 返回 TLS stream + Ok(StreamOwned::new(conn, stream)) +} + +fn main() -> Result<()> { + let Opts { + hostname, + port, + certs_dir, + cmd, + } = Opts::parse(); + + let certs_dir = PathBuf::from(&certs_dir); + + let config = ClientConfigPath { + cert_path: certs_dir.join("client.crt"), + key_path: certs_dir.join("client.key"), + ca_cert_path: certs_dir.join("ca.crt"), + }; + + let client = create_dyno_client(&hostname, port, &config) + .expect("Couldn't connect to the server..."); + + match cmd { + Command::Status => status::run_status(client), + Command::Version => version::run_version(client), + Command::Gputrace { + job_id, + pids, + log_file, + duration_ms, + iterations, + profile_start_time, + profile_start_iteration_roundup, + process_limit, + record_shapes, + profile_memory, + with_stacks, + with_flops, + with_modules, + } => { + let trigger_config = if iterations > 0 { + GpuTraceTriggerConfig::IterationBased { + profile_start_iteration_roundup, + iterations, + } + } else { + GpuTraceTriggerConfig::DurationBased { + profile_start_time, + duration_ms, + } + }; + let trace_options = GpuTraceOptions { + record_shapes, + profile_memory, + with_stacks, + with_flops, + with_modules, + }; + let trace_config = GpuTraceConfig { + log_file, + trigger_config, + trace_options, + }; + gputrace::run_gputrace(client, job_id, &pids, process_limit, trace_config) + } + Command::Nputrace { + job_id, + pids, + log_file, + duration_ms, + iterations, + profile_start_time, + start_step, + process_limit, + record_shapes, + profile_memory, + with_stack, + with_flops, + with_modules, + activities, + analyse, + profiler_level, + aic_metrics, + l2_cache, + op_attr, + msprof_tx, + gc_detect_threshold, + data_simplification, + export_type, + } => { + let trigger_config = if iterations > 0 { + NpuTraceTriggerConfig::IterationBased { + start_step, + iterations, + } + } else { + NpuTraceTriggerConfig::DurationBased { + profile_start_time, + duration_ms, + } + }; + + let trace_options = NpuTraceOptions { + record_shapes, + profile_memory, + with_stack, + with_flops, + with_modules, + activities, + analyse, + profiler_level, + aic_metrics, + l2_cache, + op_attr, + msprof_tx, + gc_detect_threshold, + data_simplification, + export_type, + }; + let trace_config = NpuTraceConfig { + log_file, + trigger_config, + trace_options, + }; + nputrace::run_nputrace(client, job_id, &pids, process_limit, trace_config) + } + Command::NpuMonitor { + npu_monitor_start, + npu_monitor_stop, + report_interval_s, + mspti_activity_kind, + } => { + let npu_mon_config = NpuMonitorConfig { + npu_monitor_start, + npu_monitor_stop, + report_interval_s, + mspti_activity_kind + }; + npumonitor::run_npumonitor(client, npu_mon_config) + } + Command::DcgmPause { duration_s } => dcgm::run_dcgm_pause(client, duration_s), + Command::DcgmResume => dcgm::run_dcgm_resume(client), + // ... add new commands here + } +} \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/CMakeLists.txt b/msmonitor/dynolog_npu/dynolog/src/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..dfa337ec532df3eeca520e2754b9deb1fa7dea88 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/CMakeLists.txt @@ -0,0 +1,71 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. + +set(CMAKE_SKIP_RPATH TRUE) + +cmake_minimum_required(VERSION 3.16) +add_definitions(-DDYNOLOG_VERSION=${DYNOLOG_VERSION} -DDYNOLOG_GIT_REV=${DYNOLOG_GIT_REV}) + +message("Use Prometheus = ${USE_PROMETHEUS}") +message("Use ODS Graph API = ${USE_ODS_GRAPH_API}") + +# our build script will first create a src/ dir where all source code will exist +file (GLOB dynolog_src "*.h" "*.cpp") + +# Remove main from library, only needed for exec. +list(REMOVE_ITEM dynolog_src "${CMAKE_CURRENT_SOURCE_DIR}/Main.cpp") +add_library(dynolog_lib ${dynolog_src}) + +if(USE_ODS_GRAPH_API) + target_compile_options(dynolog_lib PUBLIC "-DUSE_GRAPH_ENDPOINT") +endif() + +if(USE_PROMETHEUS) + find_package(prometheus-cpp CONFIG REQUIRED) + add_definitions(-DUSE_PROMETHEUS) + target_link_libraries(dynolog_lib PRIVATE prometheus-cpp::pull) +endif() + +target_compile_options(dynolog_lib PRIVATE + -fPIC + -fstack-protector-all + -ftrapv +) + +target_link_options(dynolog_lib PRIVATE + -Wl,-z,relro,-z,now,-z,noexecstack + -s +) + +target_link_libraries(dynolog_lib PUBLIC Monitor) +target_link_libraries(dynolog_lib PUBLIC BuiltinMetrics) + +add_subdirectory(rpc) + +add_subdirectory(ipcfabric) +target_link_libraries(dynolog_lib PUBLIC dynolog_ipcfabric_lib) + +# depends on ipcfabric +add_subdirectory(tracing) +target_link_libraries(dynolog_lib PUBLIC dynolog_ipcmonitor_lib) + +add_subdirectory(gpumon) +target_link_libraries(dynolog_lib PUBLIC dynolog_dcgm_lib "-ldl") + +add_subdirectory(rdmamon) +target_link_libraries(dynolog_lib PUBLIC dynolog_rdmamon_lib) + +add_subdirectory(metric_frame) + +add_executable(dynolog Main.cpp) +target_link_libraries(dynolog PRIVATE dynolog_lib dynolog_rpc_lib) + +target_compile_options(dynolog PRIVATE + -fPIC + -fstack-protector-all + -ftrapv +) + +target_link_options(dynolog PRIVATE + -Wl,-z,relro,-z,now,-z,noexecstack + -s +) \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/Main.cpp b/msmonitor/dynolog_npu/dynolog/src/Main.cpp new file mode 100644 index 0000000000000000000000000000000000000000..758d9db3ed9a2a153d94ee9f167811cc0d9a69f8 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/Main.cpp @@ -0,0 +1,212 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +// Dynolog : A portable telemetry monitoring daemon. + +#include +#include +#include +#include +#include +#include "dynolog/src/CompositeLogger.h" +#include "dynolog/src/FBRelayLogger.h" +#include "dynolog/src/KernelCollector.h" +#include "dynolog/src/Logger.h" +#include "dynolog/src/ODSJsonLogger.h" +#include "dynolog/src/PerfMonitor.h" +#include "dynolog/src/ScubaLogger.h" +#include "dynolog/src/ServiceHandler.h" +#include "dynolog/src/gpumon/DcgmGroupInfo.h" +#include "dynolog/src/rpc/SimpleJsonServer.h" +#include "dynolog/src/rpc/SimpleJsonServerInl.h" +#include "dynolog/src/tracing/IPCMonitor.h" +#include "hbt/src/perf_event/BuiltinMetrics.h" + +#ifdef USE_PROMETHEUS +#include "dynolog/src/PrometheusLogger.h" +#endif + +using namespace dynolog; +using json = nlohmann::json; +namespace hbt = facebook::hbt; + +DEFINE_int32(port, 1778, "Port for listening RPC requests."); +DEFINE_bool(use_JSON, false, "Emit metrics to JSON file through JSON logger"); +#ifdef USE_PROMETHEUS +DEFINE_bool(use_prometheus, false, "Emit metrics to Prometheus"); +#endif +DEFINE_bool(use_fbrelay, false, "Emit metrics to FB Relay on Lab machines"); +DEFINE_bool(use_ODS, false, "Emit metrics to ODS through ODS logger"); +DEFINE_bool(use_scuba, false, "Emit metrics to Scuba through Scuba logger"); +DEFINE_int32( + kernel_monitor_reporting_interval_s, + 60, + "Duration in seconds to read and report metrics for kernel monitor"); +DEFINE_int32( + perf_monitor_reporting_interval_s, + 60, + "Duration in seconds to read and report metrics for performance monitor"); +DEFINE_int32( + dcgm_reporting_interval_s, + 10, + "Duration in seconds to read and report metrics for DCGM"); +DEFINE_bool( + enable_ipc_monitor, + false, + "Enabled IPC monitor for on system tracing requests."); +DEFINE_bool( + enable_gpu_monitor, + false, + "Enabled GPU monitorng, currently supports NVIDIA GPUs."); +DEFINE_bool(enable_perf_monitor, false, "Enable heartbeat perf monitoring."); + +std::unique_ptr getLogger(const std::string& scribe_category = "") { + std::vector> loggers; +#ifdef USE_PROMETHEUS + if (FLAGS_use_prometheus) { + loggers.push_back(std::make_unique()); + } +#endif + if (FLAGS_use_fbrelay) { + loggers.push_back(std::make_unique()); + } + if (FLAGS_use_ODS) { + loggers.push_back(std::make_unique()); + } + if (FLAGS_use_JSON) { + loggers.push_back(std::make_unique()); + } + if (FLAGS_use_scuba && !scribe_category.empty()) { + loggers.push_back(std::make_unique(scribe_category)); + } + return std::make_unique(std::move(loggers)); +} + +auto next_wakeup(int sec) { + return std::chrono::steady_clock::now() + std::chrono::seconds(sec); +} + +void kernel_monitor_loop() { + KernelCollector kc; + + LOG(INFO) << "Running kernel monitor loop : interval = " + << FLAGS_kernel_monitor_reporting_interval_s << " s."; + + while (1) { + auto logger = getLogger(); + auto wakeup_timepoint = + next_wakeup(FLAGS_kernel_monitor_reporting_interval_s); + + kc.step(); + kc.log(*logger); + logger->finalize(); + + /* sleep override */ + std::this_thread::sleep_until(wakeup_timepoint); + } +} + +void perf_monitor_loop() { + PerfMonitor pm( + hbt::CpuSet::makeAllOnline(), + std::vector{"instructions", "cycles"}, + getDefaultPmuDeviceManager(), + getDefaultMetrics()); + + LOG(INFO) << "Running perf monitor loop : interval = " + << FLAGS_perf_monitor_reporting_interval_s << " s."; + + while (1) { + auto logger = getLogger(); + auto wakeup_timepoint = + next_wakeup(FLAGS_perf_monitor_reporting_interval_s); + + pm.step(); + pm.log(*logger); + + logger->finalize(); + /* sleep override */ + std::this_thread::sleep_until(wakeup_timepoint); + } +} + +auto setup_server(std::shared_ptr handler) { + return std::make_unique>( + handler, FLAGS_port); +} + +void gpu_monitor_loop(std::shared_ptr dcgm) { + auto logger = getLogger(FLAGS_scribe_category); + + LOG(INFO) << "Running DCGM loop : interval = " + << FLAGS_dcgm_reporting_interval_s << " s."; + LOG(INFO) << "DCGM fields: " << gpumon::FLAGS_dcgm_fields; + + while (1) { + auto wakeup_timepoint = next_wakeup(FLAGS_dcgm_reporting_interval_s); + + dcgm->update(); + dcgm->log(*logger); + + /* sleep override */ + std::this_thread::sleep_until(wakeup_timepoint); + } +} + +int main(int argc, char** argv) { + gflags::ParseCommandLineFlags(&argc, &argv, true); + FLAGS_logtostderr = 1; + google::InitGoogleLogging(argv[0]); + + LOG(INFO) << "Starting Ascend Extension for dynolog, version = " DYNOLOG_VERSION + << ", build git-hash = " DYNOLOG_GIT_REV; + + std::shared_ptr dcgm; + + std::unique_ptr ipcmon; + std::unique_ptr ipcmon_thread, data_ipcmon_thread, gpumon_thread, pm_thread; + + if (FLAGS_enable_ipc_monitor) { + LOG(INFO) << "Starting IPC Monitor"; + ipcmon = std::make_unique(); + ipcmon->setLogger(std::move(getLogger())); + ipcmon_thread = + std::make_unique([&ipcmon]() { ipcmon->loop(); }); + data_ipcmon_thread = + std::make_unique([&ipcmon]() { ipcmon->dataLoop(); }); + } + + if (FLAGS_enable_gpu_monitor) { + dcgm = gpumon::DcgmGroupInfo::factory( + gpumon::FLAGS_dcgm_fields, FLAGS_dcgm_reporting_interval_s * 1000); + gpumon_thread = std::make_unique(gpu_monitor_loop, dcgm); + } + std::thread km_thread{kernel_monitor_loop}; + if (FLAGS_enable_perf_monitor) { + pm_thread = std::make_unique(perf_monitor_loop); + } + + // setup service + auto handler = std::make_shared(dcgm); + + // use simple json RPC server for now + auto server = setup_server(handler); + server->run(); + + if (km_thread.joinable()) { + km_thread.join(); + } + + if (pm_thread && pm_thread->joinable()) { + pm_thread->join(); + } + if (gpumon_thread && gpumon_thread->joinable()) { + gpumon_thread->join(); + } + + server->stop(); + + return 0; +} diff --git a/msmonitor/dynolog_npu/dynolog/src/Metric.cpp b/msmonitor/dynolog_npu/dynolog/src/Metric.cpp new file mode 100644 index 0000000000000000000000000000000000000000..f6fd4d80de13f3819abc0e519e31c0890bd8c141 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/Metric.cpp @@ -0,0 +1,37 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +#include "dynolog/src/Metrics.h" + +#include +#include + +namespace dynolog { + +const std::vector getAllMetrics() { + static std::vector metrics_ = { + {.name = "kindName", + .type = MetricType::Instant, + .desc = "Report data kind name"}, + {.name = "duration", + .type = MetricType::Delta, + .desc = "Total execution time for corresponding kind"}, + {.name = "timestamp", + .type = MetricType::Instant, + .desc = "The timestamp of the reported data"}, + {.name = "deviceId", + .type = MetricType::Instant, + .desc = "The ID of the device for reporting data"}, + }; + return metrics_; +} + +// These metrics are dynamic per network drive +const std::vector getNetworkMetrics() { + static std::vector metrics_ = {}; + return metrics_; +} + +} // namespace dynolog \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/rpc/CMakeLists.txt b/msmonitor/dynolog_npu/dynolog/src/rpc/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..a0b74f82cf9be5cec400e6477183b15a52b76cdc --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/rpc/CMakeLists.txt @@ -0,0 +1,20 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +find_package(OpenSSL REQUIRED) + +add_library(dynolog_rpc_lib STATIC + SimpleJsonServer.cpp SimpleJsonServer.h + ${CMAKE_CURRENT_SOURCE_DIR}/../ServiceHandler.h +) +target_include_directories(dynolog_rpc_lib + INTERFACE ${CMAKE_CURRENT_SOURCE_DIR} +) + +target_include_directories(dynolog_rpc_lib + PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/.. +) +target_link_libraries(dynolog_rpc_lib PRIVATE dynolog_lib) +target_link_libraries(dynolog_rpc_lib PUBLIC gflags::gflags) +target_link_libraries(dynolog_rpc_lib PUBLIC glog::glog) +target_link_libraries(dynolog_rpc_lib PUBLIC nlohmann_json::nlohmann_json) +target_link_libraries(dynolog_rpc_lib PUBLIC fmt::fmt) +target_link_libraries(dynolog_rpc_lib PRIVATE OpenSSL::SSL OpenSSL::Crypto) \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.cpp b/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.cpp new file mode 100644 index 0000000000000000000000000000000000000000..17a6d42895b8d81a8818defa6defd3a5f3ffd1c6 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.cpp @@ -0,0 +1,290 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +#include "dynolog/src/rpc/SimpleJsonServer.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include + +DEFINE_string(certs_dir, "", "TLS crets dir"); + +constexpr int CLIENT_QUEUE_LEN = 50; + +namespace dynolog { + +SimpleJsonServerBase::SimpleJsonServerBase(int port) : port_(port) { + initSocket(); + init_openssl(); + ctx_ = create_context(); + configure_context(ctx_); +} + +SimpleJsonServerBase::~SimpleJsonServerBase() { + if (thread_) { + stop(); + } + close(sock_fd_); +} + +void SimpleJsonServerBase::initSocket() { + struct sockaddr_in6 server_addr; + + /* Create socket for listening (client requests).*/ + sock_fd_ = ::socket(AF_INET6, SOCK_STREAM, 0); + if (sock_fd_ == -1) { + std::perror("socket()"); + return; + } + + /* Set socket to reuse address in case server is restarted.*/ + int flag = 1; + int ret = + ::setsockopt(sock_fd_, SOL_SOCKET, SO_REUSEADDR, &flag, sizeof(flag)); + if (ret == -1) { + std::perror("setsockopt()"); + return; + } + + // in6addr_any allows us to bind to both IPv4 and IPv6 clients. + server_addr.sin6_addr = in6addr_any; + server_addr.sin6_family = AF_INET6; + server_addr.sin6_port = htons(port_); + + /* Bind address and socket together */ + ret = ::bind(sock_fd_, (struct sockaddr*)&server_addr, sizeof(server_addr)); + if (ret == -1) { + std::perror("bind()"); + close(sock_fd_); + return; + } + + /* Create listening queue (client requests) */ + ret = ::listen(sock_fd_, CLIENT_QUEUE_LEN); + if (ret == -1) { + std::perror("listen()"); + close(sock_fd_); + return; + } + + /* Get port if assigned 0 */ + if (port_ == 0) { + socklen_t len_out = sizeof(server_addr); + ret = ::getsockname(sock_fd_, (struct sockaddr*)&server_addr, &len_out); + if (ret < 0 || len_out != sizeof(server_addr)) { + std::perror("getsockname()"); + } else { + port_ = ntohs(server_addr.sin6_port); + LOG(INFO) << "System assigned port = " << ntohs(server_addr.sin6_port); + } + } + + LOG(INFO) << "Listening to connections on port " << port_; + initSuccess_ = true; +} + +/* A simple wrapper to accept connections and read data + * + * Messages are prefixed using the length so we know how long a message + * to actually read. + * : int32_t len + * : char json[] + */ +class ClientSocketWrapper { + public: + ~ClientSocketWrapper() { + if (ssl_) { + SSL_shutdown(ssl_); + SSL_free(ssl_); + } + if (client_sock_fd_ != -1) { + ::close(client_sock_fd_); + } + } + + bool accept(int server_socket, SSL_CTX* ctx) { + struct sockaddr_in6 client_addr; + socklen_t client_addr_len = sizeof(client_addr); + std::array client_addr_str; + + client_sock_fd_ = ::accept( + server_socket, (struct sockaddr*)&client_addr, &client_addr_len); + if (client_sock_fd_ == -1) { + std::perror("accept()"); + return false; + } + + inet_ntop( + AF_INET6, + &(client_addr.sin6_addr), + client_addr_str.data(), + client_addr_str.size()); + LOG(INFO) << "Received connection from " << client_addr_str.data(); + + ssl_ = SSL_new(ctx); + SSL_set_fd(ssl_, client_sock_fd_); + if (SSL_accept(ssl_) <= 0) { + ERR_print_errors_fp(stderr); + return false; + } + LOG(INFO) << "SSL handshake success"; + return true; + } + + std::string get_message() { + int32_t msg_size = -1; + if (!read_helper((uint8_t*)&msg_size, sizeof(msg_size)) || msg_size <= 0) { + LOG(ERROR) << "Invalid message size = " << msg_size; + return ""; + } + std::string message; + message.resize(msg_size); + int recv = 0; + int ret = 1; + while (recv < msg_size && ret > 0) { + ret = read_helper((uint8_t*)&message[recv], msg_size - recv); + recv += ret > 0 ? ret : 0; + } + if (recv != msg_size) { + LOG(ERROR) << "Received partial message, expected size " << msg_size + << " found : " << recv; + LOG(ERROR) << "Message received = " << message; + return ""; + } + return message; + } + + bool send_response(const std::string& response) { + int32_t size = response.size(); + int ret = SSL_write(ssl_, (void*)&size, sizeof(size)); + if (ret <= 0) { + ERR_print_errors_fp(stderr); + return false; + } + int sent = 0; + while (sent < size && ret > 0) { + ret = SSL_write(ssl_, (void*)&response[sent], size - sent); + if (ret <= 0) { + ERR_print_errors_fp(stderr); + } else { + sent += ret; + } + } + if (sent < response.size()) { + LOG(ERROR) << "Unable to write full response"; + return false; + } + return ret > 0; + } + + private: + int read_helper(uint8_t* buf, int size) { + int ret = SSL_read(ssl_, (void*)buf, size); + if (ret <= 0) { + ERR_print_errors_fp(stderr); + } + return ret; + } + + int client_sock_fd_ = -1; + SSL* ssl_ = nullptr; +}; + +/* Accepts socket connections and processes the payloads. + * This will inturn call the Handler functions*/ +void SimpleJsonServerBase::loop() noexcept { + if (sock_fd_ == -1 || !initSuccess_) { + return; + } + + while (run_) { + processOne(); + } +} + +void SimpleJsonServerBase::processOne() noexcept { + LOG(INFO) << "Waiting for connection."; + ClientSocketWrapper client; + if (!client.accept(sock_fd_, ctx_)) { + return; + } + std::string request_str = client.get_message(); + LOG(INFO) << "RPC message received = " << request_str; + auto response_str = processOneImpl(request_str); + if (response_str.empty()) { + return; + } + if (!client.send_response(response_str)) { + LOG(ERROR) << "Failed to send response"; + } +} + +void SimpleJsonServerBase::run() { + LOG(INFO) << "Launching RPC thread"; + thread_ = std::make_unique([this]() { this->loop(); }); +} + +void SimpleJsonServerBase::init_openssl() +{ + SSL_load_error_strings(); + OpenSSL_add_ssl_algorithms(); +} + +SSL_CTX* SimpleJsonServerBase::create_context() +{ + const SSL_METHOD* method = TLS_server_method(); + SSL_CTX* ctx = SSL_CTX_new(method); + if (!ctx) { + perror("Unable to create SSL context"); + ERR_print_errors_fp(stderr); + exit(EXIT_FAILURE); + } + return ctx; +} + +void SimpleJsonServerBase::configure_context(SSL_CTX* ctx) +{ + if (FLAGS_certs_dir.empty()) { + LOG(ERROR) << "--certs-dir must be specified!"; + exit(EXIT_FAILURE); + } + + std::string certs_dir = FLAGS_certs_dir; + if (!certs_dir.empty() && certs_dir.back() != '/') + certs_dir += '/'; + + std::string server_cert = certs_dir + "server.crt"; + std::string server_key = certs_dir + "server.key"; + std::string ca_cert = certs_dir + "ca.crt"; + + LOG(INFO) << "Loading server cert: " << server_cert; + LOG(INFO) << "Loading server key: " << server_key; + LOG(INFO) << "Loading CA cert: " << ca_cert; + + // 加载服务器证书 + if (SSL_CTX_use_certificate_file(ctx, server_cert.c_str(), SSL_FILETYPE_PEM) <= 0) { + ERR_print_errors_fp(stderr); + exit(EXIT_FAILURE); + } + // 加载服务器私钥 + if (SSL_CTX_use_PrivateKey_file(ctx, server_key.c_str(), SSL_FILETYPE_PEM) <= 0 ) { + ERR_print_errors_fp(stderr); + exit(EXIT_FAILURE); + } + // 加载CA证书,实现客户端证书校验 + if (SSL_CTX_load_verify_locations(ctx, ca_cert.c_str(), NULL) <= 0) { + ERR_print_errors_fp(stderr); + exit(EXIT_FAILURE); + } + // 要求客户端必须提供证书 + SSL_CTX_set_verify(ctx, SSL_VERIFY_PEER | SSL_VERIFY_FAIL_IF_NO_PEER_CERT, NULL); +} + +} // namespace dynolog \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.h b/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.h new file mode 100644 index 0000000000000000000000000000000000000000..df5d66f75b54e88dd4c0dff01b7c28ef545cb106 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/rpc/SimpleJsonServer.h @@ -0,0 +1,71 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include "dynolog/src/ServiceHandler.h" + +DECLARE_string(certs_dir); + +namespace dynolog { + +// This is a simple service built using UNIX Sockets +// with remote procedure calls implemented via JSON string. + +class SimpleJsonServerBase { + public: + explicit SimpleJsonServerBase(int port); + virtual ~SimpleJsonServerBase(); + + int getPort() const { + return port_; + } + + bool initSuccessful() const { + return initSuccess_; + } + // spin up a new thread to process requets + void run(); + + void stop() { + run_ = 0; + thread_->join(); + } + + // synchronously processes a request + void processOne() noexcept; + + protected: + void initSocket(); + void init_openssl(); + SSL_CTX* create_context(); + void configure_context(SSL_CTX* ctx); + + // process requests in a loop + void loop() noexcept; + + // implement processing of request using the handler + virtual std::string processOneImpl(const std::string& request_str) { + return ""; + } + + int port_; + int sock_fd_{-1}; + bool initSuccess_{false}; + + std::atomic run_{true}; + std::unique_ptr thread_; + + SSL_CTX* ctx_{nullptr}; +}; + +} // namespace dynolog \ No newline at end of file diff --git a/msmonitor/dynolog_npu/dynolog/src/tracing/CMakeLists.txt b/msmonitor/dynolog_npu/dynolog/src/tracing/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..4afd436bcc378db13f6b925fbd319c7b381a5f2b --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/tracing/CMakeLists.txt @@ -0,0 +1,16 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. + +add_library (dynolog_ipcmonitor_lib IPCMonitor.cpp IPCMonitor.h + ${CMAKE_CURRENT_SOURCE_DIR}/../LibkinetoConfigManager.h +) + +target_include_directories(dynolog_ipcmonitor_lib + INTERFACE ${CMAKE_CURRENT_SOURCE_DIR} +) +target_include_directories(dynolog_ipcmonitor_lib + PUBLIC ${CMAKE_CURRENT_SOURCE_DIR}/.. +) + +target_link_libraries(dynolog_ipcmonitor_lib PUBLIC glog::glog) +target_link_libraries(dynolog_ipcmonitor_lib PUBLIC dynolog_ipcfabric_lib) +target_link_libraries(dynolog_ipcmonitor_lib PUBLIC nlohmann_json::nlohmann_json) diff --git a/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.cpp b/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.cpp new file mode 100644 index 0000000000000000000000000000000000000000..811bae4e0dea1b72b6512f7d3e1819433cb1b14a --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.cpp @@ -0,0 +1,180 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +#include "dynolog/src/tracing/IPCMonitor.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dynolog/src/LibkinetoConfigManager.h" +#include "dynolog/src/ipcfabric/Utils.h" + +namespace dynolog { +namespace tracing { + +constexpr int kSleepUs = 10000; +constexpr int kDataMsgSleepUs = 1000; +const std::string kLibkinetoRequest = "req"; +const std::string kLibkinetoContext = "ctxt"; +const std::string kLibkinetoData = "data"; + +IPCMonitor::IPCMonitor(const std::string& ipc_fabric_name) { + ipc_manager_ = FabricManager::factory(ipc_fabric_name); + data_ipc_manager_ = FabricManager::factory(ipc_fabric_name + "_data"); + // below ensures singleton exists + LOG(INFO) << "Kineto config manager : active processes = " + << LibkinetoConfigManager::getInstance()->processCount("0"); +} + +void IPCMonitor::loop() { + while (ipc_manager_) { + if (ipc_manager_->recv()) { + std::unique_ptr msg = ipc_manager_->retrieve_msg(); + processMsg(std::move(msg)); + } + /* sleep override */ + usleep(kSleepUs); + } +} + +void IPCMonitor::dataLoop() { + while (data_ipc_manager_) { + if (data_ipc_manager_->recv()) { + std::unique_ptr msg = data_ipc_manager_->retrieve_msg(); + processDataMsg(std::move(msg)); + } + /* sleep override */ + usleep(kDataMsgSleepUs); + } +} + +void IPCMonitor::processMsg(std::unique_ptr msg) { + if (!ipc_manager_) { + LOG(ERROR) << "Fabric Manager not initialized"; + return; + } + // sizeof(msg->metadata.type) = 32, well above the size of the constant + // strings we are comparing against. memcmp is safe + if (memcmp( // NOLINT(facebook-security-vulnerable-memcmp) + msg->metadata.type, + kLibkinetoContext.data(), + kLibkinetoContext.size()) == 0) { + registerLibkinetoContext(std::move(msg)); + } else if ( + memcmp( // NOLINT(facebook-security-vulnerable-memcmp) + msg->metadata.type, + kLibkinetoRequest.data(), + kLibkinetoRequest.size()) == 0) { + getLibkinetoOnDemandRequest(std::move(msg)); + } else { + LOG(ERROR) << "TYPE UNKOWN: " << msg->metadata.type; + } +} + +void tracing::IPCMonitor::setLogger(std::unique_ptr logger) +{ + logger_ = std::move(logger); +} + +void IPCMonitor::LogData(const nlohmann::json& result) +{ + auto timestamp = result["timestamp"].get(); + logger_->logUint("timestamp", timestamp); + auto duration = result["duration"].get(); + logger_->logUint("duration", duration); + auto deviceId = result["deviceId"].get(); + logger_->logUint("deviceId", deviceId); + logger_->finalize(); +} + +void IPCMonitor::processDataMsg(std::unique_ptr msg) +{ + if (!data_ipc_manager_) { + LOG(ERROR) << "Fabric Manager not initialized"; + return; + } + if (memcmp( // NOLINT(facebook-security-vulnerable-memcmp) + msg->metadata.type, + kLibkinetoData.data(), + kLibkinetoData.size()) == 0) { + std::string message = std::string((char*)msg->buf.get(), msg->metadata.size); + try { + nlohmann::json result = nlohmann::json::parse(message); + LOG(INFO) << "Received data message : " << result; + LogData(result); + } catch (nlohmann::json::parse_error&) { + LOG(ERROR) << "Error parsing message = " << message; + return; + } + } else { + LOG(ERROR) << "TYPE UNKOWN: " << msg->metadata.type; + } +} + +void IPCMonitor::getLibkinetoOnDemandRequest( + std::unique_ptr msg) { + if (!ipc_manager_) { + LOG(ERROR) << "Fabric Manager not initialized"; + return; + } + std::string ret_config = ""; + ipcfabric::LibkinetoRequest* req = + (ipcfabric::LibkinetoRequest*)msg->buf.get(); + if (req->n == 0) { + LOG(ERROR) << "Missing pids parameter for type " << req->type; + return; + } + std::vector pids(req->pids, req->pids + req->n); + try { + ret_config = LibkinetoConfigManager::getInstance()->obtainOnDemandConfig( + std::to_string(req->jobid), pids, req->type); + VLOG(0) << "getLibkinetoOnDemandRequest() : job id " << req->jobid + << " pids = " << pids[0]; + } catch (const std::runtime_error& ex) { + LOG(ERROR) << "Kineto config manager exception : " << ex.what(); + } + std::unique_ptr ret = + ipcfabric::Message::constructMessage( + ret_config, kLibkinetoRequest); + if (!ipc_manager_->sync_send(*ret, msg->src)) { + LOG(ERROR) << "Failed to return config to libkineto: IPC sync_send fail"; + } + + return; +} + +void IPCMonitor::registerLibkinetoContext( + std::unique_ptr msg) { + if (!ipc_manager_) { + LOG(ERROR) << "Fabric Manager not initialized"; + return; + } + ipcfabric::LibkinetoContext* ctxt = + (ipcfabric::LibkinetoContext*)msg->buf.get(); + int32_t size = -1; + try { + size = LibkinetoConfigManager::getInstance()->registerLibkinetoContext( + std::to_string(ctxt->jobid), ctxt->pid, ctxt->gpu); + } catch (const std::runtime_error& ex) { + LOG(ERROR) << "Kineto config manager exception : " << ex.what(); + } + std::unique_ptr ret = + ipcfabric::Message::constructMessage( + size, kLibkinetoContext); + if (!ipc_manager_->sync_send(*ret, msg->src)) { + LOG(ERROR) << "Failed to send ctxt from dyno: IPC sync_send fail"; + } + + return; +} + +} // namespace tracing +} // namespace dynolog diff --git a/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.h b/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.h new file mode 100644 index 0000000000000000000000000000000000000000..1dc0cd2345fd7d7e556bc5c95361206e0fe2d7f2 --- /dev/null +++ b/msmonitor/dynolog_npu/dynolog/src/tracing/IPCMonitor.h @@ -0,0 +1,45 @@ +// Copyright (c) Meta Platforms, Inc. and affiliates. +// +// This source code is licensed under the MIT license found in the +// LICENSE file in the root directory of this source tree. + +#pragma once + +#include + +// Use glog for FabricManager.h +#define USE_GOOGLE_LOG + +#include "dynolog/src/ipcfabric/FabricManager.h" +#include "dynolog/src/Logger.h" + +namespace dynolog { +namespace tracing { + +class IPCMonitor { + public: + using FabricManager = dynolog::ipcfabric::FabricManager; + IPCMonitor(const std::string& ipc_fabric_name = "dynolog"); + virtual ~IPCMonitor() {} + + void loop(); + void dataLoop(); + + public: + virtual void processMsg(std::unique_ptr msg); + virtual void processDataMsg(std::unique_ptr msg); + void getLibkinetoOnDemandRequest(std::unique_ptr msg); + void registerLibkinetoContext(std::unique_ptr msg); + void setLogger(std::unique_ptr logger); + void LogData(const nlohmann::json& result); + + std::unique_ptr ipc_manager_; + std::unique_ptr data_ipc_manager_; + std::unique_ptr logger_; + + // friend class test_case_name##_##test_name##_Test + friend class IPCMonitorTest_LibkinetoRegisterAndOndemandTest_Test; +}; + +} // namespace tracing +} // namespace dynolog diff --git a/msmonitor/plugin/CMakeLists.txt b/msmonitor/plugin/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..65c8eeb08394f66818ae8d0713999e9a8e11fb41 --- /dev/null +++ b/msmonitor/plugin/CMakeLists.txt @@ -0,0 +1,63 @@ +cmake_minimum_required(VERSION 3.16) +project(IPCMonitor) + +set(CMAKE_SKIP_RPATH TRUE) + +set(CMAKE_CXX_STANDARD 14) +set(CMAKE_CXX_STANDARD_REQUIRED ON) +set(CMAKE_CXX_EXTENSIONS OFF) + +find_package(pybind11 REQUIRED) +find_package(Python REQUIRED COMPONENTS Interpreter Development) + +include_directories( + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor/metric + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor/mspti_monitor + ${DYNOLOG_PATH}/third_party/glog/src + ${DYNOLOG_PATH}/build/third_party/glog + ${DYNOLOG_PATH}/third_party/json/single_include +) + +file(GLOB_RECURSE IPC_SOURCES + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor/*.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor/metric/*.cpp + ${CMAKE_CURRENT_SOURCE_DIR}/ipc_monitor/mspti_monitor/*.cpp +) + +set(SOURCES + bindings.cpp + ${IPC_SOURCES} +) + +add_library(IPCMonitor MODULE ${SOURCES}) + +set_target_properties(IPCMonitor + PROPERTIES + OUTPUT_NAME IPCMonitor + PREFIX "" +) + +target_link_libraries(IPCMonitor PRIVATE + pybind11::module + pthread + ${CMAKE_CURRENT_SOURCE_DIR}/stub/libmspti.so +) + +target_link_libraries(IPCMonitor PRIVATE ${DYNOLOG_PATH}/build/third_party/glog/libglog.a) + +target_compile_options(IPCMonitor PRIVATE + -fPIC + -fstack-protector-all + -ftrapv + $<$>:-O2> +) + +target_link_options(IPCMonitor PRIVATE + -Wl,-z,relro,-z,now,-z,noexecstack + -s +) + +install(TARGETS IPCMonitor + DESTINATION ${CMAKE_INSTALL_PREFIX}/python-package +) \ No newline at end of file diff --git a/msmonitor/plugin/README.md b/msmonitor/plugin/README.md new file mode 100644 index 0000000000000000000000000000000000000000..9cc4a78eea1445f90b843e6358463cb3c9b387d2 --- /dev/null +++ b/msmonitor/plugin/README.md @@ -0,0 +1,48 @@ + + +# Plugins for msMonitor +## 模块说明 +### IPCMonitor +提供IPC(Inter-Process Communication)通信接口,用于实现 +1. IPC控制通道: profiler backend向dynolog daemon获取profiler配置 +2. IPC数据通道: mspti monitor向dynolog daemon发送性能数据 + +__PyDynamicMonitorProxy__: +* `init_dyno` 向dynolog daemon发送注册请求 + * input: npuId(int) + * return: None +* `poll_dyno` 向dynolog daemon获取Profiler控制参数 + * input: None + * return: str, 返回控制参数 +* `enable_dyno_npu_monitor` 开启mspti监控 + * input: cfg_map(Dict[str,str]) 配置 + * return: None + +## 安装方式 +### 1. 通过shell脚本一键安装 +``` +chmod +x build.sh +./build.sh +``` +### 2. 手动安装 +* 安装依赖 +``` +pip install wheel +pip install pybind11 +``` +* 编译whl包 +``` +python3 setup.py bdist_wheel +``` +以上命令执行完成后在plugn/dist目录下生成msMonitor插件whl安装包msmonitor-plugin-{version}.whl +* 安装 +``` +pip install dist/{msmonitor-plugin-{version}.whl} +``` +* 卸载 +``` +pip uninstall msmonitor-plugin +``` + +## 日志 +* 用户可以通过配置MSMONITOR_LOG_PATH环境变量,指定日志文件路径,默认路径为当前目录下的msmonitor_log diff --git a/msmonitor/plugin/bindings.cpp b/msmonitor/plugin/bindings.cpp new file mode 100644 index 0000000000000000000000000000000000000000..b08f7e3e3df0c9fb0d2905cd7463480bf1b17b7d --- /dev/null +++ b/msmonitor/plugin/bindings.cpp @@ -0,0 +1,30 @@ +#include +#include +#include "ipc_monitor/PyDynamicMonitorProxy.h" + +namespace py = pybind11; + +void init_IPCMonitor(PyObject *module) { + py::class_(module, "PyDynamicMonitorProxy") + .def(py::init<>()) + .def("init_dyno", &dynolog_npu::ipc_monitor::PyDynamicMonitorProxy::InitDyno, py::arg("npuId")) + .def("poll_dyno", &dynolog_npu::ipc_monitor::PyDynamicMonitorProxy::PollDyno) + .def("enable_dyno_npu_monitor", &dynolog_npu::ipc_monitor::PyDynamicMonitorProxy::EnableMsptiMonitor, py::arg("cfg_map")) + .def("finalize_dyno", &dynolog_npu::ipc_monitor::PyDynamicMonitorProxy::FinalizeDyno); +} + +static PyMethodDef g_moduleMethods[] = {}; + +static struct PyModuleDef ipcMonitor_module = { + PyModuleDef_HEAD_INIT, + "IPCMonitor", + nullptr, + -1, + g_moduleMethods +}; + +PyMODINIT_FUNC PyInit_IPCMonitor(void) { + PyObject* m = PyModule_Create(&ipcMonitor_module); + init_IPCMonitor(m); + return m; +} \ No newline at end of file diff --git a/msmonitor/plugin/build.sh b/msmonitor/plugin/build.sh new file mode 100644 index 0000000000000000000000000000000000000000..ec20536715a9b2bd1fd8ab7a694ca9eac26f3101 --- /dev/null +++ b/msmonitor/plugin/build.sh @@ -0,0 +1,24 @@ +#!/bin/bash + +# install pybind11 +pip install pybind11 + +# build stub +sh ./stub/build_stub.sh + +# build msmonitor_plugin wheel +python3 setup.py bdist_wheel + +# find .whl files in dist +files=$(find dist -type f -name "*.whl" 2>/dev/null) +count=$(echo "$files" | wc -l) +if [ "$count" -eq 1 ]; then + echo "find .whl in dist: $files" +else + echo "find no or multi .whl in dist" + exit 1 +fi + +# pip install whl +echo "pip install ${files}" +pip install ${files} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.cpp b/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.cpp new file mode 100644 index 0000000000000000000000000000000000000000..ce7d4b4bf2b286fc3bdc76db5493aaf638152078 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.cpp @@ -0,0 +1,88 @@ +#include "DynoLogNpuMonitor.h" +#include +#include +#include +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +bool DynoLogNpuMonitor::Init() +{ + + if (isInitialized_) { + LOG(WARNING) << "DynoLog npu monitor already initialized"; + return true; + } + bool res = ipcClient_.RegisterInstance(npuId_); + if (res) { + isInitialized_ = true; + LOG(INFO) << "DynoLog npu monitor initialized successfully"; + } + return res; +} + +ErrCode DynoLogNpuMonitor::DealMonitorReq(const MsptiMonitorCfg& cmd) +{ + if (cmd.monitorStop) { + if (msptiMonitor_.IsStarted()) { + LOG(INFO) << "Stop mspti monitor thread successfully"; + msptiMonitor_.Stop(); + } + return ErrCode::SUC; + } + + if (cmd.monitorStart && !msptiMonitor_.IsStarted()) { + LOG(INFO) << "Start mspti monitor thread successfully"; + msptiMonitor_.Start(); + } + + if (msptiMonitor_.IsStarted() && !cmd.enableActivities.empty()) { + auto curActivities = msptiMonitor_.GetEnabledActivities(); + std::vector enableKinds, disableKinds; + std::set_difference(cmd.enableActivities.begin(), cmd.enableActivities.end(), curActivities.begin(), curActivities.end(), + std::back_inserter(enableKinds)); + std::set_difference(curActivities.begin(), curActivities.end(), cmd.enableActivities.begin(), cmd.enableActivities.end(), + std::back_inserter(disableKinds)); + for (auto activity : enableKinds) { + msptiMonitor_.EnableActivity(activity); + } + for (auto activity : disableKinds) { + msptiMonitor_.DisableActivity(activity); + } + } + msptiMonitor_.SetFlushInterval(cmd.reportIntervals); + return ErrCode::SUC; +} + +std::string DynoLogNpuMonitor::Poll() +{ + std::string res = ipcClient_.IpcClientNpuConfig(); + if (res.size() == 4) { // res为4,表示dynolog注册进程成功 + LOG(INFO) << "Regist to dynolog daemon successfully"; + return ""; + } + if (res.empty()) { + return ""; + } + LOG(INFO) << "Received NPU configuration successfully"; + return res; +} + +void DynoLogNpuMonitor::EnableMsptiMonitor(std::unordered_map& cfg_map) +{ + auto cmd = InputParser::GetInstance()->DynoLogGetOpts(cfg_map); + if (cmd.isMonitor) { + auto ans = DealMonitorReq(cmd); + if (ans != ErrCode::SUC) { + LOG(ERROR) << "Deal monitor request failed, because" << IPC_ERROR(ans); + } + } +} + +void DynoLogNpuMonitor::Finalize() +{ + msptiMonitor_.Uninit(); +} +} // namespace ipc_monitor +} // namespace dynolog_npu diff --git a/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.h b/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.h new file mode 100644 index 0000000000000000000000000000000000000000..c26061a32bc72c21b3ee2ebdc93ae71aa383293d --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/DynoLogNpuMonitor.h @@ -0,0 +1,43 @@ +#ifndef DYNOLOG_NPU_MONITOR_H +#define DYNOLOG_NPU_MONITOR_H + +#include "MonitorBase.h" +#include "NpuIpcClient.h" +#include "MsptiMonitor.h" +#include "singleton.h" +#include "InputParser.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +class DynoLogNpuMonitor : public MonitorBase, public Singleton { + friend class Singleton; + +public: + DynoLogNpuMonitor() = default; + bool Init() override; + ErrCode DealMonitorReq(const MsptiMonitorCfg& cmd); + std::string Poll() override; + void EnableMsptiMonitor(std::unordered_map& cfg_map); + void Finalize(); + void SetNpuId(int id) override + { + npuId_ = id; + } + + IpcClient *GetIpcClient() + { + return &ipcClient_; + } + +private: + bool isInitialized_ = false; + int32_t npuId_ = 0; + IpcClient ipcClient_; + MsptiMonitor msptiMonitor_; +}; + +} // namespace ipc_monitor +} // namespace dynolog_npu + +#endif // DYNOLOG_NPU_MONITOR_H diff --git a/msmonitor/plugin/ipc_monitor/InputParser.cpp b/msmonitor/plugin/ipc_monitor/InputParser.cpp new file mode 100644 index 0000000000000000000000000000000000000000..bc77d33f1ae2f029e2fe548f8f9d9a7f5a594935 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/InputParser.cpp @@ -0,0 +1,61 @@ +#include "InputParser.h" +#include +#include +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +const std::string MSPTI_ACTIVITY_KIND_KEY = "MSPTI_ACTIVITY_KIND"; +const std::string REPORT_INTERVAL_S_KEY = "REPORT_INTERVAL_S"; +const std::string NPU_MONITOR_START_KEY = "NPU_MONITOR_START"; +const std::string NPU_MONITOR_STOP_KEY = "NPU_MONITOR_STOP"; + +const std::unordered_set cfgMap { + "MSPTI_ACTIVITY_KIND", + "REPORT_INTERVAL_S", + "NPU_MONITOR_START", + "NPU_MONITOR_STOP", + "REQUEST_TRACE_ID" +}; + +const std::unordered_map kindStrMap { + {"Marker", MSPTI_ACTIVITY_KIND_MARKER}, + {"Kernel", MSPTI_ACTIVITY_KIND_KERNEL}, + {"API", MSPTI_ACTIVITY_KIND_API}, + {"Hccl", MSPTI_ACTIVITY_KIND_HCCL}, + {"Memory", MSPTI_ACTIVITY_KIND_MEMORY}, + {"MemSet", MSPTI_ACTIVITY_KIND_MEMSET}, + {"MemCpy", MSPTI_ACTIVITY_KIND_MEMCPY} +}; + +std::set str2Kinds(const std::string& kindStrs) +{ + std::set res; + auto kindStrList = split(kindStrs, ','); + for (auto& kindStr : kindStrList) { + auto kind = kindStrMap.find(kindStr); + if (kind == kindStrMap.end()) { + return {MSPTI_ACTIVITY_KIND_INVALID}; + } + res.insert(kind->second); + } + return res; +} + +MsptiMonitorCfg InputParser::DynoLogGetOpts(std::unordered_map& cmd) +{ + if (cmd.count("NPU_MONITOR_SRART")) { + return {{MSPTI_ACTIVITY_KIND_INVALID}, 0, false, false, false}; + } + auto activityKinds = str2Kinds(cmd[MSPTI_ACTIVITY_KIND_KEY]); + uint32_t reportTimes = 0; + Str2Uint32(reportTimes, cmd[REPORT_INTERVAL_S_KEY]); + bool startSwitch = false; + Str2Bool(startSwitch, cmd[NPU_MONITOR_START_KEY]); + bool endSwitch = false; + Str2Bool(endSwitch, cmd[NPU_MONITOR_STOP_KEY]); + return {activityKinds, reportTimes, startSwitch, endSwitch, true}; +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/InputParser.h b/msmonitor/plugin/ipc_monitor/InputParser.h new file mode 100644 index 0000000000000000000000000000000000000000..e5f674e1605b3721a75372113ee5d7f012c5e506 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/InputParser.h @@ -0,0 +1,30 @@ +#ifndef INPUT_PARSER_H +#define INPUT_PARSER_H + +#include +#include +#include +#include + +namespace dynolog_npu { +namespace ipc_monitor { + +struct MsptiMonitorCfg +{ + std::set enableActivities; + uint32_t reportIntervals; + bool monitorStart; + bool monitorStop; + bool isMonitor; +}; + + +class InputParser: public dynolog_npu::ipc_monitor::Singleton { +public: + MsptiMonitorCfg DynoLogGetOpts(std::unordered_map& cmd); +}; + +} // namespace ipc_monitor +} // namespace dynolog_npu + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/MonitorBase.h b/msmonitor/plugin/ipc_monitor/MonitorBase.h new file mode 100644 index 0000000000000000000000000000000000000000..29be0b6be04083babb8d20e5386e93c053a41357 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/MonitorBase.h @@ -0,0 +1,18 @@ +#ifndef MONITOR_BASE_H +#define MONITOR_BASE_H + +#include + +namespace dynolog_npu { +namespace ipc_monitor { + +class MonitorBase { +public: + virtual bool Init() = 0; + virtual std::string Poll() = 0; + virtual void SetNpuId(int id) = 0; +}; + +} // namespace ipc_monitor +} // namespace dynolog_npu +#endif // MONITOR_BASE_H diff --git a/msmonitor/plugin/ipc_monitor/NpuIpcClient.cpp b/msmonitor/plugin/ipc_monitor/NpuIpcClient.cpp new file mode 100644 index 0000000000000000000000000000000000000000..2c2d8a03eb46c915abd7be4bbe259cbb1b7b33f8 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/NpuIpcClient.cpp @@ -0,0 +1,145 @@ +#include "NpuIpcClient.h" +#include + +namespace dynolog_npu { +namespace ipc_monitor { + +bool IpcClient::RegisterInstance(int32_t id) +{ + NpuContext context{ + .npu = id, + .pid = getpid(), + .jobId = JOB_ID, + }; + std::unique_ptr message = Message::ConstructMessage(context, MSG_TYPE_CONTEXT); + try { + if (!SyncSendMessage(*message, DYNO_IPC_NAME)) { + LOG(WARNING) << "Failed to send register ctxt for pid " << context.pid << " with dyno"; + return false; + } + } catch (const std::exception &e) { + LOG(WARNING) << "Error when SyncSendMessage: " << e.what(); + return false; + } + LOG(INFO) << "Resigter pid " << context.pid << " for dynolog success!"; + return true; +} + +std::string IpcClient::IpcClientNpuConfig() +{ + auto size = pids_.size(); + auto *req = ReinterpretConvert(malloc(sizeof(NpuRequest) + sizeof(int32_t) * size)); + if (req == nullptr) { + LOG(ERROR) << " Malloc for NpuRequest failed !"; + return ""; + } + req->type = DYNO_IPC_TYPE; + req->pidSize = size; + req->jobId = JOB_ID; + for (size_t i = 0; i < size; i++) { + req->pids[i] = pids_[i]; + } + std::unique_ptr message = Message::ConstructMessage(*req, MSG_TYPE_REQUEST, size); + if (!SyncSendMessage(*message, DYNO_IPC_NAME)) { + LOG(WARNING) << "Failed to send config to dyno server"; + free(req); + req = nullptr; + return ""; + } + free(req); + req = nullptr; + message = PollRecvMessage(MAX_IPC_RETRIES, MAX_SLEEP_US); + if (!message) { + LOG(WARNING) << "Failed to receive on-demand config"; + return ""; + } + std::string res = std::string(ReinterpretConvert(message->buf.get()), message->metadata.size); + return res; +} + +std::unique_ptr IpcClient::ReceiveMessage() +{ + std::lock_guard wguard(dequeLock_); + if (msgDynoDeque_.empty()) { + return nullptr; + } + std::unique_ptr message = std::move(msgDynoDeque_.front()); + msgDynoDeque_.pop_front(); + return message; +} + +bool IpcClient::SyncSendMessage(const Message &message, const std::string &destName, int numRetry, int seepTimeUs) +{ + if (destName.empty()) { + LOG(WARNING) << "Can not send to empty socket name!"; + return false; + } + int i = 0; + std::vector npuPayLoad{ NpuPayLoad(sizeof(struct Metadata), (void *)&message.metadata), + NpuPayLoad(message.metadata.size, message.buf.get()) }; + try { + auto ctxt = ep_.BuildSendNpuCtxt(destName, npuPayLoad, std::vector()); + while (!ep_.TrySendMessage(*ctxt) && i < numRetry) { + i++; + usleep(seepTimeUs); + seepTimeUs *= 2; // 2: double sleep time + } + } catch (const std::exception &e) { + LOG(ERROR) << "Error when SyncSendMessage: " << e.what(); + return false; + } + return i < numRetry; +} + +bool IpcClient::Recv() +{ + try { + Metadata recvMetadata; + std::vector PeekNpuPayLoad{ NpuPayLoad(sizeof(struct Metadata), &recvMetadata) }; + auto peekCtxt = ep_.BuildNpuRcvCtxt(PeekNpuPayLoad); + bool successFlag = false; + try { + successFlag = ep_.TryPeekMessage(*peekCtxt); + } catch (std::exception &e) { + LOG(ERROR) << "Error when TryPeekMessage: " << e.what(); + return false; + } + if (successFlag) { + std::unique_ptr npuMessage = std::make_unique(Message()); + npuMessage->metadata = recvMetadata; + npuMessage->buf = std::make_unique(recvMetadata.size); + npuMessage->src = std::string(ep_.GetName(*peekCtxt)); + std::vector npuPayLoad{ NpuPayLoad(sizeof(struct Metadata), (void *)&npuMessage->metadata), + NpuPayLoad(recvMetadata.size, npuMessage->buf.get()) }; + auto recvCtxt = ep_.BuildNpuRcvCtxt(npuPayLoad); + try { + successFlag = ep_.TryRcvMessage(*recvCtxt); + } catch (std::exception &e) { + LOG(ERROR) << "Error when TryRecvMsg: " << e.what(); + return false; + } + if (successFlag) { + std::lock_guard wguard(dequeLock_); + msgDynoDeque_.push_back(std::move(npuMessage)); + return true; + } + } + } catch (std::exception &e) { + LOG(ERROR) << "Error in Recv(): " << e.what(); + return false; + } + return false; +} + +std::unique_ptr IpcClient::PollRecvMessage(int maxRetry, int sleeTimeUs) +{ + for (int i = 0; i < maxRetry; i++) { + if (Recv()) { + return ReceiveMessage(); + } + usleep(sleeTimeUs); + } + return nullptr; +} +} // namespace ipc_monitor +} // namespace dynolog_npu diff --git a/msmonitor/plugin/ipc_monitor/NpuIpcClient.h b/msmonitor/plugin/ipc_monitor/NpuIpcClient.h new file mode 100644 index 0000000000000000000000000000000000000000..90827777a91eeac49251e246ce75f4eec4f24942 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/NpuIpcClient.h @@ -0,0 +1,119 @@ +#ifndef NPU_IPC_CLIENT_H +#define NPU_IPC_CLIENT_H + +#include +#include +#include +#include "NpuIpcEndPoint.h" +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +constexpr int TYPE_SIZE = 32; +constexpr int JOB_ID = 0; +constexpr const int DYNO_IPC_TYPE = 3; +constexpr const int MAX_IPC_RETRIES = 5; +constexpr const int MAX_SLEEP_US = 10000; +const std::string DYNO_IPC_NAME = "dynolog"; +const std::string MSG_TYPE_REQUEST = "req"; +const std::string MSG_TYPE_CONTEXT = "ctxt"; +const std::string MSG_TYPE_DATA = "data"; + +struct NpuRequest { + int type; + int pidSize; + int64_t jobId; + int32_t pids[0]; +}; + +struct NpuContext { + int32_t npu; + pid_t pid; + int64_t jobId; +}; + +struct Metadata { + size_t size = 0; + char type[TYPE_SIZE] = ""; +}; + +struct Message { + Metadata metadata; + std::unique_ptr buf; + std::string src; + template static std::unique_ptr ConstructMessage(const T &data, const std::string &type) + { + std::unique_ptr ipcNpuMessage = std::make_unique(Message()); + if (type.size() + 1 > sizeof(ipcNpuMessage->metadata.type)) { + throw std::runtime_error("Type string is too long to fit in metadata.type" + IPC_ERROR(ErrCode::PARAM)); + } + memcpy(ipcNpuMessage->metadata.type, type.c_str(), type.size() + 1); +#if __cplusplus >= 201703L + if constexpr (std::is_same::value == true) { + ipcNpuMessage->metadata.size = data.size(); + ipcNpuMessage->buf = std::make_unique(ipcNpuMessage->metadata.size); + memcpy(ipcNpuMessage->buf.get(), data.c_str(), sizeof(data)); + return ipcNpuMessage; + } +#endif + static_assert(std::is_trivially_copyable::value); + ipcNpuMessage->metadata.size = sizeof(data); + ipcNpuMessage->buf = std::make_unique(ipcNpuMessage->metadata.size); + memcpy(ipcNpuMessage->buf.get(), &data, sizeof(data)); + return ipcNpuMessage; + } + + template + static std::unique_ptr ConstructMessage(const T &data, const std::string &type, int n) + { + std::unique_ptr ipcNpuMessage = std::make_unique(Message()); + if (type.size() + 1 > sizeof(ipcNpuMessage->metadata.type)) { + throw std::runtime_error("Type string is too long to fit in metadata.type" + IPC_ERROR(ErrCode::PARAM)); + } + memcpy(ipcNpuMessage->metadata.type, type.c_str(), type.size() + 1); + static_assert(std::is_trivially_copyable::value); + static_assert(std::is_trivially_copyable::value); + ipcNpuMessage->metadata.size = sizeof(data) + sizeof(U) * n; + ipcNpuMessage->buf = std::make_unique(ipcNpuMessage->metadata.size); + memcpy(ipcNpuMessage->buf.get(), &data, ipcNpuMessage->metadata.size); + return ipcNpuMessage; + } + + static std::unique_ptr ConstructStrMessage(const std::string &data, const std::string &type) + { + std::unique_ptr ipcNpuMessage = std::make_unique(Message()); + if (type.size() + 1 > sizeof(ipcNpuMessage->metadata.type)) { + throw std::runtime_error("Type string is too long to fit in metadata.type" + IPC_ERROR(ErrCode::PARAM)); + } + memcpy(ipcNpuMessage->metadata.type, type.c_str(), type.size() + 1); + ipcNpuMessage->metadata.size = data.size(); + ipcNpuMessage->buf = std::make_unique(ipcNpuMessage->metadata.size); + memcpy(ipcNpuMessage->buf.get(), data.c_str(), ipcNpuMessage->metadata.size); + return ipcNpuMessage; + } +}; + +class IpcClient { +public: + IpcClient(const IpcClient &) = delete; + IpcClient &operator = (const IpcClient &) = delete; + IpcClient() = default; + bool RegisterInstance(int32_t npu); + std::string IpcClientNpuConfig(); + bool SyncSendMessage(const Message &message, const std::string &destName, int numRetry = 10, + int seepTimeUs = 10000); + +private: + std::vector pids_ = GetPids(); + NpuIpcEndPoint<0> ep_{ "dynoconfigclient" + GenerateUuidV4() }; + std::mutex dequeLock_; + std::deque> msgDynoDeque_; + std::unique_ptr ReceiveMessage(); + bool Recv(); + std::unique_ptr PollRecvMessage(int maxRetry, int sleeTimeUs); +}; +} // namespace ipc_monitor +} // namespace dynolog_npu + +#endif // NPU_IPC_CLIENT_H diff --git a/msmonitor/plugin/ipc_monitor/NpuIpcEndPoint.h b/msmonitor/plugin/ipc_monitor/NpuIpcEndPoint.h new file mode 100644 index 0000000000000000000000000000000000000000..79cca415f4c1a644416a174477a25846b58319ea --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/NpuIpcEndPoint.h @@ -0,0 +1,212 @@ +#ifndef NPU_IPC_ENDPOINT_H +#define NPU_IPC_ENDPOINT_H + +#include +#include +#include +#include +#include +#include +#include +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +using fileDesT = int; +constexpr const char STR_END_CHAR = '\0'; +constexpr int SOCKET_FD_CHMOD = 0666; + +struct NpuPayLoad { + size_t size; + void *data; + NpuPayLoad(size_t size, void *data) : size(size), data(data) {} +}; + +template struct NpuIpcEndPointCtxt { + struct sockaddr_un messageName; + size_t messageLen; + fileDesT *fileDesPtr; + struct msghdr msghdr; + std::vector iov; + char ancillaryBuf[CMSG_SPACE(MaxNumFileDes * sizeof(fileDesT))]; + explicit NpuIpcEndPointCtxt(size_t num) : iov(std::vector(num)){}; +}; + +template class NpuIpcEndPoint final { + using Ctxt = NpuIpcEndPointCtxt; + +public: + constexpr static size_t addressMaxLen = 108 - 2; // Max unix socket path length + explicit NpuIpcEndPoint(const std::string &addressName) + { + socketFd = socket(AF_UNIX, SOCK_DGRAM, 0); + if (socketFd == -1) { + throw std::runtime_error(std::strerror(errno) + IPC_ERROR(ErrCode::PARAM)); + } + int ret = 0; + struct sockaddr_un address; + size_t addressLen = SetSocketAdress(addressName, address); + if (address.sun_path[0] != STR_END_CHAR) { + ret = unlink(address.sun_path); + } + if (ret == -1) { + throw std::runtime_error("Unlink failed, error is " + std::string(strerror(errno)) + IPC_ERROR(ErrCode::PARAM)); + } + + ret = bind(socketFd, ReinterpretConvert(&address), addressLen); + if (ret == -1) { + throw std::runtime_error("Bind socket failed." + IPC_ERROR(ErrCode::PARAM)); + } + + if (address.sun_path[0] != STR_END_CHAR) { + ret = chmod(address.sun_path, SOCKET_FD_CHMOD); + } + if (ret == -1) { + throw std::runtime_error("Chmod failed, error is " + std::string(strerror(errno)) + IPC_ERROR(ErrCode::PARAM)); + } + } + + ~NpuIpcEndPoint() + { + close(socketFd); + } + + [[nodiscard]] auto BuildSendNpuCtxt(const std::string &desAddrName, const std::vector &npuPayLoad, + const std::vector &fileDes) + { + if (fileDes.size() > MaxNumFileDes) { + throw std::runtime_error("Request to fill more than max connections " + IPC_ERROR(ErrCode::PARAM)); + } + if (desAddrName.empty()) { + throw std::runtime_error("Can not send to dest point, because dest socket name is empty " + + IPC_ERROR(ErrCode::PARAM)); + } + auto ctxt = BuildNpuCtxt_(npuPayLoad, fileDes.size()); + ctxt->msghdr.msg_namelen = SetSocketAdress(desAddrName, ctxt->messageName); + if (!fileDes.empty()) { + if (fileDes.size() * sizeof(fileDesT) > sizeof(ctxt->fileDesPtr)) { + throw std::runtime_error("Memcpy failed when fileDes size large than ctxt fileDesPtr " + + IPC_ERROR(ErrCode::PARAM)); + } + memcpy(ctxt->fileDesPtr, fileDes.data(), fileDes.size() * sizeof(fileDesT)); + } + return ctxt; + } + + [[nodiscard]] bool TrySendMessage(Ctxt const & ctxt, bool retryOnConnRefused = true) + { + ssize_t retCode = sendmsg(socketFd, &ctxt.msghdr, MSG_DONTWAIT); + if (retCode > 0) { + return true; + } + if ((errno == EAGAIN || errno == EWOULDBLOCK) && retCode == -1) { + return false; + } + if (retryOnConnRefused && errno == ECONNREFUSED && retCode == -1) { + return false; + } + throw std::runtime_error("TrySendMessage occur " + std::string(std::strerror(errno)) + " " + + IPC_ERROR(ErrCode::PARAM)); + } + + [[nodiscard]] auto BuildNpuRcvCtxt(const std::vector &npuPayLoad) + { + return BuildNpuCtxt_(npuPayLoad, MaxNumFileDes); + } + + [[nodiscard]] bool TryRcvMessage(Ctxt &ctxt) noexcept + { + auto retCode = recvmsg(socketFd, &ctxt.msghdr, MSG_DONTWAIT); + if (retCode > 0) { + return true; + } + if (retCode == 0) { + return false; + } + if (errno == EWOULDBLOCK || errno == EAGAIN) { + return false; + } + throw std::runtime_error("TryRcvMessage occur " + std::string(std::strerror(errno)) + " " + + IPC_ERROR(ErrCode::PARAM)); + } + + [[nodiscard]] bool TryPeekMessage(Ctxt &ctxt) + { + ssize_t ret = recvmsg(socketFd, &ctxt.msghdr, MSG_DONTWAIT | MSG_PEEK); + if (ret > 0) { + return true; + } + if (ret == 0) { + return false; + } + if (errno == EAGAIN || errno == EWOULDBLOCK) { + return false; + } + throw std::runtime_error("TryPeekMessage occur " + std::string(std::strerror(errno))); + } + + const char *GetName(Ctxt const & ctxt) const + { + if (ctxt.messageName.sun_path[0] != STR_END_CHAR) { + throw std::runtime_error("GetName() want to got abstract socket, but got " + + std::string(ctxt.messageName.sun_path)); + } + return ctxt.messageName.sun_path + 1; + } + + std::vector GetFileDes(const Ctxt &ctxt) const + { + struct cmsghdr *cmg = CMSG_FIRSTHDR(&ctxt.msghdl); + unsigned numFileDes = (cmg->cmsg_len - sizeof(struct cmsghdr)) / sizeof(fileDesT); + return { ctxt.fileDesPtr, ctxt.fileDesPtr + numFileDes }; + } + +protected: + fileDesT socketFd; + size_t SetSocketAdress(const std::string &srcSocket, struct sockaddr_un &destSocket) + { + if (srcSocket.size() > addressMaxLen) { + throw std::runtime_error("Abstract UNIX Socket path cannot be larger than addressMaxLen"); + } + destSocket.sun_family = AF_UNIX; + destSocket.sun_path[0] = STR_END_CHAR; + if (srcSocket.empty()) { + return sizeof(sa_family_t); + } + srcSocket.copy(destSocket.sun_path + 1, srcSocket.size()); + destSocket.sun_path[srcSocket.size() + 1] = STR_END_CHAR; + return sizeof(sa_family_t) + srcSocket.size() + 2; // 2 + } + + auto BuildNpuCtxt_(const std::vector &npuPayLoad, unsigned numFileDes) + { + auto ctxt = std::make_unique(npuPayLoad.size()); + std::fill_n(ReinterpretConvert(&ctxt->msghdr), sizeof(msghdr), 0); + for (size_t i = 0; i < npuPayLoad.size(); i++) { + ctxt->iov[i] = {npuPayLoad[i].data, npuPayLoad[i].size}; + } + ctxt->msghdr.msg_name = &ctxt->messageName; + ctxt->msghdr.msg_namelen = sizeof(decltype(ctxt->messageName)); + ctxt->msghdr.msg_iov = ctxt->iov.data(); + ctxt->msghdr.msg_iovlen = npuPayLoad.size(); + ctxt->fileDesPtr = nullptr; + if (numFileDes == 0) { + return ctxt; + } + const size_t fileDesSize = sizeof(fileDesT) * numFileDes; + ctxt->msghdr.msg_control = ctxt->ancillaryBuf; + ctxt->msghdr.msg_controllen = CMSG_SPACE(fileDesSize); + + struct cmsghdr *cmsg = CMSG_FIRSTHDR(&ctxt->msghdr); + cmsg->cmsg_level = SOL_SOCKET; + cmsg->cmsg_type = SCM_RIGHTS; + cmsg->cmsg_len = CMSG_LEN(fileDesSize); + ctxt->fileDesPtr = ReinterpretConvert(CMSG_DATA(cmsg)); + return ctxt; + } +}; +} // namespace ipc_monitor +} // namespace dynolog_npu + +#endif // NPU_IPC_ENDPOINT_H diff --git a/msmonitor/plugin/ipc_monitor/PyDynamicMonitorProxy.h b/msmonitor/plugin/ipc_monitor/PyDynamicMonitorProxy.h new file mode 100644 index 0000000000000000000000000000000000000000..df10512a06549319895e2e08e0c28dc5df1a107f --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/PyDynamicMonitorProxy.h @@ -0,0 +1,58 @@ +#ifndef PYDYNAMIC_MONITOR_PROXY_H +#define PYDYNAMIC_MONITOR_PROXY_H + +#include +#include "MonitorBase.h" +#include "DynoLogNpuMonitor.h" + +namespace dynolog_npu { +namespace ipc_monitor { + +class PyDynamicMonitorProxy { +public: + PyDynamicMonitorProxy() = default; + bool InitDyno(int npuId) + { + try { + if (!google::IsGoogleLoggingInitialized()) { + std::string logPath; + if (CreateMsmonitorLogPath(logPath)) { + logPath = logPath + "/msmonitor_"; + google::InitGoogleLogging("MsMonitor"); + google::SetLogDestination(google::GLOG_INFO, logPath.c_str()); + google::SetLogFilenameExtension(".log"); + } else { + fprintf(stderr, "Failed to create log path, log will not record\n"); + } + } + monitor_ = DynoLogNpuMonitor::GetInstance(); + monitor_->SetNpuId(npuId); + bool res = monitor_->Init(); + return res; + } catch (const std::exception &e) { + LOG(ERROR) << "Error when init dyno " << e.what(); + return false; + } + } + + std::string PollDyno() + { + return monitor_->Poll(); + } + + void EnableMsptiMonitor(std::unordered_map& config_map) + { + DynoLogNpuMonitor::GetInstance()->EnableMsptiMonitor(config_map); + } + + void FinalizeDyno() + { + DynoLogNpuMonitor::GetInstance()->Finalize(); + } +private: + MonitorBase *monitor_ = nullptr; +}; + +} // namespace ipc_monitor +} // namespace dynolog_npu +#endif // PYDYNAMIC_MONITOR_PROXY_H diff --git a/msmonitor/plugin/ipc_monitor/TimerTask.h b/msmonitor/plugin/ipc_monitor/TimerTask.h new file mode 100644 index 0000000000000000000000000000000000000000..369712ec823a516eaec7595f317f73dc3ef70fc4 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/TimerTask.h @@ -0,0 +1,100 @@ +#ifndef TIMER_TASK_H +#define TIMER_TASK_H + +#include +#include +#include +#include +#include +#include + +namespace dynolog_npu { +namespace ipc_monitor { +class TimerTask { +public: + TimerTask(const std::string& name, int interval) + : interval(interval), name(name), manual_trigger(false), running(false) {} + + ~TimerTask() + { + Stop(); + } + + void Run() + { + if (running) { + LOG(ERROR) << name << " Timer task is already running."; + return; + } + running = true; + taskThread = std::thread(&TimerTask::TaskRun, this); + } + + void Trigger() + { + std::unique_lock lock(cv_mutex); + manual_trigger = true; + cv.notify_one(); + } + + // 停止定时任务 + void Stop() + { + if (!running) { + LOG(ERROR) << name << "Timer task is not running."; + return; + } + + running = false; + cv.notify_one(); + if (taskThread.joinable()) { + taskThread.join(); + } + } + + void SetInterval(int intervalTimes) + { + interval.store(intervalTimes); + } + + virtual void InitResource() {}; + virtual void ReleaseResource() {}; + virtual void ExecuteTask() = 0; +private: + // 定时任务线程函数 + void TaskRun() + { + LOG(INFO) << name << " Timer task started."; + InitResource(); + while (running) { + std::unique_lock lock(cv_mutex); + if (interval.load()) { + cv.wait_for(lock, std::chrono::seconds(interval.load()), [&] {return manual_trigger || !running;}); + } else { + cv.wait(lock, [&] {return manual_trigger || !running;}); + } + if (!running) { + break; + } + if (manual_trigger) { + manual_trigger = false; + } else if (running) { + ExecuteTask(); + } + } + ReleaseResource(); + LOG(INFO) << name << " Timer task stopped."; + } + + std::atomic interval; + std::string name; + std::condition_variable cv; + std::mutex cv_mutex; + std::atomic manual_trigger; + std::atomic running; + std::thread taskThread; +}; + +} +} +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..0c88fdc270514592678bcd4e065ca3e28a07e4bb --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.cpp @@ -0,0 +1,67 @@ +#include "MetricApiProcess.h" + +#include +#include + +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string ApiMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "API"; + jsonMsg["deviceId"] = -1; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricApiProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityApi* apiData = ReinterpretConvert(record); + msptiActivityApi* tmp = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityApi), 8)); + memcpy(tmp, apiData, sizeof(msptiActivityApi)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(tmp); + } + +} + +std::vector MetricApiProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + ApiMetric apiMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr api) { + return acc + api->end - api->start; + }); + apiMetric.duration = ans; + apiMetric.deviceId = -1; + apiMetric.timestamp = getCurrentTimestamp64(); + return {apiMetric}; +} + +void MetricApiProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricApiProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..6939f2a0d55cffd3a7998447001f3b9f7c704f0f --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricApiProcess.h @@ -0,0 +1,37 @@ +#ifndef METRIC_API_PROCESS_H +#define METRIC_API_PROCESS_H + +#include +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct ApiMetric { + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricApiProcess: public MetricProcessBase +{ +public: + MetricApiProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..1b36e379bdd68af971d141d672fe11fcbcb904fc --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.cpp @@ -0,0 +1,68 @@ +#include "MetricHcclProcess.h" +#include +#include +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string HcclMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "Hccl"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricHcclProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityHccl* hcclData = ReinterpretConvert(record); + msptiActivityHccl* tmp = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityHccl), ALIGN_SIZE)); + memcpy(tmp, hcclData, sizeof(msptiActivityHccl)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(tmp); + } + +} + +std::vector MetricHcclProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + if (copyRecords.empty()) { + return {}; + } + HcclMetric hcclMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr hccl) { + return acc + hccl->end - hccl->start; + }); + hcclMetric.duration = ans; + hcclMetric.deviceId = copyRecords[0]->ds.deviceId; + hcclMetric.timestamp = getCurrentTimestamp64(); + return {hcclMetric}; +} + +void MetricHcclProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricHcclProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..d3753cca1e98bb8b6f80076a29419dc36d3cd1ad --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricHcclProcess.h @@ -0,0 +1,38 @@ +#ifndef METRIC_HCCL_PROCESS_H +#define METRIC_HCCL_PROCESS_H + +#include +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct HcclMetric { + std::string kindName; + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricHcclProcess: public MetricProcessBase +{ +public: + MetricHcclProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..23353b16a982f87b421f0db8f4bfcd216f25d622 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.cpp @@ -0,0 +1,67 @@ +#include "MetricKernelProcess.h" + +#include + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string KernelMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "Kernel"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricKernelProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityKernel* kernel = ReinterpretConvert(record); + msptiActivityKernel* ptr = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityKernel), ALIGN_SIZE)); + memcpy(ptr, kernel, sizeof(msptiActivityKernel)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(ptr); + } +} + +std::vector MetricKernelProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + if (copyRecords.empty()) { + return {}; + } + auto deviceId = copyRecords[0]->ds.deviceId; + KernelMetric kernelMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr kernel) { + return acc + kernel->end - kernel->start; + }); + kernelMetric.duration = ans; + kernelMetric.deviceId = deviceId; + kernelMetric.timestamp = getCurrentTimestamp64(); + return {kernelMetric}; +} + +void MetricKernelProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricKernelProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..0107a26c283804002bd7ae7eab06e92c1a6ebbbf --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricKernelProcess.h @@ -0,0 +1,36 @@ +#ifndef METRIC_KERNEL_PROCESS_H +#define METRIC_KERNEL_PROCESS_H + +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct KernelMetric { + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricKernelProcess: public MetricProcessBase +{ +public: + MetricKernelProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricManager.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricManager.cpp new file mode 100644 index 0000000000000000000000000000000000000000..c7e8fc5d4fd7c9e29a4a5586169d3be0a221701d --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricManager.cpp @@ -0,0 +1,77 @@ +#include "MetricManager.h" +#include "MetricKernelProcess.h" +#include "MetricApiProcess.h" +#include "MetricMemCpyProcess.h" +#include "MetricHcclProcess.h" +#include "MetricMarkProcess.h" +#include "MetricMemSetProcess.h" +#include "MetricMemProcess.h" + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +MetricManager::MetricManager(): TimerTask("MetricManager", 0), +kindSwitchs_(MSPTI_ACTIVITY_KIND_COUNT), consumeStatus_(MSPTI_ACTIVITY_KIND_COUNT){ + metrics.resize(MSPTI_ACTIVITY_KIND_COUNT); + metrics[MSPTI_ACTIVITY_KIND_KERNEL] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_API] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_MEMCPY] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_MARKER] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_MEMSET] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_HCCL] = std::make_shared(); + metrics[MSPTI_ACTIVITY_KIND_MEMORY] = std::make_shared(); +} + +void MetricManager::ReleaseResource() +{ + for (int i = 0; i < MSPTI_ACTIVITY_KIND_COUNT; i++) { + if (kindSwitchs_[i].load()) { + kindSwitchs_[i] = false; + metrics[i]->Clear(); + } + } +} + +ErrCode MetricManager::ConsumeMsptiData(msptiActivity *record) +{ + if (!kindSwitchs_[record->kind]) { + return ErrCode::PERMISSION; + } + auto metricProcess = metrics[record->kind]; + consumeStatus_[record->kind] = true; + metricProcess->ConsumeMsptiData(record); + consumeStatus_[record->kind] = false; + return ErrCode::SUC; +} + +void MetricManager::SetReportInterval(uint32_t intervalTimes) +{ + if (reportInterval_.load() != intervalTimes) { + SendMetricMsg(); + SetInterval(intervalTimes); + reportInterval_.store(intervalTimes); + } +} + +void MetricManager::ExecuteTask() +{ + SendMetricMsg(); +} + +void MetricManager::SendMetricMsg() +{ + for (int i = 0; i < MSPTI_ACTIVITY_KIND_COUNT; i++) { + if (kindSwitchs_[i].load()) { + metrics[i]->SendProcessMessage(); + } + } +} + +void MetricManager::EnableKindSwitch_(msptiActivityKind kind, bool flag) +{ + kindSwitchs_[kind] = flag; +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricManager.h b/msmonitor/plugin/ipc_monitor/metric/MetricManager.h new file mode 100644 index 0000000000000000000000000000000000000000..42b6d088fb382c0cef0aa1b19dbe1c1285babb51 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricManager.h @@ -0,0 +1,36 @@ +#ifndef METRIC_MANAGER_H +#define METRIC_MANAGER_H + +#include +#include + +#include "utils.h" +#include "singleton.h" +#include "mspti.h" +#include "TimerTask.h" +#include "MetricProcessBase.h" + +namespace dynolog_npu { +namespace ipc_monitor { +namespace metric { +class MetricManager: public ipc_monitor::Singleton, public TimerTask +{ +public: + MetricManager(); + ~MetricManager() = default; + ErrCode ConsumeMsptiData(msptiActivity *record); + void SetReportInterval(uint32_t intervalTimes); + void SendMetricMsg(); + void ExecuteTask() override; + void EnableKindSwitch_(msptiActivityKind kind, bool flag); + void ReleaseResource() override; +private: + std::vector> kindSwitchs_; + std::vector> consumeStatus_; + std::atomic reportInterval_; + std::vector> metrics; +}; +} +} +} +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..7eed7f5a0112b9e7b5c3cf2a99173b745dbdda8a --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.cpp @@ -0,0 +1,136 @@ +#include "MetricMarkProcess.h" + +#include +#include +#include + +#include "utils.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +constexpr size_t COMPLETE_RANGE_DATA_SIZE = 4; + +std::string MarkMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "Marker"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["domain"] = domain; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +bool MetricMarkProcess::TransMarkData2Range(const std::vector>& markDatas, + RangeMarkData& rangemarkData) { + if(markDatas.size() != COMPLETE_RANGE_DATA_SIZE) { + return false; + } + + for (auto& activityMarker: markDatas) { + if (activityMarker->flag == MSPTI_ACTIVITY_FLAG_MARKER_START_WITH_DEVICE) { + if (activityMarker->sourceKind == MSPTI_ACTIVITY_SOURCE_KIND_DEVICE) { + rangemarkData.deviceStart = activityMarker->timestamp; + } else { + rangemarkData.start = activityMarker->timestamp; + } + } + if (activityMarker->flag == MSPTI_ACTIVITY_FLAG_MARKER_END_WITH_DEVICE) { + if (activityMarker->sourceKind == MSPTI_ACTIVITY_SOURCE_KIND_DEVICE) { + rangemarkData.deviceEnd = activityMarker->timestamp; + } else { + rangemarkData.end = activityMarker->timestamp; + } + } + } + auto markId = markDatas[0]->id; + std::string domainName = "default"; + auto it = domainMsg.find(markId); + if (it != domainMsg.end()) { + domainName = *it->second; + } + rangemarkData.domain = domainName; + id2Marker.erase(markId); + domainMsg.erase(markId); + return true; +} + +void MetricMarkProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityMarker* apiData = ReinterpretConvert(record); + msptiActivityMarker* tmp = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityMarker), ALIGN_SIZE)); + memcpy(tmp, apiData, sizeof(msptiActivityMarker)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(tmp); + if (apiData->flag == MSPTI_ACTIVITY_FLAG_MARKER_START_WITH_DEVICE && + apiData->sourceKind == MSPTI_ACTIVITY_SOURCE_KIND_HOST) { + std::string domainStr = apiData->domain; + auto markId = apiData->id; + domainMsg.emplace(markId, std::make_shared(domainStr)); + } + } +} + +std::vector MetricMarkProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + for (auto& record: copyRecords) { + id2Marker[record->id].emplace_back(std::move(record)); + } + std::vector rangeDatas; + for (auto pair = id2Marker.rbegin(); pair != id2Marker.rend(); ++pair) { + auto markId = pair->first; + auto markDatas = pair->second; + RangeMarkData rangeMark{}; + if (TransMarkData2Range(markDatas, rangeMark)) { + rangeDatas.emplace_back(rangeMark); + } + } + + std::unordered_map> domain2RangeData = + groupby(rangeDatas, [](const RangeMarkData& data) -> std::string { + return data.domain; + }); + std::vector ans; + for (auto& pair: domain2RangeData) { + MarkMetric markMetric{}; + auto domainName = pair.first; + auto rangeDatas = pair.second; + markMetric.deviceId = rangeDatas[0].deviceId; + markMetric.domain = domainName; + markMetric.timestamp = getCurrentTimestamp64(); + markMetric.duration = std::accumulate(rangeDatas.begin(), rangeDatas.end(), 0ULL, + [](uint64_t acc, const RangeMarkData& rangeData) { + return acc + rangeData.deviceEnd - rangeData.deviceStart; + }); + ans.emplace_back(markMetric); + } + return ans; +} + +void MetricMarkProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricMarkProcess::Clear() +{ + records.clear(); + domainMsg.clear(); + id2Marker.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..f57cc3a0d328dfd33051d436e62127fea0844dd9 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMarkProcess.h @@ -0,0 +1,57 @@ +#ifndef METRIC_MARK_PROCESS_H +#define METRIC_MARK_PROCESS_H + +#include +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct MarkMetric { + std::string name; + std::string domain; + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +struct RangeMarkData +{ + std::string domain; + uint64_t duration; + uint64_t start{0}; + uint64_t end{0}; + uint64_t deviceStart{0}; + uint64_t deviceEnd{0}; + msptiActivitySourceKind sourceKind; + uint32_t deviceId; +}; + + +class MetricMarkProcess: public MetricProcessBase +{ +public: + MetricMarkProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + bool TransMarkData2Range(const std::vector>& markDatas, + RangeMarkData& rangemarkData); +private: + std::mutex dataMutex; + std::unordered_map> domainMsg; + std::vector> records; + std::map>> id2Marker; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..06ef5f71a0898795d4761b3ad4ddd3bc8488569f --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.cpp @@ -0,0 +1,67 @@ +#include "MetricMemCpyProcess.h" + +#include + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string MemCpyMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "MemCpy"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricMemCpyProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityMemcpy* kernel = ReinterpretConvert(record); + msptiActivityMemcpy* ptr = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityMemcpy), ALIGN_SIZE)); + memcpy(ptr, kernel, sizeof(msptiActivityMemcpy)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(ptr); + } +} + +std::vector MetricMemCpyProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + if (copyRecords.empty()) { + return {}; + } + auto deviceId = copyRecords[0]->deviceId; + MemCpyMetric memCpyMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr memcpy) { + return acc + memcpy->end - memcpy->start; + }); + memCpyMetric.duration = ans; + memCpyMetric.deviceId = deviceId; + memCpyMetric.timestamp = getCurrentTimestamp64(); + return {memCpyMetric}; +} + +void MetricMemCpyProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricMemCpyProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..30ba8731923d9924a29f2145266c7d39cbcc0912 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemCpyProcess.h @@ -0,0 +1,36 @@ +#ifndef METRIC_MEMCPY_PROCESS_H +#define METRIC_MEMCPY_PROCESS_H + +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct MemCpyMetric { + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricMemCpyProcess: public MetricProcessBase +{ +public: + MetricMemCpyProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..80e43852fa6bde5cd421cd5ae0765bd24ebd28b9 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.cpp @@ -0,0 +1,67 @@ +#include "MetricMemProcess.h" + +#include + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string MemMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "Memory"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricMemProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityMemory* mem = ReinterpretConvert(record); + msptiActivityMemory* ptr = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityMemory), ALIGN_SIZE)); + memcpy(ptr, mem, sizeof(msptiActivityMemory)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(ptr); + } +} + +std::vector MetricMemProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + if (copyRecords.empty()) { + return {}; + } + auto deviceId = copyRecords[0]->deviceId; + MemMetric memMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr mem) { + return acc + mem->end - mem->start; + }); + memMetric.duration = ans; + memMetric.deviceId = deviceId; + memMetric.timestamp = getCurrentTimestamp64(); + return {memMetric}; +} + +void MetricMemProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricMemProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..c6193c89e729c4d07cdd26252ddf6b7004fb8ea0 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemProcess.h @@ -0,0 +1,37 @@ +#ifndef METRIC_MEM_PROCESS_H +#define METRIC_MEM_PROCESS_H + +#include +#include "MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct MemMetric { + std::string name; + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricMemProcess: public MetricProcessBase +{ +public: + MetricMemProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.cpp b/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.cpp new file mode 100644 index 0000000000000000000000000000000000000000..cd10df769a556769c3454f57ba74c622b6bcb16e --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.cpp @@ -0,0 +1,67 @@ +#include "MetricMemSetProcess.h" + +#include + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +std::string MemSetMetric::seriesToJson() +{ + nlohmann::json jsonMsg; + jsonMsg["kind"] = "MemSet"; + jsonMsg["deviceId"] = deviceId; + jsonMsg["duration"] = duration; + jsonMsg["timestamp"] = timestamp; + return jsonMsg.dump(); +} + +void MetricMemSetProcess::ConsumeMsptiData(msptiActivity *record) +{ + msptiActivityMemset* memSet = ReinterpretConvert(record); + msptiActivityMemset* ptr = ReinterpretConvert(MsptiMalloc(sizeof(msptiActivityMemset), ALIGN_SIZE)); + memcpy(ptr, memSet, sizeof(msptiActivityMemset)); + { + std::unique_lock lock(dataMutex); + records.emplace_back(ptr); + } +} + +std::vector MetricMemSetProcess::AggregatedData() +{ + std::vector> copyRecords; + { + std::unique_lock lock(dataMutex); + copyRecords = std::move(records); + records.clear(); + } + if (copyRecords.empty()) { + return {}; + } + auto deviceId = copyRecords[0]->deviceId; + MemSetMetric memSetMetric{}; + auto ans = std::accumulate(copyRecords.begin(), copyRecords.end(), 0ULL, + [](uint64_t acc, std::shared_ptr memSet) { + return acc + memSet->end - memSet->start; + }); + memSetMetric.duration = ans; + memSetMetric.deviceId = deviceId; + memSetMetric.timestamp = getCurrentTimestamp64(); + return {memSetMetric}; +} + +void MetricMemSetProcess::SendProcessMessage() +{ + auto afterAggregated = AggregatedData(); + for (auto& metric: afterAggregated) { + SendMessage(metric.seriesToJson()); + } +} + +void MetricMemSetProcess::Clear() +{ + records.clear(); +} +} +} +} \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.h b/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.h new file mode 100644 index 0000000000000000000000000000000000000000..c702a19c05c90d121278f4efeb569c975bf9e96c --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricMemSetProcess.h @@ -0,0 +1,37 @@ +#ifndef METRIC_MEM_SET_PROCESS_H +#define METRIC_MEM_SET_PROCESS_H + +#include +#include "metric/MetricProcessBase.h" + + +namespace dynolog_npu { +namespace ipc_monitor{ +namespace metric { + +struct MemSetMetric { + std::string name; + uint64_t duration; + uint64_t timestamp; + uint32_t deviceId; +public: + std::string seriesToJson(); +}; + +class MetricMemSetProcess: public MetricProcessBase +{ +public: + MetricMemSetProcess() = default; + void ConsumeMsptiData(msptiActivity *record) override; + std::vector AggregatedData(); + void SendProcessMessage() override; + void Clear() override; +private: + std::mutex dataMutex; + std::vector> records; +}; +} +} +} + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/metric/MetricProcessBase.h b/msmonitor/plugin/ipc_monitor/metric/MetricProcessBase.h new file mode 100644 index 0000000000000000000000000000000000000000..1e74431c3ca8bcd920e748af96121c0d8551342c --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/metric/MetricProcessBase.h @@ -0,0 +1,46 @@ +#ifndef METRIC_PROCESS_BASE_H +#define METRIC_PROCESS_BASE_H + +#include +#include + +#include "DynoLogNpuMonitor.h" +#include "NpuIpcClient.h" +#include "mspti.h" + +namespace dynolog_npu { +namespace ipc_monitor { +namespace metric { +class MetricProcessBase +{ +public: + void SendMessage(std::string message) + { + if (message.empty()) { + LOG(ERROR) << "SendMessage message is empty"; + return; + } + static const std::string destName = DYNO_IPC_NAME + "_data"; + static const int maxRetry = 5, retryWaitTimeUs = 1000; + auto msg = Message::ConstructStrMessage(message, MSG_TYPE_DATA); + if (!msg) { + LOG(ERROR) << "ConstructStrMessage failed, message: " << message; + return; + } + auto ipcClient = DynoLogNpuMonitor::GetInstance()->GetIpcClient(); + if (!ipcClient) { + LOG(ERROR) << "DynoLogNpuMonitor ipcClient is nullptr"; + return; + } + if (!ipcClient->SyncSendMessage(*msg, destName, maxRetry, retryWaitTimeUs)) { + LOG(ERROR) << "send mspti message failed: " << message; + } + } + virtual void ConsumeMsptiData(msptiActivity *record) = 0; + virtual void Clear() = 0; + virtual void SendProcessMessage() = 0; +}; +} +} +} +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.cpp b/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.cpp new file mode 100644 index 0000000000000000000000000000000000000000..f7dac309c5e72ce1c631933d0d50ec3c2444a8f3 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.cpp @@ -0,0 +1,227 @@ +#include "MsptiMonitor.h" + +#include +#include +#include +#include + +#include "DynoLogNpuMonitor.h" +#include "MetricManager.h" +#include "utils.h" + +namespace { +constexpr size_t DEFAULT_BUFFER_SIZE = 8 * 1024 * 1024; +constexpr size_t MAX_BUFFER_SIZE = 256 * 1024 * 1024; +constexpr uint32_t MAX_ALLOC_CNT = MAX_BUFFER_SIZE / DEFAULT_BUFFER_SIZE; + +void MsptiFree(uint8_t *ptr) +{ + if (ptr != nullptr) { + free(ptr); + } +} +} + +namespace dynolog_npu { +namespace ipc_monitor { + +MsptiMonitor::MsptiMonitor() + : start_(false), + subscriber_(nullptr), + checkFlush_(false), + flushInterval_(0) {} + +MsptiMonitor::~MsptiMonitor() +{ + Uninit(); +} + +void MsptiMonitor::Start() +{ + if (start_.load()) { + return; + } + SetThreadName("MsptiMonitor"); + if (Thread::Start() != 0) { + LOG(ERROR) << "MsptiMonitor start failed"; + return; + } + start_.store(true); + metric::MetricManager::GetInstance()->Run(); + LOG(INFO) << "MsptiMonitor start successfully"; +} + +void MsptiMonitor::Stop() +{ + if (!start_.load()) { + LOG(WARNING) << "MsptiMonitor is not running"; + return; + } + Uninit(); + if (msptiActivityFlushAll(1) != MSPTI_SUCCESS) { + LOG(WARNING) << "MsptiMonitor stop msptiActivityFlushAll failed"; + } + LOG(INFO) << "MsptiMonitor stop successfully"; +} + +void MsptiMonitor::Uninit() +{ + if (!start_.load()) { + return; + } + metric::MetricManager::GetInstance()->Stop(); + start_.store(false); + cv_.notify_one(); + Thread::Stop(); +} + +void MsptiMonitor::EnableActivity(msptiActivityKind kind) +{ + if (MSPTI_ACTIVITY_KIND_INVALID < kind && kind < MSPTI_ACTIVITY_KIND_COUNT) { + std::lock_guard lock(activityMtx_); + if (msptiActivityEnable(kind) == MSPTI_SUCCESS) { + enabledActivities_.insert(kind); + } else { + LOG(ERROR) << "MsptiMonitor enableActivity failed, kind: " << static_cast(kind); + } + metric::MetricManager::GetInstance()->EnableKindSwitch_(kind, true); + } +} + +void MsptiMonitor::DisableActivity(msptiActivityKind kind) +{ + if (MSPTI_ACTIVITY_KIND_INVALID < kind && kind < MSPTI_ACTIVITY_KIND_COUNT) { + std::lock_guard lock(activityMtx_); + if (msptiActivityDisable(kind) == MSPTI_SUCCESS) { + enabledActivities_.erase(kind); + } else { + LOG(ERROR) << "MsptiMonitor disableActivity failed, kind: " << static_cast(kind); + } + metric::MetricManager::GetInstance()->EnableKindSwitch_(kind, false); + } +} + +void MsptiMonitor::SetFlushInterval(uint32_t interval) +{ + flushInterval_.store(interval); + checkFlush_.store(true); + if (start_.load()) { + cv_.notify_one(); + } + metric::MetricManager::GetInstance()->SetReportInterval(interval); + metric::MetricManager::GetInstance()->Trigger(); +} + +bool MsptiMonitor::IsStarted() +{ + return start_.load(); +} + +std::set MsptiMonitor::GetEnabledActivities() +{ + std::lock_guard lock(activityMtx_); + return enabledActivities_; +} + +void MsptiMonitor::Run() +{ + if (msptiSubscribe(&subscriber_, nullptr, nullptr) != MSPTI_SUCCESS) { + LOG(ERROR) << "MsptiMonitor run failed, msptiSubscribe failed"; + return; + } + if (msptiActivityRegisterCallbacks(BufferRequest, BufferComplete) != MSPTI_SUCCESS) { + LOG(ERROR) << "MsptiMonitor run failed, msptiActivityRegisterCallbacks failed"; + return; + } + while (true) + { + std::unique_lock lock(cvMtx_); + if (flushInterval_.load() > 0) { + cv_.wait_for(lock, std::chrono::seconds(flushInterval_.load()), + [&]() { return checkFlush_.load() || !start_.load();}); + } else { + cv_.wait(lock, [&]() { return checkFlush_.load () || !start_.load();}); + } + if (!start_.load()) { + break; + } + if (checkFlush_.load()) { + checkFlush_.store(false); + } + if (flushInterval_.load() > 0) { + if (msptiActivityFlushAll(1) != MSPTI_SUCCESS) { + LOG(ERROR) << "MsptiMonitor run msptiActivityFlushAll failed"; + } + } + } + if (msptiUnsubscribe(subscriber_) != MSPTI_SUCCESS) { + LOG(ERROR) << "MsptiMonitor run failed, msptiUnsubscribe failed"; + } + { + std::lock_guard lock(activityMtx_); + for (auto kind : enabledActivities_) { + msptiActivityDisable(kind); + } + enabledActivities_.clear(); + } + checkFlush_.store(false); + flushInterval_.store(0); +} + +std::atomic MsptiMonitor::allocCnt{0}; + +void MsptiMonitor::BufferRequest(uint8_t **buffer, size_t *size, size_t *maxNumRecords) +{ + if (buffer == nullptr || size == nullptr || maxNumRecords == nullptr) { + return; + } + *maxNumRecords = 0; + if (allocCnt.load() >= MAX_ALLOC_CNT) { + *buffer = nullptr; + *size = 0; + LOG(ERROR) << "MsptiMonitor BufferRequest failed, allocCnt: " << allocCnt.load(); + return; + } + uint8_t *pBuffer = ReinterpretConvert(MsptiMalloc(DEFAULT_BUFFER_SIZE, ALIGN_SIZE)); + if (pBuffer == nullptr) { + *buffer = nullptr; + *size = 0; + } else { + *buffer = pBuffer; + *size = DEFAULT_BUFFER_SIZE; + allocCnt++; + LOG(INFO) << "MsptiMonitor BufferRequest, size: " << *size; + } +} + +void MsptiMonitor::BufferComplete(uint8_t *buffer, size_t size, size_t validSize) +{ + if (validSize > 0 && buffer != nullptr) { + LOG(INFO) << "MsptiMonitor BufferComplete, size: " << size << ", validSize: " << validSize; + msptiActivity *record = nullptr; + msptiResult status = MSPTI_SUCCESS; + do { + status = msptiActivityGetNextRecord(buffer, validSize, &record); + if (status == MSPTI_SUCCESS) { + BufferConsume(record); + } else if (status == MSPTI_ERROR_MAX_LIMIT_REACHED) { + break; + } else { + LOG(ERROR) << "MsptiMonitor BufferComplete failed, status: " << static_cast(status); + break; + } + } while (true); + allocCnt--; + } + MsptiFree(buffer); +} + +void MsptiMonitor::BufferConsume(msptiActivity *record) +{ + if (record == nullptr) { + return; + } + metric::MetricManager::GetInstance()->ConsumeMsptiData(record); +} +} // namespace ipc_monitor +} // namespace dynolog_npu diff --git a/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.h b/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.h new file mode 100644 index 0000000000000000000000000000000000000000..f459703fbf7b5027604d7afeac2b3653b8886089 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/mspti_monitor/MsptiMonitor.h @@ -0,0 +1,48 @@ +#ifndef MSPTI_MONITOR_H +#define MSPTI_MONITOR_H + +#include +#include +#include +#include +#include "mspti.h" +#include "thread.h" + + +namespace dynolog_npu { +namespace ipc_monitor { +class MsptiMonitor : public Thread { +public: + explicit MsptiMonitor(); + virtual ~MsptiMonitor(); + void Start(); + void Stop(); + void EnableActivity(msptiActivityKind kind); + void DisableActivity(msptiActivityKind kind); + void SetFlushInterval(uint32_t interval); + bool IsStarted(); + std::set GetEnabledActivities(); + void Uninit(); + +private: + static void BufferRequest(uint8_t **buffer, size_t *size, size_t *maxNumRecords); + static void BufferComplete(uint8_t *buffer, size_t size, size_t validSize); + static void BufferConsume(msptiActivity *record); + static std::atomic allocCnt; + +private: + void Run() override; + +private: + std::atomic start_; + std::mutex cvMtx_; + std::condition_variable cv_; + msptiSubscriberHandle subscriber_; + std::mutex activityMtx_; + std::set enabledActivities_; + std::atomic checkFlush_; + std::atomic flushInterval_; +}; +} // namespace ipc_monitor +} // namespace dynolog_npu +#endif // MSPTI_MONITOR_H diff --git a/msmonitor/plugin/ipc_monitor/mspti_monitor/mspti.h b/msmonitor/plugin/ipc_monitor/mspti_monitor/mspti.h new file mode 100644 index 0000000000000000000000000000000000000000..225dc3b9cb99a8ab1a5cf87322923a27107a0318 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/mspti_monitor/mspti.h @@ -0,0 +1,244 @@ +#ifndef MSPTI_STUB_H +#define MSPTI_STUB_H + +constexpr int ACTIVITY_STRUCT_ALIGNMENT = 8; +#if defined(_WIN32) +#define START_PACKED_ALIGNMENT __pragma(pack(push, 1)) +#define PACKED_ALIGNMENT __declspec(align(ACTIVITY_STRUCT_ALIGNMENT)) +#define END_PACKED_ALIGNMENT __pragma(pack(pop)) +#elif defined(__GNUC__) +#define START_PACKED_ALIGNMENT +#define PACKED_ALIGNMENT __attribute__((__packed__)) __attribute__((aligned(ACTIVITY_STRUCT_ALIGNMENT))) +#define END_PACKED_ALIGNMENT +#else +#define START_PACKED_ALIGNMENT +#define PACKED_ALIGNMENT +#define END_PACKED_ALIGNMENT +#endif + +#include +#include + +#define MSPTI_INVALID_DEVICE_ID ((uint32_t) 0xFFFFFFFFU) +#define MSPTI_INVALID_STREAM_ID ((uint32_t) 0xFFFFFFFFU) +#define MSPTI_INVALID_CORRELATION_ID ((uint64_t) 0) +using msptiCallbackId = uint32_t; + +#ifdef __cplusplus +extern "C" { +#endif // __cplusplus + +typedef enum { + MSPTI_SUCCESS = 0, + MSPTI_ERROR_INVALID_PARAMETER = 1, + MSPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED = 2, + MSPTI_ERROR_MAX_LIMIT_REACHED = 3, + MSPTI_ERROR_DEVICE_OFFLINE = 4, + MSPTI_ERROR_QUERY_EMPTY = 5, + MSPTI_ERROR_INNER = 999, + MSPTI_ERROR_FOECE_INT = 0x7fffffff +} msptiResult; + +typedef enum { + MSPTI_CB_DOMAIN_INVALID = 0, + MSPTI_CB_DOMAIN_RUNTIME = 1, + MSPTI_CB_DOMAIN_HCCL = 2, + MSPTI_CB_DOMAIN_SIZE, + MSPTI_CB_DOMAIN_FORCE_INT = 0x7fffffff +} msptiCallbackDomain; + +typedef enum { + MSPTI_API_ENTER = 0, + MSPTI_API_EXIT = 1, + MSPTI_API_CBSITE_FORCE_INT = 0x7fffffff +} msptiApiCallbackSite; + +typedef struct { + msptiApiCallbackSite callbackSite; + const char *functionName; + const void *functionParams; + const void *functionReturnValue; + const char *symbolName; + uint64_t correlationId; + uint64_t reserved1; + uint64_t reserved2; + uint64_t *correlationData; +} msptiCallbackData; + +typedef enum { + MSPTI_ACTIVITY_KIND_INVALID = 0, + MSPTI_ACTIVITY_KIND_MARKER = 1, + MSPTI_ACTIVITY_KIND_KERNEL = 2, + MSPTI_ACTIVITY_KIND_API = 3, + MSPTI_ACTIVITY_KIND_HCCL = 4, + MSPTI_ACTIVITY_KIND_MEMORY = 5, + MSPTI_ACTIVITY_KIND_MEMSET = 6, + MSPTI_ACTIVITY_KIND_MEMCPY = 7, + MSPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION = 8, + MSPTI_ACTIVITY_KIND_COUNT, + MSPTI_ACTIVITY_KIND_FORCE_INT = 0x7fffffff +} msptiActivityKind; + +typedef enum { + MSPTI_ACTIVITY_FLAG_NONE = 0, + MSPTI_ACTIVITY_FLAG_MARKER_INSTANTANEOUS = 1 << 0, + MSPTI_ACTIVITY_FLAG_MARKER_START = 1 << 1, + MSPTI_ACTIVITY_FLAG_MARKER_END = 1 << 2, + MSPTI_ACTIVITY_FLAG_MARKER_INSTANTANEOUS_WITH_DEVICE = 1 << 3, + MSPTI_ACTIVITY_FLAG_MARKER_START_WITH_DEVICE = 1 << 4, + MSPTI_ACTIVITY_FLAG_MARKER_END_WITH_DEVICE = 1 << 5 +} msptiActivityFlag; + +typedef enum { + MSPTI_ACTIVITY_SOURCE_KIND_HOST = 0, + MSPTI_ACTIVITY_SOURCE_KIND_DEVICE = 1 +} msptiActivitySourceKind; + +typedef enum { + MSPTI_ACTIVITY_MEMORY_OPERATION_TYPE_ALLOCATATION = 0, + MSPTI_ACTIVITY_MEMORY_OPERATION_TYPE_RELEASE = 1 +} msptiActivityMemoryOperationType; + +typedef enum { + MSPTI_ACTIVITY_MEMORY_KIND_UNKNOWN = 0, + MSPTI_ACTIVITY_MEMORY_KIND_DEVICE = 1 +} msptiActivityMemoryKind; + +typedef enum { + MSPTI_ACTIVITY_MEMCPY_KIND_UNKNOWN = 0, + MSPTI_ACTIVITY_MEMCPY_KIND_HTOH = 1, + MSPTI_ACTIVITY_MEMCPY_KIND_HTOD = 2, + MSPTI_ACTIVITY_MEMCPY_KIND_DTOH = 3, + MSPTI_ACTIVITY_MEMCPY_KIND_DTOD = 4, + MSPTI_ACTIVITY_MEMCPY_KIND_DEFAULT = 5 +} msptiActivityMemcpyKind; + +START_PACKED_ALIGNMENT + +typedef union PACKED_ALIGNMENT { + struct { + uint32_t processId; + uint32_t threadId; + } pt; + struct { + uint32_t deviceId; + uint32_t streamId; + } ds; +} msptiObjectId; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; +} msptiActivity; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + uint64_t start; + uint64_t end; + struct { + uint32_t processId; + uint32_t threadId; + } pt; + uint64_t correlationId; + const char* name; +} msptiActivityApi; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + uint64_t start; + uint64_t end; + struct { + uint32_t deviceId; + uint32_t streamId; + } ds; + uint64_t correlationId; + const char *type; + const char *name; +} msptiActivityKernel; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + msptiActivityFlag flag; + msptiActivitySourceKind sourceKind; + uint64_t timestamp; + uint64_t id; + msptiObjectId objectId; + const char *name; + const char *domain; +} msptiActivityMarker; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + uint64_t start; + uint64_t end; + struct { + uint32_t deviceId; + uint32_t streamId; + } ds; + double bandWidth; + const char *name; + const char *commName; +} msptiActivityHccl; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + msptiActivityMemoryOperationType memoryOperationType; + msptiActivityMemoryKind memoryKind; + uint64_t correlationId; + uint64_t start; + uint64_t end; + uint64_t address; + uint64_t bytes; + uint32_t processId; + uint32_t deviceId; + uint32_t streamId; +} msptiActivityMemory; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + uint32_t value; + uint64_t bytes; + uint64_t start; + uint64_t end; + uint32_t deviceId; + uint32_t streamId; + uint64_t correlationId; + uint8_t isAsync; +} msptiActivityMemset; + +typedef struct PACKED_ALIGNMENT { + msptiActivityKind kind; + msptiActivityMemcpyKind copyKind; + uint64_t bytes; + uint64_t start; + uint64_t end; + uint32_t deviceId; + uint32_t streamId; + uint64_t correlationId; + uint8_t isAsync; +} msptiActivityMemcpy; + +END_PACKED_ALIGNMENT + +typedef void(*msptiCallbackFunc)(void* userdata, msptiCallbackDomain domain, msptiCallbackId cbid, const msptiCallbackData *cbdata); +typedef void(*msptiBuffersCallbackRequestFunc)(uint8_t **buffer, size_t *size, size_t *maxNumRecords); +typedef void(*msptiBuffersCallbackCompleteFunc)(uint8_t *buffer, size_t size, size_t validSize); + +struct msptiSubscriber_st { + msptiCallbackFunc callback; + void *userdata; +}; + +typedef struct msptiSubscriber_st *msptiSubscriberHandle; + +msptiResult msptiSubscribe(msptiSubscriberHandle *subscriber, msptiCallbackFunc callback, void *userdata); +msptiResult msptiUnsubscribe(msptiSubscriberHandle subscriber); +msptiResult msptiActivityRegisterCallbacks(msptiBuffersCallbackRequestFunc funcBufferRequested, msptiBuffersCallbackCompleteFunc funcBufferCompleted); +msptiResult msptiActivityEnable(msptiActivityKind kind); +msptiResult msptiActivityDisable(msptiActivityKind kind); +msptiResult msptiActivityGetNextRecord(uint8_t *buffer, size_t validBufferSizeBytes, msptiActivity **record); +msptiResult msptiActivityFlushAll(uint32_t flag); + +#ifdef __cplusplus +} +#endif // __cplusplus +#endif // MSPTI_STUB_H diff --git a/msmonitor/plugin/ipc_monitor/singleton.h b/msmonitor/plugin/ipc_monitor/singleton.h new file mode 100644 index 0000000000000000000000000000000000000000..b2e874dc04f4720571ea178047e34b23641ae08c --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/singleton.h @@ -0,0 +1,31 @@ +#ifndef SINGLETON_H +#define SINGLETON_H +#include + +namespace dynolog_npu { +namespace ipc_monitor { + +template +class Singleton { +public: + static T *GetInstance() noexcept(std::is_nothrow_constructible::value) { + static T instance; + return &instance; + } + + virtual ~Singleton() = default; + +protected: + explicit Singleton() = default; + +private: + explicit Singleton(const Singleton &obj) = delete; + Singleton& operator=(const Singleton &obj) = delete; + explicit Singleton(Singleton &&obj) = delete; + Singleton& operator=(Singleton &&obj) = delete; +}; + +} // ipc_monitor +} // dynolog_npu + +#endif \ No newline at end of file diff --git a/msmonitor/plugin/ipc_monitor/thread.h b/msmonitor/plugin/ipc_monitor/thread.h new file mode 100644 index 0000000000000000000000000000000000000000..9e1926917af380ec14cb517cb9efe57bf110405a --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/thread.h @@ -0,0 +1,75 @@ +#ifndef IPC_MONITOR_THREAD_H +#define IPC_MONITOR_THREAD_H + +#include +#include +#include +#include +#include "utils.h" + +namespace dynolog_npu { +namespace ipc_monitor { +class Thread { +public: + Thread() + : is_alive_(false), + pid_(0), + thread_name_("IPCMonitor") {} + + ~Thread() + { + if (is_alive_) { + (void)pthread_cancel(pid_); + (void)pthread_join(pid_, nullptr); + } + } + + void SetThreadName(const std::string &name) + { + if (!name.empty()) { + thread_name_ = name; + } + } + + std::string GetThreadName() + { + return thread_name_; + } + + int Start() + { + int ret = pthread_create(&pid_, nullptr, Execute, ReinterpretConvert(this)); + is_alive_ = (ret == 0) ? true : false; + return ret; + } + + int Stop() + { + return Join(); + } + + int Join() + { + int ret = pthread_join(pid_, nullptr); + is_alive_ = (ret == 0) ? false : true; + return ret; + } + +private: + static void* Execute(void *args) + { + Thread *thr = ReinterpretConvert(args); + prctl(PR_SET_NAME, ReinterpretConvert(thr->GetThreadName().data())); + thr->Run(); + return nullptr; + } + virtual void Run() = 0; + +private: + bool is_alive_; + pthread_t pid_; + std::string thread_name_; +}; +} // ipc_monitor +} // dynolog_npu +#endif // IPC_MONITOR_THREAD_H diff --git a/msmonitor/plugin/ipc_monitor/utils.cpp b/msmonitor/plugin/ipc_monitor/utils.cpp new file mode 100644 index 0000000000000000000000000000000000000000..a47248ef0b1e3da14e6b28bcde2a229a7b617412 --- /dev/null +++ b/msmonitor/plugin/ipc_monitor/utils.cpp @@ -0,0 +1,427 @@ +#include "utils.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +namespace dynolog_npu { +namespace ipc_monitor { +std::unordered_map submoduleMap = { + {SubModule::IPC, "IPC"}, +}; + +std::unordered_map errCodeMap = { + {ErrCode::SUC, "success"}, + {ErrCode::PARAM, "invalid parameter"}, + {ErrCode::TYPE, "invalid type"}, + {ErrCode::VALUE, "invalid value"}, + {ErrCode::PTR, "invalid pointer"}, + {ErrCode::INTERNAL, "internal error"}, + {ErrCode::MEMORY, "memory error"}, + {ErrCode::NOT_SUPPORT, "feature not supported"}, + {ErrCode::NOT_FOUND, "resource not found"}, + {ErrCode::UNAVAIL, "resource unavailable"}, + {ErrCode::SYSCALL, "system call failed"}, + {ErrCode::TIMEOUT, "timeout error"}, + {ErrCode::PERMISSION, "permission error"}, +}; + +std::string getCurrentTimestamp() +{ + auto now = std::chrono::system_clock::now(); + auto micros = std::chrono::duration_cast(now.time_since_epoch()); + + std::time_t currentTime = std::chrono::system_clock::to_time_t(now); + std::tm* timeInfo = std::localtime(¤tTime); + + auto milli_time = std::chrono::duration_cast(micros).count() % 1000; + auto micro_time = micros.count() % 1000; + + std::ostringstream oss; + oss << std::put_time(timeInfo, "%Y-%m-%d-%H:%M:%S"); + return oss.str(); +} + +uint64_t getCurrentTimestamp64() +{ + auto now = std::chrono::system_clock::now(); + auto micros = std::chrono::duration_cast(now.time_since_epoch()); + auto milli_time = std::chrono::duration_cast(micros).count(); + return milli_time; +} + +std::string formatErrorCode(SubModule submodule, ErrCode errorCode) +{ + std::ostringstream oss; + oss << "\n[ERROR] " << getCurrentTimestamp() << " (PID:" << getpid() << ")"; + oss << "ERR" << std::setw(2) << std::setfill('0') << static_cast(submodule); // 2: 字段宽度 + oss << std::setw(3) << std::setfill('0') << static_cast(errorCode); // 3: 字段宽度 + oss << " " << submoduleMap[submodule] << " " << errCodeMap[errorCode]; + return oss.str(); +}; + +int32_t GetProcessId() +{ + return static_cast(getpid()); +} + +bool ParseProcStat(const std::string& line, std::string& command, int& parentPid) +{ + size_t lparen = line.find('('); + size_t rparen = line.rfind(')'); + if (lparen == std::string::npos || rparen == std::string::npos || rparen <= lparen + 1) { + LOG(WARNING) << "cannot find command name: " << line; + return false; + } + command = line.substr(lparen + 1, rparen - lparen - 1); + + std::string afterCmd = line.substr(rparen + 1); + std::istringstream iss(afterCmd); + std::string state; + int ppid; + if (!(iss >> state >> ppid)) { + LOG(WARNING) << "Failed to parse state/ppid from: " << afterCmd; + return false; + } + parentPid = ppid; + return true; +} + +std::pair GetParentPidAndCommand(int32_t pid) +{ + std::string fileName = "/proc/" + std::to_string(pid) + "/stat"; + std::ifstream statFile(fileName); + if (!statFile) { + return std::make_pair(0, ""); + } + int32_t parentPid = 0; + std::string command; + std::string line; + if (std::getline(statFile, line)) { + bool ret = ParseProcStat(line, command, parentPid); + if (ret) { + return std::make_pair(parentPid, command); + } + } + LOG(WARNING) << "Failed to parse /proc/" << pid << "/stat"; + return std::make_pair(0, ""); +} + +std::vector> GetPidCommandPairsofAncestors() +{ + std::vector> process_pids_and_cmds; + process_pids_and_cmds.reserve(MaxParentPids + 1); + int32_t current_pid = GetProcessId(); + for (int i = 0; i <= MaxParentPids && (i == 0 || current_pid > 1); i++) { + std::pair parent_pid_and_cmd = GetParentPidAndCommand(current_pid); + process_pids_and_cmds.push_back(std::make_pair(current_pid, parent_pid_and_cmd.second)); + current_pid = parent_pid_and_cmd.first; + } + return process_pids_and_cmds; +} + +std::vector GetPids() +{ + const auto &pids = GetPidCommandPairsofAncestors(); + std::vector res; + res.reserve(pids.size()); + for (const auto &pidPair : pids) { + res.push_back(pidPair.first); + } + LOG(INFO) << "Success to get parent pid: " << res; + return res; +} + +std::string GenerateUuidV4() +{ + static std::random_device randomDevice; + static std::mt19937 gen(randomDevice()); + static std::uniform_int_distribution<> dis(0, 15); // range (0, 15) + static std::uniform_int_distribution<> dis2(8, 11); // range (8, 11) + + std::stringstream stringStream; + stringStream << std::hex; + for (int i = 0; i < 8; i++) { // 8 times + stringStream << dis(gen); + } + stringStream << "-"; + for (int j = 0; j < 4; j++) { // 4 times + stringStream << dis(gen); + } + stringStream << "-4"; // add -4 + for (int k = 0; k < 3; k++) { // 3 times + stringStream << dis(gen); + } + stringStream << "-"; + stringStream << dis2(gen); + for (int m = 0; m < 3; m++) { // 3 times + stringStream << dis(gen); + } + stringStream << "-"; + for (int n = 0; n < 12; n++) { // 12 times + stringStream << dis(gen); + } + return stringStream.str(); +} + +bool Str2Uint32(uint32_t& dest, const std::string& str) +{ + if (str.empty()) { + LOG(ERROR) << "Str to uint32 failed, input string is null"; + return false; + } + size_t pos = 0; + try { + dest = static_cast(std::stoul(str, &pos)); + } catch(...) { + LOG(ERROR) << "Str to uint32 failed, input string is " << str; + return false; + } + if (pos != str.size()) { + LOG(ERROR) << "Str to uint32 failed, input string is " << str; + return false; + } + return true; +} + +bool Str2Bool(bool& dest, const std::string& str) +{ + std::string lower_str = str; + std::transform(lower_str.begin(), lower_str.end(), lower_str.begin(), ::tolower); + + if (lower_str == "true" || lower_str == "1") { + dest = true; + return true; + } + + if (lower_str == "false" || lower_str == "0") { + dest = false; + return true; + } + LOG(ERROR) << "Str to bool failed, input string is " << str; + return false; +} + +std::string& trim(std::string& str) +{ + if (str.empty()) { + return str; + } + str.erase(0, str.find_first_not_of(" ")); + str.erase(str.find_last_not_of(" ") + 1); + return str; +} + +// split函数 +std::vector split(const std::string& str, char delimiter) +{ + std::vector tokens; + std::string token; + std::istringstream tokenStream(str); + + while (std::getline(tokenStream, token, delimiter)) { + tokens.push_back(token); + } + + return tokens; +} + +void *MsptiMalloc(size_t size, size_t alignment) +{ + if (alignment > 0) { + size = (size + alignment - 1) / alignment * alignment; + } +#if defined(_POSIX_C_SOURCE) && _POSIX_C_SOURCE >= 200112L + void *ptr = nullptr; + if (posix_memalign(&ptr, alignment, size) != 0) { + ptr = nullptr; + } + return ptr; +#else + return malloc(size); +#endif +} + +bool PathUtils::IsFileExist(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return false; + } + return access(path.c_str(), F_OK) == 0; +} + +bool PathUtils::IsFileWritable(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return false; + } + return access(path.c_str(), W_OK) == 0; +} + +bool PathUtils::IsDir(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return false; + } + struct stat st{}; + int ret = lstat(path.c_str(), &st); + if (ret != 0) { + return false; + } + return S_ISDIR(st.st_mode); +} + +bool PathUtils::CreateDir(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return false; + } + if (IsFileExist(path)) { + return IsDir(path); + } + size_t pos = 0; + while ((pos = path.find_first_of('/', pos)) != std::string::npos) { + std::string baseDir = path.substr(0, ++pos); + if (IsFileExist(baseDir)) { + if (IsDir(baseDir)) { + continue; + } else { + return false; + } + } + if (mkdir(baseDir.c_str(), DATA_DIR_AUTHORITY) != 0) { + if (errno != EEXIST) { + return false; + } + } + } + auto ret = mkdir(path.c_str(), DATA_DIR_AUTHORITY); + return (ret == 0 || errno == EEXIST) ? true : false; +} + +std::string PathUtils::RealPath(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return ""; + } + char realPath[PATH_MAX] = {0}; + if (realpath(path.c_str(), realPath) == nullptr) { + return ""; + } + return std::string(realPath); +} + +std::string PathUtils::RelativeToAbsPath(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX) { + return ""; + } + if (path[0] != '/') { + char pwdPath[PATH_MAX] = {0}; + if (getcwd(pwdPath, PATH_MAX) != nullptr) { + return std::string(pwdPath) + "/" + path; + } + return ""; + } + return std::string(path); +} + +std::string PathUtils::DirName(const std::string &path) +{ + if (path.empty()) { + return ""; + } + char tempPath[PATH_MAX] = {0}; + strncpy(tempPath, path.c_str(), path.size() < PATH_MAX ? path.size() : PATH_MAX); + char* cPath = dirname(tempPath); + return cPath ? std::string(cPath) : ""; +} + +bool PathUtils::CreateFile(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX || !CreateDir(DirName(path))) { + return false; + } + int fd = creat(path.c_str(), DATA_FILE_AUTHORITY); + return (fd < 0 || close(fd) != 0) ? false : true; +} + +bool PathUtils::IsSoftLink(const std::string &path) +{ + if (path.empty() || path.size() > PATH_MAX || !IsFileExist(path)) { + return false; + } + struct stat st{}; + if (lstat(path.c_str(), &st) != 0) { + return false; + } + return S_ISLNK(st.st_mode); +} + +bool PathUtils::DirPathCheck(const std::string& absPath) +{ + if (absPath.empty() || absPath.size() > PATH_MAX) { + fprintf(stderr, "[ERROR] The length of Path %s is invalid.\n", absPath.c_str()); + return false; + } + if (IsSoftLink(absPath)) { + fprintf(stderr, "[ERROR] Path %s is soft link.\n", absPath.c_str()); + return false; + } + if (!IsFileExist(absPath) && !CreateDir(absPath)) { + fprintf(stderr, "[ERROR] Path %s not exist and create failed.\n", absPath.c_str()); + return false; + } + if (!IsDir(absPath) || !IsFileWritable(absPath)) { + fprintf(stderr, "[ERROR] %s is not a directory or is not writable.\n", absPath.c_str()); + return false; + } + return true; +} + +bool CreateMsmonitorLogPath(std::string& path) +{ + const char* logPathEnvVal = getenv("MSMONITOR_LOG_PATH"); + std::string logPath; + if (logPathEnvVal != nullptr) { + logPath = logPathEnvVal; + } + if (logPath.empty()) { + char cwdPath[PATH_MAX] = {0}; + if (getcwd(cwdPath, PATH_MAX) != nullptr) { + logPath = cwdPath; + } + } + if (logPath.empty()) { + fprintf(stderr, "[ERROR] Failed to get msmonitor log path.\n"); + return false; + } + logPath = logPath + "/msmonitor_log"; + std::string absPath = PathUtils::RelativeToAbsPath(logPath); + if (PathUtils::DirPathCheck(absPath)) { + std::string realPath = PathUtils::RealPath(absPath); + if (PathUtils::CreateDir(realPath)) { + path = realPath; + fprintf(stderr, "[INFO] Msmonitor log will record to %s.\n", realPath.c_str()); + return true; + } + fprintf(stderr, "[ERROR] Create LOG_PATH: %s failed.\n", realPath.c_str()); + } else { + fprintf(stderr, "[ERROR] LOG_PATH: %s of Msmonitor is invalid.\n", absPath.c_str()); + } + return false; +} +} // namespace ipc_monitor +} // namespace dynolog_npu diff --git a/profiler/msprof_analyze/osrt_trace/src/utils.h b/msmonitor/plugin/ipc_monitor/utils.h similarity index 38% rename from profiler/msprof_analyze/osrt_trace/src/utils.h rename to msmonitor/plugin/ipc_monitor/utils.h index 129c062d5f2898d0b33db33f4716ae497c6ad8d1..58da4b2b3e12c1ab69fd2428a34f434b73af1ce4 100644 --- a/profiler/msprof_analyze/osrt_trace/src/utils.h +++ b/msmonitor/plugin/ipc_monitor/utils.h @@ -1,5 +1,5 @@ -/** - * Copyright 2024 Huawei Technologies Co., Ltd +/* + * Copyright (C) 2025-2025. Huawei Technologies Co., Ltd. All rights reserved. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -13,28 +13,77 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -#pragma once -#include -#include +#ifndef IPC_MONITOR_UTILS_H +#define IPC_MONITOR_UTILS_H + +#include +#include #include +#include +#include +#include -#define LIKELY(x) (__builtin_expect(!!(x), 1)) -#define UNLIKELY(x) (__builtin_expect(!!(x), 0)) +namespace dynolog_npu { +namespace ipc_monitor { +constexpr int MaxParentPids = 5; +int32_t GetProcessId(); +std::string GenerateUuidV4(); +std::vector GetPids(); +std::pair GetParentPidAndCommand(int32_t pid); +std::vector> GetPidCommandPairsofAncestors(); +std::string getCurrentTimestamp(); +uint64_t getCurrentTimestamp64(); +bool Str2Uint32(uint32_t& dest, const std::string& str); +bool Str2Bool(bool& dest, const std::string& str); +std::string& trim(std::string& str); +std::vector split(const std::string& str, char delimiter); +constexpr size_t ALIGN_SIZE = 8; +void *MsptiMalloc(size_t size, size_t alignment); const mode_t DATA_FILE_AUTHORITY = 0640; const mode_t DATA_DIR_AUTHORITY = 0750; -inline uint64_t nsec_now() -{ - static const uint64_t S_TO_NS = 1000 * 1000 * 1000; - struct timespec ts; - clock_gettime(CLOCK_REALTIME, &ts); - return static_cast(ts.tv_sec * S_TO_NS + ts.tv_nsec); -} +enum class SubModule { + IPC = 0 +}; + +enum class ErrCode { + SUC = 0, + PARAM = 1, + TYPE = 2, + VALUE = 3, + PTR = 4, + INTERNAL = 5, + MEMORY = 6, + NOT_SUPPORT = 7, + NOT_FOUND = 8, + UNAVAIL = 9, + SYSCALL = 10, + TIMEOUT = 11, + PERMISSION = 12, +}; -int str_to_i64(const std::string& str, int64_t& num); +std::string formatErrorCode(SubModule submodule, ErrCode errorCode); + +#define IPC_ERROR(error) formatErrorCode(SubModule::IPC, error) + +template +inline T ReinterpretConvert(V ptr) { + return reinterpret_cast(ptr); +} +template +auto groupby(const Container& vec, KeyFunc keyFunc) { + using KeyType = decltype(keyFunc(*vec.begin())); + using ValueType = typename Container::value_type; + std::unordered_map> grouped; + for (const auto& item : vec) { + grouped[keyFunc(item)].push_back(item); + } + return grouped; +} +bool CreateMsmonitorLogPath(std::string& path); struct PathUtils { static bool IsFileExist(const std::string &path); @@ -48,3 +97,6 @@ struct PathUtils { static bool IsSoftLink(const std::string &path); static bool DirPathCheck(const std::string &path); }; +} // namespace ipc_monitor +} // namespace dynolog_npu +#endif // IPC_MONITOR_UTILS_H diff --git a/msmonitor/plugin/setup.py b/msmonitor/plugin/setup.py new file mode 100644 index 0000000000000000000000000000000000000000..2e257a48ada719a56d3cd0299f56f61351f249f4 --- /dev/null +++ b/msmonitor/plugin/setup.py @@ -0,0 +1,69 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import sys + +import subprocess +import pybind11 + +from setuptools import setup, Extension +from setuptools.command.build_ext import build_ext + + +class CMakeExtension(Extension): + def __init__(self, name, sourcedir=""): + super().__init__(name, sources=[]) + self.sourcedir = os.path.abspath(sourcedir) + + +class CMakeBuild(build_ext): + def run(self): + for ext in self.extensions: + self.build_extension(ext) + + def build_extension(self, ext): + cfg = 'Debug' if self.debug else 'Release' + build_args = ['--config', cfg] + + ext_dir = os.path.abspath(os.path.dirname(self.get_ext_fullpath(ext.name))) + cmake_args = [ + '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + ext_dir, + '-DPYTHON_EXECUTABLE=' + sys.executable, + '-DCMAKE_PREFIX_PATH=' + pybind11.get_cmake_dir(), + '-DCMAKE_INSTALL_PREFIX=' + ext_dir, + '-DDYNOLOG_PATH=' + os.path.join(os.path.dirname(BASE_DIR), "third_party", "dynolog"), + '-DCMAKE_BUILD_TYPE=' + cfg + ] + + env = os.environ.copy() + env['CXXFLAGS'] = '{} -DVERSION_INFO=\\"{}\\"'.format(env.get('CXXFLAGS', ''), + self.distribution.get_version()) + + if not os.path.exists(self.build_temp): + os.makedirs(self.build_temp) + subprocess.check_call(['cmake', ext.sourcedir] + cmake_args, cwd=self.build_temp, env=env) + subprocess.check_call(['cmake', '--build', '.', '--target', 'install', '-j', '8'] + build_args, + cwd=self.build_temp) + +BASE_DIR = os.path.dirname(os.path.realpath(__file__)) + +setup( + name="msmonitor_plugin", + version="0.1", + description="msMonitor plugins", + ext_modules=[CMakeExtension('IPCMonitor')], + cmdclass=dict(build_ext=CMakeBuild), + install_requires=["pybind11"], +) diff --git a/msmonitor/plugin/stub/build_stub.sh b/msmonitor/plugin/stub/build_stub.sh new file mode 100644 index 0000000000000000000000000000000000000000..97ec0699aec5923497ee32a7252b0337db059f7f --- /dev/null +++ b/msmonitor/plugin/stub/build_stub.sh @@ -0,0 +1,7 @@ +#!/bin/bash + +CDIR="$(cd "$(dirname "$0")" ; pwd -P)" + +cd ${CDIR} + +gcc -fPIC -shared -o libmspti.so -I../ipc_monitor/mspti_monitor mspti.cpp diff --git a/msmonitor/plugin/stub/mspti.cpp b/msmonitor/plugin/stub/mspti.cpp new file mode 100644 index 0000000000000000000000000000000000000000..d0c73b74430b9d33bff35e9e0f0a2912bda0b354 --- /dev/null +++ b/msmonitor/plugin/stub/mspti.cpp @@ -0,0 +1,36 @@ +#include "mspti.h" + +msptiResult msptiSubscribe(msptiSubscriberHandle *subscriber, msptiCallbackFunc callback, void *userdata) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiUnsubscribe(msptiSubscriberHandle subscriber) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiActivityRegisterCallbacks(msptiBuffersCallbackRequestFunc funcBufferRequested, msptiBuffersCallbackCompleteFunc funcBufferCompleted) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiActivityEnable(msptiActivityKind kind) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiActivityDisable(msptiActivityKind kind) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiActivityGetNextRecord(uint8_t *buffer, size_t validBufferSizeBytes, msptiActivity **record) +{ + return MSPTI_SUCCESS; +} + +msptiResult msptiActivityFlushAll(uint32_t flag) +{ + return MSPTI_SUCCESS; +} diff --git a/msmonitor/scripts/apply_dyno_patches.sh b/msmonitor/scripts/apply_dyno_patches.sh new file mode 100644 index 0000000000000000000000000000000000000000..c492db74a2a56948433a47e9cffcccd4ac71e098 --- /dev/null +++ b/msmonitor/scripts/apply_dyno_patches.sh @@ -0,0 +1,36 @@ +#! /bin/bash +set -e + +apply_ascend_patches() { + cd ./third_party/dynolog || return 1 + + if [ ! -d "../../patches" ]; then + echo "ERROR: patches directory not found" + cd ../.. + return 1 + fi + + for patch_file in ../../patches/*.patch; do + if [ -f "$patch_file" ]; then + echo "Applying patch: $patch_file" + git apply --check -p1 "$patch_file" + if [ $? -ne 0 ]; then + echo "ERROR: Failed to apply patch: $(basename $patch_file)" + cd ../.. + return 1 + fi + git apply -p1 "$patch_file" + if [ $? -ne 0 ]; then + echo "ERROR: Failed to apply patch: $(basename $patch_file)" + cd ../.. + return 1 + fi + fi + done + + cd ../.. + echo "Successfully applied all Ascend patches" + return 0 +} + +apply_ascend_patches \ No newline at end of file diff --git a/msmonitor/scripts/build.sh b/msmonitor/scripts/build.sh new file mode 100644 index 0000000000000000000000000000000000000000..52cd5ad4f133bf5ca95f1a624fd4ad556a6b62f5 --- /dev/null +++ b/msmonitor/scripts/build.sh @@ -0,0 +1,119 @@ +#!/bin/bash +set -e +export BUILD_PROMETHEUS=1 + +check_gcc_version() { + if ! command -v gcc >/dev/null 2>&1; then + echo "ERROR: gcc command not found" + return 1 + fi + + local GCC_VERSION=$(gcc -dumpversion) + local GCC_MAJOR=$(echo $GCC_VERSION | cut -d. -f1) + local GCC_MINOR=$(echo $GCC_VERSION | cut -d. -f2) + + if [ "$GCC_MAJOR" -lt 8 ] || ([ "$GCC_MAJOR" -eq 8 ] && [ "$GCC_MINOR" -lt 5 ]); then + echo "ERROR: gcc version must be greater than or equal to 8.5.0" + echo "Current gcc version: $GCC_VERSION" + return 1 + fi + echo "Check pass: current gcc version is $GCC_VERSION" + return 0 +} + +check_rust_version() { + if ! command -v rustc >/dev/null 2>&1; then + echo "ERROR: rustc command not found" + return 1 + fi + + local RUST_VERSION=$(rustc --version | cut -d' ' -f2) + local RUST_MAJOR=$(echo $RUST_VERSION | cut -d. -f1) + local RUST_MINOR=$(echo $RUST_VERSION | cut -d. -f2) + + if [ "$RUST_MAJOR" -lt 1 ] || ([ "$RUST_MAJOR" -eq 1 ] && [ "$RUST_MINOR" -lt 81 ]); then + echo "ERROR: Rust version must be greater than or equal to 1.81" + echo "Current Rust version: $RUST_VERSION" + return 1 + fi + echo "Check pass: current Rust version is $RUST_VERSION" + return 0 +} + +update_and_checkout_submodule() { + DYNLOG_COMMIT_ID="a9b6aeddcd6363252f5388cb0dd942981a09a24b" + + git submodule update --init --recursive + if [ $? -ne 0 ]; then + echo "ERROR: update git submodule failed" + return 1 + fi + + cd ./third_party/dynolog + git checkout ${DYNLOG_COMMIT_ID} + if [ $? -ne 0 ]; then + echo "ERROR: switch to dynolog specified commit failed" + cd .. + return 1 + fi + echo "Check pass: switch to dynolog specified commit ${DYNLOG_COMMIT_ID}" + cd ../../ + return 0 +} + +PACKAGE_TYPE="" +while getopts "t:" opt; do + case $opt in + t) + PACKAGE_TYPE="$OPTARG" + if [[ "$PACKAGE_TYPE" != "deb" && "$PACKAGE_TYPE" != "rpm" ]]; then + echo "ERROR: Invalid package type. Supported types: deb, rpm" + exit 1 + fi + ;; + \?) + echo "Usage: $0 [-t package_type]" + echo "package_type: deb or rpm (optional, if not specified will only build)" + exit 1 + ;; + esac +done + +echo "------------------ Check GCC and Rust version ----------------------" +check_gcc_version +check_rust_version + +echo "------------------ Update and checkout submodule -------------------" +update_and_checkout_submodule + +echo "------------------ Generate patch for Ascend -----------------------" +bash scripts/gen_dyno_patches.sh + +echo "------------------ Apply patch for Ascend --------------------------" +bash scripts/apply_dyno_patches.sh + +echo "------------------ Build dynolog patch for Ascend-------------------" +cd third_party/dynolog +rm -rf build +if [ -z "$PACKAGE_TYPE" ]; then + bash scripts/build.sh + echo "Build dynolog success without packaging" +elif [ "$PACKAGE_TYPE" = "deb" ]; then + ARCHITECTURE=$(uname -m) + CONTROL_FILE="scripts/debian/control" + ARCH="amd64" + if [[ "$ARCHITECTURE" == "aarch64" ]]; then + sed -i 's/^Architecture: .*/Architecture: arm64/' "$CONTROL_FILE" + ARCH="arm64" + echo "dpkg Architecture set to arm64" + fi + export ARCH=$ARCH + bash scripts/debian/make_deb.sh + unset ARCH + mv dynolog_*.deb ../../ + echo "Build dynolog deb package success" +elif [ "$PACKAGE_TYPE" = "rpm" ]; then + bash scripts/rpm/make_rpm.sh + mv dynolog-*.rpm ../../ + echo "Build dynolog rpm package success" +fi diff --git a/msmonitor/scripts/gen_dyno_patches.sh b/msmonitor/scripts/gen_dyno_patches.sh new file mode 100644 index 0000000000000000000000000000000000000000..5ade74dbcfcf88dfbc072c9de790ec4f3ec451d9 --- /dev/null +++ b/msmonitor/scripts/gen_dyno_patches.sh @@ -0,0 +1,63 @@ +#!/bin/bash +set -e + +WORK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +PATCHES_DIR="${WORK_DIR}/patches" +DYNOLOG_DIR="${WORK_DIR}/third_party/dynolog" +MODIFIED_FILES_DIR="${WORK_DIR}/dynolog_npu" + +mkdir -p "${PATCHES_DIR}" + +generate_patches() { + echo "Generating patches from modified files..." + + # 检查修改后的文件目录是否存在 + if [ ! -d "${MODIFIED_FILES_DIR}" ]; then + echo "ERROR: dynolog_npu directory not found" + return 1 + fi + + # 清理旧的patch文件 + rm -f "${PATCHES_DIR}"/*.patch + + # 遍历修改后的文件目录 + find "${MODIFIED_FILES_DIR}" -type f | while read modified_file; do + # 获取相对路径 + rel_path=$(realpath --relative-to="${MODIFIED_FILES_DIR}" "${modified_file}") + original_file="${DYNOLOG_DIR}/${rel_path}" + + echo "original_file: ${original_file}" + # 检查原始文件是否存在 + if [ ! -f "${original_file}" ]; then + echo "WARN: Original file not found: ${original_file}" + + cp "${modified_file}" "${original_file}" + echo "Copied ${modified_file} to ${original_file}" + continue + fi + + # 生成patch文件名(将路径中的斜杠替换为下划线) + patch_name=$(echo "${rel_path}" | sed 's/\//_/g') + patch_file="${PATCHES_DIR}/${patch_name}.patch" + + echo "Generating patch for: ${rel_path}" + + ( + cd "${WORK_DIR}" + diff -u "third_party/dynolog/${rel_path}" "dynolog_npu/${rel_path}" > "${patch_file}" || true + ) + + # 检查patch文件大小 + if [ ! -s "${patch_file}" ]; then + rm "${patch_file}" + echo "No differences found for: ${rel_path}" + else + echo "Successfully generated patch: ${patch_file}" + fi + done + + echo "Patch generation completed" + return 0 +} + +generate_patches \ No newline at end of file diff --git a/profiler/msprof_analyze/prof_exports/slow_link_export.py b/msmonitor/test/test_dynolog_build.py similarity index 31% rename from profiler/msprof_analyze/prof_exports/slow_link_export.py rename to msmonitor/test/test_dynolog_build.py index c584ceb2b2afbbe89c180b5887a6b99e961d96e6..6e38899defa8c38a277f259cc4ce6b7522db2b1c 100644 --- a/profiler/msprof_analyze/prof_exports/slow_link_export.py +++ b/msmonitor/test/test_dynolog_build.py @@ -12,43 +12,46 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import os +import glob +import subprocess +import unittest -from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport - -QUERY = """ - SELECT - si.value AS groupName, - co.endNs - co.startNs AS communicationTime, - sii.value AS opName, - op.value AS opType, - et.name AS dataType, - CASE - WHEN et.name = 'INT8' THEN 1 * co.count - WHEN et.name = 'INT16' THEN 2 * co.count - WHEN et.name = 'INT32' THEN 4 * co.count - WHEN et.name = 'INT64' THEN 8 * co.count - WHEN et.name = 'UINT64' THEN 8 * co.count - WHEN et.name = 'UINT8' THEN 1 * co.count - WHEN et.name = 'UINT16' THEN 2 * co.count - WHEN et.name = 'UINT32' THEN 4 * co.count - WHEN et.name = 'FP16' THEN 2 * co.count - WHEN et.name = 'FP32' THEN 4 * co.count - WHEN et.name = 'FP64' THEN 8 * co.count - WHEN et.name = 'BFP16' THEN 2 * co.count - WHEN et.name = 'INT128' THEN 16 * co.count - END AS dataSize - FROM - COMMUNICATION_OP co - CROSS - JOIN STRING_IDS si ON co.groupName = si.id - JOIN STRING_IDS sii ON co.opName = sii.id - JOIN ENUM_HCCL_DATA_TYPE et ON co.dataType = et.id - JOIN STRING_IDS op ON co.opType = op.id -""" - - -class SlowLinkExport(BaseStatsExport): - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + +def excute_cmd(cmd, timeout=30 * 60): + return subprocess.run(cmd, capture_output=True, text=True, timeout=timeout) + + +class TestBuildDynolog(unittest.TestCase): + def test_build_dynolog_bin_and_plugin_whl_should_success(self): + result = excute_cmd(["bash", "scripts/build.sh"]) + self.assertEqual( + result.returncode, + 0, + f"Build dynolog failed stdout: {result.stdout}, stderr: {result.stderr}", + ) + + dyno_path = "third_party/dynolog/build/bin/dyno" + dynolog_path = "third_party/dynolog/build/bin/dynolog" + + self.assertTrue(os.path.exists(dyno_path), f"{dyno_path} does not exist") + self.assertTrue(os.path.exists(dynolog_path), f"{dynolog_path} does not exist") + + ori_dir = os.getcwd() + os.chdir("plugin") + result = excute_cmd(["python3", "setup.py", "bdist_wheel"]) + self.assertEqual( + result.returncode, + 0, + f"Build msMonitor plugin whl failed stdout: {result.stdout}, stderr: {result.stderr}", + ) + + plugin_whl_path = glob.glob("dist/msmonitor_plugin-*.whl")[0] + self.assertTrue( + os.path.exists(plugin_whl_path), f"{plugin_whl_path} does not exist" + ) + os.chdir(ori_dir) + + +if __name__ == "__main__": + unittest.main() diff --git a/plugins/mindstudio-vscode-plugins/OWNERS b/plugins/mindstudio-vscode-plugins/OWNERS new file mode 100644 index 0000000000000000000000000000000000000000..2c4ada94aa198321313f24bc0b0f289eba360c33 --- /dev/null +++ b/plugins/mindstudio-vscode-plugins/OWNERS @@ -0,0 +1,9 @@ +options: + no_parent_owners: true +approvers: +- lee314 +- linxi9527 +reviewers: +- jzc_23 +- duanhaomiao +- yangqingliang4 \ No newline at end of file diff --git a/plugins/tensorboard-plugins/.github/workflows/libkineto_ci.yml b/plugins/tensorboard-plugins/.github/workflows/libkineto_ci.yml deleted file mode 100644 index 3133d6400fb0b3ca0ee9b38c311c2db6d1167c7e..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/.github/workflows/libkineto_ci.yml +++ /dev/null @@ -1,56 +0,0 @@ -name: LIBKINETOCI - -on: - push: - branches: - - main - pull_request: - branches: - - main - -jobs: - build: - runs-on: ${{ matrix.os }} - strategy: - matrix: - os: [ubuntu-latest] - - steps: - - uses: actions/checkout@v2 - - name: Checkout submodules - shell: bash - run: | - auth_header="$(git config --local --get http.https://github.com/.extraheader)" - git submodule sync --recursive - git -c "http.extraheader=$auth_header" -c protocol.version=2 submodule update --init --force --recursive --depth=1 - - - name: Get env vars - run: | - echo GITHUB_WORKFLOW = $GITHUB_WORKFLOW - echo HOME = $HOME - echo GITHUB_ACTION = $GITHUB_ACTION - echo GITHUB_ACTIONS = $GITHUB_ACTIONS - echo GITHUB_REPOSITORY = $GITHUB_REPOSITORY - echo GITHUB_EVENT_NAME = $GITHUB_EVENT_NAME - echo GITHUB_EVENT_PATH = $GITHUB_EVENT_PATH - echo GITHUB_WORKSPACE = $GITHUB_WORKSPACE - echo GITHUB_SHA = $GITHUB_SHA - echo GITHUB_REF = $GITHUB_REF - c++ --verbose - - # TODO: Figure out how to install cupti headers T84637671 - - name: Build static lib - run: | - set -e - mkdir build_static - cd build_static - cmake -DKINETO_LIBRARY_TYPE=static ../libkineto/ - make -j - - - name: Build shared lib - run: | - set -e - mkdir build_shared - cd build_shared - cmake -DKINETO_LIBRARY_TYPE=shared ../libkineto/ - make -j diff --git a/plugins/tensorboard-plugins/.github/workflows/tb_plugin_build_pip_package.yml b/plugins/tensorboard-plugins/.github/workflows/tb_plugin_build_pip_package.yml deleted file mode 100644 index 9bdafcc442635eaff19fc7a7505f5231cf6e5cf7..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/.github/workflows/tb_plugin_build_pip_package.yml +++ /dev/null @@ -1,19 +0,0 @@ -name: Build torch-tb-profiler Pip Package - -on: - # TODO: Add an on_release trigger to build on tags - workflow_dispatch: - -jobs: - build-package: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v2 - - name: build pip package - run: | - set -e - cd tb_plugin - python setup.py sdist bdist_wheel - cd dist/ - pip install *.whl - python -c "import torch_tb_profiler;print(torch_tb_profiler.__version__)" diff --git a/plugins/tensorboard-plugins/.github/workflows/tb_plugin_ci.yml b/plugins/tensorboard-plugins/.github/workflows/tb_plugin_ci.yml deleted file mode 100644 index 1b59a7bf90a6009caa41d4ac0e3d5545dc8b6c7c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/.github/workflows/tb_plugin_ci.yml +++ /dev/null @@ -1,57 +0,0 @@ -name: TB_Plugin_CI - -on: - push: - branches: - - main - - release/** - - plugin/** - - pull_request: - branches: - - main - - release/** - - plugin/** - -jobs: - generate-matrix: - runs-on: ubuntu-latest - outputs: - matrix: ${{ steps.set-matrix.outputs.matrix }} - steps: - - id: set-matrix - run: | - echo $GITHUB_BASE_REF - if [ $GITHUB_BASE_REF == "plugin/vnext" ] - then - echo "::set-output name=matrix::{\"python-version\":[3.7, 3.8, 3.9], \"cuda-version\":[\"cpu\"], \"pytorch-version\":[\"nightly\"]}" - else - echo "::set-output name=matrix::{\"python-version\":[3.7, 3.8, 3.9], \"cuda-version\":[\"cpu\"], \"pytorch-version\":[\"nightly\", \"1.11rc\", \"stable\"]}" - fi - - build: - needs: generate-matrix - runs-on: ubuntu-latest - strategy: - matrix: ${{fromJSON(needs.generate-matrix.outputs.matrix)}} - steps: - - uses: actions/checkout@v2 - - name: Set up Python ${{ matrix.python-version }} - uses: actions/setup-python@v2 - with: - python-version: ${{ matrix.python-version }} - architecture: 'x64' - - name: Test - env: - CUDA_VERSION: ${{ matrix.cuda-version }} - PYTORCH_VERSION: ${{ matrix.pytorch-version }} - TORCH_PROFILER_LOG_LEVEL: DEBUG - GRPC_VERBOSITY: DEBUG - GRPC_ENABLE_FORK_SUPPORT: 'False' - run: | - set -e - cd tb_plugin - sh ./ci_scripts/install_env.sh - pip install .[gs] - cd test - pytest diff --git a/plugins/tensorboard-plugins/.gitignore b/plugins/tensorboard-plugins/.gitignore deleted file mode 100644 index ce186381c0b566e0ca225be70cbf8ac233d7aa6b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/.gitignore +++ /dev/null @@ -1,3 +0,0 @@ -# ignore common items -.idea -.vscode diff --git a/plugins/tensorboard-plugins/.gitmodules b/plugins/tensorboard-plugins/.gitmodules deleted file mode 100644 index 4660ee8bc9e6a4be4f4fbb007b8e66058122d716..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/.gitmodules +++ /dev/null @@ -1,6 +0,0 @@ -[submodule "libkineto/third_party/googletest"] - path = libkineto/third_party/googletest - url = https://github.com/google/googletest.git -[submodule "libkineto/third_party/fmt"] - path = libkineto/third_party/fmt - url = https://github.com/fmtlib/fmt.git diff --git a/plugins/tensorboard-plugins/CODE_OF_CONDUCT.md b/plugins/tensorboard-plugins/CODE_OF_CONDUCT.md deleted file mode 100644 index a0cbeaab7650bf08267fbdbc9bb54e845c88f392..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/CODE_OF_CONDUCT.md +++ /dev/null @@ -1,77 +0,0 @@ -# Code of Conduct - -## Our Pledge - -In the interest of fostering an open and welcoming environment, we as -contributors and maintainers pledge to make participation in our project and -our community a harassment-free experience for everyone, regardless of age, body -size, disability, ethnicity, sex characteristics, gender identity and expression, -level of experience, education, socio-economic status, nationality, personal -appearance, race, religion, or sexual identity and orientation. - -## Our Standards - -Examples of behavior that contributes to creating a positive environment -include: - -* Using welcoming and inclusive language -* Being respectful of differing viewpoints and experiences -* Gracefully accepting constructive criticism -* Focusing on what is best for the community -* Showing empathy towards other community members - -Examples of unacceptable behavior by participants include: - -* The use of sexualized language or imagery and unwelcome sexual attention or - advances -* Trolling, insulting/derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others' private information, such as a physical or electronic - address, without explicit permission -* Other conduct which could reasonably be considered inappropriate in a - professional setting - -## Our Responsibilities - -Project maintainers are responsible for clarifying the standards of acceptable -behavior and are expected to take appropriate and fair corrective action in -response to any instances of unacceptable behavior. - -Project maintainers have the right and responsibility to remove, edit, or -reject comments, commits, code, wiki edits, issues, and other contributions -that are not aligned to this Code of Conduct, or to ban temporarily or -permanently any contributor for other behaviors that they deem inappropriate, -threatening, offensive, or harmful. - -## Scope - -This Code of Conduct applies within all project spaces, and it also applies when -an individual is representing the project or its community in public spaces. -Examples of representing a project or community include using an official -project e-mail address, posting via an official social media account, or acting -as an appointed representative at an online or offline event. Representation of -a project may be further defined and clarified by project maintainers. - -## Enforcement - -Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported by contacting the project team at . All -complaints will be reviewed and investigated and will result in a response that -is deemed necessary and appropriate to the circumstances. The project team is -obligated to maintain confidentiality with regard to the reporter of an incident. -Further details of specific enforcement policies may be posted separately. - -Project maintainers who do not follow or enforce the Code of Conduct in good -faith may face temporary or permanent repercussions as determined by other -members of the project's leadership. - -## Attribution - -This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, -available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html - -[homepage]: https://www.contributor-covenant.org - -For answers to common questions about this code of conduct, see -https://www.contributor-covenant.org/faq - diff --git a/plugins/tensorboard-plugins/CONTRIBUTING.md b/plugins/tensorboard-plugins/CONTRIBUTING.md deleted file mode 100644 index a2e931bb6f0cc82ff030cee10ee1c99fbbbda07b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/CONTRIBUTING.md +++ /dev/null @@ -1,34 +0,0 @@ -# Contributing to Kineto -We want to make contributing to this project as easy and transparent as -possible. - -## Code of Conduct -The code of conduct is described in [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md). - -## Pull Requests -We actively welcome your pull requests. - -1. Fork the repo and create your branch from `main`. -2. If you've added code that should be tested, add tests. -3. If you've changed APIs, update the documentation. -4. Ensure the test suite passes. -5. Make sure your code lints. -6. If you haven't already, complete the Contributor License Agreement ("CLA"). - -## Contributor License Agreement ("CLA") -In order to accept your pull request, we need you to submit a CLA. You only need -to do this once to work on any of Facebook's open source projects. - -Complete your CLA here: - -## Issues -We use GitHub issues to track public bugs. Please ensure your description is -clear and has sufficient instructions to be able to reproduce the issue. - -Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe -disclosure of security bugs. In those cases, please go through the process -outlined on that page and do not file a public issue. - -## License -By contributing to Kineto, you agree that your contributions will be licensed -under the LICENSE file in the root directory of this source tree. diff --git a/plugins/tensorboard-plugins/LICENSE b/plugins/tensorboard-plugins/LICENSE deleted file mode 100644 index edb179715b5213644cfe903d43294f54892e707e..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/LICENSE +++ /dev/null @@ -1,33 +0,0 @@ -BSD License - -For Kineto software - -Copyright (c) Facebook, Inc. and its affiliates. All rights reserved. - -All contributions by Microsoft: -Copyright (c) Microsoft Corporation. (The Azure AI Platform team) - -Redistribution and use in source and binary forms, with or without modification, -are permitted provided that the following conditions are met: - - * Redistributions of source code must retain the above copyright notice, this - list of conditions and the following disclaimer. - - * Redistributions in binary form must reproduce the above copyright notice, - this list of conditions and the following disclaimer in the documentation - and/or other materials provided with the distribution. - - * Neither the name Facebook nor the names of its contributors may be used to - endorse or promote products derived from this software without specific - prior written permission. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND -ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR -ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES -(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; -LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON -ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/plugins/tensorboard-plugins/ OWNERS b/plugins/tensorboard-plugins/OWNERS similarity index 61% rename from plugins/tensorboard-plugins/ OWNERS rename to plugins/tensorboard-plugins/OWNERS index 34c383beaf138da92df0991b472135496450a827..c2bfbd7ff938f9f2cb7abc5b2e27b6b7e7893786 100644 --- a/plugins/tensorboard-plugins/ OWNERS +++ b/plugins/tensorboard-plugins/OWNERS @@ -1,9 +1,12 @@ -options: - no_parent_owners: true -approvers: -- wo-wenjie -- ly-qianxiao -reviewers: -- wo-wenjie -- ly-qianxiao -- leo920320 +options: + no_parent_owners: true +approvers: +- wo-wenjie +- ly-qianxiao +- leo920320 +- ninghuang +reviewers: +- leo920320 +- ninghuang +- xiao_yao2459 +- a-qiny \ No newline at end of file diff --git a/plugins/tensorboard-plugins/README.md b/plugins/tensorboard-plugins/README.md deleted file mode 100644 index 3a18f4c6239f353c10362c9e0ba5aae052cb2c07..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/README.md +++ /dev/null @@ -1,38 +0,0 @@ -# Kineto - -Kineto is part of the PyTorch Profiler. - -The Kineto project was started to help enable -- **performance observability and diagnostics** across common ML bottleneck components -- **actionable recommendations** for common issues -- integration of external system-level profiling tools -- integration with popular visualization platforms and analysis pipelines - -A central component is libkineto, a profiling library with special focus on low-overhead GPU timeline tracing. - -The PyTorch Profiler TensorBoard plugin provides powerful and intuitive visualizations of profiling results, as well as actionable recommendations, and is the best way to experience the new PyTorch Profiler. - -## Libkineto -Libkineto is an in-process profiling library integrated with the PyTorch Profiler. Please refer to the [README](libkineto/README.md) file in the `libkineto` folder as well as documentation on the [new PyTorch Profiler API](https://pytorch.org/docs/master/profiler.html). - -## PyTorch TensorBoard Profiler NPU Plugin -The goal of the PyTorch TensorBoard Profiler is to provide a seamless and intuitive end-to-end profiling experience, including straightforward collection from PyTorch and insightful visualizations and recommendations in the TensorBoard UI. -Please refer to the [README](tb_plugin/README.md) file in the `tb_plugin` folder. - -## Future Development Direction: -Some areas we're currently working on: -- Support for tracing distributed workloads -- Trace processing, analysis and recommendation engine -- System-level activities, multiple tracing sources -- Profiling and monitoring daemon for larger scale deployments - -## Releases and Contributing -We will follow the PyTorch release schedule which roughly happens on a 3 month basis. - -We appreciate all contributions. If you are planning to contribute back bug-fixes, please do so without any further discussion. - -If you plan to contribute new features, please first open an issue and discuss the feature with us. Sending a PR without discussion might end up resulting in a rejected PR because we might be taking the infrastructure in a different direction than you might be aware of. We expect the architecture to keep evolving. - -## License -Kineto has a BSD-style license, as found in the [LICENSE](LICENSE) file. - diff --git a/plugins/tensorboard-plugins/libkineto/CMakeLists.txt b/plugins/tensorboard-plugins/libkineto/CMakeLists.txt deleted file mode 100644 index 63966de803a786913b104419776aa94bb00b74b0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/CMakeLists.txt +++ /dev/null @@ -1,198 +0,0 @@ -cmake_minimum_required(VERSION 3.5 FATAL_ERROR) - -list(APPEND CMAKE_MODULE_PATH "${CMAKE_CURRENT_SOURCE_DIR}/cmake/modules") - -#install libraries into correct locations on all platforms -include(GNUInstallDirs) - -# function to extract filelists from libkineto_defs.bzl file -find_package(PythonInterp) -function(get_filelist name outputvar) - execute_process( - COMMAND "${PYTHON_EXECUTABLE}" -c - "exec(open('libkineto_defs.bzl').read());print(';'.join(${name}))" - WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" - OUTPUT_VARIABLE _tempvar) - string(REPLACE "\n" "" _tempvar "${_tempvar}") - set(${outputvar} ${_tempvar} PARENT_SCOPE) -endfunction() - -project(kineto VERSION 0.1 LANGUAGES CXX C) - -set(KINETO_LIBRARY_TYPE "default" CACHE STRING - "Type of library (default, static or shared) to build") -set_property(CACHE KINETO_LIBRARY_TYPE PROPERTY STRINGS default shared) -option(KINETO_BUILD_TESTS "Build kineto unit tests" ON) - -set(LIBKINETO_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/src") -set(LIBKINETO_INCLUDE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/include") -set(LIBKINETO_BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}) -set(LIBKINETO_THIRDPARTY_DIR "${CMAKE_CURRENT_SOURCE_DIR}/third_party") -set(CMAKE_EXPORT_COMPILE_COMMANDS ON) - -#We should default to a Release build -if (NOT CMAKE_BUILD_TYPE OR CMAKE_BUILD_TYPE STREQUAL "") - set(CMAKE_BUILD_TYPE "Release" CACHE STRING "" FORCE) -endif() - -if (NOT CUDA_SOURCE_DIR) - set(CUDA_SOURCE_DIR "$ENV{CUDA_SOURCE_DIR}") - message(INFO " CUDA_SOURCE_DIR = ${CUDA_SOURCE_DIR}") -endif() - -if (NOT ROCM_SOURCE_DIR) - set(ROCM_SOURCE_DIR "$ENV{ROCM_SOURCE_DIR}") - message(INFO " ROCM_SOURCE_DIR = ${ROCM_SOURCE_DIR}") -endif() - -# Set LIBKINETO_NOCUPTI to explicitly disable CUPTI -# Otherwise, CUPTI is disabled if not found -IF (NOT CUDA_SOURCE_DIR OR NOT CUPTI_INCLUDE_DIR OR NOT CUDA_cupti_LIBRARY) - set(LIBKINETO_NOCUPTI ON CACHE BOOL "" FORCE) -endif() - -IF (NOT ROCM_SOURCE_DIR AND NOT ROCTRACER_INCLUDE_DIR) - set(LIBKINETO_NOROCTRACER ON CACHE BOOL "" FORCE) -endif() - -# Define file lists -if (LIBKINETO_NOCUPTI AND LIBKINETO_NOROCTRACER) - get_filelist("get_libkineto_cpu_only_srcs(with_api=False)" LIBKINETO_SRCS) - message(INFO " CUPTI unavailable or disabled - not building GPU profilers") -elseif(NOT LIBKINETO_NOROCTRACER) - get_filelist("get_libkineto_roctracer_srcs()" LIBKINETO_SRCS) - message(INFO " Building with roctracer") -else() - get_filelist("get_libkineto_cupti_srcs(with_api=False)" LIBKINETO_SRCS) -endif() -get_filelist("get_libkineto_public_headers()" LIBKINETO_PUBLIC_HEADERS) -get_filelist("get_libkineto_api_srcs()" LIBKINETO_API_SRCS) - -add_library(kineto_base OBJECT ${LIBKINETO_SRCS}) -add_library(kineto_api OBJECT ${LIBKINETO_API_SRCS}) - -# Make libraries depend on libkineto_defs.bzl -add_custom_target(libkineto_defs.bzl DEPENDS libkineto_defs.bzl) -add_dependencies(kineto_base libkineto_defs.bzl) - -set_target_properties(kineto_base kineto_api PROPERTIES - CXX_STANDARD 14 - CXX_STANDARD_REQUIRED YES - CXX_EXTENSIONS NO - CXX_VISIBILITY_PRESET hidden) - -set(KINETO_COMPILE_OPTIONS "-DKINETO_NAMESPACE=libkineto") -list(APPEND KINETO_COMPILE_OPTIONS "-DFMT_HEADER_ONLY") -if(NOT MSVC) - list(APPEND KINETO_COMPILE_OPTIONS "-std=c++14") -else() - list(APPEND KINETO_COMPILE_OPTIONS "/std:c++14") - list(APPEND KINETO_COMPILE_OPTIONS "-DWIN32_LEAN_AND_MEAN") - list(APPEND KINETO_COMPILE_OPTIONS "-DNOGDI") -endif() -if (NOT LIBKINETO_NOCUPTI) - list(APPEND KINETO_COMPILE_OPTIONS "-DHAS_CUPTI") -endif() -if (NOT LIBKINETO_NOROCTRACER) - target_compile_options(kineto_base PRIVATE "-DHAS_ROCTRACER") - target_compile_options(kineto_base PRIVATE "-D__HIP_PLATFORM_HCC__") - target_compile_options(kineto_base PRIVATE "-D__HIP_PLATFORM_AMD__") -endif() - -target_compile_options(kineto_base PRIVATE "${KINETO_COMPILE_OPTIONS}") -target_compile_options(kineto_api PRIVATE "${KINETO_COMPILE_OPTIONS}") - -if(NOT TARGET fmt) - if(NOT FMT_SOURCE_DIR) - set(FMT_SOURCE_DIR "${LIBKINETO_THIRDPARTY_DIR}/fmt" - CACHE STRING "fmt source directory from submodules") - endif() - - # Build FMT. - # FMT and some other libraries use BUILD_SHARED_LIBS to control - # the library type. - # Save and restore the value after configuring FMT - set(TEMP_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) - set(BUILD_SHARED_LIBS OFF CACHE BOOL "Build shared libs" FORCE) - set(FMT_LIBRARY_TYPE static CACHE STRING "Set lib type to static") - add_subdirectory("${FMT_SOURCE_DIR}" "${LIBKINETO_BINARY_DIR}/fmt") - set_property(TARGET fmt PROPERTY POSITION_INDEPENDENT_CODE ON) - set(BUILD_SHARED_LIBS ${TEMP_BUILD_SHARED_LIBS} CACHE BOOL "Build shared libs" FORCE) -endif() - -set(FMT_INCLUDE_DIR "${FMT_SOURCE_DIR}/include") -message(STATUS "Kineto: FMT_SOURCE_DIR = ${FMT_SOURCE_DIR}") -message(STATUS "Kineto: FMT_INCLUDE_DIR = ${FMT_INCLUDE_DIR}") -if (NOT CUPTI_INCLUDE_DIR) - set(CUPTI_INCLUDE_DIR "${CUDA_SOURCE_DIR}/extras/CUPTI/include") -endif() -if (NOT CUDA_INCLUDE_DIRS) - set(CUDA_INCLUDE_DIRS "${CUDA_SOURCE_DIR}/include") -endif() -if (NOT ROCTRACER_INCLUDE_DIR) - set(ROCTRACER_INCLUDE_DIR "${ROCM_SOURCE_DIR}/roctracer/include") -endif() -if (NOT ROCM_INCLUDE_DIRS) - set(ROCM_INCLUDE_DIRS "${ROCM_SOURCE_DIR}/include") -endif() - -message(INFO " CUPTI_INCLUDE_DIR = ${CUPTI_INCLUDE_DIR}") -message(INFO " ROCTRACER_INCLUDE_DIR = ${ROCTRACER_INCLUDE_DIR}") - -target_include_directories(kineto_base PUBLIC - $ - $ - $ - $ - $ - $ - $) - -target_include_directories(kineto_api PUBLIC - $ - $) - -if(KINETO_LIBRARY_TYPE STREQUAL "default") - add_library(kineto - $ - $) -elseif(KINETO_LIBRARY_TYPE STREQUAL "static") - add_library(kineto STATIC - $ - $) -elseif(KINETO_LIBRARY_TYPE STREQUAL "shared") - add_library(kineto SHARED - $) - set_property(TARGET kineto_base PROPERTY POSITION_INDEPENDENT_CODE ON) - set_target_properties(kineto PROPERTIES - CXX_VISIBILITY_PRESET hidden) -else() - message(FATAL_ERROR "Unsupported library type ${KINETO_LIBRARY_TYPE}") -endif() - -if(NOT LIBKINETO_NOROCTRACER) - find_library(ROCTRACER_LIBRARY NAMES libroctracer64.so HINTS /opt/rocm/roctracer/lib) - target_link_libraries(kineto "${ROCTRACER_LIBRARY}") - find_library(KINETO_HIP_LIBRARY NAMES libamdhip64.so HINTS /opt/rocm/lib) - target_link_libraries(kineto "${KINETO_HIP_LIBRARY}") -endif() - -if(NOT LIBKINETO_NOCUPTI) - target_link_libraries(kineto "${CUDA_cupti_LIBRARY}") -endif() -target_link_libraries(kineto $) -add_dependencies(kineto fmt::fmt-header-only) - -install(TARGETS kineto EXPORT kinetoLibraryConfig - ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR} - LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}) - -install(FILES ${LIBKINETO_PUBLIC_HEADERS} - DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/kineto") - -install(EXPORT kinetoLibraryConfig DESTINATION share/cmake/kineto - FILE kinetoLibraryConfig.cmake) - -if(KINETO_BUILD_TESTS) - add_subdirectory(test) -endif() diff --git a/plugins/tensorboard-plugins/libkineto/README.md b/plugins/tensorboard-plugins/libkineto/README.md deleted file mode 100644 index 37127ca5aa821217da48aad38cb82eb36f8735c2..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/README.md +++ /dev/null @@ -1,65 +0,0 @@ -# Libkineto - -Libkineto is an in-process profiling library, part of the Kineto performance -tools project. - -The library provides a way to collect GPU traces and metrics from the host -process, either via the library public API or by sending a signal, if enabled. - -Currently only NVIDIA GPUs are supported. - -## Build Notes -Libkineto uses the standard CMAKE-based build flow. - -### Dependencies -Libkineto requires gcc 5+ and: - -- NVIDIA CUPTI: used to collect traces and metrics from NVIDIA GPUs. -- fmt: used for its convenient and lightweight string formatting functionality. -- googletest: required to build and run Kineto's tests. - - **googletest is not required** if you don't want to run Kineto tests. -By default, building of tests is **on**. Turn it off by setting `KINETO_BUILD_TESTS` to **off**. - -You can download [NVIDIA CUPTI][1], [fmt][2], [googletest][3] and set -`CUDA_SOURCE_DIR`, `FMT_SOURCE_DIR`, `GOOGLETEST_SOURCE_DIR` respectively for -cmake to find these libraries. If the fmt and googletest variables are not set, cmake will -build the git submodules found in the `third_party` directory. -If `CUDA_SOURCE_DIR` is not set, libkineto will fail to build. - -### Building Libkineto - -``` -# Check out repo and sub modules -git clone --recursive https://github.com/pytorch/kineto.git -# Build libkineto with cmake -cd kineto/libkineto -mkdir build && cd build -cmake .. -make -``` - -To run the tests after building libkineto (if tests are built), use the following -command: -``` -make test -``` - -### Installing Libkineto -``` -make install -``` - -## How Libkineto works -We will provide a high-level overview, design philosophy and brief descriptions of various -parts of Libkineto in upcoming blogs. - -## Full documentation -We strive to keep our source files readable. The best and up-to-date -documentation is available in the source files. - -## License -Libkineto is BSD licensed, as detailed in the [LICENSE](../LICENSE) file. - -[1]:https://developer.nvidia.com/CUPTI-CTK10_2 -[2]:https://github.com/fmt -[3]:https://github.com/google/googletest diff --git a/plugins/tensorboard-plugins/libkineto/include/AbstractConfig.h b/plugins/tensorboard-plugins/libkineto/include/AbstractConfig.h deleted file mode 100644 index 1cadf4906c11c3b5f59e290295048cee7fd63acf..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/AbstractConfig.h +++ /dev/null @@ -1,113 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -class AbstractConfig { - public: - AbstractConfig& operator=(const AbstractConfig&) = delete; - AbstractConfig(AbstractConfig&&) = delete; - AbstractConfig& operator=(AbstractConfig&&) = delete; - - virtual ~AbstractConfig() { - for (const auto& p : featureConfigs_) { - delete p.second; - } - } - - // Return a copy of the full derived class - virtual AbstractConfig* cloneDerived(AbstractConfig& parent) const = 0; - - // Returns true if successfully parsed the config string - bool parse(const std::string& conf); - - // Default setup for signal-triggered profiling - virtual void setSignalDefaults() { - for (auto& p : featureConfigs_) { - p.second->setSignalDefaults(); - } - } - - // Default setup for client-triggered profiling - virtual void setClientDefaults() { - for (auto& p : featureConfigs_) { - p.second->setClientDefaults(); - } - } - - // Time config was created / updated - std::chrono::time_point timestamp() const { - return timestamp_; - } - - // Source config string that this was parsed from - const std::string& source() const { - return source_; - } - - AbstractConfig& feature(std::string name) const { - const auto& pos = featureConfigs_.find(name); - return *pos->second; - } - - // Transfers ownership of cfg arg - void addFeature(const std::string& name, AbstractConfig* cfg) { - featureConfigs_[name] = cfg; - } - - protected: - AbstractConfig() {} - AbstractConfig(const AbstractConfig& other) = default; - - // Return true if the option was recognized and successfully parsed. - // Throw std::invalid_argument if val is invalid. - virtual bool handleOption(const std::string& name, std::string& val); - - // Perform post-validation checks, typically conditons involving - // multiple options. - // Throw std::invalid_argument if automatic correction can not be made. - // - // @param fallbackProfileStartTime Specify a fallback profile start timestamp in case it was never specified by the client - virtual void validate(const std::chrono::time_point& fallbackProfileStartTime) = 0; - - // TODO: Separate out each profiler type into features? - virtual void printActivityProfilerConfig(std::ostream& s) const; - - // Helpers for use in handleOption - // Split a string by delimiter and remove external white space - std::vector splitAndTrim(const std::string& s, char delim) const; - // Lowercase for case-insensitive comparisons - std::string toLower(std::string& s) const; - // Does string end with suffix - bool endsWith(const std::string& s, const std::string& suffix) const; - // Conversions - int64_t toIntRange(const std::string& val, int64_t min, int64_t max) const; - int32_t toInt32(const std::string& val) const; - int64_t toInt64(const std::string& val) const; - bool toBool(std::string& val) const; - - void cloneFeaturesInto(AbstractConfig& cfg) const { - for (const auto& feature : featureConfigs_) { - cfg.featureConfigs_[feature.first] = feature.second->cloneDerived(cfg); - } - } - - private: - // Time config was created / updated - std::chrono::time_point timestamp_{}; - - // Original configuration string, used for comparison - std::string source_{""}; - - // Configuration objects for optional features - std::map featureConfigs_{}; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/include/ActivityProfilerInterface.h b/plugins/tensorboard-plugins/libkineto/include/ActivityProfilerInterface.h deleted file mode 100644 index 29871e47ab8af87888ccb8e20403bc26c433b5cc..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ActivityProfilerInterface.h +++ /dev/null @@ -1,91 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include - -#include "ActivityType.h" -#include "ActivityTraceInterface.h" -#include "IActivityProfiler.h" - -namespace libkineto { - -class ActivityProfilerController; -struct CpuTraceBuffer; -class Config; - -class ActivityProfilerInterface { - - public: - virtual ~ActivityProfilerInterface() {}; - - virtual void init() {} - virtual bool isInitialized() { - return false; - } - virtual bool isActive(){ - return false; - } - - // *** Asynchronous API *** - // Instead of starting and stopping the trace manually, provide a start time - // and duration and / or iteration stop criterion. - // Tracing terminates when either condition is met. - virtual void scheduleTrace(const std::string& configStr) {} - - // *** Synchronous API *** - // These must be called in order: - // prepareTrace -> startTrace -> stopTrace. - - // Many tracing structures are lazily initialized during trace collection, - // with potentially high overhead. - // Call prepareTrace to enable tracing, then run the region to trace - // at least once (and ideally run the same code that is to be traced) to - // allow tracing structures to be initialized. - virtual void prepareTrace( - const std::set& activityTypes, - const std::string& configStr = "") {} - - // Start recording, potentially reusing any buffers allocated since - // prepareTrace was called. - virtual void startTrace() {} - - // Stop and process trace, producing an in-memory list of trace records. - // The processing will be done synchronously (using the calling thread.) - virtual std::unique_ptr stopTrace() { - return nullptr; - } - - // Re-evaluate internal state to allow for triggering operations based - // on number of iteration. each implicitly increments the iteration count - virtual void step() {} - - // *** TraceActivity API *** - // FIXME: Pass activityProfiler interface into clientInterface? - virtual void pushCorrelationId(uint64_t id){} - virtual void popCorrelationId(){} - virtual void transferCpuTrace( - std::unique_ptr traceBuffer){} - - // Correlation ids for user defined spans - virtual void pushUserCorrelationId(uint64_t){} - virtual void popUserCorrelationId(){} - - // Saves information for the current thread to be used in profiler output - // Client must record any new kernel thread where the activity has occured. - virtual void recordThreadInfo() {} - - // Record trace metadata, currently supporting only string key and values, - // values with the same key are overwritten - virtual void addMetadata(const std::string& key, const std::string& value) = 0; - - // Add a child activity profiler, this enables frameworks in the application - // to enable custom framework events. - virtual void addChildActivityProfiler( - std::unique_ptr profiler) {} -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/ActivityTraceInterface.h b/plugins/tensorboard-plugins/libkineto/include/ActivityTraceInterface.h deleted file mode 100644 index 23d4edab00ce2fa90427e13818ac09c8541835ac..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ActivityTraceInterface.h +++ /dev/null @@ -1,21 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include - -namespace libkineto { - -struct ITraceActivity; - -class ActivityTraceInterface { - public: - virtual ~ActivityTraceInterface() {} - virtual const std::vector* activities() { - return nullptr; - } - virtual void save(const std::string& path) {} -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/ActivityType.h b/plugins/tensorboard-plugins/libkineto/include/ActivityType.h deleted file mode 100644 index 74c6a2531d6a9cee3196f9f889517926afea823f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ActivityType.h +++ /dev/null @@ -1,34 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include - -namespace libkineto { - -enum class ActivityType { - CPU_OP = 0, // cpu side ops - USER_ANNOTATION, - GPU_USER_ANNOTATION, - GPU_MEMCPY, - GPU_MEMSET, - CONCURRENT_KERNEL, // on-device kernels - EXTERNAL_CORRELATION, - CUDA_RUNTIME, // host side cuda runtime events - CUDA_PROFILER_RANGE, // CUPTI Profiler range for performance metrics - GLOW_RUNTIME, // host side glow runtime events - CPU_INSTANT_EVENT, // host side point-like events - PYTHON_FUNCTION, - OVERHEAD, // CUPTI induced overhead events sampled from its overhead API. - ENUM_COUNT // This is to add buffer and not used for any profiling logic. Add your new type before it. -}; - -const char* toString(ActivityType t); -ActivityType toActivityType(const std::string& str); - -// Return an array of all activity types except COUNT -constexpr int activityTypeCount = (int)ActivityType::ENUM_COUNT; -const std::array activityTypes(); - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/ClientInterface.h b/plugins/tensorboard-plugins/libkineto/include/ClientInterface.h deleted file mode 100644 index 06dc075838164f80e9481b34a5d5d3c136b92efd..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ClientInterface.h +++ /dev/null @@ -1,16 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -namespace libkineto { - -class ClientInterface { - public: - virtual ~ClientInterface() {} - virtual void init() = 0; - virtual void warmup(bool setupOpInputsCollection) = 0; - virtual void start() = 0; - virtual void stop() = 0; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/Config.h b/plugins/tensorboard-plugins/libkineto/include/Config.h deleted file mode 100644 index 040e96c9f75ab3ab768aaebac28f959f12a3ea06..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/Config.h +++ /dev/null @@ -1,433 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include "AbstractConfig.h" -#include "ActivityType.h" - -#include -#include -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -class Config : public AbstractConfig { - public: - Config(); - Config& operator=(const Config&) = delete; - Config(Config&&) = delete; - Config& operator=(Config&&) = delete; - - // Return a full copy including feature config object - std::unique_ptr clone() const { - auto cfg = std::unique_ptr(new Config(*this)); - cloneFeaturesInto(*cfg); - return cfg; - } - - bool handleOption(const std::string& name, std::string& val) override; - - void setClientDefaults() override; - - // Log events to this file - const std::string& eventLogFile() const { - return eventLogFile_; - } - - bool activityProfilerEnabled() const { - return activityProfilerEnabled_ || - activitiesOnDemandTimestamp_.time_since_epoch().count() > 0; - } - - // Log activitiy trace to this file - const std::string& activitiesLogFile() const { - return activitiesLogFile_; - } - - // Log activitiy trace to this url - const std::string& activitiesLogUrl() const { - return activitiesLogUrl_; - } - - void setActivitiesLogUrl(const std::string& url) { - activitiesLogUrl_ = url; - } - - bool activitiesLogToMemory() const { - return activitiesLogToMemory_; - } - - // Is profiling enabled for the given device? - bool eventProfilerEnabledForDevice(uint32_t dev) const { - return 0 != (eventProfilerDeviceMask_ & (1 << dev)); - } - - // Take a sample (read hardware counters) at this frequency. - // This controls how often counters are read - if all counters cannot - // be collected simultaneously then multiple samples are needed to - // collect all requested counters - see multiplex period. - std::chrono::milliseconds samplePeriod() const { - return samplePeriod_; - } - - void setSamplePeriod(std::chrono::milliseconds period) { - samplePeriod_ = period; - } - - // When all requested counters cannot be collected simultaneously, - // counters will be multiplexed at this frequency. - // Multiplexing can have a large performance impact if done frequently. - // To avoid a perf impact, keep this at 1s or above. - std::chrono::milliseconds multiplexPeriod() const { - return multiplexPeriod_; - } - - void setMultiplexPeriod(std::chrono::milliseconds period) { - multiplexPeriod_ = period; - } - - // Report counters at this frequency. Note that several samples can - // be reported each time, see samplesPerReport. - std::chrono::milliseconds reportPeriod() const { - return reportPeriod_; - } - - void setReportPeriod(std::chrono::milliseconds msecs); - - // Number of samples dispatched each report period. - // Must be in the range [1, report period / sample period]. - // In other words, aggregation is supported but not interpolation. - int samplesPerReport() const { - return samplesPerReport_; - } - - void setSamplesPerReport(int count) { - samplesPerReport_ = count; - } - - // The names of events to collect - const std::set& eventNames() const { - return eventNames_; - } - - // Add additional events to be profiled - void addEvents(const std::set& names) { - eventNames_.insert(names.begin(), names.end()); - } - - // The names of metrics to collect - const std::set& metricNames() const { - return metricNames_; - } - - // Add additional metrics to be profiled - void addMetrics(const std::set& names) { - metricNames_.insert(names.begin(), names.end()); - } - - const std::vector& percentiles() const { - return eventReportPercentiles_; - } - - // Profile for this long, then revert to base config - std::chrono::seconds eventProfilerOnDemandDuration() const { - return eventProfilerOnDemandDuration_; - } - - void setEventProfilerOnDemandDuration(std::chrono::seconds duration) { - eventProfilerOnDemandDuration_ = duration; - } - - // Too many event profilers on a single system can overload the driver. - // At some point, latencies shoot through the roof and collection of samples - // becomes impossible. To avoid this situation we have a limit of profilers - // per GPU. - // NOTE: Communication with a daemon is needed for this feature. - // Library must be built with an active DaemonConfigLoader. - int maxEventProfilersPerGpu() const { - return eventProfilerMaxInstancesPerGpu_; - } - - // On Cuda11 we've seen occasional hangs when reprogramming counters - // Monitor profiling threads and report when a thread is not responding - // for a given number of seconds. - // A period of 0 means disable. - std::chrono::seconds eventProfilerHeartbeatMonitorPeriod() const { - return eventProfilerHeartbeatMonitorPeriod_; - } - - // The types of activities selected in the configuration file - const std::set& selectedActivityTypes() const { - return selectedActivityTypes_; - } - - void setSelectedActivityTypes(const std::set& types) { - selectedActivityTypes_ = types; - } - - bool isOpInputsCollectionEnabled() const { - return enableOpInputsCollection_; - } - - // Trace for this long - std::chrono::milliseconds activitiesDuration() const { - return activitiesDuration_; - } - - // Trace for this many iterations, determined by external API - int activitiesRunIterations() const { - return activitiesRunIterations_; - } - - std::chrono::milliseconds activitiesDurationDefault() const; - - void setActivitiesDuration(std::chrono::milliseconds duration) { - activitiesDuration_ = duration; - } - - int activitiesMaxGpuBufferSize() const { - return activitiesMaxGpuBufferSize_; - } - - std::chrono::seconds activitiesWarmupDuration() const { - return activitiesWarmupDuration_; - } - - int activitiesWarmupIterations() const { - return activitiesWarmupIterations_; - } - - // Timestamp at which the profiling to start, requested by the user. - const std::chrono::time_point requestTimestamp() - const { - if (profileStartTime_.time_since_epoch().count()) { - return profileStartTime_; - } - - // TODO(T94634890): Deperecate requestTimestamp - return requestTimestamp_ + maxRequestAge() + activitiesWarmupDuration(); - } - - bool hasProfileStartTime() const { - return requestTimestamp_.time_since_epoch().count() > 0 || - profileStartTime_.time_since_epoch().count() > 0; - } - - int profileStartIteration() const { - return profileStartIteration_; - } - - bool hasProfileStartIteration() const { - return profileStartIteration_ >= 0 && activitiesRunIterations_ > 0; - } - - void setProfileStartIteration(int iter) { - profileStartIteration_ = iter; - } - - int profileStartIterationRoundUp() const { - return profileStartIterationRoundUp_; - } - - // calculate the start iteration accounting for warmup - int startIterationIncludingWarmup() const { - if (!hasProfileStartIteration()) { - return -1; - } - return profileStartIteration_ - activitiesWarmupIterations_; - } - - const std::chrono::seconds maxRequestAge() const; - - // All VLOG* macros will log if the verbose log level is >= - // the verbosity specified for the verbose log message. - // Default value is -1, so messages with log level 0 will log by default. - int verboseLogLevel() const { - return verboseLogLevel_; - } - - // Modules for which verbose logging is enabled. - // If empty, logging is enabled for all modules. - const std::vector& verboseLogModules() const { - return verboseLogModules_; - } - - bool sigUsr2Enabled() const { - return enableSigUsr2_; - } - - bool ipcFabricEnabled() const { - return enableIpcFabric_; - } - - static std::chrono::milliseconds alignUp( - std::chrono::milliseconds duration, - std::chrono::milliseconds alignment) { - duration += alignment; - return duration - (duration % alignment); - } - - std::chrono::time_point - eventProfilerOnDemandStartTime() const { - return eventProfilerOnDemandTimestamp_; - } - - std::chrono::time_point - eventProfilerOnDemandEndTime() const { - return eventProfilerOnDemandTimestamp_ + eventProfilerOnDemandDuration_; - } - - std::chrono::time_point - activityProfilerRequestReceivedTime() const { - return activitiesOnDemandTimestamp_; - } - - // Users may request and set trace id and group trace id. - const std::string& requestTraceID() const { - return requestTraceID_; - } - - void setRequestTraceID(const std::string& tid) { - requestTraceID_ = tid; - } - - const std::string& requestGroupTraceID() const { - return requestGroupTraceID_; - } - - void setRequestGroupTraceID(const std::string& gtid) { - requestGroupTraceID_ = gtid; - } - - void updateActivityProfilerRequestReceivedTime(); - - void printActivityProfilerConfig(std::ostream& s) const override; - - void validate( - const std::chrono::time_point& fallbackProfileStartTime) override; - - static void addConfigFactory( - std::string name, - std::function factory); - - void print(std::ostream& s) const; - - private: - explicit Config(const Config& other) = default; - - AbstractConfig* cloneDerived(AbstractConfig& parent) const override { - // Clone from AbstractConfig not supported - assert(false); - return nullptr; - } - - uint8_t createDeviceMask(const std::string& val); - - // Adds valid activity types from the user defined string list in the - // configuration file - void setActivityTypes(const std::vector& selected_activities); - - // Sets the default activity types to be traced - void selectDefaultActivityTypes() { - // If the user has not specified an activity list, add all types - for (ActivityType t : activityTypes()) { - // Do no enable this by default - // TODO: introduce optional types - if (t != ActivityType::OVERHEAD) { - selectedActivityTypes_.insert(t); - } - } - } - - int verboseLogLevel_; - std::vector verboseLogModules_; - - // Event profiler - // These settings are also supported in on-demand mode - std::chrono::milliseconds samplePeriod_; - std::chrono::milliseconds reportPeriod_; - int samplesPerReport_; - std::set eventNames_; - std::set metricNames_; - - // On-demand duration - std::chrono::seconds eventProfilerOnDemandDuration_; - // Last on-demand request - std::chrono::time_point - eventProfilerOnDemandTimestamp_; - - int eventProfilerMaxInstancesPerGpu_; - - // Monitor whether event profiler threads are stuck - // at this frequency - std::chrono::seconds eventProfilerHeartbeatMonitorPeriod_; - - // These settings can not be changed on-demand - std::string eventLogFile_; - std::vector eventReportPercentiles_ = {5, 25, 50, 75, 95}; - uint8_t eventProfilerDeviceMask_ = ~0; - std::chrono::milliseconds multiplexPeriod_; - - // Activity profiler - bool activityProfilerEnabled_; - std::set selectedActivityTypes_; - - // The activity profiler settings are all on-demand - std::string activitiesLogFile_; - - std::string activitiesLogUrl_; - - // Log activities to memory buffer - bool activitiesLogToMemory_{false}; - - int activitiesMaxGpuBufferSize_; - std::chrono::seconds activitiesWarmupDuration_; - int activitiesWarmupIterations_; - - // Client Interface - // Enable inputs collection when tracing ops - bool enableOpInputsCollection_{true}; - - // Profile for specified iterations and duration - std::chrono::milliseconds activitiesDuration_; - int activitiesRunIterations_; - - // Below are not used - // Use this net name for iteration count - std::string activitiesExternalAPIIterationsTarget_; - // Only profile nets that includes this in the name - std::vector activitiesExternalAPIFilter_; - // Only profile nets with at least this many operators - int activitiesExternalAPINetSizeThreshold_; - // Only profile nets with at least this many GPU operators - int activitiesExternalAPIGpuOpCountThreshold_; - // Last activity profiler request - std::chrono::time_point - activitiesOnDemandTimestamp_; - - // Synchronized start timestamp - std::chrono::time_point profileStartTime_; - // or start iteration - int profileStartIteration_; - int profileStartIterationRoundUp_; - - // DEPRECATED - std::chrono::time_point requestTimestamp_; - - // Enable profiling via SIGUSR2 - bool enableSigUsr2_; - - // Enable IPC Fabric instead of thrift communication - bool enableIpcFabric_; - - // Logger Metadata - std::string requestTraceID_; - std::string requestGroupTraceID_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/include/GenericTraceActivity.h b/plugins/tensorboard-plugins/libkineto/include/GenericTraceActivity.h deleted file mode 100644 index 4272cf1efa4e7613a46c3684270b4e803853345b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/GenericTraceActivity.h +++ /dev/null @@ -1,125 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include - -#include "ThreadUtil.h" -#include "ITraceActivity.h" -#include "TraceSpan.h" - -namespace libkineto { - -// Link type, used in GenericTraceActivity.flow.type -constexpr unsigned int kLinkFwdBwd = 1; -constexpr unsigned int kLinkAsyncCpuGpu = 2; - -// @lint-ignore-every CLANGTIDY cppcoreguidelines-non-private-member-variables-in-classes -// @lint-ignore-every CLANGTIDY cppcoreguidelines-pro-type-member-init -class GenericTraceActivity : public ITraceActivity { - - public: - GenericTraceActivity() : activityType(ActivityType::ENUM_COUNT), traceSpan_(NULL) {} - - GenericTraceActivity( - const TraceSpan& trace, ActivityType type, const std::string& name) - : activityType(type), activityName(name), traceSpan_(&trace) { - } - - int64_t deviceId() const override { - return device; - } - - int64_t resourceId() const override { - return resource; - } - - int32_t getThreadId() const override { - return threadId; - } - - int64_t timestamp() const override { - return startTime; - } - - int64_t duration() const override { - return endTime - startTime; - } - - int64_t correlationId() const override { - return id; - } - - ActivityType type() const override { - return activityType; - } - - const ITraceActivity* linkedActivity() const override { - return nullptr; - } - - int flowType() const override { - return flow.type; - } - - int flowId() const override { - return flow.id; - } - - bool flowStart() const override { - return flow.start; - } - - const std::string name() const override { - return activityName; - } - - const TraceSpan* traceSpan() const override { - return traceSpan_; - } - - void log(ActivityLogger& logger) const override; - - //Encode client side metadata as a key/value - template - void addMetadata(const std::string& key, const ValType& value) { - metadata_.push_back(fmt::format("\"{}\": {}", key, value)); - } - - void addMetadataQuoted(const std::string& key, const std::string& value) { - metadata_.push_back(fmt::format("\"{}\": \"{}\"", key, value)); - } - - const std::string metadataJson() const override { - return fmt::format("{}", fmt::join(metadata_, ", ")); - } - - virtual ~GenericTraceActivity() {}; - - int64_t startTime{0}; - int64_t endTime{0}; - int32_t id{0}; - int32_t device{0}; - int32_t resource{0}; - int32_t threadId{0}; - ActivityType activityType; - std::string activityName; - struct Flow { - Flow(): id(0), type(0), start(0) {} - // Ids must be unique within each type - uint32_t id : 27; - // Type will be used to connect flows between profilers, as - // well as look up flow information (name etc) - uint32_t type : 4; - uint32_t start : 1; - } flow; - - private: - const TraceSpan* traceSpan_; - std::vector metadata_; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/IActivityProfiler.h b/plugins/tensorboard-plugins/libkineto/include/IActivityProfiler.h deleted file mode 100644 index f5d4b3fb828a3348d948c6487acc6a9e5a18f836..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/IActivityProfiler.h +++ /dev/null @@ -1,104 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include - -#include "Config.h" -#include "GenericTraceActivity.h" - -/* This file includes an abstract base class for an activity profiler - * that can be implemented by multiple tracing agents in the application. - * The high level Kineto profiler can co-ordinate start and end of tracing - * and combine together events from multiple such activity profilers. - */ - -namespace libkineto { - -using namespace KINETO_NAMESPACE; - -#ifdef _MSC_VER -// workaround for the predefined ERROR macro on Windows -#undef ERROR -#endif // _MSC_VER - -enum class TraceStatus { - READY, // Accepting trace requests - WARMUP, // Performing trace warmup - RECORDING, // Actively collecting activities - PROCESSING, // Recording is complete, preparing results - ERROR, // One or more errors (and possibly also warnings) occurred. - WARNING, // One or more warnings occurred. -}; - -/* IActivityProfilerSession: - * an opaque object that can be used by a high level profiler to - * start/stop and return trace events. - */ -class IActivityProfilerSession { - - public: - virtual ~IActivityProfilerSession() {} - - // start the trace collection synchronously - virtual void start() = 0; - - // stop the trace collection synchronously - virtual void stop() = 0; - - TraceStatus status() { - return status_; - } - - // returns list of Trace Activities - virtual std::vector& activities() = 0; - - // returns errors with this trace - virtual std::vector errors() = 0; - - // processes trace activities using logger - virtual void processTrace(ActivityLogger& logger) = 0; - - // XXX define trace formats - // virtual save(string name, TraceFormat format) - - protected: - TraceStatus status_ = TraceStatus::READY; -}; - - -/* Activity Profiler Plugins: - * These allow other frameworks to integrate into Kineto's primariy - * activity profiler. While the primary activity profiler handles - * timing the trace collections and correlating events the plugins - * can become source of new trace activity types. - */ -class IActivityProfiler { - - public: - - virtual ~IActivityProfiler() {} - - // name of profiler - virtual const std::string& name() const = 0; - - // returns activity types this profiler supports - virtual const std::set& availableActivities() const = 0; - - // Calls prepare() on registered tracer providers passing in the relevant - // activity types. Returns a profiler session handle - virtual std::unique_ptr configure( - const std::set& activity_types, - const Config& config) = 0; - - // asynchronous version of the above with future timestamp and duration. - virtual std::unique_ptr configure( - int64_t ts_ms, - int64_t duration_ms, - const std::set& activity_types, - const Config& config) = 0; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/ILoggerObserver.h b/plugins/tensorboard-plugins/libkineto/include/ILoggerObserver.h deleted file mode 100644 index 4fce7851b9669ff93a3f3a772140b0466674853c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ILoggerObserver.h +++ /dev/null @@ -1,50 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -// Stages in libkineto used when pushing logs to UST Logger. -constexpr char kWarmUpStage[] = "Warm Up"; -constexpr char kCollectionStage[] = "Collection"; -constexpr char kPostProcessingStage[] = "Post Processing"; - -#if !USE_GOOGLE_LOG - -#include -#include - -namespace libkineto { - -enum LoggerOutputType { - VERBOSE = 0, - INFO = 1, - WARNING = 2, - ERROR = 3, - STAGE = 4, - ENUM_COUNT = 5 -}; - -const char* toString(LoggerOutputType t); -LoggerOutputType toLoggerOutputType(const std::string& str); - -constexpr int LoggerTypeCount = (int) LoggerOutputType::ENUM_COUNT; - -class ILoggerObserver { - public: - virtual ~ILoggerObserver() = default; - virtual void write(const std::string& message, LoggerOutputType ot) = 0; - virtual const std::map> extractCollectorMetadata() = 0; - virtual void reset() = 0; - virtual void addDevice(const int64_t device) = 0; - virtual void setTraceDurationMS(const int64_t duration) = 0; - virtual void addEventCount(const int64_t count) = 0; - virtual void setTraceID(const std::string&) {} - virtual void setGroupTraceID(const std::string&) {} - virtual void addDestination(const std::string& dest) = 0; - -}; - -} // namespace libkineto - -#endif // !USE_GOOGLE_LOG diff --git a/plugins/tensorboard-plugins/libkineto/include/ITraceActivity.h b/plugins/tensorboard-plugins/libkineto/include/ITraceActivity.h deleted file mode 100644 index a477ed814662cb4c57738b7e40ec6052e9f65288..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ITraceActivity.h +++ /dev/null @@ -1,53 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -#include "ActivityType.h" - -namespace libkineto { - -class ActivityLogger; -struct TraceSpan; - -// Generic activity interface is borrowed from tensorboard protobuf format. -struct ITraceActivity { - virtual ~ITraceActivity() {} - // Device is a physical or logical entity, e.g. CPU, GPU or process - virtual int64_t deviceId() const = 0; - // A resource is something on the device, h/w thread, - // functional units etc. - virtual int64_t resourceId() const = 0; - // s/w thread - virtual int32_t getThreadId() const = 0; - // Start timestamp in mucrosecond - virtual int64_t timestamp() const = 0; - // Duration in microseconds - virtual int64_t duration() const = 0; - // Used to link up async activities - virtual int64_t correlationId() const = 0; - // Part of a flow, identified by flow id and type - virtual int flowType() const = 0; - virtual int flowId() const = 0; - virtual bool flowStart() const = 0; - virtual ActivityType type() const = 0; - virtual const std::string name() const = 0; - // Optional linked activity - virtual const ITraceActivity* linkedActivity() const = 0; - // Optional containing trace object - virtual const TraceSpan* traceSpan() const = 0; - // Log activity - virtual void log(ActivityLogger& logger) const = 0; - // Return json formatted metadata - // FIXME: Return iterator to dynamic type map here instead - virtual const std::string metadataJson() const = 0; - - static int64_t nsToUs(int64_t ns) { - // It's important that this conversion is the same everywhere. - // No rounding! - return ns / 1000; - } -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/ThreadUtil.h b/plugins/tensorboard-plugins/libkineto/include/ThreadUtil.h deleted file mode 100644 index d1dc80ad2ab0dfd3bea313363fb0e6565349889c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/ThreadUtil.h +++ /dev/null @@ -1,22 +0,0 @@ -#pragma once - -#include -#include -#include -#include - -namespace libkineto { - -int32_t systemThreadId(); -int32_t threadId(); -bool setThreadName(const std::string& name); -std::string getThreadName(); - -int32_t processId(); -std::string processName(int32_t pid); - -// Return a list of pids and process names for the current process -// and its parents. -std::vector> pidCommandPairsOfAncestors(); - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/TraceSpan.h b/plugins/tensorboard-plugins/libkineto/include/TraceSpan.h deleted file mode 100644 index af9a9d5ee556830ac34568e6c81ec4f8f00da2e3..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/TraceSpan.h +++ /dev/null @@ -1,36 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include - -namespace libkineto { - -struct TraceSpan { - TraceSpan() = delete; - TraceSpan( - int64_t startTime, int64_t endTime, std::string name) - : startTime(startTime), endTime(endTime), name(std::move(name)) { - } - TraceSpan( - int opCount, int it, std::string name, std::string prefix) - : opCount(opCount), - iteration(it), - name(std::move(name)), - prefix(std::move(prefix)) { - } - - // FIXME: change to duration? - int64_t startTime{0}; - int64_t endTime{0}; - int opCount{0}; - int iteration{-1}; - // Name is used to identify timeline - std::string name; - // Prefix used to distinguish trace spans on the same timeline - std::string prefix; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/libkineto.h b/plugins/tensorboard-plugins/libkineto/include/libkineto.h deleted file mode 100644 index 87c3d64f638dad9d1c2d24c013135db60d477642..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/libkineto.h +++ /dev/null @@ -1,138 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -// Mediator for initialization and profiler control - -#pragma once - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "ActivityProfilerInterface.h" -#include "ActivityType.h" -#include "ClientInterface.h" -#include "GenericTraceActivity.h" -#include "TraceSpan.h" -#include "IActivityProfiler.h" -#include "ActivityTraceInterface.h" - -#include "ThreadUtil.h" - -extern "C" { - void suppressLibkinetoLogMessages(); - int InitializeInjection(void); - bool libkineto_init(bool cpuOnly, bool logOnError); -} - -namespace libkineto { - -class Config; -class ConfigLoader; - -struct CpuTraceBuffer { - TraceSpan span{0, 0, "none"}; - int gpuOpCount; - std::deque activities; -}; - -using ChildActivityProfilerFactory = - std::function()>; - -class LibkinetoApi { - public: - - explicit LibkinetoApi(ConfigLoader& configLoader) - : configLoader_(configLoader) { - } - - // Called by client that supports tracing API. - // libkineto can still function without this. - void registerClient(ClientInterface* client); - - // Called by libkineto on init - void registerProfiler(std::unique_ptr profiler) { - activityProfiler_ = std::move(profiler); - initClientIfRegistered(); - } - - ActivityProfilerInterface& activityProfiler() { - return *activityProfiler_; - } - - ClientInterface* client() { - return client_; - } - - void initProfilerIfRegistered() { - static std::once_flag once; - if (activityProfiler_) { - std::call_once(once, [this] { - if (!activityProfiler_->isInitialized()) { - activityProfiler_->init(); - initChildActivityProfilers(); - } - }); - } - } - - bool isProfilerInitialized() const { - return activityProfiler_ && activityProfiler_->isInitialized(); - } - - bool isProfilerRegistered() const { - return activityProfiler_ != nullptr; - } - - void suppressLogMessages() { - suppressLibkinetoLogMessages(); - } - - // Provides access to profier configuration manaegement - ConfigLoader& configLoader() { - return configLoader_; - } - - void registerProfilerFactory( - ChildActivityProfilerFactory factory) { - if (isProfilerInitialized()) { - activityProfiler_->addChildActivityProfiler(factory()); - } else { - childProfilerFactories_.push_back(factory); - } - } - - private: - - void initChildActivityProfilers() { - if (!isProfilerInitialized()) { - return; - } - for (const auto& factory : childProfilerFactories_) { - activityProfiler_->addChildActivityProfiler(factory()); - } - childProfilerFactories_.clear(); - } - - // Client is initialized once both it and libkineto has registered - void initClientIfRegistered(); - - ConfigLoader& configLoader_; - std::unique_ptr activityProfiler_{}; - ClientInterface* client_{}; - int32_t clientRegisterThread_{0}; - - bool isLoaded_{false}; - std::vector childProfilerFactories_; -}; - -// Singleton -LibkinetoApi& api(); - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/include/time_since_epoch.h b/plugins/tensorboard-plugins/libkineto/include/time_since_epoch.h deleted file mode 100644 index caa6b4d92760d384eca2b1383a679fe7435c53b3..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/include/time_since_epoch.h +++ /dev/null @@ -1,16 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -namespace libkineto { - -inline int64_t timeSinceEpoch( - const std::chrono::time_point& t) { - return std::chrono::duration_cast( - t.time_since_epoch()) - .count(); -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/libkineto_defs.bzl b/plugins/tensorboard-plugins/libkineto/libkineto_defs.bzl deleted file mode 100644 index 330c54a22dfcedf895f0eba4077713a7c4cd8072..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/libkineto_defs.bzl +++ /dev/null @@ -1,77 +0,0 @@ -# Copyright (c) Facebook, Inc. and its affiliates. -# All rights reserved. -# This source code is licensed under the BSD-style license found in the -# LICENSE file in the root directory of this source tree. - -def get_libkineto_api_srcs(): - return [ - "src/ThreadUtil.cpp", - "src/libkineto_api.cpp", - ] - -def get_libkineto_cupti_srcs(with_api = True): - return [ - "src/CudaDeviceProperties.cpp", - "src/CuptiActivityApi.cpp", - "src/CuptiActivityPlatform.cpp", - "src/CuptiCallbackApi.cpp", - "src/CuptiEventApi.cpp", - "src/CuptiMetricApi.cpp", - "src/CuptiRangeProfilerApi.cpp", - "src/Demangle.cpp", - "src/EventProfiler.cpp", - "src/EventProfilerController.cpp", - "src/WeakSymbols.cpp", - "src/cupti_strings.cpp", - ] + (get_libkineto_cpu_only_srcs(with_api)) - -def get_libkineto_roctracer_srcs(with_api = True): - return [ - "src/RoctracerActivityApi.cpp", - ] + (get_libkineto_cpu_only_srcs(with_api)) - -def get_libkineto_cpu_only_srcs(with_api = True): - return [ - "src/AbstractConfig.cpp", - "src/CuptiActivityProfiler.cpp", - "src/ActivityProfilerController.cpp", - "src/ActivityProfilerProxy.cpp", - "src/ActivityType.cpp", - "src/Config.cpp", - "src/ConfigLoader.cpp", - "src/CuptiActivityApi.cpp", - "src/Demangle.cpp", - "src/GenericTraceActivity.cpp", - "src/ILoggerObserver.cpp", - "src/Logger.cpp", - "src/init.cpp", - "src/output_csv.cpp", - "src/output_json.cpp", - ] + (get_libkineto_api_srcs() if with_api else []) - -def get_libkineto_public_headers(): - return [ - "include/AbstractConfig.h", - "include/ActivityProfilerInterface.h", - "include/ActivityType.h", - "include/Config.h", - "include/ClientInterface.h", - "include/GenericTraceActivity.h", - "include/GenericTraceActivity.h", - "include/IActivityProfiler.h", - "include/ILoggerObserver.h", - "include/ITraceActivity.h", - "include/TraceSpan.h", - "include/ThreadUtil.h", - "include/libkineto.h", - "include/time_since_epoch.h", - ] - -# kineto code should be updated to not have to -# suppress these warnings. -KINETO_COMPILER_FLAGS = [ - "-fexceptions", - "-Wno-deprecated-declarations", - "-Wno-unused-function", - "-Wno-unused-private-field", -] diff --git a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cpp b/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cpp deleted file mode 100644 index 780047912ed09996d3952901267d46aab99cf78c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cpp +++ /dev/null @@ -1,38 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#include - -#include -#include - -#include "kineto/libkineto/sample_programs/kineto_playground.cuh" - -using namespace kineto; - -static const std::string kFileName = "/tmp/kineto_playground_trace.json"; - -int main() { - warmup(); - - // Kineto config - - // Empty types set defaults to all types - std::set types; - - auto& profiler = libkineto::api().activityProfiler(); - libkineto::api().initProfilerIfRegistered(); - profiler.prepareTrace(types); - - // Good to warm up after prepareTrace to get cupti initialization to settle - warmup(); - profiler.startTrace(); - playground(); - - auto trace = profiler.stopTrace(); - LOG(INFO) << "Stopped and processed trace. Got " << trace->activities()->size() << " activities."; - trace->save(kFileName); - return 0; -} - diff --git a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cu b/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cu deleted file mode 100644 index 54c6f82ff4be2e468c0e868b49b3a9130de97490..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cu +++ /dev/null @@ -1,60 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include - -#include "kineto_playground.cuh" - - -namespace kineto { - -void warmup(void) { - // Inititalizing CUDA can take a while which we normally do not want to see in Kineto traces. - // This is done in various ways that take Kineto as dependency. This is our way of doing warmup - // for kineto_playground - size_t bytes = 1000; - float* mem = NULL; - auto error = cudaMalloc(&mem, bytes); - if (error != cudaSuccess) { - printf("cudaMalloc failed during kineto_playground warmup. error code: %d", error); - return; - } - - cudaFree(mem); -} - -void basicMemcpyMemset(void) { - size_t size = (1 << 8) * sizeof(float); - float *hostMemSrc, *deviceMem, *hostMemDst; - cudaError_t err; - - hostMemSrc = (float*)malloc(size); - hostMemDst = (float*)malloc(size); - err = cudaMalloc(&deviceMem, size); - if (err != cudaSuccess) { - printf("cudaMalloc failed during %s", __func__); - return; - } - - memset(hostMemSrc, 1, size); - cudaMemcpy(deviceMem, hostMemSrc, size, cudaMemcpyHostToDevice); - if (err != cudaSuccess) { - printf("cudaMemcpy failed during %s", __func__); - return; - } - - cudaMemcpy(hostMemDst, deviceMem, size, cudaMemcpyDeviceToHost); - if (err != cudaSuccess) { - printf("cudaMemcpy failed during %s", __func__); - return; - } - - free(hostMemSrc); - free(hostMemDst); - cudaFree(deviceMem); -} - -void playground(void) { - // Add your experimental CUDA implementation here. -} - -} diff --git a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cuh b/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cuh deleted file mode 100644 index 54e1ee59ada9ae88370b38146567ed87be2b914b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/sample_programs/kineto_playground.cuh +++ /dev/null @@ -1,18 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -namespace kineto { - -// Warms up CUDA before the tracing starts -void warmup(void); - -// Basic usage of cudaMemcpy and cudaMemset -void basicMemcpyMemset(void); - -// Your experimental code goes in here! -void playground(void); - -} diff --git a/plugins/tensorboard-plugins/libkineto/src/AbstractConfig.cpp b/plugins/tensorboard-plugins/libkineto/src/AbstractConfig.cpp deleted file mode 100644 index d60ab43c9a3e198167beb7987d619b0bb8e9ed13..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/AbstractConfig.cpp +++ /dev/null @@ -1,188 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "AbstractConfig.h" - -#include -#include -#include - -#include "Logger.h" - -using namespace std::chrono; - -using std::string; -using std::vector; - -namespace KINETO_NAMESPACE { - -constexpr char kWhitespace[] = "\t\n "; - -static bool isWhitespace(string& s) { - return s.find_first_not_of(kWhitespace) == string::npos; -} - -// Remove whitespace from both end of string -static inline string trim(string& s) { - if (s.empty()) { - return s; - } else if (isWhitespace(s)) { - return ""; - } - auto start = s.find_first_not_of(kWhitespace); - auto end = s.find_last_not_of(kWhitespace); - return s.substr(start, end - start + 1); -} - -// Helper function for split. -// Return the index of char d in string s. -// If not found, returns the length of the string. -static int find(const char* s, char delim) { - int i; - for (i = 0; s[i]; i++) { - if (s[i] == delim) { - break; - } - } - return i; -} - -// Split a string by delimiter -static vector split(const string& s, char delim) { - vector res; - const char* cs = s.c_str(); - for (int i = find(cs, delim); cs[i]; cs += i + 1, i = find(cs, delim)) { - res.emplace_back(cs, i); - } - res.emplace_back(cs); - return res; -} - -// Remove a trailing comment. -static inline string stripComment(const string& s) { - std::size_t pos = s.find("#"); - return s.substr(0, pos); -} - -string AbstractConfig::toLower(string& s) const { - string res = s; - for (int i = 0; i < res.size(); i++) { - if (res[i] >= 'A' && res[i] <= 'Z') { - res[i] += ('a' - 'A'); - } - } - return res; -} - -bool AbstractConfig::endsWith(const string& s, const string& suffix) const { - if (suffix.size() > s.size()) { - return false; - } - return s.compare(s.size() - suffix.size(), suffix.size(), suffix) == 0; -} - -vector AbstractConfig::splitAndTrim(const string& s, char delim) const { - auto res = split(s, delim); - for (string& x : res) { - x = trim(x); - } - return res; -} - -int64_t AbstractConfig::toIntRange(const string& val, int64_t min, int64_t max) - const { - char* invalid; - int64_t res = strtoll(val.c_str(), &invalid, 10); - if (val.empty() || *invalid) { - throw std::invalid_argument(fmt::format("Invalid integer: {}", val)); - } else if (res < min || res > max) { - throw std::invalid_argument(fmt::format( - "Invalid argument: {} - expected range [{}, {}]", res, min, max)); - } - return res; -} - -int32_t AbstractConfig::toInt32(const string& val) const { - return toIntRange(val, 0, ~0u / 2); -} - -int64_t AbstractConfig::toInt64(const string& val) const { - return toIntRange(val, 0, ~0ul / 2); -} - -bool AbstractConfig::toBool(string& val) const { - const std::array bool_vals{ - "n", "y", "no", "yes", "f", "t", "false", "true"}; - const string lower_val = toLower(val); - for (int i = 0; i < bool_vals.size(); i++) { - if (lower_val == bool_vals[i]) { - return i % 2; - } - } - throw std::invalid_argument(fmt::format("Invalid bool argument: {}", val)); - return false; -} - -bool AbstractConfig::parse(const string& conf) { - std::istringstream iss(conf); - string line; - - timestamp_ = system_clock::now(); - - // Read the string stream 1 line at a time to parse. - while (std::getline(iss, line)) { - line = stripComment(line); - if (isWhitespace(line)) { - continue; - } - vector key_val = splitAndTrim(line, '='); - if (key_val.size() != 2) { - LOG(ERROR) << "Invalid config line: " << line; - return false; - } else { - bool handled = false; - try { - handled = handleOption(key_val[0], key_val[1]); - if (!handled) { - for (auto& feature_cfg : featureConfigs_) { - if (feature_cfg.second->handleOption(key_val[0], key_val[1])) { - handled = true; - break; - } - } - } - } catch (const std::exception& e) { - LOG(ERROR) << "Failed to parse config line: " << line; - LOG(ERROR) << e.what(); - return false; - } - if (!handled) { - // This might be due to using a newer config option on an - // older binary where it is not supported. In this case, - // print a warning message - but it is expected to work! - LOG(WARNING) << "Unrecognized config line: " << line; - } - } - } - - validate(timestamp_); - - // Store original text, used to detect updates - source_ = conf; - timestamp_ = system_clock::now(); - return true; -} - -bool AbstractConfig::handleOption( - const std::string& /* unused */, - std::string& /* unused */) { - LOG(ERROR) << "handleOption unimplemented"; - return false; -} - -void AbstractConfig::printActivityProfilerConfig(std::ostream& s) const { - for (const auto& feature_cfg : featureConfigs_) { - feature_cfg.second->printActivityProfilerConfig(s); - } -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityBuffers.h b/plugins/tensorboard-plugins/libkineto/src/ActivityBuffers.h deleted file mode 100644 index 157af879379a5f5fc5e274f22604987a97f17af4..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityBuffers.h +++ /dev/null @@ -1,29 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - - -#include -#include - -#include "libkineto.h" -#include "CuptiActivityBuffer.h" - -namespace KINETO_NAMESPACE { - -struct ActivityBuffers { - std::list> cpu; - std::unique_ptr gpu; - - // Add a wrapper object to the underlying struct stored in the buffer - template - const ITraceActivity& addActivityWrapper(const T& act) { - wrappers_.push_back(std::make_unique(act)); - return *wrappers_.back().get(); - } - - private: - std::vector> wrappers_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityLoggerFactory.h b/plugins/tensorboard-plugins/libkineto/src/ActivityLoggerFactory.h deleted file mode 100644 index 0d1bf642cd68051e487004d33e19c5eb181e1c41..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityLoggerFactory.h +++ /dev/null @@ -1,60 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -class ActivityLogger; - -class ActivityLoggerFactory { - - public: - using FactoryFunc = - std::function(const std::string& url)>; - - // Add logger factory for a protocol prefix - void addProtocol(const std::string& protocol, FactoryFunc f) { - factories_[tolower(protocol)] = f; - } - - // Create a logger, invoking the factory for the protocol specified in url - std::unique_ptr makeLogger(const std::string& url) const { - std::string protocol = extractProtocol(url); - auto it = factories_.find(tolower(protocol)); - if (it != factories_.end()) { - return it->second(stripProtocol(url)); - } - throw std::invalid_argument(fmt::format( - "No logger registered for the {} protocol prefix", - protocol)); - return nullptr; - } - - private: - static std::string tolower(std::string s) { - std::transform(s.begin(), s.end(), s.begin(), - [](unsigned char c) { return std::tolower(c); } - ); - return s; - } - - static std::string extractProtocol(std::string url) { - return url.substr(0, url.find("://")); - } - - static std::string stripProtocol(std::string url) { - size_t pos = url.find("://"); - return pos == url.npos ? url : url.substr(pos + 3); - } - - std::map factories_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.cpp b/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.cpp deleted file mode 100644 index c85d41ed73ff059bcd7ee69c36a0bcc6c3d5c4ca..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.cpp +++ /dev/null @@ -1,246 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "ActivityProfilerController.h" - -#include -#include - -#include "ActivityLoggerFactory.h" -#include "ActivityTrace.h" -#include "CuptiActivityApi.h" -#ifdef HAS_ROCTRACER -#include "RoctracerActivityApi.h" -#endif -#include "ThreadUtil.h" -#include "output_json.h" -#include "output_membuf.h" - -#include "Logger.h" - -using namespace std::chrono; - -namespace KINETO_NAMESPACE { - -constexpr milliseconds kProfilerIntervalMsecs(1000); - -ActivityProfilerController::ActivityProfilerController( - ConfigLoader& configLoader, bool cpuOnly) - : configLoader_(configLoader) { -#ifdef HAS_ROCTRACER - profiler_ = std::make_unique( - RoctracerActivityApi::singleton(), cpuOnly); -#else - profiler_ = std::make_unique( - CuptiActivityApi::singleton(), cpuOnly); -#endif - configLoader_.addHandler(ConfigLoader::ConfigKind::ActivityProfiler, this); -} - -ActivityProfilerController::~ActivityProfilerController() { - configLoader_.removeHandler( - ConfigLoader::ConfigKind::ActivityProfiler, this); - if (profilerThread_) { - // signaling termination of the profiler loop - stopRunloop_ = true; - profilerThread_->join(); - delete profilerThread_; - profilerThread_ = nullptr; - } -} - -static ActivityLoggerFactory initLoggerFactory() { - ActivityLoggerFactory factory; - factory.addProtocol("file", [](const std::string& url) { - return std::unique_ptr(new ChromeTraceLogger(url)); - }); - return factory; -} - -static ActivityLoggerFactory& loggerFactory() { - static ActivityLoggerFactory factory = initLoggerFactory(); - return factory; -} - -void ActivityProfilerController::addLoggerFactory( - const std::string& protocol, ActivityLoggerFactory::FactoryFunc factory) { - loggerFactory().addProtocol(protocol, factory); -} - -static std::unique_ptr makeLogger(const Config& config) { - if (config.activitiesLogToMemory()) { - return std::make_unique(config); - } - return loggerFactory().makeLogger(config.activitiesLogUrl()); -} - -bool ActivityProfilerController::canAcceptConfig() { - return !profiler_->isActive(); -} - -void ActivityProfilerController::acceptConfig(const Config& config) { - VLOG(1) << "acceptConfig"; - if (config.activityProfilerEnabled()) { - scheduleTrace(config); - } -} - -void ActivityProfilerController::profilerLoop() { - setThreadName("Kineto Activity Profiler"); - VLOG(0) << "Entering activity profiler loop"; - - auto now = system_clock::now(); - auto next_wakeup_time = now + kProfilerIntervalMsecs; - - while (!stopRunloop_) { - now = system_clock::now(); - - while (now < next_wakeup_time) { - /* sleep override */ - std::this_thread::sleep_for(next_wakeup_time - now); - now = system_clock::now(); - } - - if (!profiler_->isActive()) { - std::lock_guard lock(asyncConfigLock_); - if (asyncRequestConfig_ - && !asyncRequestConfig_->hasProfileStartIteration()) { - // Note on now + kProfilerIntervalMsecs - // Profiler interval does not align perfectly upto startTime - warmup. Waiting until the next tick - // won't allow sufficient time for the profiler to warm up. So check if we are very close to the warmup time and trigger warmup - if (now + kProfilerIntervalMsecs - >= (asyncRequestConfig_->requestTimestamp() - asyncRequestConfig_->activitiesWarmupDuration())) { - LOG(INFO) << "Received on-demand activity trace request by " - << " profile timestamp = " - << asyncRequestConfig_-> - requestTimestamp().time_since_epoch().count(); - activateConfig(now); - } - } - } - - while (next_wakeup_time < now) { - next_wakeup_time += kProfilerIntervalMsecs; - } - - if (profiler_->isActive()) { - next_wakeup_time = profiler_->performRunLoopStep(now, next_wakeup_time); - VLOG(1) << "Profiler loop: " - << duration_cast(system_clock::now() - now).count() - << "ms"; - } - } - - VLOG(0) << "Exited activity profiling loop"; -} - -void ActivityProfilerController::step() { - int64_t currentIter = ++iterationCount_; - VLOG(0) << "Step called , iteration = " << currentIter; - - // optimization to not take the lock unless necessary - if (asyncRequestConfig_ && !profiler_->isActive()) { - std::lock_guard lock(asyncConfigLock_); - auto startIter = asyncRequestConfig_->startIterationIncludingWarmup(); - - if (asyncRequestConfig_->hasProfileStartIteration() - && currentIter >= startIter) { - LOG(INFO) << "Received on-demand activity trace request by profile" - << " start iteration = " - << asyncRequestConfig_->profileStartIteration() - << " current iteration = " << currentIter; - - if (currentIter > startIter) { - // adjust the start iteration if it is in the past - auto newProfileStart = currentIter + - asyncRequestConfig_->activitiesWarmupIterations(); - LOG(INFO) << "Start iteration updated to " << newProfileStart; - asyncRequestConfig_->setProfileStartIteration(newProfileStart); - } - activateConfig(system_clock::now()); - } - } - - if (profiler_->isActive()) { - auto now = system_clock::now(); - auto next_wakeup_time = now + kProfilerIntervalMsecs; - profiler_->performRunLoopStep(now, next_wakeup_time, currentIter); - } -} - -void ActivityProfilerController::activateConfig( - std::chrono::time_point now) { - logger_ = makeLogger(*asyncRequestConfig_); - profiler_->setLogger(logger_.get()); - profiler_->configure(*asyncRequestConfig_, now); - asyncRequestConfig_ = nullptr; -} - -void ActivityProfilerController::scheduleTrace(const Config& config) { - VLOG(1) << "scheduleTrace"; - if (profiler_->isActive()) { - LOG(ERROR) << "Ignored request - profiler busy"; - return; - } - int64_t currentIter = iterationCount_; - if (config.hasProfileStartIteration() && currentIter < 0) { - LOG(ERROR) << "Ignored profile iteration count based request as " - << "application is not updating iteration count"; - return; - } - std::lock_guard lock(asyncConfigLock_); - asyncRequestConfig_ = config.clone(); - - auto startIter = asyncRequestConfig_->startIterationIncludingWarmup(); - - if (asyncRequestConfig_->hasProfileStartIteration() - && (currentIter > startIter) - && asyncRequestConfig_->profileStartIterationRoundUp() > 0) { - auto newProfileStart - = currentIter + asyncRequestConfig_->activitiesWarmupIterations(); - // round up to nearest multiple - auto divisor = asyncRequestConfig_->profileStartIterationRoundUp(); - auto rem = newProfileStart % divisor; - newProfileStart += ((rem == 0) ? 0 : divisor - rem); - LOG(INFO) << "Rounding up profiler start iteration to : " << newProfileStart; - asyncRequestConfig_->setProfileStartIteration(newProfileStart); - } - - // start a profilerLoop() thread to handle request - if (!profilerThread_) { - profilerThread_ = - new std::thread(&ActivityProfilerController::profilerLoop, this); - } -} - -void ActivityProfilerController::prepareTrace(const Config& config) { - // Requests from ActivityProfilerApi have higher priority than - // requests from other sources (signal, daemon). - // Cancel any ongoing request and refuse new ones. - auto now = system_clock::now(); - if (profiler_->isActive()) { - LOG(WARNING) << "Cancelling current trace request in order to start " - << "higher priority synchronous request"; - if (libkineto::api().client()) { - libkineto::api().client()->stop(); - } - profiler_->stopTrace(now); - profiler_->reset(); - } - - profiler_->configure(config, now); -} - -std::unique_ptr ActivityProfilerController::stopTrace() { - profiler_->stopTrace(std::chrono::system_clock::now()); - auto logger = std::make_unique(profiler_->config()); - profiler_->processTrace(*logger); - profiler_->reset(); - return std::make_unique(std::move(logger), loggerFactory()); -} - -void ActivityProfilerController::addMetadata( - const std::string& key, const std::string& value) { - profiler_->addMetadata(key, value); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.h b/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.h deleted file mode 100644 index 415f107cbed6aab4777c65e9e51d65686002e762..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerController.h +++ /dev/null @@ -1,84 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -#include "ActivityLoggerFactory.h" -#include "CuptiActivityProfiler.h" -#include "ActivityProfilerInterface.h" -#include "ActivityTraceInterface.h" -#include "ConfigLoader.h" -#include "CuptiActivityApi.h" - -namespace KINETO_NAMESPACE { - -class Config; - -class ActivityProfilerController : public ConfigLoader::ConfigHandler { - public: - explicit ActivityProfilerController(ConfigLoader& configLoader, bool cpuOnly); - ActivityProfilerController(const ActivityProfilerController&) = delete; - ActivityProfilerController& operator=(const ActivityProfilerController&) = - delete; - - ~ActivityProfilerController(); - - static void addLoggerFactory( - const std::string& protocol, - ActivityLoggerFactory::FactoryFunc factory); - - bool canAcceptConfig() override; - void acceptConfig(const Config& config) override; - - void scheduleTrace(const Config& config); - - void prepareTrace(const Config& config); - - void startTrace() { - profiler_->startTrace(std::chrono::system_clock::now()); - } - - void step(); - - std::unique_ptr stopTrace(); - - bool isActive() { - return profiler_->isActive(); - } - - void transferCpuTrace( - std::unique_ptr cpuTrace) { - return profiler_->transferCpuTrace(std::move(cpuTrace)); - } - - void recordThreadInfo() { - profiler_->recordThreadInfo(); - } - - void addChildActivityProfiler( - std::unique_ptr profiler) { - profiler_->addChildActivityProfiler(std::move(profiler)); - } - - void addMetadata(const std::string& key, const std::string& value); - - private: - void profilerLoop(); - void activateConfig(std::chrono::time_point now); - - std::unique_ptr asyncRequestConfig_; - std::mutex asyncConfigLock_; - std::unique_ptr profiler_; - std::unique_ptr logger_; - std::thread* profilerThread_{nullptr}; - std::atomic_bool stopRunloop_{false}; - std::atomic iterationCount_{-1}; - ConfigLoader& configLoader_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.cpp b/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.cpp deleted file mode 100644 index b2d36b7b3abf9c3e0aed838a10e4054a5d292139..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.cpp +++ /dev/null @@ -1,119 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "ActivityProfilerProxy.h" - -#include "ActivityProfilerController.h" -#include "Config.h" -#include "CuptiActivityApi.h" -#include "Logger.h" -#include - -namespace KINETO_NAMESPACE { - -ActivityProfilerProxy::ActivityProfilerProxy( - bool cpuOnly, ConfigLoader& configLoader) - : cpuOnly_(cpuOnly), configLoader_(configLoader) { -} - -ActivityProfilerProxy::~ActivityProfilerProxy() { - delete controller_; -}; - -void ActivityProfilerProxy::init() { - if (!controller_) { - controller_ = new ActivityProfilerController(configLoader_, cpuOnly_); - } -} - -void ActivityProfilerProxy::scheduleTrace(const std::string& configStr) { - Config config; - config.parse(configStr); - controller_->scheduleTrace(config); -} - -void ActivityProfilerProxy::scheduleTrace(const Config& config) { - controller_->scheduleTrace(config); -} - -void ActivityProfilerProxy::prepareTrace( - const std::set& activityTypes, - const std::string& configStr) { - Config config; - bool validate_required = true; - - // allow user provided config to override default options - if (!configStr.empty()) { - if (!config.parse(configStr)) { - LOG(WARNING) << "Failed to parse config : " << configStr; - } - // parse also runs validate - validate_required = false; - } - - config.setClientDefaults(); - config.setSelectedActivityTypes(activityTypes); - - if (validate_required) { - config.validate(std::chrono::system_clock::now()); - } - - controller_->prepareTrace(config); -} - -void ActivityProfilerProxy::startTrace() { - controller_->startTrace(); -} - -std::unique_ptr -ActivityProfilerProxy::stopTrace() { - return controller_->stopTrace(); -} - -void ActivityProfilerProxy::step() { - controller_->step(); -} - -bool ActivityProfilerProxy::isActive() { - return controller_->isActive(); -} - -void ActivityProfilerProxy::pushCorrelationId(uint64_t id) { - CuptiActivityApi::pushCorrelationID(id, - CuptiActivityApi::CorrelationFlowType::Default); -} - -void ActivityProfilerProxy::popCorrelationId() { - CuptiActivityApi::popCorrelationID( - CuptiActivityApi::CorrelationFlowType::Default); -} - -void ActivityProfilerProxy::pushUserCorrelationId(uint64_t id) { - CuptiActivityApi::pushCorrelationID(id, - CuptiActivityApi::CorrelationFlowType::User); -} - -void ActivityProfilerProxy::popUserCorrelationId() { - CuptiActivityApi::popCorrelationID( - CuptiActivityApi::CorrelationFlowType::User); -} - -void ActivityProfilerProxy::transferCpuTrace( - std::unique_ptr traceBuffer) { - controller_->transferCpuTrace(std::move(traceBuffer)); -} - -void ActivityProfilerProxy::addMetadata( - const std::string& key, const std::string& value) { - controller_->addMetadata(key, value); -} - -void ActivityProfilerProxy::recordThreadInfo() { - controller_->recordThreadInfo(); -} - -void ActivityProfilerProxy::addChildActivityProfiler( - std::unique_ptr profiler) { - controller_->addChildActivityProfiler(std::move(profiler)); -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.h b/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.h deleted file mode 100644 index b5cf84b2f1ddb005060fea0927c99fc63d144d99..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityProfilerProxy.h +++ /dev/null @@ -1,73 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include "ActivityProfilerInterface.h" - -#include -#include -#include - -#include "ActivityType.h" -#include "ITraceActivity.h" - -namespace libkineto { - // previous declaration is struct so this one must be too. - struct CpuTraceBuffer; -} - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -class ActivityProfilerController; -class Config; -class ConfigLoader; - -class ActivityProfilerProxy : public ActivityProfilerInterface { - - public: - ActivityProfilerProxy(bool cpuOnly, ConfigLoader& configLoader); - ~ActivityProfilerProxy() override; - - void init() override; - bool isInitialized() override { - return controller_ != nullptr; - } - - bool isActive() override; - - void recordThreadInfo() override; - - void scheduleTrace(const std::string& configStr) override; - void scheduleTrace(const Config& config); - - void prepareTrace( - const std::set& activityTypes, - const std::string& configStr = "") override; - - void startTrace() override; - void step() override; - std::unique_ptr stopTrace() override; - - void pushCorrelationId(uint64_t id) override; - void popCorrelationId() override; - - void pushUserCorrelationId(uint64_t id) override; - void popUserCorrelationId() override; - - void transferCpuTrace( - std::unique_ptr traceBuffer) override; - - void addMetadata(const std::string& key, const std::string& value) override; - - virtual void addChildActivityProfiler( - std::unique_ptr profiler) override; - - private: - bool cpuOnly_{true}; - ConfigLoader& configLoader_; - ActivityProfilerController* controller_{nullptr}; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityTrace.h b/plugins/tensorboard-plugins/libkineto/src/ActivityTrace.h deleted file mode 100644 index 0be76af08e47c16ebee2ac1d1ad01c4425ff17a5..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityTrace.h +++ /dev/null @@ -1,45 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include - -#include "ActivityLoggerFactory.h" -#include "ActivityTraceInterface.h" -#include "output_json.h" -#include "output_membuf.h" - -namespace libkineto { - -class ActivityTrace : public ActivityTraceInterface { - public: - ActivityTrace( - std::unique_ptr tmpLogger, - const ActivityLoggerFactory& factory) - : memLogger_(std::move(tmpLogger)), - loggerFactory_(factory) { - } - - const std::vector* activities() override { - return memLogger_->traceActivities(); - }; - - void save(const std::string& url) override { - std::string prefix; - // if no protocol is specified, default to file - if (url.find("://") == url.npos) { - prefix = "file://"; - } - memLogger_->log(*loggerFactory_.makeLogger(prefix + url)); - }; - - private: - // Activities are logged into a buffer - std::unique_ptr memLogger_; - - // Alternative logger used by save() if protocol prefix is specified - const ActivityLoggerFactory& loggerFactory_; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/ActivityType.cpp b/plugins/tensorboard-plugins/libkineto/src/ActivityType.cpp deleted file mode 100644 index 18856b72370abdb6d9cf4309b32be4cae10805de..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ActivityType.cpp +++ /dev/null @@ -1,58 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "ActivityType.h" - -#include - -namespace libkineto { - -struct ActivityTypeName { - const char* name; - ActivityType type; -}; - -static constexpr std::array map{{ - {"cpu_op", ActivityType::CPU_OP}, - {"user_annotation", ActivityType::USER_ANNOTATION}, - {"gpu_user_Annotation", ActivityType::GPU_USER_ANNOTATION}, - {"gpu_memcpy", ActivityType::GPU_MEMCPY}, - {"gpu_memset", ActivityType::GPU_MEMSET}, - {"kernel", ActivityType::CONCURRENT_KERNEL}, - {"external_correlation", ActivityType::EXTERNAL_CORRELATION}, - {"cuda_runtime", ActivityType::CUDA_RUNTIME}, - {"cuda_profiler_range", ActivityType::CUDA_PROFILER_RANGE}, - {"glow_runtime", ActivityType::GLOW_RUNTIME}, - {"cpu_instant_event", ActivityType::CPU_INSTANT_EVENT}, - {"python_function", ActivityType::PYTHON_FUNCTION}, - {"overhead", ActivityType::OVERHEAD}, - {"ENUM_COUNT", ActivityType::ENUM_COUNT} -}}; - -static constexpr bool matchingOrder(int idx = 0) { - return map[idx].type == ActivityType::ENUM_COUNT || - ((idx == (int) map[idx].type) && matchingOrder(idx + 1)); -} -static_assert(matchingOrder(), "ActivityTypeName map is out of order"); - -const char* toString(ActivityType t) { - return map[(int)t].name; -} - -ActivityType toActivityType(const std::string& str) { - for (int i = 0; i < activityTypeCount; i++) { - if (str == map[i].name) { - return map[i].type; - } - } - throw std::invalid_argument(fmt::format("Invalid activity type: {}", str)); -} - -const std::array activityTypes() { - std::array res; - for (int i = 0; i < activityTypeCount; i++) { - res[i] = map[i].type; - } - return res; -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/Config.cpp b/plugins/tensorboard-plugins/libkineto/src/Config.cpp deleted file mode 100644 index 95538840f378e83b2b44161823042c620b34fe93..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/Config.cpp +++ /dev/null @@ -1,473 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "Config.h" - -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "Logger.h" -#include "ThreadUtil.h" - -using namespace std::chrono; - -using std::string; -using std::vector; - -namespace KINETO_NAMESPACE { - -constexpr milliseconds kDefaultSamplePeriodMsecs(1000); -constexpr milliseconds kDefaultMultiplexPeriodMsecs(1000); -constexpr milliseconds kDefaultActivitiesProfileDurationMSecs(500); -constexpr int kDefaultActivitiesMaxGpuBufferSize(128 * 1024 * 1024); -constexpr seconds kDefaultActivitiesWarmupDurationSecs(5); -constexpr seconds kDefaultBufferUntilWarmup(10); -constexpr seconds kDefaultReportPeriodSecs(1); -constexpr int kDefaultSamplesPerReport(1); -constexpr int kDefaultMaxEventProfilersPerGpu(1); -constexpr int kDefaultEventProfilerHearbeatMonitorPeriod(0); -constexpr seconds kMaxRequestAge(10); - -// Event Profiler -constexpr char kEventsKey[] = "EVENTS"; -constexpr char kMetricsKey[] = "METRICS"; -constexpr char kSamplePeriodKey[] = "SAMPLE_PERIOD_MSECS"; -constexpr char kMultiplexPeriodKey[] = "MULTIPLEX_PERIOD_MSECS"; -constexpr char kReportPeriodKey[] = "REPORT_PERIOD_SECS"; -constexpr char kSamplesPerReportKey[] = "SAMPLES_PER_REPORT"; -constexpr char kEventsLogFileKey[] = "EVENTS_LOG_FILE"; -constexpr char kEventsEnabledDevicesKey[] = "EVENTS_ENABLED_DEVICES"; -constexpr char kOnDemandDurationKey[] = "EVENTS_DURATION_SECS"; -constexpr char kMaxEventProfilersPerGpuKey[] = "MAX_EVENT_PROFILERS_PER_GPU"; -constexpr char kHeartbeatMonitorPeriodKey[] = - "EVENTS_HEARTBEAT_MONITOR_PERIOD_SECS"; - -// Activity Profiler -constexpr char kActivitiesEnabledKey[] = "ACTIVITIES_ENABLED"; -constexpr char kActivityTypesKey[] = "ACTIVITY_TYPES"; -constexpr char kActivitiesLogFileKey[] = "ACTIVITIES_LOG_FILE"; -constexpr char kActivitiesDurationKey[] = "ACTIVITIES_DURATION_SECS"; -constexpr char kActivitiesDurationMsecsKey[] = "ACTIVITIES_DURATION_MSECS"; -constexpr char kActivitiesWarmupDurationSecsKey[] = "ACTIVITIES_WARMUP_PERIOD_SECS"; -constexpr char kActivitiesMaxGpuBufferSizeKey[] = - "ACTIVITIES_MAX_GPU_BUFFER_SIZE_MB"; - -// Client Interface -constexpr char kClientInterfaceEnableOpInputsCollection[] = "CLIENT_INTERFACE_ENABLE_OP_INPUTS_COLLECTION"; - -constexpr char kActivitiesWarmupIterationsKey[] = "ACTIVITIES_WARMUP_ITERATIONS"; -constexpr char kActivitiesIterationsKey[] = "ACTIVITIES_ITERATIONS"; -// Common - -// Client-side timestamp used for synchronized start across hosts for -// distributed workloads. -// Specified in milliseconds Unix time (milliseconds since epoch). -// To use, compute a future timestamp as follows: -// * C++: + duration_cast( -// system_clock::now().time_since_epoch()).count() -// * Python: + int(time.time() * 1000) -// * Bash: $(( + $(date +%s%3N))) -// If used for a tracing request, timestamp must be far enough in the future -// to accommodate ACTIVITIES_WARMUP_PERIOD_SECS as well as any delays in -// propagating the request to the profiler. -// If the request can not be honored, it is up to the profilers to report -// an error somehow - no checks are done at config parse time. -// Note PROFILE_START_ITERATION has higher precedence -constexpr char kProfileStartTimeKey[] = "PROFILE_START_TIME"; -// DEPRECATED - USE PROFILE_START_TIME instead -constexpr char kRequestTimestampKey[] = "REQUEST_TIMESTAMP"; - -// Alternatively if the application supports reporting iterations -// start the profile at specific iteration. If the iteration count -// is >= this value the profile is started immediately. -// A value >= 0 is valid for this config option to take effect. -// Note PROFILE_START_ITERATION will take precedence over PROFILE_START_TIME. -constexpr char kProfileStartIterationKey[] = "PROFILE_START_ITERATION"; - -// Users can also start the profile on an integer multiple of the config -// value PROFILE_START_ITERATION_ROUNDUP. This knob behaves similar to -// PROFILE_START_ITERATION but instead of saying : "start collection trace on -// iteration 500", one can configure it to "start collecting trace on the next -// 100th iteration". -// -// For example, -// PROFILE_START_ITERATION_ROUNDUP = 1000, and the current iteration is 2010 -// The profile will then be collected on the next multiple of 1000 ie. 3000 -// Note PROFILE_START_ITERATION_ROUNDUP will also take precedence over -// PROFILE_START_TIME. -constexpr char kProfileStartIterationRoundUpKey[] - = "PROFILE_START_ITERATION_ROUNDUP"; - -// Enable on-demand trigger via kill -USR2 -// When triggered in this way, /tmp/libkineto.conf will be used as config. -constexpr char kEnableSigUsr2Key[] = "ENABLE_SIGUSR2"; - -// Enable communication through IPC Fabric -// and disable thrift communication with dynolog daemon -constexpr char kEnableIpcFabricKey[] = "ENABLE_IPC_FABRIC"; - -// Verbose log level -// The actual glog is not used and --v and --vmodule has no effect. -// Instead set the verbose level and modules in the config file. -constexpr char kLogVerboseLevelKey[] = "VERBOSE_LOG_LEVEL"; -// By default, all modules will log verbose messages >= verboseLogLevel. -// But to reduce noise we can specify one or more modules of interest. -// A module is a C/C++ object file (source file name), -// Example argument: ActivityProfiler.cpp,output_json.cpp -constexpr char kLogVerboseModulesKey[] = "VERBOSE_LOG_MODULES"; - -// Max devices supported on any system -constexpr uint8_t kMaxDevices = 8; - -namespace { - -struct FactoryMap { - - void addFactory( - std::string name, - std::function factory) { - std::lock_guard lock(lock_); - factories_[name] = factory; - } - - void addFeatureConfigs(Config& cfg) { - std::lock_guard lock(lock_); - for (const auto& p : factories_) { - cfg.addFeature(p.first, p.second(cfg)); - } - } - -// Config factories are shared between objects and since -// config objects can be created by multiple threads, we need a lock. - std::mutex lock_; - std::map> factories_; -}; - -std::shared_ptr configFactories() { - // Ensure this is safe to call during shutdown, even as static - // destructors are invoked. Once factories destructor has been - // invoked, weak_ptr.lock() will return nullptr. - // But calls before that point will have a valid shared_ptr, - // delaying destruction of the underlying FactoryMap. - static auto factories = std::make_shared(); - static std::weak_ptr weak_ptr = factories; - return weak_ptr.lock(); -} - -} // namespace - -void Config::addConfigFactory( - std::string name, - std::function factory) { - auto factories = configFactories(); - if (factories) { - factories->addFactory(name, factory); - } -} - -static string defaultTraceFileName() { - return fmt::format("/tmp/libkineto_activities_{}.json", processId()); -} - -Config::Config() - : verboseLogLevel_(-1), - samplePeriod_(kDefaultSamplePeriodMsecs), - reportPeriod_(duration_cast(kDefaultReportPeriodSecs)), - samplesPerReport_(kDefaultSamplesPerReport), - eventProfilerOnDemandDuration_(seconds(0)), - eventProfilerMaxInstancesPerGpu_(kDefaultMaxEventProfilersPerGpu), - eventProfilerHeartbeatMonitorPeriod_( - kDefaultEventProfilerHearbeatMonitorPeriod), - multiplexPeriod_(kDefaultMultiplexPeriodMsecs), - activityProfilerEnabled_(true), - activitiesLogFile_(defaultTraceFileName()), - activitiesLogUrl_(fmt::format("file://{}", activitiesLogFile_)), - activitiesMaxGpuBufferSize_(kDefaultActivitiesMaxGpuBufferSize), - activitiesWarmupDuration_(kDefaultActivitiesWarmupDurationSecs), - activitiesWarmupIterations_(0), - activitiesDuration_(kDefaultActivitiesProfileDurationMSecs), - activitiesRunIterations_(0), - activitiesOnDemandTimestamp_(milliseconds(0)), - profileStartTime_(milliseconds(0)), - profileStartIteration_(-1), - profileStartIterationRoundUp_(-1), - requestTimestamp_(milliseconds(0)), - enableSigUsr2_(false), - enableIpcFabric_(false) { - auto factories = configFactories(); - if (factories) { - factories->addFeatureConfigs(*this); - } -} - -uint8_t Config::createDeviceMask(const string& val) { - uint8_t res = 0; - for (const auto& d : splitAndTrim(val, ',')) { - res |= 1 << toIntRange(d, 0, kMaxDevices - 1); - } - return res; -} - -const seconds Config::maxRequestAge() const { - return kMaxRequestAge; -} - -static std::string getTimeStr(time_point t) { - std::time_t t_c = system_clock::to_time_t(t); - return fmt::format("{:%H:%M:%S}", fmt::localtime(t_c)); -} - -static time_point handleRequestTimestamp(int64_t ms) { - auto t = time_point(milliseconds(ms)); - auto now = system_clock::now(); - if (t > now) { - throw std::invalid_argument(fmt::format( - "Invalid {}: {} - time is in future", - kRequestTimestampKey, - getTimeStr(t))); - } else if ((now - t) > kMaxRequestAge) { - throw std::invalid_argument(fmt::format( - "Invalid {}: {} - time is more than {}s in the past", - kRequestTimestampKey, - getTimeStr(t), - kMaxRequestAge.count())); - } - return t; -} - -void Config::setActivityTypes( - const std::vector& selected_activities) { - selectedActivityTypes_.clear(); - if (selected_activities.size() > 0) { - for (const auto& activity : selected_activities) { - if (activity == "") { - continue; - } - selectedActivityTypes_.insert(toActivityType(activity)); - } - } -} - -bool Config::handleOption(const std::string& name, std::string& val) { - // Event Profiler - if (!name.compare(kEventsKey)) { - vector event_names = splitAndTrim(val, ','); - eventNames_.insert(event_names.begin(), event_names.end()); - } else if (!name.compare(kMetricsKey)) { - vector metric_names = splitAndTrim(val, ','); - metricNames_.insert(metric_names.begin(), metric_names.end()); - } else if (!name.compare(kSamplePeriodKey)) { - samplePeriod_ = milliseconds(toInt32(val)); - } else if (!name.compare(kMultiplexPeriodKey)) { - multiplexPeriod_ = milliseconds(toInt32(val)); - } else if (!name.compare(kReportPeriodKey)) { - setReportPeriod(seconds(toInt32(val))); - } else if (!name.compare(kSamplesPerReportKey)) { - samplesPerReport_ = toInt32(val); - } else if (!name.compare(kEventsLogFileKey)) { - eventLogFile_ = val; - } else if (!name.compare(kEventsEnabledDevicesKey)) { - eventProfilerDeviceMask_ = createDeviceMask(val); - } else if (!name.compare(kOnDemandDurationKey)) { - eventProfilerOnDemandDuration_ = seconds(toInt32(val)); - eventProfilerOnDemandTimestamp_ = timestamp(); - } else if (!name.compare(kMaxEventProfilersPerGpuKey)) { - eventProfilerMaxInstancesPerGpu_ = toInt32(val); - } else if (!name.compare(kHeartbeatMonitorPeriodKey)) { - eventProfilerHeartbeatMonitorPeriod_ = seconds(toInt32(val)); - } - - // Activity Profiler - else if (!name.compare(kActivitiesDurationKey)) { - activitiesDuration_ = - duration_cast(seconds(toInt32(val))); - activitiesOnDemandTimestamp_ = timestamp(); - } else if (!name.compare(kActivityTypesKey)) { - vector activity_types = splitAndTrim(toLower(val), ','); - setActivityTypes(activity_types); - } else if (!name.compare(kActivitiesDurationMsecsKey)) { - activitiesDuration_ = milliseconds(toInt32(val)); - activitiesOnDemandTimestamp_ = timestamp(); - } else if (!name.compare(kActivitiesIterationsKey)) { - activitiesRunIterations_ = toInt32(val); - activitiesOnDemandTimestamp_ = timestamp(); - } else if (!name.compare(kLogVerboseLevelKey)) { - verboseLogLevel_ = toInt32(val); - } else if (!name.compare(kLogVerboseModulesKey)) { - verboseLogModules_ = splitAndTrim(val, ','); - } else if (!name.compare(kActivitiesEnabledKey)) { - activityProfilerEnabled_ = toBool(val); - } else if (!name.compare(kActivitiesLogFileKey)) { - activitiesLogFile_ = val; - activitiesLogUrl_ = fmt::format("file://{}", val); - activitiesOnDemandTimestamp_ = timestamp(); - } else if (!name.compare(kActivitiesMaxGpuBufferSizeKey)) { - activitiesMaxGpuBufferSize_ = toInt32(val) * 1024 * 1024; - } else if (!name.compare(kActivitiesWarmupDurationSecsKey)) { - activitiesWarmupDuration_ = seconds(toInt32(val)); - } else if (!name.compare(kActivitiesWarmupIterationsKey)) { - activitiesWarmupIterations_ = toInt32(val); - } - - // Client Interface - else if (!name.compare(kClientInterfaceEnableOpInputsCollection)) { - enableOpInputsCollection_ = toBool(val); - } - - // Common - else if (!name.compare(kRequestTimestampKey)) { - VLOG(0) << kRequestTimestampKey - << " has been deprecated - please use " - << kProfileStartTimeKey; - requestTimestamp_ = handleRequestTimestamp(toInt64(val)); - } else if (!name.compare(kProfileStartTimeKey)) { - profileStartTime_ = - time_point(milliseconds(toInt64(val))); - } else if (!name.compare(kProfileStartIterationKey)) { - profileStartIteration_ = toInt32(val); - } else if (!name.compare(kProfileStartIterationRoundUpKey)) { - profileStartIterationRoundUp_ = toInt32(val); - } else if (!name.compare(kEnableSigUsr2Key)) { - enableSigUsr2_ = toBool(val); - } else if (!name.compare(kEnableIpcFabricKey)) { - enableIpcFabric_ = toBool(val); - } else { - return false; - } - return true; -} - -std::chrono::milliseconds Config::activitiesDurationDefault() const { - return kDefaultActivitiesProfileDurationMSecs; -}; - -void Config::updateActivityProfilerRequestReceivedTime() { - activitiesOnDemandTimestamp_ = system_clock::now(); -} - -void Config::setClientDefaults() { - AbstractConfig::setClientDefaults(); - activitiesLogToMemory_ = true; -} - -void Config::validate( - const time_point& fallbackProfileStartTime) { - if (samplePeriod_.count() == 0) { - LOG(WARNING) << "Sample period must be greater than 0, setting to 1ms"; - samplePeriod_ = milliseconds(1); - } - - if (multiplexPeriod_ < samplePeriod_) { - LOG(WARNING) << "Multiplex period can not be smaller " - << "than sample period"; - LOG(WARNING) << "Setting multiplex period to " << samplePeriod_.count() - << "ms"; - multiplexPeriod_ = samplePeriod_; - } - - if ((multiplexPeriod_ % samplePeriod_).count() != 0) { - LOG(WARNING) << "Multiplex period must be a " - << "multiple of sample period"; - multiplexPeriod_ = alignUp(multiplexPeriod_, samplePeriod_); - LOG(WARNING) << "Setting multiplex period to " << multiplexPeriod_.count() - << "ms"; - } - - if ((reportPeriod_ % multiplexPeriod_).count() != 0 || - reportPeriod_.count() == 0) { - LOG(WARNING) << "Report period must be a " - << "multiple of multiplex period"; - reportPeriod_ = alignUp(reportPeriod_, multiplexPeriod_); - LOG(WARNING) << "Setting report period to " << reportPeriod_.count() - << "ms"; - } - - if (samplesPerReport_ < 1) { - LOG(WARNING) << "Samples per report must be in the range " - << "[1, report period / sample period]"; - LOG(WARNING) << "Setting samples per report to 1"; - samplesPerReport_ = 1; - } - - int max_samples_per_report = reportPeriod_ / samplePeriod_; - if (samplesPerReport_ > max_samples_per_report) { - LOG(WARNING) << "Samples per report must be in the range " - << "[1, report period / sample period] ([1, " - << reportPeriod_.count() << "ms / " << samplePeriod_.count() - << "ms = " << max_samples_per_report << "])"; - LOG(WARNING) << "Setting samples per report to " << max_samples_per_report; - samplesPerReport_ = max_samples_per_report; - } - - if (!hasProfileStartTime()) { - VLOG(0) - << "No explicit timestamp has been set. " - << "Defaulting it to now + activitiesWarmupDuration with buffer."; - profileStartTime_ = fallbackProfileStartTime + - activitiesWarmupDuration() + kDefaultBufferUntilWarmup; - } - - if (profileStartIterationRoundUp_ == 0) { - // setting to 0 will mess up modulo arithmetic, set it to -1 so it has no effect - LOG(WARNING) << "Profiler start iteration round up should be >= 1."; - profileStartIterationRoundUp_ = -1; - } - - if (profileStartIterationRoundUp_ > 0 && !hasProfileStartIteration()) { - VLOG(0) << "Setting profiler start iteration to 0 so this config is " - << "triggered via iteration count."; - profileStartIteration_ = 0; - } - - if (selectedActivityTypes_.size() == 0) { - selectDefaultActivityTypes(); - } -} - -void Config::setReportPeriod(milliseconds msecs) { - reportPeriod_ = msecs; -} - -void Config::printActivityProfilerConfig(std::ostream& s) const { - s << "Log file: " << activitiesLogFile() << std::endl; - if (hasProfileStartIteration()) { - s << "Trace start Iteration: " << profileStartIteration() << std::endl; - s << "Trace warmup Iterations: " << activitiesWarmupIterations() << std::endl; - s << "Trace profile Iterations: " << activitiesRunIterations() << std::endl; - if (profileStartIterationRoundUp() > 0) { - s << "Trace start iteration roundup : " << profileStartIterationRoundUp() - << std::endl; - } - } else if (hasProfileStartTime()) { - std::time_t t_c = system_clock::to_time_t(requestTimestamp()); - LOG(INFO) << "Trace start time: " - << fmt::format("{:%Y-%m-%d %H:%M:%S}", fmt::localtime(t_c)); - s << "Trace duration: " << activitiesDuration().count() << "ms" - << std::endl; - s << "Warmup duration: " << activitiesWarmupDuration().count() << "s" - << std::endl; - } - - s << "Max GPU buffer size: " << activitiesMaxGpuBufferSize() / 1024 / 1024 - << "MB" << std::endl; - - std::vector activities; - for (const auto& activity : selectedActivityTypes_) { - activities.push_back(toString(activity)); - } - s << "Enabled activities: " - << fmt::format("{}", fmt::join(activities, ",")) << std::endl; - - AbstractConfig::printActivityProfilerConfig(s); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.cpp b/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.cpp deleted file mode 100644 index 4080b678d371e98757897d4d7726c159887377e1..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.cpp +++ /dev/null @@ -1,300 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "ConfigLoader.h" - -#ifdef __linux__ -#include -#endif - -#include -#include -#include -#include -#include - -#include "DaemonConfigLoader.h" - -#include "Logger.h" - -using namespace std::chrono; -using std::string; - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -constexpr char kConfigFileEnvVar[] = "KINETO_CONFIG"; -#ifdef __linux__ -constexpr char kConfigFile[] = "/etc/libkineto.conf"; -constexpr char kOnDemandConfigFile[] = "/tmp/libkineto.conf"; -#else -constexpr char kConfigFile[] = "libkineto.conf"; -constexpr char kOnDemandConfigFile[] = "libkineto.conf"; -#endif - -constexpr std::chrono::seconds kConfigUpdateIntervalSecs(300); -constexpr std::chrono::seconds kOnDemandConfigUpdateIntervalSecs(5); - -#ifdef __linux__ -static struct sigaction originalUsr2Handler = {}; -#endif - -// Use SIGUSR2 to initiate profiling. -// Look for an on-demand config file. -// If none is found, default to base config. -// Try to not affect existing handlers -static bool hasOriginalSignalHandler() { -#ifdef __linux__ - return originalUsr2Handler.sa_handler != nullptr || - originalUsr2Handler.sa_sigaction != nullptr; -#else - return false; -#endif -} - -static void handle_signal(int signal) { -#ifdef __linux__ - if (signal == SIGUSR2) { - ConfigLoader::instance().handleOnDemandSignal(); - if (hasOriginalSignalHandler()) { - // Invoke original handler and reinstate ours - struct sigaction act; - sigaction(SIGUSR2, &originalUsr2Handler, &act); - raise(SIGUSR2); - sigaction(SIGUSR2, &act, &originalUsr2Handler); - } - } -#endif -} - -static void setupSignalHandler(bool enableSigUsr2) { -#ifdef __linux__ - if (enableSigUsr2) { - struct sigaction act = {}; - act.sa_handler = &handle_signal; - act.sa_flags = SA_NODEFER; - if (sigaction(SIGUSR2, &act, &originalUsr2Handler) < 0) { - PLOG(ERROR) << "Failed to register SIGUSR2 handler"; - } - if (originalUsr2Handler.sa_handler == &handle_signal) { - originalUsr2Handler = {}; - } - } else if (hasOriginalSignalHandler()) { - sigaction(SIGUSR2, &originalUsr2Handler, nullptr); - originalUsr2Handler = {}; - } -#endif -} - -// return an empty string if reading gets any errors. Otherwise a config string. -static std::string readConfigFromConfigFile(const char* filename) { - // Read whole file into a string. - std::ifstream file(filename); - std::string conf; - try { - conf.assign( - std::istreambuf_iterator(file), std::istreambuf_iterator()); - } catch (std::exception& e) { - VLOG(0) << "Error reading " << filename << ": " - << e.what(); - conf = ""; - } - return conf; -} - -static std::function()>& -daemonConfigLoaderFactory() { - static std::function()> factory = nullptr; - return factory; -} - -void ConfigLoader::setDaemonConfigLoaderFactory( - std::function()> factory) { - daemonConfigLoaderFactory() = factory; -} - -ConfigLoader& ConfigLoader::instance() { - static ConfigLoader config_loader; - return config_loader; -} - -// return an empty string if polling gets any errors. Otherwise a config string. -std::string ConfigLoader::readOnDemandConfigFromDaemon( - time_point now) { - if (!daemonConfigLoader_) { - return ""; - } - bool events = canHandlerAcceptConfig(ConfigKind::EventProfiler); - bool activities = canHandlerAcceptConfig(ConfigKind::ActivityProfiler); - return daemonConfigLoader_->readOnDemandConfig(events, activities); -} - -int ConfigLoader::contextCountForGpu(uint32_t device) { - if (!daemonConfigLoader_) { - // FIXME: Throw error? - return 0; - } - return daemonConfigLoader_->gpuContextCount(device); -} - -ConfigLoader::ConfigLoader() - : configUpdateIntervalSecs_(kConfigUpdateIntervalSecs), - onDemandConfigUpdateIntervalSecs_(kOnDemandConfigUpdateIntervalSecs), - stopFlag_(false), - onDemandSignal_(false) { -} - -void ConfigLoader::startThread() { - if (!updateThread_) { - // Create default base config here - at this point static initializers - // of extensions should have run and registered all config feature factories - std::lock_guard lock(configLock_); - if (!config_) { - config_ = std::make_unique(); - } - updateThread_ = - std::make_unique(&ConfigLoader::updateConfigThread, this); - } -} - -ConfigLoader::~ConfigLoader() { - if (updateThread_) { - stopFlag_ = true; - { - std::lock_guard lock(updateThreadMutex_); - updateThreadCondVar_.notify_one(); - } - updateThread_->join(); - } -#if !USE_GOOGLE_LOG - Logger::clearLoggerObservers(); -#endif // !USE_GOOGLE_LOG -} - -void ConfigLoader::handleOnDemandSignal() { - onDemandSignal_ = true; - { - std::lock_guard lock(updateThreadMutex_); - updateThreadCondVar_.notify_one(); - } -} - -const char* ConfigLoader::configFileName() { - if (!configFileName_) { - configFileName_ = getenv(kConfigFileEnvVar); - if (configFileName_ == nullptr) { - configFileName_ = kConfigFile; - } - } - return configFileName_; -} - -DaemonConfigLoader* ConfigLoader::daemonConfigLoader() { - if (!daemonConfigLoader_ && daemonConfigLoaderFactory()) { - daemonConfigLoader_ = daemonConfigLoaderFactory()(); - daemonConfigLoader_->setCommunicationFabric(config_->ipcFabricEnabled()); - } - return daemonConfigLoader_.get(); -} - -void ConfigLoader::updateBaseConfig() { - // First try reading local config file - // If that fails, read from daemon - // TODO: Invert these once daemon path fully rolled out - std::string config_str = readConfigFromConfigFile(configFileName()); - if (config_str.empty() && daemonConfigLoader()) { - // If local config file was not successfully loaded (e.g. not found) - // then try the daemon - config_str = daemonConfigLoader()->readBaseConfig(); - } - if (config_str != config_->source()) { - std::lock_guard lock(configLock_); - config_ = std::make_unique(); - config_->parse(config_str); - if (daemonConfigLoader()) { - daemonConfigLoader()->setCommunicationFabric(config_->ipcFabricEnabled()); - } - setupSignalHandler(config_->sigUsr2Enabled()); - SET_LOG_VERBOSITY_LEVEL( - config_->verboseLogLevel(), - config_->verboseLogModules()); - VLOG(0) << "Detected base config change"; - } -} - -void ConfigLoader::configureFromSignal( - time_point now, - Config& config) { - LOG(INFO) << "Received on-demand profiling signal, " - << "reading config from " << kOnDemandConfigFile; - // Reset start time to 0 in order to compute new default start time - const std::string config_str = "PROFILE_START_TIME=0\n" - + readConfigFromConfigFile(kOnDemandConfigFile); - config.parse(config_str); - config.setSignalDefaults(); - notifyHandlers(config); -} - -void ConfigLoader::configureFromDaemon( - time_point now, - Config& config) { - const std::string config_str = readOnDemandConfigFromDaemon(now); - if (config_str.empty()) { - return; - } - - LOG(INFO) << "Received config from dyno:\n" << config_str; - config.parse(config_str); - notifyHandlers(config); -} - -void ConfigLoader::updateConfigThread() { - auto now = system_clock::now(); - auto next_config_load_time = now; - auto next_on_demand_load_time = now + onDemandConfigUpdateIntervalSecs_; - seconds interval = configUpdateIntervalSecs_; - if (interval > onDemandConfigUpdateIntervalSecs_) { - interval = onDemandConfigUpdateIntervalSecs_; - } - auto onDemandConfig = std::make_unique(); - - // This can potentially sleep for long periods of time, so allow - // the desctructor to wake it to avoid a 5-minute long destruct period. - for (;;) { - { - std::unique_lock lock(updateThreadMutex_); - updateThreadCondVar_.wait_for(lock, interval); - } - if (stopFlag_) { - break; - } - now = system_clock::now(); - if (now > next_config_load_time) { - updateBaseConfig(); - next_config_load_time = now + configUpdateIntervalSecs_; - } - if (onDemandSignal_.exchange(false)) { - onDemandConfig = config_->clone(); - configureFromSignal(now, *onDemandConfig); - } else if (now > next_on_demand_load_time) { - onDemandConfig = std::make_unique(); - configureFromDaemon(now, *onDemandConfig); - next_on_demand_load_time = now + onDemandConfigUpdateIntervalSecs_; - } - if (onDemandConfig->verboseLogLevel() >= 0) { - LOG(INFO) << "Setting verbose level to " - << onDemandConfig->verboseLogLevel() - << " from on-demand config"; - SET_LOG_VERBOSITY_LEVEL( - onDemandConfig->verboseLogLevel(), - onDemandConfig->verboseLogModules()); - } - } -} - -bool ConfigLoader::hasNewConfig(const Config& oldConfig) { - std::lock_guard lock(configLock_); - return config_->timestamp() > oldConfig.timestamp(); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.h b/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.h deleted file mode 100644 index 4ce3468e48db116b2a40d992f000a3af1338e70a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ConfigLoader.h +++ /dev/null @@ -1,147 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include -#include - -#include "Config.h" - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ILoggerObserver.h" - -namespace libkineto { - class LibkinetoApi; -} - -namespace KINETO_NAMESPACE { - -using namespace libkineto; -class DaemonConfigLoader; - -class ConfigLoader { - public: - - static ConfigLoader& instance(); - - enum ConfigKind { - ActivityProfiler = 0, - EventProfiler, - NumConfigKinds - }; - - struct ConfigHandler { - virtual ~ConfigHandler() {} - virtual bool canAcceptConfig() = 0; - virtual void acceptConfig(const Config& cfg) = 0; - }; - - void addHandler(ConfigKind kind, ConfigHandler* handler) { - std::lock_guard lock(updateThreadMutex_); - handlers_[kind].push_back(handler); - startThread(); - } - - void removeHandler(ConfigKind kind, ConfigHandler* handler) { - std::lock_guard lock(updateThreadMutex_); - auto it = std::find( - handlers_[kind].begin(), handlers_[kind].end(), handler); - if (it != handlers_[kind].end()) { - handlers_[kind].erase(it); - } - } - - void notifyHandlers(const Config& cfg) { - std::lock_guard lock(updateThreadMutex_); - for (auto& key_val : handlers_) { - for (ConfigHandler* handler : key_val.second) { - handler->acceptConfig(cfg); - } - } - } - - bool canHandlerAcceptConfig(ConfigKind kind) { - std::lock_guard lock(updateThreadMutex_); - for (ConfigHandler* handler : handlers_[kind]) { - if (!handler->canAcceptConfig()) { - return false; - } - } - return true; - } - - void initBaseConfig() { - bool init = false; - { - std::lock_guard lock(configLock_); - init = !config_ || config_->source().empty(); - } - if (init) { - updateBaseConfig(); - } - } - - inline std::unique_ptr getConfigCopy() { - std::lock_guard lock(configLock_); - return config_->clone(); - } - - bool hasNewConfig(const Config& oldConfig); - int contextCountForGpu(uint32_t gpu); - - void handleOnDemandSignal(); - - static void setDaemonConfigLoaderFactory( - std::function()> factory); - - private: - ConfigLoader(); - ~ConfigLoader(); - - const char* configFileName(); - DaemonConfigLoader* daemonConfigLoader(); - - void startThread(); - void updateConfigThread(); - void updateBaseConfig(); - - // Create configuration when receiving SIGUSR2 - void configureFromSignal( - std::chrono::time_point now, - Config& config); - - // Create configuration when receiving request from a daemon - void configureFromDaemon( - std::chrono::time_point now, - Config& config); - - std::string readOnDemandConfigFromDaemon( - std::chrono::time_point now); - - std::mutex configLock_; - std::atomic configFileName_{nullptr}; - std::unique_ptr config_; - std::unique_ptr daemonConfigLoader_; - std::map> handlers_; - - std::chrono::seconds configUpdateIntervalSecs_; - std::chrono::seconds onDemandConfigUpdateIntervalSecs_; - std::unique_ptr updateThread_; - std::condition_variable updateThreadCondVar_; - std::mutex updateThreadMutex_; - std::atomic_bool stopFlag_{false}; - std::atomic_bool onDemandSignal_{false}; - -#if !USE_GOOGLE_LOG - std::unique_ptr> loggerObservers_; - std::mutex loggerObserversMutex_; -#endif // !USE_GOOGLE_LOG -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.cpp b/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.cpp deleted file mode 100644 index 1e909d5f9cfda13b95cc4abab547d964fe47b48a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.cpp +++ /dev/null @@ -1,130 +0,0 @@ -/* - * Copyright (c) Kineto Contributors - * All rights reserved. - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. - */ - -#include "CudaDeviceProperties.h" - -#include -#include - -#include -#include - -#include "Logger.h" - -namespace KINETO_NAMESPACE { - -static const std::vector createDeviceProps() { - std::vector props; - int device_count; - cudaError_t error_id = cudaGetDeviceCount(&device_count); - // Return empty vector if error. - if (error_id != cudaSuccess) { - LOG(ERROR) << "cudaGetDeviceCount failed with code " << error_id; - return {}; - } - VLOG(0) << "Device count is " << device_count; - for (size_t i = 0; i < device_count; ++i) { - cudaDeviceProp prop; - error_id = cudaGetDeviceProperties(&prop, i); - // Return empty vector if any device property fail to get. - if (error_id != cudaSuccess) { - LOG(ERROR) << "cudaGetDeviceProperties failed with " << error_id; - return {}; - } - props.push_back(prop); - LOGGER_OBSERVER_ADD_DEVICE(i); - } - return props; -} - -static const std::vector& deviceProps() { - static const std::vector props = createDeviceProps(); - return props; -} - -static const std::string createDevicePropertiesJson( - size_t id, const cudaDeviceProp& props) { - return fmt::format(R"JSON( - {{ - "id": {}, "name": "{}", "totalGlobalMem": {}, - "computeMajor": {}, "computeMinor": {}, - "maxThreadsPerBlock": {}, "maxThreadsPerMultiprocessor": {}, - "regsPerBlock": {}, "regsPerMultiprocessor": {}, "warpSize": {}, - "sharedMemPerBlock": {}, "sharedMemPerMultiprocessor": {}, - "numSms": {}, "sharedMemPerBlockOptin": {} - }})JSON", - id, props.name, props.totalGlobalMem, - props.major, props.minor, - props.maxThreadsPerBlock, props.maxThreadsPerMultiProcessor, - props.regsPerBlock, props.regsPerMultiprocessor, props.warpSize, - props.sharedMemPerBlock, props.sharedMemPerMultiprocessor, - props.multiProcessorCount, props.sharedMemPerBlockOptin); -} - -static const std::string createDevicePropertiesJson() { - std::vector jsonProps; - const auto& props = deviceProps(); - for (size_t i = 0; i < props.size(); i++) { - jsonProps.push_back(createDevicePropertiesJson(i, props[i])); - } - return fmt::format("{}", fmt::join(jsonProps, ",")); -} - -const std::string& devicePropertiesJson() { - static std::string devicePropsJson = createDevicePropertiesJson(); - return devicePropsJson; -} - -int smCount(uint32_t deviceId) { - const std::vector &props = deviceProps(); - return deviceId >= props.size() ? 0 : - props[deviceId].multiProcessorCount; -} - -float kernelOccupancy( - uint32_t deviceId, - uint16_t registersPerThread, - int32_t staticSharedMemory, - int32_t dynamicSharedMemory, - int32_t blockX, - int32_t blockY, - int32_t blockZ, - float blocksPerSm) { - // Calculate occupancy - float occupancy = -1.0; - const std::vector &props = deviceProps(); - if (deviceId < props.size()) { - cudaOccFuncAttributes occFuncAttr; - occFuncAttr.maxThreadsPerBlock = INT_MAX; - occFuncAttr.numRegs = registersPerThread; - occFuncAttr.sharedSizeBytes = staticSharedMemory; - occFuncAttr.partitionedGCConfig = PARTITIONED_GC_OFF; - occFuncAttr.shmemLimitConfig = FUNC_SHMEM_LIMIT_DEFAULT; - occFuncAttr.maxDynamicSharedSizeBytes = 0; - const cudaOccDeviceState occDeviceState = {}; - int blockSize = blockX * blockY * blockZ; - size_t dynamicSmemSize = dynamicSharedMemory; - cudaOccResult occ_result; - cudaOccDeviceProp prop(props[deviceId]); - cudaOccError status = cudaOccMaxActiveBlocksPerMultiprocessor( - &occ_result, &prop, &occFuncAttr, &occDeviceState, - blockSize, dynamicSmemSize); - if (status == CUDA_OCC_SUCCESS) { - if (occ_result.activeBlocksPerMultiprocessor < blocksPerSm) { - blocksPerSm = occ_result.activeBlocksPerMultiprocessor; - } - occupancy = blocksPerSm * blockSize / - (float) props[deviceId].maxThreadsPerMultiProcessor; - } else { - LOG_EVERY_N(ERROR, 1000) << "Failed to calculate occupancy, status = " - << status; - } - } - return occupancy; -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.h b/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.h deleted file mode 100644 index b731fde0c2aab4c9bd3e97f475d204dad02986e7..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CudaDeviceProperties.h +++ /dev/null @@ -1,31 +0,0 @@ -/* - * Copyright (c) Kineto Contributors - * All rights reserved. - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. - */ - -#pragma once - -#include -#include - -namespace KINETO_NAMESPACE { - -int smCount(uint32_t deviceId); - -// Return estimated achieved occupancy for a kernel -float kernelOccupancy( - uint32_t deviceId, - uint16_t registersPerThread, - int32_t staticSharedMemory, - int32_t dynamicSharedMemory, - int32_t blockX, - int32_t blockY, - int32_t blockZ, - float blocks_per_sm); - -// Return compute properties for each device as a json string -const std::string& devicePropertiesJson(); - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.h b/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.h deleted file mode 100644 index 09c29504060ecbbac609aa2d021ff643f45c143e..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.h +++ /dev/null @@ -1,114 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -#include "ITraceActivity.h" -#include "CuptiActivityPlatform.h" -#include "ThreadUtil.h" -#include "cupti_strings.h" - -namespace libkineto { - class ActivityLogger; -} - -namespace KINETO_NAMESPACE { - -using namespace libkineto; -struct TraceSpan; - -// These classes wrap the various CUPTI activity types -// into subclasses of ITraceActivity so that they can all be accessed -// using the ITraceActivity interface and logged via ActivityLogger. - -// Abstract base class, templated on Cupti activity type -template -struct CuptiActivity : public ITraceActivity { - explicit CuptiActivity(const T* activity, const ITraceActivity* linked) - : activity_(*activity), linked_(linked) {} - int64_t timestamp() const override { - return nsToUs(unixEpochTimestamp(activity_.start)); - } - int64_t duration() const override { - return nsToUs(activity_.end - activity_.start); - } - // TODO(T107507796): Deprecate ITraceActivity - int64_t correlationId() const override {return 0;} - int32_t getThreadId() const override {return 0;} - const ITraceActivity* linkedActivity() const override {return linked_;} - int flowType() const override {return kLinkAsyncCpuGpu;} - int flowId() const override {return correlationId();} - const T& raw() const {return activity_;} - const TraceSpan* traceSpan() const override {return nullptr;} - - protected: - const T& activity_; - const ITraceActivity* linked_{nullptr}; -}; - -// CUpti_ActivityAPI - CUDA runtime activities -struct RuntimeActivity : public CuptiActivity { - explicit RuntimeActivity( - const CUpti_ActivityAPI* activity, - const ITraceActivity* linked, - int32_t threadId) - : CuptiActivity(activity, linked), threadId_(threadId) {} - int64_t correlationId() const override {return activity_.correlationId;} - int64_t deviceId() const override {return processId();} - int64_t resourceId() const override {return threadId_;} - ActivityType type() const override {return ActivityType::CUDA_RUNTIME;} - bool flowStart() const override; - const std::string name() const override {return runtimeCbidName(activity_.cbid);} - void log(ActivityLogger& logger) const override; - const std::string metadataJson() const override; - - private: - const int32_t threadId_; -}; - -// CUpti_ActivityAPI - CUDA runtime activities -struct OverheadActivity : public CuptiActivity { - explicit OverheadActivity( - const CUpti_ActivityOverhead* activity, - const ITraceActivity* linked, - int32_t threadId=0) - : CuptiActivity(activity, linked), threadId_(threadId) {} - - int64_t timestamp() const override { - return nsToUs(unixEpochTimestamp(activity_.start)); - } - int64_t duration() const override { - return nsToUs(activity_.end - activity_.start); - } - // TODO: Update this with PID ordering - int64_t deviceId() const override {return -1;} - int64_t resourceId() const override {return threadId_;} - ActivityType type() const override {return ActivityType::OVERHEAD;} - bool flowStart() const override; - const std::string name() const override {return overheadKindString(activity_.overheadKind);} - void log(ActivityLogger& logger) const override; - const std::string metadataJson() const override; - - private: - const int32_t threadId_; -}; - -// Base class for GPU activities. -// Can also be instantiated directly. -template -struct GpuActivity : public CuptiActivity { - explicit GpuActivity(const T* activity, const ITraceActivity* linked) - : CuptiActivity(activity, linked) {} - int64_t correlationId() const override {return raw().correlationId;} - int64_t deviceId() const override {return raw().deviceId;} - int64_t resourceId() const override {return raw().streamId;} - ActivityType type() const override; - bool flowStart() const override {return false;} - const std::string name() const override; - void log(ActivityLogger& logger) const override; - const std::string metadataJson() const override; - const T& raw() const {return CuptiActivity::raw();} -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.tpp b/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.tpp deleted file mode 100644 index 1ff2dafe06b0016ce7b904ef4b55e047c69bcc1c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivity.tpp +++ /dev/null @@ -1,111 +0,0 @@ - /* - * Copyright (c) Facebook, Inc. and its affiliates. - * All rights reserved. - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of this source tree. - */ - -#include "CuptiActivity.h" - -#include - -#include "Demangle.h" -#include "output_base.h" - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -template<> -inline const std::string GpuActivity::name() const { - return demangle(raw().name); -} - -template<> -inline ActivityType GpuActivity::type() const { - return ActivityType::CONCURRENT_KERNEL; -} - -static inline std::string memcpyName(uint8_t kind, uint8_t src, uint8_t dst) { - return fmt::format( - "Memcpy {} ({} -> {})", - memcpyKindString((CUpti_ActivityMemcpyKind)kind), - memoryKindString((CUpti_ActivityMemoryKind)src), - memoryKindString((CUpti_ActivityMemoryKind)dst)); -} - -template<> -inline ActivityType GpuActivity::type() const { - return ActivityType::GPU_MEMCPY; -} - -template<> -inline const std::string GpuActivity::name() const { - return memcpyName(raw().copyKind, raw().srcKind, raw().dstKind); -} - -template<> -inline ActivityType GpuActivity::type() const { - return ActivityType::GPU_MEMCPY; -} - -template<> -inline const std::string GpuActivity::name() const { - return memcpyName(raw().copyKind, raw().srcKind, raw().dstKind); -} - -template<> -inline const std::string GpuActivity::name() const { - const char* memory_kind = - memoryKindString((CUpti_ActivityMemoryKind)raw().memoryKind); - return fmt::format("Memset ({})", memory_kind); -} - -template<> -inline ActivityType GpuActivity::type() const { - return ActivityType::GPU_MEMSET; -} - -inline void RuntimeActivity::log(ActivityLogger& logger) const { - logger.handleActivity(*this); -} - -inline void OverheadActivity::log(ActivityLogger& logger) const { - logger.handleActivity(*this); -} - -inline bool OverheadActivity::flowStart() const { - return false; -} - -inline const std::string OverheadActivity::metadataJson() const { - return ""; -} - -template -inline void GpuActivity::log(ActivityLogger& logger) const { - logger.handleGpuActivity(*this); -} - -inline bool RuntimeActivity::flowStart() const { - return activity_.cbid == CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000 || - (activity_.cbid >= CUPTI_RUNTIME_TRACE_CBID_cudaMemcpy_v3020 && - activity_.cbid <= CUPTI_RUNTIME_TRACE_CBID_cudaMemset2DAsync_v3020) || - activity_.cbid == - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchCooperativeKernel_v9000 || - activity_.cbid == - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchCooperativeKernelMultiDevice_v9000; -} - -inline const std::string RuntimeActivity::metadataJson() const { - return fmt::format(R"JSON( - "cbid": {}, "correlation": {})JSON", - activity_.cbid, activity_.correlationId); -} - -template -inline const std::string GpuActivity::metadataJson() const { - return ""; -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.cpp deleted file mode 100644 index 5718bed2f89b06cc702d1b82976cd42e5fceebd0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.cpp +++ /dev/null @@ -1,343 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "CuptiActivityApi.h" - -#include -#include - -#include "cupti_call.h" -#include "Logger.h" - -using namespace std::chrono; - -namespace KINETO_NAMESPACE { - -// TODO: do we want this to be configurable? -// Set to 2MB to avoid constantly creating buffers (espeically for networks -// that has many small memcpy such as sparseNN) -// Consider putting this on huge pages? -constexpr size_t kBufSize(2 * 1024 * 1024); - -CuptiActivityApi& CuptiActivityApi::singleton() { - static CuptiActivityApi instance; - return instance; -} - -void CuptiActivityApi::pushCorrelationID(int id, CorrelationFlowType type) { -#ifdef HAS_CUPTI - if (!singleton().externalCorrelationEnabled_) { - return; - } - VLOG(2) << "pushCorrelationID(" << id << ")"; - switch(type) { - case Default: - CUPTI_CALL(cuptiActivityPushExternalCorrelationId( - CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM0, id)); - break; - case User: - CUPTI_CALL(cuptiActivityPushExternalCorrelationId( - CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1, id)); - } -#endif -} - -void CuptiActivityApi::popCorrelationID(CorrelationFlowType type) { -#ifdef HAS_CUPTI - if (!singleton().externalCorrelationEnabled_) { - return; - } - switch(type) { - case Default: - CUPTI_CALL(cuptiActivityPopExternalCorrelationId( - CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM0, nullptr)); - break; - case User: - CUPTI_CALL(cuptiActivityPopExternalCorrelationId( - CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1, nullptr)); - } -#endif -} - -static int getSMCount() { -#ifdef HAS_CUPTI - // There may be a simpler way to get the number of SMs.... - // Look for domain_d - this has 80 instances on Volta and - // 56 instances on Pascal, corresponding to the number of SMs - // FIXME: This does not work on Turing and later - uint32_t domainCount{0}; - CUPTI_CALL(cuptiDeviceGetNumEventDomains(0, &domainCount)); - std::vector ids(domainCount); - size_t sz = sizeof(CUpti_EventDomainID) * domainCount; - CUPTI_CALL(cuptiDeviceEnumEventDomains(0, &sz, ids.data())); - for (CUpti_EventDomainID id : ids) { - char name[16]; - name[0] = '\0'; - sz = sizeof(name); - CUPTI_CALL(cuptiEventDomainGetAttribute( - id, CUPTI_EVENT_DOMAIN_ATTR_NAME, &sz, name)); - if (strncmp(name, "domain_d", sz) == 0) { - uint32_t count{0}; - sz = sizeof(count); - CUPTI_CALL(cuptiDeviceGetEventDomainAttribute( - 0, id, CUPTI_EVENT_DOMAIN_ATTR_TOTAL_INSTANCE_COUNT, &sz, &count)); - return count; - } - } -#endif - - return -1; -} - -int CuptiActivityApi::smCount() { - static int sm_count = getSMCount(); - return sm_count; -} - -static bool nextActivityRecord( - uint8_t* buffer, - size_t valid_size, - CUpti_Activity*& record) { -#ifdef HAS_CUPTI - CUptiResult status = CUPTI_CALL_NOWARN( - cuptiActivityGetNextRecord(buffer, valid_size, &record)); - if (status != CUPTI_SUCCESS) { - if (status != CUPTI_ERROR_MAX_LIMIT_REACHED) { - CUPTI_CALL(status); - } - record = nullptr; - } -#endif - return record != nullptr; -} - -void CuptiActivityApi::setMaxBufferSize(int size) { - maxGpuBufferCount_ = 1 + size / kBufSize; -} - -void CuptiActivityApi::forceLoadCupti() { -#ifdef HAS_CUPTI - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL)); -#endif -} - -#ifdef HAS_CUPTI -void CUPTIAPI CuptiActivityApi::bufferRequestedTrampoline( - uint8_t** buffer, - size_t* size, - size_t* maxNumRecords) { - singleton().bufferRequested(buffer, size, maxNumRecords); -} - -void CuptiActivityApi::bufferRequested( - uint8_t** buffer, size_t* size, size_t* maxNumRecords) { - std::lock_guard guard(mutex_); - if (allocatedGpuTraceBuffers_.size() >= maxGpuBufferCount_) { - stopCollection = true; - LOG(WARNING) << "Exceeded max GPU buffer count (" - << allocatedGpuTraceBuffers_.size() - << " > " << maxGpuBufferCount_ - << ") - terminating tracing"; - } - - auto buf = std::make_unique(kBufSize); - *buffer = buf->data(); - *size = kBufSize; - - allocatedGpuTraceBuffers_[*buffer] = std::move(buf); - - *maxNumRecords = 0; -} -#endif - -std::unique_ptr -CuptiActivityApi::activityBuffers() { - { - std::lock_guard guard(mutex_); - if (allocatedGpuTraceBuffers_.empty()) { - return nullptr; - } - } - -#ifdef HAS_CUPTI - VLOG(1) << "Flushing GPU activity buffers"; - time_point t1; - if (VLOG_IS_ON(1)) { - t1 = system_clock::now(); - } - // Can't hold mutex_ during this call, since bufferCompleted - // will be called by libcupti and mutex_ is acquired there. - CUPTI_CALL(cuptiActivityFlushAll(CUPTI_ACTIVITY_FLAG_FLUSH_FORCED)); - if (VLOG_IS_ON(1)) { - flushOverhead = - duration_cast(system_clock::now() - t1).count(); - } -#endif - std::lock_guard guard(mutex_); - // Transfer ownership of buffers to caller. A new map is created on-demand. - return std::move(readyGpuTraceBuffers_); -} - -#ifdef HAS_CUPTI -int CuptiActivityApi::processActivitiesForBuffer( - uint8_t* buf, - size_t validSize, - std::function handler) { - int count = 0; - if (buf && validSize) { - CUpti_Activity* record{nullptr}; - while ((nextActivityRecord(buf, validSize, record))) { - handler(record); - ++count; - } - } - return count; -} -#endif - -const std::pair CuptiActivityApi::processActivities( - CuptiActivityBufferMap& buffers, - std::function handler) { - std::pair res{0, 0}; -#ifdef HAS_CUPTI - for (auto& pair : buffers) { - // No lock needed - only accessed from this thread - auto& buf = pair.second; - res.first += processActivitiesForBuffer(buf->data(), buf->size(), handler); - res.second += buf->size(); - } -#endif - return res; -} - -void CuptiActivityApi::clearActivities() { - { - std::lock_guard guard(mutex_); - if (allocatedGpuTraceBuffers_.empty()) { - return; - } - } - // Can't hold mutex_ during this call, since bufferCompleted - // will be called by libcupti and mutex_ is acquired there. -#ifdef HAS_CUPTI - CUPTI_CALL(cuptiActivityFlushAll(0)); -#endif - // FIXME: We might want to make sure we reuse - // the same memory during warmup and tracing. - // Also, try to use the amount of memory required - // for active tracing during warmup. - std::lock_guard guard(mutex_); - // Throw away ready buffers as a result of above flush - readyGpuTraceBuffers_ = nullptr; -} - -#ifdef HAS_CUPTI -void CUPTIAPI CuptiActivityApi::bufferCompletedTrampoline( - CUcontext ctx, - uint32_t streamId, - uint8_t* buffer, - size_t /* unused */, - size_t validSize) { - singleton().bufferCompleted(ctx, streamId, buffer, 0, validSize); -} - -void CuptiActivityApi::bufferCompleted( - CUcontext ctx, - uint32_t streamId, - uint8_t* buffer, - size_t /* unused */, - size_t validSize) { - - std::lock_guard guard(mutex_); - auto it = allocatedGpuTraceBuffers_.find(buffer); - if (it == allocatedGpuTraceBuffers_.end()) { - LOG(ERROR) << "bufferCompleted called with unknown buffer: " - << (void*) buffer; - return; - } - - if (!readyGpuTraceBuffers_) { - readyGpuTraceBuffers_ = std::make_unique(); - } - // Set valid size of buffer before moving to ready map - it->second->setSize(validSize); - (*readyGpuTraceBuffers_)[it->first] = std::move(it->second); - allocatedGpuTraceBuffers_.erase(it); - - // report any records dropped from the queue; to avoid unnecessary cupti - // API calls, we make it report only in verbose mode (it doesn't happen - // often in our testing anyways) - if (VLOG_IS_ON(1)) { - size_t dropped = 0; - CUPTI_CALL(cuptiActivityGetNumDroppedRecords(ctx, streamId, &dropped)); - if (dropped != 0) { - LOG(WARNING) << "Dropped " << dropped << " activity records"; - } - } -} -#endif - -void CuptiActivityApi::enableCuptiActivities( - const std::set& selected_activities) { -#ifdef HAS_CUPTI - static bool registered = false; - if (!registered) { - CUPTI_CALL( - cuptiActivityRegisterCallbacks(bufferRequestedTrampoline, bufferCompletedTrampoline)); - } - - externalCorrelationEnabled_ = false; - for (const auto& activity : selected_activities) { - if (activity == ActivityType::GPU_MEMCPY) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_MEMCPY)); - } - if (activity == ActivityType::GPU_MEMSET) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_MEMSET)); - } - if (activity == ActivityType::CONCURRENT_KERNEL) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL)); - } - if (activity == ActivityType::EXTERNAL_CORRELATION) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION)); - externalCorrelationEnabled_ = true; - } - if (activity == ActivityType::CUDA_RUNTIME) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_RUNTIME)); - } - if (activity == ActivityType::OVERHEAD) { - CUPTI_CALL(cuptiActivityEnable(CUPTI_ACTIVITY_KIND_OVERHEAD)); - } - } -#endif - - // Explicitly enabled, so reset this flag if set - stopCollection = false; -} - -void CuptiActivityApi::disableCuptiActivities( - const std::set& selected_activities) { -#ifdef HAS_CUPTI - for (const auto& activity : selected_activities) { - if (activity == ActivityType::GPU_MEMCPY) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_MEMCPY)); - } - if (activity == ActivityType::GPU_MEMSET) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_MEMSET)); - } - if (activity == ActivityType::CONCURRENT_KERNEL) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL)); - } - if (activity == ActivityType::EXTERNAL_CORRELATION) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION)); - } - if (activity == ActivityType::CUDA_RUNTIME) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_RUNTIME)); - } - if (activity == ActivityType::OVERHEAD) { - CUPTI_CALL(cuptiActivityDisable(CUPTI_ACTIVITY_KIND_OVERHEAD)); - } - } - externalCorrelationEnabled_ = false; -#endif -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.h b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.h deleted file mode 100644 index 92af51ecac9ec99181c4726c3849894de9e32b33..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityApi.h +++ /dev/null @@ -1,100 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include -#endif - -#include "ActivityType.h" -#include "CuptiActivityBuffer.h" - - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -#ifndef HAS_CUPTI -using CUpti_Activity = void; -#endif - -class CuptiActivityApi { - public: - enum CorrelationFlowType { - Default, - User - }; - - CuptiActivityApi() = default; - CuptiActivityApi(const CuptiActivityApi&) = delete; - CuptiActivityApi& operator=(const CuptiActivityApi&) = delete; - - virtual ~CuptiActivityApi() {} - - static CuptiActivityApi& singleton(); - - virtual int smCount(); - static void pushCorrelationID(int id, CorrelationFlowType type); - static void popCorrelationID(CorrelationFlowType type); - - void enableCuptiActivities( - const std::set& selected_activities); - void disableCuptiActivities( - const std::set& selected_activities); - void clearActivities(); - - virtual std::unique_ptr activityBuffers(); - - virtual const std::pair processActivities( - CuptiActivityBufferMap&, - std::function handler); - - void setMaxBufferSize(int size); - - std::atomic_bool stopCollection{false}; - int64_t flushOverhead{0}; - - static void forceLoadCupti(); - - private: -#ifdef HAS_CUPTI - int processActivitiesForBuffer( - uint8_t* buf, - size_t validSize, - std::function handler); - static void CUPTIAPI - bufferRequestedTrampoline(uint8_t** buffer, size_t* size, size_t* maxNumRecords); - static void CUPTIAPI bufferCompletedTrampoline( - CUcontext ctx, - uint32_t streamId, - uint8_t* buffer, - size_t /* unused */, - size_t validSize); -#endif // HAS_CUPTI - - int maxGpuBufferCount_{0}; - CuptiActivityBufferMap allocatedGpuTraceBuffers_; - std::unique_ptr readyGpuTraceBuffers_; - std::mutex mutex_; - bool externalCorrelationEnabled_{false}; - - protected: -#ifdef HAS_CUPTI - void bufferRequested(uint8_t** buffer, size_t* size, size_t* maxNumRecords); - void bufferCompleted( - CUcontext ctx, - uint32_t streamId, - uint8_t* buffer, - size_t /* unused */, - size_t validSize); -#endif -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityBuffer.h b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityBuffer.h deleted file mode 100644 index 1c3fbef62c8d8f42ff5da1718e20315cc1ba95d5..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityBuffer.h +++ /dev/null @@ -1,51 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include -#include - -#include "ITraceActivity.h" - -namespace KINETO_NAMESPACE { - -class CuptiActivityBuffer { - public: - explicit CuptiActivityBuffer(size_t size) : size_(size) { - buf_.reserve(size); - } - CuptiActivityBuffer() = delete; - CuptiActivityBuffer& operator=(const CuptiActivityBuffer&) = delete; - CuptiActivityBuffer(CuptiActivityBuffer&&) = default; - CuptiActivityBuffer& operator=(CuptiActivityBuffer&&) = default; - - size_t size() const { - return size_; - } - - void setSize(size_t size) { - assert(size <= buf_.capacity()); - size_ = size; - } - - uint8_t* data() { - return buf_.data(); - } - - private: - - std::vector buf_; - size_t size_; - - std::vector> wrappers_; -}; - -using CuptiActivityBufferMap = - std::map>; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.cpp deleted file mode 100644 index fa2ef2f3a8c9cbb7f10567c158d6ee3e8e26eed0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.cpp +++ /dev/null @@ -1,31 +0,0 @@ -#include - -namespace chrono = std::chrono; - -namespace KINETO_NAMESPACE { - -#ifdef _WIN32 -uint64_t epochs_diff() { - // On Windows, steady_clock wraps the QueryPerformanceCounter function. - // https://docs.microsoft.com/en-us/cpp/standard-library/steady-clock-struct?view=msvc-160 - auto steady = - chrono::time_point_cast(chrono::steady_clock::now()); - auto system = - chrono::time_point_cast(chrono::system_clock::now()); - - auto time_since_unix = system.time_since_epoch().count(); - auto time_since_boot = steady.time_since_epoch().count(); - return time_since_unix - time_since_boot; -} - -uint64_t unixEpochTimestamp(uint64_t ts) { - static uint64_t diff = epochs_diff(); - return ts + diff; -} -#else -uint64_t unixEpochTimestamp(uint64_t ts) { - return ts; -} -#endif // _WIN32 - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.h b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.h deleted file mode 100644 index 78de8373d5fe391d48edffc897aff6893aa6f54f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityPlatform.h +++ /dev/null @@ -1,12 +0,0 @@ -#pragma once - -#include - -namespace KINETO_NAMESPACE { - -// cupti's timestamps are platform specific. This function convert the raw -// cupti timestamp to time since unix epoch. So that on different platform, -// correction can work correctly. -uint64_t unixEpochTimestamp(uint64_t ts); - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.cpp deleted file mode 100644 index 97c23ef047d75aff75b56773a20801ce83fb1653..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.cpp +++ /dev/null @@ -1,841 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "CuptiActivityProfiler.h" - -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include -#endif - -#include "Config.h" -#include "time_since_epoch.h" -#ifdef HAS_CUPTI -#include "CuptiActivity.h" -#include "CuptiActivity.tpp" -#include "CuptiActivityApi.h" -#endif // HAS_CUPTI -#ifdef HAS_ROCTRACER -#include "RoctracerActivityApi.h" -#endif -#include "output_base.h" - -#include "Logger.h" -#include "ThreadUtil.h" - -using namespace std::chrono; -using namespace libkineto; -using std::string; - -namespace KINETO_NAMESPACE { - -void CuptiActivityProfiler::transferCpuTrace( - std::unique_ptr cpuTrace) { - std::lock_guard guard(mutex_); - const string& trace_name = cpuTrace->span.name; - if (currentRunloopState_ != RunloopState::CollectTrace && - currentRunloopState_ != RunloopState::ProcessTrace) { - VLOG(0) << "Trace collection not in progress - discarding span " - << trace_name; - return; - } - - cpuTrace->span.iteration = iterationCountMap_[trace_name]++; - - VLOG(0) << "Received iteration " << cpuTrace->span.iteration << " of span " - << trace_name << " (" << cpuTrace->activities.size() << " activities / " - << cpuTrace->gpuOpCount << " gpu activities)"; - traceBuffers_->cpu.push_back(std::move(cpuTrace)); -} - -#ifdef HAS_ROCTRACER -CuptiActivityProfiler::CuptiActivityProfiler(RoctracerActivityApi& cupti, bool cpuOnly) -#else -CuptiActivityProfiler::CuptiActivityProfiler(CuptiActivityApi& cupti, bool cpuOnly) -#endif - : cupti_(cupti), - flushOverhead_{0, 0}, - setupOverhead_{0, 0}, - cpuOnly_{cpuOnly}, - currentRunloopState_{RunloopState::WaitForRequest}, - stopCollection_{false} {} - -void CuptiActivityProfiler::processTraceInternal(ActivityLogger& logger) { - LOG(INFO) << "Processing " << traceBuffers_->cpu.size() - << " CPU buffers"; - VLOG(0) << "Profile time range: " << captureWindowStartTime_ << " - " - << captureWindowEndTime_; - logger.handleTraceStart(metadata_); - for (auto& cpu_trace : traceBuffers_->cpu) { - string trace_name = cpu_trace->span.name; - VLOG(0) << "Processing CPU buffer for " << trace_name << " (" - << cpu_trace->span.iteration << ") - " - << cpu_trace->activities.size() << " records"; - VLOG(0) << "Span time range: " << cpu_trace->span.startTime << " - " - << cpu_trace->span.endTime; - processCpuTrace(*cpu_trace, logger); - LOGGER_OBSERVER_ADD_EVENT_COUNT(cpu_trace->activities.size()); - } - -#ifdef HAS_CUPTI - if (!cpuOnly_) { - VLOG(0) << "Retrieving GPU activity buffers"; - traceBuffers_->gpu = cupti_.activityBuffers(); - if (VLOG_IS_ON(1)) { - addOverheadSample(flushOverhead_, cupti_.flushOverhead); - } - if (traceBuffers_->gpu) { - const auto count_and_size = cupti_.processActivities( - *traceBuffers_->gpu, - std::bind(&CuptiActivityProfiler::handleCuptiActivity, this, std::placeholders::_1, &logger)); - LOG(INFO) << "Processed " << count_and_size.first - << " GPU records (" << count_and_size.second << " bytes)"; - LOGGER_OBSERVER_ADD_EVENT_COUNT(count_and_size.first); - } - } -#endif // HAS_CUPTI -#ifdef HAS_ROCTRACER - if (!cpuOnly_) { - VLOG(0) << "Retrieving GPU activity buffers"; - const int count = cupti_.processActivities(logger); - LOG(INFO) << "Processed " << count - << " GPU records"; - LOGGER_OBSERVER_ADD_EVENT_COUNT(count); - } -#endif // HAS_ROCTRACER - - for (const auto& session : sessions_){ - LOG(INFO) << "Processing child profiler trace"; - session->processTrace(logger); - } - - finalizeTrace(*config_, logger); -} - -CuptiActivityProfiler::CpuGpuSpanPair& CuptiActivityProfiler::recordTraceSpan( - TraceSpan& span, int gpuOpCount) { - TraceSpan gpu_span(gpuOpCount, span.iteration, span.name, "GPU: "); - auto& iterations = traceSpans_[span.name]; - iterations.push_back({span, gpu_span}); - return iterations.back(); -} - -void CuptiActivityProfiler::processCpuTrace( - libkineto::CpuTraceBuffer& cpuTrace, - ActivityLogger& logger) { - if (cpuTrace.activities.size() == 0) { - LOG(WARNING) << "CPU trace is empty!"; - return; - } - - CpuGpuSpanPair& span_pair = recordTraceSpan(cpuTrace.span, cpuTrace.gpuOpCount); - TraceSpan& cpu_span = span_pair.first; - for (auto const& act : cpuTrace.activities) { - VLOG(2) << act.correlationId() << ": OP " << act.activityName; - if (config_->selectedActivityTypes().count(act.type())) { - act.log(logger); - } - clientActivityTraceMap_[act.correlationId()] = &span_pair; - activityMap_[act.correlationId()] = &act; - - recordThreadInfo(act.resourceId(), act.getThreadId(), act.deviceId()); - } - logger.handleTraceSpan(cpu_span); -} - -#ifdef HAS_CUPTI -inline void CuptiActivityProfiler::handleCorrelationActivity( - const CUpti_ActivityExternalCorrelation* correlation) { - if (correlation->externalKind == CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM0) { - cpuCorrelationMap_[correlation->correlationId] = correlation->externalId; - } else if (correlation->externalKind == CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1){ - userCorrelationMap_[correlation->correlationId] = correlation->externalId; - } else { - LOG(ERROR) << "Invalid CUpti_ActivityExternalCorrelation sent to handleCuptiActivity"; - } -} -#endif // HAS_CUPTI - -static GenericTraceActivity createUserGpuSpan( - const libkineto::ITraceActivity& cpuTraceActivity, - const libkineto::ITraceActivity& gpuTraceActivity) { - GenericTraceActivity res( - *cpuTraceActivity.traceSpan(), - ActivityType::GPU_USER_ANNOTATION, - cpuTraceActivity.name()); - res.startTime = gpuTraceActivity.timestamp(); - res.device = gpuTraceActivity.deviceId(); - res.resource = gpuTraceActivity.resourceId(); - res.endTime = - gpuTraceActivity.timestamp() + gpuTraceActivity.duration(); - res.id = cpuTraceActivity.correlationId(); - return res; -} - -void CuptiActivityProfiler::GpuUserEventMap::insertOrExtendEvent( - const ITraceActivity& userActivity, - const ITraceActivity& gpuActivity) { - StreamKey key(gpuActivity.deviceId(), gpuActivity.resourceId()); - CorrelationSpanMap& correlationSpanMap = streamSpanMap_[key]; - auto it = correlationSpanMap.find(userActivity.correlationId()); - if (it == correlationSpanMap.end()) { - auto it_success = correlationSpanMap.insert({ - userActivity.correlationId(), createUserGpuSpan(userActivity, gpuActivity) - }); - it = it_success.first; - } - GenericTraceActivity& span = it->second; - if (gpuActivity.timestamp() < span.startTime || span.startTime == 0) { - span.startTime = gpuActivity.timestamp(); - } - int64_t gpu_activity_end = gpuActivity.timestamp() + gpuActivity.duration(); - if (gpu_activity_end > span.endTime) { - span.endTime = gpu_activity_end; - } -} - -const CuptiActivityProfiler::CpuGpuSpanPair& CuptiActivityProfiler::defaultTraceSpan() { - static TraceSpan span(0, 0, "Unknown", ""); - static CpuGpuSpanPair span_pair(span, span); - return span_pair; -} - -void CuptiActivityProfiler::GpuUserEventMap::logEvents(ActivityLogger *logger) { - for (auto const& streamMapPair : streamSpanMap_) { - for (auto const& correlationSpanPair : streamMapPair.second) { - correlationSpanPair.second.log(*logger); - } - } -} - -#ifdef HAS_CUPTI -inline bool CuptiActivityProfiler::outOfRange(const ITraceActivity& act) { - bool out_of_range = act.timestamp() < captureWindowStartTime_ || - (act.timestamp() + act.duration()) > captureWindowEndTime_; - if (out_of_range) { - VLOG(2) << "TraceActivity outside of profiling window: " << act.name() - << " (" << act.timestamp() << " < " << captureWindowStartTime_ << " or " - << (act.timestamp() + act.duration()) << " > " << captureWindowEndTime_; - } - return out_of_range; -} - -inline static bool isBlockListedRuntimeCbid(CUpti_CallbackId cbid) { - // Some CUDA calls that are very frequent and also not very interesting. - // Filter these out to reduce trace size. - if (cbid == CUPTI_RUNTIME_TRACE_CBID_cudaGetDevice_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaSetDevice_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaGetLastError_v3020 || - // Don't care about cudaEvents - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaEventCreate_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaEventCreateWithFlags_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaEventRecord_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaEventDestroy_v3020 || - cbid == CUPTI_RUNTIME_TRACE_CBID_cudaEventSynchronize_v3020) { - return true; - } - - return false; -} - -void CuptiActivityProfiler::handleRuntimeActivity( - const CUpti_ActivityAPI* activity, - ActivityLogger* logger) { - if (isBlockListedRuntimeCbid(activity->cbid)) { - return; - } - VLOG(2) << activity->correlationId - << ": CUPTI_ACTIVITY_KIND_RUNTIME, cbid=" << activity->cbid - << " tid=" << activity->threadId; - int32_t tid = activity->threadId; - const auto& it = resourceInfo_.find({processId(), tid}); - if (it != resourceInfo_.end()) { - tid = it->second.id; - } - const ITraceActivity* linked = linkedActivity( - activity->correlationId, cpuCorrelationMap_); - const auto& runtime_activity = - traceBuffers_->addActivityWrapper(RuntimeActivity(activity, linked, tid)); - checkTimestampOrder(&runtime_activity); - if (outOfRange(runtime_activity)) { - return; - } - runtime_activity.log(*logger); -} - -void CuptiActivityProfiler::handleOverheadActivity( - const CUpti_ActivityOverhead* activity, - ActivityLogger* logger) { - VLOG(2) << ": CUPTI_ACTIVITY_KIND_OVERHEAD" << " overheadKind=" << activity->overheadKind; - - const auto& overhead_activity = - traceBuffers_->addActivityWrapper(OverheadActivity(activity, nullptr)); - overhead_activity.log(*logger); -} - - -inline void CuptiActivityProfiler::updateGpuNetSpan( - const ITraceActivity& gpuOp) { - if (!gpuOp.linkedActivity()) { - VLOG(0) << "Missing linked activity"; - return; - } - const auto& it = clientActivityTraceMap_.find( - gpuOp.linkedActivity()->correlationId()); - if (it == clientActivityTraceMap_.end()) { - // No correlation id mapping? - return; - } - TraceSpan& gpu_span = it->second->second; - if (gpuOp.timestamp() < gpu_span.startTime || gpu_span.startTime == 0) { - gpu_span.startTime = gpuOp.timestamp(); - } - if ((gpuOp.timestamp() + gpuOp.duration()) > gpu_span.endTime) { - gpu_span.endTime = gpuOp.timestamp() + gpuOp.duration(); - } -} - -// I've observed occasional broken timestamps attached to GPU events... -void CuptiActivityProfiler::checkTimestampOrder(const ITraceActivity* act1) { - // Correlated GPU runtime activity cannot - // have timestamp greater than the GPU activity's - const auto& it = correlatedCudaActivities_.find(act1->correlationId()); - if (it == correlatedCudaActivities_.end()) { - correlatedCudaActivities_.insert({act1->correlationId(), act1}); - return; - } - - // Activities may be appear in the buffers out of order. - // If we have a runtime activity in the map, it should mean that we - // have a GPU activity passed in, and vice versa. - const ITraceActivity* act2 = it->second; - if (act2->type() == ActivityType::CUDA_RUNTIME) { - // Buffer is out-of-order. - // Swap so that runtime activity is first for the comparison below. - std::swap(act1, act2); - } - if (act1->timestamp() > act2->timestamp()) { - LOG(WARNING) << "GPU op timestamp (" << act2->timestamp() - << ") < runtime timestamp (" << act1->timestamp() << ") by " - << act1->timestamp() - act2->timestamp() << "us"; - LOG(WARNING) << "Name: " << act2->name() - << " Device: " << act2->deviceId() - << " Stream: " << act2->resourceId(); - } -} - -inline void CuptiActivityProfiler::handleGpuActivity( - const ITraceActivity& act, - ActivityLogger* logger) { - if (outOfRange(act)) { - return; - } - checkTimestampOrder(&act); - VLOG(2) << act.correlationId() << ": " - << act.name(); - recordStream(act.deviceId(), act.resourceId(), ""); - act.log(*logger); - updateGpuNetSpan(act); - if (config_->selectedActivityTypes().count(ActivityType::GPU_USER_ANNOTATION)) { - const auto& it = userCorrelationMap_.find(act.correlationId()); - if (it != userCorrelationMap_.end()) { - const auto& it2 = activityMap_.find(it->second); - if (it2 != activityMap_.end()) { - recordStream(act.deviceId(), act.resourceId(), "context"); - gpuUserEventMap_.insertOrExtendEvent(*it2->second, act); - } - } - } -} - -const ITraceActivity* CuptiActivityProfiler::linkedActivity( - int32_t correlationId, - const std::unordered_map& correlationMap) { - const auto& it = correlationMap.find(correlationId); - if (it != correlationMap.end()) { - const auto& it2 = activityMap_.find(it->second); - if (it2 != activityMap_.end()) { - return it2->second; - } - } - return nullptr; -} - -template -inline void CuptiActivityProfiler::handleGpuActivity( - const T* act, ActivityLogger* logger) { - const ITraceActivity* linked = linkedActivity( - act->correlationId, cpuCorrelationMap_); - const auto& gpu_activity = - traceBuffers_->addActivityWrapper(GpuActivity(act, linked)); - handleGpuActivity(gpu_activity, logger); -} - -void CuptiActivityProfiler::handleCuptiActivity(const CUpti_Activity* record, ActivityLogger* logger) { - switch (record->kind) { - case CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION: - handleCorrelationActivity( - reinterpret_cast( - record)); - break; - case CUPTI_ACTIVITY_KIND_RUNTIME: - handleRuntimeActivity( - reinterpret_cast(record), logger); - break; - case CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL: - handleGpuActivity( - reinterpret_cast(record), logger); - break; - case CUPTI_ACTIVITY_KIND_MEMCPY: - handleGpuActivity( - reinterpret_cast(record), logger); - break; - case CUPTI_ACTIVITY_KIND_MEMCPY2: - handleGpuActivity( - reinterpret_cast(record), logger); - break; - case CUPTI_ACTIVITY_KIND_MEMSET: - handleGpuActivity( - reinterpret_cast(record), logger); - break; - case CUPTI_ACTIVITY_KIND_OVERHEAD: - handleOverheadActivity (reinterpret_cast(record), logger); - break; - default: - LOG(WARNING) << "Unexpected activity type: " << record->kind; - break; - } -} -#endif // HAS_CUPTI - -void CuptiActivityProfiler::configureChildProfilers() { - // If child profilers are enabled create profiler sessions - for (auto& profiler: profilers_) { - int64_t start_time_ms = duration_cast( - profileStartTime_.time_since_epoch()).count(); - LOG(INFO) << "Running child profiler " << profiler->name() << " for " - << config_->activitiesDuration().count() << " ms"; - auto session = profiler->configure( - start_time_ms, - config_->activitiesDuration().count(), - config_->selectedActivityTypes(), - *config_ - ); - if (session) { - sessions_.push_back(std::move(session)); - } - } -} - -void CuptiActivityProfiler::configure( - const Config& config, - const time_point& now) { - std::lock_guard guard(mutex_); - if (isActive()) { - LOG(ERROR) << "CuptiActivityProfiler already busy, terminating"; - return; - } - - config_ = config.clone(); - - if (config_->activitiesDuration().count() == 0) { - // Use default if not specified - config_->setActivitiesDuration( - config_->activitiesDurationDefault()); - } - - // Ensure we're starting in a clean state - resetTraceData(); - -#if !USE_GOOGLE_LOG - // Add a LoggerObserverCollector to collect all logs during the trace. - loggerCollectorMetadata_ = std::make_unique(); - Logger::addLoggerObserver(loggerCollectorMetadata_.get()); -#endif // !USE_GOOGLE_LOG - - profileStartTime_ = config_->requestTimestamp(); - - if (config_->hasProfileStartIteration()) { - profileStartIter_ = config_->profileStartIteration(); - profileEndIter_ = profileStartIter_ + config_->activitiesRunIterations(); - } else { - - profileStartIter_ = -1; - profileEndIter_ = (std::numeric_limits::max)(); - - if (profileStartTime_ < now) { - LOG(ERROR) << "Not starting tracing - start timestamp is in the past. Time difference (ms): " << duration_cast(now - profileStartTime_).count(); - return; - } else if ((profileStartTime_ - now) < config_->activitiesWarmupDuration()) { - LOG(ERROR) << "Not starting tracing - insufficient time for warmup. Time to warmup (ms): " << duration_cast(profileStartTime_ - now).count() ; - return; - } - } - - if (LOG_IS_ON(INFO)) { - config_->printActivityProfilerConfig(LIBKINETO_DBG_STREAM); - } - if (!cpuOnly_ && !libkineto::api().client()) { - if (profileStartIter_ < 0) { - LOG(INFO) << "GPU-only tracing for " - << config_->activitiesDuration().count() << "ms"; - } else { - LOG(INFO) << "GPU-only tracing for " - << config_->activitiesRunIterations() << " iterations"; - } - } - - // Set useful metadata into the logger. - LOGGER_OBSERVER_SET_TRACE_DURATION_MS(config_->activitiesDuration().count()); - if (!config_->requestTraceID().empty()) { - LOGGER_OBSERVER_SET_TRACE_ID(config_->requestTraceID()); - } - if (!config_->requestGroupTraceID().empty()) { - LOGGER_OBSERVER_SET_GROUP_TRACE_ID(config_->requestGroupTraceID()); - } - LOGGER_OBSERVER_ADD_DESTINATION(config_->activitiesLogUrl()); - -#if defined(HAS_CUPTI) || defined(HAS_ROCTRACER) - if (!cpuOnly_) { - // Enabling CUPTI activity tracing incurs a larger perf hit at first, - // presumably because structures are allocated and initialized, callbacks - // are activated etc. After a while the overhead decreases and stabilizes. - // It's therefore useful to perform some warmup before starting recording. - LOG(INFO) << "Enabling GPU tracing"; - cupti_.setMaxBufferSize(config_->activitiesMaxGpuBufferSize()); - - time_point timestamp; - if (VLOG_IS_ON(1)) { - timestamp = system_clock::now(); - } -#ifdef HAS_CUPTI - cupti_.enableCuptiActivities(config_->selectedActivityTypes()); -#else - cupti_.enableActivities(config_->selectedActivityTypes()); -#endif - if (VLOG_IS_ON(1)) { - auto t2 = system_clock::now(); - addOverheadSample( - setupOverhead_, duration_cast(t2 - timestamp).count()); - } - } -#endif // HAS_CUPTI || HAS_ROCTRACER - - if (profilers_.size() > 0) { - configureChildProfilers(); - } - - if (libkineto::api().client()) { - libkineto::api().client()->warmup(config_->isOpInputsCollectionEnabled()); - } - if (profileStartIter_ >= 0) { - LOG(INFO) << "Tracing starting on iteration = " << profileStartIter_; - } else { - LOG(INFO) << "Tracing starting in " - << duration_cast(profileStartTime_ - now).count() << "s"; - } - - traceBuffers_ = std::make_unique(); - captureWindowStartTime_ = captureWindowEndTime_ = 0; - currentRunloopState_ = RunloopState::Warmup; -} - -void CuptiActivityProfiler::startTraceInternal(const time_point& now) { - captureWindowStartTime_ = libkineto::timeSinceEpoch(now); - VLOG(0) << "Warmup -> CollectTrace"; - for (auto& session: sessions_){ - LOG(INFO) << "Starting child profiler session"; - session->start(); - } - currentRunloopState_ = RunloopState::CollectTrace; -} - -void CuptiActivityProfiler::stopTraceInternal(const time_point& now) { - if (captureWindowEndTime_ == 0) { - captureWindowEndTime_ = libkineto::timeSinceEpoch(now); - } -#if defined(HAS_CUPTI) || defined(HAS_ROCTRACER) - if (!cpuOnly_) { - time_point timestamp; - if (VLOG_IS_ON(1)) { - timestamp = system_clock::now(); - } -#ifdef HAS_CUPTI - cupti_.disableCuptiActivities(config_->selectedActivityTypes()); -#else - cupti_.disableActivities(config_->selectedActivityTypes()); -#endif - if (VLOG_IS_ON(1)) { - auto t2 = system_clock::now(); - addOverheadSample( - setupOverhead_, duration_cast(t2 - timestamp).count()); - } - } -#endif // HAS_CUPTI || HAS_ROCTRACER - - if (currentRunloopState_ == RunloopState::CollectTrace) { - VLOG(0) << "CollectTrace -> ProcessTrace"; - } else { - LOG(WARNING) << "Called stopTrace with state == " << - static_cast::type>( - currentRunloopState_.load()); - } - for (auto& session: sessions_){ - LOG(INFO) << "Stopping child profiler session"; - session->stop(); - } - currentRunloopState_ = RunloopState::ProcessTrace; -} - -void CuptiActivityProfiler::resetInternal() { - resetTraceData(); - currentRunloopState_ = RunloopState::WaitForRequest; -} - -bool CuptiActivityProfiler::isWarmupDone( - const time_point& now, - int64_t currentIter) const { - // is it a time based config - if (profileStartIter_ < 0) { - // qualify that this check is not being called from application step() API - // this avoids races between the step() API and periodically invoked - // profiler run loop step() method - return (currentIter < 0) && (now >= profileStartTime_); - } - // this is an iteration based config - if (currentIter < 0) { - return false; - } - return currentIter >= profileStartIter_; -} - -bool CuptiActivityProfiler::isCollectionDone( - const time_point& now, - int64_t currentIter) const { - // is it a time based config - if (profileStartIter_ < 0) { - // qualify that this check is not being called from application step() API - return (currentIter < 0) && (now >= profileEndTime_); - } - // this is an iteration based config - if (currentIter < 0) { - return false; - } - return currentIter >= profileEndIter_; -} - -const time_point CuptiActivityProfiler::performRunLoopStep( - const time_point& now, - const time_point& nextWakeupTime, - int64_t currentIter) { - auto new_wakeup_time = nextWakeupTime; - bool warmup_done = false, collection_done = false; - - VLOG_IF(1, currentIter >= 0) << "Run loop on application step(), iteration = " - << currentIter; - - switch (currentRunloopState_) { - case RunloopState::WaitForRequest: - VLOG(1) << "State: WaitForRequest"; - // Nothing to do - break; - - case RunloopState::Warmup: - VLOG(1) << "State: Warmup"; - warmup_done = isWarmupDone(now, currentIter); -#if defined(HAS_CUPTI) || defined(HAS_ROCTRACER) - // Flushing can take a while so avoid doing it close to the start time - if (!cpuOnly_ && currentIter < 0 && - (profileStartIter_ >= 0 || nextWakeupTime < profileStartTime_)) { - cupti_.clearActivities(); - } - - if (cupti_.stopCollection) { - // Go to process trace to clear any outstanding buffers etc - LOG(WARNING) << "Trace terminated during warmup"; - std::lock_guard guard(mutex_); - stopTraceInternal(now); - resetInternal(); - VLOG(0) << "Warmup -> WaitForRequest"; - break; - } -#endif // HAS_CUPTI || HAS_ROCTRACER - - if (warmup_done) { - UST_LOGGER_MARK_COMPLETED(kWarmUpStage); - if (profileStartIter_ < 0 && - (now > profileStartTime_ + milliseconds(10))) { - LOG(WARNING) - << "Tracing started " - << duration_cast(now - profileStartTime_).count() - << "ms late!"; - } else { - LOG(INFO) << "Tracing started"; - } - startTrace(now); - if (libkineto::api().client()) { - libkineto::api().client()->start(); - } - if (nextWakeupTime > profileEndTime_) { - new_wakeup_time = profileEndTime_; - } - } else if (nextWakeupTime > profileStartTime_) { - new_wakeup_time = profileStartTime_; - } - - break; - - case RunloopState::CollectTrace: - VLOG(1) << "State: CollectTrace"; - // captureWindowStartTime_ can be set by external threads, - // so recompute end time. - // FIXME: Is this a good idea for synced start? - if (profileStartIter_ < 0) { - std::lock_guard guard(mutex_); - profileEndTime_ = time_point( - microseconds(captureWindowStartTime_)) + - config_->activitiesDuration(); - } - - collection_done = isCollectionDone(now, currentIter); - - // TODO revisit stopCollection_ is not used right now - if (collection_done || stopCollection_.exchange(false) -#if defined(HAS_CUPTI) || defined(HAS_ROCTRACER) - || cupti_.stopCollection -#endif // HAS_CUPTI || HAS_ROCTRACER - ){ - // Update runloop state first to prevent further updates to shared state - LOG(INFO) << "Tracing complete."; - if (currentIter > 0) { - LOG(INFO) << "This state change was invoked by application's step() call"; - } - // FIXME: Need to communicate reason for stopping on errors - if (libkineto::api().client()) { - libkineto::api().client()->stop(); - } - std::lock_guard guard(mutex_); - stopTraceInternal(now); - VLOG_IF(0, collection_done) << "Reached profile end time"; - - UST_LOGGER_MARK_COMPLETED(kCollectionStage); - } else if (profileStartIter_ >= 0) { - // nothing to do here - } else if (now < profileEndTime_ && profileEndTime_ < nextWakeupTime) { - new_wakeup_time = profileEndTime_; - } - - break; - - case RunloopState::ProcessTrace: - VLOG(1) << "State: ProcessTrace"; - // skip this state transition if it called from the step() api - // of the profiler. - // else it could lead to a race between the profiler thread and an - // application thread calling step() - if (currentIter >= 0) { - return new_wakeup_time; - } - // FIXME: Probably want to allow interruption here - // for quickly handling trace request via synchronous API - std::lock_guard guard(mutex_); - processTraceInternal(*logger_); - UST_LOGGER_MARK_COMPLETED(kPostProcessingStage); - resetInternal(); - VLOG(0) << "ProcessTrace -> WaitForRequest"; - break; - } - - return new_wakeup_time; -} - -void CuptiActivityProfiler::finalizeTrace(const Config& config, ActivityLogger& logger) { - LOG(INFO) << "Recorded nets:"; - { - for (const auto& it : iterationCountMap_) { - LOG(INFO) << it.first << ": " << it.second << " iterations"; - } - iterationCountMap_.clear(); - } - - // Process names - int32_t pid = processId(); - string process_name = processName(pid); - if (!process_name.empty()) { - logger.handleDeviceInfo( - {pid, process_name, "CPU"}, captureWindowStartTime_); - if (!cpuOnly_) { - // GPU events use device id as pid (0-7). - constexpr int kMaxGpuCount = 8; - for (int gpu = 0; gpu < kMaxGpuCount; gpu++) { - logger.handleDeviceInfo( - {gpu, process_name, fmt::format("GPU {}", gpu)}, - captureWindowStartTime_); - } - } - } - - // Thread & stream info - for (auto pair : resourceInfo_) { - const auto& resource = pair.second; - logger.handleResourceInfo(resource, captureWindowStartTime_); - } - - for (const auto& iterations : traceSpans_) { - for (const auto& span_pair : iterations.second) { - const TraceSpan& gpu_span = span_pair.second; - if (gpu_span.opCount > 0) { - logger.handleTraceSpan(gpu_span); - } - } - } - - // Overhead info - overheadInfo_.push_back(ActivityLogger::OverheadInfo("CUPTI Overhead")); - for(const auto& info : overheadInfo_) { - logger.handleOverheadInfo(info, captureWindowStartTime_); - } - - gpuUserEventMap_.logEvents(&logger); - -#if !USE_GOOGLE_LOG - // Save logs from LoggerCollector objects into Trace metadata. - auto LoggerMD = loggerCollectorMetadata_->extractCollectorMetadata(); - std::unordered_map> LoggerMDString; - for (auto& md : LoggerMD) { - LoggerMDString[toString(md.first)] = md.second; - } -#endif // !USE_GOOGLE_LOG - - logger.finalizeTrace(config, std::move(traceBuffers_), captureWindowEndTime_, LoggerMDString); -} - -void CuptiActivityProfiler::resetTraceData() { -#if defined(HAS_CUPTI) || defined(HAS_ROCTRACER) - if (!cpuOnly_) { - cupti_.clearActivities(); - } -#endif // HAS_CUPTI || HAS_ROCTRACER - activityMap_.clear(); - cpuCorrelationMap_.clear(); - correlatedCudaActivities_.clear(); - gpuUserEventMap_.clear(); - traceSpans_.clear(); - clientActivityTraceMap_.clear(); - traceBuffers_ = nullptr; - metadata_.clear(); - sessions_.clear(); -#if !USE_GOOGLE_LOG - Logger::removeLoggerObserver(loggerCollectorMetadata_.get()); -#endif // !USE_GOOGLE_LOG -} - - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.h b/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.h deleted file mode 100644 index 208833a4db720429982a63ed72ffa4762ef00bd0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiActivityProfiler.h +++ /dev/null @@ -1,364 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ThreadUtil.h" -#include "TraceSpan.h" -#include "libkineto.h" -#include "output_base.h" -#include "GenericTraceActivity.h" -#include "IActivityProfiler.h" -#include "LoggerCollector.h" - -namespace KINETO_NAMESPACE { - -class Config; -class CuptiActivityApi; -class RoctracerActivityApi; - -class CuptiActivityProfiler { - public: - CuptiActivityProfiler(CuptiActivityApi& cupti, bool cpuOnly); - CuptiActivityProfiler(RoctracerActivityApi& rai, bool cpuOnly); - CuptiActivityProfiler(const CuptiActivityProfiler&) = delete; - CuptiActivityProfiler& operator=(const CuptiActivityProfiler&) = delete; - - bool isActive() const { - return currentRunloopState_ != RunloopState::WaitForRequest; - } - - // Invoke at a regular interval to perform profiling activities. - // When not active, an interval of 1-5 seconds is probably fine, - // depending on required warm-up time and delayed start time. - // When active, it's a good idea to invoke more frequently to stay below - // memory usage limit (ACTIVITIES_MAX_GPU_BUFFER_SIZE_MB) during warmup. - const std::chrono::time_point performRunLoopStep( - const std::chrono::time_point& now, - const std::chrono::time_point& nextWakeupTime, - int64_t currentIter = -1); - - // Used for async requests - void setLogger(ActivityLogger* logger) { - logger_ = logger; - } - - // Synchronous control API - void startTrace( - const std::chrono::time_point& now) { - std::lock_guard guard(mutex_); - startTraceInternal(now); - } - - void stopTrace(const std::chrono::time_point& now) { - std::lock_guard guard(mutex_); - stopTraceInternal(now); - } - - // Process CPU and GPU traces - void processTrace(ActivityLogger& logger) { - std::lock_guard guard(mutex_); - processTraceInternal(logger); - } - - void reset() { - std::lock_guard guard(mutex_); - resetInternal(); - } - - // Set up profiler as specified in config. - void configure( - const Config& config, - const std::chrono::time_point& now); - - // Registered with client API to pass CPU trace events over - void transferCpuTrace( - std::unique_ptr cpuTrace); - - Config& config() { - return *config_; - } - - inline void recordThreadInfo() { - int32_t sysTid = systemThreadId(); - // Note we're using the lower 32 bits of the (opaque) pthread id - // as key, because that's what CUPTI records. - int32_t tid = threadId(); - int32_t pid = processId(); - std::lock_guard guard(mutex_); - recordThreadInfo(sysTid, tid, pid); - } - - // T107508020: We can deprecate the recordThreadInfo(void) once we optimized profiler_kineto - void recordThreadInfo(int32_t sysTid, int32_t tid, int32_t pid) { - if (resourceInfo_.find({pid, tid}) == resourceInfo_.end()) { - resourceInfo_.emplace( - std::make_pair(pid, tid), - ActivityLogger::ResourceInfo( - pid, - sysTid, - sysTid, // sortindex - fmt::format("thread {} ({})", sysTid, getThreadName()))); - } - } - - void addMetadata(const std::string& key, const std::string& value) { - std::lock_guard guard(mutex_); - metadata_[key] = value; - } - - void addChildActivityProfiler( - std::unique_ptr profiler) { - std::lock_guard guard(mutex_); - profilers_.push_back(std::move(profiler)); - } - - protected: - - using CpuGpuSpanPair = std::pair; - static const CpuGpuSpanPair& defaultTraceSpan(); - - private: - - // Map of gpu activities to user defined events - class GpuUserEventMap { - public: - // Insert a user defined event which maps to the gpu trace activity. - // If the user defined event mapping already exists this will update the - // gpu side span to include the span of gpuTraceActivity. - void insertOrExtendEvent(const ITraceActivity& cpuTraceActivity, - const ITraceActivity& gpuTraceActivity); - // Log out the events to the logger - void logEvents(ActivityLogger *logger); - - void clear() { - streamSpanMap_.clear(); - } - - private: - // device id and stream name - using StreamKey = std::pair; - - // map of correlation id to TraceSpan - using CorrelationSpanMap = - std::unordered_map; - std::map streamSpanMap_; - }; - - GpuUserEventMap gpuUserEventMap_; - // id -> activity* - std::unordered_map activityMap_; - // cuda runtime id -> pytorch op id - // CUPTI provides a mechanism for correlating Cuda events to arbitrary - // external events, e.g.operator activities from PyTorch. - std::unordered_map cpuCorrelationMap_; - // CUDA runtime <-> GPU Activity - std::unordered_map - correlatedCudaActivities_; - std::unordered_map userCorrelationMap_; - - // data structure to collect cuptiActivityFlushAll() latency overhead - struct profilerOverhead { - int64_t overhead; - int cntr; - }; - - bool isWarmupDone( - const std::chrono::time_point& now, - int64_t currentIter) const; - - bool isCollectionDone( - const std::chrono::time_point& now, - int64_t currentIter) const; - - void startTraceInternal( - const std::chrono::time_point& now); - - void stopTraceInternal( - const std::chrono::time_point& now); - - void processTraceInternal(ActivityLogger& logger); - - void resetInternal(); - - void finalizeTrace(const Config& config, ActivityLogger& logger); - - void configureChildProfilers(); - - // Process a single CPU trace - void processCpuTrace( - libkineto::CpuTraceBuffer& cpuTrace, - ActivityLogger& logger); - - // Create resource names for streams - inline void recordStream(int device, int id, const char* postfix) { - if (resourceInfo_.find({device, id}) == resourceInfo_.end()) { - resourceInfo_.emplace( - std::make_pair(device, id), - ActivityLogger::ResourceInfo( - device, id, id, fmt::format( - "stream {} {}", id, postfix))); - } - } - - // Record client trace span for subsequent lookups from activities - // Also creates a corresponding GPU-side span. - CpuGpuSpanPair& recordTraceSpan(TraceSpan& span, int gpuOpCount); - - // Returns true if net name is to be tracked for a specified number of - // iterations. - bool iterationTargetMatch(libkineto::CpuTraceBuffer& trace); - - // net name to id - int netId(const std::string& netName); - - const ITraceActivity* linkedActivity( - int32_t correlationId, - const std::unordered_map& correlationMap); - -#ifdef HAS_CUPTI - // Process generic CUPTI activity - void handleCuptiActivity(const CUpti_Activity* record, ActivityLogger* logger); - - // Process specific GPU activity types - void updateGpuNetSpan(const ITraceActivity& gpuOp); - bool outOfRange(const ITraceActivity& act); - void handleCorrelationActivity( - const CUpti_ActivityExternalCorrelation* correlation); - void handleRuntimeActivity( - const CUpti_ActivityAPI* activity, ActivityLogger* logger); - void handleOverheadActivity( - const CUpti_ActivityOverhead* activity, ActivityLogger* logger); - void handleGpuActivity(const ITraceActivity& act, - ActivityLogger* logger); - template - void handleGpuActivity(const T* act, ActivityLogger* logger); -#endif // HAS_CUPTI - - void resetTraceData(); - - void addOverheadSample(profilerOverhead& counter, int64_t overhead) { - counter.overhead += overhead; - counter.cntr++; - } - int64_t getOverhead(const profilerOverhead& counter) { - if (counter.cntr == 0) { - return 0; - } - return counter.overhead / counter.cntr; - } - - void checkTimestampOrder(const ITraceActivity* act1); - - // On-demand request configuration - std::unique_ptr config_; - - // Logger used during trace processing - ActivityLogger* logger_; - - // Calls to CUPTI is encapsulated behind this interface -#ifdef HAS_ROCTRACER - RoctracerActivityApi& cupti_; // Design failure here -#else - CuptiActivityApi& cupti_; -#endif - - enum class RunloopState { - WaitForRequest, - Warmup, - CollectTrace, - ProcessTrace - }; - - // Start and end time used for triggering and stopping profiling - std::chrono::time_point profileStartTime_; - std::chrono::time_point profileEndTime_; - int64_t profileStartIter_ = -1, profileEndIter_ = -1; - - - // All recorded trace spans, both CPU and GPU - // Trace Id -> list of iterations. - // Using map of lists for the iterator semantics, since we are recording - // pointers to the elements in this structure. - std::map> traceSpans_; - - // Maintain a map of client trace activity to trace span. - // Maps correlation id -> TraceSpan* held by traceSpans_. - using ActivityTraceMap = std::unordered_map; - ActivityTraceMap clientActivityTraceMap_; - - // Cache thread names and system thread ids for pthread ids, - // and stream ids for GPU streams - std::map< - std::pair, - ActivityLogger::ResourceInfo> resourceInfo_; - - std::vector overheadInfo_; - - // the overhead to flush the activity buffer - profilerOverhead flushOverhead_; - // the overhead to enable/disable activity tracking - profilerOverhead setupOverhead_; - - bool cpuOnly_{false}; - - // *************************************************************************** - // Below state is shared with external threads. - // These need to either be atomic, accessed under lock or only used - // by external threads in separate runloop phases from the profiler thread. - // *************************************************************************** - - // Mutex to protect non-atomic access to below state - std::mutex mutex_; - - // Runloop phase - std::atomic currentRunloopState_{RunloopState::WaitForRequest}; - - // Keep track of the start time of the first net in the current trace. - // This is only relevant to Caffe2 as PyTorch does not have nets. - // All CUDA events before this time will be removed - // Can be written by external threads during collection. - int64_t captureWindowStartTime_{0}; - // Similarly, all CUDA API events after the last net event will be removed - int64_t captureWindowEndTime_{0}; - - // span name -> iteration count - std::map iterationCountMap_; - // Flag used to stop tracing from external api callback. - // Needs to be atomic since it's set from a different thread. - std::atomic_bool stopCollection_{false}; - - // Buffers where trace data is stored - std::unique_ptr traceBuffers_; - - // Trace metadata - std::unordered_map metadata_; - - // child activity profilers - std::vector> profilers_; - - // a vector of active profiler plugin sessions - std::vector> sessions_; - - // LoggerCollector to collect all LOGs during the trace -#if !USE_GOOGLE_LOG - std::unique_ptr loggerCollectorMetadata_; -#endif // !USE_GOOGLE_LOG -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.cpp deleted file mode 100644 index 1876003998dc0c66f882d939ca8100750cfd046a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.cpp +++ /dev/null @@ -1,260 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "CuptiCallbackApi.h" - -#include -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include "cupti_call.h" -#endif -#include "Logger.h" - - -namespace KINETO_NAMESPACE { - -// limit on number of handles per callback type -constexpr size_t MAX_CB_FNS_PER_CB = 8; - -// Reader Writer lock types -using ReaderWriterLock = std::shared_timed_mutex; -using ReaderLockGuard = std::shared_lock; -using WriteLockGuard = std::unique_lock; - -static ReaderWriterLock callbackLock_; - -/* Callback Table : - * Overall goal of the design is to optimize the lookup of function - * pointers. The table is structured at two levels and the leaf - * elements in the table are std::list to enable fast access/inserts/deletes - * - * | - * -> cb id 0 -> std::list of callbacks - * ... - * -> cb id n -> std::list of callbacks - * | - * ... - * CallbackTable is the finaly table type above - * See type declrartions in header file. - */ - - -/* callback_switchboard : is the global callback handler we register - * with CUPTI. The goal is to make it as efficient as possible - * to re-direct to the registered callback(s). - * - * Few things to care about : - * a) use if/then switches rather than map/hash structures - * b) avoid dynamic memory allocations - * c) be aware of locking overheads - */ -#ifdef HAS_CUPTI -static void CUPTIAPI callback_switchboard( -#else -static void callback_switchboard( -#endif - void* /* unused */, - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo) { - - // below statement is likey going to call a mutex - // on the singleton access - CuptiCallbackApi::singleton().__callback_switchboard( - domain, cbid, cbInfo); -} - - -void CuptiCallbackApi::__callback_switchboard( - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo) { - VLOG(0) << "Callback: domain = " << domain << ", cbid = " << cbid; - CallbackList *cblist = nullptr; - - switch (domain) { - - // add the fastest path for kernel launch callbacks - // as these are the most frequent ones - case CUPTI_CB_DOMAIN_RUNTIME_API: - switch (cbid) { - case CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000: - cblist = &callbacks_.runtime[ - CUDA_LAUNCH_KERNEL - __RUNTIME_CB_DOMAIN_START]; - break; - default: - break; - } - break; - - case CUPTI_CB_DOMAIN_RESOURCE: - switch (cbid) { - case CUPTI_CBID_RESOURCE_CONTEXT_CREATED: - cblist = &callbacks_.resource[ - RESOURCE_CONTEXT_CREATED - __RESOURCE_CB_DOMAIN_START]; - break; - case CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING: - cblist = &callbacks_.resource[ - RESOURCE_CONTEXT_DESTROYED - __RESOURCE_CB_DOMAIN_START]; - break; - default: - break; - } - break; - - default: - return; - } - - // ignore callbacks that are not handled - if (cblist == nullptr) { - return; - } - - // make a copy of the callback list so we avoid holding lock - // in common case this should be just one func pointer copy - std::array callbacks; - int num_cbs = 0; - { - ReaderLockGuard rl(callbackLock_); - int i = 0; - for (auto it = cblist->begin(); - it != cblist->end() && i < MAX_CB_FNS_PER_CB; - it++, i++) { - callbacks[i] = *it; - } - num_cbs = i; - } - - for (int i = 0; i < num_cbs; i++) { - auto fn = callbacks[i]; - fn(domain, cbid, cbInfo); - } -} - -CuptiCallbackApi& CuptiCallbackApi::singleton() { - static CuptiCallbackApi instance; - return instance; -} - -CuptiCallbackApi::CuptiCallbackApi() { -#ifdef HAS_CUPTI - lastCuptiStatus_ = CUPTI_ERROR_UNKNOWN; - lastCuptiStatus_ = CUPTI_CALL_NOWARN( - cuptiSubscribe(&subscriber_, - (CUpti_CallbackFunc)callback_switchboard, - nullptr)); - - initSuccess_ = (lastCuptiStatus_ == CUPTI_SUCCESS); -#endif -} - -CuptiCallbackApi::CallbackList* CuptiCallbackApi::CallbackTable::lookup( - CUpti_CallbackDomain domain, CuptiCallBackID cbid) { - size_t idx; - - switch (domain) { - - case CUPTI_CB_DOMAIN_RESOURCE: - assert(cbid >= __RESOURCE_CB_DOMAIN_START); - assert(cbid < __RESOURCE_CB_DOMAIN_END); - idx = cbid - __RESOURCE_CB_DOMAIN_START; - return &resource.at(idx); - - case CUPTI_CB_DOMAIN_RUNTIME_API: - assert(cbid >= __RUNTIME_CB_DOMAIN_START); - assert(cbid < __RUNTIME_CB_DOMAIN_END); - idx = cbid - __RUNTIME_CB_DOMAIN_START; - return &runtime.at(idx); - - default: - LOG(WARNING) << " Unsupported callback domain : " << domain; - return nullptr; - } -} - -bool CuptiCallbackApi::registerCallback( - CUpti_CallbackDomain domain, - CuptiCallBackID cbid, - CuptiCallbackFn cbfn) { - CallbackList* cblist = callbacks_.lookup(domain, cbid); - - if (!cblist) { - LOG(WARNING) << "Could not register callback -- domain = " << domain - << " callback id = " << cbid; - return false; - } - - // avoid duplicates - auto it = std::find(cblist->begin(), cblist->end(), cbfn); - if (it != cblist->end()) { - LOG(WARNING) << "Adding duplicate callback -- domain = " << domain - << " callback id = " << cbid; - return true; - } - - if (cblist->size() == MAX_CB_FNS_PER_CB) { - LOG(WARNING) << "Already registered max callback -- domain = " << domain - << " callback id = " << cbid; - } - - WriteLockGuard wl(callbackLock_); - cblist->push_back(cbfn); - return true; -} - -bool CuptiCallbackApi::deleteCallback( - CUpti_CallbackDomain domain, - CuptiCallBackID cbid, - CuptiCallbackFn cbfn) { - CallbackList* cblist = callbacks_.lookup(domain, cbid); - if (!cblist) { - LOG(WARNING) << "Attempting to remove unsupported callback -- domain = " << domain - << " callback id = " << cbid; - return false; - } - - // Locks are not required here as - // https://en.cppreference.com/w/cpp/container/list/erase - // "References and iterators to the erased elements are invalidated. - // Other references and iterators are not affected." - auto it = std::find(cblist->begin(), cblist->end(), cbfn); - if (it == cblist->end()) { - LOG(WARNING) << "Could not find callback to remove -- domain = " << domain - << " callback id = " << cbid; - return false; - } - - WriteLockGuard wl(callbackLock_); - cblist->erase(it); - return true; -} - -bool CuptiCallbackApi::enableCallback( - CUpti_CallbackDomain domain, CUpti_CallbackId cbid) { -#ifdef HAS_CUPTI - if (initSuccess_) { - lastCuptiStatus_ = CUPTI_CALL_NOWARN( - cuptiEnableCallback(1, subscriber_, domain, cbid)); - return (lastCuptiStatus_ == CUPTI_SUCCESS); - } -#endif - return false; -} - -bool CuptiCallbackApi::disableCallback( - CUpti_CallbackDomain domain, CUpti_CallbackId cbid) { -#ifdef HAS_CUPTI - if (initSuccess_) { - lastCuptiStatus_ = CUPTI_CALL_NOWARN( - cuptiEnableCallback(0, subscriber_, domain, cbid)); - return (lastCuptiStatus_ == CUPTI_SUCCESS); - } -#endif - return false; -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.h b/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.h deleted file mode 100644 index 4526f3750b4a134bc888843b8ff347a1f2bf8d5f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApi.h +++ /dev/null @@ -1,130 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#ifdef HAS_CUPTI -#include -#endif -#include -#include -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "CuptiCallbackApiMock.h" - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - - -/* CuptiCallbackApi : Provides an abstraction over CUPTI callback - * interface. This enables various callback functions to be registered - * with this class. The class registers a global callback handler that - * redirects to the respective callbacks. - * - * Note: one design choice we made is to only support simple function pointers - * in order to speed up the implementation for fast path. - */ - -using CuptiCallbackFn = void(*)( - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo); - - -class CuptiCallbackApi { - - public: - - /* Global list of supported callback ids - * use the class namespace to avoid confusing with CUPTI enums*/ - enum CuptiCallBackID { - CUDA_LAUNCH_KERNEL = 0, - // can possibly support more callback ids per domain - // - __RUNTIME_CB_DOMAIN_START = CUDA_LAUNCH_KERNEL, - - // Callbacks under Resource CB domain - RESOURCE_CONTEXT_CREATED, - RESOURCE_CONTEXT_DESTROYED, - - __RUNTIME_CB_DOMAIN_END = RESOURCE_CONTEXT_CREATED, - __RESOURCE_CB_DOMAIN_START = RESOURCE_CONTEXT_CREATED, - - __RESOURCE_CB_DOMAIN_END = RESOURCE_CONTEXT_DESTROYED + 1, - }; - - - CuptiCallbackApi(const CuptiCallbackApi&) = delete; - CuptiCallbackApi& operator=(const CuptiCallbackApi&) = delete; - - static CuptiCallbackApi& singleton(); - - bool initSuccess() const { - return initSuccess_; - } - -#ifdef HAS_CUPTI - CUptiResult getCuptiStatus() const { - return lastCuptiStatus_; - } -#endif - - bool registerCallback( - CUpti_CallbackDomain domain, - CuptiCallBackID cbid, - CuptiCallbackFn cbfn); - - // returns false if callback was not found - bool deleteCallback( - CUpti_CallbackDomain domain, - CuptiCallBackID cbid, - CuptiCallbackFn cbfn); - - bool enableCallback(CUpti_CallbackDomain domain, CUpti_CallbackId cbid); - bool disableCallback(CUpti_CallbackDomain domain, CUpti_CallbackId cbid); - - - // Please do not use this method. This has to be exposed as public - // so it is accessible from the callback handler - void __callback_switchboard( - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo); - - private: - - explicit CuptiCallbackApi(); - - // For callback table design overview see the .cpp file - using CallbackList = std::list; - - // level 2 tables sizes are known at compile time - constexpr static size_t RUNTIME_CB_DOMAIN_SIZE - = (__RUNTIME_CB_DOMAIN_END - __RUNTIME_CB_DOMAIN_START); - - constexpr static size_t RESOURCE_CB_DOMAIN_SIZE - = (__RESOURCE_CB_DOMAIN_END - __RESOURCE_CB_DOMAIN_START); - - // level 1 table is a struct - struct CallbackTable { - std::array runtime; - std::array resource; - - CallbackList* lookup(CUpti_CallbackDomain domain, CuptiCallBackID cbid); - }; - - CallbackTable callbacks_; - bool initSuccess_ = false; - -#ifdef HAS_CUPTI - CUptiResult lastCuptiStatus_; - CUpti_SubscriberHandle subscriber_; -#endif -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApiMock.h b/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApiMock.h deleted file mode 100644 index fd51267274f99a0c9949eaac6fdae2dff917c7a0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiCallbackApiMock.h +++ /dev/null @@ -1,32 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -// Provides data structures to mock CUPTI Callback API -#ifndef HAS_CUPTI - -enum CUpti_CallbackDomain { - CUPTI_CB_DOMAIN_RESOURCE, - CUPTI_CB_DOMAIN_RUNTIME_API, -}; -enum CUpti_CallbackId { - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000, - CUPTI_CBID_RESOURCE_CONTEXT_CREATED, - CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING, -}; - -using CUcontext = void*; - -struct CUpti_ResourceData { - CUcontext context; -}; - -constexpr int CUPTI_API_ENTER = 0; -constexpr int CUPTI_API_EXIT = 0; - -struct CUpti_CallbackData { - CUcontext context; - const char* symbolName; - int callbackSite; -}; -#endif // HAS_CUPTI diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.cpp deleted file mode 100644 index 7f1d48c1d00bb7defb6b622c13da55da99312a3b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.cpp +++ /dev/null @@ -1,112 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "CuptiEventApi.h" - -#include - -#include "Logger.h" -#include "cupti_call.h" - -using namespace std::chrono; -using std::vector; - -namespace KINETO_NAMESPACE { - -CuptiEventApi::CuptiEventApi(CUcontext context) - : context_(context) { - CUPTI_CALL(cuptiGetDeviceId(context_, (uint32_t*)&device_)); -} - -CUpti_EventGroupSets* CuptiEventApi::createGroupSets( - vector& ids) { - CUpti_EventGroupSets* group_sets = nullptr; - CUptiResult res = CUPTI_CALL(cuptiEventGroupSetsCreate( - context_, sizeof(CUpti_EventID) * ids.size(), ids.data(), &group_sets)); - - if (res != CUPTI_SUCCESS || group_sets == nullptr) { - const char* errstr = nullptr; - CUPTI_CALL(cuptiGetResultString(res, &errstr)); - throw std::system_error(EINVAL, std::generic_category(), errstr); - } - - return group_sets; -} - -void CuptiEventApi::destroyGroupSets(CUpti_EventGroupSets* sets) { - CUPTI_CALL(cuptiEventGroupSetsDestroy(sets)); -} - -bool CuptiEventApi::setContinuousMode() { - // Avoid logging noise for CUPTI_ERROR_LEGACY_PROFILER_NOT_SUPPORTED - CUptiResult res = CUPTI_CALL_NOWARN(cuptiSetEventCollectionMode( - context_, CUPTI_EVENT_COLLECTION_MODE_CONTINUOUS)); - if (res == CUPTI_ERROR_LEGACY_PROFILER_NOT_SUPPORTED) { - return false; - } - // Log warning on other errors - CUPTI_CALL(res); - return (res == CUPTI_SUCCESS); -} - -void CuptiEventApi::enablePerInstance(CUpti_EventGroup eventGroup) { - uint32_t profile_all = 1; - CUPTI_CALL(cuptiEventGroupSetAttribute( - eventGroup, - CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES, - sizeof(profile_all), - &profile_all)); -} - -uint32_t CuptiEventApi::instanceCount(CUpti_EventGroup eventGroup) { - uint32_t instance_count = 0; - size_t s = sizeof(instance_count); - CUPTI_CALL(cuptiEventGroupGetAttribute( - eventGroup, CUPTI_EVENT_GROUP_ATTR_INSTANCE_COUNT, &s, &instance_count)); - return instance_count; -} - -void CuptiEventApi::enableGroupSet(CUpti_EventGroupSet& set) { - CUptiResult res = CUPTI_CALL_NOWARN(cuptiEventGroupSetEnable(&set)); - if (res != CUPTI_SUCCESS) { - const char* errstr = nullptr; - CUPTI_CALL(cuptiGetResultString(res, &errstr)); - throw std::system_error(EIO, std::generic_category(), errstr); - } -} - -void CuptiEventApi::disableGroupSet(CUpti_EventGroupSet& set) { - CUPTI_CALL(cuptiEventGroupSetDisable(&set)); -} - -void CuptiEventApi::readEvent( - CUpti_EventGroup grp, - CUpti_EventID id, - vector& vals) { - size_t s = sizeof(int64_t) * vals.size(); - CUPTI_CALL(cuptiEventGroupReadEvent( - grp, - CUPTI_EVENT_READ_FLAG_NONE, - id, - &s, - reinterpret_cast(vals.data()))); -} - -vector CuptiEventApi::eventsInGroup(CUpti_EventGroup grp) { - uint32_t group_size = 0; - size_t s = sizeof(group_size); - CUPTI_CALL(cuptiEventGroupGetAttribute( - grp, CUPTI_EVENT_GROUP_ATTR_NUM_EVENTS, &s, &group_size)); - size_t events_size = group_size * sizeof(CUpti_EventID); - vector res(group_size); - CUPTI_CALL(cuptiEventGroupGetAttribute( - grp, CUPTI_EVENT_GROUP_ATTR_EVENTS, &events_size, res.data())); - return res; -} - -CUpti_EventID CuptiEventApi::eventId(const std::string& name) { - CUpti_EventID id{0}; - CUPTI_CALL(cuptiEventGetIdFromName(device_, name.c_str(), &id)); - return id; -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.h b/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.h deleted file mode 100644 index 79610f93f0ecfa62a9508d4caddfa876518169d3..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiEventApi.h +++ /dev/null @@ -1,49 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include - -namespace KINETO_NAMESPACE { - -// C++ interface to CUPTI Events C API. -// Virtual methods are here mainly to allow easier testing. -class CuptiEventApi { - public: - explicit CuptiEventApi(CUcontext context_); - virtual ~CuptiEventApi() {} - - CUdevice device() { - return device_; - } - - virtual CUpti_EventGroupSets* createGroupSets( - std::vector& ids); - virtual void destroyGroupSets(CUpti_EventGroupSets* sets); - - virtual bool setContinuousMode(); - - virtual void enablePerInstance(CUpti_EventGroup eventGroup); - virtual uint32_t instanceCount(CUpti_EventGroup eventGroup); - - virtual void enableGroupSet(CUpti_EventGroupSet& set); - virtual void disableGroupSet(CUpti_EventGroupSet& set); - - virtual void - readEvent(CUpti_EventGroup g, CUpti_EventID id, std::vector& vals); - virtual std::vector eventsInGroup(CUpti_EventGroup g); - - virtual CUpti_EventID eventId(const std::string& name); - - protected: - // Unit testing - CuptiEventApi() : context_(nullptr), device_(0) {} - - private: - CUcontext context_; - CUdevice device_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.cpp deleted file mode 100644 index 36401e7434108d1da079aa4ba0264192c5d62838..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.cpp +++ /dev/null @@ -1,107 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "CuptiMetricApi.h" - -#include - -#include "Logger.h" -#include "cupti_call.h" - -using namespace std::chrono; -using std::vector; - -namespace KINETO_NAMESPACE { - -CUpti_MetricID CuptiMetricApi::idFromName(const std::string& name) { - CUpti_MetricID metric_id{~0u}; - CUptiResult res = - CUPTI_CALL(cuptiMetricGetIdFromName(device_, name.c_str(), &metric_id)); - if (res == CUPTI_ERROR_INVALID_METRIC_NAME) { - LOG(WARNING) << "Invalid metric name: " << name; - } - return metric_id; -} - -// Return a map of event IDs and names for a given metric id. -// Note that many events don't have a name. In that case the name will -// be set to the empty string. -std::map CuptiMetricApi::events( - CUpti_MetricID metric_id) { - uint32_t num_events = 0; - CUPTI_CALL(cuptiMetricGetNumEvents(metric_id, &num_events)); - vector ids(num_events); - size_t array_size = num_events * sizeof(CUpti_EventID); - CUPTI_CALL(cuptiMetricEnumEvents(metric_id, &array_size, ids.data())); - std::map res; - for (CUpti_EventID id : ids) { - // Attempt to lookup name from CUPTI - constexpr size_t kMaxEventNameLength = 64; - char cupti_name[kMaxEventNameLength]; - size_t size = kMaxEventNameLength; - CUPTI_CALL( - cuptiEventGetAttribute(id, CUPTI_EVENT_ATTR_NAME, &size, cupti_name)); - cupti_name[kMaxEventNameLength - 1] = 0; - - // CUPTI "helpfully" returns "event_name" when the event is unnamed. - if (size > 0 && strcmp(cupti_name, "event_name") != 0) { - res.emplace(id, cupti_name); - } else { - res.emplace(id, ""); - } - } - return res; -} - -CUpti_MetricValueKind CuptiMetricApi::valueKind(CUpti_MetricID metric) { - CUpti_MetricValueKind res{CUPTI_METRIC_VALUE_KIND_FORCE_INT}; - size_t value_kind_size = sizeof(res); - CUPTI_CALL(cuptiMetricGetAttribute( - metric, CUPTI_METRIC_ATTR_VALUE_KIND, &value_kind_size, &res)); - return res; -} - -CUpti_MetricEvaluationMode CuptiMetricApi::evaluationMode( - CUpti_MetricID metric) { - CUpti_MetricEvaluationMode eval_mode{ - CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE}; - size_t eval_mode_size = sizeof(eval_mode); - CUPTI_CALL(cuptiMetricGetAttribute( - metric, CUPTI_METRIC_ATTR_EVALUATION_MODE, &eval_mode_size, &eval_mode)); - return eval_mode; -} - -// FIXME: Consider caching value kind here -SampleValue CuptiMetricApi::calculate( - CUpti_MetricID metric, - CUpti_MetricValueKind kind, - vector& events, - vector& values, - int64_t duration) { - CUpti_MetricValue metric_value; - CUPTI_CALL(cuptiMetricGetValue( - device_, - metric, - events.size() * sizeof(CUpti_EventID), - events.data(), - values.size() * sizeof(int64_t), - reinterpret_cast(values.data()), - duration, - &metric_value)); - - switch (kind) { - case CUPTI_METRIC_VALUE_KIND_DOUBLE: - case CUPTI_METRIC_VALUE_KIND_PERCENT: - return SampleValue(metric_value.metricValueDouble); - case CUPTI_METRIC_VALUE_KIND_UINT64: - case CUPTI_METRIC_VALUE_KIND_INT64: - case CUPTI_METRIC_VALUE_KIND_THROUGHPUT: - return SampleValue(metric_value.metricValueUint64); - case CUPTI_METRIC_VALUE_KIND_UTILIZATION_LEVEL: - return SampleValue((int)metric_value.metricValueUtilizationLevel); - default: - assert(false); - } - return SampleValue(-1); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.h b/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.h deleted file mode 100644 index f45d38cd6169dc7fd30208dbb7dac09fd8a9dee5..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiMetricApi.h +++ /dev/null @@ -1,38 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -#include -#include - -#include "SampleListener.h" - -namespace KINETO_NAMESPACE { - -// C++ interface to CUPTI Metrics C API. -// Virtual methods are here mainly to allow easier testing. -class CuptiMetricApi { - public: - explicit CuptiMetricApi(CUdevice device) : device_(device) {} - virtual ~CuptiMetricApi() {} - - virtual CUpti_MetricID idFromName(const std::string& name); - virtual std::map events(CUpti_MetricID metric_id); - - virtual CUpti_MetricValueKind valueKind(CUpti_MetricID metric); - virtual CUpti_MetricEvaluationMode evaluationMode(CUpti_MetricID metric); - - virtual SampleValue calculate( - CUpti_MetricID metric, - CUpti_MetricValueKind kind, - std::vector& events, - std::vector& values, - int64_t duration); - - private: - CUdevice device_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.cpp deleted file mode 100644 index d1b08ab2c13d0615221e71f43f07c3d3fe102a2f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.cpp +++ /dev/null @@ -1,504 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#ifdef HAS_CUPTI -#include -#if defined(CUDART_VERSION) && CUDART_VERSION > 10000 && CUDART_VERSION < 11040 -#include -#include -#include -#endif // cuda version > 10.00 and < 11.04 -#endif // HAS_CUPTI - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ScopeExit.h" -#include "CuptiNvPerfMetric.h" -#include "Logger.h" - -namespace KINETO_NAMESPACE { - -// Add a namespace to isolate these utility functions that are only -// going to be used by the CuptiRangeProfiler. These included calls -// to NVIDIA PerfWorks APIs. -namespace nvperf { - - -// Largely based on NVIDIA sample code provided with CUDA release -// files Metric.cpp and Eval.cpp - -// ------------------------------------------------- -// Metric and Counter Data Configuration -// ------------------------------------------------- - - -// Note: Be carful before modifying the code below. There is a specific -// sequence one needs to follow to program the metrics else things may -// stop working. We tried to keep the flow consistent with the example -// code from NVIDIA. Since most of the programmability comes from -// the CUPTI profiler metric names this should be okay. - -// Only supported on CUDA RT Version between 10.0 and 11.04. -// After CUDA RT 11.04, the structure has changed. -// TODO update the structure NVPA_RawMetricsConfig to support 11.04 -#if defined(CUDART_VERSION) && CUDART_VERSION > 10000 && CUDART_VERSION < 11040 - -bool getRawMetricRequests( - NVPA_MetricsContext* metricsContext, - std::vector metricNames, - std::vector& rawMetricsDeps, - std::vector& rawMetricRequests) { - bool isolated = true; - /* Bug in collection with collection of metrics without instances, keep it - * to true*/ - bool keepInstances = true; - - for (const auto& metricName : metricNames) { - - NVPW_MetricsContext_GetMetricProperties_Begin_Params - getMetricPropertiesBeginParams = { - NVPW_MetricsContext_GetMetricProperties_Begin_Params_STRUCT_SIZE, nullptr}; - getMetricPropertiesBeginParams.pMetricsContext = metricsContext; - getMetricPropertiesBeginParams.pMetricName = metricName.c_str(); - - if (!NVPW_CALL( - NVPW_MetricsContext_GetMetricProperties_Begin( - &getMetricPropertiesBeginParams))) { - return false; - } - - for (const char** metricDepsIt = - getMetricPropertiesBeginParams.ppRawMetricDependencies; - *metricDepsIt; - ++metricDepsIt) { - rawMetricsDeps.push_back(*metricDepsIt); - } - - NVPW_MetricsContext_GetMetricProperties_End_Params - getMetricPropertiesEndParams = { - NVPW_MetricsContext_GetMetricProperties_End_Params_STRUCT_SIZE, nullptr}; - getMetricPropertiesEndParams.pMetricsContext = metricsContext; - - if (!NVPW_CALL(NVPW_MetricsContext_GetMetricProperties_End( - &getMetricPropertiesEndParams))) { - return false; - } - } - - for (const auto& rawMetricName : rawMetricsDeps) { - NVPA_RawMetricRequest metricRequest = {NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE, nullptr}; - metricRequest.pMetricName = rawMetricName.c_str(); - metricRequest.isolated = isolated; - metricRequest.keepInstances = keepInstances; - rawMetricRequests.push_back(metricRequest); - VLOG(1) << "Adding raw metric struct : raw metric = " << rawMetricName - << " isolated = " << isolated << " keepinst = " << keepInstances; - } - - if (rawMetricRequests.size() == 0) { - LOG(WARNING) << "CUPTI Profiler was unable to configure any metrics"; - return false; - } - return true; -} - -// Setup CUPTI Profiler Config Image -bool getProfilerConfigImage( - const std::string& chipName, - const std::vector& metricNames, - std::vector& configImage, - const uint8_t* counterAvailabilityImage) { - - NVPW_CUDA_MetricsContext_Create_Params metricsContextCreateParams = { - NVPW_CUDA_MetricsContext_Create_Params_STRUCT_SIZE, nullptr}; - metricsContextCreateParams.pChipName = chipName.c_str(); - - if (!NVPW_CALL( - NVPW_CUDA_MetricsContext_Create(&metricsContextCreateParams))) { - return false; - } - - NVPW_MetricsContext_Destroy_Params metricsContextDestroyParams = { - NVPW_MetricsContext_Destroy_Params_STRUCT_SIZE, nullptr}; - metricsContextDestroyParams.pMetricsContext = - metricsContextCreateParams.pMetricsContext; - - SCOPE_EXIT([&]() { - NVPW_MetricsContext_Destroy( - (NVPW_MetricsContext_Destroy_Params*)&metricsContextDestroyParams); - }); - - // Get all raw metrics required for given metricNames list - std::vector rawMetricRequests; - - // note: we need a variable at this functions scope to hold the string - // pointers for underlying C char arrays. - std::vector rawMetricDeps; - - if (!getRawMetricRequests( - metricsContextCreateParams.pMetricsContext, - metricNames, - rawMetricDeps, - rawMetricRequests)) { - return false; - } - - NVPA_RawMetricsConfigOptions metricsConfigOptions = { - NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE, nullptr}; - metricsConfigOptions.activityKind = NVPA_ACTIVITY_KIND_PROFILER; - metricsConfigOptions.pChipName = chipName.c_str(); - NVPA_RawMetricsConfig* rawMetricsConfig; - if (!NVPW_CALL( - NVPA_RawMetricsConfig_Create( - &metricsConfigOptions, &rawMetricsConfig))) { - return false; - } - - // TODO check if this is required - if (counterAvailabilityImage) { - NVPW_RawMetricsConfig_SetCounterAvailability_Params - setCounterAvailabilityParams = { - NVPW_RawMetricsConfig_SetCounterAvailability_Params_STRUCT_SIZE, nullptr}; - setCounterAvailabilityParams.pRawMetricsConfig = rawMetricsConfig; - setCounterAvailabilityParams.pCounterAvailabilityImage = - counterAvailabilityImage; - if (!NVPW_CALL( - NVPW_RawMetricsConfig_SetCounterAvailability( - &setCounterAvailabilityParams))) { - return false; - } - } - - NVPW_RawMetricsConfig_Destroy_Params rawMetricsConfigDestroyParams = { - NVPW_RawMetricsConfig_Destroy_Params_STRUCT_SIZE, nullptr}; - rawMetricsConfigDestroyParams.pRawMetricsConfig = rawMetricsConfig; - SCOPE_EXIT([&]() { - NVPW_RawMetricsConfig_Destroy( - (NVPW_RawMetricsConfig_Destroy_Params*)&rawMetricsConfigDestroyParams); - }); - - // Start a Raw Metric Pass group - NVPW_RawMetricsConfig_BeginPassGroup_Params beginPassGroupParams = { - NVPW_RawMetricsConfig_BeginPassGroup_Params_STRUCT_SIZE, nullptr}; - beginPassGroupParams.pRawMetricsConfig = rawMetricsConfig; - if (!NVPW_CALL( - NVPW_RawMetricsConfig_BeginPassGroup(&beginPassGroupParams))) { - return false; - } - - // Add all raw metrics - NVPW_RawMetricsConfig_AddMetrics_Params addMetricsParams = { - NVPW_RawMetricsConfig_AddMetrics_Params_STRUCT_SIZE, nullptr}; - addMetricsParams.pRawMetricsConfig = rawMetricsConfig; - addMetricsParams.pRawMetricRequests = rawMetricRequests.data(); - addMetricsParams.numMetricRequests = rawMetricRequests.size(); - if (!NVPW_CALL( - NVPW_RawMetricsConfig_AddMetrics(&addMetricsParams))) { - return false; - } - - // End pass group - NVPW_RawMetricsConfig_EndPassGroup_Params endPassGroupParams = { - NVPW_RawMetricsConfig_EndPassGroup_Params_STRUCT_SIZE, nullptr}; - endPassGroupParams.pRawMetricsConfig = rawMetricsConfig; - if (!NVPW_CALL( - NVPW_RawMetricsConfig_EndPassGroup(&endPassGroupParams))) { - return false; - } - - // Setup Config Image generation - NVPW_RawMetricsConfig_GenerateConfigImage_Params generateConfigImageParams = { - NVPW_RawMetricsConfig_GenerateConfigImage_Params_STRUCT_SIZE, nullptr}; - generateConfigImageParams.pRawMetricsConfig = rawMetricsConfig; - if (!NVPW_CALL( - NVPW_RawMetricsConfig_GenerateConfigImage(&generateConfigImageParams))) { - return false; - } - - // Get the Config Image size... nearly there - NVPW_RawMetricsConfig_GetConfigImage_Params getConfigImageParams = { - NVPW_RawMetricsConfig_GetConfigImage_Params_STRUCT_SIZE, nullptr}; - getConfigImageParams.pRawMetricsConfig = rawMetricsConfig; - getConfigImageParams.bytesAllocated = 0; - getConfigImageParams.pBuffer = nullptr; - if (!NVPW_CALL( - NVPW_RawMetricsConfig_GetConfigImage(&getConfigImageParams))) { - return false; - } - - configImage.resize(getConfigImageParams.bytesCopied); - - // Write the Config image binary - getConfigImageParams.bytesAllocated = configImage.size(); - getConfigImageParams.pBuffer = configImage.data(); - if (!NVPW_CALL( - NVPW_RawMetricsConfig_GetConfigImage(&getConfigImageParams))) { - return false; - } - - return true; -} - -bool getCounterDataPrefixImage( - const std::string& chipName, - const std::vector& metricNames, - std::vector& counterDataImagePrefix) { - - NVPW_CUDA_MetricsContext_Create_Params metricsContextCreateParams = { - NVPW_CUDA_MetricsContext_Create_Params_STRUCT_SIZE, nullptr}; - metricsContextCreateParams.pChipName = chipName.c_str(); - - if (!NVPW_CALL( - NVPW_CUDA_MetricsContext_Create(&metricsContextCreateParams))) { - return false; - } - - NVPW_MetricsContext_Destroy_Params metricsContextDestroyParams = { - NVPW_MetricsContext_Destroy_Params_STRUCT_SIZE, nullptr}; - metricsContextDestroyParams.pMetricsContext = - metricsContextCreateParams.pMetricsContext; - - - SCOPE_EXIT([&]() { - NVPW_MetricsContext_Destroy( - (NVPW_MetricsContext_Destroy_Params*)&metricsContextDestroyParams); - }); - - // Get all raw metrics required for given metricNames list - std::vector rawMetricRequests; - - // note: we need a variable at this functions scope to hold the string - // pointers for underlying C char arrays. - std::vector rawMetricDeps; - - if (!getRawMetricRequests( - metricsContextCreateParams.pMetricsContext, - metricNames, - rawMetricDeps, - rawMetricRequests)) { - return false; - } - - // Setup Counter Data builder - NVPW_CounterDataBuilder_Create_Params counterDataBuilderCreateParams = { - NVPW_CounterDataBuilder_Create_Params_STRUCT_SIZE, nullptr}; - counterDataBuilderCreateParams.pChipName = chipName.c_str(); - if (!NVPW_CALL( - NVPW_CounterDataBuilder_Create(&counterDataBuilderCreateParams))) { - return false; - } - - NVPW_CounterDataBuilder_Destroy_Params counterDataBuilderDestroyParams = { - NVPW_CounterDataBuilder_Destroy_Params_STRUCT_SIZE, nullptr}; - counterDataBuilderDestroyParams.pCounterDataBuilder = - counterDataBuilderCreateParams.pCounterDataBuilder; - SCOPE_EXIT([&]() { - NVPW_CounterDataBuilder_Destroy(( - NVPW_CounterDataBuilder_Destroy_Params*)&counterDataBuilderDestroyParams); - }); - - // Add metrics to counter data image prefix - NVPW_CounterDataBuilder_AddMetrics_Params addMetricsParams = { - NVPW_CounterDataBuilder_AddMetrics_Params_STRUCT_SIZE, nullptr}; - addMetricsParams.pCounterDataBuilder = - counterDataBuilderCreateParams.pCounterDataBuilder; - addMetricsParams.pRawMetricRequests = rawMetricRequests.data(); - addMetricsParams.numMetricRequests = rawMetricRequests.size(); - if (!NVPW_CALL( - NVPW_CounterDataBuilder_AddMetrics(&addMetricsParams))) { - return false; - } - - // Get image prefix size - NVPW_CounterDataBuilder_GetCounterDataPrefix_Params - getCounterDataPrefixParams = { - NVPW_CounterDataBuilder_GetCounterDataPrefix_Params_STRUCT_SIZE, nullptr}; - getCounterDataPrefixParams.pCounterDataBuilder = - counterDataBuilderCreateParams.pCounterDataBuilder; - getCounterDataPrefixParams.bytesAllocated = 0; - getCounterDataPrefixParams.pBuffer = nullptr; - if (!NVPW_CALL( - NVPW_CounterDataBuilder_GetCounterDataPrefix( - &getCounterDataPrefixParams))) { - return false; - } - - counterDataImagePrefix.resize(getCounterDataPrefixParams.bytesCopied); - - // Now write counter data image prefix - getCounterDataPrefixParams.bytesAllocated = counterDataImagePrefix.size(); - getCounterDataPrefixParams.pBuffer = counterDataImagePrefix.data(); - if (!NVPW_CALL( - NVPW_CounterDataBuilder_GetCounterDataPrefix( - &getCounterDataPrefixParams))) { - return false; - } - - return true; -} - -// ------------------------------------------------- -// Metric and Counter Evaluation Utilities -// ------------------------------------------------- - -std::string getRangeDescription( - const std::vector& counterDataImage, - int rangeIndex) { - std::vector descriptionPtrs; - - NVPW_Profiler_CounterData_GetRangeDescriptions_Params getRangeDescParams = { - NVPW_Profiler_CounterData_GetRangeDescriptions_Params_STRUCT_SIZE, nullptr}; - getRangeDescParams.pCounterDataImage = counterDataImage.data(); - getRangeDescParams.rangeIndex = rangeIndex; - - if (!NVPW_CALL( - NVPW_Profiler_CounterData_GetRangeDescriptions(&getRangeDescParams))) { - return ""; - } - - descriptionPtrs.resize(getRangeDescParams.numDescriptions); - getRangeDescParams.ppDescriptions = descriptionPtrs.data(); - - if (!NVPW_CALL( - NVPW_Profiler_CounterData_GetRangeDescriptions(&getRangeDescParams))) { - return ""; - } - - std::string rangeName; - - for (size_t i = 0; i < getRangeDescParams.numDescriptions; i++) { - if (i > 0) { - rangeName.append("/"); - } - rangeName.append(descriptionPtrs[i]); - } - return rangeName; -} - -CuptiProfilerResult evalMetricValues( - const std::string& chipName, - const std::vector& counterDataImage, - const std::vector& metricNames, - bool verbose) { - - if (!counterDataImage.size()) { - LOG(ERROR) << "Counter Data Image is empty!"; - return {}; - } - - NVPW_CUDA_MetricsContext_Create_Params metricsContextCreateParams = { - NVPW_CUDA_MetricsContext_Create_Params_STRUCT_SIZE, nullptr}; - metricsContextCreateParams.pChipName = chipName.c_str(); - if (!NVPW_CALL( - NVPW_CUDA_MetricsContext_Create(&metricsContextCreateParams))) { - return {}; - } - - NVPW_MetricsContext_Destroy_Params metricsContextDestroyParams = { - NVPW_MetricsContext_Destroy_Params_STRUCT_SIZE, nullptr}; - metricsContextDestroyParams.pMetricsContext = - metricsContextCreateParams.pMetricsContext; - SCOPE_EXIT([&]() { - NVPW_MetricsContext_Destroy( - (NVPW_MetricsContext_Destroy_Params*)&metricsContextDestroyParams); - }); - - NVPW_CounterData_GetNumRanges_Params getNumRangesParams = { - NVPW_CounterData_GetNumRanges_Params_STRUCT_SIZE, nullptr}; - getNumRangesParams.pCounterDataImage = counterDataImage.data(); - if (!NVPW_CALL( - NVPW_CounterData_GetNumRanges(&getNumRangesParams))) { - return {}; - } - - // TBD in the future support special chars in metric name - // for now these are default - const bool isolated = true; - - // API takes a 2D array of chars - std::vector metricNamePtrs; - - for (const auto& metric : metricNames) { - metricNamePtrs.push_back(metric.c_str()); - } - - CuptiProfilerResult result{ - .metricNames = metricNames}; - - for (size_t rangeIndex = 0; rangeIndex < getNumRangesParams.numRanges; - ++rangeIndex) { - - CuptiRangeMeasurement rangeData { - .rangeName = getRangeDescription(counterDataImage, rangeIndex)}; - rangeData.values.resize(metricNames.size()); - - // First set Counter data image with current range - NVPW_MetricsContext_SetCounterData_Params setCounterDataParams = { - NVPW_MetricsContext_SetCounterData_Params_STRUCT_SIZE, nullptr}; - - setCounterDataParams.pMetricsContext = - metricsContextCreateParams.pMetricsContext; - setCounterDataParams.pCounterDataImage = counterDataImage.data(); - setCounterDataParams.isolated = isolated; - setCounterDataParams.rangeIndex = rangeIndex; - - NVPW_CALL(NVPW_MetricsContext_SetCounterData(&setCounterDataParams)); - - - // Now we can evaluate GPU metrics - NVPW_MetricsContext_EvaluateToGpuValues_Params evalToGpuParams = { - NVPW_MetricsContext_EvaluateToGpuValues_Params_STRUCT_SIZE, nullptr}; - evalToGpuParams.pMetricsContext = - metricsContextCreateParams.pMetricsContext; - evalToGpuParams.numMetrics = metricNamePtrs.size(); - evalToGpuParams.ppMetricNames = metricNamePtrs.data(); - evalToGpuParams.pMetricValues = rangeData.values.data(); - - if (!NVPW_CALL(NVPW_MetricsContext_EvaluateToGpuValues(&evalToGpuParams))) { - LOG(WARNING) << "Failed to evaluate metris for range : " - << rangeData.rangeName; - continue; - } - - if (verbose) { - for (size_t i = 0; i < metricNames.size(); i++) { - LOG(INFO) << "rangeName: " << rangeData.rangeName - << "\tmetricName: " << metricNames[i] - << "\tgpuValue: " << rangeData.values[i]; - } - } - - result.rangeVals.emplace_back(std::move(rangeData)); - } - - return result; -} - -#else - -bool getProfilerConfigImage( - const std::string& /*chipName*/, - const std::vector& /*metricNames*/, - std::vector& /*configImage*/, - const uint8_t* /*counterAvailabilityImage*/) { - return false; -} - -bool getCounterDataPrefixImage( - const std::string& /*chipName*/, - const std::vector& /*metricNames*/, - std::vector& /*counterDataImagePrefix*/) { - return false; -} - -CuptiProfilerResult evalMetricValues( - const std::string& /*chipName*/, - const std::vector& /*counterDataImage*/, - const std::vector& /*metricNames*/, - bool /*verbose*/) { - return {}; -} - -#endif // cuda version > 10.00 and < 11.04 - -} // namespace nvperf -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.h b/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.h deleted file mode 100644 index d5dd1b1c1d20b066891f8be679e6d6371d4f4a9b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiNvPerfMetric.h +++ /dev/null @@ -1,71 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "Logger.h" - -namespace KINETO_NAMESPACE { - -struct CuptiRangeMeasurement { - std::string rangeName; - std::vector values; -}; - -struct CuptiProfilerResult { - std::vector metricNames; - // rangeName, list values - std::vector rangeVals; -}; - -/* Utilities for CUPTI and NVIDIA PerfWorks Metric API - */ - -#define NVPW_CALL(call) \ - [&]() -> bool { \ - NVPA_Status _status_ = call; \ - if (_status_ != NVPA_STATUS_SUCCESS) { \ - LOG(WARNING) << fmt::format( \ - "function {} failed with error ({})", \ - #call, \ - (int)_status_); \ - return false; \ - } \ - return true; \ - }() - -// fixme - add a results string -// nvpperfGetResultString(_status_, &_errstr_); - -namespace nvperf { - -// Setup CUPTI profiler configuration blob and counter data image prefix -bool getProfilerConfigImage( - const std::string& chipName, - const std::vector& metricNames, - std::vector& configImage, - const uint8_t* counterAvailabilityImage = nullptr); - -// Setup CUPTI profiler configuration blob and counter data image prefix -bool getCounterDataPrefixImage( - const std::string& chipName, - const std::vector& metricNames, - std::vector& counterDataImagePrefix); - -/* NV Perf Metric Evaluation helpers - * - utilities to read binary data and obtain metrics for ranges - */ -CuptiProfilerResult evalMetricValues( - const std::string& chipName, - const std::vector& counterDataImage, - const std::vector& metricNames, - bool verbose = false); - - -} // namespace nvperf -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.cpp deleted file mode 100644 index e5f18ed7b0b70963eb2deab126ff4f7119ed582b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.cpp +++ /dev/null @@ -1,751 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#ifdef HAS_CUPTI -#include -#include -#endif // HAS_CUPTI -#include -#include - -#ifdef HAS_CUPTI -#include "cupti_call.h" -#endif - -#include "time_since_epoch.h" -#include "Logger.h" -#include "Demangle.h" - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "CuptiCallbackApiMock.h" -#include "CuptiRangeProfilerApi.h" - -#if HAS_CUPTI_RANGE_PROFILER -#include -#include -#include "cupti_call.h" -#endif // HAS_CUPTI_RANGE_PROFILER - -namespace KINETO_NAMESPACE { - -#if HAS_CUPTI_RANGE_PROFILER -constexpr char kRootUserRangeName[] = "__profile__"; -constexpr int kCallbacksCountToFlush = 500; - -// Should we set Counter availability image ourselves? -// Disabled this right now as this call conflicts with DCGM -// It is not clear why it should conflict except it being a profiler API call -// TODO Revisit -constexpr bool kSetCounterAvail = false; - -// Shared state to track one Cupti Profiler API per Device -namespace { -// per device profiler maps -std::unordered_map profiler_map; -std::unordered_map enable_flag; -std::unordered_map disable_flag; - -std::mutex contextMutex_; -std::unordered_map ctx_to_dev; -std::set active_devices; -} - -// forward declarations -void __trackCudaCtx(CUcontext ctx, uint32_t device_id, CUpti_CallbackId cbid); -void __trackCudaKernelLaunch(CUcontext ctx, const char* kernelName); - -/// Helper functions - -// Available raw counters -std::vector getCounterAvailiability(CUcontext cuContext) { - std::vector counterAvailabilityImage; - CUpti_Profiler_GetCounterAvailability_Params getCounterAvailabilityParams = { - CUpti_Profiler_GetCounterAvailability_Params_STRUCT_SIZE, nullptr}; - getCounterAvailabilityParams.ctx = cuContext; - CUPTI_CALL( - cuptiProfilerGetCounterAvailability(&getCounterAvailabilityParams)); - - counterAvailabilityImage.clear(); - counterAvailabilityImage.resize( - getCounterAvailabilityParams.counterAvailabilityImageSize); - - getCounterAvailabilityParams.pCounterAvailabilityImage = - counterAvailabilityImage.data(); - CUPTI_CALL( - cuptiProfilerGetCounterAvailability(&getCounterAvailabilityParams)); - - return counterAvailabilityImage; -} - -std::string getChipName(int deviceId) { - // Get chip name for the cuda device - CUpti_Device_GetChipName_Params getChipNameParams = { - CUpti_Device_GetChipName_Params_STRUCT_SIZE, nullptr}; - - getChipNameParams.deviceIndex = deviceId; - CUPTI_CALL(cuptiDeviceGetChipName(&getChipNameParams)); - - return getChipNameParams.pChipName; -} - -inline uint32_t getDevID(CUcontext ctx) { - uint32_t device_id = UINT32_MAX; - CUPTI_CALL(cuptiGetDeviceId(ctx, &device_id)); - if (device_id == UINT32_MAX) { - LOG(ERROR) << "Could not determine dev id for = " << ctx; - } - return device_id; -} - -// We use CUPTI Callback functions in three ways : -// 1. Track cuda contexts and maintain a list of active GPUs to profile -// 2. Callbacks on kernel launches to track the name of automatic -// ranges that correspond to names of kernels -// 3. Lastly CUPTI profiler has to be enabled on the same thread executing -// the CUDA kernels. We use Callbacks to enable the profiler -// asynchronously from another thread. - -void disableKernelCallbacks(); - -void trackCudaCtx( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo) { - auto *d = reinterpret_cast(cbInfo); - auto ctx = d->context; - uint32_t device_id = getDevID(ctx); - - if (device_id == UINT32_MAX) { - return; - } - - __trackCudaCtx(ctx, device_id, cbid); -} - -void __trackCudaCtx(CUcontext ctx, uint32_t device_id, CUpti_CallbackId cbid) { - std::lock_guard g(contextMutex_); - if (cbid == CUPTI_CBID_RESOURCE_CONTEXT_CREATED) { - VLOG(0) << "CUPTI Profiler observed CUDA Context created = " - << ctx << " device id = " << device_id; - active_devices.insert(device_id); - if constexpr (kSetCounterAvail) { - if (active_devices.size() == 1) { - CuptiRBProfilerSession::setCounterAvailabilityImage( - getCounterAvailiability(ctx)); - } - } - ctx_to_dev[ctx] = device_id; - - } else if (cbid == CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING) { - VLOG(0) << "CUPTI Profiler observed CUDA Context destroyed = " - << ctx << " device id = " << device_id; - auto it = active_devices.find(device_id); - if (it != active_devices.end()) { - active_devices.erase(it); - ctx_to_dev.erase(ctx); - } - } -} - -void trackCudaKernelLaunch( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId /*cbid*/, - const CUpti_CallbackData* cbInfo) { - VLOG(1) << " Trace : Callback name = " - << (cbInfo->symbolName ? cbInfo->symbolName: "") - << " context ptr = " << cbInfo->context; - auto ctx = cbInfo->context; - // should be in CUPTI_API_ENTER call site - if (cbInfo->callbackSite != CUPTI_API_ENTER) { - return; - } - __trackCudaKernelLaunch(ctx, cbInfo->symbolName); -} - -void __trackCudaKernelLaunch( - CUcontext ctx, - const char* kernelName) { - VLOG(0) << " Tracking kernel name = " << (kernelName ? kernelName : "") - << " context ptr = " << ctx; - - uint32_t device_id = 0; - auto it = ctx_to_dev.find(ctx); - if (it == ctx_to_dev.end()) { - // Warning here could be too noisy - VLOG(0) << " Could not find corresponding device to ctx = " << ctx; - return; - } else { - device_id = it->second; - } - - auto pit = profiler_map.find(device_id); - if (pit == profiler_map.end() || pit->second == nullptr) { - return; - } - auto profiler = pit->second; - - if (enable_flag[device_id]) { - LOG(INFO) << "Callback handler is enabling cupti profiler"; - profiler->startAndEnable(); - enable_flag[device_id] = false; - - } else if (disable_flag[device_id]) { - LOG(INFO) << "Callback handler is disabling cupti profiler"; - profiler->disableAndStop(); - return; - } - - if (profiler->curRange_ == CUPTI_AutoRange) { - profiler->logKernelName(kernelName ? kernelName : "__missing__"); - } - - /* TODO add per kernel time logging - if (measure_per_kernel) { - profiler->kernelStartTs_.push_back( - std::chrono::high_resolution_clock::now()); - } - */ - - // periodically flush profiler data from GPU - if (profiler->numCallbacks_ % kCallbacksCountToFlush == 0) { - profiler->flushCounterData(); - } - profiler->numCallbacks_++; -} - -void enableKernelCallbacks() { - auto& cbapi = CuptiCallbackApi::singleton(); - bool status = cbapi.enableCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000); - if (!status) { - LOG(WARNING) << "CUPTI Range Profiler unable to " - << "enable cuda kernel launch callback"; - return; - } - LOG(INFO) << "CUPTI Profiler kernel callbacks enabled"; -} - -void disableKernelCallbacks() { - auto& cbapi = CuptiCallbackApi::singleton(); - bool status = cbapi.disableCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000); - if (!status) { - LOG(WARNING) << "CUPTI Range Profiler unable to " - << "disable cuda kernel launch callback"; - return; - } - LOG(INFO) << "CUPTI Profiler kernel callbacks disabled"; -} - -// static -std::set CuptiRBProfilerSession::getActiveDevices() { - std::lock_guard g(contextMutex_); - return active_devices; -} - -// static -void CuptiRBProfilerSession::initCupti() { - CUpti_Profiler_Initialize_Params profilerInitializeParams = { - CUpti_Profiler_Initialize_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerInitialize(&profilerInitializeParams)); -} - -// static -void CuptiRBProfilerSession::deInitCupti() { - CUpti_Profiler_DeInitialize_Params profilerDeInitializeParams = { - CUpti_Profiler_DeInitialize_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerDeInitialize(&profilerDeInitializeParams)); -} - -// static -void CuptiRBProfilerSession::staticInit() { - CuptiRBProfilerSession::initCupti(); - - // Register CUPTI callbacks - auto& cbapi = CuptiCallbackApi::singleton(); - CUpti_CallbackDomain domain = CUPTI_CB_DOMAIN_RESOURCE; - bool status = cbapi.registerCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_CREATED, trackCudaCtx); - status = status && cbapi.registerCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_DESTROYED, trackCudaCtx); - status = status && cbapi.enableCallback( - domain, CUPTI_CBID_RESOURCE_CONTEXT_CREATED); - status = status && cbapi.enableCallback( - domain, CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING); - - if (!status) { - LOG(WARNING) << "CUPTI Range Profiler unable to attach cuda context " - << "create and destroy callbacks"; - CUPTI_CALL(cbapi.getCuptiStatus()); - return; - } - - domain = CUPTI_CB_DOMAIN_RUNTIME_API; - status = cbapi.registerCallback( - domain, CuptiCallbackApi::CUDA_LAUNCH_KERNEL, trackCudaKernelLaunch); - - if (!status) { - LOG(WARNING) << "CUPTI Range Profiler unable to attach cuda kernel " - << "launch callback"; - return; - } -} - -// static -std::vector& CuptiRBProfilerSession::counterAvailabilityImage() { - static std::vector counterAvailabilityImage_; - return counterAvailabilityImage_; -} - - -// Setup the profiler sessions -CuptiRBProfilerSession::CuptiRBProfilerSession( - const std::vector& metricNames, - int deviceId, - int maxRanges, - int numNestingLevels, - CUcontext cuContext) - : metricNames_(metricNames), - chipName_(getChipName(deviceId)), - deviceId_(deviceId), - maxRanges_(maxRanges), - numNestingLevels_(numNestingLevels), - cuContext_(cuContext) { - CuptiRBProfilerSession::initCupti(); - - LOG(INFO) << "Initializing CUPTI profiler session : device = " << deviceId - << " chip = " << chipName_; - /* Generate configuration for metrics, this can also be done offline*/ - NVPW_InitializeHost_Params initializeHostParams = { - NVPW_InitializeHost_Params_STRUCT_SIZE, nullptr}; - NVPW_CALL(NVPW_InitializeHost(&initializeHostParams)); - - if (metricNames.size()) { - if (!nvperf::getProfilerConfigImage( - chipName_, - metricNames, - configImage, - CuptiRBProfilerSession::counterAvailabilityImage().data())) { - LOG(ERROR) << "Failed to create configImage or counterDataImagePrefix"; - return; - } - if (!nvperf::getCounterDataPrefixImage( - chipName_, - metricNames, - counterDataImagePrefix)) { - LOG(ERROR) << "Failed to create counterDataImagePrefix"; - return; - } - } else { - LOG(ERROR) << "No metrics provided to profile"; - return; - } - - if (!createCounterDataImage()) { - LOG(ERROR) << "Failed to create counterDataImage"; - return; - } - - LOG(INFO) << "Size of structs\n" - << " config image size = " << configImage.size() << " B" - << " counter data image prefix = " - << counterDataImagePrefix.size() << " B" - << " counter data image size = " << counterDataImage.size() / 1024 - << " KB" - << " counter sb image size = " - << counterDataScratchBuffer.size() << " B"; - - beginPassParams_ = {CUpti_Profiler_BeginPass_Params_STRUCT_SIZE, nullptr}; - endPassParams_ = {CUpti_Profiler_EndPass_Params_STRUCT_SIZE, nullptr}; - - initSuccess_ = true; - profiler_map[deviceId] = this; -} - -// used in unittests only -CuptiRBProfilerSession::CuptiRBProfilerSession(int deviceId, CUcontext ctx) - : deviceId_(deviceId), cuContext_(ctx) { - initSuccess_ = true; - profiler_map[deviceId] = this; -} - -void CuptiRBProfilerSession::startInternal( - CUpti_ProfilerRange profilerRange, - CUpti_ProfilerReplayMode profilerReplayMode) { - LOG(INFO) << "Starting profiler session: profiler range = " - << ((profilerRange == CUPTI_AutoRange) ? "autorange" : "userrange") - << " replay mode = " - << ((profilerReplayMode == CUPTI_KernelReplay) ? "kernel" : "user"); - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return; - } - - if (cuContext_ == nullptr) { - for (const auto& it : ctx_to_dev) { - if (it.second == deviceId_) { - cuContext_ = it.first; - break; - } - } - LOG(INFO) << " Cupti Profiler using CUDA context = " << cuContext_; - } - - profilerStartTs_ = std::chrono::high_resolution_clock::now(); - curRange_ = profilerRange; - curReplay_ = profilerReplayMode; - - CUpti_Profiler_BeginSession_Params beginSessionParams = { - CUpti_Profiler_BeginSession_Params_STRUCT_SIZE, nullptr}; - - beginSessionParams.ctx = cuContext_; - beginSessionParams.counterDataImageSize = counterDataImage.size(); - beginSessionParams.pCounterDataImage = counterDataImage.data(); - beginSessionParams.counterDataScratchBufferSize = - counterDataScratchBuffer.size(); - beginSessionParams.pCounterDataScratchBuffer = counterDataScratchBuffer.data(); - beginSessionParams.range = profilerRange; - beginSessionParams.replayMode = profilerReplayMode; - beginSessionParams.maxRangesPerPass = maxRanges_; - beginSessionParams.maxLaunchesPerPass = maxRanges_; - - auto status = CUPTI_CALL(cuptiProfilerBeginSession(&beginSessionParams)); - if (status != CUPTI_SUCCESS) { - LOG(WARNING) << "Failed to start CUPTI profiler"; - initSuccess_ = false; - return; - } - - // Set counter configuration - CUpti_Profiler_SetConfig_Params setConfigParams = { - CUpti_Profiler_SetConfig_Params_STRUCT_SIZE, nullptr}; - - setConfigParams.ctx = cuContext_; - setConfigParams.pConfig = configImage.data(); - setConfigParams.configSize = configImage.size(); - setConfigParams.passIndex = 0; - setConfigParams.minNestingLevel = 1; - setConfigParams.numNestingLevels = numNestingLevels_; - status = CUPTI_CALL(cuptiProfilerSetConfig(&setConfigParams)); - - if (status != CUPTI_SUCCESS) { - LOG(WARNING) << "Failed to configure CUPTI profiler"; - initSuccess_ = false; - return; - } - profilerInitDoneTs_ = std::chrono::high_resolution_clock::now(); - - if (curRange_ == CUPTI_AutoRange) { - enableKernelCallbacks(); - } - profilingActive_ = true; -} - -void CuptiRBProfilerSession::stop() { - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return; - } - LOG(INFO) << "Stop profiler session on device = " << deviceId_; - - CUpti_Profiler_UnsetConfig_Params unsetConfigParams = { - CUpti_Profiler_UnsetConfig_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerUnsetConfig(&unsetConfigParams)); - - CUpti_Profiler_EndSession_Params endSessionParams = { - CUpti_Profiler_EndSession_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerEndSession(&endSessionParams)); - - disableKernelCallbacks(); - - profilerStopTs_ = std::chrono::high_resolution_clock::now(); - profilingActive_ = false; -} - -void CuptiRBProfilerSession::beginPass() { - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return; - } - CUPTI_CALL(cuptiProfilerBeginPass(&beginPassParams_)); -} - -bool CuptiRBProfilerSession::endPass() { - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return true; - } - CUPTI_CALL(cuptiProfilerEndPass(&endPassParams_)); - return endPassParams_.allPassesSubmitted; -} - -void CuptiRBProfilerSession::flushCounterData() { - LOG(INFO) << "Flushing counter data on device = " << deviceId_; - CUpti_Profiler_FlushCounterData_Params flushCounterDataParams = { - CUpti_Profiler_FlushCounterData_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerFlushCounterData(&flushCounterDataParams)); -} - -/// Enable and disable the profiler -void CuptiRBProfilerSession::enable() { - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return; - } - CUpti_Profiler_EnableProfiling_Params enableProfilingParams = { - CUpti_Profiler_EnableProfiling_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerEnableProfiling(&enableProfilingParams)); -} - -void CuptiRBProfilerSession::disable() { - if (!initSuccess_) { - LOG(WARNING) << __func__ << "() bailing out since initialization failed"; - return; - } - CUpti_Profiler_DisableProfiling_Params disableProfilingParams = { - CUpti_Profiler_DisableProfiling_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerDisableProfiling(&disableProfilingParams)); -} - -/// User range based profiling -void CuptiRBProfilerSession::pushRange(const std::string& rangeName) { - LOG(INFO) << " CUPTI pushrange ( " << rangeName << " )"; - CUpti_Profiler_PushRange_Params pushRangeParams = { - CUpti_Profiler_PushRange_Params_STRUCT_SIZE, nullptr}; - pushRangeParams.pRangeName = rangeName.c_str(); - CUPTI_CALL(cuptiProfilerPushRange(&pushRangeParams)); -} - -void CuptiRBProfilerSession::popRange() { - LOG(INFO) << " CUPTI pop range"; - CUpti_Profiler_PopRange_Params popRangeParams = { - CUpti_Profiler_PopRange_Params_STRUCT_SIZE, nullptr}; - CUPTI_CALL(cuptiProfilerPopRange(&popRangeParams)); -} - -void CuptiRBProfilerSession::startAndEnable() { - startInternal(curRange_, curReplay_); - if (curReplay_ == CUPTI_UserReplay) { - beginPass(); - } - enable(); - if (curRange_ == CUPTI_UserRange) { - pushRange(kRootUserRangeName); - } - enable_flag[deviceId_] = false; -} - -void CuptiRBProfilerSession::disableAndStop() { - if (curRange_ == CUPTI_UserRange) { - popRange(); - } - disable(); - if (curReplay_ == CUPTI_UserReplay) { - endPass(); - flushCounterData(); - } - stop(); - disable_flag[deviceId_] = false; -} - -void CuptiRBProfilerSession::asyncStartAndEnable( - CUpti_ProfilerRange profilerRange, - CUpti_ProfilerReplayMode profilerReplayMode) { - LOG(INFO) << "Starting CUPTI profiler asynchronously on device = " - << deviceId_ << " profiler range = " - << ((profilerRange == CUPTI_AutoRange) ? "autorange" : "userrange") - << " replay mode = " - << ((profilerReplayMode == CUPTI_KernelReplay) ? "kernel" : "user"); - curReplay_ = profilerReplayMode; - curRange_ = profilerRange; - enable_flag[deviceId_] = true; - enableKernelCallbacks(); -} - -void CuptiRBProfilerSession::asyncDisableAndStop() { - LOG(INFO) << "Stopping CUPTI profiler asynchronously on device = " - << deviceId_ << " cu context = " << cuContext_; - disable_flag[deviceId_] = true; -} - - -CuptiProfilerResult CuptiRBProfilerSession::evaluateMetrics( - bool verbose) { - if (!initSuccess_) { - LOG(WARNING) << "Profiling failed, no results to return"; - return {}; - } - if (profilingActive_) { - disableAndStop(); - } - - LOG(INFO) << "Total kernels logged = " << kernelNames_.size(); - if (verbose) { - for (const auto& kernel : kernelNames_) { - std::cout << demangle(kernel) << std::endl; - } - LOG(INFO) << "Profiler Range data : "; - } - - auto results = nvperf::evalMetricValues( - chipName_, counterDataImage, metricNames_, verbose /*verbose*/); - - // profiler end-end duration - auto duration_ms = std::chrono::duration_cast( - profilerStopTs_ - profilerStartTs_); - - auto init_dur_ms = std::chrono::duration_cast( - profilerInitDoneTs_ - profilerStartTs_); - LOG(INFO) << "Total profiler time = " << duration_ms.count() << " ms"; - LOG(INFO) << "Total profiler init time = " << init_dur_ms.count() << " ms"; - - return results; -} - -std::unique_ptr CuptiRBProfilerSession::getProfilerTraceSpan() { - return std::make_unique( - timeSinceEpoch(profilerStartTs_), - timeSinceEpoch(profilerStopTs_), - "__cupti_profiler__" - ); -} - -void CuptiRBProfilerSession::saveCounterData( - const std::string& /*CounterDataFileName*/, - const std::string& /*CounterDataSBFileName*/) { - /* TBD write binary files for counter data and counter scratch buffer */ -} - -/// Setup counter data -bool CuptiRBProfilerSession::createCounterDataImage() { - CUpti_Profiler_CounterDataImageOptions counterDataImageOptions; - counterDataImageOptions.pCounterDataPrefix = counterDataImagePrefix.data(); - counterDataImageOptions.counterDataPrefixSize = counterDataImagePrefix.size(); - counterDataImageOptions.maxNumRanges = maxRanges_; - counterDataImageOptions.maxNumRangeTreeNodes = maxRanges_; - counterDataImageOptions.maxRangeNameLength = 64; - - // Calculate size of counter data image - CUpti_Profiler_CounterDataImage_CalculateSize_Params calculateSizeParams = { - CUpti_Profiler_CounterDataImage_CalculateSize_Params_STRUCT_SIZE, nullptr}; - calculateSizeParams.pOptions = &counterDataImageOptions; - calculateSizeParams.sizeofCounterDataImageOptions = - CUpti_Profiler_CounterDataImageOptions_STRUCT_SIZE; - - CUPTI_CALL( - cuptiProfilerCounterDataImageCalculateSize(&calculateSizeParams)); - counterDataImage.resize(calculateSizeParams.counterDataImageSize); - - // Initialize counter data image - CUpti_Profiler_CounterDataImage_Initialize_Params initializeParams = { - CUpti_Profiler_CounterDataImage_Initialize_Params_STRUCT_SIZE, nullptr}; - initializeParams.sizeofCounterDataImageOptions = - CUpti_Profiler_CounterDataImageOptions_STRUCT_SIZE; - initializeParams.pOptions = &counterDataImageOptions; - initializeParams.counterDataImageSize = - calculateSizeParams.counterDataImageSize; - initializeParams.pCounterDataImage = counterDataImage.data(); - CUPTI_CALL(cuptiProfilerCounterDataImageInitialize(&initializeParams)); - - // Calculate counter Scratch Buffer size - CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params - scratchBufferSizeParams = { - CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params_STRUCT_SIZE, nullptr}; - - scratchBufferSizeParams.counterDataImageSize = - calculateSizeParams.counterDataImageSize; - scratchBufferSizeParams.pCounterDataImage = - initializeParams.pCounterDataImage; - CUPTI_CALL(cuptiProfilerCounterDataImageCalculateScratchBufferSize( - &scratchBufferSizeParams)); - - counterDataScratchBuffer.resize( - scratchBufferSizeParams.counterDataScratchBufferSize); - - // Initialize scratch buffer - CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params - initScratchBufferParams = { - CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params_STRUCT_SIZE, nullptr}; - - initScratchBufferParams.counterDataImageSize = - calculateSizeParams.counterDataImageSize; - - initScratchBufferParams.pCounterDataImage = - initializeParams.pCounterDataImage; - initScratchBufferParams.counterDataScratchBufferSize = - scratchBufferSizeParams.counterDataScratchBufferSize; - initScratchBufferParams.pCounterDataScratchBuffer = - counterDataScratchBuffer.data(); - - CUPTI_CALL(cuptiProfilerCounterDataImageInitializeScratchBuffer( - &initScratchBufferParams)); - - return true; -} - -#elif defined(HAS_CUPTI) - -// Create empty stubs for the API when CUPTI is not present. -CuptiRBProfilerSession::CuptiRBProfilerSession( - const std::vector& metricNames, - int deviceId, - int maxRanges, - int numNestingLevels, - CUcontext cuContext) - : metricNames_(metricNames), - deviceId_(deviceId), - maxRanges_(maxRanges), - numNestingLevels_(numNestingLevels), - cuContext_(cuContext) {} -void CuptiRBProfilerSession::stop() {} -void CuptiRBProfilerSession::enable() {} -void CuptiRBProfilerSession::disable() {} -void CuptiRBProfilerSession::beginPass() {} -bool CuptiRBProfilerSession::endPass() { return true; } -void CuptiRBProfilerSession::flushCounterData() {} -void CuptiRBProfilerSession::pushRange(const std::string& /*rangeName*/) {} -void CuptiRBProfilerSession::popRange() {} -void CuptiRBProfilerSession::asyncStartAndEnable( - CUpti_ProfilerRange /*profilerRange*/, - CUpti_ProfilerReplayMode /*profilerReplayMode*/) {} -void CuptiRBProfilerSession::asyncDisableAndStop() {} -CuptiProfilerResult CuptiRBProfilerSession::evaluateMetrics(bool verbose) { - static CuptiProfilerResult res; - return res; -}; -void CuptiRBProfilerSession::saveCounterData( - const std::string& /*CounterDataFileName*/, - const std::string& /*CounterDataSBFileName*/) {} -void CuptiRBProfilerSession::initCupti() {} -void CuptiRBProfilerSession::deInitCupti() {} -void CuptiRBProfilerSession::staticInit() {} -bool CuptiRBProfilerSession::createCounterDataImage() { return true; } -void CuptiRBProfilerSession::startInternal( - CUpti_ProfilerRange /*profilerRange*/, - CUpti_ProfilerReplayMode /*profilerReplayMode*/) {} -std::vector& CuptiRBProfilerSession::counterAvailabilityImage() { - static std::vector _vec; - return _vec; -} -#endif // HAS_CUPTI_RANGE_PROFILER - -namespace testing { - -void trackCudaCtx(CUcontext ctx, uint32_t device_id, CUpti_CallbackId cbid) { -#if HAS_CUPTI_RANGE_PROFILER - __trackCudaCtx(ctx, device_id, cbid); -#endif // HAS_CUPTI_RANGE_PROFILER -} - -void trackCudaKernelLaunch(CUcontext ctx, const char* kernelName) { -#if HAS_CUPTI_RANGE_PROFILER - __trackCudaKernelLaunch(ctx, kernelName); -#endif // HAS_CUPTI_RANGE_PROFILER -} - -} // namespace testing -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.h b/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.h deleted file mode 100644 index 98a0b3ea5f4850dfa060e4e86d5ebf210692db1a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerApi.h +++ /dev/null @@ -1,220 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#ifdef HAS_CUPTI -#include -#include -// Using CUDA 11 and above due to usage of API: cuptiProfilerGetCounterAvailability. -#if defined(CUDART_VERSION) && CUDART_VERSION >= 10000 && CUDART_VERSION < 11040 && CUDA_VERSION >= 11000 -#define HAS_CUPTI_RANGE_PROFILER 1 -#endif // CUDART_VERSION > 10.00 and < 11.04 && CUDA_VERSION >= 11.00 -#endif // HAS_CUPTI - -#if HAS_CUPTI_RANGE_PROFILER -#include -#include -#include -#else -using CUpti_ProfilerRange = enum -{ - CUPTI_AutoRange, - CUPTI_UserRange, -}; - -using CUpti_ProfilerReplayMode = enum -{ - CUPTI_KernelReplay, - CUPTI_UserReplay, -}; -#endif // HAS_CUPTI_RANGE_PROFILER - -#include -#include -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "TraceSpan.h" -#include "CuptiCallbackApi.h" -#include "CuptiNvPerfMetric.h" - -/* Cupti Range based profiler session - * See : https://docs.nvidia.com/cupti/Cupti/r_main.html#r_profiler - */ - -namespace KINETO_NAMESPACE { - -class CuptiRBProfilerSession { - public: - // Initialize and configure CUPTI Profiler counters. - // - Metric names must be provided as string vector. - // - Supported values by CUPTI can be found at - - // https://docs.nvidia.com/cupti/Cupti/r_main.html#r_host_metrics_api - explicit CuptiRBProfilerSession( - const std::vector& metricNames, - int deviceId, - int maxRanges, - int numNestingLevels = 1, - CUcontext cuContext = 0); - - virtual ~CuptiRBProfilerSession() = default; - - // Start profiling session - // This function has to be called from the CPU thread running - // the CUDA context. If this is not the case asyncStartAndEnable() - // can be used - void start( - CUpti_ProfilerRange profilerRange = CUPTI_AutoRange, - CUpti_ProfilerReplayMode profilerReplayMode = CUPTI_KernelReplay) { - startInternal(profilerRange, profilerReplayMode); - } - - // Stop profiling session - virtual void stop(); - - virtual void enable(); - virtual void disable(); - - // Profiler passes - // GPU hardware has limited performance monitoring resources - // the CUPTI profiler may need to run multiple passes to collect - // data for a given range - // If we use kernel replay model the kernels are automatically replayed - // else, you can use the beginPass() and endPass() functions below - // for user to manage the replays - - // starts a profiler pass with given kernels in between - virtual void beginPass(); - - // end a profiler pass with given kernels in between - // returns true if no more passes are required - virtual bool endPass(); - - // flushes the counter data - required if you use user replay - virtual void flushCounterData(); - - // Each pass can contain multiple of ranges - // metrics configured in a pass are collected per each range-stack. - virtual void pushRange(const std::string& rangeName); - virtual void popRange(); - - // utilities for common operations - void startAndEnable(); - void disableAndStop(); - - // Async APIs : these will can be called from another thread - // outside the CUDA context being profiled - void asyncStartAndEnable( - CUpti_ProfilerRange profilerRange = CUPTI_AutoRange, - CUpti_ProfilerReplayMode profilerReplayMode = CUPTI_KernelReplay); - void asyncDisableAndStop(); - - void printMetrics() { - evaluateMetrics(true); - } - - std::unique_ptr getProfilerTraceSpan(); - - virtual CuptiProfilerResult evaluateMetrics(bool verbose = false); - - void saveCounterData( - const std::string& CounterDataFileName, - const std::string& CounterDataSBFileName); - - // This is not thread safe so please only call after - // profiling has stopped - const std::vector& getKernelNames() const { - return kernelNames_; - } - - int deviceId() const { - return deviceId_; - } - - bool profilingActive() const { - return profilingActive_; - } - - static std::set getActiveDevices(); - - static void initCupti(); - - static void deInitCupti(); - - static void staticInit(); - - static void setCounterAvailabilityImage(std::vector img) { - counterAvailabilityImage() = img; - } - protected: - CuptiRBProfilerSession(int deviceId, CUcontext ctx); - - virtual void startInternal( - CUpti_ProfilerRange profilerRange, - CUpti_ProfilerReplayMode profilerReplayMode); - - CUpti_ProfilerRange curRange_ = CUPTI_AutoRange; - CUpti_ProfilerReplayMode curReplay_ = CUPTI_KernelReplay; - - private: - - bool createCounterDataImage(); - - - // log kernel name that used with callbacks - void logKernelName(const char* kernel) { - std::lock_guard lg(kernelNamesMutex_); - kernelNames_.emplace_back(kernel); - } - - std::vector metricNames_; - std::string chipName_; - - uint32_t deviceId_ = 0; - int maxRanges_; - int numNestingLevels_; - CUcontext cuContext_; - - - // data buffers for configuration and counter data collection - std::vector counterDataImagePrefix; - std::vector configImage; - std::vector counterDataImage; - std::vector counterDataScratchBuffer; - - std::chrono::time_point profilerStartTs_; - std::chrono::time_point - profilerInitDoneTs_; - std::chrono::time_point profilerStopTs_; - - std::mutex kernelNamesMutex_; - // raw kernel names (not demangled) - std::vector kernelNames_; - - uint32_t numCallbacks_ = 0; - - static std::vector& counterAvailabilityImage(); - -#if HAS_CUPTI_RANGE_PROFILER - CUpti_Profiler_BeginPass_Params beginPassParams_; - CUpti_Profiler_EndPass_Params endPassParams_; -#endif - - bool initSuccess_ = false; - bool profilingActive_ = false; - - friend void __trackCudaKernelLaunch(CUcontext ctx, const char* kernelName); -}; - -// called directly only in unit tests -namespace testing { - -void trackCudaCtx(CUcontext ctx, uint32_t device_id, CUpti_CallbackId cbid); -void trackCudaKernelLaunch(CUcontext ctx, const char* kernelName); - -} // namespace testing - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.cpp b/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.cpp deleted file mode 100644 index 04b1ad0cb3f807cf87d32bc03de0ca9b552b0063..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.cpp +++ /dev/null @@ -1,68 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include - -#include -#include - -#include -#include - -using namespace std::chrono; - -namespace KINETO_NAMESPACE { - -// number of ranges affect the size of counter data binary used by -// the CUPTI Profiler. these defaults can be tuned -constexpr int KMaxAutoRanges = 1500; // supports 1500 kernels -constexpr int KMaxUserRanges = 10; // enable upto 10 sub regions marked by user - -constexpr char kCuptiProfilerMetricsKey[] = "CUPTI_PROFILER_METRICS"; -constexpr char kCuptiProfilerPerKernelKey[] = "CUPTI_PROFILER_ENABLE_PER_KERNEL"; -constexpr char kCuptiProfilerMaxRangesKey[] = "CUPTI_PROFILER_MAX_RANGES"; - -CuptiRangeProfilerConfig::CuptiRangeProfilerConfig(Config& cfg) - : parent_(&cfg), - cuptiProfilerPerKernel_(false), - cuptiProfilerMaxRanges_(0) {} - -bool CuptiRangeProfilerConfig::handleOption(const std::string& name, std::string& val) { - VLOG(0) << " handling : " << name << " = " << val; - // Cupti Range based Profiler configuration - if (!name.compare(kCuptiProfilerMetricsKey)) { - activitiesCuptiMetrics_ = splitAndTrim(val, ','); - } else if (!name.compare(kCuptiProfilerPerKernelKey)) { - cuptiProfilerPerKernel_ = toBool(val); - } else if (!name.compare(kCuptiProfilerMaxRangesKey)) { - cuptiProfilerMaxRanges_ = toInt64(val); - } else { - return false; - } - return true; -} - -void CuptiRangeProfilerConfig::setDefaults() { - if (activitiesCuptiMetrics_.size() > 0 && cuptiProfilerMaxRanges_ == 0) { - cuptiProfilerMaxRanges_ = - cuptiProfilerPerKernel_ ? KMaxAutoRanges : KMaxUserRanges; - } -} - -void CuptiRangeProfilerConfig::printActivityProfilerConfig(std::ostream& s) const { - if (activitiesCuptiMetrics_.size() > 0) { - s << "Cupti Profiler metrics : " - << fmt::format("{}", fmt::join(activitiesCuptiMetrics_, ", ")) << std::endl; - s << "Cupti Profiler measure per kernel : " - << cuptiProfilerPerKernel_ << std::endl; - s << "Cupti Profiler max ranges : " << cuptiProfilerMaxRanges_ << std::endl; - } -} - -void CuptiRangeProfilerConfig::registerFactory() { - Config::addConfigFactory( - kCuptiProfilerConfigName, - [](Config& cfg) { return new CuptiRangeProfilerConfig(cfg); }); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.h b/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.h deleted file mode 100644 index 549b8a4e8b40c66b59bae974eb87c7f64967344e..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/CuptiRangeProfilerConfig.h +++ /dev/null @@ -1,86 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include "Config.h" - -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -constexpr char kCuptiProfilerConfigName[] = "cupti_rb_profiler"; - -class CuptiRangeProfilerConfig : public AbstractConfig { - public: - bool handleOption(const std::string& name, std::string& val) override; - - void validate( - const std::chrono::time_point& - fallbackProfileStartTime) override {} - - static CuptiRangeProfilerConfig& get(const Config& cfg) { - return dynamic_cast(cfg.feature( - kCuptiProfilerConfigName)); - } - - Config& parent() const { - return *parent_; - } - - std::vector activitiesCuptiMetrics() const { - return activitiesCuptiMetrics_; - } - - bool cuptiProfilerPerKernel() const { - return cuptiProfilerPerKernel_; - } - - int64_t cuptiProfilerMaxRanges() const { - return cuptiProfilerMaxRanges_; - } - - void setSignalDefaults() override { - setDefaults(); - } - - void setClientDefaults() override { - setDefaults(); - } - - void printActivityProfilerConfig(std::ostream& s) const override; - - static void registerFactory(); - protected: - AbstractConfig* cloneDerived(AbstractConfig& parent) const override { - CuptiRangeProfilerConfig* clone = new CuptiRangeProfilerConfig(*this); - clone->parent_ = dynamic_cast(&parent); - return clone; - } - - private: - CuptiRangeProfilerConfig() = delete; - explicit CuptiRangeProfilerConfig(Config& parent); - explicit CuptiRangeProfilerConfig( - const CuptiRangeProfilerConfig& other) = default; - - // some defaults will depend on other configuration - void setDefaults(); - - // Associated Config object - Config* parent_; - - // Counter metrics exposed via CUPTI Profiler API - std::vector activitiesCuptiMetrics_; - - // Collect profiler metrics per kernel - autorange made - bool cuptiProfilerPerKernel_{false}; - - // max number of ranges to configure the profiler for. - // this has to be set before hand to reserve space for the output - int64_t cuptiProfilerMaxRanges_ = 0; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/DaemonConfigLoader.h b/plugins/tensorboard-plugins/libkineto/src/DaemonConfigLoader.h deleted file mode 100644 index 9b0ed92863648824a57ce8193ddc16d7cf23622e..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/DaemonConfigLoader.h +++ /dev/null @@ -1,27 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include - -namespace KINETO_NAMESPACE { - -class DaemonConfigLoader { - public: - virtual ~DaemonConfigLoader() {} - - // Return the base config from the daemon - virtual std::string readBaseConfig() = 0; - - // Return a configuration string from the daemon, if one has been posted. - virtual std::string readOnDemandConfig(bool events, bool activities) = 0; - - // Returns the number of tracked contexts for this device. The daemon has a - // global view. If an unexpedted error occurs, return -1. - virtual int gpuContextCount(uint32_t device) = 0; - - virtual void setCommunicationFabric(bool enabled) = 0; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/Demangle.cpp b/plugins/tensorboard-plugins/libkineto/src/Demangle.cpp deleted file mode 100644 index f84f0b8ec36f621061cb1e8bb8dd948cb8aed7b3..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/Demangle.cpp +++ /dev/null @@ -1,49 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "Demangle.h" - -#ifndef _MSC_VER -#include -#endif -#include -#include - -namespace KINETO_NAMESPACE { - -static constexpr int kMaxSymbolSize = 1024; - -std::string demangle(const char* name) { -#ifndef _MSC_VER - if (!name) { - return ""; - } - - if (strlen(name) > kMaxSymbolSize) { - return name; - } - - int status; - size_t len = 0; - char* demangled = abi::__cxa_demangle(name, nullptr, &len, &status); - if (status != 0) { - return name; - } - std::string res(demangled); - // The returned buffer must be freed! - free(demangled); - return res; -#else - // TODO: demangling on Windows - if (!name) { - return ""; - } else { - return name; - } -#endif -} - -std::string demangle(const std::string& name) { - return demangle(name.c_str()); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/Demangle.h b/plugins/tensorboard-plugins/libkineto/src/Demangle.h deleted file mode 100644 index 6dcf0776f1abf30e7e3614272fa02f6bae1bdf35..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/Demangle.h +++ /dev/null @@ -1,12 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -namespace KINETO_NAMESPACE { - -std::string demangle(const char* name); -std::string demangle(const std::string& name); - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/EventProfiler.cpp b/plugins/tensorboard-plugins/libkineto/src/EventProfiler.cpp deleted file mode 100644 index dbf2755238974392ff6205f05a5c80a1733bf2ee..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/EventProfiler.cpp +++ /dev/null @@ -1,635 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "EventProfiler.h" - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include "CuptiEventApi.h" -#include "Logger.h" - -using namespace std::chrono; -using std::accumulate; -using std::endl; -using std::map; -using std::ostream; -using std::string; -using std::unique_ptr; -using std::vector; - -namespace KINETO_NAMESPACE { - -static std::mutex& logMutex() { - static std::mutex instance; - return instance; -} - -// --------------------------------------------------------------------- -// class Event -// --------------------------------------------------------------------- - -// Compute domain instance percentiles -PercentileList& Event::percentiles( - PercentileList& pcs, - const SampleSlice& slice) const { - vector instance_values; - instance_values.reserve(instanceCount); - for (int i = 0; i < instanceCount; i++) { - instance_values.push_back(sumInstance(i, slice)); - } - return KINETO_NAMESPACE::percentiles(instance_values, pcs); -} - -// Add up all samples for a given domain instance -int64_t Event::sumInstance(int i, const SampleSlice& slice) const { - auto r = toIdxRange(slice); - auto start = samples_.cbegin(); - std::advance(start, r.first); - auto end = start; - std::advance(end, r.second); - return accumulate(start, end, 0ul, [i](int64_t a, const Sample& b) { - return a + b.second[i]; - }); -} - -// Add up all samples across all domain instances -int64_t Event::sumAll(const SampleSlice& slice) const { - int64_t res = 0; - for (int i = 0; i < instanceCount; i++) { - res += sumInstance(i, slice); - } - return res; -} - -// Print raw sample values for all domains -void Event::printSamples(ostream& s, CUdevice device) const { - // Don't mess up output with interleaved lines - // Probably OK to reuse logMutex() here since this is - // used for debugging, but need to keep an eye on it. - std::lock_guard lock(logMutex()); - s << "Device " << device << " " << name << ":" << endl; - for (const auto& sample : samples_) { - const auto& vals = sample.second; - for (int64_t val : vals) { - s << val << " "; - } - s << endl; - } -} - -// --------------------------------------------------------------------- -// class Metric -// --------------------------------------------------------------------- -Metric::Metric( - string name, - CUpti_MetricID id, - vector events, - CUpti_MetricEvaluationMode eval_mode, - CuptiMetricApi& cupti_metrics) - : name(std::move(name)), - id_(id), - events_(std::move(events)), - evalMode_(eval_mode), - cuptiMetrics_(cupti_metrics), - valueKind_(cuptiMetrics_.valueKind(id)) {} - -// Return per-SM vector as well as total -struct Metric::CalculatedValues Metric::calculate( - map& event_map, - nanoseconds sample_duration, - const SampleSlice& slice) { - vector metric_values; - vector ev_values; - ev_values.reserve(events_.size()); - if (evalMode_ & CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE) { - int instance_count = instanceCount(event_map); - metric_values.reserve(instance_count); - for (int i = 0; i < instance_count; i++) { - ev_values.clear(); - for (CUpti_EventID event_id : events_) { - ev_values.push_back(event_map[event_id].sumInstance(i, slice)); - } - metric_values.push_back(cuptiMetrics_.calculate( - id_, valueKind_, events_, ev_values, sample_duration.count())); - } - } - - // FIXME: Check assumption that all instances are profiled - ev_values.clear(); - for (CUpti_EventID event_id : events_) { - ev_values.push_back(event_map[event_id].sumAll(slice)); - } - SampleValue total = cuptiMetrics_.calculate( - id_, valueKind_, events_, ev_values, sample_duration.count()); - if (evalMode_ & CUPTI_METRIC_EVALUATION_MODE_AGGREGATE) { - metric_values.push_back(total); - } - return {metric_values, std::move(total)}; -} - -void Metric::printDescription(ostream& s) const { - s << fmt::format("{} ({})", name, fmt::join(events_, ",")) << endl; -} - -// --------------------------------------------------------------------- -// class EventGroupSet -// --------------------------------------------------------------------- - -// Each domain has a set of counters. -// Some counters in a domain can be collected simultaneously in a "group" -// Counters from different domains can also be collected at the same time -// Therefore we have a "set of groups", or group set, with counters that -// can all be collected at once. -EventGroupSet::EventGroupSet( - CUpti_EventGroupSet& set, - map& events, - CuptiEventApi& cupti) - : set_(set), events_(events), cuptiEvents_(cupti), enabled_(false) { - for (int g = 0; g < set.numEventGroups; g++) { - CUpti_EventGroup grp = set.eventGroups[g]; - // Profile all domain instances - cuptiEvents_.enablePerInstance(grp); - uint32_t instance_count = cuptiEvents_.instanceCount(grp); - for (const auto& id : cuptiEvents_.eventsInGroup(grp)) { - VLOG(0) << "Instance count for " << id << ":" << instance_count; - events_[id].instanceCount = instance_count; - } - } -} - -EventGroupSet::~EventGroupSet() { - // Disable EventGroupSet in Cupti. - if (enabled_) { - setEnabled(false); - } -} - -// Enable or disable this group set -void EventGroupSet::setEnabled(bool enabled) { - if (enabled && !enabled_) { - cuptiEvents_.enableGroupSet(set_); - } else if (!enabled && enabled_) { - cuptiEvents_.disableGroupSet(set_); - } - enabled_ = enabled; -} - -// Collect counter values for each counter in group set -void EventGroupSet::collectSample() { - auto timestamp = system_clock::now(); - for (int g = 0; g < set_.numEventGroups; g++) { - CUpti_EventGroup grp = set_.eventGroups[g]; - for (const auto& id : cuptiEvents_.eventsInGroup(grp)) { - Event& ev = events_[id]; - vector vals(ev.instanceCount); - // FIXME: Use cuptiEventGroupReadAllEvents - cuptiEvents_.readEvent(grp, id, vals); - - if (VLOG_IS_ON(0)) { - for (int64_t v : vals) { - if (v == CUPTI_EVENT_OVERFLOW) { - LOG(WARNING) << "Counter overflow detected " - << "- decrease sample period!" << endl; - } - } - } - - ev.addSample(timestamp, vals); - } - } - - if (VLOG_IS_ON(1)) { - auto t2 = system_clock::now(); - VLOG(1) << "Device " << cuptiEvents_.device() << " Sample (us): " - << duration_cast(t2 - timestamp).count(); - } -} - -// Print names of events in this group set, ordered by group -void EventGroupSet::printDescription(ostream& s) const { - for (int g = 0; g < set_.numEventGroups; g++) { - s << " Events in group " << g << ": "; - for (const auto& id : cuptiEvents_.eventsInGroup(set_.eventGroups[g])) { - s << id << " (" << events_[id].name << ") "; - } - s << endl; - } -} - -// --------------------------------------------------------------------- -// class EventProfiler -// --------------------------------------------------------------------- - -// Find nearest factor of a number by linear search, -// starting at hi and lo - hi searches up and lo searches down -static int nearestFactor(int hi, int lo, int number) { - return number % hi == 0 - ? hi - : number % lo == 0 ? lo : nearestFactor(hi + 1, lo - 1, number); -} - -static int nearestFactor(int count, int max) { - return nearestFactor(count, count, max); -} - -void EventProfiler::initEvents(const std::set& eventNames) { - events_.clear(); - // Build event map - for (const auto& name : eventNames) { - events_.emplace(cuptiEvents_->eventId(name), name); - } -} - -void EventProfiler::initMetrics(const std::set& metricNames) { - metrics_.clear(); - // Add events from metrics - metrics_.reserve(metricNames.size()); - for (const auto& metric_name : metricNames) { - CUpti_MetricID metric_id = cuptiMetrics_->idFromName(metric_name); - if (metric_id == ~0) { - continue; - } - - const auto& events = cuptiMetrics_->events(metric_id); - vector event_ids; - event_ids.reserve(events.size()); - for (const auto& pair : events) { - CUpti_EventID id = pair.first; - const string& event_name = pair.second; - if (event_name.empty()) { - // For unnamed events, use metric name and event id - // FIXME: For subsequent metrics using the same event, - // this will be confusing - events_.emplace(id, metric_name + "_" + event_name); - } else { - events_.emplace(id, event_name); - } - event_ids.push_back(id); - } - metrics_.emplace_back( - metric_name, - metric_id, - event_ids, - cuptiMetrics_->evaluationMode(metric_id), - *cuptiMetrics_); - } -} - -bool EventProfiler::initEventGroups() { - sets_.clear(); - if (eventGroupSets_) { - cuptiEvents_->destroyGroupSets(eventGroupSets_); - eventGroupSets_ = nullptr; - } - if (events_.empty()) { - return true; - } - - // Determine sets of groups to be collected - vector ids; - ids.reserve(events_.size()); - for (const auto& ev : events_) { - ids.push_back(ev.first); - } - eventGroupSets_ = cuptiEvents_->createGroupSets(ids); - VLOG(0) << "Number of group sets: " << eventGroupSets_->numSets; - for (int i = 0; i < eventGroupSets_->numSets; i++) { - sets_.push_back( - EventGroupSet(eventGroupSets_->sets[i], events_, *cuptiEvents_)); - } - return !sets_.empty(); -} - -static unique_ptr alignAndValidateConfigs( - Config& base, - Config* onDemand) { - auto now = system_clock::now(); - if (!onDemand || - now > - (onDemand->eventProfilerOnDemandStartTime() + - onDemand->eventProfilerOnDemandDuration())) { - base.validate(now); - return base.clone(); - } - - auto res = base.clone(); - res->addEvents(onDemand->eventNames()); - res->addMetrics(onDemand->metricNames()); - - int sample_period = - std::min(base.samplePeriod().count(), onDemand->samplePeriod().count()); - if (sample_period < base.samplePeriod().count() && - (base.samplePeriod().count() % sample_period) != 0) { - sample_period = nearestFactor(sample_period, base.samplePeriod().count()); - LOG(WARNING) - << "On-demand sample period must be a factor of base sample period. " - << "Adjusting from " << onDemand->samplePeriod().count() << "ms to " - << sample_period << "ms."; - } - base.setSamplePeriod(milliseconds(sample_period)); - base.validate(now); - res->setSamplePeriod(base.samplePeriod()); - res->setMultiplexPeriod(base.multiplexPeriod()); - res->validate(now); - onDemand->setSamplePeriod(base.samplePeriod()); - onDemand->setMultiplexPeriod(base.multiplexPeriod()); - onDemand->validate(now); - - return res; -} - -static milliseconds minReportPeriod(const Config& config, int num_sets) { - return config.multiplexPeriod() * num_sets; -} - -static bool canSupportReportPeriod(const Config& config, int num_sets) { - // Can we get through the groups an even number per report period? - milliseconds min_report_period = minReportPeriod(config, num_sets); - return (config.reportPeriod().count() % min_report_period.count()) == 0; -} - -static int completeSamplesPerReport(const Config& config, int num_sets) { - if (num_sets <= 1) { - return config.reportPeriod() / config.samplePeriod(); - } - // Numnber of complete sample collections in the report period - // E.g. if report period is 10000ms, sample period 500ms, - // multiplex period 2000ms and num_sets is 5 then # of complete samples is - // (2000ms / 500ms) * (10000ms / 2000ms / 5) = 4 * 1 = 4 - int samples_per_multiplex_period = - config.multiplexPeriod() / config.samplePeriod(); - int multiplex_periods_per_report = - config.reportPeriod() / config.multiplexPeriod(); - return (multiplex_periods_per_report / num_sets) * - samples_per_multiplex_period; -} - -static bool canSupportSamplesPerReport(const Config& config, int num_sets) { - // Can samples per report can be honored with an exact *full* set of samples? - // We don't support partial samples at this point. - int full_samples_per_report = completeSamplesPerReport(config, num_sets); - return (full_samples_per_report % config.samplesPerReport()) == 0; -} - -static void adjustConfig(Config& config, int num_sets) { - // Don't change sample period and multiplex period here, since that can - // cause overflows and perf degradation. Report period and samples per - // report is OK to change (with warning). - if (!canSupportReportPeriod(config, num_sets)) { - milliseconds min_report_period = minReportPeriod(config, num_sets); - LOG(WARNING) << "Report period must be a multiple of " - << min_report_period.count() << "ms (" << num_sets - << " event sets * " << config.multiplexPeriod().count() - << "ms multiplex period), in order to get complete samples."; - auto new_report_period = - Config::alignUp(config.reportPeriod(), min_report_period); - double sf = - ((double)new_report_period.count()) / config.reportPeriod().count(); - int new_samples_per_report = std::round(config.samplesPerReport() * sf); - LOG(WARNING) << "Adjusting report period from " - << config.reportPeriod().count() << "ms to " - << new_report_period.count() << "ms"; - if (new_samples_per_report != config.samplesPerReport()) { - LOG(WARNING) << "Adjusting samples per report from " - << config.samplesPerReport() << " to " - << new_samples_per_report; - } - config.setReportPeriod(new_report_period); - config.setSamplesPerReport(new_samples_per_report); - } - // Ensure that samples per report can be honored with - // an exact *full* set of samples. Don't support partial - // samples at this point. - if (!canSupportSamplesPerReport(config, num_sets)) { - int full_samples_per_report = completeSamplesPerReport(config, num_sets); - int adjusted_count = - nearestFactor(config.samplesPerReport(), full_samples_per_report); - LOG(WARNING) - << "Samples per report must be such that an even number of " - << "complete samples can be aggregated in each report period. Adjusting" - << " from " << config.samplesPerReport() << " to " << adjusted_count - << " (complete sample count is " << full_samples_per_report << ")"; - config.setSamplesPerReport(adjusted_count); - } -} - -// Prepare profiler -EventProfiler::EventProfiler( - std::unique_ptr cupti_events, - std::unique_ptr cupti_metrics, - vector>& loggers, - vector>& onDemandLoggers) - : cuptiEvents_(std::move(cupti_events)), - cuptiMetrics_(std::move(cupti_metrics)), - loggers_(loggers), - onDemandLoggers_(onDemandLoggers) {} - -void EventProfiler::reportSamples() { - dispatchSamples(*config_, loggers_, baseSamples_); - baseSamples_ += completeSamplesPerReport(*config_, sets_.size()); -} - -void EventProfiler::reportOnDemandSamples() { - dispatchSamples(*onDemandConfig_, onDemandLoggers_, onDemandSamples_); - onDemandSamples_ += completeSamplesPerReport(*onDemandConfig_, sets_.size()); -} - -EventProfiler::~EventProfiler() { - if (eventGroupSets_) { - for (auto& set : sets_) { - set.setEnabled(false); - } - cuptiEvents_->destroyGroupSets(eventGroupSets_); - } - VLOG(0) << "Stopped event profiler for device " << device(); -} - -void EventProfiler::updateLoggers(Config& config, Config* on_demand_config) { - // Update loggers. - for (auto& logger : loggers_) { - std::lock_guard lock(logMutex()); - logger->update(config); - } - - if (on_demand_config) { - // Update onDemand loggers. - for (auto& logger : onDemandLoggers_) { - std::lock_guard lock(logMutex()); - logger->update(*on_demand_config); - } - } -} - -bool EventProfiler::applyConfig(const Config& config) { - // Initialize events, metrics, and event group sets. - // TODO: Send warnings / errors back to dyno for onDemand config - try { - if (!initEventsAndMetrics(config)) { - return false; - } - } catch (const std::exception& ex) { - LOG(WARNING) << "Failed to apply config (" << ex.what() << ")"; - return false; - } - - return true; -} - -bool EventProfiler::initEventsAndMetrics(const Config& config) { - initEvents(config.eventNames()); - initMetrics(config.metricNames()); - // We now have the total list of events to collect - // They need to be organized into groups for multiplexing - if (!initEventGroups()) { - LOG(WARNING) << "No events/metrics initialized successfully"; - return false; - } - - if (VLOG_IS_ON(1)) { - printMetrics(LIBKINETO_DBG_STREAM); - printSets(LIBKINETO_DBG_STREAM); - } - return true; -} - -void EventProfiler::printSets(ostream& s) const { - for (int i = 0; i < sets_.size(); i++) { - s << "Set " << i << endl; - sets_[i].printDescription(s); - } -} - -void EventProfiler::printMetrics(ostream& s) const { - s << "Metrics:" << endl; - for (const Metric& m : metrics_) { - m.printDescription(s); - } -} - -void EventProfiler::printAllSamples(ostream& s, CUdevice device) const { - for (const auto& pair : events_) { - const Event& ev = pair.second; - ev.printSamples(s, device); - } -} - -void EventProfiler::enableNextCounterSet() { - if (sets_.size() > 1) { - auto t1 = system_clock::now(); - - VLOG(1) << "Disabling set " << curEnabledSet_; - sets_[curEnabledSet_].setEnabled(false); - curEnabledSet_ = (curEnabledSet_ + 1) % sets_.size(); - VLOG(1) << "Enabling set " << curEnabledSet_; - sets_[curEnabledSet_].setEnabled(true); - - if (VLOG_IS_ON(1)) { - auto t2 = system_clock::now(); - VLOG(1) << "Switch (us): " - << duration_cast(t2 - t1).count(); - } - } -} - -// Notify listeners of collected samples -void EventProfiler::dispatchSamples( - const Config& config, - const vector>& loggers, - int sample_offset) { - Sample sample(events_.size() + metrics_.size()); - // Normalize values to per second - auto delta = config.reportPeriod() / config.samplesPerReport(); - double sf = 1000.0 * sets_.size() / delta.count(); - for (int i = 0; i < config.samplesPerReport(); i++) { - sample.stats.clear(); - sample.deltaMsec = (delta * i).count(); - SampleSlice slice = {sample_offset, i, config.samplesPerReport()}; - VLOG(1) << "Slice: " << sample_offset << ", " << i << ", " - << config.samplesPerReport(); - for (const auto& pair : events_) { - const Event& ev = pair.second; - int64_t total = std::round(sf * ev.sumAll(slice)); - PercentileList pcs = initPercentiles(config.percentiles()); - normalize(ev.percentiles(pcs, slice), sf); - sample.stats.push_back({ev.name, std::move(pcs), SampleValue(total)}); - } - - for (auto& m : metrics_) { - // calculate returns a pair of per-SM vector and a total - auto vals = m.calculate(events_, delta, slice); - PercentileList pcs = initPercentiles(config.percentiles()); - sample.stats.push_back( - {m.name, std::move(percentiles(vals.perInstance, pcs)), vals.total}); - } - - for (auto& logger : loggers) { - std::lock_guard lock(logMutex()); - logger->handleSample(device(), sample, config.ipcFabricEnabled()); - } - } - - if (VLOG_IS_ON(2)) { - printAllSamples(LIBKINETO_DBG_STREAM, device()); - } -} - -void EventProfiler::configure(Config& config, Config* onDemandConfig) { - if (!sets_.empty()) { - sets_[curEnabledSet_].setEnabled(false); - clearSamples(); - } - - config_ = config.clone(); - onDemandConfig_ = onDemandConfig ? onDemandConfig->clone() : nullptr; - mergedConfig_ = alignAndValidateConfigs(*config_, onDemandConfig_.get()); - if (!applyConfig(*mergedConfig_)) { - LOG(WARNING) << "Failed to apply config!"; - mergedConfig_ = config_->clone(); - applyConfig(*config_); - } - if (!sets_.empty()) { - // Make timing adjustments based on multiplexing requirements. - adjustConfig(*config_, sets_.size()); - if (onDemandConfig_) { - int duration = onDemandConfig_->eventProfilerOnDemandDuration().count(); - LOG(INFO) << "On demand profiler activated for " << duration << " secs"; - adjustConfig(*onDemandConfig_, sets_.size()); - } - // If events or metrics were added or removed, need to tell loggers - updateLoggers(*config_, onDemandConfig_.get()); - } - - curEnabledSet_ = 0; - if (!sets_.empty()) { - sets_[0].setEnabled(true); - } else { - VLOG(0) << "No counters profiled!"; - } - - baseSamples_ = 0; - onDemandSamples_ = 0; -} - -void EventProfiler::collectSample() { - if (sets_.empty()) { - return; - } - sets_[curEnabledSet_].collectSample(); - if (VLOG_IS_ON(1)) { - printAllSamples(LIBKINETO_DBG_STREAM, device()); - } -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/EventProfiler.h b/plugins/tensorboard-plugins/libkineto/src/EventProfiler.h deleted file mode 100644 index fafd5b9bb8336b28b210ba58d588d3a798a73969..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/EventProfiler.h +++ /dev/null @@ -1,341 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include "Config.h" -#include "CuptiEventApi.h" -#include "CuptiMetricApi.h" -#include "SampleListener.h" - -namespace KINETO_NAMESPACE { - -// Helper function for computing percentiles (nearest-rank). -// Modifies the input. -template -inline PercentileList& percentiles(std::vector values, PercentileList& pcs) { - auto size = values.size(); - for (auto& x : pcs) { - int idx = std::min(size - 1, (x.first * size) / 100); - std::nth_element(values.begin(), values.begin() + idx, values.end()); - x.second = SampleValue(values[idx]); - } - return pcs; -} - -// Helper function for normalizing a percentile list -// Modifies the input -inline PercentileList& normalize(PercentileList& pcs, double sf) { - for (auto& pc : pcs) { - pc.second *= sf; - } - return pcs; -} - -// A slice of the sample buffer -struct SampleSlice { - // Start offset (samples) - int offset; - // Slice number - int index; - // Out of this many - int count; -}; - -// A sampled event -class Event { - public: - /* implicit */ Event(std::string name) : name(std::move(name)) {} - /* implicit */ Event(const char* name) : name(name) {} - Event() : name("INVALID") {} - - Event(const Event&) = delete; - Event& operator=(const Event&) = delete; - Event(Event&&) = default; - Event& operator=(Event&&) = default; - - void addSample( - std::chrono::time_point timestamp, - const std::vector& values) { - assert(values.size() == instanceCount); - samples_.emplace_back(timestamp, values); - } - - // Sum samples for a single domain instance - int64_t sumInstance(int i, const SampleSlice& slice) const; - - // Sum all samples across all domain instances - int64_t sumAll(const SampleSlice& slice) const; - - // Create list of percentiles - PercentileList& percentiles(PercentileList& pcs, const SampleSlice& slice) - const; - - void eraseSamples(int count) { - auto end = samples_.begin(); - std::advance(end, count); - samples_.erase(samples_.begin(), end); - } - - void clearSamples() { - samples_.clear(); - } - - int sampleCount() { - return samples_.size(); - } - - void printSamples(std::ostream& s, CUdevice device) const; - - // Event name (see nvprof --query-events) - std::string name; - - // Number of domain instances for this event, e.g. number of SMs - int instanceCount = 0; - - private: - std::pair toIdxRange(const SampleSlice& slice) const { - int size = (samples_.size() - slice.offset) / slice.count; - return std::make_pair(slice.offset + (slice.index * size), size); - } - - // List of collected samples, where each sample has values for - // one or more domain instances - using Sample = std::pair< - std::chrono::time_point, - std::vector>; - std::list samples_; -}; - -class Metric { - public: - Metric( - std::string name, - CUpti_MetricID id, - std::vector events, - CUpti_MetricEvaluationMode eval_mode, - CuptiMetricApi& cupti_metrics); - - struct CalculatedValues { - std::vector perInstance; - SampleValue total; - }; - - struct CalculatedValues calculate( - std::map& events, - std::chrono::nanoseconds sample_duration, - const SampleSlice& slice); - - int instanceCount(std::map& events) { - return events[events_[0]].instanceCount; - } - - void printDescription(std::ostream& s) const; - - std::string name; - - private: - CUpti_MetricID id_; - std::vector events_; - CUpti_MetricEvaluationMode evalMode_; - // Calls to CUPTI is encapsulated behind this interface - CuptiMetricApi& cuptiMetrics_; - CUpti_MetricValueKind valueKind_; -}; - -/** - * A set of event groups. - * Holds all the events that may be collected in a single pass. - * A group contains one or more counters for a single domain. - * A group set contains zero or one groups per domain. - */ -class EventGroupSet { - public: - EventGroupSet( - CUpti_EventGroupSet& set, - std::map& events, - CuptiEventApi& cupti); - ~EventGroupSet(); - - EventGroupSet(const EventGroupSet&) = delete; - EventGroupSet& operator=(const EventGroupSet&) = delete; - EventGroupSet(EventGroupSet&&) = default; - EventGroupSet& operator=(EventGroupSet&&) = delete; - - // Number of groups = number of domains profiled - int groupCount() const { - return set_.numEventGroups; - } - - void setEnabled(bool enabled); - // Take a sample of counters in this group set - void collectSample(); - void printDescription(std::ostream& s) const; - - private: - CUpti_EventGroupSet& set_; - std::map& events_; - // Calls to CUPTI is encapsulated behind this interface - CuptiEventApi& cuptiEvents_; - bool enabled_; -}; - -// The sampler -class EventProfiler { - public: - explicit EventProfiler( - std::unique_ptr cupti_events, - std::unique_ptr cupti_metrics, - std::vector>& loggers, - std::vector>& onDemandLoggers); - EventProfiler(const EventProfiler&) = delete; - EventProfiler& operator=(const EventProfiler&) = delete; - ~EventProfiler(); - - void configure(Config& config, Config* onDemandConfig); - - bool isOnDemandActive() { - return !!onDemandConfig_; - } - - // Print the counter sets. Multiple sets will be multiplexed. - void printSets(std::ostream& s) const; - - // Print metrics descriptions - void printMetrics(std::ostream& s) const; - - bool enableForDevice(Config& cfg); - - CUdevice device() { - return cuptiEvents_->device(); - } - - bool setContinuousMode() { - return cuptiEvents_->setContinuousMode(); - } - - std::chrono::milliseconds samplePeriod() { - return mergedConfig_->samplePeriod(); - } - - std::chrono::milliseconds multiplexPeriod() { - return mergedConfig_->multiplexPeriod(); - } - - std::chrono::milliseconds reportPeriod() { - return config_->reportPeriod(); - } - - std::chrono::milliseconds onDemandReportPeriod() { - return onDemandConfig_->reportPeriod(); - } - - // Read values of currently running counters. - void collectSample(); - - void reportSamples(); - void reportOnDemandSamples(); - - bool enabled() { - return sets_.size() > 0; - } - - bool multiplexEnabled() { - return sets_.size() > 1; - } - - // Multiplex counters. - void enableNextCounterSet(); - - void eraseReportedSamples() { - int erase_count = baseSamples_; - if (onDemandConfig_ && - onDemandConfig_->eventProfilerOnDemandDuration().count() > 0) { - erase_count = std::min(baseSamples_, onDemandSamples_); - } - eraseSamples(erase_count); - baseSamples_ -= erase_count; - onDemandSamples_ -= erase_count; - } - - void clearSamples() { - for (auto& pair : events_) { - pair.second.clearSamples(); - } - baseSamples_ = 0; - onDemandSamples_ = 0; - } - - private: - // Functions to initialize profiler based on Config settings. - bool applyConfig(const Config& config); - bool initEventsAndMetrics(const Config& config); - void initEvents(const std::set& eventNames); - void initMetrics(const std::set& metricNames); - bool initEventGroups(); - - PercentileList initPercentiles(const std::vector& percentiles) { - PercentileList res; - res.reserve(percentiles.size()); - for (int p : percentiles) { - res.emplace_back(p, SampleValue(0)); - } - return res; - } - - // Notify listeners of collected samples - void dispatchSamples( - const Config& config, - const std::vector>& loggers, - int report_nr); - - void eraseSamples(int count) { - for (auto& pair : events_) { - pair.second.eraseSamples(count); - } - } - - void updateLoggers(Config& config, Config* on_demand_config); - - // Print all collected samples since last clear. - void printAllSamples(std::ostream& s, CUdevice device) const; - - // Calls to CUPTI is encapsulated behind these interfaces - std::unique_ptr cuptiEvents_; - std::unique_ptr cuptiMetrics_; - // The CUpti API reports event IDs, we must map them to our event objects - std::map events_; - // List of metrics - std::vector metrics_; - // The countert sets needed to collect all counters - std::vector sets_; - // The event group set object returned by Cupti. - // Saved s.t. we can call cuptiEventGroupSetsDestroy to free memory when - // the object is no longer needed. - CUpti_EventGroupSets* eventGroupSets_ = nullptr; - // Current multiplexed counter set - int curEnabledSet_{0}; - - std::unique_ptr config_; - std::unique_ptr onDemandConfig_; - std::unique_ptr mergedConfig_; - int baseSamples_{0}; - int onDemandSamples_{0}; - - // Shared between profiler threads - // Vectors are read-only but calling loggers require lock - const std::vector>& loggers_; - const std::vector>& onDemandLoggers_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.cpp b/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.cpp deleted file mode 100644 index 0427cc7a90cbc49d31262bcce63f1f81c5b6293f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.cpp +++ /dev/null @@ -1,423 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "EventProfilerController.h" - -#include -#include -#include - -#include "ConfigLoader.h" -#include "CuptiEventApi.h" -#include "CuptiMetricApi.h" -#include "EventProfiler.h" -#include "output_csv.h" - -#include "Logger.h" -#include "ThreadUtil.h" - -using namespace std::chrono; -using std::unique_ptr; -using std::vector; - -namespace KINETO_NAMESPACE { - -namespace { - -vector(const Config&)>>& -loggerFactories() { - static vector(const Config&)>> - factories; - return factories; -} - -vector(const Config&)>>& -onDemandLoggerFactories() { - static vector(const Config&)>> - factories; - return factories; -} - -vector> makeLoggers(const Config& config) { - vector> loggers; - for (const auto& factory : loggerFactories()) { - loggers.push_back(factory(config)); - } - loggers.push_back(std::make_unique()); - loggers.push_back(std::make_unique()); - return loggers; -} - -vector> makeOnDemandLoggers( - const Config& config) { - vector> loggers; - for (const auto& factory : onDemandLoggerFactories()) { - loggers.push_back(factory(config)); - } - loggers.push_back(std::make_unique()); - return loggers; -} - -vector>& loggers(const Config& config) { - static auto res = makeLoggers(config); - return res; -} - -vector>& onDemandLoggers( - const Config& config) { - static auto res = makeOnDemandLoggers(config); - return res; -} - -} // anon namespace - -// Keep an eye on profiling threads. -// We've observed deadlocks in Cuda11 in libcuda / libcupti.. -namespace detail { - -class HeartbeatMonitor { - - public: - ~HeartbeatMonitor() { - stopMonitoring(); - } - - static HeartbeatMonitor& instance() { - static HeartbeatMonitor monitor; - return monitor; - } - - void profilerHeartbeat() { - int32_t tid = systemThreadId(); - std::lock_guard lock(mutex_); - profilerAliveMap_[tid]++; - } - - void setPeriod(seconds period) { - { - std::lock_guard lock(mutex_); - if (period_ == period) { - return; - } - period_ = period; - } - if (period == seconds(0)) { - stopMonitoring(); - } else { - startMonitoring(); - } - } - - private: - HeartbeatMonitor() = default; - - void monitorLoop() { - std::unique_lock lock(mutex_); - while(!stopMonitor_) { - auto cv_status = condVar_.wait_for(lock, seconds(period_)); - // Don't perform check on spurious wakeup or on notify - if (cv_status == std::cv_status::timeout) { - for (auto& pair : profilerAliveMap_) { - int32_t tid = pair.first; - int& i = pair.second; - if (i == 0) { - LOG(ERROR) << "Thread " << tid << " appears stuck!"; - } - i = 0; - } - } - } - } - - void startMonitoring() { - if (!monitorThread_) { - VLOG(0) << "Starting monitoring thread"; - stopMonitor_ = false; - monitorThread_ = std::make_unique( - &HeartbeatMonitor::monitorLoop, this); - } - } - - void stopMonitoring() { - if (monitorThread_) { - VLOG(0) << "Stopping monitoring thread"; - stopMonitor_ = true; - condVar_.notify_one(); - monitorThread_->join(); - monitorThread_ = nullptr; - VLOG(0) << "Monitoring thread terminated"; - } - } - - std::map profilerAliveMap_; - std::unique_ptr monitorThread_; - std::mutex mutex_; - std::condition_variable condVar_; - std::atomic_bool stopMonitor_{false}; - seconds period_{0}; -}; - -} // namespace detail - -namespace { -// Profiler map singleton -std::map>& profilerMap() { - static std::map> instance; - return instance; -} - -void reportLateSample( - int sleepMs, - int sampleMs, - int reportMs, - int reprogramMs) { - LOG_EVERY_N(WARNING, 10) << "Lost sample due to delays (ms): " << sleepMs - << ", " << sampleMs << ", " << reportMs << ", " - << reprogramMs; -} - -void configureHeartbeatMonitor( - detail::HeartbeatMonitor& monitor, const Config& base, const Config* onDemand) { - seconds base_period = - base.eventProfilerHeartbeatMonitorPeriod(); - seconds on_demand_period = !onDemand ? seconds(0) : - onDemand->eventProfilerHeartbeatMonitorPeriod(); - monitor.setPeriod( - on_demand_period > seconds(0) ? on_demand_period : base_period); -} - -} // anon namespace - -void EventProfilerController::addLoggerFactory( - std::function(const Config&)> factory) { - loggerFactories().push_back(factory); -} - -void EventProfilerController::addOnDemandLoggerFactory( - std::function(const Config&)> factory) { - onDemandLoggerFactories().push_back(factory); -} - -EventProfilerController::EventProfilerController( - CUcontext context, - ConfigLoader& configLoader, - detail::HeartbeatMonitor& heartbeatMonitor) - : configLoader_(configLoader), heartbeatMonitor_(heartbeatMonitor) { - auto cupti_events = std::make_unique(context); - auto cupti_metrics = - std::make_unique(cupti_events->device()); - configLoader_.addHandler( - ConfigLoader::ConfigKind::EventProfiler, this); - auto config = configLoader.getConfigCopy(); - profiler_ = std::make_unique( - std::move(cupti_events), - std::move(cupti_metrics), - loggers(*config), - onDemandLoggers(*config)); - profilerThread_ = std::make_unique( - &EventProfilerController::profilerLoop, this); -} - -EventProfilerController::~EventProfilerController() { - if (profilerThread_) { - // signaling termination of the profiler loop - stopRunloop_ = true; - profilerThread_->join(); - } - configLoader_.removeHandler( - ConfigLoader::ConfigKind::EventProfiler, this); - VLOG(0) << "Stopped event profiler"; -} - -// Must be called under lock -void EventProfilerController::start(CUcontext ctx, ConfigLoader& configLoader) { - profilerMap()[ctx] = unique_ptr( - new EventProfilerController( - ctx, configLoader, detail::HeartbeatMonitor::instance())); -} - -// Must be called under lock -void EventProfilerController::stop(CUcontext ctx) { - profilerMap()[ctx] = nullptr; -} - -bool EventProfilerController::canAcceptConfig() { - std::lock_guard guard(mutex_); - return !newOnDemandConfig_; -} - -void EventProfilerController::acceptConfig(const Config& config) { - if (config.eventProfilerOnDemandDuration().count() == 0) { - // Ignore - not for this profiler - return; - } - std::lock_guard guard(mutex_); - if (newOnDemandConfig_) { - LOG(ERROR) << "On demand request already queued - ignoring new request"; - return; - } - newOnDemandConfig_ = config.clone(); - LOG(INFO) << "Received new on-demand config"; -} - -bool EventProfilerController::enableForDevice(Config& cfg) { - // FIXME: Use device unique id! - if (!cfg.eventProfilerEnabledForDevice(profiler_->device())) { - return false; - } - // context count includes the new context - int instances = configLoader_.contextCountForGpu(profiler_->device()); - VLOG(0) << "Device context count: " << instances; - return instances >= 0 && instances <= cfg.maxEventProfilersPerGpu(); -} - -void EventProfilerController::profilerLoop() { - // We limit the number of profilers that can exist per GPU - auto config = configLoader_.getConfigCopy(); - if (!enableForDevice(*config)) { - VLOG(0) << "Not starting EventProfiler - profilers for GPU " - << profiler_->device() << " exceeds profilers per GPU limit (" - << config->maxEventProfilersPerGpu() << ")"; - return; - } - - if (!profiler_->setContinuousMode()) { - VLOG(0) << "Continuous mode not supported for GPU " - << profiler_->device() << ". Not starting Event Profiler."; - return; - } - - VLOG(0) << "Starting Event Profiler for GPU " << profiler_->device(); - setThreadName("CUPTI Event Profiler"); - - time_point next_sample_time; - time_point next_report_time; - time_point next_on_demand_report_time; - time_point next_multiplex_time; - std::unique_ptr on_demand_config = nullptr; - bool reconfigure = true; - bool restart = true; - int report_count = 0; - int on_demand_report_count = 0; - while (!stopRunloop_) { - heartbeatMonitor_.profilerHeartbeat(); - if (configLoader_.hasNewConfig(*config)) { - config = configLoader_.getConfigCopy(); - VLOG(0) << "Base config changed"; - report_count = 0; - reconfigure = true; - } - - auto now = system_clock::now(); - if (on_demand_config && - now > (on_demand_config->eventProfilerOnDemandStartTime() + - on_demand_config->eventProfilerOnDemandDuration())) { - on_demand_config = nullptr; - LOG(INFO) << "On-demand profiling complete"; - reconfigure = true; - } - - if (!profiler_->isOnDemandActive()) { - std::lock_guard lock(mutex_); - if (newOnDemandConfig_) { - VLOG(0) << "Received on-demand config, reconfiguring"; - on_demand_config = std::move(newOnDemandConfig_); - reconfigure = true; - on_demand_report_count = 0; - } - } - - if (reconfigure) { - try { - profiler_->configure(*config, on_demand_config.get()); - } catch (const std::exception& ex) { - LOG(ERROR) << "Encountered error while configuring event profiler: " - << ex.what(); - // Exit profiling entirely when encountering an error here - // as it indicates a serious problem or bug. - break; - } - configureHeartbeatMonitor( - heartbeatMonitor_, *config, on_demand_config.get()); - reconfigure = false; - restart = true; - } - - if (restart) { - now = system_clock::now(); - next_sample_time = now + profiler_->samplePeriod(); - next_report_time = now + profiler_->reportPeriod(); - if (profiler_->isOnDemandActive()) { - next_on_demand_report_time = now + profiler_->onDemandReportPeriod(); - } - next_multiplex_time = now + profiler_->multiplexPeriod(); - // Collect an initial sample and throw it away - // The next sample is the first valid one - profiler_->collectSample(); - profiler_->clearSamples(); - restart = false; - } - - auto start_sleep = now; - while (now < next_sample_time) { - /* sleep override */ - std::this_thread::sleep_for(next_sample_time - now); - now = system_clock::now(); - } - int sleep_time = duration_cast(now - start_sleep).count(); - - auto start_sample = now; - profiler_->collectSample(); - now = system_clock::now(); - int sample_time = duration_cast(now - start_sample).count(); - - next_sample_time += profiler_->samplePeriod(); - if (now > next_sample_time) { - reportLateSample(sleep_time, sample_time, 0, 0); - restart = true; - continue; - } - - auto start_report = now; - if (now > next_report_time) { - VLOG(1) << "Report #" << report_count++; - profiler_->reportSamples(); - next_report_time += profiler_->reportPeriod(); - } - if (profiler_->isOnDemandActive() && now > next_on_demand_report_time) { - VLOG(1) << "OnDemand Report #" << on_demand_report_count++; - profiler_->reportOnDemandSamples(); - next_on_demand_report_time += profiler_->onDemandReportPeriod(); - } - profiler_->eraseReportedSamples(); - now = system_clock::now(); - int report_time = duration_cast(now - start_report).count(); - - if (now > next_sample_time) { - reportLateSample(sleep_time, sample_time, report_time, 0); - restart = true; - continue; - } - - auto start_multiplex = now; - if (profiler_->multiplexEnabled() && now > next_multiplex_time) { - profiler_->enableNextCounterSet(); - next_multiplex_time += profiler_->multiplexPeriod(); - } - now = system_clock::now(); - int multiplex_time = - duration_cast(now - start_multiplex).count(); - - if (now > next_sample_time) { - reportLateSample(sleep_time, sample_time, report_time, multiplex_time); - restart = true; - } - - VLOG(0) << "Runloop execution time: " - << duration_cast(now - start_sample).count() << "ms"; - } - - VLOG(0) << "Device " << profiler_->device() - << ": Exited event profiling loop"; -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.h b/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.h deleted file mode 100644 index 007a82faa9289ada9256d09907167471eb6520b9..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/EventProfilerController.h +++ /dev/null @@ -1,63 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -#include - -#include "ConfigLoader.h" - -namespace KINETO_NAMESPACE { - -class Config; -class ConfigLoader; -class EventProfiler; -class SampleListener; - -namespace detail { -class HeartbeatMonitor; -} - -class EventProfilerController : public ConfigLoader::ConfigHandler { - public: - EventProfilerController(const EventProfilerController&) = delete; - EventProfilerController& operator=(const EventProfilerController&) = delete; - - ~EventProfilerController(); - - static void start(CUcontext ctx, ConfigLoader& configLoader); - static void stop(CUcontext ctx); - - static void addLoggerFactory( - std::function(const Config&)> factory); - - static void addOnDemandLoggerFactory( - std::function(const Config&)> factory); - - bool canAcceptConfig() override; - - void acceptConfig(const Config& config) override; - - private: - explicit EventProfilerController( - CUcontext context, - ConfigLoader& configLoader, - detail::HeartbeatMonitor& heartbeatMonitor); - bool enableForDevice(Config& cfg); - void profilerLoop(); - - ConfigLoader& configLoader_; - std::unique_ptr newOnDemandConfig_; - detail::HeartbeatMonitor& heartbeatMonitor_; - std::unique_ptr profiler_; - std::unique_ptr profilerThread_; - std::atomic_bool stopRunloop_{false}; - std::mutex mutex_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/GenericTraceActivity.cpp b/plugins/tensorboard-plugins/libkineto/src/GenericTraceActivity.cpp deleted file mode 100644 index 4e00b1256c4fa301e288e619ee9ef8c56c8b8569..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/GenericTraceActivity.cpp +++ /dev/null @@ -1,10 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "GenericTraceActivity.h" -#include "output_base.h" - -namespace libkineto { - void GenericTraceActivity::log(ActivityLogger& logger) const { - logger.handleGenericActivity(*this); - } -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/ILoggerObserver.cpp b/plugins/tensorboard-plugins/libkineto/src/ILoggerObserver.cpp deleted file mode 100644 index f0106578811837c9cc677def30d5697d43a94221..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ILoggerObserver.cpp +++ /dev/null @@ -1,54 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ILoggerObserver.h" - -#if !USE_GOOGLE_LOG - -#include -#include - -namespace libkineto { - -struct LoggerTypeName { - constexpr LoggerTypeName(const char* n, LoggerOutputType t) : name(n), type(t) {}; - const char* name; - LoggerOutputType type; -}; - -static constexpr std::array LoggerMap{{ - {"VERBOSE", LoggerOutputType::VERBOSE}, - {"INFO", LoggerOutputType::INFO}, - {"WARNING", LoggerOutputType::WARNING}, - {"ERROR", LoggerOutputType::ERROR}, - {"STAGE", LoggerOutputType::STAGE}, - {"???", LoggerOutputType::ENUM_COUNT} -}}; - -static constexpr bool matchingOrder(int idx = 0) { - return LoggerMap[idx].type == LoggerOutputType::ENUM_COUNT || - ((idx == (int) LoggerMap[idx].type) && matchingOrder(idx + 1)); -} -static_assert(matchingOrder(), "LoggerTypeName map is out of order"); - -const char* toString(LoggerOutputType t) { - if(t < VERBOSE || t >= ENUM_COUNT) { - return LoggerMap[ENUM_COUNT].name; - } - return LoggerMap[(int)t].name; -} - -LoggerOutputType toLoggerOutputType(const std::string& str) { - for (int i = 0; i < LoggerTypeCount; i++) { - if (str == LoggerMap[i].name) { - return LoggerMap[i].type; - } - } - throw std::invalid_argument(fmt::format("Invalid activity type: {}", str)); -} - -} // namespace libkineto - - -#endif // !USE_GOOGLE_LOG diff --git a/plugins/tensorboard-plugins/libkineto/src/Logger.cpp b/plugins/tensorboard-plugins/libkineto/src/Logger.cpp deleted file mode 100644 index dbde765f51f7a5f03c31a9c79e6d00ce9a2070b6..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/Logger.cpp +++ /dev/null @@ -1,136 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "Logger.h" -#include "ILoggerObserver.h" - -#ifndef USE_GOOGLE_LOG - -#include -#include -#include -#include -#include - -#include -#include - -#include "ThreadUtil.h" - -namespace KINETO_NAMESPACE { - -std::atomic_int Logger::severityLevel_{VERBOSE}; -std::atomic_int Logger::verboseLogLevel_{-1}; -std::atomic Logger::verboseLogModules_{~0ull}; - -#pragma GCC diagnostic push -#pragma GCC diagnostic ignored "-Wglobal-constructors" -std::mutex Logger::loggerObserversMutex_; -#pragma GCC diagnostic pop - - -Logger::Logger(int severity, int line, const char* filePath, int errnum) - : buf_(), out_(LIBKINETO_DBG_STREAM), errnum_(errnum), messageSeverity_(severity) { - buf_ << toString((LoggerOutputType) severity) << ":"; - - const auto tt = - std::chrono::system_clock::to_time_t(std::chrono::system_clock::now()); - const char* file = strrchr(filePath, '/'); - buf_ << fmt::format("{:%Y-%m-%d %H:%M:%S}", fmt::localtime(tt)) << " " - << processId() << ":" << systemThreadId() << " " - << (file ? file + 1 : filePath) << ":" << line << "] "; -} - -Logger::~Logger() { -#ifdef __linux__ - if (errnum_ != 0) { - thread_local char buf[1024]; - buf_ << " : " << strerror_r(errnum_, buf, sizeof(buf)); - } -#endif - - { - std::lock_guard guard(loggerObserversMutex_); - for (auto* observer : loggerObservers()) { - // Output to observers. Current Severity helps keep track of which bucket the output goes. - if (observer) { - observer->write(buf_.str(), (LoggerOutputType) messageSeverity_); - } - } - } - - // Finally, print to terminal or console. - out_ << buf_.str() << std::endl; -} - -void Logger::setVerboseLogModules(const std::vector& modules) { - uint64_t mask = 0; - if (modules.empty()) { - mask = ~0ull; - } else { - for (const std::string& name : modules) { - mask |= hash(name.c_str()); - } - } - verboseLogModules_ = mask; -} - -void Logger::addLoggerObserver(ILoggerObserver* observer) { - if (observer == nullptr) { - return; - } - std::lock_guard guard(loggerObserversMutex_); - loggerObservers().insert(observer); -} - -void Logger::removeLoggerObserver(ILoggerObserver* observer) { - std::lock_guard guard(loggerObserversMutex_); - loggerObservers().erase(observer); -} - -void Logger::addLoggerObserverDevice(int64_t device) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->addDevice(device); - } -} - -void Logger::addLoggerObserverEventCount(int64_t count) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->addEventCount(count); - } -} - -void Logger::setLoggerObserverTraceDurationMS(int64_t duration) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->setTraceDurationMS(duration); - } -} - -void Logger::setLoggerObserverTraceID(const std::string& tid) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->setTraceID(tid); - } -} - -void Logger::setLoggerObserverGroupTraceID(const std::string& gtid) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->setGroupTraceID(gtid); - } -} - -void Logger::addLoggerObserverDestination(const std::string& dest) { - std::lock_guard guard(loggerObserversMutex_); - for (auto observer : loggerObservers()) { - observer->addDestination(dest); - } -} - -} // namespace KINETO_NAMESPACE - -#endif // USE_GOOGLE_LOG diff --git a/plugins/tensorboard-plugins/libkineto/src/Logger.h b/plugins/tensorboard-plugins/libkineto/src/Logger.h deleted file mode 100644 index 868fc84b9f4ee86d88805bed81468a5df6988257..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/Logger.h +++ /dev/null @@ -1,244 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -#define LIBKINETO_DBG_STREAM std::cerr - -#if USE_GOOGLE_LOG - -#include - -#define SET_LOG_SEVERITY_LEVEL(level) -#define SET_LOG_VERBOSITY_LEVEL(level, modules) -#define LOGGER_OBSERVER_ADD_DEVICE(device) -#define LOGGER_OBSERVER_ADD_EVENT_COUNT(count) -#define LOGGER_OBSERVER_SET_TRACE_DURATION_MS(duration) -#define LOGGER_OBSERVER_SET_TRACE_ID(tid) -#define LOGGER_OBSERVER_SET_GROUP_TRACE_ID(gtid) -#define LOGGER_OBSERVER_ADD_DESTINATION(dest) -#define UST_LOGGER_MARK_COMPLETED(stage) - -#else // !USE_GOOGLE_LOG -#include -#include -#include -#include -#include -#include -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ILoggerObserver.h" - -#ifdef _MSC_VER -// unset a predefined ERROR (windows) -#undef ERROR -#endif // _MSC_VER - -namespace KINETO_NAMESPACE { - -class Logger { - public: - Logger(int severity, int line, const char* filePath, int errnum = 0); - ~Logger(); - - inline std::ostream& stream() { - return buf_; - } - - static inline void setSeverityLevel(int level) { - severityLevel_ = level; - } - - static inline int severityLevel() { - return severityLevel_; - } - - static inline void setVerboseLogLevel(int level) { - verboseLogLevel_ = level; - } - - static inline int verboseLogLevel() { - return verboseLogLevel_; - } - - // This is constexpr so that the hash for a file name is computed at compile - // time when used in the VLOG macros. - // This way, there is no string comparison for matching VLOG modules, - // only a comparison of pre-computed hashes. - // No fancy hashing needed here. It's pretty inefficient (one character - // at a time) but the strings are not large and it's not in the critical path. - static constexpr uint64_t rol(uint64_t val, int amount) { - return val << amount | val >> (63 - amount); - } - static constexpr uint64_t hash(const char* s) { - uint64_t hash = hash_rec(s, 0); - return hash & rol(0x41a0240682483014ull, hash & 63); - } - static constexpr uint64_t hash_rec(const char* s, int off) { - // Random constants! - return (!s[off] ? 57ull : (hash_rec(s, off + 1) * 293) ^ s[off]); - } - static constexpr const char* basename(const char* s, int off = 0) { - return !s[off] - ? s - : s[off] == '/' ? basename(&s[off + 1]) : basename(s, off + 1); - } - - static void setVerboseLogModules(const std::vector& modules); - - static inline uint64_t verboseLogModules() { - return verboseLogModules_; - } - - static void clearLoggerObservers() { - std::lock_guard g(loggerObserversMutex_); - loggerObservers().clear(); - } - - static void addLoggerObserver(ILoggerObserver* observer); - - static void removeLoggerObserver(ILoggerObserver* observer); - - static void addLoggerObserverDevice(int64_t device); - - static void addLoggerObserverEventCount(int64_t count); - - static void setLoggerObserverTraceDurationMS(int64_t duration); - - static void setLoggerObserverTraceID(const std::string& tid); - - static void setLoggerObserverGroupTraceID(const std::string& gtid); - - static void addLoggerObserverDestination(const std::string& dest); - - private: - std::stringstream buf_; - std::ostream& out_; - int errnum_; - int messageSeverity_; - static std::atomic_int severityLevel_; - static std::atomic_int verboseLogLevel_; - static std::atomic verboseLogModules_; - static std::set& loggerObservers() { - static auto* inst = new std::set(); - return *inst; - } - static std::mutex loggerObserversMutex_; -}; - -class VoidLogger { - public: - VoidLogger() {} - void operator&(std::ostream&) {} -}; - -} // namespace KINETO_NAMESPACE - -#ifdef LOG // Undefine in case these are already defined (quite likely) -#undef LOG -#undef LOG_IS_ON -#undef LOG_IF -#undef LOG_EVERY_N -#undef LOG_IF_EVERY_N -#undef DLOG -#undef DLOG_IF -#undef VLOG -#undef VLOG_IF -#undef VLOG_EVERY_N -#undef VLOG_IS_ON -#undef DVLOG -#undef LOG_FIRST_N -#undef CHECK -#undef DCHECK -#undef DCHECK_EQ -#undef PLOG -#undef PCHECK -#undef LOG_OCCURRENCES -#endif - -#define LOG_IS_ON(severity) \ - (severity >= libkineto::Logger::severityLevel()) - -#define LOG_IF(severity, condition) \ - !(LOG_IS_ON(severity) && (condition)) ? (void)0 : libkineto::VoidLogger() & \ - libkineto::Logger(severity, __LINE__, __FILE__).stream() - -#define LOG(severity) LOG_IF(severity, true) - -#define LOCAL_VARNAME_CONCAT(name, suffix) _##name##suffix##_ - -#define LOCAL_VARNAME(name) LOCAL_VARNAME_CONCAT(name, __LINE__) - -#define LOG_OCCURRENCES LOCAL_VARNAME(log_count) - -#define LOG_EVERY_N(severity, rate) \ - static int LOG_OCCURRENCES = 0; \ - LOG_IF(severity, LOG_OCCURRENCES++ % rate == 0) \ - << "(x" << LOG_OCCURRENCES << ") " - -template -struct __to_constant__ { - static const uint64_t val = n; -}; -#define FILENAME_HASH \ - __to_constant__::val -#define VLOG_IS_ON(verbosity) \ - (libkineto::Logger::verboseLogLevel() >= verbosity && \ - (libkineto::Logger::verboseLogModules() & FILENAME_HASH) == FILENAME_HASH) - -#define VLOG_IF(verbosity, condition) \ - LOG_IF(VERBOSE, VLOG_IS_ON(verbosity) && (condition)) - -#define VLOG(verbosity) VLOG_IF(verbosity, true) - -#define VLOG_EVERY_N(verbosity, rate) \ - static int LOG_OCCURRENCES = 0; \ - VLOG_IF(verbosity, LOG_OCCURRENCES++ % rate == 0) \ - << "(x" << LOG_OCCURRENCES << ") " - -#define PLOG(severity) \ - libkineto::Logger(severity, __LINE__, __FILE__, errno).stream() - -#define SET_LOG_SEVERITY_LEVEL(level) \ - libkineto::Logger::setSeverityLevel(level) - -#define SET_LOG_VERBOSITY_LEVEL(level, modules) \ - libkineto::Logger::setVerboseLogLevel(level); \ - libkineto::Logger::setVerboseLogModules(modules) - -// Logging the set of devices the trace is collect on. -#define LOGGER_OBSERVER_ADD_DEVICE(device_count) \ - libkineto::Logger::addLoggerObserverDevice(device_count) - -// Incrementing the number of events collected by this trace. -#define LOGGER_OBSERVER_ADD_EVENT_COUNT(count) \ - libkineto::Logger::addLoggerObserverEventCount(count) - -// Record duration of trace in milliseconds. -#define LOGGER_OBSERVER_SET_TRACE_DURATION_MS(duration) \ - libkineto::Logger::setLoggerObserverTraceDurationMS(duration) - -// Record the trace id when given. -#define LOGGER_OBSERVER_SET_TRACE_ID(tid) \ - libkineto::Logger::setLoggerObserverTraceID(tid) - -// Record the group trace id when given. -#define LOGGER_OBSERVER_SET_GROUP_TRACE_ID(gtid) \ - libkineto::Logger::setLoggerObserverGroupTraceID(gtid) - -// Log the set of destinations the trace is sent to. -#define LOGGER_OBSERVER_ADD_DESTINATION(dest) \ - libkineto::Logger::addLoggerObserverDestination(dest) - -// UST Logger Semantics to describe when a stage is complete. -#define UST_LOGGER_MARK_COMPLETED(stage) \ - LOG(libkineto::LoggerOutputType::STAGE) << "Completed Stage: " << stage - -#endif // USE_GOOGLE_LOG diff --git a/plugins/tensorboard-plugins/libkineto/src/LoggerCollector.h b/plugins/tensorboard-plugins/libkineto/src/LoggerCollector.h deleted file mode 100644 index bb05aab218dc137cfe2f0107694a049ee2ea6508..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/LoggerCollector.h +++ /dev/null @@ -1,70 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#if !USE_GOOGLE_LOG - -#include -#include -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "ILoggerObserver.h" - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -class LoggerCollector : public ILoggerObserver { - public: - LoggerCollector() : buckets_() {} - - void write(const std::string& message, LoggerOutputType ot = ERROR) override { - // Skip STAGE output type which is only used by USTLoggerCollector. - if (ot != STAGE) { - buckets_[ot].push_back(message); - } - } - - const std::map> extractCollectorMetadata() override { - return buckets_; - } - - void reset() override { - trace_duration_ms = 0; - event_count = 0; - destinations.clear(); - } - - void addDevice(const int64_t device) override { - devices.insert(device); - } - - void setTraceDurationMS(const int64_t duration) override { - trace_duration_ms = duration; - } - - void addEventCount(const int64_t count) override { - event_count += count; - } - - void addDestination(const std::string& dest) override { - destinations.insert(dest); - } - - protected: - std::map> buckets_; - - // These are useful metadata to collect from CUPTIActivityProfiler for internal tracking. - std::set devices; - int64_t trace_duration_ms{0}; - std::atomic event_count{0}; - std::set destinations; - -}; - -} // namespace KINETO_NAMESPACE - -#endif // !USE_GOOGLE_LOG diff --git a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.cpp b/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.cpp deleted file mode 100644 index 73eff13e2a08bcfecefb03f5b229bde89b7e96cb..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.cpp +++ /dev/null @@ -1,569 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "RoctracerActivityApi.h" - -#include -#include -#include - -#include "Demangle.h" -#include "output_base.h" -#include "ThreadUtil.h" - -typedef uint64_t timestamp_t; - -static timestamp_t timespec_to_ns(const timespec& time) { - return ((timestamp_t)time.tv_sec * 1000000000) + time.tv_nsec; - } - -using namespace std::chrono; - -namespace KINETO_NAMESPACE { - -constexpr size_t kBufSize(2 * 1024 * 1024); - -RoctracerActivityApi& RoctracerActivityApi::singleton() { - static RoctracerActivityApi instance; - return instance; -} - -RoctracerActivityApi::RoctracerActivityApi() { - gpuTraceBuffers_ = std::make_unique>(); -} - -RoctracerActivityApi::~RoctracerActivityApi() { - disableActivities(std::set()); - endTracing(); -} - -void RoctracerActivityApi::pushCorrelationID(int id, CorrelationFlowType type) { -#ifdef HAS_ROCTRACER - if (!singleton().externalCorrelationEnabled_) { - return; - } - // placeholder -#endif -} - -void RoctracerActivityApi::popCorrelationID(CorrelationFlowType type) { -#ifdef HAS_ROCTRACER - if (!singleton().externalCorrelationEnabled_) { - return; - } - // placeholder -#endif -} - -void RoctracerActivityApi::setMaxBufferSize(int size) { - maxGpuBufferCount_ = 1 + size / kBufSize; -} - -int RoctracerActivityApi::processActivities( - ActivityLogger& logger) { - // Find offset to map from monotonic clock to system clock. - // This will break time-ordering of events but is status quo. - - timespec t0, t1, t00; - clock_gettime(CLOCK_REALTIME, &t0); - clock_gettime(CLOCK_MONOTONIC, &t1); - clock_gettime(CLOCK_REALTIME, &t00); - - const timestamp_t toffset = (timespec_to_ns(t0) >> 1) + (timespec_to_ns(t00) >> 1) - timespec_to_ns(t1); - - int count = 0; - - // Basic Api calls - - for (auto &item : rows_) { - GenericTraceActivity a; - a.startTime = (item.begin + toffset) / 1000; - a.endTime = (item.end + toffset) / 1000; - a.id = item.id; - a.device = item.pid; - a.resource = item.tid; - a.activityType = ActivityType::CUDA_RUNTIME; - a.activityName = std::string(roctracer_op_string(ACTIVITY_DOMAIN_HIP_API, item.cid, 0)); - a.flow.id = item.id; - a.flow.type = kLinkAsyncCpuGpu; - a.flow.start = true; - - logger.handleGenericActivity(a); - ++count; - } - - // Malloc/Free calls - for (auto &item : mallocRows_) { - GenericTraceActivity a; - a.startTime = (item.begin + toffset) / 1000; - a.endTime = (item.end + toffset) / 1000; - a.id = item.id; - a.device = item.pid; - a.resource = item.tid; - a.activityType = ActivityType::CUDA_RUNTIME; - a.activityName = std::string(roctracer_op_string(ACTIVITY_DOMAIN_HIP_API, item.cid, 0)); - a.flow.id = item.id; - a.flow.type = kLinkAsyncCpuGpu; - a.flow.start = true; - - a.addMetadata("ptr", item.ptr); - if (item.cid == HIP_API_ID_hipMalloc) { - a.addMetadata("size", item.size); - } - - logger.handleGenericActivity(a); - ++count; - } - - // HipMemcpy calls - for (auto &item : copyRows_) { - GenericTraceActivity a; - a.startTime = (item.begin + toffset) / 1000; - a.endTime = (item.end + toffset) / 1000; - a.id = item.id; - a.device = item.pid; - a.resource = item.tid; - a.activityType = ActivityType::CUDA_RUNTIME; - a.activityName = std::string(roctracer_op_string(ACTIVITY_DOMAIN_HIP_API, item.cid, 0)); - a.flow.id = item.id; - a.flow.type = kLinkAsyncCpuGpu; - a.flow.start = true; - - a.addMetadata("src", item.src); - a.addMetadata("dst", item.dst); - a.addMetadata("size", item.size); - a.addMetadata("kind", item.kind); - if ((item.cid == HIP_API_ID_hipMemcpyAsync) || (item.cid == HIP_API_ID_hipMemcpyWithStream)) { - a.addMetadata("stream", fmt::format("{}", reinterpret_cast(item.stream))); - } - - logger.handleGenericActivity(a); - ++count; - } - - // Kernel Launch Api calls - - for (auto &item : kernelRows_) { - GenericTraceActivity a; - a.startTime = (item.begin + toffset) / 1000; - a.endTime = (item.end + toffset) / 1000; - a.id = item.id; - a.device = item.pid; - a.resource = item.tid; - a.activityType = ActivityType::CUDA_RUNTIME; - a.activityName = std::string(roctracer_op_string(ACTIVITY_DOMAIN_HIP_API, item.cid, 0)); - a.flow.id = item.id; - a.flow.type = kLinkAsyncCpuGpu; - a.flow.start = true; - - if (item.functionAddr != nullptr) { - a.addMetadataQuoted( - "kernel", demangle(hipKernelNameRefByPtr(item.functionAddr, item.stream))); - } - else if (item.function != nullptr) { - a.addMetadataQuoted( - "kernel", demangle(hipKernelNameRef(item.function))); - } - a.addMetadata("grid dim", fmt::format("[{}, {}, {}]", item.gridX, item.gridY, item.gridZ)); - a.addMetadata("block dim", fmt::format("[{}, {}, {}]", item.workgroupX, item.workgroupY, item.workgroupZ)); - a.addMetadata("shared size", item.groupSegmentSize); - a.addMetadata("stream", fmt::format("{}", reinterpret_cast(item.stream))); - - // Stash launches to tie to the async ops - kernelLaunches_[a.id] = a; - - // Stash kernel names to tie to the async ops - std::string name; - if (item.functionAddr != nullptr) { - name = demangle(hipKernelNameRefByPtr(item.functionAddr, item.stream)); - } - else if (item.function != nullptr) { - name = demangle(hipKernelNameRef(item.function)); - } - if (!name.empty()) { - uint32_t string_id = reverseStrings_[name]; - if (string_id == 0) { - string_id = nextStringId_++; - reverseStrings_[name] = string_id; - strings_[string_id] = name; - } - kernelNames_[item.id] = string_id; - } - - logger.handleGenericActivity(a); - ++count; - } - - // Async Ops - - for (auto& buffer : *gpuTraceBuffers_) { - const roctracer_record_t* record = (const roctracer_record_t*)(buffer.data); - const roctracer_record_t* end_record = (const roctracer_record_t*)(buffer.data + buffer.validSize); - GenericTraceActivity a; - - while (record < end_record) { - if ((record->domain == ACTIVITY_DOMAIN_HIP_API) && (loggedIds_.contains(record->op))) { - const char *name = roctracer_op_string(record->domain, record->op, record->kind); - a.device = record->process_id; - a.resource = record->thread_id; - - a.startTime = (record->begin_ns + toffset) / 1000; - a.endTime = (record->end_ns + toffset) / 1000; - a.id = record->correlation_id; - - a.activityType = ActivityType::CUDA_RUNTIME; - a.activityName = std::string(name); - a.flow.id = record->correlation_id; - a.flow.type = kLinkAsyncCpuGpu; - a.flow.start = true; - - logger.handleGenericActivity(a); - ++count; - } - else if (record->domain == ACTIVITY_DOMAIN_HCC_OPS) { - // Overlay launch metadata for kernels - auto kit = kernelLaunches_.find(record->correlation_id); - if (kit != kernelLaunches_.end()) { - a = (*kit).second; - } - - const char *name = roctracer_op_string(record->domain, record->op, record->kind); - a.device = record->device_id; - a.resource = record->queue_id; - - a.startTime = (record->begin_ns + toffset) / 1000; - a.endTime = (record->end_ns + toffset) / 1000; - a.id = record->correlation_id; - - a.activityType = ActivityType::CONCURRENT_KERNEL; - a.activityName = std::string(name); - a.flow.id = record->correlation_id; - a.flow.type = kLinkAsyncCpuGpu; - - auto it = kernelNames_.find(record->correlation_id); - if (it != kernelNames_.end()) { - a.activityName = strings_[it->second]; - } - - logger.handleGenericActivity(a); - ++count; - } - - roctracer_next_record(record, &record); - } - } - return count; -} - -void RoctracerActivityApi::clearActivities() { - gpuTraceBuffers_->clear(); - rows_.clear(); - kernelRows_.clear(); - copyRows_.clear(); - mallocRows_.clear(); - kernelLaunches_.clear(); -} - -void RoctracerActivityApi::api_callback(uint32_t domain, uint32_t cid, const void* callback_data, void* arg) -{ - RoctracerActivityApi *dis = &singleton(); - - if (domain == ACTIVITY_DOMAIN_HIP_API && dis->loggedIds_.contains(cid)) { - const hip_api_data_t* data = (const hip_api_data_t*)(callback_data); - - // Pack callbacks into row structures - - static timespec timestamp; // FIXME verify thread safety - - if (data->phase == ACTIVITY_API_PHASE_ENTER) { - clock_gettime(CLOCK_MONOTONIC, ×tamp); // record proper clock - } - else { // (data->phase == ACTIVITY_API_PHASE_EXIT) - timespec endTime; - timespec startTime { timestamp }; - clock_gettime(CLOCK_MONOTONIC, &endTime); // record proper clock - - switch (cid) { - case HIP_API_ID_hipLaunchKernel: - case HIP_API_ID_hipExtLaunchKernel: - case HIP_API_ID_hipLaunchCooperativeKernel: // Should work here - { - auto &args = data->args.hipLaunchKernel; - dis->kernelRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - args.function_address, - nullptr, - args.numBlocks.x, - args.numBlocks.y, - args.numBlocks.z, - args.dimBlocks.x, - args.dimBlocks.y, - args.dimBlocks.z, - args.sharedMemBytes, - args.stream - ); - } - break; - case HIP_API_ID_hipHccModuleLaunchKernel: - case HIP_API_ID_hipModuleLaunchKernel: - case HIP_API_ID_hipExtModuleLaunchKernel: - { - auto &args = data->args.hipModuleLaunchKernel; - dis->kernelRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - nullptr, - args.f, - args.gridDimX, - args.gridDimY, - args.gridDimZ, - args.blockDimX, - args.blockDimY, - args.blockDimZ, - args.sharedMemBytes, - args.stream - ); - } - break; - case HIP_API_ID_hipLaunchCooperativeKernelMultiDevice: - case HIP_API_ID_hipExtLaunchMultiKernelMultiDevice: -#if 0 - { - auto &args = data->args.hipLaunchCooperativeKernelMultiDevice.launchParamsList__val; - dis->kernelRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - args.function_address, - nullptr, - args.numBlocks.x, - args.numBlocks.y, - args.numBlocks.z, - args.dimBlocks.x, - args.dimBlocks.y, - args.dimBlocks.z, - args.sharedMemBytes, - args.stream - ); - } -#endif - break; - case HIP_API_ID_hipMalloc: - dis->mallocRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - data->args.hipMalloc.ptr__val, - data->args.hipMalloc.size - ); - break; - case HIP_API_ID_hipFree: - dis->mallocRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - data->args.hipFree.ptr, - 0 - ); - break; - case HIP_API_ID_hipMemcpy: - { - auto &args = data->args.hipMemcpy; - dis->copyRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - args.src, - args.dst, - args.sizeBytes, - args.kind, - static_cast(0) // use placeholder? - ); - } - break; - case HIP_API_ID_hipMemcpyAsync: - case HIP_API_ID_hipMemcpyWithStream: - { - auto &args = data->args.hipMemcpyAsync; - dis->copyRows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime), - args.src, - args.dst, - args.sizeBytes, - args.kind, - args.stream - ); - } - break; - default: - dis->rows_.emplace_back(data->correlation_id, - domain, - cid, - processId(), - systemThreadId(), - timespec_to_ns(startTime), - timespec_to_ns(endTime) - ); - break; - } - } - } -} - -void RoctracerActivityApi::activity_callback(const char* begin, const char* end, void* arg) -{ - size_t size = end - begin; - uint8_t *buffer = (uint8_t*) malloc(size); - auto &gpuTraceBuffers = singleton().gpuTraceBuffers_; - memcpy(buffer, begin, size); - gpuTraceBuffers->emplace_back(buffer, size); -} - -void RoctracerActivityApi::enableActivities( - const std::set& selected_activities) { -#ifdef HAS_ROCTRACER - if (!registered_) { - roctracer_set_properties(ACTIVITY_DOMAIN_HIP_API, nullptr); // Magic encantation - - // Set some api calls to ignore - loggedIds_.setInvertMode(true); // Omit the specified api - loggedIds_.add("hipGetDevice"); - loggedIds_.add("hipSetDevice"); - loggedIds_.add("hipGetLastError"); - loggedIds_.add("__hipPushCallConfiguration"); - loggedIds_.add("__hipPopCallConfiguration"); - loggedIds_.add("hipCtxSetCurrent"); - loggedIds_.add("hipEventRecord"); - loggedIds_.add("hipEventQuery"); - loggedIds_.add("hipGetDeviceProperties"); - loggedIds_.add("hipPeekAtLastError"); - loggedIds_.add("hipModuleGetFunction"); - loggedIds_.add("hipEventCreateWithFlags"); - - // Enable API callbacks - if (loggedIds_.invertMode() == true) { - // exclusion list - enable entire domain and turn off things in list - roctracer_enable_domain_callback(ACTIVITY_DOMAIN_HIP_API, api_callback, nullptr); - const std::unordered_map &filter = loggedIds_.filterList(); - for (auto it = filter.begin(); it != filter.end(); ++it) { - roctracer_disable_op_callback(ACTIVITY_DOMAIN_HIP_API, it->first); - } - } - else { - // inclusion list - only enable things in the list - const std::unordered_map &filter = loggedIds_.filterList(); - roctracer_disable_domain_callback(ACTIVITY_DOMAIN_HIP_API); - for (auto it = filter.begin(); it != filter.end(); ++it) { - roctracer_enable_op_callback(ACTIVITY_DOMAIN_HIP_API, it->first, api_callback, nullptr); - } - } - //roctracer_enable_domain_callback(ACTIVITY_DOMAIN_ROCTX, api_callback, nullptr); - - // Allocate default tracing pool - roctracer_properties_t properties; - memset(&properties, 0, sizeof(roctracer_properties_t)); - properties.buffer_size = 0x1000; - roctracer_open_pool(&properties); - - // Enable async op collection - roctracer_properties_t hcc_cb_properties; - memset(&hcc_cb_properties, 0, sizeof(roctracer_properties_t)); - hcc_cb_properties.buffer_size = 0x4000; - hcc_cb_properties.buffer_callback_fun = activity_callback; - roctracer_open_pool_expl(&hcc_cb_properties, &hccPool_); - roctracer_enable_domain_activity_expl(ACTIVITY_DOMAIN_HCC_OPS, hccPool_); - - registered_ = true; - } - - for (const auto& activity : selected_activities) { - if (activity == ActivityType::EXTERNAL_CORRELATION) { - externalCorrelationEnabled_ = true; - } - } - - roctracer_start(); -#endif -} - -void RoctracerActivityApi::disableActivities( - const std::set& selected_activities) { -#ifdef HAS_ROCTRACER - roctracer_stop(); - roctracer_flush_activity_expl(hccPool_); - - for (const auto& activity : selected_activities) { - if (activity == ActivityType::EXTERNAL_CORRELATION) { - externalCorrelationEnabled_ = false; - } - } -#endif -} - -void RoctracerActivityApi::endTracing() { - if (registered_ == true) { - roctracer_disable_domain_callback(ACTIVITY_DOMAIN_HIP_API); - //roctracer_disable_domain_callback(ACTIVITY_DOMAIN_ROCTX); - - roctracer_disable_domain_activity(ACTIVITY_DOMAIN_HCC_OPS); - roctracer_close_pool_expl(hccPool_); - } -} - - -ApiIdList::ApiIdList() -: invert_(true) -{ -} - -void ApiIdList::add(std::string apiName) -{ - uint32_t cid = 0; - if (roctracer_op_code(ACTIVITY_DOMAIN_HIP_API, apiName.c_str(), &cid, nullptr) == ROCTRACER_STATUS_SUCCESS) { - filter_[cid] = 1; - } -} -void ApiIdList::remove(std::string apiName) -{ - uint32_t cid = 0; - if (roctracer_op_code(ACTIVITY_DOMAIN_HIP_API, apiName.c_str(), &cid, nullptr) == ROCTRACER_STATUS_SUCCESS) { - filter_.erase(cid); - } -} - -bool ApiIdList::loadUserPrefs() -{ - // placeholder - return false; -} -bool ApiIdList::contains(uint32_t apiId) -{ - return (filter_.find(apiId) != filter_.end()) ? !invert_ : invert_; // XOR -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.h b/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.h deleted file mode 100644 index 28280253e7c8426e85c11d679785bcd74fa2a0c7..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityApi.h +++ /dev/null @@ -1,171 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#ifdef HAS_ROCTRACER -#include -#include -#include -#include -#include -#endif - -#include "ActivityType.h" -#include "GenericTraceActivity.h" -#include "RoctracerActivityBuffer.h" - - -namespace KINETO_NAMESPACE { - -using namespace libkineto; - -class ApiIdList -{ -public: - ApiIdList(); - bool invertMode() { return invert_; } - void setInvertMode(bool invert) { invert_ = invert; } - void add(std::string apiName); - void remove(std::string apiName); - bool loadUserPrefs(); - bool contains(uint32_t apiId); - const std::unordered_map &filterList() { return filter_; } - -private: - std::unordered_map filter_; - bool invert_; -}; - -struct roctracerRow { - roctracerRow(uint64_t id, uint32_t domain, uint32_t cid, uint32_t pid - , uint32_t tid, uint64_t begin, uint64_t end) - : id(id), domain(domain), cid(cid), pid(pid), tid(tid), begin(begin), end(end) {} - uint64_t id; // correlation_id - uint32_t domain; - uint32_t cid; - uint32_t pid; - uint32_t tid; - uint64_t begin; - uint64_t end; -}; - -struct kernelRow : public roctracerRow { - kernelRow(uint64_t id, uint32_t domain, uint32_t cid, uint32_t pid - , uint32_t tid, uint64_t begin, uint64_t end - , const void *faddr, hipFunction_t function - , unsigned int gx, unsigned int gy, unsigned int gz - , unsigned int wx, unsigned int wy, unsigned int wz - , size_t gss, hipStream_t stream) - : roctracerRow(id, domain, cid, pid, tid, begin, end), functionAddr(faddr) - , function(function), gridX(gx), gridY(gy), gridZ(gz) - , workgroupX(wx), workgroupY(wy), workgroupZ(wz), groupSegmentSize(gss) - , stream(stream) {} - const void* functionAddr; - hipFunction_t function; - unsigned int gridX; - unsigned int gridY; - unsigned int gridZ; - unsigned int workgroupX; - unsigned int workgroupY; - unsigned int workgroupZ; - size_t groupSegmentSize; - hipStream_t stream; -}; - -struct copyRow : public roctracerRow { - copyRow(uint64_t id, uint32_t domain, uint32_t cid, uint32_t pid - , uint32_t tid, uint64_t begin, uint64_t end - , const void* src, const void *dst, size_t size, hipMemcpyKind kind - , hipStream_t stream) - : roctracerRow(id, domain, cid, pid, tid, begin, end) - , src(src), dst(dst), size(size), kind(kind), stream(stream) {} - const void *src; - const void *dst; - size_t size; - hipMemcpyKind kind; - hipStream_t stream; -}; - -struct mallocRow : public roctracerRow { - mallocRow(uint64_t id, uint32_t domain, uint32_t cid, uint32_t pid - , uint32_t tid, uint64_t begin, uint64_t end - , const void* ptr, size_t size) - : roctracerRow(id, domain, cid, pid, tid, begin, end) - , ptr(ptr), size(size) {} - const void *ptr; - size_t size; -}; - - -class RoctracerActivityApi { - public: - enum CorrelationFlowType { - Default, - User - }; - - RoctracerActivityApi(); - RoctracerActivityApi(const RoctracerActivityApi&) = delete; - RoctracerActivityApi& operator=(const RoctracerActivityApi&) = delete; - - virtual ~RoctracerActivityApi(); - - static RoctracerActivityApi& singleton(); - - static void pushCorrelationID(int id, CorrelationFlowType type); - static void popCorrelationID(CorrelationFlowType type); - - void enableActivities( - const std::set& selected_activities); - void disableActivities( - const std::set& selected_activities); - void clearActivities(); - - int processActivities(ActivityLogger& logger); - - void setMaxBufferSize(int size); - - std::atomic_bool stopCollection{false}; - - private: - bool registered_{false}; - void endTracing(); - -#ifdef HAS_ROCTRACER - roctracer_pool_t *hccPool_{NULL}; - static void api_callback(uint32_t domain, uint32_t cid, const void* callback_data, void* arg); - static void activity_callback(const char* begin, const char* end, void* arg); - - //Name cache - uint32_t nextStringId_{2}; - std::map strings_; - std::map reverseStrings_; - std::map kernelNames_; - - ApiIdList loggedIds_; - - // Api callback data - std::deque rows_; - std::deque kernelRows_; - std::deque copyRows_; - std::deque mallocRows_; - std::map kernelLaunches_; -#endif - - int maxGpuBufferCount_{0}; - std::unique_ptr> gpuTraceBuffers_; - bool externalCorrelationEnabled_{true}; -}; - -} // namespace KINETO_NAMESPACE - diff --git a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityBuffer.h b/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityBuffer.h deleted file mode 100644 index cd8a5709a841b7c988ab3f2d1f3108d693343584..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/RoctracerActivityBuffer.h +++ /dev/null @@ -1,30 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -class RoctracerActivityBuffer { - public: - // data must be allocated using malloc. - // Ownership is transferred to this object. - RoctracerActivityBuffer(uint8_t* data, size_t validSize) - : data(data), validSize(validSize) {} - - ~RoctracerActivityBuffer() { - free(data); - } - - // Allocated by malloc - uint8_t* data{nullptr}; - - // Number of bytes used - size_t validSize; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/SampleListener.h b/plugins/tensorboard-plugins/libkineto/src/SampleListener.h deleted file mode 100644 index bff86ad122a051d4f3dfdbdd329a3b63d93a7c77..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/SampleListener.h +++ /dev/null @@ -1,146 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -namespace KINETO_NAMESPACE { - -class Config; - -class SampleValue { - public: - template - explicit SampleValue(T v) { - init(v); - } - - SampleValue(const SampleValue&) = default; - SampleValue& operator=(const SampleValue&) = delete; - SampleValue(SampleValue&&) = default; - SampleValue& operator=(SampleValue&&) = default; - - bool isInt() const { - return type_ == INT64; - } - - int64_t getInt() const { - assert(isInt()); - return int_; - } - - bool isDouble() const { - return type_ == DOUBLE; - } - - double getDouble() const { - assert(isDouble()); - return dbl_; - } - - inline void operator*=(double x) { - assert(isDouble() || isInt()); - if (isDouble()) { - dbl_ *= x; - } else { - int_ = std::round(int_ * x); - } - } - - inline bool operator<(const SampleValue& o) const { - if (type_ != o.type_) { - return type_ < o.type_; - } else if (type_ == INT64) { - return int_ < o.int_; - } else if (type_ == DOUBLE) { - return dbl_ < o.dbl_; - } - assert(false); - return true; - } - - void print(std::ostream& s) const { - if (type_ == INT64) { - s << int_; - } else if (type_ == DOUBLE) { - s << dbl_; - } else { - assert(false); - } - } - - private: - enum Type { INT64, DOUBLE }; - - template - void init(T v); - - Type type_{INT64}; - union { - int64_t int_{0}; - double dbl_; - }; -}; - -template <> -inline void SampleValue::init(uint64_t v) { - int_ = v, type_ = INT64; -} -template <> -inline void SampleValue::init(int64_t v) { - int_ = v, type_ = INT64; -} -template <> -inline void SampleValue::init(int v) { - int_ = v, type_ = INT64; -} -template <> -inline void SampleValue::init(double v) { - dbl_ = v, type_ = DOUBLE; -} - -inline std::ostream& operator<<(std::ostream& out, const SampleValue& s) { - s.print(out); - return out; -} - -using PercentileList = std::vector>; - -struct Stat { - const std::string& name; - const PercentileList percentileValues; - SampleValue total; -}; - -struct Sample { - Sample(int stats_count) { - stats.reserve(stats_count); - } - - // Offset in milliseconds from first sample in report - int deltaMsec; - std::vector stats; -}; - -// Inherit from this to be notified of samples -class SampleListener { - public: - SampleListener(const SampleListener&) = delete; - SampleListener& operator=(const SampleListener&) = delete; - - virtual ~SampleListener(){}; - - // Report bucketed & aggregated values for event - virtual void handleSample(int device, const Sample& sample, bool from_new_version) = 0; - - virtual void update(const Config& config) = 0; - - protected: - SampleListener() = default; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/ScopeExit.h b/plugins/tensorboard-plugins/libkineto/src/ScopeExit.h deleted file mode 100644 index b9a6bc83ef942c7fb0e4b198b0396e5d75aa5a3a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ScopeExit.h +++ /dev/null @@ -1,29 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -// Implement a simple scope handler allowing a function to release -// resources when an error or exception occurs - -template -class ScopeExit { - public: - explicit ScopeExit(T t) : t(t) {} - ~ScopeExit() { - t(); - } - T t; -}; - -template -ScopeExit makeScopeExit(T t) { - return ScopeExit(t); -}; - -// Add a level of indirection so __LINE__ is expanded -#define __kINETO_CONCAT(name, line) name##line -#define ANON_VAR(name, line) __kINETO_CONCAT(name, line) - -#define SCOPE_EXIT(func) \ - const auto ANON_VAR(SCOPE_BLOCK, __LINE__) = \ - makeScopeExit([=]() { func; }) diff --git a/plugins/tensorboard-plugins/libkineto/src/ThreadUtil.cpp b/plugins/tensorboard-plugins/libkineto/src/ThreadUtil.cpp deleted file mode 100644 index 0f67d54d58512aa47b05aed69748a6894aa06b1c..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/ThreadUtil.cpp +++ /dev/null @@ -1,203 +0,0 @@ -#include "ThreadUtil.h" - -#ifndef _MSC_VER -#include -#include -#include -#include -#else // _MSC_VER -#include -#include -#define WIN32_LEAN_AND_MEAN -#define NOGDI -#include -#include -#undef ERROR -#endif // _MSC_VER - -#ifdef __ANDROID__ -#include -#endif - -#include -#include -#include - -namespace libkineto { - -namespace { -thread_local int32_t _pid = 0; -thread_local int32_t _tid = 0; -thread_local int32_t _sysTid = 0; -} - -int32_t processId() { - if (!_pid) { -#ifndef _MSC_VER - _pid = (int32_t)getpid(); -#else - _pid = (int32_t)GetCurrentProcessId(); -#endif - } - return _pid; -} - -int32_t systemThreadId() { - if (!_sysTid) { -#ifdef __APPLE__ - _sysTid = (int32_t)syscall(SYS_thread_selfid); -#elif defined _MSC_VER - _sysTid = (int32_t)GetCurrentThreadId(); -#else - _sysTid = (int32_t)syscall(SYS_gettid); -#endif - } - return _sysTid; -} - -int32_t threadId() { - if (!_tid) { -#ifdef __APPLE__ - uint64_t tid; - pthread_threadid_np(nullptr, &tid); - _tid = tid; -#elif defined _MSC_VER - _tid = (int32_t)GetCurrentThreadId(); -#else - pthread_t pth = pthread_self(); - int32_t* ptr = reinterpret_cast(&pth); - _tid = *ptr; -#endif - } - return _tid; -} - -namespace { -static constexpr size_t kMaxThreadNameLength = 16; - -static constexpr const char* basename(const char* s, int off = 0) { - return !s[off] - ? s - : s[off] == '/' ? basename(&s[off + 1]) : basename(s, off + 1); -} -#if defined(_MSC_VER) -void *getKernel32Func(const char* procName) { - return GetProcAddress(GetModuleHandleA("KERNEL32.DLL"), procName); -} -#endif -} - -bool setThreadName(const std::string& name) { -#ifdef __APPLE__ - return 0 == pthread_setname_np(name.c_str()); -#elif defined _MSC_VER - // Per https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription - // Use runtime linking to set thread description - static auto _SetThreadDescription = reinterpret_cast(getKernel32Func("SetThreadDescription")); - if (!_SetThreadDescription) { - return false; - } - std::wstring_convert> conv; - std::wstring wname = conv.from_bytes(name); - HRESULT hr = _SetThreadDescription(GetCurrentThread(), wname.c_str()); - return SUCCEEDED(hr); -#else - return 0 == pthread_setname_np(pthread_self(), name.c_str()); -#endif -} - -std::string getThreadName() { -#ifndef _MSC_VER - char buf[kMaxThreadNameLength] = ""; - if ( -#ifndef __ANDROID__ - pthread_getname_np(pthread_self(), buf, kMaxThreadNameLength) != 0 -#else - prctl(PR_GET_NAME, buf, kMaxThreadNameLength) != 0 -#endif - ) { - return "Unknown"; - } - return buf; -#else // _MSC_VER - static auto _GetThreadDescription = reinterpret_cast(getKernel32Func("GetThreadDescription")); - if (!_GetThreadDescription) { - return "Unknown"; - } - PWSTR data; - HRESULT hr = _GetThreadDescription(GetCurrentThread(), &data); - if (!SUCCEEDED(hr)) { - return ""; - } - std::wstring_convert> conv; - std::string name = conv.to_bytes(data); - LocalFree(data); - return name; -#endif -} - -// Linux: -// Extract process name from /proc/pid/cmdline. This does not have -// the 16 character limit that /proc/pid/status and /prod/pid/comm has. -std::string processName(int32_t pid) { -#ifdef __linux__ - FILE* cmdfile = fopen(fmt::format("/proc/{}/cmdline", pid).c_str(), "r"); - if (cmdfile != nullptr) { - char* command = nullptr; - int scanned = fscanf(cmdfile, "%ms", &command); - fclose(cmdfile); - if (scanned > 0 && command) { - std::string ret(basename(command)); - free(command); - return ret; - } - } - std::cerr << "Failed to read process name for pid " << pid << std::endl; -#endif - return ""; -} - -// Max number of parent pids to collect, just for extra safeguarding. -constexpr int kMaxParentPids = 10; - -// Return a pair of -static std::pair parentPidAndCommand(int32_t pid) { -#ifdef __linux__ - FILE* statfile = fopen(fmt::format("/proc/{}/stat", pid).c_str(), "r"); - if (statfile == nullptr) { - return std::make_pair(0, ""); - } - int32_t parent_pid; - char* command = nullptr; - int scanned = fscanf(statfile, "%*d (%m[^)]) %*c %d", &command, &parent_pid); - fclose(statfile); - std::pair ret; - if (scanned == 2) { - ret = std::make_pair(parent_pid, std::string(command)); - } else { - std::cerr << "Failed to parse /proc/" << pid << "/stat" << std::endl; - ret = std::make_pair(0, ""); - } - - // The 'm' character in the format tells fscanf to allocate memory - // for the parsed string, which we need to free here. - free(command); - return ret; -#else - return std::make_pair(0, ""); -#endif -} - -std::vector> pidCommandPairsOfAncestors() { - std::vector> pairs; - pairs.reserve(kMaxParentPids + 1); - int32_t curr_pid = processId(); - for (int i = 0; i <= kMaxParentPids && curr_pid > 1; i++) { - std::pair ppid_and_comm = parentPidAndCommand(curr_pid); - pairs.push_back(std::make_pair(curr_pid, ppid_and_comm.second)); - curr_pid = ppid_and_comm.first; - } - return pairs; -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/WeakSymbols.cpp b/plugins/tensorboard-plugins/libkineto/src/WeakSymbols.cpp deleted file mode 100644 index 540a5ac8f97c8f38c7ee3d31ea285a3ab7c9f375..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/WeakSymbols.cpp +++ /dev/null @@ -1,12 +0,0 @@ -#include - -#ifndef _MSC_VER -extern "C" { -// This function is needed to avoid superfluous dependency on GNU OpenMP library when cuPTI is linked statically -// For more details see https://github.com/pytorch/pytorch/issues/51026 -__attribute__((weak)) int acc_get_device_type() { - throw std::runtime_error("Dummy implementation of acc_get_device_type is not supposed to be called!"); -} - -} // extern "C" -#endif diff --git a/plugins/tensorboard-plugins/libkineto/src/cupti_call.h b/plugins/tensorboard-plugins/libkineto/src/cupti_call.h deleted file mode 100644 index fd6ebae7691ed607867db5717248ba22f4efa5c0..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/cupti_call.h +++ /dev/null @@ -1,33 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -#ifdef HAS_CUPTI - -#include - -#define CUPTI_CALL(call) \ - [&]() -> CUptiResult { \ - CUptiResult _status_ = call; \ - if (_status_ != CUPTI_SUCCESS) { \ - const char* _errstr_ = nullptr; \ - cuptiGetResultString(_status_, &_errstr_); \ - LOG(WARNING) << fmt::format( \ - "function {} failed with error {} ({})", \ - #call, \ - _errstr_, \ - (int)_status_); \ - } \ - return _status_; \ - }() - -#define CUPTI_CALL_NOWARN(call) call - -#else - -#define CUPTI_CALL(call) call -#define CUPTI_CALL_NOWARN(call) call - -#endif // HAS_CUPTI diff --git a/plugins/tensorboard-plugins/libkineto/src/cupti_strings.cpp b/plugins/tensorboard-plugins/libkineto/src/cupti_strings.cpp deleted file mode 100644 index 4535273a277e04b0b6f98b539df82955ef62468f..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/cupti_strings.cpp +++ /dev/null @@ -1,502 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "cupti_strings.h" - -namespace libkineto { - -const char* memcpyKindString( - CUpti_ActivityMemcpyKind kind) { - switch (kind) { - case CUPTI_ACTIVITY_MEMCPY_KIND_HTOD: - return "HtoD"; - case CUPTI_ACTIVITY_MEMCPY_KIND_DTOH: - return "DtoH"; - case CUPTI_ACTIVITY_MEMCPY_KIND_HTOA: - return "HtoA"; - case CUPTI_ACTIVITY_MEMCPY_KIND_ATOH: - return "AtoH"; - case CUPTI_ACTIVITY_MEMCPY_KIND_ATOA: - return "AtoA"; - case CUPTI_ACTIVITY_MEMCPY_KIND_ATOD: - return "AtoD"; - case CUPTI_ACTIVITY_MEMCPY_KIND_DTOA: - return "DtoA"; - case CUPTI_ACTIVITY_MEMCPY_KIND_DTOD: - return "DtoD"; - case CUPTI_ACTIVITY_MEMCPY_KIND_HTOH: - return "HtoH"; - case CUPTI_ACTIVITY_MEMCPY_KIND_PTOP: - return "PtoP"; - default: - break; - } - return ""; -} - -const char* memoryKindString( - CUpti_ActivityMemoryKind kind) { - switch (kind) { - case CUPTI_ACTIVITY_MEMORY_KIND_UNKNOWN: - return "Unknown"; - case CUPTI_ACTIVITY_MEMORY_KIND_PAGEABLE: - return "Pageable"; - case CUPTI_ACTIVITY_MEMORY_KIND_PINNED: - return "Pinned"; - case CUPTI_ACTIVITY_MEMORY_KIND_DEVICE: - return "Device"; - case CUPTI_ACTIVITY_MEMORY_KIND_ARRAY: - return "Array"; - case CUPTI_ACTIVITY_MEMORY_KIND_MANAGED: - return "Managed"; - case CUPTI_ACTIVITY_MEMORY_KIND_DEVICE_STATIC: - return "Device Static"; - case CUPTI_ACTIVITY_MEMORY_KIND_MANAGED_STATIC: - return "Managed Static"; - case CUPTI_ACTIVITY_MEMORY_KIND_FORCE_INT: - return "Force Int"; - default: - return "Unrecognized"; - } -} - -const char* overheadKindString( - CUpti_ActivityOverheadKind kind) { - switch (kind) { - case CUPTI_ACTIVITY_OVERHEAD_UNKNOWN: - return "Unknown"; - case CUPTI_ACTIVITY_OVERHEAD_DRIVER_COMPILER: - return "Driver Compiler"; - case CUPTI_ACTIVITY_OVERHEAD_CUPTI_BUFFER_FLUSH: - return "Buffer Flush"; - case CUPTI_ACTIVITY_OVERHEAD_CUPTI_INSTRUMENTATION: - return "Instrumentation"; - case CUPTI_ACTIVITY_OVERHEAD_CUPTI_RESOURCE: - return "Resource"; - case CUPTI_ACTIVITY_OVERHEAD_FORCE_INT: - return "Force Int"; - default: - return "Unrecognized"; - } -} - - - -static const char* runtimeCbidNames[] = { - "INVALID", - "cudaDriverGetVersion", - "cudaRuntimeGetVersion", - "cudaGetDeviceCount", - "cudaGetDeviceProperties", - "cudaChooseDevice", - "cudaGetChannelDesc", - "cudaCreateChannelDesc", - "cudaConfigureCall", - "cudaSetupArgument", - "cudaGetLastError", - "cudaPeekAtLastError", - "cudaGetErrorString", - "cudaLaunch", - "cudaFuncSetCacheConfig", - "cudaFuncGetAttributes", - "cudaSetDevice", - "cudaGetDevice", - "cudaSetValidDevices", - "cudaSetDeviceFlags", - "cudaMalloc", - "cudaMallocPitch", - "cudaFree", - "cudaMallocArray", - "cudaFreeArray", - "cudaMallocHost", - "cudaFreeHost", - "cudaHostAlloc", - "cudaHostGetDevicePointer", - "cudaHostGetFlags", - "cudaMemGetInfo", - "cudaMemcpy", - "cudaMemcpy2D", - "cudaMemcpyToArray", - "cudaMemcpy2DToArray", - "cudaMemcpyFromArray", - "cudaMemcpy2DFromArray", - "cudaMemcpyArrayToArray", - "cudaMemcpy2DArrayToArray", - "cudaMemcpyToSymbol", - "cudaMemcpyFromSymbol", - "cudaMemcpyAsync", - "cudaMemcpyToArrayAsync", - "cudaMemcpyFromArrayAsync", - "cudaMemcpy2DAsync", - "cudaMemcpy2DToArrayAsync", - "cudaMemcpy2DFromArrayAsync", - "cudaMemcpyToSymbolAsync", - "cudaMemcpyFromSymbolAsync", - "cudaMemset", - "cudaMemset2D", - "cudaMemsetAsync", - "cudaMemset2DAsync", - "cudaGetSymbolAddress", - "cudaGetSymbolSize", - "cudaBindTexture", - "cudaBindTexture2D", - "cudaBindTextureToArray", - "cudaUnbindTexture", - "cudaGetTextureAlignmentOffset", - "cudaGetTextureReference", - "cudaBindSurfaceToArray", - "cudaGetSurfaceReference", - "cudaGLSetGLDevice", - "cudaGLRegisterBufferObject", - "cudaGLMapBufferObject", - "cudaGLUnmapBufferObject", - "cudaGLUnregisterBufferObject", - "cudaGLSetBufferObjectMapFlags", - "cudaGLMapBufferObjectAsync", - "cudaGLUnmapBufferObjectAsync", - "cudaWGLGetDevice", - "cudaGraphicsGLRegisterImage", - "cudaGraphicsGLRegisterBuffer", - "cudaGraphicsUnregisterResource", - "cudaGraphicsResourceSetMapFlags", - "cudaGraphicsMapResources", - "cudaGraphicsUnmapResources", - "cudaGraphicsResourceGetMappedPointer", - "cudaGraphicsSubResourceGetMappedArray", - "cudaVDPAUGetDevice", - "cudaVDPAUSetVDPAUDevice", - "cudaGraphicsVDPAURegisterVideoSurface", - "cudaGraphicsVDPAURegisterOutputSurface", - "cudaD3D11GetDevice", - "cudaD3D11GetDevices", - "cudaD3D11SetDirect3DDevice", - "cudaGraphicsD3D11RegisterResource", - "cudaD3D10GetDevice", - "cudaD3D10GetDevices", - "cudaD3D10SetDirect3DDevice", - "cudaGraphicsD3D10RegisterResource", - "cudaD3D10RegisterResource", - "cudaD3D10UnregisterResource", - "cudaD3D10MapResources", - "cudaD3D10UnmapResources", - "cudaD3D10ResourceSetMapFlags", - "cudaD3D10ResourceGetSurfaceDimensions", - "cudaD3D10ResourceGetMappedArray", - "cudaD3D10ResourceGetMappedPointer", - "cudaD3D10ResourceGetMappedSize", - "cudaD3D10ResourceGetMappedPitch", - "cudaD3D9GetDevice", - "cudaD3D9GetDevices", - "cudaD3D9SetDirect3DDevice", - "cudaD3D9GetDirect3DDevice", - "cudaGraphicsD3D9RegisterResource", - "cudaD3D9RegisterResource", - "cudaD3D9UnregisterResource", - "cudaD3D9MapResources", - "cudaD3D9UnmapResources", - "cudaD3D9ResourceSetMapFlags", - "cudaD3D9ResourceGetSurfaceDimensions", - "cudaD3D9ResourceGetMappedArray", - "cudaD3D9ResourceGetMappedPointer", - "cudaD3D9ResourceGetMappedSize", - "cudaD3D9ResourceGetMappedPitch", - "cudaD3D9Begin", - "cudaD3D9End", - "cudaD3D9RegisterVertexBuffer", - "cudaD3D9UnregisterVertexBuffer", - "cudaD3D9MapVertexBuffer", - "cudaD3D9UnmapVertexBuffer", - "cudaThreadExit", - "cudaSetDoubleForDevice", - "cudaSetDoubleForHost", - "cudaThreadSynchronize", - "cudaThreadGetLimit", - "cudaThreadSetLimit", - "cudaStreamCreate", - "cudaStreamDestroy", - "cudaStreamSynchronize", - "cudaStreamQuery", - "cudaEventCreate", - "cudaEventCreateWithFlags", - "cudaEventRecord", - "cudaEventDestroy", - "cudaEventSynchronize", - "cudaEventQuery", - "cudaEventElapsedTime", - "cudaMalloc3D", - "cudaMalloc3DArray", - "cudaMemset3D", - "cudaMemset3DAsync", - "cudaMemcpy3D", - "cudaMemcpy3DAsync", - "cudaThreadSetCacheConfig", - "cudaStreamWaitEvent", - "cudaD3D11GetDirect3DDevice", - "cudaD3D10GetDirect3DDevice", - "cudaThreadGetCacheConfig", - "cudaPointerGetAttributes", - "cudaHostRegister", - "cudaHostUnregister", - "cudaDeviceCanAccessPeer", - "cudaDeviceEnablePeerAccess", - "cudaDeviceDisablePeerAccess", - "cudaPeerRegister", - "cudaPeerUnregister", - "cudaPeerGetDevicePointer", - "cudaMemcpyPeer", - "cudaMemcpyPeerAsync", - "cudaMemcpy3DPeer", - "cudaMemcpy3DPeerAsync", - "cudaDeviceReset", - "cudaDeviceSynchronize", - "cudaDeviceGetLimit", - "cudaDeviceSetLimit", - "cudaDeviceGetCacheConfig", - "cudaDeviceSetCacheConfig", - "cudaProfilerInitialize", - "cudaProfilerStart", - "cudaProfilerStop", - "cudaDeviceGetByPCIBusId", - "cudaDeviceGetPCIBusId", - "cudaGLGetDevices", - "cudaIpcGetEventHandle", - "cudaIpcOpenEventHandle", - "cudaIpcGetMemHandle", - "cudaIpcOpenMemHandle", - "cudaIpcCloseMemHandle", - "cudaArrayGetInfo", - "cudaFuncSetSharedMemConfig", - "cudaDeviceGetSharedMemConfig", - "cudaDeviceSetSharedMemConfig", - "cudaCreateTextureObject", - "cudaDestroyTextureObject", - "cudaGetTextureObjectResourceDesc", - "cudaGetTextureObjectTextureDesc", - "cudaCreateSurfaceObject", - "cudaDestroySurfaceObject", - "cudaGetSurfaceObjectResourceDesc", - "cudaMallocMipmappedArray", - "cudaGetMipmappedArrayLevel", - "cudaFreeMipmappedArray", - "cudaBindTextureToMipmappedArray", - "cudaGraphicsResourceGetMappedMipmappedArray", - "cudaStreamAddCallback", - "cudaStreamCreateWithFlags", - "cudaGetTextureObjectResourceViewDesc", - "cudaDeviceGetAttribute", - "cudaStreamDestroy", - "cudaStreamCreateWithPriority", - "cudaStreamGetPriority", - "cudaStreamGetFlags", - "cudaDeviceGetStreamPriorityRange", - "cudaMallocManaged", - "cudaOccupancyMaxActiveBlocksPerMultiprocessor", - "cudaStreamAttachMemAsync", - "cudaGetErrorName", - "cudaOccupancyMaxActiveBlocksPerMultiprocessor", - "cudaLaunchKernel", - "cudaGetDeviceFlags", - "cudaLaunch_ptsz", - "cudaLaunchKernel_ptsz", - "cudaMemcpy_ptds", - "cudaMemcpy2D_ptds", - "cudaMemcpyToArray_ptds", - "cudaMemcpy2DToArray_ptds", - "cudaMemcpyFromArray_ptds", - "cudaMemcpy2DFromArray_ptds", - "cudaMemcpyArrayToArray_ptds", - "cudaMemcpy2DArrayToArray_ptds", - "cudaMemcpyToSymbol_ptds", - "cudaMemcpyFromSymbol_ptds", - "cudaMemcpyAsync_ptsz", - "cudaMemcpyToArrayAsync_ptsz", - "cudaMemcpyFromArrayAsync_ptsz", - "cudaMemcpy2DAsync_ptsz", - "cudaMemcpy2DToArrayAsync_ptsz", - "cudaMemcpy2DFromArrayAsync_ptsz", - "cudaMemcpyToSymbolAsync_ptsz", - "cudaMemcpyFromSymbolAsync_ptsz", - "cudaMemset_ptds", - "cudaMemset2D_ptds", - "cudaMemsetAsync_ptsz", - "cudaMemset2DAsync_ptsz", - "cudaStreamGetPriority_ptsz", - "cudaStreamGetFlags_ptsz", - "cudaStreamSynchronize_ptsz", - "cudaStreamQuery_ptsz", - "cudaStreamAttachMemAsync_ptsz", - "cudaEventRecord_ptsz", - "cudaMemset3D_ptds", - "cudaMemset3DAsync_ptsz", - "cudaMemcpy3D_ptds", - "cudaMemcpy3DAsync_ptsz", - "cudaStreamWaitEvent_ptsz", - "cudaStreamAddCallback_ptsz", - "cudaMemcpy3DPeer_ptds", - "cudaMemcpy3DPeerAsync_ptsz", - "cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags", - "cudaMemPrefetchAsync", - "cudaMemPrefetchAsync_ptsz", - "cudaMemAdvise", - "cudaDeviceGetP2PAttribute", - "cudaGraphicsEGLRegisterImage", - "cudaEGLStreamConsumerConnect", - "cudaEGLStreamConsumerDisconnect", - "cudaEGLStreamConsumerAcquireFrame", - "cudaEGLStreamConsumerReleaseFrame", - "cudaEGLStreamProducerConnect", - "cudaEGLStreamProducerDisconnect", - "cudaEGLStreamProducerPresentFrame", - "cudaEGLStreamProducerReturnFrame", - "cudaGraphicsResourceGetMappedEglFrame", - "cudaMemRangeGetAttribute", - "cudaMemRangeGetAttributes", - "cudaEGLStreamConsumerConnectWithFlags", - "cudaLaunchCooperativeKernel", - "cudaLaunchCooperativeKernel_ptsz", - "cudaEventCreateFromEGLSync", - "cudaLaunchCooperativeKernelMultiDevice", - "cudaFuncSetAttribute", - "cudaImportExternalMemory", - "cudaExternalMemoryGetMappedBuffer", - "cudaExternalMemoryGetMappedMipmappedArray", - "cudaDestroyExternalMemory", - "cudaImportExternalSemaphore", - "cudaSignalExternalSemaphoresAsync", - "cudaSignalExternalSemaphoresAsync_ptsz", - "cudaWaitExternalSemaphoresAsync", - "cudaWaitExternalSemaphoresAsync_ptsz", - "cudaDestroyExternalSemaphore", - "cudaLaunchHostFunc", - "cudaLaunchHostFunc_ptsz", - "cudaGraphCreate", - "cudaGraphKernelNodeGetParams", - "cudaGraphKernelNodeSetParams", - "cudaGraphAddKernelNode", - "cudaGraphAddMemcpyNode", - "cudaGraphMemcpyNodeGetParams", - "cudaGraphMemcpyNodeSetParams", - "cudaGraphAddMemsetNode", - "cudaGraphMemsetNodeGetParams", - "cudaGraphMemsetNodeSetParams", - "cudaGraphAddHostNode", - "cudaGraphHostNodeGetParams", - "cudaGraphAddChildGraphNode", - "cudaGraphChildGraphNodeGetGraph", - "cudaGraphAddEmptyNode", - "cudaGraphClone", - "cudaGraphNodeFindInClone", - "cudaGraphNodeGetType", - "cudaGraphGetRootNodes", - "cudaGraphNodeGetDependencies", - "cudaGraphNodeGetDependentNodes", - "cudaGraphAddDependencies", - "cudaGraphRemoveDependencies", - "cudaGraphDestroyNode", - "cudaGraphInstantiate", - "cudaGraphLaunch", - "cudaGraphLaunch_ptsz", - "cudaGraphExecDestroy", - "cudaGraphDestroy", - "cudaStreamBeginCapture", - "cudaStreamBeginCapture_ptsz", - "cudaStreamIsCapturing", - "cudaStreamIsCapturing_ptsz", - "cudaStreamEndCapture", - "cudaStreamEndCapture_ptsz", - "cudaGraphHostNodeSetParams", - "cudaGraphGetNodes", - "cudaGraphGetEdges", - "cudaStreamGetCaptureInfo", - "cudaStreamGetCaptureInfo_ptsz", - "cudaGraphExecKernelNodeSetParams", - "cudaThreadExchangeStreamCaptureMode", - "cudaDeviceGetNvSciSyncAttributes", - "cudaOccupancyAvailableDynamicSMemPerBlock", - "cudaStreamSetFlags", - "cudaStreamSetFlags_ptsz", - "cudaGraphExecMemcpyNodeSetParams", - "cudaGraphExecMemsetNodeSetParams", - "cudaGraphExecHostNodeSetParams", - "cudaGraphExecUpdate", - "cudaGetFuncBySymbol", - "cudaCtxResetPersistingL2Cache", - "cudaGraphKernelNodeCopyAttributes", - "cudaGraphKernelNodeGetAttribute", - "cudaGraphKernelNodeSetAttribute", - "cudaStreamCopyAttributes", - "cudaStreamCopyAttributes_ptsz", - "cudaStreamGetAttribute", - "cudaStreamGetAttribute_ptsz", - "cudaStreamSetAttribute", - "cudaStreamSetAttribute_ptsz", - "cudaDeviceGetTexture1DLinearMaxWidth", - "cudaGraphUpload", - "cudaGraphUpload_ptsz", - "cudaGraphAddMemcpyNodeToSymbol", - "cudaGraphAddMemcpyNodeFromSymbol", - "cudaGraphAddMemcpyNode1D", - "cudaGraphMemcpyNodeSetParamsToSymbol", - "cudaGraphMemcpyNodeSetParamsFromSymbol", - "cudaGraphMemcpyNodeSetParams1D", - "cudaGraphExecMemcpyNodeSetParamsToSymbol", - "cudaGraphExecMemcpyNodeSetParamsFromSymbol", - "cudaGraphExecMemcpyNodeSetParams1D", - "cudaArrayGetSparseProperties", - "cudaMipmappedArrayGetSparseProperties", - "cudaGraphExecChildGraphNodeSetParams", - "cudaGraphAddEventRecordNode", - "cudaGraphEventRecordNodeGetEvent", - "cudaGraphEventRecordNodeSetEvent", - "cudaGraphAddEventWaitNode", - "cudaGraphEventWaitNodeGetEvent", - "cudaGraphEventWaitNodeSetEvent", - "cudaGraphExecEventRecordNodeSetEvent", - "cudaGraphExecEventWaitNodeSetEvent", - "cudaEventRecordWithFlags", - "cudaEventRecordWithFlags_ptsz", - "cudaDeviceGetDefaultMemPool", - "cudaMallocAsync", - "cudaMallocAsync_ptsz", - "cudaFreeAsync", - "cudaFreeAsync_ptsz", - "cudaMemPoolTrimTo", - "cudaMemPoolSetAttribute", - "cudaMemPoolGetAttribute", - "cudaMemPoolSetAccess", - "cudaArrayGetPlane", - "cudaMemPoolGetAccess", - "cudaMemPoolCreate", - "cudaMemPoolDestroy", - "cudaDeviceSetMemPool", - "cudaDeviceGetMemPool", - "cudaMemPoolExportToShareableHandle", - "cudaMemPoolImportFromShareableHandle", - "cudaMemPoolExportPointer", - "cudaMemPoolImportPointer", - "cudaMallocFromPoolAsync", - "cudaMallocFromPoolAsync_ptsz", - "cudaSignalExternalSemaphoresAsync", - "cudaSignalExternalSemaphoresAsync", - "cudaWaitExternalSemaphoresAsync", - "cudaWaitExternalSemaphoresAsync", - "cudaGraphAddExternalSemaphoresSignalNode", - "cudaGraphExternalSemaphoresSignalNodeGetParams", - "cudaGraphExternalSemaphoresSignalNodeSetParams", - "cudaGraphAddExternalSemaphoresWaitNode", - "cudaGraphExternalSemaphoresWaitNodeGetParams", - "cudaGraphExternalSemaphoresWaitNodeSetParams", - "cudaGraphExecExternalSemaphoresSignalNodeSetParams", - "cudaGraphExecExternalSemaphoresWaitNodeSetParams", - "SIZE" -}; - -const char* runtimeCbidName(CUpti_CallbackId cbid) { - constexpr int names_size = - sizeof(runtimeCbidNames) / sizeof(runtimeCbidNames[0]); - if (cbid < 0 || cbid >= names_size) { - return runtimeCbidNames[CUPTI_RUNTIME_TRACE_CBID_INVALID]; - } - return runtimeCbidNames[cbid]; -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/cupti_strings.h b/plugins/tensorboard-plugins/libkineto/src/cupti_strings.h deleted file mode 100644 index bbfebb983648005d8268d9a29d613d369d6a5384..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/cupti_strings.h +++ /dev/null @@ -1,14 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include - -namespace libkineto { - -const char* memoryKindString(CUpti_ActivityMemoryKind kind); -const char* memcpyKindString(CUpti_ActivityMemcpyKind kind); -const char* runtimeCbidName(CUpti_CallbackId cbid); -const char* overheadKindString(CUpti_ActivityOverheadKind kind); - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/init.cpp b/plugins/tensorboard-plugins/libkineto/src/init.cpp deleted file mode 100644 index 4e1022485ac5d17b5af1e0676b6a4595a138e1b5..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/init.cpp +++ /dev/null @@ -1,139 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include - -#include "ActivityProfilerProxy.h" -#include "Config.h" -#ifdef HAS_CUPTI -#include "CuptiCallbackApi.h" -#include "CuptiActivityApi.h" -#include "EventProfilerController.h" -#endif -#include "cupti_call.h" -#include "libkineto.h" - -#include "Logger.h" - -namespace KINETO_NAMESPACE { - -#ifdef HAS_CUPTI -static bool initialized = false; -static std::mutex initMutex; - -static void initProfilers( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId /*cbid*/, - const CUpti_CallbackData* cbInfo) { - CUpti_ResourceData* d = (CUpti_ResourceData*)cbInfo; - CUcontext ctx = d->context; - - VLOG(0) << "CUDA Context created"; - std::lock_guard lock(initMutex); - - if (!initialized) { - libkineto::api().initProfilerIfRegistered(); - initialized = true; - VLOG(0) << "libkineto profilers activated"; - } - if (getenv("KINETO_DISABLE_EVENT_PROFILER") != nullptr) { - VLOG(0) << "Event profiler disabled via env var"; - } else { - ConfigLoader& config_loader = libkineto::api().configLoader(); - config_loader.initBaseConfig(); - EventProfilerController::start(ctx, config_loader); - } -} - -// Some models suffer from excessive instrumentation code gen -// on dynamic attach which can hang for more than 5+ seconds. -// If the workload was meant to be traced, preload the CUPTI -// to take the performance hit early on. -// https://docs.nvidia.com/cupti/r_main.html#r_overhead -static bool shouldPreloadCuptiInstrumentation() { - return getenv("PRELOAD_CUPTI_INSTRUMENTATION"); -} - -static void stopProfiler( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId /*cbid*/, - const CUpti_CallbackData* cbInfo) { - CUpti_ResourceData* d = (CUpti_ResourceData*)cbInfo; - CUcontext ctx = d->context; - - LOG(INFO) << "CUDA Context destroyed"; - std::lock_guard lock(initMutex); - EventProfilerController::stop(ctx); -} -#endif // HAS_CUPTI - -} // namespace KINETO_NAMESPACE - -// Callback interface with CUPTI and library constructors -using namespace KINETO_NAMESPACE; -extern "C" { - -// Return true if no CUPTI errors occurred during init -bool libkineto_init(bool cpuOnly, bool logOnError) { - bool success = true; -#ifdef HAS_CUPTI - if (!cpuOnly) { - // libcupti will be lazily loaded on this call. - // If it is not available (e.g. CUDA is not installed), - // then this call will return an error and we just abort init. - auto& cbapi = CuptiCallbackApi::singleton(); - bool status = false; - - if (cbapi.initSuccess()){ - const CUpti_CallbackDomain domain = CUPTI_CB_DOMAIN_RESOURCE; - status = cbapi.registerCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_CREATED, initProfilers); - status = status && cbapi.registerCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_DESTROYED, stopProfiler); - - if (status) { - status = cbapi.enableCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_CREATED); - status = status && cbapi.enableCallback( - domain, CuptiCallbackApi::RESOURCE_CONTEXT_DESTROYED); - } - } - - if (!cbapi.initSuccess() || !status) { - success = false; - cpuOnly = true; - if (logOnError) { - CUPTI_CALL(cbapi.getCuptiStatus()); - LOG(WARNING) << "CUPTI initialization failed - " - << "CUDA profiler activities will be missing"; - LOG(INFO) << "If you see CUPTI_ERROR_INSUFFICIENT_PRIVILEGES, refer to " - << "https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-cupti"; - } - } - } - - if (shouldPreloadCuptiInstrumentation()) { - CuptiActivityApi::forceLoadCupti(); - } -#endif // HAS_CUPTI - - ConfigLoader& config_loader = libkineto::api().configLoader(); - libkineto::api().registerProfiler( - std::make_unique(cpuOnly, config_loader)); - - return success; -} - -// The cuda driver calls this function if the CUDA_INJECTION64_PATH environment -// variable is set -int InitializeInjection(void) { - LOG(INFO) << "Injection mode: Initializing libkineto"; - libkineto_init(false /*cpuOnly*/, true /*logOnError*/); - return 1; -} - -void suppressLibkinetoLogMessages() { - SET_LOG_SEVERITY_LEVEL(ERROR); -} - -} // extern C diff --git a/plugins/tensorboard-plugins/libkineto/src/libkineto_api.cpp b/plugins/tensorboard-plugins/libkineto/src/libkineto_api.cpp deleted file mode 100644 index 9a622e4f5e5cfd54848cb8c6dc05b98da2fb6011..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/libkineto_api.cpp +++ /dev/null @@ -1,41 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "libkineto.h" - -#include "ConfigLoader.h" -#include "ThreadUtil.h" - -namespace libkineto { - -LibkinetoApi& api() { - static LibkinetoApi instance(ConfigLoader::instance()); - return instance; -} - -void LibkinetoApi::initClientIfRegistered() { - if (client_) { - if (clientRegisterThread_ != threadId()) { - fprintf( - stderr, - "ERROR: External init callback must run in same thread as registerClient " - "(%d != %d)\n", - threadId(), - (int)clientRegisterThread_); - } else { - client_->init(); - } - } -} - -void LibkinetoApi::registerClient(ClientInterface* client) { - client_ = client; - if (client && activityProfiler_) { - // Can initialize straight away - client->init(); - } - // Assume here that the external init callback is *not* threadsafe - // and only call it if it's the same thread that called registerClient - clientRegisterThread_ = threadId(); -} - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/src/output_base.h b/plugins/tensorboard-plugins/libkineto/src/output_base.h deleted file mode 100644 index 29d0d57768c91b8593f202cea51071a1affcd88d..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_base.h +++ /dev/null @@ -1,104 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include -#include "CuptiActivity.h" -#endif // HAS_CUPTI -#include "ActivityBuffers.h" -#include "GenericTraceActivity.h" -#include "ThreadUtil.h" -#include "TraceSpan.h" - -namespace KINETO_NAMESPACE { - class Config; - class GpuKernelActivity; - struct RuntimeActivity; -} - -namespace libkineto { - -using namespace KINETO_NAMESPACE; - -class ActivityLogger { - public: - - virtual ~ActivityLogger() = default; - - struct DeviceInfo { - DeviceInfo(int64_t id, const std::string& name, const std::string& label) : - id(id), name(name), label(label) {} - int64_t id; - const std::string name; - const std::string label; - }; - - struct ResourceInfo { - ResourceInfo( - int64_t deviceId, - int64_t id, - int64_t sortIndex, - const std::string& name) : - id(id), sortIndex(sortIndex), deviceId(deviceId), name(name) {} - int64_t id; - int64_t sortIndex; - int64_t deviceId; - const std::string name; - }; - - struct OverheadInfo { - explicit OverheadInfo(const std::string& name) : name(name) {} - const std::string name; - }; - - virtual void handleDeviceInfo( - const DeviceInfo& info, - uint64_t time) = 0; - - virtual void handleResourceInfo(const ResourceInfo& info, int64_t time) = 0; - - virtual void handleOverheadInfo(const OverheadInfo& info, int64_t time) = 0; - - virtual void handleTraceSpan(const TraceSpan& span) = 0; - - virtual void handleActivity( - const libkineto::ITraceActivity& activity) = 0; - virtual void handleGenericActivity( - const libkineto::GenericTraceActivity& activity) = 0; - -#ifdef HAS_CUPTI - virtual void handleGpuActivity( - const GpuActivity& activity) = 0; - virtual void handleGpuActivity( - const GpuActivity& activity) = 0; - virtual void handleGpuActivity( - const GpuActivity& activity) = 0; - virtual void handleGpuActivity( - const GpuActivity& activity) = 0; -#endif // HAS_CUPTI - - virtual void handleTraceStart( - const std::unordered_map& metadata) = 0; - - void handleTraceStart() { - handleTraceStart(std::unordered_map()); - } - - virtual void finalizeTrace( - const KINETO_NAMESPACE::Config& config, - std::unique_ptr buffers, - int64_t endTime, - std::unordered_map>& metadata) = 0; - - protected: - ActivityLogger() = default; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/output_csv.cpp b/plugins/tensorboard-plugins/libkineto/src/output_csv.cpp deleted file mode 100644 index e56c02293982745ed0c013b83bd04d9f42ea7305..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_csv.cpp +++ /dev/null @@ -1,88 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "output_csv.h" - -#include -#include -#include - -#include -#include - -#include "Config.h" -#include "Logger.h" - -namespace KINETO_NAMESPACE { - -static void write_header( - std::ostream& out, - const std::vector& percentiles) { - out << "timestamp,delta_ms,device,event_name"; - for (int p : percentiles) { - out << ",p" << p; - } - out << ",total" << std::endl; -} - -void EventCSVLogger::update(const Config& config) { - eventNames_.clear(); - eventNames_.insert(config.eventNames().begin(), config.eventNames().end()); - eventNames_.insert(config.metricNames().begin(), config.metricNames().end()); - if (config.percentiles() != percentiles_) { - percentiles_ = config.percentiles(); - if (out_) { - write_header(*out_, percentiles_); - } - } -} - -void EventCSVLogger::handleSample(int device, const Sample& sample, bool from_new_version) { - using namespace std::chrono; - if (out_) { - auto now = system_clock::now(); - auto time = system_clock::to_time_t(now); - for (const Stat& s : sample.stats) { - if (eventNames_.find(s.name) == eventNames_.end()) { - continue; - } - *out_ << fmt::format("{:%Y-%m-%d %H:%M:%S}", fmt::localtime(time)) << ","; - *out_ << sample.deltaMsec << ","; - *out_ << device << ","; - *out_ << s.name; - for (const auto& p : s.percentileValues) { - *out_ << "," << p.second; - } - *out_ << "," << s.total << std::endl; - } - } -} - -void EventCSVFileLogger::update(const Config& config) { - if (config.eventLogFile() != filename_) { - if (of_.is_open()) { - of_.close(); - out_ = nullptr; - percentiles_.clear(); - } - filename_ = config.eventLogFile(); - if (!filename_.empty()) { - of_.open(filename_, std::ios::out | std::ios::trunc); - out_ = &of_; - } - } - EventCSVLogger::update(config); -} - -void EventCSVDbgLogger::update(const Config& config) { - if (out_ && config.verboseLogLevel() < 0) { - out_ = nullptr; - } else if (!out_ && config.verboseLogLevel() >= 0) { - out_ = &LIBKINETO_DBG_STREAM; - } - if (config.verboseLogLevel() >= 0) { - percentiles_.clear(); - EventCSVLogger::update(config); - } -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/output_csv.h b/plugins/tensorboard-plugins/libkineto/src/output_csv.h deleted file mode 100644 index bca29f4db99af8aedf031aed869ff2efd3df6155..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_csv.h +++ /dev/null @@ -1,39 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once -#include "SampleListener.h" - -#include -#include -#include - -namespace KINETO_NAMESPACE { - -class EventCSVLogger : public SampleListener { - public: - void update(const Config& config) override; - void handleSample(int device, const Sample& sample, bool from_new_version) override; - - protected: - EventCSVLogger() : out_(nullptr) {} - - std::ostream* out_; - std::set eventNames_; - std::vector percentiles_; -}; - -class EventCSVFileLogger : public EventCSVLogger { - public: - void update(const Config& config) override; - - private: - std::ofstream of_; - std::string filename_; -}; - -class EventCSVDbgLogger : public EventCSVLogger { - public: - void update(const Config& config) override; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/output_json.cpp b/plugins/tensorboard-plugins/libkineto/src/output_json.cpp deleted file mode 100644 index 0ef22339fad15d6a78e43d7fcb7761fbbc97333b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_json.cpp +++ /dev/null @@ -1,583 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "output_json.h" - -#include -#include -#include -#include - -#include "Config.h" -#ifdef HAS_CUPTI -#include "CuptiActivity.h" -#include "CuptiActivity.tpp" -#include "CuptiActivityApi.h" -#include "CudaDeviceProperties.h" -#endif // HAS_CUPTI -#include "Demangle.h" -#include "TraceSpan.h" - -#include "Logger.h" - -using std::endl; -using namespace libkineto; - -namespace KINETO_NAMESPACE { - -static constexpr int kSchemaVersion = 1; -static constexpr char kFlowStart = 's'; -static constexpr char kFlowEnd = 'f'; - -#ifdef __linux__ -static constexpr char kDefaultLogFileFmt[] = - "/tmp/libkineto_activities_{}.json"; -#else -static constexpr char kDefaultLogFileFmt[] = "libkineto_activities_{}.json"; -#endif - -std::string& ChromeTraceLogger::sanitizeStrForJSON(std::string& value) { -// Replace all backslashes with forward slash because Windows paths causing JSONDecodeError. -#ifdef _WIN32 - std::replace(value.begin(), value.end(), '\\', '/'); -#endif - return value; -} - -void ChromeTraceLogger::metadataToJSON( - const std::unordered_map& metadata) { - for (const auto& kv : metadata) { - traceOf_ << fmt::format(R"JSON( - "{}": {},)JSON", kv.first, kv.second); - } -} - -void ChromeTraceLogger::handleTraceStart( - const std::unordered_map& metadata) { - traceOf_ << fmt::format(R"JSON( -{{ - "schemaVersion": {},)JSON", kSchemaVersion); - -#ifdef HAS_CUPTI - traceOf_ << fmt::format(R"JSON( - "deviceProperties": [{} - ],)JSON", devicePropertiesJson()); -#endif - - metadataToJSON(metadata); - traceOf_ << R"JSON( - "traceEvents": [)JSON"; -} - -static std::string defaultFileName() { - return fmt::format(kDefaultLogFileFmt, processId()); -} - -void ChromeTraceLogger::openTraceFile() { - traceOf_.open(fileName_, std::ofstream::out | std::ofstream::trunc); - if (!traceOf_) { - PLOG(ERROR) << "Failed to open '" << fileName_ << "'"; - } else { - LOG(INFO) << "Tracing to " << fileName_; - } -} - -ChromeTraceLogger::ChromeTraceLogger(const std::string& traceFileName) { - fileName_ = traceFileName.empty() ? defaultFileName() : traceFileName; - traceOf_.clear(std::ios_base::badbit); - openTraceFile(); -} - -static int64_t us(int64_t timestamp) { - // It's important that this conversion is the same here and in the CPU trace. - // No rounding! - return timestamp / 1000; -} - -void ChromeTraceLogger::handleDeviceInfo( - const DeviceInfo& info, - uint64_t time) { - if (!traceOf_) { - return; - } - - // M is for metadata - // process_name needs a pid and a name arg - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "name": "process_name", "ph": "M", "ts": {}, "pid": {}, "tid": 0, - "args": {{ - "name": "{}" - }} - }}, - {{ - "name": "process_labels", "ph": "M", "ts": {}, "pid": {}, "tid": 0, - "args": {{ - "labels": "{}" - }} - }}, - {{ - "name": "process_sort_index", "ph": "M", "ts": {}, "pid": {}, "tid": 0, - "args": {{ - "sort_index": {} - }} - }},)JSON", - time, info.id, - info.name, - time, info.id, - info.label, - time, info.id, - info.id < 8 ? info.id + 0x1000000ll : info.id); - // clang-format on -} - -void ChromeTraceLogger::handleResourceInfo( - const ResourceInfo& info, - int64_t time) { - if (!traceOf_) { - return; - } - - // M is for metadata - // thread_name needs a pid and a name arg - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "name": "thread_name", "ph": "M", "ts": {}, "pid": {}, "tid": {}, - "args": {{ - "name": "{}" - }} - }}, - {{ - "name": "thread_sort_index", "ph": "M", "ts": {}, "pid": {}, "tid": {}, - "args": {{ - "sort_index": {} - }} - }},)JSON", - time, info.deviceId, info.id, - info.name, - time, info.deviceId, info.id, - info.sortIndex); - // clang-format on -} - -void ChromeTraceLogger::handleOverheadInfo( - const OverheadInfo& info, - int64_t time) { - if (!traceOf_) { - return; - } - - // TOOD: reserve pid = -1 for overhead but we need to rethink how to scale this for - // other metadata - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "name": "process_name", "ph": "M", "ts": {}, "pid": -1, "tid": 0, - "args": {{ - "name": "{}" - }} - }}, - {{ - "name": "process_sort_index", "ph": "M", "ts": {}, "pid": -1, "tid": 0, - "args": {{ - "sort_index": {} - }} - }},)JSON", - time, - info.name, - time, - 0x100000All); - // clang-format on -} - -void ChromeTraceLogger::handleTraceSpan(const TraceSpan& span) { - if (!traceOf_) { - return; - } - - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "Trace", "ts": {}, "dur": {}, - "pid": "Spans", "tid": "{}", - "name": "{}{} ({})", - "args": {{ - "Op count": {} - }} - }}, - {{ - "name": "process_sort_index", "ph": "M", "ts": {}, - "pid": "Spans", "tid": 0, - "args": {{ - "sort_index": {} - }} - }},)JSON", - span.startTime, span.endTime - span.startTime, - span.name, - span.prefix, span.name, span.iteration, - span.opCount, - span.startTime, - // Large sort index to appear at the bottom - 0x20000000ll); - // clang-format on - - addIterationMarker(span); -} - -void ChromeTraceLogger::addIterationMarker(const TraceSpan& span) { - if (!traceOf_) { - return; - } - - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "name": "Iteration Start: {}", "ph": "i", "s": "g", - "pid": "Traces", "tid": "Trace {}", "ts": {} - }},)JSON", - span.name, - span.name, span.startTime); - // clang-format on -} - -static std::string traceActivityJson(const ITraceActivity& activity) { - // clang-format off - int64_t ts = activity.timestamp(); - int64_t duration = activity.duration(); - if (activity.type() == ActivityType::GPU_USER_ANNOTATION) { - // The GPU user annotations start at the same time as the - // first associated GPU activity. Since they appear later - // in the trace file, this causes a visualization issue in Chrome. - // Make it start one us earlier. - ts--; - duration++; // Still need it to end at the orginal point - } - return fmt::format(R"JSON( - "name": "{}", "pid": {}, "tid": {}, - "ts": {}, "dur": {})JSON", - activity.name(), activity.deviceId(), activity.resourceId(), - ts, duration); - // clang-format on -} - -void ChromeTraceLogger::handleGenericInstantEvent( - const libkineto::ITraceActivity& op) { - if (!traceOf_) { - return; - } - - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "i", "s": "t", "name": "{}", - "pid": {}, "tid": {}, - "ts": {}, - "args": {{ - {} - }} - }},)JSON", - op.name(), op.deviceId(), op.resourceId(), - op.timestamp(), op.metadataJson()); -} - -void ChromeTraceLogger::handleActivity( - const libkineto::ITraceActivity& op) { - if (!traceOf_) { - return; - } - - if (op.type() == ActivityType::CPU_INSTANT_EVENT) { - handleGenericInstantEvent(op); - return; - } - - const std::string op_metadata = op.metadataJson(); - std::string separator = ""; - if (op_metadata.find_first_not_of(" \t\n") != std::string::npos) { - separator = ",\n "; - } - std::string span = ""; - if (op.traceSpan()) { - span = fmt::format(R"JSON( - "Trace name": "{}", "Trace iteration": {},)JSON", - op.traceSpan()->name, - op.traceSpan()->iteration); - } - - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "{}", {}, - "args": {{{} - "External id": {}{}{} - }} - }},)JSON", - toString(op.type()), traceActivityJson(op), - // args - span, - op.correlationId(), separator, op_metadata); - // clang-format on - if (op.flowId() > 0) { - handleGenericLink(op); - } -} - -void ChromeTraceLogger::handleGenericActivity( - const libkineto::GenericTraceActivity& op) { - handleActivity(op); -} - -void ChromeTraceLogger::handleGenericLink(const ITraceActivity& act) { - static struct { - int type; - char longName[24]; - char shortName[16]; - } flow_names[] = { - {kLinkFwdBwd, "forward_backward", "fwd_bwd"}, - {kLinkAsyncCpuGpu, "async_cpu_to_gpu", "async_gpu"} - }; - for (auto& flow : flow_names) { - if (act.flowType() == flow.type) { - // Link the activities via flow ID in source and destination. - // The source node must return true from flowStart() - // and the destination node false. - if (act.flowStart()) { - handleLink(kFlowStart, act, act.flowId(), flow.longName, flow.shortName); - } else { - handleLink(kFlowEnd, act, act.flowId(), flow.longName, flow.shortName); - } - return; - } - } - LOG(ERROR) << "Unknown flow type: " << act.flowType(); -} - -void ChromeTraceLogger::handleLink( - char type, - const ITraceActivity& e, - int64_t id, - const std::string& cat, - const std::string& name) { - if (!traceOf_) { - return; - } - - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "{}", "id": {}, "pid": {}, "tid": {}, "ts": {}, - "cat": "{}", "name": "{}", "bp": "e" - }},)JSON", - type, id, e.deviceId(), e.resourceId(), e.timestamp(), cat, name); - // clang-format on -} - -#ifdef HAS_CUPTI -// GPU side kernel activity -void ChromeTraceLogger::handleGpuActivity( - const GpuActivity& activity) { - if (!traceOf_) { - return; - } - const CUpti_ActivityKernel4* kernel = &activity.raw(); - constexpr int threads_per_warp = 32; - float blocks_per_sm = -1.0; - float warps_per_sm = -1.0; - int sm_count = smCount(kernel->deviceId); - if (sm_count) { - blocks_per_sm = - (kernel->gridX * kernel->gridY * kernel->gridZ) / (float) sm_count; - warps_per_sm = - blocks_per_sm * (kernel->blockX * kernel->blockY * kernel->blockZ) - / threads_per_warp; - } - - // Calculate occupancy - float occupancy = KINETO_NAMESPACE::kernelOccupancy( - kernel->deviceId, - kernel->registersPerThread, - kernel->staticSharedMemory, - kernel->dynamicSharedMemory, - kernel->blockX, - kernel->blockY, - kernel->blockZ, - blocks_per_sm); - - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "Kernel", {}, - "args": {{ - "queued": {}, "device": {}, "context": {}, - "stream": {}, "correlation": {}, - "registers per thread": {}, - "shared memory": {}, - "blocks per SM": {}, - "warps per SM": {}, - "grid": [{}, {}, {}], - "block": [{}, {}, {}], - "est. achieved occupancy %": {} - }} - }},)JSON", - traceActivityJson(activity), - // args - us(kernel->queued), kernel->deviceId, kernel->contextId, - kernel->streamId, kernel->correlationId, - kernel->registersPerThread, - kernel->staticSharedMemory + kernel->dynamicSharedMemory, - blocks_per_sm, - warps_per_sm, - kernel->gridX, kernel->gridY, kernel->gridZ, - kernel->blockX, kernel->blockY, kernel->blockZ, - (int) (0.5 + occupancy * 100.0)); - // clang-format on - - auto to_id = activity.correlationId(); - handleLink(kFlowEnd, activity, to_id, "async_cpu_to_gpu", "async_gpu"); -} - -static std::string bandwidth(uint64_t bytes, uint64_t duration) { - return duration == 0 ? "\"N/A\"" : fmt::format("{}", bytes * 1.0 / duration); -} - -// GPU side memcpy activity -void ChromeTraceLogger::handleGpuActivity( - const GpuActivity& activity) { - if (!traceOf_) { - return; - } - const CUpti_ActivityMemcpy& memcpy = activity.raw(); - VLOG(2) << memcpy.correlationId << ": MEMCPY"; - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "Memcpy", {}, - "args": {{ - "device": {}, "context": {}, - "stream": {}, "correlation": {}, - "bytes": {}, "memory bandwidth (GB/s)": {} - }} - }},)JSON", - traceActivityJson(activity), - // args - memcpy.deviceId, memcpy.contextId, - memcpy.streamId, memcpy.correlationId, - memcpy.bytes, bandwidth(memcpy.bytes, memcpy.end - memcpy.start)); - // clang-format on - - int64_t to_id = activity.correlationId(); - handleLink(kFlowEnd, activity, to_id, "async_cpu_to_gpu", "async_gpu"); -} - -// GPU side memcpy activity -void ChromeTraceLogger::handleGpuActivity( - const GpuActivity& activity) { - if (!traceOf_) { - return; - } - const CUpti_ActivityMemcpy2& memcpy = activity.raw(); - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "Memcpy", {}, - "args": {{ - "fromDevice": {}, "inDevice": {}, "toDevice": {}, - "fromContext": {}, "inContext": {}, "toContext": {}, - "stream": {}, "correlation": {}, - "bytes": {}, "memory bandwidth (GB/s)": {} - }} - }},)JSON", - traceActivityJson(activity), - // args - memcpy.srcDeviceId, memcpy.deviceId, memcpy.dstDeviceId, - memcpy.srcContextId, memcpy.contextId, memcpy.dstContextId, - memcpy.streamId, memcpy.correlationId, - memcpy.bytes, bandwidth(memcpy.bytes, memcpy.end - memcpy.start)); - // clang-format on - - int64_t to_id = activity.correlationId(); - handleLink(kFlowEnd, activity, to_id, "async_cpu_to_gpu", "async_gpu"); -} - -void ChromeTraceLogger::handleGpuActivity( - const GpuActivity& activity) { - if (!traceOf_) { - return; - } - const CUpti_ActivityMemset& memset = activity.raw(); - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "ph": "X", "cat": "Memset", {}, - "args": {{ - "device": {}, "context": {}, - "stream": {}, "correlation": {}, - "bytes": {}, "memory bandwidth (GB/s)": {} - }} - }},)JSON", - traceActivityJson(activity), - // args - memset.deviceId, memset.contextId, - memset.streamId, memset.correlationId, - memset.bytes, bandwidth(memset.bytes, memset.end - memset.start)); - // clang-format on - - int64_t to_id = activity.correlationId(); - handleLink(kFlowEnd, activity, to_id, "async_cpu_to_gpu", "async_gpu"); -} -#endif // HAS_CUPTI - -void ChromeTraceLogger::finalizeTrace( - const Config& /*unused*/, - std::unique_ptr /*unused*/, - int64_t endTime, - std::unordered_map>& metadata) { - if (!traceOf_) { - LOG(ERROR) << "Failed to write to log file!"; - return; - } - LOG(INFO) << "Chrome Trace written to " << fileName_; - // clang-format off - traceOf_ << fmt::format(R"JSON( - {{ - "name": "Record Window End", "ph": "i", "s": "g", - "pid": "", "tid": "", "ts": {} - }} - ],)JSON", - endTime); - -#if !USE_GOOGLE_LOG - std::unordered_map PreparedMetadata; - for (const auto& kv : metadata) { - // Skip empty log buckets, ex. skip ERROR if its empty. - if (!kv.second.empty()) { - std::string value = "["; - // Ex. Each metadata from logger is a list of strings, expressed in JSON as - // "ERROR": ["Error 1", "Error 2"], - // "WARNING": ["Warning 1", "Warning 2", "Warning 3"], - // ... - int mdv_count = kv.second.size(); - for (const auto& v : kv.second) { - value.append("\"" + v + "\""); - if(mdv_count > 1) { - value.append(","); - mdv_count--; - } - } - value.append("]"); - PreparedMetadata[kv.first] = sanitizeStrForJSON(value); - } - } - metadataToJSON(PreparedMetadata); -#endif // !USE_GOOGLE_LOG - - // Putting this here because the last entry MUST not end with a comma. - traceOf_ << fmt::format(R"JSON( - "traceName": "{}" -}})JSON", sanitizeStrForJSON(fileName_)); - // clang-format on - - traceOf_.close(); -} - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/output_json.h b/plugins/tensorboard-plugins/libkineto/src/output_json.h deleted file mode 100644 index 5a8a81e4a9fdeef09b0e9ace59b964d5ab99b7ad..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_json.h +++ /dev/null @@ -1,91 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include -#endif -#include "GenericTraceActivity.h" -#include "output_base.h" - -namespace KINETO_NAMESPACE { - // Previous declaration of TraceSpan is struct. Must match the same here. - struct TraceSpan; -} - -namespace KINETO_NAMESPACE { - -class Config; - -class ChromeTraceLogger : public libkineto::ActivityLogger { - public: - explicit ChromeTraceLogger(const std::string& traceFileName); - - // Note: the caller of these functions should handle concurrency - // i.e., we these functions are not thread-safe - void handleDeviceInfo( - const DeviceInfo& info, - uint64_t time) override; - - void handleOverheadInfo(const OverheadInfo& info, int64_t time) override; - - void handleResourceInfo(const ResourceInfo& info, int64_t time) override; - - void handleTraceSpan(const TraceSpan& span) override; - - void handleActivity(const ITraceActivity& activity) override; - void handleGenericActivity(const GenericTraceActivity& activity) override; - -#ifdef HAS_CUPTI - void handleGpuActivity(const GpuActivity& activity) override; - void handleGpuActivity(const GpuActivity& activity) override; - void handleGpuActivity(const GpuActivity& activity) override; - void handleGpuActivity(const GpuActivity& activity) override; -#endif // HAS_CUPTI - - void handleTraceStart( - const std::unordered_map& metadata) override; - - void finalizeTrace( - const Config& config, - std::unique_ptr buffers, - int64_t endTime, - std::unordered_map>& metadata) override; - - std::string traceFileName() const { - return fileName_; - } - - private: - - // Create a flow event (arrow) - void handleLink( - char type, - const ITraceActivity& e, - int64_t id, - const std::string& cat, - const std::string& name); - - void addIterationMarker(const TraceSpan& span); - - void openTraceFile(); - - void handleGenericInstantEvent(const ITraceActivity& op); - - void handleGenericLink(const ITraceActivity& activity); - - void metadataToJSON(const std::unordered_map& metadata); - - std::string& sanitizeStrForJSON(std::string& value); - - std::string fileName_; - std::ofstream traceOf_; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/src/output_membuf.h b/plugins/tensorboard-plugins/libkineto/src/output_membuf.h deleted file mode 100644 index ef6aadeb65728e0e05e454f98b32ccecca229cf4..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/src/output_membuf.h +++ /dev/null @@ -1,130 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include -#include - -#ifdef HAS_CUPTI -#include -#endif - -#include "Config.h" -#include "GenericTraceActivity.h" -#ifdef HAS_CUPTI -#include "CuptiActivity.h" -#include "CuptiActivity.tpp" -#endif // HAS_CUPTI -#include "output_base.h" - -namespace KINETO_NAMESPACE { - -class Config; - -class MemoryTraceLogger : public ActivityLogger { - public: - MemoryTraceLogger(const Config& config) : config_(config.clone()) { - activities_.reserve(100000); - } - - // Note: the caller of these functions should handle concurrency - // i.e., these functions are not thread-safe - void handleDeviceInfo( - const DeviceInfo& info, - uint64_t time) override { - deviceInfoList_.emplace_back(info, time); - } - - void handleResourceInfo(const ResourceInfo& info, int64_t time) override { - resourceInfoList_.emplace_back(info, time); - } - - void handleOverheadInfo(const OverheadInfo& info, int64_t time) override {} - - void handleTraceSpan(const TraceSpan& span) override { - // Handled separately - } - - template - void addActivityWrapper(const T& act) { - wrappers_.push_back(std::make_unique(act)); - activities_.push_back(wrappers_.back().get()); - } - - // Just add the pointer to the list - ownership of the underlying - // objects must be transferred in ActivityBuffers via finalizeTrace - void handleActivity(const ITraceActivity& activity) override { - activities_.push_back(&activity); - } - void handleGenericActivity(const GenericTraceActivity& activity) override { - addActivityWrapper(activity); - } - -#ifdef HAS_CUPTI - void handleGpuActivity(const GpuActivity& activity) override { - addActivityWrapper(activity); - } - void handleGpuActivity(const GpuActivity& activity) override { - addActivityWrapper(activity); - } - void handleGpuActivity(const GpuActivity& activity) override { - addActivityWrapper(activity); - } - void handleGpuActivity(const GpuActivity& activity) override { - addActivityWrapper(activity); - } -#endif // HAS_CUPTI - - void handleTraceStart( - const std::unordered_map& metadata) override { - metadata_ = metadata; - } - - void finalizeTrace( - const Config& config, - std::unique_ptr buffers, - int64_t endTime, - std::unordered_map>& metadata) override { - buffers_ = std::move(buffers); - endTime_ = endTime; - } - - const std::vector* traceActivities() { - return &activities_; - } - - void log(ActivityLogger& logger) { - logger.handleTraceStart(metadata_); - for (auto& activity : activities_) { - activity->log(logger); - } - for (auto& p : deviceInfoList_) { - logger.handleDeviceInfo(p.first, p.second); - } - for (auto& p : resourceInfoList_) { - logger.handleResourceInfo(p.first, p.second); - } - for (auto& cpu_trace_buffer : buffers_->cpu) { - logger.handleTraceSpan(cpu_trace_buffer->span); - } - // Hold on to the buffers - logger.finalizeTrace(*config_, nullptr, endTime_, loggerMetadata_); - } - - private: - - std::unique_ptr config_; - // Optimization: Remove unique_ptr by keeping separate vector per type - std::vector activities_; - std::vector> wrappers_; - std::vector> deviceInfoList_; - std::vector> resourceInfoList_; - std::unique_ptr buffers_; - std::unordered_map metadata_; - std::unordered_map> loggerMetadata_; - int64_t endTime_{0}; -}; - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/test/CMakeLists.txt b/plugins/tensorboard-plugins/libkineto/test/CMakeLists.txt deleted file mode 100644 index ca54460b36cd4ade93918c8512f1309b48552e65..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CMakeLists.txt +++ /dev/null @@ -1,3 +0,0 @@ -cmake_minimum_required(VERSION 3.5 FATAL_ERROR) - -# TODO diff --git a/plugins/tensorboard-plugins/libkineto/test/ConfigTest.cpp b/plugins/tensorboard-plugins/libkineto/test/ConfigTest.cpp deleted file mode 100644 index 16bc86e751cefdbee1d48aeb79fc849b7d151a18..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/ConfigTest.cpp +++ /dev/null @@ -1,315 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "include/Config.h" - -#include -#include -#include -#include - -using namespace std::chrono; -using namespace KINETO_NAMESPACE; - -TEST(ParseTest, Whitespace) { - Config cfg; - // Check that various types of whitespace is ignored - EXPECT_TRUE(cfg.parse("")); - EXPECT_TRUE(cfg.parse(" ")); - EXPECT_TRUE(cfg.parse("\t")); - EXPECT_TRUE(cfg.parse("\n")); - EXPECT_TRUE(cfg.parse(" ")); - EXPECT_TRUE(cfg.parse("\t \n \t\t\n\n")); - // Only the above characters are supported - EXPECT_FALSE(cfg.parse("\r\n")); -} - -TEST(ParseTest, Comment) { - Config cfg; - // Anything following a '#' should be ignored, up to a newline - EXPECT_TRUE(cfg.parse("# comment")); - EXPECT_TRUE(cfg.parse(" # ~!@#$")); - EXPECT_TRUE(cfg.parse("\t#abc")); - EXPECT_TRUE(cfg.parse("###\n##")); - EXPECT_TRUE(cfg.parse("EVENTS=util ##ok")); - EXPECT_TRUE(cfg.parse("EVENTS=util ## EVENTS=instruction")); - // Whatever appears before the comment must be valid format - EXPECT_FALSE(cfg.parse("util ## not ok")); - EXPECT_FALSE(cfg.parse("## ok \n blah # not OK")); - // Check that a comment does not affect config parsing - EXPECT_TRUE(cfg.parse("SAMPLE_PERIOD_MSECS = 1 # Sample every millisecond")); - EXPECT_EQ(cfg.samplePeriod(), milliseconds(1)); -} - -TEST(ParseTest, Format) { - Config cfg; - // The basic format is just "name = value". - // Where both value and name can be almost anything. - // Leading and trailing whitespace should be removed - // for both 'name' and 'value', but internal whitespace is not. - EXPECT_FALSE(cfg.parse("events")); - EXPECT_TRUE(cfg.parse("events=")); - EXPECT_FALSE(cfg.parse("=events=")); - EXPECT_TRUE(cfg.parse("events=1,2,3")); - // Only one setting per line - EXPECT_FALSE(cfg.parse("events = 1,2,3 ; metrics = 4,5,6")); - // Names are case sensitive - EXPECT_TRUE(cfg.parse("EVENTS = 1,2,3 \n metrics = 4,5,6")); - EXPECT_EQ(cfg.eventNames(), std::set({"1", "2", "3"})); - EXPECT_EQ(cfg.metricNames().size(), 0); - // Leading and trailing whitespace removed for event and metric names, - // but not internal. - EXPECT_TRUE( - cfg.parse("EVENTS = 1, 2, 3 \n \tMETRICS\t = \t4,\t5\t,\ts i x ")); - EXPECT_EQ(cfg.eventNames(), std::set({"1", "2", "3"})); - EXPECT_EQ(cfg.metricNames(), std::set({"4", "5", "s i x"})); -} - -TEST(ParseTest, DefaultActivityTypes) { - Config cfg; - cfg.validate(std::chrono::system_clock::now()); - auto all_activities = activityTypes(); - // TODO: introduce optional activities - EXPECT_EQ(cfg.selectedActivityTypes(), - std::set(all_activities.begin(), all_activities.end() - 1)); -} - -TEST(ParseTest, ActivityTypes) { - Config cfg; - EXPECT_FALSE(cfg.parse("ACTIVITY_TYPES")); - EXPECT_TRUE(cfg.parse("ACTIVITY_TYPES=")); - EXPECT_FALSE(cfg.parse("=ACTIVITY_TYPES=")); - - EXPECT_EQ(cfg.selectedActivityTypes(), - std::set({ActivityType::CPU_OP, - ActivityType::CPU_INSTANT_EVENT, - ActivityType::PYTHON_FUNCTION, - ActivityType::USER_ANNOTATION, - ActivityType::GPU_USER_ANNOTATION, - ActivityType::GPU_MEMCPY, - ActivityType::GPU_MEMSET, - ActivityType::CONCURRENT_KERNEL, - ActivityType::EXTERNAL_CORRELATION, - ActivityType::GLOW_RUNTIME, - ActivityType::CUDA_RUNTIME, - ActivityType::CUDA_PROFILER_RANGE})); - - Config cfg2; - EXPECT_TRUE(cfg2.parse("ACTIVITY_TYPES=gpu_memcpy,gpu_MeMsEt,kernel")); - EXPECT_EQ(cfg2.selectedActivityTypes(), - std::set({ActivityType::GPU_MEMCPY, - ActivityType::GPU_MEMSET, - ActivityType::CONCURRENT_KERNEL})); - - EXPECT_TRUE(cfg2.parse("ACTIVITY_TYPES = cuda_Runtime,")); - EXPECT_EQ(cfg2.selectedActivityTypes(), - std::set({ActivityType::CUDA_RUNTIME})); - - // Should throw an exception because incorrect activity name - EXPECT_FALSE(cfg2.parse("ACTIVITY_TYPES = memcopy,cuda_runtime")); - - EXPECT_TRUE(cfg2.parse("ACTIVITY_TYPES = cpu_op")); - EXPECT_EQ(cfg2.selectedActivityTypes(), - std::set({ActivityType::CPU_OP})); -} - -TEST(ParseTest, SamplePeriod) { - Config cfg; - EXPECT_TRUE(cfg.parse("SAMPLE_PERIOD_MSECS=10")); - EXPECT_EQ(cfg.samplePeriod(), milliseconds(10)); - EXPECT_TRUE(cfg.parse("SAMPLE_PERIOD_MSECS=0")); - cfg.validate(std::chrono::system_clock::now()); - // 0 should be adjustd up to 1 - EXPECT_EQ(cfg.samplePeriod(), milliseconds(1)); - // Negative and non-int values should fail - EXPECT_FALSE(cfg.parse("SAMPLE_PERIOD_MSECS=-10")); - EXPECT_FALSE(cfg.parse("SAMPLE_PERIOD_MSECS=1.5")); - EXPECT_FALSE(cfg.parse("SAMPLE_PERIOD_MSECS=")); - EXPECT_FALSE(cfg.parse("SAMPLE_PERIOD_MSECS=string")); - EXPECT_EQ(cfg.samplePeriod(), milliseconds(1)); -} - -TEST(ParseTest, MultiplexPeriod) { - Config cfg; - auto now = std::chrono::system_clock::now(); - - EXPECT_TRUE(cfg.parse("SAMPLE_PERIOD_MSECS=100\nMULTIPLEX_PERIOD_MSECS=100")); - EXPECT_EQ(cfg.multiplexPeriod(), milliseconds(100)); - EXPECT_TRUE(cfg.parse("MULTIPLEX_PERIOD_MSECS = 0")); - cfg.validate(now); - // Adjusted to match sample period - EXPECT_EQ(cfg.multiplexPeriod(), milliseconds(100)); - EXPECT_TRUE(cfg.parse("MULTIPLEX_PERIOD_MSECS \t= \t 750 \n")); - cfg.validate(now); - // Adjusted to match multiple of sample period - EXPECT_EQ(cfg.multiplexPeriod(), milliseconds(800)); - EXPECT_FALSE(cfg.parse("MULTIPLEX_PERIOD_MSECS=-10")); - EXPECT_FALSE(cfg.parse("MULTIPLEX_PERIOD_MSECS=1.5")); - EXPECT_FALSE(cfg.parse("MULTIPLEX_PERIOD_MSECS=")); - EXPECT_FALSE(cfg.parse("MULTIPLEX_PERIOD_MSECS=string")); - // Previous value not affected - EXPECT_EQ(cfg.multiplexPeriod(), milliseconds(800)); -} - -TEST(ParseTest, ReportPeriod) { - Config cfg; - EXPECT_TRUE(cfg.parse("REPORT_PERIOD_SECS=1")); - EXPECT_EQ(cfg.reportPeriod(), seconds(1)); - // Whitespace - EXPECT_TRUE(cfg.parse("REPORT_PERIOD_SECS = \t100")); - EXPECT_EQ(cfg.reportPeriod(), seconds(100)); - // Invalid types - EXPECT_FALSE(cfg.parse("REPORT_PERIOD_SECS=-1")); - EXPECT_EQ(cfg.reportPeriod(), seconds(100)); -} - -TEST(ParseTest, SamplesPerReport) { - Config cfg; - auto now = std::chrono::system_clock::now(); - - EXPECT_TRUE(cfg.parse(R"( - SAMPLE_PERIOD_MSECS = 1000 - REPORT_PERIOD_SECS = 1 - SAMPLES_PER_REPORT = 10)")); - cfg.validate(now); - // Adjusted down to one sample per report - EXPECT_EQ(cfg.samplesPerReport(), 1); - EXPECT_TRUE(cfg.parse(R"( - SAMPLE_PERIOD_MSECS = 1000 - REPORT_PERIOD_SECS = 10 - SAMPLES_PER_REPORT = 10)")); - cfg.validate(now); - // No adjustment needed - EXPECT_EQ(cfg.samplesPerReport(), 10); - EXPECT_TRUE(cfg.parse(R"( - SAMPLE_PERIOD_MSECS = 1000 - REPORT_PERIOD_SECS = 2 - SAMPLES_PER_REPORT = 10)")); - cfg.validate(now); - // Adjusted to 2 samples per report - EXPECT_EQ(cfg.samplesPerReport(), 2); - EXPECT_TRUE(cfg.parse(R"( - SAMPLE_PERIOD_MSECS = 200 - REPORT_PERIOD_SECS = 2 - SAMPLES_PER_REPORT = 10)")); - cfg.validate(now); - // No adjustment needed - EXPECT_EQ(cfg.samplesPerReport(), 10); - EXPECT_TRUE(cfg.parse("SAMPLES_PER_REPORT=0")); - cfg.validate(now); - // Adjusted up to 1 - EXPECT_EQ(cfg.samplesPerReport(), 1); - // Invalid value types - EXPECT_FALSE(cfg.parse("SAMPLES_PER_REPORT=-10")); - EXPECT_FALSE(cfg.parse("SAMPLES_PER_REPORT=1.5")); - EXPECT_EQ(cfg.samplesPerReport(), 1); - - EXPECT_TRUE(cfg.parse(R"( - SAMPLE_PERIOD_MSECS=1000 - MULTIPLEX_PERIOD_MSECS=500 # Must be a multiple of sample period - REPORT_PERIOD_SECS=0 # Must be non-zero multiple of multiplex period - SAMPLES_PER_REPORT=5 # Max report period / multiplex period)")); - cfg.validate(now); - // Multiple adjustments - EXPECT_EQ(cfg.samplePeriod(), milliseconds(1000)); - EXPECT_EQ(cfg.multiplexPeriod(), milliseconds(1000)); - EXPECT_EQ(cfg.reportPeriod(), seconds(1)); - EXPECT_EQ(cfg.samplesPerReport(), 1); -} - -TEST(ParseTest, EnableSigUsr2) { - Config cfg; - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=yes")); - EXPECT_TRUE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=no")); - EXPECT_FALSE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=YES")); - EXPECT_TRUE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=NO")); - EXPECT_FALSE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=Y")); - EXPECT_TRUE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=N")); - EXPECT_FALSE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=T")); - EXPECT_TRUE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=F")); - EXPECT_FALSE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=true")); - EXPECT_TRUE(cfg.sigUsr2Enabled()); - EXPECT_TRUE(cfg.parse("ENABLE_SIGUSR2=false")); - EXPECT_FALSE(cfg.sigUsr2Enabled()); - EXPECT_FALSE(cfg.parse("ENABLE_SIGUSR2= ")); - EXPECT_FALSE(cfg.parse("ENABLE_SIGUSR2=2")); - EXPECT_FALSE(cfg.parse("ENABLE_SIGUSR2=-1")); - EXPECT_FALSE(cfg.parse("ENABLE_SIGUSR2=yep")); -} - -TEST(ParseTest, DeviceMask) { - Config cfg; - // Single device - EXPECT_TRUE(cfg.parse("EVENTS_ENABLED_DEVICES = 0")); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(0)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(1)); - - // Two devices, internal whitespace - EXPECT_TRUE(cfg.parse("EVENTS_ENABLED_DEVICES = 1, 2")); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(0)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(1)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(2)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(3)); - - // Three devices, check that previous devices are ignored - EXPECT_TRUE(cfg.parse("EVENTS_ENABLED_DEVICES = 0, 2,4")); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(0)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(1)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(2)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(3)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(4)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(5)); - - // Repeated numbers have no effect - EXPECT_TRUE(cfg.parse("EVENTS_ENABLED_DEVICES = 0,1,1,1,2,3,2,1,3,7,7,3")); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(0)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(1)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(2)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(3)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(4)); - EXPECT_FALSE(cfg.eventProfilerEnabledForDevice(6)); - EXPECT_TRUE(cfg.eventProfilerEnabledForDevice(7)); - - // 8 is larger than the max allowed - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = 3,8")); - - // 300 cannot be held in an uint8_t - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = 300")); - - // Various illegal cases - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = 0,1,two,three")); - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = 0,1,,2")); - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = -1")); - EXPECT_FALSE(cfg.parse("EVENTS_ENABLED_DEVICES = 1.0")); -} - -TEST(ParseTest, RequestTime) { - Config cfg; - system_clock::time_point now = system_clock::now(); - int64_t tgood_ms = - duration_cast(now.time_since_epoch()).count(); - EXPECT_TRUE(cfg.parse(fmt::format("REQUEST_TIMESTAMP = {}", tgood_ms))); - - tgood_ms = duration_cast((now - seconds(5)).time_since_epoch()) - .count(); - EXPECT_TRUE(cfg.parse(fmt::format("REQUEST_TIMESTAMP = {}", tgood_ms))); - - int64_t tbad_ms = - duration_cast((now - seconds(20)).time_since_epoch()) - .count(); - EXPECT_FALSE(cfg.parse(fmt::format("REQUEST_TIMESTAMP = {}", tbad_ms))); - - EXPECT_FALSE(cfg.parse("REQUEST_TIMESTAMP = 0")); - EXPECT_FALSE(cfg.parse("REQUEST_TIMESTAMP = -1")); - - tbad_ms = duration_cast((now + seconds(10)).time_since_epoch()) - .count(); - EXPECT_FALSE(cfg.parse(fmt::format("REQUEST_TIMESTAMP = {}", tbad_ms))); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiActivityProfilerTest.cpp b/plugins/tensorboard-plugins/libkineto/test/CuptiActivityProfilerTest.cpp deleted file mode 100644 index 6e67980ee31a3386580974033201b7acae75d22b..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiActivityProfilerTest.cpp +++ /dev/null @@ -1,629 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#include -#include -#include -#include - -#ifdef __linux__ -#include -#include -#include -#endif - -#include "include/libkineto.h" -#include "include/Config.h" -#include "src/CuptiActivityProfiler.h" -#include "src/ActivityTrace.h" -#include "src/CuptiActivityApi.h" -#include "src/output_base.h" -#include "src/output_json.h" -#include "src/output_membuf.h" - -#include "src/Logger.h" -#include "test/MockActivitySubProfiler.h" - -using namespace std::chrono; -using namespace KINETO_NAMESPACE; - -#define CUDA_LAUNCH_KERNEL CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000 -#define CUDA_MEMCPY CUPTI_RUNTIME_TRACE_CBID_cudaMemcpy_v3020 - -namespace { -const TraceSpan& defaultTraceSpan() { - static TraceSpan span(0, 0, "Unknown", ""); - return span; -} -} - -// Provides ability to easily create a few test CPU-side ops -struct MockCpuActivityBuffer : public CpuTraceBuffer { - MockCpuActivityBuffer(int64_t startTime, int64_t endTime) { - span = TraceSpan(startTime, endTime,"Test trace"); - gpuOpCount = 0; - } - - void addOp(std::string name, int64_t startTime, int64_t endTime, int64_t correlation) { - GenericTraceActivity op(span, ActivityType::CPU_OP, name); - op.startTime = startTime; - op.endTime = endTime; - op.resource = systemThreadId(); - op.id = correlation; - activities.push_back(std::move(op)); - span.opCount++; - } -}; - -// Provides ability to easily create a few test CUPTI ops -struct MockCuptiActivityBuffer { - void addCorrelationActivity(int64_t correlation, CUpti_ExternalCorrelationKind externalKind, int64_t externalId) { - auto& act = *(CUpti_ActivityExternalCorrelation*) malloc(sizeof(CUpti_ActivityExternalCorrelation)); - act.kind = CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION; - act.externalId = externalId; - act.externalKind = externalKind; - act.correlationId = correlation; - activities.push_back(reinterpret_cast(&act)); - } - - void addRuntimeActivity( - CUpti_runtime_api_trace_cbid_enum cbid, - int64_t start_us, int64_t end_us, int64_t correlation) { - auto& act = createActivity( - start_us, end_us, correlation); - act.kind = CUPTI_ACTIVITY_KIND_RUNTIME; - act.cbid = cbid; - act.threadId = threadId(); - activities.push_back(reinterpret_cast(&act)); - } - - void addKernelActivity( - int64_t start_us, int64_t end_us, int64_t correlation) { - auto& act = createActivity( - start_us, end_us, correlation); - act.kind = CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL; - act.deviceId = 0; - act.streamId = 1; - act.name = "kernel"; - act.gridX = act.gridY = act.gridZ = 1; - act.blockX = act.blockY = act.blockZ = 1; - activities.push_back(reinterpret_cast(&act)); - } - - void addMemcpyActivity( - int64_t start_us, int64_t end_us, int64_t correlation) { - auto& act = createActivity( - start_us, end_us, correlation); - act.kind = CUPTI_ACTIVITY_KIND_MEMCPY; - act.deviceId = 0; - act.streamId = 2; - act.copyKind = CUPTI_ACTIVITY_MEMCPY_KIND_HTOD; - act.srcKind = CUPTI_ACTIVITY_MEMORY_KIND_PINNED; - act.dstKind = CUPTI_ACTIVITY_MEMORY_KIND_DEVICE; - activities.push_back(reinterpret_cast(&act)); - } - - template - T& createActivity( - int64_t start_us, int64_t end_us, int64_t correlation) { - T& act = *static_cast(malloc(sizeof(T))); - bzero(&act, sizeof(act)); - act.start = start_us * 1000; - act.end = end_us * 1000; - act.correlationId = correlation; - return act; - } - - ~MockCuptiActivityBuffer() { - for (CUpti_Activity* act : activities) { - free(act); - } - } - - std::vector activities; -}; - -// Mock parts of the CuptiActivityApi -class MockCuptiActivities : public CuptiActivityApi { - public: - virtual int smCount() override { - return 10; - } - - virtual const std::pair processActivities( - CuptiActivityBufferMap&, /*unused*/ - std::function handler) override { - for (CUpti_Activity* act : activityBuffer->activities) { - handler(act); - } - return {activityBuffer->activities.size(), 100}; - } - - virtual std::unique_ptr - activityBuffers() override { - auto map = std::make_unique(); - auto buf = std::make_unique(100); - uint8_t* addr = buf->data(); - (*map)[addr] = std::move(buf); - return map; - } - - void bufferRequestedOverride(uint8_t** buffer, size_t* size, size_t* maxNumRecords) { - this->bufferRequested(buffer, size, maxNumRecords); - } - - std::unique_ptr activityBuffer; -}; - - -// Common setup / teardown and helper functions -class CuptiActivityProfilerTest : public ::testing::Test { - protected: - void SetUp() override { - profiler_ = std::make_unique( - cuptiActivities_, /*cpu only*/ false); - cfg_ = std::make_unique(); - cfg_->validate(std::chrono::system_clock::now()); - loggerFactory.addProtocol("file", [](const std::string& url) { - return std::unique_ptr(new ChromeTraceLogger(url)); - }); - } - - std::unique_ptr cfg_; - MockCuptiActivities cuptiActivities_; - std::unique_ptr profiler_; - ActivityLoggerFactory loggerFactory; -}; - -void checkTracefile(const char* filename) { -#ifdef __linux__ - // Check that the expected file was written and that it has some content - int fd = open(filename, O_RDONLY); - if (!fd) { - perror(filename); - } - EXPECT_TRUE(fd); - // Should expect at least 100 bytes - struct stat buf{}; - fstat(fd, &buf); - EXPECT_GT(buf.st_size, 100); - close(fd); -#endif -} - -TEST(CuptiActivityProfiler, AsyncTrace) { - std::vector log_modules( - {"CuptiActivityProfiler.cpp", "output_json.cpp"}); - SET_LOG_VERBOSITY_LEVEL(1, log_modules); - - MockCuptiActivities activities; - CuptiActivityProfiler profiler(activities, /*cpu only*/ true); - - char filename[] = "/tmp/libkineto_testXXXXXX.json"; - mkstemps(filename, 5); - - Config cfg; - - int iter = 0; - int warmup = 5; - auto now = system_clock::now(); - auto startTime = now + seconds(10); - - bool success = cfg.parse(fmt::format(R"CFG( - ACTIVITIES_WARMUP_PERIOD_SECS = {} - ACTIVITIES_DURATION_SECS = 1 - ACTIVITIES_LOG_FILE = {} - PROFILE_START_TIME = {} - )CFG", warmup, filename, duration_cast(startTime.time_since_epoch()).count())); - - EXPECT_TRUE(success); - EXPECT_FALSE(profiler.isActive()); - - auto logger = std::make_unique(cfg.activitiesLogFile()); - - // Usually configuration is done when now is startTime - warmup to kick off warmup - // but start right away in the test - profiler.configure(cfg, now); - profiler.setLogger(logger.get()); - - EXPECT_TRUE(profiler.isActive()); - - // fast forward in time and we have reached the startTime - now = startTime; - - // Run the profiler - // Warmup - // performRunLoopStep is usually called by the controller loop and takes - // the current time and the controller's next wakeup time. - profiler.performRunLoopStep( - /* Current time */ now, /* Next wakeup time */ now); - - auto next = now + milliseconds(1000); - - // performRunLoopStep can also be called by an application thread to update iteration count - // since this config does not use iteration this should have no effect on the state - while (++iter < 20) { - profiler.performRunLoopStep(now, now, iter); - } - - // Runloop should now be in collect state, so start workload - // Perform another runloop step, passing in the end profile time as current. - // This should terminate collection - profiler.performRunLoopStep( - /* Current time */ next, /* Next wakeup time */ next); - // One step needed for each of the Process and Finalize phases - // Doesn't really matter what times we pass in here. - - EXPECT_TRUE(profiler.isActive()); - - auto nextnext = next + milliseconds(1000); - - while (++iter < 40) { - profiler.performRunLoopStep(next, next, iter); - } - - EXPECT_TRUE(profiler.isActive()); - - profiler.performRunLoopStep(nextnext,nextnext); - profiler.performRunLoopStep(nextnext,nextnext); - - // Assert that tracing has completed - EXPECT_FALSE(profiler.isActive()); - - checkTracefile(filename); -} - -TEST(CuptiActivityProfiler, AsyncTraceUsingIter) { - std::vector log_modules( - {"CuptiActivityProfiler.cpp", "output_json.cpp"}); - SET_LOG_VERBOSITY_LEVEL(1, log_modules); - - auto runIterTest = [&]( - int start_iter, int warmup_iters, int trace_iters) { - - LOG(INFO ) << "Async Trace Test: start_iteration = " << start_iter - << " warmup iterations = " << warmup_iters - << " trace iterations = " << trace_iters; - - MockCuptiActivities activities; - CuptiActivityProfiler profiler(activities, /*cpu only*/ true); - - char filename[] = "/tmp/libkineto_testXXXXXX.json"; - mkstemps(filename, 5); - - Config cfg; - - int iter = 0; - auto now = system_clock::now(); - - bool success = cfg.parse(fmt::format(R"CFG( - PROFILE_START_ITERATION = {} - ACTIVITIES_WARMUP_ITERATIONS={} - ACTIVITIES_ITERATIONS={} - ACTIVITIES_DURATION_SECS = 1 - ACTIVITIES_LOG_FILE = {} - )CFG", start_iter, warmup_iters, trace_iters, filename)); - - EXPECT_TRUE(success); - EXPECT_FALSE(profiler.isActive()); - - auto logger = std::make_unique(cfg.activitiesLogFile()); - - // Usually configuration is done when now is startIter - warmup iter to kick off warmup - // but start right away in the test - while (iter < (start_iter - warmup_iters)) { - profiler.performRunLoopStep(now, now, iter++); - } - - profiler.configure(cfg, now); - profiler.setLogger(logger.get()); - - EXPECT_TRUE(profiler.isActive()); - - // fast forward in time, mimicking what will happen in reality - now += seconds(10); - auto next = now + milliseconds(1000); - - // this call to runloop step should not be effecting the state - profiler.performRunLoopStep(now, next); - EXPECT_TRUE(profiler.isActive()); - - // start trace collection - while (iter < start_iter) { - profiler.performRunLoopStep(now, next, iter++); - } - - // Runloop should now be in collect state, so start workload - - while (iter < (start_iter + trace_iters)) { - profiler.performRunLoopStep(now, next, iter++); - } - - // One step is required for each of the Process and Finalize phases - // Doesn't really matter what times we pass in here. - if (iter >= (start_iter + trace_iters)) { - profiler.performRunLoopStep(now, next, iter++); - } - EXPECT_TRUE(profiler.isActive()); - - auto nextnext = next + milliseconds(1000); - - profiler.performRunLoopStep(nextnext, nextnext); - profiler.performRunLoopStep(nextnext, nextnext); - - // Assert that tracing has completed - EXPECT_FALSE(profiler.isActive()); - - checkTracefile(filename); - }; - - // start iter = 50, warmup iters = 5, trace iters = 10 - runIterTest(50, 5, 10); - // should be able to start at 0 iteration - runIterTest(0, 0, 2); - runIterTest(0, 5, 5); -} - -TEST_F(CuptiActivityProfilerTest, SyncTrace) { - using ::testing::Return; - using ::testing::ByMove; - - // Verbose logging is useful for debugging - std::vector log_modules( - {"CuptiActivityProfiler.cpp"}); - SET_LOG_VERBOSITY_LEVEL(2, log_modules); - - // Start and stop profiling - CuptiActivityProfiler profiler(cuptiActivities_, /*cpu only*/ false); - int64_t start_time_us = 100; - int64_t duration_us = 300; - auto start_time = time_point(microseconds(start_time_us)); - profiler.configure(*cfg_, start_time); - profiler.startTrace(start_time); - profiler.stopTrace(start_time + microseconds(duration_us)); - - profiler.recordThreadInfo(); - - // Log some cpu ops - auto cpuOps = std::make_unique( - start_time_us, start_time_us + duration_us); - cpuOps->addOp("op1", 120, 150, 1); - cpuOps->addOp("op2", 130, 140, 2); - cpuOps->addOp("op3", 200, 250, 3); - profiler.transferCpuTrace(std::move(cpuOps)); - - // And some GPU ops - auto gpuOps = std::make_unique(); - gpuOps->addRuntimeActivity(CUDA_LAUNCH_KERNEL, 133, 138, 1); - gpuOps->addRuntimeActivity(CUDA_MEMCPY, 210, 220, 2); - gpuOps->addRuntimeActivity(CUDA_LAUNCH_KERNEL, 230, 245, 3); - gpuOps->addKernelActivity(150, 170, 1); - gpuOps->addMemcpyActivity(240, 250, 2); - gpuOps->addKernelActivity(260, 320, 3); - cuptiActivities_.activityBuffer = std::move(gpuOps); - - // Have the profiler process them - auto logger = std::make_unique(*cfg_); - profiler.processTrace(*logger); - - // Profiler can be reset at this point - logger owns the activities - profiler_->reset(); - - // Wrapper that allows iterating over the activities - ActivityTrace trace(std::move(logger), loggerFactory); - EXPECT_EQ(trace.activities()->size(), 9); - std::map activityCounts; - std::map resourceIds; - for (auto& activity : *trace.activities()) { - activityCounts[activity->name()]++; - resourceIds[activity->resourceId()]++; - } - for (const auto& p : activityCounts) { - LOG(INFO) << p.first << ": " << p.second; - } - EXPECT_EQ(activityCounts["op1"], 1); - EXPECT_EQ(activityCounts["op2"], 1); - EXPECT_EQ(activityCounts["op3"], 1); - EXPECT_EQ(activityCounts["cudaLaunchKernel"], 2); - EXPECT_EQ(activityCounts["cudaMemcpy"], 1); - EXPECT_EQ(activityCounts["kernel"], 2); - EXPECT_EQ(activityCounts["Memcpy HtoD (Pinned -> Device)"], 1); - - auto sysTid = systemThreadId(); - // Ops and runtime events are on thread sysTid - EXPECT_EQ(resourceIds[sysTid], 6); - // Kernels are on stream 1, memcpy on stream 2 - EXPECT_EQ(resourceIds[1], 2); - EXPECT_EQ(resourceIds[2], 1); - -#ifdef __linux__ - char filename[] = "/tmp/libkineto_testXXXXXX.json"; - mkstemps(filename, 5); - trace.save(filename); - // Check that the expected file was written and that it has some content - int fd = open(filename, O_RDONLY); - if (!fd) { - perror(filename); - } - EXPECT_TRUE(fd); - // Should expect at least 100 bytes - struct stat buf{}; - fstat(fd, &buf); - EXPECT_GT(buf.st_size, 100); -#endif -} - -TEST_F(CuptiActivityProfilerTest, GpuUserAnnotationTest) { - // Verbose logging is useful for debugging - std::vector log_modules( - {"CuptiActivityProfiler.cpp"}); - SET_LOG_VERBOSITY_LEVEL(2, log_modules); - - // Start and stop profiling - CuptiActivityProfiler profiler(cuptiActivities_, /*cpu only*/ false); - int64_t start_time_us = 100; - int64_t duration_us = 300; - auto start_time = time_point(microseconds(start_time_us)); - profiler.configure(*cfg_, start_time); - profiler.startTrace(start_time); - profiler.stopTrace(start_time + microseconds(duration_us)); - - int64_t kernelLaunchTime = 120; - profiler.recordThreadInfo(); - - // set up CPU event - auto cpuOps = std::make_unique( - start_time_us, start_time_us + duration_us); - cpuOps->addOp("annotation", kernelLaunchTime, kernelLaunchTime + 10, 1); - profiler.transferCpuTrace(std::move(cpuOps)); - - // set up a couple of GPU events and correlate with above CPU event. - // CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1 is used for user annotations. - auto gpuOps = std::make_unique(); - gpuOps->addCorrelationActivity(1, CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1, 1); - gpuOps->addKernelActivity(kernelLaunchTime + 5, kernelLaunchTime + 10, 1); - gpuOps->addCorrelationActivity(1, CUPTI_EXTERNAL_CORRELATION_KIND_CUSTOM1, 1); - gpuOps->addKernelActivity(kernelLaunchTime + 15, kernelLaunchTime + 25, 1); - cuptiActivities_.activityBuffer = std::move(gpuOps); - - // process trace - auto logger = std::make_unique(*cfg_); - profiler.processTrace(*logger); - - ActivityTrace trace(std::move(logger), loggerFactory); - std::map counts; - for (auto& activity : *trace.activities()) { - counts[activity->name()]++; - } - - // We should now have an additional annotation activity created - // on the GPU timeline. - EXPECT_EQ(counts["annotation"], 2); - EXPECT_EQ(counts["kernel"], 2); - - auto& annotation = trace.activities()->at(0); - auto& kernel1 = trace.activities()->at(1); - auto& kernel2 = trace.activities()->at(2); - auto& gpu_annotation = trace.activities()->at(3); - EXPECT_EQ(gpu_annotation->type(), ActivityType::GPU_USER_ANNOTATION); - EXPECT_EQ(gpu_annotation->timestamp(), kernel1->timestamp()); - EXPECT_EQ( - gpu_annotation->duration(), - kernel2->timestamp() + kernel2->duration() - kernel1->timestamp()); - EXPECT_EQ(gpu_annotation->deviceId(), kernel1->deviceId()); - EXPECT_EQ(gpu_annotation->resourceId(), kernel1->resourceId()); - EXPECT_EQ(gpu_annotation->correlationId(), annotation->correlationId()); - EXPECT_EQ(gpu_annotation->name(), annotation->name()); -} - -TEST_F(CuptiActivityProfilerTest, SubActivityProfilers) { - using ::testing::Return; - using ::testing::ByMove; - - // Verbose logging is useful for debugging - std::vector log_modules( - {"CuptiActivityProfiler.cpp"}); - SET_LOG_VERBOSITY_LEVEL(2, log_modules); - - // Setup example events to test - GenericTraceActivity ev{defaultTraceSpan(), ActivityType::GLOW_RUNTIME, ""}; - ev.device = 1; - ev.resource = 0; - - int64_t start_time_us = 100; - int64_t duration_us = 1000; - auto start_time = time_point(microseconds(start_time_us)); - - std::vector test_activities{3, ev}; - test_activities[0].startTime = start_time_us; - test_activities[0].endTime = start_time_us + 5000; - test_activities[0].activityName = "SubGraph A execution"; - test_activities[1].startTime = start_time_us; - test_activities[1].endTime = start_time_us + 2000; - test_activities[1].activityName = "Operator foo"; - test_activities[2].startTime = start_time_us + 2500; - test_activities[2].endTime = start_time_us + 2900; - test_activities[2].activityName = "Operator bar"; - - auto mock_activity_profiler = - std::make_unique(test_activities); - - MockCuptiActivities activities; - CuptiActivityProfiler profiler(activities, /*cpu only*/ true); - profiler.addChildActivityProfiler( - std::move(mock_activity_profiler)); - - profiler.configure(*cfg_, start_time); - profiler.startTrace(start_time); - EXPECT_TRUE(profiler.isActive()); - - profiler.stopTrace(start_time + microseconds(duration_us)); - EXPECT_TRUE(profiler.isActive()); - - char filename[] = "/tmp/libkineto_testXXXXXX.json"; - mkstemps(filename, 5); - LOG(INFO) << "Logging to tmp file " << filename; - - // process trace - auto logger = std::make_unique(*cfg_); - profiler.processTrace(*logger); - profiler.setLogger(logger.get()); - - ActivityTrace trace(std::move(logger), loggerFactory); - trace.save(filename); - const auto& traced_activites = trace.activities(); - - // Test we have all the events - EXPECT_EQ(traced_activites->size(), test_activities.size()); - - // Check that the expected file was written and that it has some content - int fd = open(filename, O_RDONLY); - if (!fd) { - perror(filename); - } - EXPECT_TRUE(fd); - - // Should expect at least 100 bytes - struct stat buf{}; - fstat(fd, &buf); - EXPECT_GT(buf.st_size, 100); -} - -TEST_F(CuptiActivityProfilerTest, BufferSizeLimitTestWarmup) { - CuptiActivityProfiler profiler(cuptiActivities_, /*cpu only*/ false); - - auto now = system_clock::now(); - auto startTime = now + seconds(10); - - int maxBufferSizeMB = 3; - - auto startTimeEpoch = std::to_string(duration_cast(startTime.time_since_epoch()).count()); - std::string maxBufferSizeMBStr = std::to_string(maxBufferSizeMB); - cfg_->handleOption("ACTIVITIES_MAX_GPU_BUFFER_SIZE_MB", maxBufferSizeMBStr); - cfg_->handleOption("PROFILE_START_TIME", startTimeEpoch); - - - EXPECT_FALSE(profiler.isActive()); - profiler.configure(*cfg_, now); - EXPECT_TRUE(profiler.isActive()); - - for (size_t i = 0; i < maxBufferSizeMB; i++) { - uint8_t* buf; - size_t gpuBufferSize; - size_t maxNumRecords; - cuptiActivities_.bufferRequestedOverride(&buf, &gpuBufferSize, &maxNumRecords); - } - - // fast forward to startTime and profiler is now running - now = startTime; - - profiler.performRunLoopStep(now, now); - - auto next = now + milliseconds(1000); - profiler.performRunLoopStep(next, next); - profiler.performRunLoopStep(next, next); - profiler.performRunLoopStep(next, next); - - EXPECT_FALSE(profiler.isActive()); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiCallbackApiTest.cpp b/plugins/tensorboard-plugins/libkineto/test/CuptiCallbackApiTest.cpp deleted file mode 100644 index 253b696da54d1919e9c0076c5691a11e35345686..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiCallbackApiTest.cpp +++ /dev/null @@ -1,239 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "src/Logger.h" -#include "src/CuptiCallbackApi.h" - -#include -#include -#include -#include - -using namespace std::chrono; -using namespace KINETO_NAMESPACE; -using namespace libkineto; - -const size_t some_data = 42; - -std::atomic simple_cb_calls = 0; - -void simple_cb( - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - const CUpti_CallbackData* cbInfo) { - - // simple arg check - EXPECT_EQ(domain, CUPTI_CB_DOMAIN_RUNTIME_API); - EXPECT_EQ(cbid, CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000); - EXPECT_EQ(*reinterpret_cast(cbInfo), some_data); - - simple_cb_calls++; -} - -void atomic_cb( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId /*cbid*/, - const CUpti_CallbackData* /*cbInfo)*/) { - // do some atomics in a loop - for (int i = 0; i < 1000; i++) { - // would have used release consistency but this is fine - simple_cb_calls++; - } -} - -void empty_cb( - CUpti_CallbackDomain /*domain*/, - CUpti_CallbackId /*cbid*/, - const CUpti_CallbackData* /*cbInfo*/) { -} - -TEST(CuptiCallbackApiTest, SimpleTest) { - auto& api = CuptiCallbackApi::singleton(); - - auto addSimpleCallback = [&]() -> bool { - bool ret = api.registerCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CuptiCallbackApi::CUDA_LAUNCH_KERNEL, - &simple_cb - ); - return ret; - }; - EXPECT_TRUE(addSimpleCallback()) << "Failed to add callback"; - - // duplicate add should be okay - EXPECT_TRUE(addSimpleCallback()) << "Failed to re-add callback"; - - simple_cb_calls = 0; - - // simulate callback - api.__callback_switchboard( - CUPTI_CB_DOMAIN_RUNTIME_API, - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000, - reinterpret_cast(&some_data)); - - EXPECT_EQ(simple_cb_calls, 1); - - bool ret = api.deleteCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CuptiCallbackApi::CUDA_LAUNCH_KERNEL, - &simple_cb - ); - - EXPECT_TRUE(ret) << "Failed to remove callback"; - - ret = api.deleteCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CuptiCallbackApi::CUDA_LAUNCH_KERNEL, - &atomic_cb - ); - - EXPECT_FALSE(ret) << "oops! deleted a callback that was never added"; -} - -TEST(CuptiCallbackApiTest, AllCallbacks) { - auto& api = CuptiCallbackApi::singleton(); - - auto testCallback = [&]( - CUpti_CallbackDomain domain, - CUpti_CallbackId cbid, - CuptiCallbackApi::CuptiCallBackID kineto_cbid) -> bool { - - bool ret = api.registerCallback(domain, kineto_cbid, atomic_cb); - EXPECT_TRUE(ret) << "Failed to add callback"; - - if (!ret) { - return false; - } - - simple_cb_calls = 0; - api.__callback_switchboard(domain, cbid, nullptr); - EXPECT_EQ(simple_cb_calls, 1000); - ret = simple_cb_calls == 1000; - - EXPECT_TRUE(api.deleteCallback(domain, kineto_cbid, atomic_cb)); - - return ret; - }; - - EXPECT_TRUE( - testCallback( - CUPTI_CB_DOMAIN_RESOURCE, - CUPTI_CBID_RESOURCE_CONTEXT_CREATED, - CuptiCallbackApi::RESOURCE_CONTEXT_CREATED)) - << "Failed to run callback for RESOURCE_CONTEXT_CREATED"; - - EXPECT_TRUE( - testCallback( - CUPTI_CB_DOMAIN_RESOURCE, - CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING, - CuptiCallbackApi::RESOURCE_CONTEXT_DESTROYED)) - << "Failed to run callback for RESOURCE_CONTEXT_DESTROYED"; - - EXPECT_TRUE( - testCallback( - CUPTI_CB_DOMAIN_RUNTIME_API, - CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000, - CuptiCallbackApi::CUDA_LAUNCH_KERNEL)) - << "Failed to run callback for CUDA_LAUNCH_KERNEL"; - -} - -TEST(CuptiCallbackApiTest, ContentionTest) { - auto& api = CuptiCallbackApi::singleton(); - const CUpti_CallbackDomain domain = CUPTI_CB_DOMAIN_RUNTIME_API; - const CUpti_CallbackId cbid = CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000; - const CuptiCallbackApi::CuptiCallBackID kineto_cbid = - CuptiCallbackApi::CUDA_LAUNCH_KERNEL; - - bool ret = api.registerCallback(domain, kineto_cbid, empty_cb); - EXPECT_TRUE(ret) << "Failed to add callback"; - - const int iters = 10000; - const int num_readers = 8; - - simple_cb_calls = 0; - - // simulate callbacks being executed on multiple threads in parallel - // during this interval add a new atomic_callback. - // this test ensured mutual exclusion is working fine - auto read_fn = [&](int tid){ - auto start_ts = high_resolution_clock::now(); - for (int i = 0; i < iters; i++) { - api.__callback_switchboard(domain, cbid, nullptr); - } - auto runtime_ms = duration_cast( - high_resolution_clock::now() - start_ts); - LOG(INFO) << "th " << tid << " done in " << runtime_ms.count() << " ms"; - }; - - - std::vector read_ths; - for (int i = 0; i< num_readers; i++) { - read_ths.emplace_back(read_fn, i); - } - - ret = api.registerCallback(domain, kineto_cbid, atomic_cb); - EXPECT_TRUE(ret) << "Failed to add callback"; - - for (auto& t : read_ths) { - t.join(); - } - - //EXPECT_GT(simple_cb_calls, 0) - // << "Atomic callback should have been called at least once."; - - api.deleteCallback(domain, kineto_cbid, empty_cb); - api.deleteCallback(domain, kineto_cbid, atomic_cb); -} - -TEST(CuptiCallbackApiTest, Bechmark) { - - constexpr int iters = 1000; - // atomic bench a number of times to get a baseline - - const CUpti_CallbackDomain domain = CUPTI_CB_DOMAIN_RUNTIME_API; - const CUpti_CallbackId cbid = CUPTI_RUNTIME_TRACE_CBID_cudaLaunchKernel_v7000; - const CuptiCallbackApi::CuptiCallBackID kineto_cbid = - CuptiCallbackApi::CUDA_LAUNCH_KERNEL; - - LOG(INFO) << "Iteration count = " << iters; - - const bool use_empty = true; - auto cbfn = use_empty ? &empty_cb : &atomic_cb; - - // warmup - for (int i = 0; i < 50; i++) { - (*cbfn)(domain, cbid, nullptr); - } - - auto start_ts = high_resolution_clock::now(); - for (int i = 0; i < iters; i++) { - (*cbfn)(domain, cbid, nullptr); - } - auto delta_baseline_ns = duration_cast( - high_resolution_clock::now() - start_ts); - LOG(INFO) << "Baseline runtime = " << delta_baseline_ns.count() << " ns"; - - - auto& api = CuptiCallbackApi::singleton(); - bool ret = api.registerCallback(domain, kineto_cbid, cbfn); - EXPECT_TRUE(ret) << "Failed to add callback"; - - // warmup - for (int i = 0; i < 50; i++) { - api.__callback_switchboard(domain, cbid, nullptr); - } - - start_ts = high_resolution_clock::now(); - for (int i = 0; i < iters; i++) { - api.__callback_switchboard(domain, cbid, nullptr); - } - - auto delta_callback_ns = duration_cast( - high_resolution_clock::now() - start_ts); - LOG(INFO) << "Callback runtime = " << delta_callback_ns.count() << " ns"; - - LOG(INFO) << "Callback runtime per iteration = " << - (delta_callback_ns.count() - delta_baseline_ns.count()) / (double) iters - << " ns"; - -} diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiProfilerApiTest.cu b/plugins/tensorboard-plugins/libkineto/test/CuptiProfilerApiTest.cu deleted file mode 100644 index 54ad51b0a1fc9a6a54585d1cad4674943c874b98..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiProfilerApiTest.cu +++ /dev/null @@ -1,353 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#include - -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "src/Logger.h" -#include "src/CuptiRangeProfilerApi.h" - -#define DRIVER_API_CALL(apiFuncCall) \ - do { \ - CUresult _status = apiFuncCall; \ - if (_status != CUDA_SUCCESS) { \ - LOG(ERROR) << "Failed invoking CUDA driver function " \ - << #apiFuncCall << " status = " \ - << _status; \ - exit(-1); \ - } \ - } while (0) - -#define EXPECT(expr)\ - if (!(expr)) {\ - }; - -using namespace KINETO_NAMESPACE; - -static int numRanges = 1; - -using Type = double; - -// Device code -__global__ void VecAdd(const Type* A, const Type* B, Type* C, int N) { - int i = blockDim.x * blockIdx.x + threadIdx.x; - if (i < N) { - C[i] = A[i] + B[i]; - } -} - -// Device code -__global__ void VecSub(const Type* A, const Type* B, Type* C, int N) { - int i = blockDim.x * blockIdx.x + threadIdx.x; - if (i < N) { - C[i] = A[i] - B[i]; - } -} - -static void initVec(Type* vec, int n) { - for (int i = 0; i < n; i++) { - vec[i] = i; - } -} - -static void cleanUp( - Type* h_A, - Type* h_B, - Type* h_C, - Type* h_D, - Type* d_A, - Type* d_B, - Type* d_C, - Type* d_D) { - if (d_A) - cudaFree(d_A); - if (d_B) - cudaFree(d_B); - if (d_C) - cudaFree(d_C); - if (d_D) - cudaFree(d_D); - - // Free host memory - if (h_A) - free(h_A); - if (h_B) - free(h_B); - if (h_C) - free(h_C); - if (h_D) - free(h_D); -} - -/* Benchmark application used to test profiler measurements - * This simply runs two kernels vector Add and Vector Subtract - */ - -void VectorAddSubtract() { - int N = 50000; - size_t size = N * sizeof(Type); - int threadsPerBlock = 0; - int blocksPerGrid = 0; - Type *h_A, *h_B, *h_C, *h_D; - Type *d_A, *d_B, *d_C, *d_D; - int i; - Type sum, diff; - - // Allocate input vectors h_A and h_B in host memory - h_A = (Type*)malloc(size); - h_B = (Type*)malloc(size); - h_C = (Type*)malloc(size); - h_D = (Type*)malloc(size); - - // Initialize input vectors - initVec(h_A, N); - initVec(h_B, N); - memset(h_C, 0, size); - memset(h_D, 0, size); - - // Allocate vectors in device memory - cudaMalloc((void**)&d_A, size); - cudaMalloc((void**)&d_B, size); - cudaMalloc((void**)&d_C, size); - cudaMalloc((void**)&d_D, size); - - // Copy vectors from host memory to device memory - cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice); - cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice); - - // Invoke kernel - threadsPerBlock = 256; - blocksPerGrid = (N + threadsPerBlock - 1) / threadsPerBlock; - LOG(INFO) << fmt::format( - "Launching kernel: blocks {}, thread/block {}", - blocksPerGrid, - threadsPerBlock); - - VecAdd<<>>(d_A, d_B, d_C, N); - - VecSub<<>>(d_A, d_B, d_D, N); - - // Copy result from device memory to host memory - // h_C contains the result in host memory - cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost); - cudaMemcpy(h_D, d_D, size, cudaMemcpyDeviceToHost); - - // Verify result - for (i = 0; i < N; ++i) { - sum = h_A[i] + h_B[i]; - diff = h_A[i] - h_B[i]; - if (h_C[i] != sum || h_D[i] != diff) { - LOG(ERROR) << "Result verification failed"; - break; - } - } - - cleanUp(h_A, h_B, h_C, h_D, d_A, d_B, d_C, d_D); -} - -#if HAS_CUPTI_RANGE_PROFILER -bool runTestWithAutoRange( - int deviceNum, - const std::vector& metricNames, - CUcontext cuContext, - bool async) { - - // create a CUPTI range based profiling profiler - // this configures the counter data as well - CuptiRBProfilerSession profiler( - metricNames, deviceNum, 2, 1, async ? nullptr : cuContext); - - CUpti_ProfilerRange profilerRange = CUPTI_AutoRange; - CUpti_ProfilerReplayMode profilerReplayMode = CUPTI_KernelReplay; - - if (async) { - profiler.asyncStartAndEnable(profilerRange, profilerReplayMode); - } else { - profiler.start(profilerRange, profilerReplayMode); - profiler.enable(); - } - - VectorAddSubtract(); - - if (!async) { - profiler.disable(); - // stop profiler - profiler.stop(); - } else { - profiler.asyncDisableAndStop(); - } - - auto result = profiler.evaluateMetrics(true); - - // check results - EXPECT_EQ(result.metricNames.size(), 3); - EXPECT_EQ(result.rangeVals.size(), 2); - - for (const auto& measurement : result.rangeVals) { - EXPECT_EQ(measurement.values.size(), 3); - - if (measurement.values.size() == 3) { - // smsp__warps_launched.avg - EXPECT_NE(measurement.values[0], 0); - // smsp__sass_thread_inst_executed_op_dadd_pred_on.sum - // each kernel has 50000 dadd ops - EXPECT_EQ(measurement.values[1], 50000); - // sm__inst_executed_pipe_tensor.sum - //EXPECT_EQ(measurement.values[2], 0); - } - } - return true; -} - -bool runTestWithUserRange( - int deviceNum, - const std::vector& metricNames, - CUcontext cuContext, - bool async = false) { - - // create a CUPTI range based profiling profiler - // this configures the counter data as well - CuptiRBProfilerSession profiler( - metricNames, deviceNum, numRanges, 1, async ? nullptr : cuContext); - - CUpti_ProfilerRange profilerRange = CUPTI_UserRange; - CUpti_ProfilerReplayMode profilerReplayMode = CUPTI_UserReplay; - - if (async) { - profiler.asyncStartAndEnable(profilerRange, profilerReplayMode); - { VectorAddSubtract(); } - profiler.disableAndStop(); - } else { - profiler.start(profilerRange, profilerReplayMode); - - /* User takes the resposiblity of replaying the kernel launches */ - bool replay = true; - do { - profiler.beginPass(); - { - profiler.enable(); - - std::string rangeName = "vecAddSub"; - profiler.pushRange(rangeName); - - { VectorAddSubtract(); } - - profiler.popRange(); - profiler.disable(); - } - LOG(INFO) << "Replay starting."; - replay = profiler.endPass(); - - } while (!replay); - - // stop profiler - profiler.stop(); - } - VectorAddSubtract(); - auto result = profiler.evaluateMetrics(true); - - // check results - EXPECT_EQ(result.metricNames.size(), 3); - EXPECT_EQ(result.rangeVals.size(), 1); - - if (result.rangeVals.size() > 0) { - const auto& measurement = result.rangeVals[0]; - EXPECT_EQ(measurement.values.size(), 3); - - if (measurement.values.size() == 3) { - // smsp__warps_launched.avg - EXPECT_NE(measurement.values[0], 0); - // smsp__sass_thread_inst_executed_op_dadd_pred_on.sum - // in async mode multiple passes are not supported yet - if (!async) { - EXPECT_EQ(measurement.values[1], 100000); - } - // sm__inst_executed_pipe_tensor.sum - //EXPECT_EQ(measurement.values[2], 0); - } - } - return true; -} -#endif // HAS_CUPTI_RANGE_PROFILER - -int main(int argc, char* argv[]) { - - CUdevice cuDevice; - - int deviceCount, deviceNum; - int computeCapabilityMajor = 0, computeCapabilityMinor = 0; - - printf("Usage: %s [device_num]\n", argv[0]); - - DRIVER_API_CALL(cuInit(0)); - DRIVER_API_CALL(cuDeviceGetCount(&deviceCount)); - - if (deviceCount == 0) { - LOG(ERROR) << "There is no device supporting CUDA."; - return -2; - } - - if (argc > 1) - deviceNum = atoi(argv[1]); - else - deviceNum = 0; - LOG(INFO) << "CUDA Device Number: " << deviceNum; - - DRIVER_API_CALL(cuDeviceGet(&cuDevice, deviceNum)); - DRIVER_API_CALL(cuDeviceGetAttribute( - &computeCapabilityMajor, - CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR, - cuDevice)); - DRIVER_API_CALL(cuDeviceGetAttribute( - &computeCapabilityMinor, - CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR, - cuDevice)); - - LOG(INFO) << "Compute Cabapbility = " - << fmt::format("{},{}",computeCapabilityMajor, computeCapabilityMinor); - - if (computeCapabilityMajor < 7) { - LOG(ERROR) << "CUPTI Profiler is not supported with compute capability < 7.0"; - return -2; - } - - CuptiRBProfilerSession::staticInit(); - - // metrics to profile - std::vector metricNames = { - "smsp__warps_launched.avg", - "smsp__sass_thread_inst_executed_op_dadd_pred_on.sum", - "sm__inst_executed_pipe_tensor.sum", - }; - - CUcontext cuContext; - DRIVER_API_CALL(cuCtxCreate(&cuContext, 0, cuDevice)); - - VectorAddSubtract(); - -#if HAS_CUPTI_RANGE_PROFILER - CuptiRBProfilerSession::staticInit(); - - if (!runTestWithUserRange(deviceNum, metricNames, cuContext, false)) { - LOG(ERROR) << "Failed to profiler test benchmark in user range"; - } else if (!runTestWithAutoRange(deviceNum, metricNames, cuContext, false)) { - LOG(ERROR) << "Failed to profiler test benchmark in auto range"; - } else if (!runTestWithUserRange(deviceNum, metricNames, cuContext, true)) { - LOG(ERROR) << "Failed to profiler test benchmark in user range async"; - } else if (!runTestWithAutoRange(deviceNum, metricNames, cuContext, true)) { - LOG(ERROR) << "Failed to profiler test benchmark in auto range async"; - } - - CuptiRBProfilerSession::deInitCupti(); -#else - LOG(WARNING) << "CuptiRBProfilerSession is not supported."; -#endif // HAS_CUPTI_RANGE_PROFILER - DRIVER_API_CALL(cuCtxDestroy(cuContext)); - - - return 0; -} diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerApiTest.cpp b/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerApiTest.cpp deleted file mode 100644 index 28cad722c53ee5defaa7c24cbe0d6b2cbc840a30..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerApiTest.cpp +++ /dev/null @@ -1,113 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#include - -#include "include/libkineto.h" -#include "include/Config.h" -#include "src/CuptiRangeProfilerApi.h" - -#include "src/Logger.h" -#include "test/CuptiRangeProfilerTestUtil.h" - -using namespace KINETO_NAMESPACE; - -#if HAS_CUPTI_PROFILER - -TEST(CuptiRangeProfilerApiTest, contextTracking) { - std::vector log_modules( - {"CuptiRangeProfilerApi.cpp"}); - SET_LOG_VERBOSITY_LEVEL(1, log_modules); - - std::array data; - std::array contexts; - for (int i = 0; i < data.size(); i++) { - contexts[i] = reinterpret_cast(&data[i]); - } - - // simulate creating contexts, this calls the trackCudaContexts - // function that would otherwise be called via a callback - uint32_t dev = 0; - for (auto ctx : contexts) { - simulateCudaContextCreate(ctx, dev++); - } - - EXPECT_EQ( - CuptiRBProfilerSession::getActiveDevices(), - std::set({0, 1, 2})); - - simulateCudaContextDestroy(contexts[1], 1); - - EXPECT_EQ( - CuptiRBProfilerSession::getActiveDevices(), - std::set({0, 2})); - - simulateCudaContextDestroy(contexts[0], 0); - simulateCudaContextDestroy(contexts[2], 2); - - EXPECT_TRUE( - CuptiRBProfilerSession::getActiveDevices().empty()); -} - -TEST(CuptiRangeProfilerApiTest, asyncLaunchUserRange) { - std::vector log_modules( - {"CuptiRangeProfilerApi.cpp"}); - SET_LOG_VERBOSITY_LEVEL(1, log_modules); - - // this is bad but the pointer is never accessed - CUcontext ctx0 = reinterpret_cast(10); - simulateCudaContextCreate(ctx0, 0 /*device_id*/); - - auto session = std::make_unique(0, ctx0); - session->asyncStartAndEnable(CUPTI_UserRange, CUPTI_UserReplay); - - simulateKernelLaunch(ctx0, "hello"); - simulateKernelLaunch(ctx0, "foo"); - simulateKernelLaunch(ctx0, "bar"); - - session->asyncDisableAndStop(); - // stop happens after next kernel is run - simulateKernelLaunch(ctx0, "bar"); - simulateCudaContextDestroy(ctx0, 0 /*device_id*/); - - EXPECT_EQ(session->passes_ended, 1); - EXPECT_EQ(session->ranges_ended, 1); - EXPECT_TRUE(session->enabled); -} - -TEST(CuptiRangeProfilerApiTest, asyncLaunchAutoRange) { - std::vector log_modules( - {"CuptiRangeProfilerApi.cpp"}); - SET_LOG_VERBOSITY_LEVEL(1, log_modules); - - // this is bad but the pointer is never accessed - CUcontext ctx0 = reinterpret_cast(10); - CUcontext ctx1 = reinterpret_cast(11); - - simulateCudaContextCreate(ctx0, 0 /*device_id*/); - - auto session = std::make_unique(0, ctx0); - session->asyncStartAndEnable(CUPTI_AutoRange, CUPTI_KernelReplay); - - simulateKernelLaunch(ctx0, "hello"); - simulateKernelLaunch(ctx0, "foo"); - simulateKernelLaunch(ctx1, "kernel_on_different_device"); - simulateKernelLaunch(ctx0, "bar"); - - session->asyncDisableAndStop(); - // stop happens after next kernel is run - simulateKernelLaunch(ctx0, "bar"); - simulateCudaContextDestroy(ctx0, 0 /*device_id*/); - - EXPECT_EQ(session->passes_ended, 0); - EXPECT_EQ(session->ranges_ended, 0); - EXPECT_TRUE(session->enabled); - - EXPECT_EQ( - session->getKernelNames(), - std::vector({"hello", "foo", "bar"})) - << "Kernel names were not tracked"; -} - -#endif // HAS_CUPTI_PROFILER diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerConfigTest.cpp b/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerConfigTest.cpp deleted file mode 100644 index 3f568968238a0e376ab3bae621af00a162af0d25..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerConfigTest.cpp +++ /dev/null @@ -1,67 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "include/Config.h" -#include "src/CuptiRangeProfilerConfig.h" - -#include -#include -#include -#include - -using namespace std::chrono; -using namespace KINETO_NAMESPACE; - -class CuptiRangeProfilerConfigTest : public ::testing::Test { - protected: - void SetUp() override { - CuptiRangeProfilerConfig::registerFactory(); - } -}; - -TEST_F(CuptiRangeProfilerConfigTest, ConfigureProfiler) { - Config cfg; - std::vector metrics = { - "kineto__cuda_core_flops", - "sm__inst_executed.sum", - "l1tex__data_bank_conflicts_pipe_lsu.sum", - }; - auto metricsConfigStr = - fmt::format("CUPTI_PROFILER_METRICS = {}", fmt::join(metrics, ",")); - - EXPECT_TRUE(cfg.parse(metricsConfigStr)); - EXPECT_TRUE(cfg.parse("CUPTI_PROFILER_ENABLE_PER_KERNEL = true")); - EXPECT_TRUE(cfg.parse("CUPTI_PROFILER_MAX_RANGES = 42")); - - const CuptiRangeProfilerConfig& cupti_cfg = - CuptiRangeProfilerConfig::get(cfg); - - EXPECT_EQ(cupti_cfg.activitiesCuptiMetrics(), metrics); - EXPECT_EQ(cupti_cfg.cuptiProfilerPerKernel(), true); - EXPECT_EQ(cupti_cfg.cuptiProfilerMaxRanges(), 42); - -} - -TEST_F(CuptiRangeProfilerConfigTest, RangesDefaults) { - Config cfg, cfg_auto; - - // do not set max ranges in config, check defaults are sane - EXPECT_TRUE(cfg.parse("CUPTI_PROFILER_METRICS = kineto__cuda_core_flops")); - EXPECT_TRUE(cfg.parse("CUPTI_PROFILER_ENABLE_PER_KERNEL = false")); - - cfg.setSignalDefaults(); - - EXPECT_TRUE(cfg_auto.parse("CUPTI_PROFILER_METRICS = kineto__cuda_core_flops")); - EXPECT_TRUE(cfg_auto.parse("CUPTI_PROFILER_ENABLE_PER_KERNEL = true")); - - cfg_auto.setClientDefaults(); - - int user_ranges, auto_ranges; - - user_ranges = CuptiRangeProfilerConfig::get(cfg).cuptiProfilerMaxRanges(); - auto_ranges = CuptiRangeProfilerConfig::get(cfg_auto).cuptiProfilerMaxRanges(); - - EXPECT_GE(user_ranges, 1) << " in user range mode default to at least 1 ranges"; - EXPECT_GE(auto_ranges, 1000) << " in auto range mode default to at least 1000 ranges"; - - EXPECT_GT(auto_ranges, user_ranges); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerTestUtil.h b/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerTestUtil.h deleted file mode 100644 index 861b65fd701bf69373df657ab2a22d9dba0b27df..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiRangeProfilerTestUtil.h +++ /dev/null @@ -1,96 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "CuptiRangeProfilerApi.h" - -namespace KINETO_NAMESPACE { - -#if HAS_CUPTI_PROFILER - -class MockCuptiRBProfilerSession : public CuptiRBProfilerSession { - public: - MockCuptiRBProfilerSession(int deviceId, CUcontext ctx) - : CuptiRBProfilerSession(deviceId, ctx) {} - - void beginPass() override { - LOG(INFO) << " Mock CUPTI begin pass"; - passes_started++; - } - - bool endPass() override { - passes_ended++; - return true; - } - - void flushCounterData() override {} - - void pushRange(const std::string& rangeName) override { - LOG(INFO) << " Mock CUPTI pushrange ( " << rangeName << " )"; - ranges_started++; - } - - void popRange() override { - LOG(INFO) << " Mock CUPTI poprange"; - ranges_ended++; - } - - void stop() override { - runChecks(); - } - - void enable() override { - enabled = true; - } - void disable() override {} - - CuptiProfilerResult evaluateMetrics(bool /*verbose*/) override { - return result; - } - -protected: - void startInternal( - CUpti_ProfilerRange profilerRange, - CUpti_ProfilerReplayMode profilerReplayMode) override { - curRange_ = profilerRange; - curReplay_ = profilerReplayMode; - } - -private: - void runChecks() { - EXPECT_EQ(passes_started, passes_ended); - EXPECT_EQ(ranges_started, ranges_ended); - } - - public: - int passes_started = 0; - int passes_ended = 0; - int ranges_started = 0; - int ranges_ended = 0; - bool enabled = false; - - CuptiProfilerResult result; - -}; - -inline void simulateCudaContextCreate(CUcontext context, uint32_t dev) { - testing::trackCudaCtx( - context, dev, CUPTI_CBID_RESOURCE_CONTEXT_CREATED); -} - -inline void simulateCudaContextDestroy(CUcontext context, uint32_t dev) { - testing::trackCudaCtx( - context, dev, CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING); -} - -inline void simulateKernelLaunch( - CUcontext context, const std::string& kernelName) { - testing::trackCudaKernelLaunch(context, kernelName.c_str()); -} - -#endif // HAS_CUPTI_PROFILER - -} // namespace KINETO_NAMESPACE diff --git a/plugins/tensorboard-plugins/libkineto/test/CuptiStringsTest.cpp b/plugins/tensorboard-plugins/libkineto/test/CuptiStringsTest.cpp deleted file mode 100644 index 405f9404a49a5bf8b7433930b0ad2fe898ea2d89..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/CuptiStringsTest.cpp +++ /dev/null @@ -1,29 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include - -#include "src/cupti_strings.h" - -using namespace KINETO_NAMESPACE; - -TEST(CuptiStringsTest, Valid) { - ASSERT_STREQ( - runtimeCbidName(CUPTI_RUNTIME_TRACE_CBID_INVALID), "INVALID"); - ASSERT_STREQ( - runtimeCbidName(CUPTI_RUNTIME_TRACE_CBID_cudaDriverGetVersion_v3020), - "cudaDriverGetVersion"); - ASSERT_STREQ(runtimeCbidName - (CUPTI_RUNTIME_TRACE_CBID_cudaDeviceSynchronize_v3020), - "cudaDeviceSynchronize"); - ASSERT_STREQ( - runtimeCbidName(CUPTI_RUNTIME_TRACE_CBID_cudaStreamSetAttribute_ptsz_v11000), - "cudaStreamSetAttribute_ptsz"); -} - -TEST(CuptiStringsTest, Invalid) { - ASSERT_STREQ(runtimeCbidName(-1), "INVALID"); - // We can't actually use CUPTI_RUNTIME_TRACE_CBID_SIZE here until we - // auto-generate the string table, since it may have more entries than - // the enum in the version used to compile. - ASSERT_STREQ(runtimeCbidName(1000), "INVALID"); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/EventProfilerTest.cpp b/plugins/tensorboard-plugins/libkineto/test/EventProfilerTest.cpp deleted file mode 100644 index cb36c826a7f32b2fe6732e73eae3b6a006b0cd3d..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/EventProfilerTest.cpp +++ /dev/null @@ -1,578 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "src/EventProfiler.h" - -#include -#include -#include - -using namespace std::chrono; -using namespace KINETO_NAMESPACE; - -TEST(PercentileTest, Create) { - PercentileList pct = {{10, SampleValue(0)}, - {49, SampleValue(0)}, - {50, SampleValue(0)}, - {90, SampleValue(0)}}; - - percentiles({0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100}, pct); - EXPECT_EQ(pct[0].second.getInt(), 10); - EXPECT_EQ(pct[1].second.getInt(), 50); - EXPECT_EQ(pct[2].second.getInt(), 50); - EXPECT_EQ(pct[3].second.getInt(), 90); - - percentiles({80, 10, 20, 70, 60, 40, 90, 30, 50, 0, 100}, pct); - EXPECT_EQ(pct[0].second.getInt(), 10); - EXPECT_EQ(pct[1].second.getInt(), 50); - EXPECT_EQ(pct[2].second.getInt(), 50); - EXPECT_EQ(pct[3].second.getInt(), 90); - - percentiles({80}, pct); - EXPECT_EQ(pct[0].second.getInt(), 80); - EXPECT_EQ(pct[1].second.getInt(), 80); - EXPECT_EQ(pct[2].second.getInt(), 80); - EXPECT_EQ(pct[3].second.getInt(), 80); - - percentiles({80, 50}, pct); - EXPECT_EQ(pct[0].second.getInt(), 50); - EXPECT_EQ(pct[1].second.getInt(), 50); - EXPECT_EQ(pct[2].second.getInt(), 80); - EXPECT_EQ(pct[3].second.getInt(), 80); -} - -TEST(PercentileTest, Normalize) { - PercentileList pct = { - {10, SampleValue(10)}, {50, SampleValue(100.0)}, {90, SampleValue(2000)}}; - - normalize(pct, 2.5); - - EXPECT_EQ(pct[0].second.getInt(), 25); - EXPECT_EQ((int)pct[1].second.getDouble(), 250); - EXPECT_EQ(pct[2].second.getInt(), 5000); -} - -TEST(EventTest, SumSamples) { - Event ev; - ev.instanceCount = 4; - auto t = system_clock::now(); - ev.addSample(t, {1, 2, 3, 4}); - ev.addSample(t, {10, 20, 30, 40}); - ev.addSample(t, {100, 200, 300, 400}); - - EXPECT_EQ(ev.sumInstance(0, {0, 0, 3}), 1); - EXPECT_EQ(ev.sumInstance(0, {0, 1, 3}), 10); - EXPECT_EQ(ev.sumInstance(0, {0, 2, 3}), 100); - - EXPECT_EQ(ev.sumInstance(0, {0, 0, 1}), 111); - - EXPECT_EQ(ev.sumInstance(3, {0, 0, 1}), 444); - - // Non-zero offset - EXPECT_EQ(ev.sumInstance(0, {1, 0, 2}), 10); - EXPECT_EQ(ev.sumInstance(0, {1, 1, 2}), 100); - EXPECT_EQ(ev.sumInstance(0, {1, 0, 1}), 110); - - ev.addSample(t, {1000, 2000, 3000, 4000}); - - EXPECT_EQ(ev.sumInstance(0, {1, 0, 3}), 10); - EXPECT_EQ(ev.sumInstance(0, {1, 1, 3}), 100); - EXPECT_EQ(ev.sumInstance(0, {2, 1, 2}), 1000); - EXPECT_EQ(ev.sumInstance(0, {2, 0, 1}), 1100); - - EXPECT_EQ(ev.sumAll({0, 0, 4}), 10); - EXPECT_EQ(ev.sumAll({1, 0, 3}), 100); - EXPECT_EQ(ev.sumAll({2, 1, 2}), 10000); - EXPECT_EQ(ev.sumAll({0, 1, 2}), 11000); - EXPECT_EQ(ev.sumAll({0, 0, 1}), 11110); -} - -TEST(EventTest, Percentiles) { - Event ev; - ev.instanceCount = 4; - auto t = system_clock::now(); - ev.addSample(t, {3, 2, 1, 4}); - ev.addSample(t, {30, 20, 10, 40}); - ev.addSample(t, {300, 200, 100, 400}); - - PercentileList pct = { - {10, SampleValue(0)}, {50, SampleValue(0)}, {90, SampleValue(0)}}; - - ev.percentiles(pct, {0, 0, 3}); - EXPECT_EQ(pct[0].second.getInt(), 1); - EXPECT_EQ(pct[1].second.getInt(), 3); - EXPECT_EQ(pct[2].second.getInt(), 4); - - ev.percentiles(pct, {0, 0, 1}); - EXPECT_EQ(pct[0].second.getInt(), 111); - EXPECT_EQ(pct[1].second.getInt(), 333); - EXPECT_EQ(pct[2].second.getInt(), 444); -} - -class MockCuptiMetrics : public CuptiMetricApi { - public: - MockCuptiMetrics() : CuptiMetricApi(0) {} - MOCK_METHOD1(idFromName, CUpti_MetricID(const std::string& name)); - MOCK_METHOD1( - events, - std::map(CUpti_MetricID metric_id)); - MOCK_METHOD1(valueKind, CUpti_MetricValueKind(CUpti_MetricID metric)); - MOCK_METHOD1( - evaluationMode, - CUpti_MetricEvaluationMode(CUpti_MetricID metric)); - MOCK_METHOD5( - calculate, - SampleValue( - CUpti_MetricID metric, - CUpti_MetricValueKind kind, - std::vector& events, - std::vector& values, - int64_t duration)); -}; - -TEST(MetricTest, Calculate) { - using ::testing::Return; - MockCuptiMetrics metrics; - - // The events used for the ipc metrics: instructions and cycles - // Pretend we have 2 SMs and 2 samples of each event - Event instr("instructions"); - instr.instanceCount = 2; - auto t = system_clock::now(); - instr.addSample(t, {100, 200}); - instr.addSample(t, {300, 400}); - - Event cycles("cycles"); - cycles.instanceCount = 2; - cycles.addSample(t, {1000, 1200}); - cycles.addSample(t, {1300, 1300}); - - // 2 & 3 are the event ids we specified in the metric - std::map events; - events[2] = std::move(instr); - events[3] = std::move(cycles); - - // Define an ipc metric - EXPECT_CALL(metrics, valueKind(1)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_VALUE_KIND_DOUBLE)); - Metric m( - "ipc", 1, {2, 3}, CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE, metrics); - - // Calculate metric for first sample - // Since evaluation mode is CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE, - // Cupti API will be called three times: once for each SM (2) and once - // to get the total across SMs. - std::vector ids = {2, 3}; - std::vector vals = {100, 1000}; - EXPECT_CALL( - metrics, calculate(1, CUPTI_METRIC_VALUE_KIND_DOUBLE, ids, vals, 1000)) - .Times(1) - .WillOnce(Return(SampleValue(0.1))); - vals = {200, 1200}; - EXPECT_CALL( - metrics, calculate(1, CUPTI_METRIC_VALUE_KIND_DOUBLE, ids, vals, 1000)) - .Times(1) - .WillOnce(Return(SampleValue(0.17))); - vals = {300, 2200}; - EXPECT_CALL( - metrics, calculate(1, CUPTI_METRIC_VALUE_KIND_DOUBLE, ids, vals, 1000)) - .Times(1) - .WillOnce(Return(SampleValue(0.14))); - auto v = m.calculate(events, nanoseconds(1000), {0, 0, 2}); - - EXPECT_EQ(v.perInstance.size(), 2); - EXPECT_EQ(v.perInstance[0].getDouble(), 0.1); - EXPECT_EQ(v.perInstance[1].getDouble(), 0.17); - EXPECT_EQ(v.total.getDouble(), 0.14); - - // Calculate second sample. - // Change evaluation mode to CUPTI_METRIC_EVALUATION_MODE_AGGREGATE. - // Now we should get only one call to the Cupti API for the total. - EXPECT_CALL(metrics, valueKind(1)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_VALUE_KIND_DOUBLE)); - Metric m2("ipc", 1, {2, 3}, CUPTI_METRIC_EVALUATION_MODE_AGGREGATE, metrics); - vals = {700, 2600}; - EXPECT_CALL( - metrics, calculate(1, CUPTI_METRIC_VALUE_KIND_DOUBLE, ids, vals, 1000)) - .Times(1) - .WillOnce(Return(SampleValue(0.27))); - v = m2.calculate(events, nanoseconds(1000), {0, 1, 2}); - - EXPECT_EQ(v.perInstance.size(), 1); - EXPECT_EQ(v.perInstance[0].getDouble(), 0.27); - EXPECT_EQ(v.total.getDouble(), 0.27); -} - -class MockCuptiEvents : public CuptiEventApi { - public: - MOCK_METHOD1( - createGroupSets, - CUpti_EventGroupSets*(std::vector& ids)); - MOCK_METHOD1(destroyGroupSets, void(CUpti_EventGroupSets* sets)); - MOCK_METHOD0(setContinuousMode, bool()); - MOCK_METHOD1(enablePerInstance, void(CUpti_EventGroup eventGroup)); - MOCK_METHOD1(instanceCount, uint32_t(CUpti_EventGroup eventGroup)); - MOCK_METHOD1(enableGroupSet, void(CUpti_EventGroupSet& set)); - MOCK_METHOD1(disableGroupSet, void(CUpti_EventGroupSet& set)); - MOCK_METHOD3( - readEvent, - void(CUpti_EventGroup g, CUpti_EventID id, std::vector& vals)); - MOCK_METHOD1(eventsInGroup, std::vector(CUpti_EventGroup g)); - MOCK_METHOD1(eventId, CUpti_EventID(const std::string& name)); -}; - -TEST(EventGroupSetTest, CollectSample) { - using ::testing::_; - using ::testing::Return; - using ::testing::SetArgPointee; - const CUpti_EventGroup g1{nullptr}; - const CUpti_EventGroup g2{reinterpret_cast(0x1000)}; - CUpti_EventGroup groups[] = {g1, g2}; - CUpti_EventGroupSet set; - set.eventGroups = groups; - set.numEventGroups = 2; - - std::map events; - Event instr("instructions"); - events[4] = std::move(instr); - Event cycles("cycles"); - events[5] = std::move(cycles); - Event branches("branches"); - events[10] = std::move(branches); - - MockCuptiEvents cupti_events; - EXPECT_CALL(cupti_events, enablePerInstance(g1)).Times(1); - EXPECT_CALL(cupti_events, enablePerInstance(g2)).Times(1); - EXPECT_CALL(cupti_events, instanceCount(g1)).Times(1).WillOnce(Return(80)); - EXPECT_CALL(cupti_events, instanceCount(g2)).Times(1).WillOnce(Return(40)); - std::vector events_in_group1 = {4, 5}; - EXPECT_CALL(cupti_events, eventsInGroup(g1)) - .Times(1) - .WillOnce(Return(events_in_group1)); - std::vector events_in_group2 = {10}; - EXPECT_CALL(cupti_events, eventsInGroup(g2)) - .Times(1) - .WillOnce(Return(events_in_group2)); - EventGroupSet group_set(set, events, cupti_events); - - EXPECT_EQ(group_set.groupCount(), 2); - EXPECT_EQ(events[4].instanceCount, 80); - EXPECT_EQ(events[5].instanceCount, 80); - EXPECT_EQ(events[10].instanceCount, 40); - - // This should not cause any Cupti API action as the group - // set is already disabled - group_set.setEnabled(false); - - // Activate group set - if activated twice, only the first - // should cause cupti API to be called - EXPECT_CALL(cupti_events, enableGroupSet(_)).Times(1); - group_set.setEnabled(false); - group_set.setEnabled(true); - - EXPECT_CALL(cupti_events, eventsInGroup(g1)) - .Times(1) - .WillOnce(Return(events_in_group1)); - EXPECT_CALL(cupti_events, eventsInGroup(g2)) - .Times(1) - .WillOnce(Return(events_in_group2)); - EXPECT_CALL(cupti_events, readEvent(g1, 4, _)).Times(1); - EXPECT_CALL(cupti_events, readEvent(g1, 5, _)).Times(1); - EXPECT_CALL(cupti_events, readEvent(g2, 10, _)).Times(1); - group_set.collectSample(); - - EXPECT_EQ(events[4].sampleCount(), 1); - EXPECT_EQ(events[5].sampleCount(), 1); - EXPECT_EQ(events[10].sampleCount(), 1); -} - -class MockLogger : public SampleListener { - public: - MOCK_METHOD3(handleSample, void(int device, const Sample& sample, bool from_new_version)); - MOCK_METHOD1(update, void(const Config& config)); -}; - -class EventProfilerTest : public ::testing::Test { - protected: - void SetUp() override { - auto cupti_events_ptr = std::make_unique(); - auto cupti_metrics_ptr = std::make_unique(); - cuptiEvents_ = cupti_events_ptr.get(); - cuptiMetrics_ = cupti_metrics_ptr.get(); - loggers_.push_back(std::make_unique()); - onDemandLoggers_.push_back(std::make_unique()); - profiler_ = std::make_unique( - std::move(cupti_events_ptr), - std::move(cupti_metrics_ptr), - loggers_, - onDemandLoggers_); - - for (int i = 0; i < kEventGroupCount; i++) { - eventGroups_[i] = &eventGroups_[i]; - } - for (int i = 0; i < kGroupSetCount; i++) { - // Default size to 1 but can be changed by test - groupSet_[i].numEventGroups = 1; - // Two groups per set - groupSet_[i].eventGroups = &eventGroups_[i * 2]; - } - groupSets_.numSets = 1; - groupSets_.sets = groupSet_; - } - - MockCuptiEvents* cuptiEvents_; - MockCuptiMetrics* cuptiMetrics_; - std::vector> loggers_; - std::vector> onDemandLoggers_; - constexpr static int kEventGroupCount = 4; - constexpr static int kGroupSetCount = 2; - CUpti_EventGroup eventGroups_[kEventGroupCount]; - CUpti_EventGroupSet groupSet_[kGroupSetCount]; - CUpti_EventGroupSets groupSets_; - std::unique_ptr profiler_; -}; - -TEST_F(EventProfilerTest, ConfigureFailure) { - using namespace testing; - - // Default config has no counters enabled. - // Check that profiler remains disabled. - Config cfg; - profiler_->configure(cfg, nullptr); - - EXPECT_FALSE(profiler_->enabled()); - - // There is no event named "cycles" - // In this case the profiler should print a warning and remain disabled - bool parsed = cfg.parse("EVENTS = cycles"); - EXPECT_TRUE(parsed); - - // EventProfiler should handle exception thrown from createGroupSets - // Configuration will be applied twice - once for combined base + on-demand - // and then again falling back to base - EXPECT_CALL(*cuptiEvents_, eventId("cycles")) - .Times(2) - .WillRepeatedly(Return(0)); - std::vector ids = {0}; - EXPECT_CALL(*cuptiEvents_, createGroupSets(ids)) - .Times(2) - .WillRepeatedly(Throw( - std::system_error(EINVAL, std::generic_category(), "Event ID"))); - profiler_->configure(cfg, nullptr); - - EXPECT_FALSE(profiler_->enabled()); -} - -TEST_F(EventProfilerTest, ConfigureBase) { - using namespace testing; - - // Test normal path, simple base config - Config cfg; - bool parsed = cfg.parse("EVENTS = elapsed_cycles_sm"); - EXPECT_TRUE(parsed); - - // One valid event - expect one call to eventId and createGroupSets - EXPECT_CALL(*cuptiEvents_, eventId("elapsed_cycles_sm")) - .Times(1) - .WillOnce(Return(5)); - std::vector ids = {5}; - EXPECT_CALL(*cuptiEvents_, createGroupSets(ids)) - .Times(1) - .WillOnce(Return(&groupSets_)); - EXPECT_CALL(*cuptiEvents_, enablePerInstance(eventGroups_[0])).Times(1); - EXPECT_CALL(*cuptiEvents_, instanceCount(eventGroups_[0])) - .Times(1) - .WillOnce(Return(80)); - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[0])) - .Times(1) - .WillOnce(Return(ids)); - EXPECT_CALL(*cuptiEvents_, enableGroupSet(_)).Times(1); - - profiler_->configure(cfg, nullptr); - - EXPECT_TRUE(profiler_->enabled()); -} - -TEST_F(EventProfilerTest, ConfigureOnDemand) { - using namespace testing; - - // Test base + on-demand config, one event and one metric - Config cfg, on_demand_cfg; - bool parsed = cfg.parse(R"( - EVENTS = active_cycles - SAMPLE_PERIOD_MSECS=500 - REPORT_PERIOD_SECS=10 - SAMPLES_PER_REPORT=5 - )"); - EXPECT_TRUE(parsed); - - parsed = on_demand_cfg.parse(R"( - METRICS = ipc - EVENTS_DURATION_SECS=60 - SAMPLE_PERIOD_MSECS=200 - MULTIPLEX_PERIOD_MSECS=2000 - REPORT_PERIOD_SECS=3 - SAMPLES_PER_REPORT=10 - )"); - EXPECT_TRUE(parsed); - - // One event - EXPECT_CALL(*cuptiEvents_, eventId("active_cycles")) - .Times(1) - .WillOnce(Return(3)); - // One metric - EXPECT_CALL(*cuptiMetrics_, idFromName("ipc")).Times(1).WillOnce(Return(10)); - std::map ipc_events; - ipc_events[4] = "instructions"; - ipc_events[5] = "elapsed_cycles_sm"; - EXPECT_CALL(*cuptiMetrics_, events(10)).Times(1).WillOnce(Return(ipc_events)); - EXPECT_CALL(*cuptiMetrics_, evaluationMode(10)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE)); - EXPECT_CALL(*cuptiMetrics_, valueKind(10)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_VALUE_KIND_DOUBLE)); - std::vector ids = {3, 4, 5}; - groupSet_[0].numEventGroups = 2; - groupSets_.numSets = 2; - EXPECT_CALL(*cuptiEvents_, createGroupSets(ids)) - .Times(1) - .WillOnce(Return(&groupSets_)); - // Specified CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE per instance above - // So check that it's enabled - EXPECT_CALL(*cuptiEvents_, enablePerInstance(eventGroups_[0])).Times(1); - EXPECT_CALL(*cuptiEvents_, enablePerInstance(eventGroups_[1])).Times(1); - EXPECT_CALL(*cuptiEvents_, enablePerInstance(eventGroups_[2])).Times(1); - std::vector ids_g1{3}, ids_g2{4}, ids_g3{5}; - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[0])) - .Times(1) - .WillOnce(Return(ids_g1)); - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[1])) - .Times(1) - .WillOnce(Return(ids_g2)); - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[2])) - .Times(1) - .WillOnce(Return(ids_g3)); - EXPECT_CALL(*cuptiEvents_, enableGroupSet(_)).Times(1); - - profiler_->configure(cfg, &on_demand_cfg); - - EXPECT_TRUE(profiler_->enabled()); - EXPECT_EQ(profiler_->samplePeriod().count(), 250); - EXPECT_EQ(profiler_->multiplexPeriod().count(), 1000); - EXPECT_EQ(profiler_->reportPeriod().count(), 10000); - EXPECT_EQ(profiler_->onDemandReportPeriod().count(), 4000); -} - -TEST_F(EventProfilerTest, ReportSample) { - using namespace testing; - - // Test base + on-demand config, one event and one metric - Config cfg, on_demand_cfg; - bool parsed = cfg.parse("EVENTS = active_cycles"); - EXPECT_TRUE(parsed); - - parsed = on_demand_cfg.parse(R"( - METRICS = ipc - EVENTS_DURATION_SECS=60 - )"); - EXPECT_TRUE(parsed); - - // One event - EXPECT_CALL(*cuptiEvents_, eventId("active_cycles")) - .Times(1) - .WillOnce(Return(3)); - // One metric - EXPECT_CALL(*cuptiMetrics_, idFromName("ipc")).Times(1).WillOnce(Return(10)); - std::map ipc_events; - ipc_events[4] = "instructions"; - ipc_events[5] = "elapsed_cycles_sm"; - EXPECT_CALL(*cuptiMetrics_, events(10)).Times(1).WillOnce(Return(ipc_events)); - EXPECT_CALL(*cuptiMetrics_, evaluationMode(10)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE)); - EXPECT_CALL(*cuptiMetrics_, valueKind(10)) - .Times(1) - .WillOnce(Return(CUPTI_METRIC_VALUE_KIND_DOUBLE)); - std::vector ids = {3, 4, 5}; - groupSet_[0].numEventGroups = 2; - groupSets_.numSets = 2; - EXPECT_CALL(*cuptiEvents_, createGroupSets(ids)) - .Times(1) - .WillOnce(Return(&groupSets_)); - EXPECT_CALL(*cuptiEvents_, instanceCount(_)) - .Times(3) - .WillRepeatedly(Return(4)); - std::vector ids_g1{3}, ids_g2{4}, ids_g3{5}; - // These will be called by collectSample() as well, which is called twice - // per group set - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[0])) - .Times(3) - .WillRepeatedly(Return(ids_g1)); - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[1])) - .Times(3) - .WillRepeatedly(Return(ids_g2)); - EXPECT_CALL(*cuptiEvents_, eventsInGroup(eventGroups_[2])) - .Times(3) - .WillRepeatedly(Return(ids_g3)); - EXPECT_CALL(*cuptiEvents_, enableGroupSet(_)).Times(1); - - profiler_->configure(cfg, &on_demand_cfg); - - EXPECT_TRUE(profiler_->enabled()); - - EXPECT_CALL(*cuptiEvents_, readEvent(_, _, _)) - .Times(6) - .WillRepeatedly(Invoke( - [](CUpti_EventGroup g, CUpti_EventID id, std::vector& vals) { - vals = {1, 2, 3, 4}; - })); - - // Need to collect four times - twice for each group set - profiler_->collectSample(); - profiler_->collectSample(); - EXPECT_CALL(*cuptiEvents_, disableGroupSet(_)).Times(1); - EXPECT_CALL(*cuptiEvents_, enableGroupSet(_)).Times(1); - profiler_->enableNextCounterSet(); - profiler_->collectSample(); - profiler_->collectSample(); - - std::vector ipc_ids = {4, 5}; - // Called once for each instance (4) and once for the total. - // x2 since we recompute per logger. - EXPECT_CALL( - *cuptiMetrics_, - calculate(10, CUPTI_METRIC_VALUE_KIND_DOUBLE, ipc_ids, _, 2000000000)) - .Times(10) - .WillRepeatedly(Return(SampleValue(0.3))); - auto& logger = dynamic_cast(*loggers_[0]); - EXPECT_CALL(logger, handleSample(0, _, _)) - .Times(1) - .WillOnce(Invoke([](int device, const Sample& sample, bool from_new_version) { - // Sample will include all stats - logger must pick the - // ones it wants. - EXPECT_EQ(sample.stats.size(), 4); - EXPECT_EQ(sample.stats[0].name, "active_cycles"); - EXPECT_EQ(sample.stats[1].name, "instructions"); - EXPECT_EQ(sample.stats[2].name, "elapsed_cycles_sm"); - EXPECT_EQ(sample.stats[3].name, "ipc"); - // 2 samples, each with values {1, 2, 3, 4} - // i.e. {2, 4, 6, 8} total - EXPECT_EQ(sample.stats[0].total.getInt(), 20); - EXPECT_EQ(sample.stats[0].percentileValues[0].second.getInt(), 2); - EXPECT_EQ(sample.stats[0].percentileValues.back().second.getInt(), 8); - // ipc is always 0.3 from mocked calculate function above - EXPECT_EQ(sample.stats[3].total.getDouble(), 0.3); - EXPECT_EQ(sample.stats[3].percentileValues[0].second.getDouble(), 0.3); - EXPECT_EQ( - sample.stats[3].percentileValues.back().second.getDouble(), 0.3); - })); - profiler_->reportSamples(); - - auto& on_demand_logger = dynamic_cast(*onDemandLoggers_[0]); - EXPECT_CALL(on_demand_logger, handleSample(0, _, _)).Times(1); - profiler_->reportOnDemandSamples(); - - EXPECT_CALL(*cuptiEvents_, disableGroupSet(_)).Times(1); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/LoggerObserverTest.cpp b/plugins/tensorboard-plugins/libkineto/test/LoggerObserverTest.cpp deleted file mode 100644 index 30ba4a824af10401a45100b0b39cec54fcf98680..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/LoggerObserverTest.cpp +++ /dev/null @@ -1,96 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include - -// TODO(T90238193) -// @lint-ignore-every CLANGTIDY facebook-hte-RelativeInclude -#include "include/libkineto.h" -#include "src/Logger.h" -#include "LoggerCollector.h" - -using namespace KINETO_NAMESPACE; - -#if !USE_GOOGLE_LOG - -constexpr char InfoTestStr[] = "Checking LOG(INFO)"; -constexpr char WarningTestStr[] = "Checking LOG(WARNING)"; -constexpr char ErrorTestStr[] = "Checking LOG(ERROR)"; - -TEST(LoggerObserverTest, SingleCollectorObserver) { - // Add a LoggerObserverCollector to collect all logs during the trace. - std::unique_ptr lCollector = std::make_unique(); - Logger::addLoggerObserver(lCollector.get()); - - LOG(INFO) << InfoTestStr; - LOG(WARNING) << WarningTestStr; - LOG(ERROR) << ErrorTestStr; - - auto LoggerMD = lCollector->extractCollectorMetadata(); - EXPECT_TRUE(LoggerMD[LoggerOutputType::INFO][0].find(InfoTestStr) != std::string::npos); - EXPECT_TRUE(LoggerMD[LoggerOutputType::WARNING][0].find(WarningTestStr) != std::string::npos); - EXPECT_TRUE(LoggerMD[LoggerOutputType::ERROR][0].find(ErrorTestStr) != std::string::npos); - - Logger::removeLoggerObserver(lCollector.get()); -} - -#define NUM_OF_MESSAGES_FOR_EACH_TYPE 10 -#define NUM_OF_WRITE_THREADS 200 - -// Writes NUM_OF_MESSAGES_FOR_EACH_TYPE messages for each INFO, WARNING, and ERROR. -// NOLINTNEXTLINE(clang-diagnostic-unused-parameter) -void* writeSeveralMessages(void* ptr) { - for(int i=0; i lc1 = std::make_unique(); - std::unique_ptr lc2 = std::make_unique(); - std::unique_ptr lc3 = std::make_unique(); - std::unique_ptr lc4 = std::make_unique(); - Logger::addLoggerObserver(lc1.get()); - Logger::addLoggerObserver(lc2.get()); - Logger::addLoggerObserver(lc3.get()); - Logger::addLoggerObserver(lc4.get()); - - // Launch NUM_OF_WRITE_THREADS threads writing several messages. - pthread_t ListOfThreads[NUM_OF_WRITE_THREADS]; - for (int i=0; iextractCollectorMetadata(); - int InfoCount = 0, WarnCount = 0, ErrorCount = 0; - for (auto& md : lc1MD) { - InfoCount += md.first == LoggerOutputType::INFO ? md.second.size() : 0; - WarnCount += md.first == LoggerOutputType::WARNING ? md.second.size() : 0; - ErrorCount += md.first == LoggerOutputType::ERROR ? md.second.size() : 0; - } - - EXPECT_EQ(InfoCount, NUM_OF_WRITE_THREADS * NUM_OF_MESSAGES_FOR_EACH_TYPE); - EXPECT_EQ(WarnCount, NUM_OF_WRITE_THREADS * NUM_OF_MESSAGES_FOR_EACH_TYPE); - EXPECT_EQ(ErrorCount, NUM_OF_WRITE_THREADS * NUM_OF_MESSAGES_FOR_EACH_TYPE); - - Logger::removeLoggerObserver(lc1.get()); - Logger::removeLoggerObserver(lc2.get()); - Logger::removeLoggerObserver(lc3.get()); - Logger::removeLoggerObserver(lc4.get()); -} - -#endif // !USE_GOOGLE_LOG - -int main(int argc, char **argv) { - ::testing::InitGoogleTest(&argc, argv); - return RUN_ALL_TESTS(); -} diff --git a/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.cpp b/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.cpp deleted file mode 100644 index 89f1d536ca8d6d794b7ffc7402001d0e3d4d9c06..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.cpp +++ /dev/null @@ -1,49 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include -#include -#include - -#include "test/MockActivitySubProfiler.h" - -namespace libkineto { - -const std::set supported_activities {ActivityType::CPU_OP}; -const std::string profile_name{"MockProfiler"}; - -void MockProfilerSession::processTrace(ActivityLogger& logger) { - for (const auto& activity: activities()) { - activity.log(logger); - } -} - -const std::string& MockActivityProfiler::name() const { - return profile_name; -} - -const std::set& MockActivityProfiler::availableActivities() const { - return supported_activities; -} - -MockActivityProfiler::MockActivityProfiler( - std::vector& activities) : - test_activities_(activities) {}; - -std::unique_ptr MockActivityProfiler::configure( - const std::set& /*activity_types*/, - const Config& /*config*/) { - auto session = std::make_unique(); - session->set_test_activities(std::move(test_activities_)); - return session; -}; - -std::unique_ptr MockActivityProfiler::configure( - int64_t /*ts_ms*/, - int64_t /*duration_ms*/, - const std::set& activity_types, - const Config& config) { - return configure(activity_types, config); -}; - -} // namespace libkineto - diff --git a/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.h b/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.h deleted file mode 100644 index 36eaa13d1a544c624a2f4bb053891d055686ebf4..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/MockActivitySubProfiler.h +++ /dev/null @@ -1,72 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#pragma once - -#include -#include -#include - -#include "include/IActivityProfiler.h" - -namespace libkineto { - -class MockProfilerSession: public IActivityProfilerSession { - - public: - explicit MockProfilerSession() {} - - void start() override { - start_count++; - status_ = TraceStatus::RECORDING; - } - - void stop() override { - stop_count++; - status_ = TraceStatus::PROCESSING; - } - - std::vector& activities() override { - return test_activities_; - } - - std::vector errors() override { - return {}; - } - - void processTrace(ActivityLogger& logger) override; - - void set_test_activities(std::vector&& acs) { - test_activities_ = std::move(acs); - } - - int start_count = 0; - int stop_count = 0; - private: - std::vector test_activities_; -}; - - -class MockActivityProfiler: public IActivityProfiler { - - public: - explicit MockActivityProfiler(std::vector& activities); - - const std::string& name() const override; - - const std::set& availableActivities() const override; - - std::unique_ptr configure( - const std::set& activity_types, - const Config& config) override; - - std::unique_ptr configure( - int64_t ts_ms, - int64_t duration_ms, - const std::set& activity_types, - const Config& config) override; - - private: - std::vector test_activities_; -}; - -} // namespace libkineto diff --git a/plugins/tensorboard-plugins/libkineto/test/PidInfoTest.cpp b/plugins/tensorboard-plugins/libkineto/test/PidInfoTest.cpp deleted file mode 100644 index b86cfb36d0581ba9a8a03a09724b181c2fd2e88a..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/libkineto/test/PidInfoTest.cpp +++ /dev/null @@ -1,27 +0,0 @@ -// (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary. - -#include "include/ThreadUtil.h" - -#include -#include - -#include -#include - -using namespace KINETO_NAMESPACE; - -TEST(ThreadNameTest, setAndGet) { - setThreadName("ThreadNameTest"); - EXPECT_EQ(getThreadName(), "ThreadNameTest"); - - setThreadName(""); - EXPECT_EQ(getThreadName(), ""); - - // Spaces etc are ok - setThreadName("Name w/ spaces"); - EXPECT_EQ(getThreadName(), "Name w/ spaces"); - - // More than 16 chars is not OK - setThreadName("More than 16 characters"); - EXPECT_EQ(getThreadName(), "Name w/ spaces"); -} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/.gitignore b/plugins/tensorboard-plugins/tb_graph_ascend/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..70f4e767811d0d93c25fbb8ce2d2b29c4ba3b6e6 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/.gitignore @@ -0,0 +1,11 @@ +node_modules/ +.npmrc +yarn.lock +dist/ +build/ +tb_graph_ascend.egg-info/ +__pycache__/ +/server/static/index.html +report.html +assets/ +/htmlcov/ \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/README.md b/plugins/tensorboard-plugins/tb_graph_ascend/README.md new file mode 100644 index 0000000000000000000000000000000000000000..86fa749dbafbf46213cdf9cbe6d16da1de7683d3 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/README.md @@ -0,0 +1,152 @@ +# tb-graph-ascend + +## 一、 介绍 + +此工具是将模型结构进行分级可视化展示的 Tensorboard 插件。可将模型的层级关系、精度性能数据进行可视化,并支持将调试模型和标杆模型进行分视图展示和关联比对,方便用户快速定位精度问题。 + +## 二、快速安装 + +### 1. 相关依赖 + + `python >= 3.7 ,tensorboard >= 2.11.2,numpy <= 1.26.3` + +### 2. 安装方式 + +#### 2.1 pip 安装(推荐) + + - 现本插件已经上传到 pypi 社区,用户可在 python 环境下直接通过以下 pip 指令进行安装: + ``` + pip install tb-graph-ascend + ``` + - 也可在 pypi 社区上下载离线 whl 包,传输到无法访问公网的环境上离线安装使用。访问[下载链接](https://pypi.org/project/tb-graph-ascend/#files)选择 whl 包进行下载,之后便可使用指令安装(此处{version}为 whl 包实际版本) + ``` + pip install tb-graph_ascend_{version}-py3-none-any.whl + ``` + +#### 2.2 从源代码安装 + +1. 从仓库下载源码并切换到 master 分支: + + ``` + git clone https://gitee.com/ascend/mstt.git -b master + ``` + +2. 进入目录 `plugins/tensorboard-plugins/tb_graph_ascend` 下 +3. 编译前端代码,根据操作系统选取不同指令 + + ``` + cd fe + // 安装前端依赖 + npm install --force + // Windows系统 + npm run buildWin + // 其他可使用cp指令的系统,如Linux或Mac + npm run buildLinux + ``` + + **注意**: 此步骤需要安装 Node.js 环境 + +4. 回到上级目录直接安装: + ``` + cd ../ + python setup.py develop + ``` + - 或: 构建 whl 包安装 + ``` + python setup.py bdist_wheel + ``` + 在 `plugins/tensorboard-plugins/tb_graph_ascend/dist` 目录下取出 whl 包,使用以下指令安装(此处{version}为 whl 包实际版本) + ``` + pip install tb-graph_ascend_{version}-py3-none-any.whl + ``` + +### 3. 解析数据说明 + + 将通过[msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe#10-%E5%88%86%E7%BA%A7%E5%8F%AF%E8%A7%86%E5%8C%96%E6%9E%84%E5%9B%BE%E6%AF%94%E5%AF%B9)工具构图功能采集得到的文件后缀为.vis 的模型结构文件(文件本身为 json 格式)放置于某个文件夹中,路径名称下文称之为 `output_path` + - E.g. \ + `---output_path` \ + `-----output.vis` \ + `-----output2.vis` + +### 4. 启动方式 + +1. 启动 TensorBoard + + ``` + tensorboard --logdir output_path + ``` + + 注意:确保默认端口 6006 可连通。 + + 如果需要切换端口号需要在尾部加上指定的端口号,如`--port=6007` + + ``` + tensorboard --logdir output_path --port=6007 + ``` + +2. 在浏览器上打开 tensorboard + + 在浏览器中打开 URL: `http://localhost:6006`。 + + 注意:如果`--logdir` 指定目录下的文件太大或太多,请等候,刷新浏览器查看加载结果。 + +3. 建议在本地启动tensorboard,如果网络浏览器与启动 TensorBoard 的机器不在同一台机器上,需要远程启动,可参照[远程启动方式](#413-远程查看数据),但需用户自行评估**安全风险**。 + +## 三、浏览器查看 + +### 3.1 主界面 + + +![输入图片说明](./doc/images/main-page.png) + +### 3.2 操作方式: + +- **节点双击打开,单击选中。** +- **选中的节点边框呈现蓝色,比对场景下若其存在对应节点,则对应节点边框为浅蓝色。** +- **键盘 WS 根据鼠标位置放大缩小,AD 左右移动。** +- **鼠标滚轮上下移动,鼠标可拖动页面。** +- **比对场景鼠标右键可选中节点,并可展开至对应侧的节点并选中。** + +![输入图片说明](./doc/images/operator-image.png) +### 3.3 名称搜索 +![输入图片说明](./doc/images/vis_search_info.png) +### 3.4 精度筛选/溢出筛选 +注意:单图场景不存在精度筛选和溢出筛选,下图为双图比对场景。
+ +![输入图片说明](./doc/images/vis_precision_info.png) +### 3.5 未匹配节点筛选 +参考匹配说明 ,不符合匹配规则的节点为无匹配节点,颜色标灰。适用于排查两个模型结构差异的场景。
+ +![输入图片说明](./doc/images/vis_unmatch_info.png) +### 3.6 手动选择节点匹配 +可通过浏览器界面,通过鼠标选择两个待匹配的灰色节点进行匹配。当前暂不支持真实数据模式。
+ +![输入图片说明](./doc/images/vis_match_info.png) + + + +## 四、附录 + +### 4.1 安全加固建议 + +#### 4.1.1 免责声明 +本工具为基于 TensorBoard 底座开发的插件,使用本插件需要基于 TensorBoard 运行,请自行关注 TensorBoard 相关安全配置和安全风险。 +#### 4.1.2 TensorBoard版本说明 +满足[相关依赖](#1-相关依赖)中要求的 TensorBoard 版本皆可正常使用本插件功能,但为 TensorBoard 本身安全风险考虑,建议使用最新版本 TensorBoard 。 +#### 4.1.3 远程查看数据 + +如果网络浏览器与启动 TensorBoard 的机器不在同一台机器上, TensorBoard 提供了远程查看数据的指令启动方式,但此种方式会将服务器对应端口在局域网内公开,请用户自行关注安全风险。 + + * 在启动指令尾部加上`--bind_all`或`--host={服务器IP}`参数启用远程查看方式,如: + + ``` + tensorboard --logdir output_path --port=6006 --host=xxx.xxx.xxx.xxx + 或 + tensorboard --logdir output_path --port=6006 --bind_all + ``` + + * 在打开浏览器访问界面时,需将 URL 内主机名由`localhost`替换为主机的 ip 地址,如`http://xxx.xxx.xxx.xxx:6006` + +### 4.2 公网地址说明 +[公网地址说明](./doc/公网地址说明.csv) + diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/main-page.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/main-page.png new file mode 100644 index 0000000000000000000000000000000000000000..b8e2a6dbcc5f55f3369406148dfc378890ccdc73 Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/main-page.png differ diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/operator-image.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/operator-image.png new file mode 100644 index 0000000000000000000000000000000000000000..b4463c05dc0e6a379d68592ec4129bd397ae0dd6 Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/operator-image.png differ diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_match_info.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_match_info.png new file mode 100644 index 0000000000000000000000000000000000000000..2d0c68cd12ab31c891be6f22de04f230472d4e2d Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_match_info.png differ diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_precision_info.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_precision_info.png new file mode 100644 index 0000000000000000000000000000000000000000..79c6ff77f4fffedfcbaee47767d3f8a4f1b0d5b3 Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_precision_info.png differ diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_search_info.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_search_info.png new file mode 100644 index 0000000000000000000000000000000000000000..7c51a804862591005725e1c2e1da0ff0ac152df1 Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_search_info.png differ diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_unmatch_info.png b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_unmatch_info.png new file mode 100644 index 0000000000000000000000000000000000000000..4b123a4e7d06016cd76effd2cebcc30d6f4c2226 Binary files /dev/null and b/plugins/tensorboard-plugins/tb_graph_ascend/doc/images/vis_unmatch_info.png differ diff --git "a/plugins/tensorboard-plugins/tb_graph_ascend/doc/\345\205\254\347\275\221\345\234\260\345\235\200\350\257\264\346\230\216.csv" "b/plugins/tensorboard-plugins/tb_graph_ascend/doc/\345\205\254\347\275\221\345\234\260\345\235\200\350\257\264\346\230\216.csv" new file mode 100644 index 0000000000000000000000000000000000000000..c26fa5e50958bc9597ca2638ee814aa7e7dc918b --- /dev/null +++ "b/plugins/tensorboard-plugins/tb_graph_ascend/doc/\345\205\254\347\275\221\345\234\260\345\235\200\350\257\264\346\230\216.csv" @@ -0,0 +1,11 @@ +IPַ/URLַ//ַ,;˵ +http://www.apache.org/licenses/LICENSE-2.0,License +pmail_mindstudio@huawei.com,MindStudioٷ +https://gitee.com/ascend/mstt/tree/master/plugins/tensorboard-plugins/tb_graph_ascend,ֵַ +https://npms.io,npm߹ַ +http://codepen.io/shyndman/pen/,룬ע +https://github.com/webcomponents/shadycss/issues/193,룬ע +http://jsbin.com/temexa/4,룬ע +https://fonts.googleapis.com/,룬ʽļ +https://developer.mozilla.org/,룬ע +https://github.com/vaadin/vaadin-time-picker/issues/145,룬ע diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/.prettierrc b/plugins/tensorboard-plugins/tb_graph_ascend/fe/.prettierrc new file mode 100644 index 0000000000000000000000000000000000000000..e3d2acb00457084b2f6cccafb8c95740e0344485 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/.prettierrc @@ -0,0 +1,13 @@ +{ + "parser": "typescript", + "semi": true, + "singleQuote": true, + "jsxSingleQuote": false, + "bracketSpacing": true, + "tabWidth": 2, + "useTabs": false, + "trailingComma": "all", + "proseWrap": "always", + "endOfLine": "lf", + "printWidth": 120 +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/index.html b/plugins/tensorboard-plugins/tb_graph_ascend/fe/index.html new file mode 100644 index 0000000000000000000000000000000000000000..e91005f457dfc1076b8abc3c224d875e3dbdbb72 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/index.html @@ -0,0 +1,28 @@ + + + + + + + + + + + diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json new file mode 100644 index 0000000000000000000000000000000000000000..1ad4c73e884b479bfa99f95132880ba462fb1078 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package-lock.json @@ -0,0 +1,6474 @@ +{ + "name": "tb-graph-ascend", + "version": "0.1.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "tb-graph-ascend", + "version": "0.1.0", + "dependencies": { + "@polymer/decorators": "^3.0.0", + "@polymer/iron-behaviors": "^3.0.1", + "@polymer/iron-collapse": "^3.0.1", + "@polymer/iron-icon": "^3.0.1", + "@polymer/iron-icons": "^3.0.1", + "@polymer/iron-iconset-svg": "^3.0.1", + "@polymer/iron-list": "^3.1.0", + "@polymer/iron-resizable-behavior": "^3.0.1", + "@polymer/paper-behaviors": "^3.0.1", + "@polymer/paper-button": "^3.0.1", + "@polymer/paper-checkbox": "^3.1.0", + "@polymer/paper-dialog": "^3.0.1", + "@polymer/paper-dropdown-menu": "^3.1.0", + "@polymer/paper-icon-button": "^3.0.2", + "@polymer/paper-item": "^3.0.1", + "@polymer/paper-listbox": "^3.0.1", + "@polymer/paper-progress": "^3.0.1", + "@polymer/paper-tooltip": "^3.0.1", + "@polymer/polymer": "^3.5.1", + "@types/lodash": "^4.17.1", + "@vaadin/button": "24.6.5", + "@vaadin/combo-box": "24.6.5", + "@vaadin/details": "24.6.5", + "@vaadin/grid": "24.6.5", + "@vaadin/icon": "24.6.5", + "@vaadin/icons": "24.6.5", + "@vaadin/notification": "24.6.5", + "@vaadin/progress-bar": "24.6.5", + "@vaadin/select": "24.6.5", + "@vaadin/tabs": "24.6.5", + "@vaadin/tabsheet": "24.6.5", + "@vaadin/text-field": "24.6.5", + "@vaadin/tooltip": "24.6.5", + "clean-webpack-plugin": "^4.0.0", + "cross-env": "^7.0.3", + "css-loader": "^7.1.2", + "d3": "5.7.0", + "dagre": "^0.8.5", + "lodash": "^4.17.21", + "prettier": "^3.4.2", + "style-loader": "^4.0.0" + }, + "devDependencies": { + "@types/d3": "5.7.2", + "@types/lodash": "^4.14.172", + "@types/node": "^16.4.13", + "@types/offscreencanvas": "^2019.6.3", + "@types/requirejs": "^2.1.33", + "@types/resize-observer-browser": "^0.1.6", + "@types/three": "^0.131.0", + "html-loader": "^5.1.0", + "html-webpack-plugin": "^5.6.3", + "inline-chunk-html-plugin": "^1.1.1", + "ts-loader": "^9.5.1", + "tslib": "^2.6.2", + "typescript": "^5.4.5", + "webpack": "^5.96.1", + "webpack-cli": "^5.1.4", + "webpack-dev-server": "4.15.1", + "ws": "8.13.0" + } + }, + "node_modules/@discoveryjs/json-ext": { + "version": "0.5.7", + "resolved": "https://registry.npmmirror.com/@discoveryjs/json-ext/-/json-ext-0.5.7.tgz", + "integrity": "sha512-dBVuXR082gk3jsFp7Rd/JI4kytwGHecnCoTtXFb7DB6CNHp4rg5k1bhg0nWdLGLnOV71lmDzGQaLMy8iPLY0pw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10.0.0" + } + }, + "node_modules/@jridgewell/gen-mapping": { + "version": "0.3.8", + "resolved": "https://registry.npmmirror.com/@jridgewell/gen-mapping/-/gen-mapping-0.3.8.tgz", + "integrity": "sha512-imAbBGkb+ebQyxKgzv5Hu2nmROxoDOXHh80evxdoXNOrvAnVx7zimzc1Oo5h9RlfV4vPXaE2iM5pOFbvOCClWA==", + "license": "MIT", + "dependencies": { + "@jridgewell/set-array": "^1.2.1", + "@jridgewell/sourcemap-codec": "^1.4.10", + "@jridgewell/trace-mapping": "^0.3.24" + }, + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/@jridgewell/resolve-uri": { + "version": "3.1.2", + "resolved": "https://registry.npmmirror.com/@jridgewell/resolve-uri/-/resolve-uri-3.1.2.tgz", + "integrity": "sha512-bRISgCIjP20/tbWSPWMEi54QVPRZExkuD9lJL+UIxUKtwVJA8wW1Trb1jMs1RFXo1CBTNZ/5hpC9QvmKWdopKw==", + "license": "MIT", + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/@jridgewell/set-array": { + "version": "1.2.1", + "resolved": "https://registry.npmmirror.com/@jridgewell/set-array/-/set-array-1.2.1.tgz", + "integrity": "sha512-R8gLRTZeyp03ymzP/6Lil/28tGeGEzhx1q2k703KGWRAI1VdvPIXdG70VJc2pAMw3NA6JKL5hhFu1sJX0Mnn/A==", + "license": "MIT", + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/@jridgewell/source-map": { + "version": "0.3.6", + "resolved": "https://registry.npmmirror.com/@jridgewell/source-map/-/source-map-0.3.6.tgz", + "integrity": "sha512-1ZJTZebgqllO79ue2bm3rIGud/bOe0pP5BjSRCRxxYkEZS8STV7zN84UBbiYu7jy+eCKSnVIUgoWWE/tt+shMQ==", + "license": "MIT", + "dependencies": { + "@jridgewell/gen-mapping": "^0.3.5", + "@jridgewell/trace-mapping": "^0.3.25" + } + }, + "node_modules/@jridgewell/sourcemap-codec": { + "version": "1.5.0", + "resolved": "https://registry.npmmirror.com/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.5.0.tgz", + "integrity": "sha512-gv3ZRaISU3fjPAgNsriBRqGWQL6quFx04YMPW/zD8XMLsU32mhCCbfbO6KZFLjvYpCZ8zyDEgqsgf+PwPaM7GQ==", + "license": "MIT" + }, + "node_modules/@jridgewell/trace-mapping": { + "version": "0.3.25", + "resolved": "https://registry.npmmirror.com/@jridgewell/trace-mapping/-/trace-mapping-0.3.25.tgz", + "integrity": "sha512-vNk6aEwybGtawWmy/PzwnGDOjCkLWSD2wqvjGGAgOAwCGWySYXfYoxt00IJkTF+8Lb57DwOb3Aa0o9CApepiYQ==", + "license": "MIT", + "dependencies": { + "@jridgewell/resolve-uri": "^3.1.0", + "@jridgewell/sourcemap-codec": "^1.4.14" + } + }, + "node_modules/@leichtgewicht/ip-codec": { + "version": "2.0.5", + "resolved": "https://registry.npmmirror.com/@leichtgewicht/ip-codec/-/ip-codec-2.0.5.tgz", + "integrity": "sha512-Vo+PSpZG2/fmgmiNzYK9qWRh8h/CHrwD0mo1h1DzL4yzHNSfWYujGTYsWGreD000gcgmZ7K4Ys6Tx9TxtsKdDw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@lit-labs/ssr-dom-shim": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/@lit-labs/ssr-dom-shim/-/ssr-dom-shim-1.3.0.tgz", + "integrity": "sha512-nQIWonJ6eFAvUUrSlwyHDm/aE8PBDu5kRpL0vHMg6K8fK3Diq1xdPjTnsJSwxABhaZ+5eBi1btQB5ShUTKo4nQ==", + "license": "BSD-3-Clause" + }, + "node_modules/@lit/reactive-element": { + "version": "2.0.4", + "resolved": "https://registry.npmmirror.com/@lit/reactive-element/-/reactive-element-2.0.4.tgz", + "integrity": "sha512-GFn91inaUa2oHLak8awSIigYz0cU0Payr1rcFsrkf5OJ5eSPxElyZfKh0f2p9FsTiZWXQdWGJeXZICEfXXYSXQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@lit-labs/ssr-dom-shim": "^1.2.0" + } + }, + "node_modules/@open-wc/dedupe-mixin": { + "version": "1.4.0", + "resolved": "https://registry.npmmirror.com/@open-wc/dedupe-mixin/-/dedupe-mixin-1.4.0.tgz", + "integrity": "sha512-Sj7gKl1TLcDbF7B6KUhtvr+1UCxdhMbNY5KxdU5IfMFWqL8oy1ZeAcCANjoB1TL0AJTcPmcCFsCbHf8X2jGDUA==", + "license": "MIT" + }, + "node_modules/@polymer/decorators": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/@polymer/decorators/-/decorators-3.0.0.tgz", + "integrity": "sha512-qh+VID9nDV9q3ABvIfWgm7/+udl7v2HKsMLPXFm8tj1fI7qr7yWJMFwS3xWBkMmuNPtmkS8MDP0vqLAQIEOWzg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.5" + } + }, + "node_modules/@polymer/font-roboto": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@polymer/font-roboto/-/font-roboto-3.0.2.tgz", + "integrity": "sha512-tx5TauYSmzsIvmSqepUPDYbs4/Ejz2XbZ1IkD7JEGqkdNUJlh+9KU85G56Tfdk/xjEZ8zorFfN09OSwiMrIQWA==", + "license": "BSD-3-Clause" + }, + "node_modules/@polymer/iron-a11y-announcer": { + "version": "3.2.0", + "resolved": "https://registry.npmmirror.com/@polymer/iron-a11y-announcer/-/iron-a11y-announcer-3.2.0.tgz", + "integrity": "sha512-We+hyaFHcg7Ke8ovsoxUpYEXFIJLHxMCDaLehTB4dELS+C+K0zMnGSiqQvb/YzGS+nSYpAfkQIyg1msOCdHMtA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-a11y-keys-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-a11y-keys-behavior/-/iron-a11y-keys-behavior-3.0.1.tgz", + "integrity": "sha512-lnrjKq3ysbBPT/74l0Fj0U9H9C35Tpw2C/tpJ8a+5g8Y3YJs1WSZYnEl1yOkw6sEyaxOq/1DkzH0+60gGu5/PQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-autogrow-textarea": { + "version": "3.0.3", + "resolved": "https://registry.npmmirror.com/@polymer/iron-autogrow-textarea/-/iron-autogrow-textarea-3.0.3.tgz", + "integrity": "sha512-5r0VkWrIlm0JIp5E5wlnvkw7slK72lFRZXncmrsLZF+6n1dg2rI8jt7xpFzSmUWrqpcyXwyKaGaDvUjl3j4JLA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/iron-validatable-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-behaviors": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-behaviors/-/iron-behaviors-3.0.1.tgz", + "integrity": "sha512-IMEwcv1lhf1HSQxuyWOUIL0lOBwmeaoSTpgCJeP9IBYnuB1SPQngmfRuHKgK6/m9LQ9F9miC7p3HeQQUdKAE0w==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-checked-element-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-checked-element-behavior/-/iron-checked-element-behavior-3.0.1.tgz", + "integrity": "sha512-aDr0cbCNVq49q+pOqa6CZutFh+wWpwPMLpEth9swx+GkAj+gCURhuQkaUYhIo5f2egDbEioR1aeHMnPlU9dQZA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-form-element-behavior": "^3.0.0-pre.26", + "@polymer/iron-validatable-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-collapse": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-collapse/-/iron-collapse-3.0.1.tgz", + "integrity": "sha512-yg6q5ZyckQR9VL9VmLrSTkSFXWy9AcJC8KtnD5cg0EHRPbakE8I9S/gVAgeP4nMWV2a/BjLLC4IBygcCMDhAGw==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-resizable-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-dropdown": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-dropdown/-/iron-dropdown-3.0.1.tgz", + "integrity": "sha512-22yLhepfcKjuQMfFmRHi/9MPKTqkzgRrmWWW0P5uqK++xle53k2QBO5VYUAYiCN3ZcxIi9lEhZ9YWGeQj2JBig==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-overlay-behavior": "^3.0.0-pre.27", + "@polymer/neon-animation": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-fit-behavior": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@polymer/iron-fit-behavior/-/iron-fit-behavior-3.1.0.tgz", + "integrity": "sha512-ABcgIYqrjhmUT8tiuolqeGttF/8pd3sEymUDrO1vXbZu4FWIvoLNndrMDFvs++AGd12Mjf5pYy84NJc6dB8Vig==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-flex-layout": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-flex-layout/-/iron-flex-layout-3.0.1.tgz", + "integrity": "sha512-7gB869czArF+HZcPTVSgvA7tXYFze9EKckvM95NB7SqYF+NnsQyhoXgKnpFwGyo95lUjUW9TFDLUwDXnCYFtkw==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-form-element-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-form-element-behavior/-/iron-form-element-behavior-3.0.1.tgz", + "integrity": "sha512-G/e2KXyL5AY7mMjmomHkGpgS0uAf4ovNpKhkuUTRnMuMJuf589bKqE85KN4ovE1Tzhv2hJoh/igyD6ekHiYU1A==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-icon": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-icon/-/iron-icon-3.0.1.tgz", + "integrity": "sha512-QLPwirk+UPZNaLnMew9VludXA4CWUCenRewgEcGYwdzVgDPCDbXxy6vRJjmweZobMQv/oVLppT2JZtJFnPxX6g==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/iron-meta": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-icons": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-icons/-/iron-icons-3.0.1.tgz", + "integrity": "sha512-xtEI8erH2GIBiF3QxEMyW81XuVjguu6Le5WjEEpX67qd9z7jjmc4T/ke3zRUlnDydex9p8ytcwVpMIKcyvjYAQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-icon": "^3.0.0-pre.26", + "@polymer/iron-iconset-svg": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-iconset-svg": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-iconset-svg/-/iron-iconset-svg-3.0.1.tgz", + "integrity": "sha512-XNwURbNHRw6u2fJe05O5fMYye6GSgDlDqCO+q6K1zAnKIrpgZwf2vTkBd5uCcZwsN0FyCB3mvNZx4jkh85dRDw==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-meta": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-input": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-input/-/iron-input-3.0.1.tgz", + "integrity": "sha512-WLx13kEcbH9GKbj9+pWR6pbJkA5kxn3796ynx6eQd2rueMyUfVTR3GzOvadBKsciUuIuzrxpBWZ2+3UcueVUQQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-announcer": "^3.0.0-pre.26", + "@polymer/iron-validatable-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-list": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@polymer/iron-list/-/iron-list-3.1.0.tgz", + "integrity": "sha512-Eiv6xd3h3oPmn8SXFntXVfC3ZnegH+KHAxiKLKcOASFSRY3mHnr2AdcnExUJ9ItoCMA5UzKaM/0U22eWzGERtA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-resizable-behavior": "^3.0.0-pre.26", + "@polymer/iron-scroll-target-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-menu-behavior": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@polymer/iron-menu-behavior/-/iron-menu-behavior-3.0.2.tgz", + "integrity": "sha512-8dpASkFNBIkxAJWsFLWIO1M7tKM0+wKs3PqdeF/dDdBciwoaaFgC2K1XCZFZnbe2t9/nJgemXxVugGZAWpYCGg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/iron-selector": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-meta": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-meta/-/iron-meta-3.0.1.tgz", + "integrity": "sha512-pWguPugiLYmWFV9UWxLWzZ6gm4wBwQdDy4VULKwdHCqR7OP7u98h+XDdGZsSlDPv6qoryV/e3tGHlTIT0mbzJA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-overlay-behavior": { + "version": "3.0.3", + "resolved": "https://registry.npmmirror.com/@polymer/iron-overlay-behavior/-/iron-overlay-behavior-3.0.3.tgz", + "integrity": "sha512-Q/Fp0+uOQQ145ebZ7T8Cxl4m1tUKYjyymkjcL2rXUm+aDQGb1wA1M1LYxUF5YBqd+9lipE0PTIiYwA2ZL/sznA==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-fit-behavior": "^3.0.0-pre.26", + "@polymer/iron-resizable-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-range-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-range-behavior/-/iron-range-behavior-3.0.1.tgz", + "integrity": "sha512-+jtL9v45M/T1RJleWyQaNH84S9/mIIR+AjNbYIttbKGp1eG+98j8MDWe7LXNtg79V2LQnE/+VS82cBeELyGVeg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-resizable-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-resizable-behavior/-/iron-resizable-behavior-3.0.1.tgz", + "integrity": "sha512-FyHxRxFspVoRaeZSWpT3y0C9awomb4tXXolIJcZ7RvXhMP632V5lez+ch5G5SwK0LpnAPkg35eB0LPMFv+YMMQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-scroll-target-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-scroll-target-behavior/-/iron-scroll-target-behavior-3.0.1.tgz", + "integrity": "sha512-xg1WanG25BIkQE8rhuReqY9zx1K5M7F+YAIYpswEp5eyDIaZ1Y3vUmVeQ3KG+hiSugzI1M752azXN7kvyhOBcQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-selector": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-selector/-/iron-selector-3.0.1.tgz", + "integrity": "sha512-sBVk2uas6prW0glUe2xEJJYlvxmYzM40Au9OKbfDK2Qekou/fLKcBRyIYI39kuI8zWRaip8f3CI8qXcUHnKb1A==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/iron-validatable-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/iron-validatable-behavior/-/iron-validatable-behavior-3.0.1.tgz", + "integrity": "sha512-wwpYh6wOa4fNI+jH5EYKC7TVPYQ2OfgQqocWat7GsNWcsblKYhLYbwsvEY5nO0n2xKqNfZzDLrUom5INJN7msQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-meta": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/neon-animation": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/neon-animation/-/neon-animation-3.0.1.tgz", + "integrity": "sha512-cDDc0llpVCe0ATbDS3clDthI54Bc8YwZIeTGGmBJleKOvbRTUC5+ssJmRL+VwVh+VM5FlnQlx760ppftY3uprg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-resizable-behavior": "^3.0.0-pre.26", + "@polymer/iron-selector": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-behaviors": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-behaviors/-/paper-behaviors-3.0.1.tgz", + "integrity": "sha512-6knhj69fPJejv8qR0kCSUY+Q0XjaUf0OSnkjRjmTJPAwSrRYtgqE+l6P1FfA+py1X/cUjgne9EF5rMZAKJIg1g==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-checked-element-behavior": "^3.0.0-pre.26", + "@polymer/paper-ripple": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-button": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-button/-/paper-button-3.0.1.tgz", + "integrity": "sha512-JRNBc+Oj9EWnmyLr7FcCr8T1KAnEHPh6mosln9BUdkM+qYaYsudSICh3cjTIbnj6AuF5OJidoLkM1dlyj0j6Zg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/paper-behaviors": "^3.0.0-pre.27", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-checkbox": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@polymer/paper-checkbox/-/paper-checkbox-3.1.0.tgz", + "integrity": "sha512-kXm6yDG1tT8if0XuJ2cc9NF+g8Ev4wG+rnf0a+Sx+O7J6fn1jcnBlYn72FlrfjVjDQZDBFmT6nynhD5PvFw8iQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-checked-element-behavior": "^3.0.0-pre.26", + "@polymer/paper-behaviors": "^3.0.0-pre.27", + "@polymer/paper-ripple": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-dialog": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-dialog/-/paper-dialog-3.0.1.tgz", + "integrity": "sha512-KvglYbEq7AWJvui2j6WKLnOvgVMeGjovAydGrPRj7kVzCiD49Eq/hpYFJTRV5iDcalWH+mORUpw+jrFnG9+Kgw==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-overlay-behavior": "^3.0.0-pre.27", + "@polymer/neon-animation": "^3.0.0-pre.26", + "@polymer/paper-dialog-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-dialog-behavior": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-dialog-behavior/-/paper-dialog-behavior-3.0.1.tgz", + "integrity": "sha512-wbI4kCK8le/9MHT+IXzvHjoatxf3kd3Yn0tgozAiAwfSZ7N4Ubpi5MHrK0m9S9PeIxKokAgBYdTUrezSE5378A==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-overlay-behavior": "^3.0.0-pre.27", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-dropdown-menu": { + "version": "3.2.0", + "resolved": "https://registry.npmmirror.com/@polymer/paper-dropdown-menu/-/paper-dropdown-menu-3.2.0.tgz", + "integrity": "sha512-2ohwSHF+RLSK6kA0UkkMiMQF6EZcaEYWAA25kfisI6DWie7yozKrpQNsqvwfOEHU6DdDMIotrOtH1TM88YS8Zg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-form-element-behavior": "^3.0.0-pre.26", + "@polymer/iron-icon": "^3.0.0-pre.26", + "@polymer/iron-iconset-svg": "^3.0.0-pre.26", + "@polymer/iron-validatable-behavior": "^3.0.0-pre.26", + "@polymer/paper-behaviors": "^3.0.0-pre.27", + "@polymer/paper-input": "^3.1.0", + "@polymer/paper-menu-button": "^3.1.0", + "@polymer/paper-ripple": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.3.1" + } + }, + "node_modules/@polymer/paper-icon-button": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@polymer/paper-icon-button/-/paper-icon-button-3.0.2.tgz", + "integrity": "sha512-kOdxQgnKL097bggFF6PWvsBYuWg+MCcoHoTHX6bh/MuZoWFZNjrFntFqwuB4oEbpjCpfm4moA33muPJFj7CihQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-icon": "^3.0.0-pre.26", + "@polymer/paper-behaviors": "^3.0.0-pre.27", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-input": { + "version": "3.2.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-input/-/paper-input-3.2.1.tgz", + "integrity": "sha512-6ghgwQKM6mS0hAQxQqj+tkeEY1VUBqAsrasAm8V5RpNcfSWQC/hhRFxU0beGuKTAhndzezDzWYP6Zz4b8fExGg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-autogrow-textarea": "^3.0.0-pre.26", + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-form-element-behavior": "^3.0.0-pre.26", + "@polymer/iron-input": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-item": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-item/-/paper-item-3.0.1.tgz", + "integrity": "sha512-KTk2N+GsYiI/HuubL3sxebZ6tteQbBOAp4QVLAnbjSPmwl+mJSDWk+omuadesU0bpkCwaWVs3fHuQsmXxy4pkw==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-listbox": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-listbox/-/paper-listbox-3.0.1.tgz", + "integrity": "sha512-vMLWFpYcggAPmEDBmK+96fFefacOG3GLB1EguTn8+ZkqI+328hNfw1MzHjH68rgCIIUtjmm+9qgB1Sy/MN0a/A==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-menu-behavior": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-menu-button": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@polymer/paper-menu-button/-/paper-menu-button-3.1.0.tgz", + "integrity": "sha512-q0G0/rvYD/FFmIBMGCQWjfXzRqwFw9+WHSYV4uOQzM1Ln8LMXSAd+2CENsbVwtMh6fmBePj15ZlU8SM2dt1WDQ==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/iron-behaviors": "^3.0.0-pre.26", + "@polymer/iron-dropdown": "^3.0.0-pre.26", + "@polymer/iron-fit-behavior": "^3.1.0", + "@polymer/neon-animation": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-progress": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-progress/-/paper-progress-3.0.1.tgz", + "integrity": "sha512-5nguG+tmnyoaWKVNG8Smtno2uLSPBgEsT3f20JY8yJTjUBYWaqa8E3l5RLkTRXgA4x9OnvLb8/CdlQWXQIogBg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/iron-range-behavior": "^3.0.0-pre.26", + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-ripple": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@polymer/paper-ripple/-/paper-ripple-3.0.2.tgz", + "integrity": "sha512-DnLNvYIMsiayeICroYxx6Q6Hg1cUU8HN2sbutXazlemAlGqdq80qz3TIaVdbpbt/pvjcFGX2HtntMlPstCge8Q==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/iron-a11y-keys-behavior": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-styles": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-styles/-/paper-styles-3.0.1.tgz", + "integrity": "sha512-y6hmObLqlCx602TQiSBKHqjwkE7xmDiFkoxdYGaNjtv4xcysOTdVJsDR/R9UHwIaxJ7gHlthMSykir1nv78++g==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/font-roboto": "^3.0.1", + "@polymer/iron-flex-layout": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/paper-tooltip": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/@polymer/paper-tooltip/-/paper-tooltip-3.0.1.tgz", + "integrity": "sha512-yiUk09opTEnE1lK+tb501ENb+yQBi4p++Ep0eGJAHesVYKVMPNgPphVKkIizkDaU+n0SE+zXfTsRbYyOMDYXSg==", + "license": "BSD-3-Clause", + "dependencies": { + "@polymer/paper-styles": "^3.0.0-pre.26", + "@polymer/polymer": "^3.0.0" + } + }, + "node_modules/@polymer/polymer": { + "version": "3.5.2", + "resolved": "https://registry.npmmirror.com/@polymer/polymer/-/polymer-3.5.2.tgz", + "integrity": "sha512-fWwImY/UH4bb2534DVSaX+Azs2yKg8slkMBHOyGeU2kKx7Xmxp6Lee0jP8p6B3d7c1gFUPB2Z976dTUtX81pQA==", + "license": "BSD-3-Clause", + "dependencies": { + "@webcomponents/shadycss": "^1.9.1" + } + }, + "node_modules/@types/body-parser": { + "version": "1.19.5", + "resolved": "https://registry.npmmirror.com/@types/body-parser/-/body-parser-1.19.5.tgz", + "integrity": "sha512-fB3Zu92ucau0iQ0JMCFQE7b/dv8Ot07NI3KaZIkIUNXq82k4eBAqUaneXfleGY9JWskeS9y+u0nXMyspcuQrCg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/connect": "*", + "@types/node": "*" + } + }, + "node_modules/@types/bonjour": { + "version": "3.5.13", + "resolved": "https://registry.npmmirror.com/@types/bonjour/-/bonjour-3.5.13.tgz", + "integrity": "sha512-z9fJ5Im06zvUL548KvYNecEVlA7cVDkGUi6kZusb04mpyEFKCIZJvloCcmpmLaIahDpOQGHaHmG6imtPMmPXGQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@types/connect": { + "version": "3.4.38", + "resolved": "https://registry.npmmirror.com/@types/connect/-/connect-3.4.38.tgz", + "integrity": "sha512-K6uROf1LD88uDQqJCktA4yzL1YYAK6NgfsI0v/mTgyPKWsX1CnJ0XPSDhViejru1GcRkLWb8RlzFYJRqGUbaug==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@types/connect-history-api-fallback": { + "version": "1.5.4", + "resolved": "https://registry.npmmirror.com/@types/connect-history-api-fallback/-/connect-history-api-fallback-1.5.4.tgz", + "integrity": "sha512-n6Cr2xS1h4uAulPRdlw6Jl6s1oG8KrVilPN2yUITEs+K48EzMJJ3W1xy8K5eWuFvjp3R74AOIGSmp2UfBJ8HFw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/express-serve-static-core": "*", + "@types/node": "*" + } + }, + "node_modules/@types/d3": { + "version": "5.7.2", + "resolved": "https://registry.npmmirror.com/@types/d3/-/d3-5.7.2.tgz", + "integrity": "sha512-7/wClB8ycneWGy3jdvLfXKTd5SoTg9hji7IdJ0RuO9xTY54YpJ8zlcFADcXhY1J3kCBwxp+/1jeN6a5OMwgYOw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-array": "^1", + "@types/d3-axis": "*", + "@types/d3-brush": "*", + "@types/d3-chord": "*", + "@types/d3-collection": "*", + "@types/d3-color": "*", + "@types/d3-contour": "*", + "@types/d3-dispatch": "*", + "@types/d3-drag": "*", + "@types/d3-dsv": "*", + "@types/d3-ease": "*", + "@types/d3-fetch": "*", + "@types/d3-force": "*", + "@types/d3-format": "*", + "@types/d3-geo": "*", + "@types/d3-hierarchy": "*", + "@types/d3-interpolate": "*", + "@types/d3-path": "*", + "@types/d3-polygon": "*", + "@types/d3-quadtree": "*", + "@types/d3-random": "*", + "@types/d3-scale": "*", + "@types/d3-scale-chromatic": "*", + "@types/d3-selection": "*", + "@types/d3-shape": "*", + "@types/d3-time": "*", + "@types/d3-time-format": "*", + "@types/d3-timer": "*", + "@types/d3-transition": "*", + "@types/d3-voronoi": "*", + "@types/d3-zoom": "*" + } + }, + "node_modules/@types/d3-array": { + "version": "1.2.12", + "resolved": "https://registry.npmmirror.com/@types/d3-array/-/d3-array-1.2.12.tgz", + "integrity": "sha512-zIq9wCg/JO7MGC6vq3HRDaVYkqgSPIDjpo3JhAQxl7PHYVPA5D9SMiBfjW/ZoAvPd2a+rkovqBg0nS0QOChsJQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-axis": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-axis/-/d3-axis-3.0.6.tgz", + "integrity": "sha512-pYeijfZuBd87T0hGn0FO1vQ/cgLk6E1ALJjfkC0oJ8cbwkZl3TpgS8bVBLZN+2jjGgg38epgxb2zmoGtSfvgMw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-brush": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-brush/-/d3-brush-3.0.6.tgz", + "integrity": "sha512-nH60IZNNxEcrh6L1ZSMNA28rj27ut/2ZmI3r96Zd+1jrZD++zD3LsMIjWlvg4AYrHn/Pqz4CF3veCxGjtbqt7A==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-chord": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-chord/-/d3-chord-3.0.6.tgz", + "integrity": "sha512-LFYWWd8nwfwEmTZG9PfQxd17HbNPksHBiJHaKuY1XeqscXacsS2tyoo6OdRsjf+NQYeB6XrNL3a25E3gH69lcg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-collection": { + "version": "1.0.13", + "resolved": "https://registry.npmmirror.com/@types/d3-collection/-/d3-collection-1.0.13.tgz", + "integrity": "sha512-v0Rgw3IZebRyamcwVmtTDCZ8OmQcj4siaYjNc7wGMZT7PmdSHawGsCOQMxyLvZ7lWjfohYLK0oXtilMOMgfY8A==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-color": { + "version": "3.1.3", + "resolved": "https://registry.npmmirror.com/@types/d3-color/-/d3-color-3.1.3.tgz", + "integrity": "sha512-iO90scth9WAbmgv7ogoq57O9YpKmFBbmoEoCHDB2xMBY0+/KVrqAaCDyCE16dUspeOvIxFFRI+0sEtqDqy2b4A==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-contour": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-contour/-/d3-contour-3.0.6.tgz", + "integrity": "sha512-BjzLgXGnCWjUSYGfH1cpdo41/hgdWETu4YxpezoztawmqsvCeep+8QGfiY6YbDvfgHz/DkjeIkkZVJavB4a3rg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-array": "*", + "@types/geojson": "*" + } + }, + "node_modules/@types/d3-dispatch": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-dispatch/-/d3-dispatch-3.0.6.tgz", + "integrity": "sha512-4fvZhzMeeuBJYZXRXrRIQnvUYfyXwYmLsdiN7XXmVNQKKw1cM8a5WdID0g1hVFZDqT9ZqZEY5pD44p24VS7iZQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-drag": { + "version": "3.0.7", + "resolved": "https://registry.npmmirror.com/@types/d3-drag/-/d3-drag-3.0.7.tgz", + "integrity": "sha512-HE3jVKlzU9AaMazNufooRJ5ZpWmLIoc90A37WU2JMmeq28w1FQqCZswHZ3xR+SuxYftzHq6WU6KJHvqxKzTxxQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-dsv": { + "version": "3.0.7", + "resolved": "https://registry.npmmirror.com/@types/d3-dsv/-/d3-dsv-3.0.7.tgz", + "integrity": "sha512-n6QBF9/+XASqcKK6waudgL0pf/S5XHPPI8APyMLLUHd8NqouBGLsU8MgtO7NINGtPBtk9Kko/W4ea0oAspwh9g==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-ease": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@types/d3-ease/-/d3-ease-3.0.2.tgz", + "integrity": "sha512-NcV1JjO5oDzoK26oMzbILE6HW7uVXOHLQvHshBUW4UMdZGfiY6v5BeQwh9a9tCzv+CeefZQHJt5SRgK154RtiA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-fetch": { + "version": "3.0.7", + "resolved": "https://registry.npmmirror.com/@types/d3-fetch/-/d3-fetch-3.0.7.tgz", + "integrity": "sha512-fTAfNmxSb9SOWNB9IoG5c8Hg6R+AzUHDRlsXsDZsNp6sxAEOP0tkP3gKkNSO/qmHPoBFTxNrjDprVHDQDvo5aA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-dsv": "*" + } + }, + "node_modules/@types/d3-force": { + "version": "3.0.10", + "resolved": "https://registry.npmmirror.com/@types/d3-force/-/d3-force-3.0.10.tgz", + "integrity": "sha512-ZYeSaCF3p73RdOKcjj+swRlZfnYpK1EbaDiYICEEp5Q6sUiqFaFQ9qgoshp5CzIyyb/yD09kD9o2zEltCexlgw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-format": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/@types/d3-format/-/d3-format-3.0.4.tgz", + "integrity": "sha512-fALi2aI6shfg7vM5KiR1wNJnZ7r6UuggVqtDA+xiEdPZQwy/trcQaHnwShLuLdta2rTymCNpxYTiMZX/e09F4g==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-geo": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@types/d3-geo/-/d3-geo-3.1.0.tgz", + "integrity": "sha512-856sckF0oP/diXtS4jNsiQw/UuK5fQG8l/a9VVLeSouf1/PPbBE1i1W852zVwKwYCBkFJJB7nCFTbk6UMEXBOQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/geojson": "*" + } + }, + "node_modules/@types/d3-hierarchy": { + "version": "3.1.7", + "resolved": "https://registry.npmmirror.com/@types/d3-hierarchy/-/d3-hierarchy-3.1.7.tgz", + "integrity": "sha512-tJFtNoYBtRtkNysX1Xq4sxtjK8YgoWUNpIiUee0/jHGRwqvzYxkq0hGVbbOGSz+JgFxxRu4K8nb3YpG3CMARtg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-interpolate": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/@types/d3-interpolate/-/d3-interpolate-3.0.4.tgz", + "integrity": "sha512-mgLPETlrpVV1YRJIglr4Ez47g7Yxjl1lj7YKsiMCb27VJH9W8NVM6Bb9d8kkpG/uAQS5AmbA48q2IAolKKo1MA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-color": "*" + } + }, + "node_modules/@types/d3-path": { + "version": "3.1.1", + "resolved": "https://registry.npmmirror.com/@types/d3-path/-/d3-path-3.1.1.tgz", + "integrity": "sha512-VMZBYyQvbGmWyWVea0EHs/BwLgxc+MKi1zLDCONksozI4YJMcTt8ZEuIR4Sb1MMTE8MMW49v0IwI5+b7RmfWlg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-polygon": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@types/d3-polygon/-/d3-polygon-3.0.2.tgz", + "integrity": "sha512-ZuWOtMaHCkN9xoeEMr1ubW2nGWsp4nIql+OPQRstu4ypeZ+zk3YKqQT0CXVe/PYqrKpZAi+J9mTs05TKwjXSRA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-quadtree": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/@types/d3-quadtree/-/d3-quadtree-3.0.6.tgz", + "integrity": "sha512-oUzyO1/Zm6rsxKRHA1vH0NEDG58HrT5icx/azi9MF1TWdtttWl0UIUsjEQBBh+SIkrpd21ZjEv7ptxWys1ncsg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-random": { + "version": "3.0.3", + "resolved": "https://registry.npmmirror.com/@types/d3-random/-/d3-random-3.0.3.tgz", + "integrity": "sha512-Imagg1vJ3y76Y2ea0871wpabqp613+8/r0mCLEBfdtqC7xMSfj9idOnmBYyMoULfHePJyxMAw3nWhJxzc+LFwQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-scale": { + "version": "4.0.9", + "resolved": "https://registry.npmmirror.com/@types/d3-scale/-/d3-scale-4.0.9.tgz", + "integrity": "sha512-dLmtwB8zkAeO/juAMfnV+sItKjlsw2lKdZVVy6LRr0cBmegxSABiLEpGVmSJJ8O08i4+sGR6qQtb6WtuwJdvVw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-time": "*" + } + }, + "node_modules/@types/d3-scale-chromatic": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/@types/d3-scale-chromatic/-/d3-scale-chromatic-3.1.0.tgz", + "integrity": "sha512-iWMJgwkK7yTRmWqRB5plb1kadXyQ5Sj8V/zYlFGMUBbIPKQScw+Dku9cAAMgJG+z5GYDoMjWGLVOvjghDEFnKQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-selection": { + "version": "3.0.11", + "resolved": "https://registry.npmmirror.com/@types/d3-selection/-/d3-selection-3.0.11.tgz", + "integrity": "sha512-bhAXu23DJWsrI45xafYpkQ4NtcKMwWnAC/vKrd2l+nxMFuvOT3XMYTIj2opv8vq8AO5Yh7Qac/nSeP/3zjTK0w==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-shape": { + "version": "3.1.7", + "resolved": "https://registry.npmmirror.com/@types/d3-shape/-/d3-shape-3.1.7.tgz", + "integrity": "sha512-VLvUQ33C+3J+8p+Daf+nYSOsjB4GXp19/S/aGo60m9h1v6XaxjiT82lKVWJCfzhtuZ3yD7i/TPeC/fuKLLOSmg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-path": "*" + } + }, + "node_modules/@types/d3-time": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/@types/d3-time/-/d3-time-3.0.4.tgz", + "integrity": "sha512-yuzZug1nkAAaBlBBikKZTgzCeA+k1uy4ZFwWANOfKw5z5LRhV0gNA7gNkKm7HoK+HRN0wX3EkxGk0fpbWhmB7g==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-time-format": { + "version": "4.0.3", + "resolved": "https://registry.npmmirror.com/@types/d3-time-format/-/d3-time-format-4.0.3.tgz", + "integrity": "sha512-5xg9rC+wWL8kdDj153qZcsJ0FWiFt0J5RB6LYUNZjwSnesfblqrI/bJ1wBdJ8OQfncgbJG5+2F+qfqnqyzYxyg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-timer": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/@types/d3-timer/-/d3-timer-3.0.2.tgz", + "integrity": "sha512-Ps3T8E8dZDam6fUyNiMkekK3XUsaUEik+idO9/YjPtfj2qruF8tFBXS7XhtE4iIXBLxhmLjP3SXpLhVf21I9Lw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-transition": { + "version": "3.0.9", + "resolved": "https://registry.npmmirror.com/@types/d3-transition/-/d3-transition-3.0.9.tgz", + "integrity": "sha512-uZS5shfxzO3rGlu0cC3bjmMFKsXv+SmZZcgp0KD22ts4uGXp5EVYGzu/0YdwZeKmddhcAccYtREJKkPfXkZuCg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-selection": "*" + } + }, + "node_modules/@types/d3-voronoi": { + "version": "1.1.12", + "resolved": "https://registry.npmmirror.com/@types/d3-voronoi/-/d3-voronoi-1.1.12.tgz", + "integrity": "sha512-DauBl25PKZZ0WVJr42a6CNvI6efsdzofl9sajqZr2Gf5Gu733WkDdUGiPkUHXiUvYGzNNlFQde2wdZdfQPG+yw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/d3-zoom": { + "version": "3.0.8", + "resolved": "https://registry.npmmirror.com/@types/d3-zoom/-/d3-zoom-3.0.8.tgz", + "integrity": "sha512-iqMC4/YlFCSlO8+2Ii1GGGliCAY4XdeG748w5vQUbevlbDu0zSjH/+jojorQVBK/se0j6DUFNPBGSqD3YWYnDw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/d3-interpolate": "*", + "@types/d3-selection": "*" + } + }, + "node_modules/@types/eslint": { + "version": "9.6.1", + "resolved": "https://registry.npmmirror.com/@types/eslint/-/eslint-9.6.1.tgz", + "integrity": "sha512-FXx2pKgId/WyYo2jXw63kk7/+TY7u7AziEJxJAnSFzHlqTAS3Ync6SvgYAN/k4/PQpnnVuzoMuVnByKK2qp0ag==", + "license": "MIT", + "dependencies": { + "@types/estree": "*", + "@types/json-schema": "*" + } + }, + "node_modules/@types/eslint-scope": { + "version": "3.7.7", + "resolved": "https://registry.npmmirror.com/@types/eslint-scope/-/eslint-scope-3.7.7.tgz", + "integrity": "sha512-MzMFlSLBqNF2gcHWO0G1vP/YQyfvrxZ0bF+u7mzUdZ1/xK4A4sru+nraZz5i3iEIk1l1uyicaDVTB4QbbEkAYg==", + "license": "MIT", + "dependencies": { + "@types/eslint": "*", + "@types/estree": "*" + } + }, + "node_modules/@types/estree": { + "version": "1.0.7", + "resolved": "https://registry.npmmirror.com/@types/estree/-/estree-1.0.7.tgz", + "integrity": "sha512-w28IoSUCJpidD/TGviZwwMJckNESJZXFu7NBZ5YJ4mEUnNraUn9Pm8HSZm/jDF1pDWYKspWE7oVphigUPRakIQ==", + "license": "MIT" + }, + "node_modules/@types/express": { + "version": "4.17.21", + "resolved": "https://registry.npmmirror.com/@types/express/-/express-4.17.21.tgz", + "integrity": "sha512-ejlPM315qwLpaQlQDTjPdsUFSc6ZsP4AN6AlWnogPjQ7CVi7PYF3YVz+CY3jE2pwYf7E/7HlDAN0rV2GxTG0HQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/body-parser": "*", + "@types/express-serve-static-core": "^4.17.33", + "@types/qs": "*", + "@types/serve-static": "*" + } + }, + "node_modules/@types/express-serve-static-core": { + "version": "5.0.6", + "resolved": "https://registry.npmmirror.com/@types/express-serve-static-core/-/express-serve-static-core-5.0.6.tgz", + "integrity": "sha512-3xhRnjJPkULekpSzgtoNYYcTWgEZkp4myc+Saevii5JPnHNvHMRlBSHDbs7Bh1iPPoVTERHEZXyhyLbMEsExsA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*", + "@types/qs": "*", + "@types/range-parser": "*", + "@types/send": "*" + } + }, + "node_modules/@types/express/node_modules/@types/express-serve-static-core": { + "version": "4.19.6", + "resolved": "https://registry.npmmirror.com/@types/express-serve-static-core/-/express-serve-static-core-4.19.6.tgz", + "integrity": "sha512-N4LZ2xG7DatVqhCZzOGb1Yi5lMbXSZcmdLDe9EzSndPV2HpWYWzRbaerl2n27irrm94EPpprqa8KpskPT085+A==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*", + "@types/qs": "*", + "@types/range-parser": "*", + "@types/send": "*" + } + }, + "node_modules/@types/geojson": { + "version": "7946.0.16", + "resolved": "https://registry.npmmirror.com/@types/geojson/-/geojson-7946.0.16.tgz", + "integrity": "sha512-6C8nqWur3j98U6+lXDfTUWIfgvZU+EumvpHKcYjujKH7woYyLj2sUmff0tRhrqM7BohUw7Pz3ZB1jj2gW9Fvmg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/glob": { + "version": "7.2.0", + "resolved": "https://registry.npmmirror.com/@types/glob/-/glob-7.2.0.tgz", + "integrity": "sha512-ZUxbzKl0IfJILTS6t7ip5fQQM/J3TJYubDm3nMbgubNNYS62eXeUpoLUC8/7fJNiFYHTrGPQn7hspDUzIHX3UA==", + "license": "MIT", + "dependencies": { + "@types/minimatch": "*", + "@types/node": "*" + } + }, + "node_modules/@types/html-minifier-terser": { + "version": "6.1.0", + "resolved": "https://registry.npmmirror.com/@types/html-minifier-terser/-/html-minifier-terser-6.1.0.tgz", + "integrity": "sha512-oh/6byDPnL1zeNXFrDXFLyZjkr1MsBG667IM792caf1L2UPOOMf65NFzjUH/ltyfwjAGfs1rsX1eftK0jC/KIg==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/http-errors": { + "version": "2.0.4", + "resolved": "https://registry.npmmirror.com/@types/http-errors/-/http-errors-2.0.4.tgz", + "integrity": "sha512-D0CFMMtydbJAegzOyHjtiKPLlvnm3iTZyZRSZoLq2mRhDdmLfIWOCYPfQJ4cu2erKghU++QvjcUjp/5h7hESpA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/http-proxy": { + "version": "1.17.16", + "resolved": "https://registry.npmmirror.com/@types/http-proxy/-/http-proxy-1.17.16.tgz", + "integrity": "sha512-sdWoUajOB1cd0A8cRRQ1cfyWNbmFKLAqBB89Y8x5iYyG/mkJHc0YUH8pdWBy2omi9qtCpiIgGjuwO0dQST2l5w==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@types/json-schema": { + "version": "7.0.15", + "resolved": "https://registry.npmmirror.com/@types/json-schema/-/json-schema-7.0.15.tgz", + "integrity": "sha512-5+fP8P8MFNC+AyZCDxrB2pkZFPGzqQWUzpSeuuVLvm8VMcorNYavBqoFcxK8bQz4Qsbn4oUEEem4wDLfcysGHA==", + "license": "MIT" + }, + "node_modules/@types/lodash": { + "version": "4.17.16", + "resolved": "https://registry.npmmirror.com/@types/lodash/-/lodash-4.17.16.tgz", + "integrity": "sha512-HX7Em5NYQAXKW+1T+FiuG27NGwzJfCX3s1GjOa7ujxZa52kjJLOr4FUxT+giF6Tgxv1e+/czV/iTtBw27WTU9g==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/mime": { + "version": "1.3.5", + "resolved": "https://registry.npmmirror.com/@types/mime/-/mime-1.3.5.tgz", + "integrity": "sha512-/pyBZWSLD2n0dcHE3hq8s8ZvcETHtEuF+3E7XVt0Ig2nvsVQXdghHVcEkIWjy9A0wKfTn97a/PSDYohKIlnP/w==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/minimatch": { + "version": "5.1.2", + "resolved": "https://registry.npmmirror.com/@types/minimatch/-/minimatch-5.1.2.tgz", + "integrity": "sha512-K0VQKziLUWkVKiRVrx4a40iPaxTUefQmjtkQofBkYRcoaaL/8rhwDWww9qWbrgicNOgnpIsMxyNIUM4+n6dUIA==", + "license": "MIT" + }, + "node_modules/@types/node": { + "version": "16.18.126", + "resolved": "https://registry.npmmirror.com/@types/node/-/node-16.18.126.tgz", + "integrity": "sha512-OTcgaiwfGFBKacvfwuHzzn1KLxH/er8mluiy8/uM3sGXHaRe73RrSIj01jow9t4kJEW633Ov+cOexXeiApTyAw==", + "license": "MIT" + }, + "node_modules/@types/node-forge": { + "version": "1.3.11", + "resolved": "https://registry.npmmirror.com/@types/node-forge/-/node-forge-1.3.11.tgz", + "integrity": "sha512-FQx220y22OKNTqaByeBGqHWYz4cl94tpcxeFdvBo3wjG6XPBuZ0BNgNZRV5J5TFmmcsJ4IzsLkmGRiQbnYsBEQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@types/offscreencanvas": { + "version": "2019.7.3", + "resolved": "https://registry.npmmirror.com/@types/offscreencanvas/-/offscreencanvas-2019.7.3.tgz", + "integrity": "sha512-ieXiYmgSRXUDeOntE1InxjWyvEelZGP63M+cGuquuRLuIKKT1osnkXjxev9B7d1nXSug5vpunx+gNlbVxMlC9A==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/qs": { + "version": "6.9.18", + "resolved": "https://registry.npmmirror.com/@types/qs/-/qs-6.9.18.tgz", + "integrity": "sha512-kK7dgTYDyGqS+e2Q4aK9X3D7q234CIZ1Bv0q/7Z5IwRDoADNU81xXJK/YVyLbLTZCoIwUoDoffFeF+p/eIklAA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/range-parser": { + "version": "1.2.7", + "resolved": "https://registry.npmmirror.com/@types/range-parser/-/range-parser-1.2.7.tgz", + "integrity": "sha512-hKormJbkJqzQGhziax5PItDUTMAM9uE2XXQmM37dyd4hVM+5aVl7oVxMVUiVQn2oCQFN/LKCZdvSM0pFRqbSmQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/requirejs": { + "version": "2.1.37", + "resolved": "https://registry.npmmirror.com/@types/requirejs/-/requirejs-2.1.37.tgz", + "integrity": "sha512-jmFgr3mwN2NSmtRP6IpZ2nfRS7ufSXuDYQ6YyPFArN8x5dARQcD/DXzT0J6NYbvquVT4pg9K9HWdi6e6DZR9iQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/resize-observer-browser": { + "version": "0.1.11", + "resolved": "https://registry.npmmirror.com/@types/resize-observer-browser/-/resize-observer-browser-0.1.11.tgz", + "integrity": "sha512-cNw5iH8JkMkb3QkCoe7DaZiawbDQEUX8t7iuQaRTyLOyQCR2h+ibBD4GJt7p5yhUHrlOeL7ZtbxNHeipqNsBzQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/retry": { + "version": "0.12.0", + "resolved": "https://registry.npmmirror.com/@types/retry/-/retry-0.12.0.tgz", + "integrity": "sha512-wWKOClTTiizcZhXnPY4wikVAwmdYHp8q6DmC+EJUzAMsycb7HB32Kh9RN4+0gExjmPmZSAQjgURXIGATPegAvA==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/send": { + "version": "0.17.4", + "resolved": "https://registry.npmmirror.com/@types/send/-/send-0.17.4.tgz", + "integrity": "sha512-x2EM6TJOybec7c52BX0ZspPodMsQUd5L6PRwOunVyVUhXiBSKf3AezDL8Dgvgt5o0UfKNfuA0eMLr2wLT4AiBA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/mime": "^1", + "@types/node": "*" + } + }, + "node_modules/@types/serve-index": { + "version": "1.9.4", + "resolved": "https://registry.npmmirror.com/@types/serve-index/-/serve-index-1.9.4.tgz", + "integrity": "sha512-qLpGZ/c2fhSs5gnYsQxtDEq3Oy8SXPClIXkW5ghvAvsNuVSA8k+gCONcUCS/UjLEYvYps+e8uBtfgXgvhwfNug==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/express": "*" + } + }, + "node_modules/@types/serve-static": { + "version": "1.15.7", + "resolved": "https://registry.npmmirror.com/@types/serve-static/-/serve-static-1.15.7.tgz", + "integrity": "sha512-W8Ym+h8nhuRwaKPaDw34QUkwsGi6Rc4yYqvKFo5rm2FUEhCFbzVWrxXUxuKK8TASjWsysJY0nsmNCGhCOIsrOw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/http-errors": "*", + "@types/node": "*", + "@types/send": "*" + } + }, + "node_modules/@types/sockjs": { + "version": "0.3.36", + "resolved": "https://registry.npmmirror.com/@types/sockjs/-/sockjs-0.3.36.tgz", + "integrity": "sha512-MK9V6NzAS1+Ud7JV9lJLFqW85VbC9dq3LmwZCuBe4wBDgKC0Kj/jd8Xl+nSviU+Qc3+m7umHHyHg//2KSa0a0Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@types/three": { + "version": "0.131.1", + "resolved": "https://registry.npmmirror.com/@types/three/-/three-0.131.1.tgz", + "integrity": "sha512-unnjsolcm7R90e4XK9qMq4JYEzly0XQNa0pG8RAOMZeVzj3FLIFPymAYUx4Osz0gY9jFZz8omIQplqiieEE7gw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/trusted-types": { + "version": "2.0.7", + "resolved": "https://registry.npmmirror.com/@types/trusted-types/-/trusted-types-2.0.7.tgz", + "integrity": "sha512-ScaPdn1dQczgbl0QFTeTOmVHFULt394XJgOQNoyVhZ6r2vLnMLJfBPd53SB52T/3G36VI1/g2MZaX0cwDuXsfw==", + "license": "MIT" + }, + "node_modules/@types/ws": { + "version": "8.18.0", + "resolved": "https://registry.npmmirror.com/@types/ws/-/ws-8.18.0.tgz", + "integrity": "sha512-8svvI3hMyvN0kKCJMvTJP/x6Y/EoQbepff882wL+Sn5QsXb3etnamgrJq4isrBxSJj5L2AuXcI0+bgkoAXGUJw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node": "*" + } + }, + "node_modules/@vaadin/a11y-base": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/a11y-base/-/a11y-base-24.6.7.tgz", + "integrity": "sha512-CJYYTWPBEEaVt4AvBE8RzEn3hqUZbGUGLzqs6NGBFTw0c5cfkqoO2ZMkKhz5Z52QF+2mCXpEtyg6s+t0h171Qg==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/button": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/button/-/button-24.6.5.tgz", + "integrity": "sha512-i+pgR0Gn6EWxLgWEQOi7yXXQSQklsr7a+yotlet1GOB+DymE+w9RVp4WOZ6T8yaqTICKcDQldFkreTzFVxsHAQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/checkbox": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/checkbox/-/checkbox-24.6.7.tgz", + "integrity": "sha512-/Vl5codokNdN5ku1l/iAkdjUmYTUZGKyAleHjM7V3ZFpwkK2IoWN4HrbWyhPuf1gL3T85bKMLSPuYoOX/ymrFw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/field-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/combo-box": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/combo-box/-/combo-box-24.6.5.tgz", + "integrity": "sha512-u/xC9QegwWgmw9TutPRoIzeBpUgG6Kt9CmJbNZNeWBrP9Nicz/QAawApynvjWQtmm7zIKXp7SPzW1Gqwpe09mQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/field-base": "~24.6.5", + "@vaadin/input-container": "~24.6.5", + "@vaadin/item": "~24.6.5", + "@vaadin/lit-renderer": "~24.6.5", + "@vaadin/overlay": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/component-base": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/component-base/-/component-base-24.6.7.tgz", + "integrity": "sha512-LcZQZEwouPDHBoXfXRREb1mRScsPSPeKTUZdgrXh180Piy57VzpNzslIMrdfVFSye9lLMs2/g2o8HCUDgnY/OQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/vaadin-development-mode-detector": "^2.0.0", + "@vaadin/vaadin-usage-statistics": "^2.1.0", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/details": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/details/-/details-24.6.5.tgz", + "integrity": "sha512-V22OCdRnT7qOVsVpedGfrwDPE9dFWdhFDv66RfkiWGHpPoq0+dYUpP2Y5Iy7YRCxqVnogVBiE8qHPgZAO4U18A==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/button": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/field-base": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/field-base/-/field-base-24.6.7.tgz", + "integrity": "sha512-5MXpAQGZA15/hRdnZrJK5q5Mv8rgOraSyBpC/gjRJ1W1IQ5DrCcb3ltvPATguv0K3vpJwunXGXrGqm/+SGEk0w==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/grid": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/grid/-/grid-24.6.5.tgz", + "integrity": "sha512-BlZO8+oWTmrnCbZESa73IbMuXfxQu7Viotd88NXY/ixq/8LiQqj2yNHtKTPz2l2QL1ke57ckFsjzN6w52nYc5g==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/checkbox": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/lit-renderer": "~24.6.5", + "@vaadin/text-field": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/icon": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/icon/-/icon-24.6.5.tgz", + "integrity": "sha512-y6Jy69nySb3tZqEIYAYpyGTiNkKS//ro+w6tuD0a0gu+GrfTv90XDNEY9FvGvnUHsM44OoiQRH3kD15kmISkxQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/icons": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/icons/-/icons-24.6.5.tgz", + "integrity": "sha512-zd8KKkJ18EI70IQGoCz3hcQed+VFPnqECKci8vt+OJi1n5j7qzPW4sbEOLZxr6cWrnN1eNdSHfJCQWXrFfL0bQ==", + "license": "Apache-2.0", + "dependencies": { + "@polymer/polymer": "^3.0.0", + "@vaadin/icon": "~24.6.5" + } + }, + "node_modules/@vaadin/input-container": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/input-container/-/input-container-24.6.7.tgz", + "integrity": "sha512-376ZyD74jrKvjiM+gE0xNScyZPU7REMBbGXpmM4DpoLYgw60m01D3fliZaOTVDyXc3gvxWIai3L1vCY0KYpD6w==", + "license": "Apache-2.0", + "dependencies": { + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/item": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/item/-/item-24.6.7.tgz", + "integrity": "sha512-9xpJEVhgHF3YQGVeet2uakMTH7SyEbQx+uT5Kld/r1CiCYOKUxbERXrFuJ/5/lgakXjDvN1d7rYDcjPb3CUfsQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/list-box": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/list-box/-/list-box-24.6.7.tgz", + "integrity": "sha512-yUBHonI6uD28l2h+CUh2KPzXe+Ptv6UWtNJIIevX/xkQhptquXzE01bVXlh1NcLVppnu21gaxFs/l+/rHlAKpw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/item": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/lit-renderer": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/lit-renderer/-/lit-renderer-24.6.7.tgz", + "integrity": "sha512-S9daJnGW/X+HBhOriENRYNf8hCFYABmea756onaLS0QoWLkaU3QVPKrhHjZtzNVf/15UcIeAx4C5JlIas2osFA==", + "license": "Apache-2.0", + "dependencies": { + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/notification": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/notification/-/notification-24.6.5.tgz", + "integrity": "sha512-9OgYmZn3qU3pVMaoIRITNs6gymrnswYO7bk9+8e97o3W4A9TIcAO6F2HTgLO5ieMuuOI1DSlVCpXbrM3xBe8pw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.5", + "@vaadin/lit-renderer": "~24.6.5", + "@vaadin/overlay": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/overlay": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/overlay/-/overlay-24.6.7.tgz", + "integrity": "sha512-3HZ2+Ld/ktOzFt3Ug3EoZeMqX//uKh9rsXd1d3lQl18bwVtSvG81lY7NI6tEQ2dSuniM0yy2tM+mVnV4lZq9Gw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/popover": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/popover/-/popover-24.6.7.tgz", + "integrity": "sha512-GqdDsi+x6+6YNBNPC+BvrshrwXlcmL+nR8v5sY+l1TMPVKNWFb2579Qzc9vvu7jMOr2rQd3F+ZjPoMAqgwuZHw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/lit-renderer": "~24.6.7", + "@vaadin/overlay": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/progress-bar": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/progress-bar/-/progress-bar-24.6.5.tgz", + "integrity": "sha512-lJPRV1SAP0Z46pcgQ9RiV8ZVqytDpIDZ7oMJW7WsjS70CAlrqJZF0JoJ3WoqUrHasNhxU7jjx+iXVXw7CzRrDg==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/scroller": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/scroller/-/scroller-24.6.7.tgz", + "integrity": "sha512-JLqrJCVcfo3GELWd8xNLGif+xz4WpiodPn4uW5/kI3lqLKYg7RKhEu9dg1zRpSEUou5SVFQCMB9m+D1AwyoQGQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.7", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-material-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/select": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/select/-/select-24.6.5.tgz", + "integrity": "sha512-dDVv4d4QLs7EZEJuOkBI/wjmR7mZ5TyUacCKmscq+Ke7DQrq46DuCUjj82+OSFC7z2m3+v5wflfVMciQehR1+Q==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.2.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/button": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/field-base": "~24.6.5", + "@vaadin/input-container": "~24.6.5", + "@vaadin/item": "~24.6.5", + "@vaadin/list-box": "~24.6.5", + "@vaadin/lit-renderer": "~24.6.5", + "@vaadin/overlay": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/tabs": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/tabs/-/tabs-24.6.5.tgz", + "integrity": "sha512-svUqDjwzlnKsAOYB0szST4Tjhspnb007bMf16fhmkM12u3KK053hEZ2TYX7lNVFLC3RiDvGa8i6nCAK2SVXCDQ==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/item": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/tabsheet": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/tabsheet/-/tabsheet-24.6.5.tgz", + "integrity": "sha512-dn4RFFdK+7Hu6Hhq/V0jb1pwwcLxipgMjAmYGsot4vapqFKSdqea1WpVo6TvVkGXCg3TIrYq5SRbzrIzh9FEzg==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.5", + "@vaadin/scroller": "~24.6.5", + "@vaadin/tabs": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/text-field": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/text-field/-/text-field-24.6.5.tgz", + "integrity": "sha512-zujt5k6i6pkVbfUiQlYWBGa/MUAmWeq0xhDLgHIapzUlEIq6gf67KFwEfhfmwdVzGQImFTTKUBWhO4DERRF0Nw==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/field-base": "~24.6.5", + "@vaadin/input-container": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/tooltip": { + "version": "24.6.5", + "resolved": "https://registry.npmmirror.com/@vaadin/tooltip/-/tooltip-24.6.5.tgz", + "integrity": "sha512-IPcMN61PO+u9IgHyM3GCqrzSUQUo13Tysvp58Z7OvtZg/IgQpcEtWkC2m+Qg9rwJAZu/x37Qfd/8on0TQWzlMg==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/a11y-base": "~24.6.5", + "@vaadin/component-base": "~24.6.5", + "@vaadin/overlay": "~24.6.5", + "@vaadin/popover": "~24.6.5", + "@vaadin/vaadin-lumo-styles": "~24.6.5", + "@vaadin/vaadin-material-styles": "~24.6.5", + "@vaadin/vaadin-themable-mixin": "~24.6.5", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/vaadin-development-mode-detector": { + "version": "2.0.7", + "resolved": "https://registry.npmmirror.com/@vaadin/vaadin-development-mode-detector/-/vaadin-development-mode-detector-2.0.7.tgz", + "integrity": "sha512-9FhVhr0ynSR3X2ao+vaIEttcNU5XfzCbxtmYOV8uIRnUCtNgbvMOIcyGBvntsX9I5kvIP2dV3cFAOG9SILJzEA==", + "license": "Apache-2.0" + }, + "node_modules/@vaadin/vaadin-lumo-styles": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/vaadin-lumo-styles/-/vaadin-lumo-styles-24.6.7.tgz", + "integrity": "sha512-DNamU8cVxbaVn3HfRm3pN8ul95xvaem92ByVeEQwdvKaHwLI4m7AdSWKEA+13ST9TdBtCeDW6DjmtGcoEqbqiw==", + "license": "Apache-2.0", + "dependencies": { + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.7", + "@vaadin/icon": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7" + } + }, + "node_modules/@vaadin/vaadin-lumo-styles/node_modules/@vaadin/icon": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/icon/-/icon-24.6.7.tgz", + "integrity": "sha512-+Cv3hLyFSXJAhnuGuPQ+hQcv9/ijZpIprJ6rqWeChvFk+bQOoPgUPx/tj67mOiTcrmV5hYt+dYs4QM7JZ//dGg==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-lumo-styles": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/vaadin-material-styles": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/vaadin-material-styles/-/vaadin-material-styles-24.6.7.tgz", + "integrity": "sha512-7ecHOEZrFEbUz5UVSGapOt/uC7lSYV05RADCNhG16c+WsuN+oxkGIIaThMMCdBcclg5ej/BeTxZlZha8JoNO3g==", + "license": "Apache-2.0", + "dependencies": { + "@polymer/polymer": "^3.0.0", + "@vaadin/component-base": "~24.6.7", + "@vaadin/vaadin-themable-mixin": "~24.6.7" + } + }, + "node_modules/@vaadin/vaadin-themable-mixin": { + "version": "24.6.7", + "resolved": "https://registry.npmmirror.com/@vaadin/vaadin-themable-mixin/-/vaadin-themable-mixin-24.6.7.tgz", + "integrity": "sha512-fiVBvJWInNBq/oXeE0UAQmzadQ7UJE3ns768D1taKOwTMOxiio1UMoUXcVGwni9ASzXrd96S7F6c4aIaVqNx6A==", + "license": "Apache-2.0", + "dependencies": { + "@open-wc/dedupe-mixin": "^1.3.0", + "lit": "^3.0.0" + } + }, + "node_modules/@vaadin/vaadin-usage-statistics": { + "version": "2.1.3", + "resolved": "https://registry.npmmirror.com/@vaadin/vaadin-usage-statistics/-/vaadin-usage-statistics-2.1.3.tgz", + "integrity": "sha512-8r4TNknD7OJQADe3VygeofFR7UNAXZ2/jjBFP5dgI8+2uMfnuGYgbuHivasKr9WSQ64sPej6m8rDoM1uSllXjQ==", + "hasInstallScript": true, + "license": "Apache-2.0", + "dependencies": { + "@vaadin/vaadin-development-mode-detector": "^2.0.0" + }, + "engines": { + "node": "^12.20.0 || ^14.13.1 || >=16.0.0" + } + }, + "node_modules/@webassemblyjs/ast": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/ast/-/ast-1.14.1.tgz", + "integrity": "sha512-nuBEDgQfm1ccRp/8bCQrx1frohyufl4JlbMMZ4P1wpeOfDhF6FQkxZJ1b/e+PLwr6X1Nhw6OLme5usuBWYBvuQ==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/helper-numbers": "1.13.2", + "@webassemblyjs/helper-wasm-bytecode": "1.13.2" + } + }, + "node_modules/@webassemblyjs/floating-point-hex-parser": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/floating-point-hex-parser/-/floating-point-hex-parser-1.13.2.tgz", + "integrity": "sha512-6oXyTOzbKxGH4steLbLNOu71Oj+C8Lg34n6CqRvqfS2O71BxY6ByfMDRhBytzknj9yGUPVJ1qIKhRlAwO1AovA==", + "license": "MIT" + }, + "node_modules/@webassemblyjs/helper-api-error": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/helper-api-error/-/helper-api-error-1.13.2.tgz", + "integrity": "sha512-U56GMYxy4ZQCbDZd6JuvvNV/WFildOjsaWD3Tzzvmw/mas3cXzRJPMjP83JqEsgSbyrmaGjBfDtV7KDXV9UzFQ==", + "license": "MIT" + }, + "node_modules/@webassemblyjs/helper-buffer": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/helper-buffer/-/helper-buffer-1.14.1.tgz", + "integrity": "sha512-jyH7wtcHiKssDtFPRB+iQdxlDf96m0E39yb0k5uJVhFGleZFoNw1c4aeIcVUPPbXUVJ94wwnMOAqUHyzoEPVMA==", + "license": "MIT" + }, + "node_modules/@webassemblyjs/helper-numbers": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/helper-numbers/-/helper-numbers-1.13.2.tgz", + "integrity": "sha512-FE8aCmS5Q6eQYcV3gI35O4J789wlQA+7JrqTTpJqn5emA4U2hvwJmvFRC0HODS+3Ye6WioDklgd6scJ3+PLnEA==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/floating-point-hex-parser": "1.13.2", + "@webassemblyjs/helper-api-error": "1.13.2", + "@xtuc/long": "4.2.2" + } + }, + "node_modules/@webassemblyjs/helper-wasm-bytecode": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/helper-wasm-bytecode/-/helper-wasm-bytecode-1.13.2.tgz", + "integrity": "sha512-3QbLKy93F0EAIXLh0ogEVR6rOubA9AoZ+WRYhNbFyuB70j3dRdwH9g+qXhLAO0kiYGlg3TxDV+I4rQTr/YNXkA==", + "license": "MIT" + }, + "node_modules/@webassemblyjs/helper-wasm-section": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/helper-wasm-section/-/helper-wasm-section-1.14.1.tgz", + "integrity": "sha512-ds5mXEqTJ6oxRoqjhWDU83OgzAYjwsCV8Lo/N+oRsNDmx/ZDpqalmrtgOMkHwxsG0iI//3BwWAErYRHtgn0dZw==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@webassemblyjs/helper-buffer": "1.14.1", + "@webassemblyjs/helper-wasm-bytecode": "1.13.2", + "@webassemblyjs/wasm-gen": "1.14.1" + } + }, + "node_modules/@webassemblyjs/ieee754": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/ieee754/-/ieee754-1.13.2.tgz", + "integrity": "sha512-4LtOzh58S/5lX4ITKxnAK2USuNEvpdVV9AlgGQb8rJDHaLeHciwG4zlGr0j/SNWlr7x3vO1lDEsuePvtcDNCkw==", + "license": "MIT", + "dependencies": { + "@xtuc/ieee754": "^1.2.0" + } + }, + "node_modules/@webassemblyjs/leb128": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/leb128/-/leb128-1.13.2.tgz", + "integrity": "sha512-Lde1oNoIdzVzdkNEAWZ1dZ5orIbff80YPdHx20mrHwHrVNNTjNr8E3xz9BdpcGqRQbAEa+fkrCb+fRFTl/6sQw==", + "license": "Apache-2.0", + "dependencies": { + "@xtuc/long": "4.2.2" + } + }, + "node_modules/@webassemblyjs/utf8": { + "version": "1.13.2", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/utf8/-/utf8-1.13.2.tgz", + "integrity": "sha512-3NQWGjKTASY1xV5m7Hr0iPeXD9+RDobLll3T9d2AO+g3my8xy5peVyjSag4I50mR1bBSN/Ct12lo+R9tJk0NZQ==", + "license": "MIT" + }, + "node_modules/@webassemblyjs/wasm-edit": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/wasm-edit/-/wasm-edit-1.14.1.tgz", + "integrity": "sha512-RNJUIQH/J8iA/1NzlE4N7KtyZNHi3w7at7hDjvRNm5rcUXa00z1vRz3glZoULfJ5mpvYhLybmVcwcjGrC1pRrQ==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@webassemblyjs/helper-buffer": "1.14.1", + "@webassemblyjs/helper-wasm-bytecode": "1.13.2", + "@webassemblyjs/helper-wasm-section": "1.14.1", + "@webassemblyjs/wasm-gen": "1.14.1", + "@webassemblyjs/wasm-opt": "1.14.1", + "@webassemblyjs/wasm-parser": "1.14.1", + "@webassemblyjs/wast-printer": "1.14.1" + } + }, + "node_modules/@webassemblyjs/wasm-gen": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/wasm-gen/-/wasm-gen-1.14.1.tgz", + "integrity": "sha512-AmomSIjP8ZbfGQhumkNvgC33AY7qtMCXnN6bL2u2Js4gVCg8fp735aEiMSBbDR7UQIj90n4wKAFUSEd0QN2Ukg==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@webassemblyjs/helper-wasm-bytecode": "1.13.2", + "@webassemblyjs/ieee754": "1.13.2", + "@webassemblyjs/leb128": "1.13.2", + "@webassemblyjs/utf8": "1.13.2" + } + }, + "node_modules/@webassemblyjs/wasm-opt": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/wasm-opt/-/wasm-opt-1.14.1.tgz", + "integrity": "sha512-PTcKLUNvBqnY2U6E5bdOQcSM+oVP/PmrDY9NzowJjislEjwP/C4an2303MCVS2Mg9d3AJpIGdUFIQQWbPds0Sw==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@webassemblyjs/helper-buffer": "1.14.1", + "@webassemblyjs/wasm-gen": "1.14.1", + "@webassemblyjs/wasm-parser": "1.14.1" + } + }, + "node_modules/@webassemblyjs/wasm-parser": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/wasm-parser/-/wasm-parser-1.14.1.tgz", + "integrity": "sha512-JLBl+KZ0R5qB7mCnud/yyX08jWFw5MsoalJ1pQ4EdFlgj9VdXKGuENGsiCIjegI1W7p91rUlcB/LB5yRJKNTcQ==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@webassemblyjs/helper-api-error": "1.13.2", + "@webassemblyjs/helper-wasm-bytecode": "1.13.2", + "@webassemblyjs/ieee754": "1.13.2", + "@webassemblyjs/leb128": "1.13.2", + "@webassemblyjs/utf8": "1.13.2" + } + }, + "node_modules/@webassemblyjs/wast-printer": { + "version": "1.14.1", + "resolved": "https://registry.npmmirror.com/@webassemblyjs/wast-printer/-/wast-printer-1.14.1.tgz", + "integrity": "sha512-kPSSXE6De1XOR820C90RIo2ogvZG+c3KiHzqUoO/F34Y2shGzesfqv7o57xrxovZJH/MetF5UjroJ/R/3isoiw==", + "license": "MIT", + "dependencies": { + "@webassemblyjs/ast": "1.14.1", + "@xtuc/long": "4.2.2" + } + }, + "node_modules/@webcomponents/shadycss": { + "version": "1.11.2", + "resolved": "https://registry.npmmirror.com/@webcomponents/shadycss/-/shadycss-1.11.2.tgz", + "integrity": "sha512-vRq+GniJAYSBmTRnhCYPAPq6THYqovJ/gzGThWbgEZUQaBccndGTi1hdiUP15HzEco0I6t4RCtXyX0rsSmwgPw==", + "license": "BSD-3-Clause" + }, + "node_modules/@webpack-cli/configtest": { + "version": "2.1.1", + "resolved": "https://registry.npmmirror.com/@webpack-cli/configtest/-/configtest-2.1.1.tgz", + "integrity": "sha512-wy0mglZpDSiSS0XHrVR+BAdId2+yxPSoJW8fsna3ZpYSlufjvxnP4YbKTCBZnNIcGN4r6ZPXV55X4mYExOfLmw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.15.0" + }, + "peerDependencies": { + "webpack": "5.x.x", + "webpack-cli": "5.x.x" + } + }, + "node_modules/@webpack-cli/info": { + "version": "2.0.2", + "resolved": "https://registry.npmmirror.com/@webpack-cli/info/-/info-2.0.2.tgz", + "integrity": "sha512-zLHQdI/Qs1UyT5UBdWNqsARasIA+AaF8t+4u2aS2nEpBQh2mWIVb8qAklq0eUENnC5mOItrIB4LiS9xMtph18A==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.15.0" + }, + "peerDependencies": { + "webpack": "5.x.x", + "webpack-cli": "5.x.x" + } + }, + "node_modules/@webpack-cli/serve": { + "version": "2.0.5", + "resolved": "https://registry.npmmirror.com/@webpack-cli/serve/-/serve-2.0.5.tgz", + "integrity": "sha512-lqaoKnRYBdo1UgDX8uF24AfGMifWK19TxPmM5FHc2vAGxrJ/qtyUyFBWoY1tISZdelsQ5fBcOusifo5o5wSJxQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.15.0" + }, + "peerDependencies": { + "webpack": "5.x.x", + "webpack-cli": "5.x.x" + }, + "peerDependenciesMeta": { + "webpack-dev-server": { + "optional": true + } + } + }, + "node_modules/@xtuc/ieee754": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/@xtuc/ieee754/-/ieee754-1.2.0.tgz", + "integrity": "sha512-DX8nKgqcGwsc0eJSqYt5lwP4DH5FlHnmuWWBRy7X0NcaGR0ZtuyeESgMwTYVEtxmsNGY+qit4QYT/MIYTOTPeA==", + "license": "BSD-3-Clause" + }, + "node_modules/@xtuc/long": { + "version": "4.2.2", + "resolved": "https://registry.npmmirror.com/@xtuc/long/-/long-4.2.2.tgz", + "integrity": "sha512-NuHqBY1PB/D8xU6s/thBgOAiAP7HOYDQ32+BFZILJ8ivkUkAHQnWfn6WhL79Owj1qmUnoN/YPhktdIoucipkAQ==", + "license": "Apache-2.0" + }, + "node_modules/accepts": { + "version": "1.3.8", + "resolved": "https://registry.npmmirror.com/accepts/-/accepts-1.3.8.tgz", + "integrity": "sha512-PYAthTa2m2VKxuvSD3DPC/Gy+U+sOA1LAuT8mkmRuvw+NACSaeXEQ+NHcVF7rONl6qcaxV3Uuemwawk+7+SJLw==", + "dev": true, + "license": "MIT", + "dependencies": { + "mime-types": "~2.1.34", + "negotiator": "0.6.3" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/accepts/node_modules/negotiator": { + "version": "0.6.3", + "resolved": "https://registry.npmmirror.com/negotiator/-/negotiator-0.6.3.tgz", + "integrity": "sha512-+EUsqGPLsM+j/zdChZjsnX51g4XrHFOIXwfnCVPGlQk/k5giakcKsuxCObBRu6DSm9opw/O6slWbJdghQM4bBg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/acorn": { + "version": "8.14.1", + "resolved": "https://registry.npmmirror.com/acorn/-/acorn-8.14.1.tgz", + "integrity": "sha512-OvQ/2pUDKmgfCg++xsTX1wGxfTaszcHVcTctW4UJB4hibJx2HXxxO5UmVgyjMa+ZDsiaf5wWLXYpRWMmBI0QHg==", + "license": "MIT", + "bin": { + "acorn": "bin/acorn" + }, + "engines": { + "node": ">=0.4.0" + } + }, + "node_modules/ajv": { + "version": "8.17.1", + "resolved": "https://registry.npmmirror.com/ajv/-/ajv-8.17.1.tgz", + "integrity": "sha512-B/gBuNg5SiMTrPkC+A2+cW0RszwxYmn6VYxB/inlBStS5nx6xHIt/ehKRhIMhqusl7a8LjQoZnjCs5vhwxOQ1g==", + "license": "MIT", + "dependencies": { + "fast-deep-equal": "^3.1.3", + "fast-uri": "^3.0.1", + "json-schema-traverse": "^1.0.0", + "require-from-string": "^2.0.2" + }, + "funding": { + "type": "github", + "url": "https://github.com/sponsors/epoberezkin" + } + }, + "node_modules/ajv-formats": { + "version": "2.1.1", + "resolved": "https://registry.npmmirror.com/ajv-formats/-/ajv-formats-2.1.1.tgz", + "integrity": "sha512-Wx0Kx52hxE7C18hkMEggYlEifqWZtYaRgouJor+WMdPnQyEK13vgEWyVNup7SoeeoLMsr4kf5h6dOW11I15MUA==", + "license": "MIT", + "dependencies": { + "ajv": "^8.0.0" + }, + "peerDependencies": { + "ajv": "^8.0.0" + }, + "peerDependenciesMeta": { + "ajv": { + "optional": true + } + } + }, + "node_modules/ajv-keywords": { + "version": "5.1.0", + "resolved": "https://registry.npmmirror.com/ajv-keywords/-/ajv-keywords-5.1.0.tgz", + "integrity": "sha512-YCS/JNFAUyr5vAuhk1DWm1CBxRHW9LbJ2ozWeemrIqpbsqKjHVxYPyi5GC0rjZIT5JxJ3virVTS8wk4i/Z+krw==", + "license": "MIT", + "dependencies": { + "fast-deep-equal": "^3.1.3" + }, + "peerDependencies": { + "ajv": "^8.8.2" + } + }, + "node_modules/ansi-html-community": { + "version": "0.0.8", + "resolved": "https://registry.npmmirror.com/ansi-html-community/-/ansi-html-community-0.0.8.tgz", + "integrity": "sha512-1APHAyr3+PCamwNw3bXCPp4HFLONZt/yIH0sZp0/469KWNTEy+qN5jQ3GVX6DMZ1UXAi34yVwtTeaG/HpBuuzw==", + "dev": true, + "engines": [ + "node >= 0.8.0" + ], + "license": "Apache-2.0", + "bin": { + "ansi-html": "bin/ansi-html" + } + }, + "node_modules/ansi-regex": { + "version": "5.0.1", + "resolved": "https://registry.npmmirror.com/ansi-regex/-/ansi-regex-5.0.1.tgz", + "integrity": "sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/ansi-styles": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/ansi-styles/-/ansi-styles-4.3.0.tgz", + "integrity": "sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg==", + "dev": true, + "license": "MIT", + "dependencies": { + "color-convert": "^2.0.1" + }, + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/chalk/ansi-styles?sponsor=1" + } + }, + "node_modules/anymatch": { + "version": "3.1.3", + "resolved": "https://registry.npmmirror.com/anymatch/-/anymatch-3.1.3.tgz", + "integrity": "sha512-KMReFUr0B4t+D+OBkjR3KYqvocp2XaSzO55UcB6mgQMd3KbcE+mWTyvVV7D/zsdEbNnV6acZUutkiHQXvTr1Rw==", + "dev": true, + "license": "ISC", + "dependencies": { + "normalize-path": "^3.0.0", + "picomatch": "^2.0.4" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/array-flatten": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/array-flatten/-/array-flatten-1.1.1.tgz", + "integrity": "sha512-PCVAQswWemu6UdxsDFFX/+gVeYqKAod3D3UVm91jHwynguOwAvYPhx8nNlM++NqRcK6CxxpUafjmhIdKiHibqg==", + "dev": true, + "license": "MIT" + }, + "node_modules/array-union": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/array-union/-/array-union-1.0.2.tgz", + "integrity": "sha512-Dxr6QJj/RdU/hCaBjOfxW+q6lyuVE6JFWIrAUpuOOhoJJoQ99cUn3igRaHVB5P9WrgFVN0FfArM3x0cueOU8ng==", + "license": "MIT", + "dependencies": { + "array-uniq": "^1.0.1" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/array-uniq": { + "version": "1.0.3", + "resolved": "https://registry.npmmirror.com/array-uniq/-/array-uniq-1.0.3.tgz", + "integrity": "sha512-MNha4BWQ6JbwhFhj03YK552f7cb3AzoE8SzeljgChvL1dl3IcvggXVz1DilzySZkCja+CXuZbdW7yATchWn8/Q==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/balanced-match": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/balanced-match/-/balanced-match-1.0.2.tgz", + "integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==", + "license": "MIT" + }, + "node_modules/batch": { + "version": "0.6.1", + "resolved": "https://registry.npmmirror.com/batch/-/batch-0.6.1.tgz", + "integrity": "sha512-x+VAiMRL6UPkx+kudNvxTl6hB2XNNCG2r+7wixVfIYwu/2HKRXimwQyaumLjMveWvT2Hkd/cAJw+QBMfJ/EKVw==", + "dev": true, + "license": "MIT" + }, + "node_modules/binary-extensions": { + "version": "2.3.0", + "resolved": "https://registry.npmmirror.com/binary-extensions/-/binary-extensions-2.3.0.tgz", + "integrity": "sha512-Ceh+7ox5qe7LJuLHoY0feh3pHuUDHAcRUeyL2VYghZwfpkNIy/+8Ocg0a3UuSoYzavmylwuLWQOf3hl0jjMMIw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/body-parser": { + "version": "1.20.3", + "resolved": "https://registry.npmmirror.com/body-parser/-/body-parser-1.20.3.tgz", + "integrity": "sha512-7rAxByjUMqQ3/bHJy7D6OGXvx/MMc4IqBn/X0fcM1QUcAItpZrBEYhWGem+tzXH90c+G01ypMcYJBO9Y30203g==", + "dev": true, + "license": "MIT", + "dependencies": { + "bytes": "3.1.2", + "content-type": "~1.0.5", + "debug": "2.6.9", + "depd": "2.0.0", + "destroy": "1.2.0", + "http-errors": "2.0.0", + "iconv-lite": "0.4.24", + "on-finished": "2.4.1", + "qs": "6.13.0", + "raw-body": "2.5.2", + "type-is": "~1.6.18", + "unpipe": "1.0.0" + }, + "engines": { + "node": ">= 0.8", + "npm": "1.2.8000 || >= 1.4.16" + } + }, + "node_modules/bonjour-service": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/bonjour-service/-/bonjour-service-1.3.0.tgz", + "integrity": "sha512-3YuAUiSkWykd+2Azjgyxei8OWf8thdn8AITIog2M4UICzoqfjlqr64WIjEXZllf/W6vK1goqleSR6brGomxQqA==", + "dev": true, + "license": "MIT", + "dependencies": { + "fast-deep-equal": "^3.1.3", + "multicast-dns": "^7.2.5" + } + }, + "node_modules/boolbase": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/boolbase/-/boolbase-1.0.0.tgz", + "integrity": "sha512-JZOSA7Mo9sNGB8+UjSgzdLtokWAky1zbztM3WRLCbZ70/3cTANmQmOdR7y2g+J0e2WXywy1yS468tY+IruqEww==", + "dev": true, + "license": "ISC" + }, + "node_modules/brace-expansion": { + "version": "1.1.11", + "resolved": "https://registry.npmmirror.com/brace-expansion/-/brace-expansion-1.1.11.tgz", + "integrity": "sha512-iCuPHDFgrHX7H2vEI/5xpz07zSHB00TpugqhmYtVmMO6518mCuRMoOYFldEBl0g187ufozdaHgWKcYFb61qGiA==", + "license": "MIT", + "dependencies": { + "balanced-match": "^1.0.0", + "concat-map": "0.0.1" + } + }, + "node_modules/braces": { + "version": "3.0.3", + "resolved": "https://registry.npmmirror.com/braces/-/braces-3.0.3.tgz", + "integrity": "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==", + "dev": true, + "license": "MIT", + "dependencies": { + "fill-range": "^7.1.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/browserslist": { + "version": "4.24.4", + "resolved": "https://registry.npmmirror.com/browserslist/-/browserslist-4.24.4.tgz", + "integrity": "sha512-KDi1Ny1gSePi1vm0q4oxSF8b4DR44GF4BbmS2YdhPLOEqd8pDviZOGH/GsmRwoWJ2+5Lr085X7naowMwKHDG1A==", + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/browserslist" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "caniuse-lite": "^1.0.30001688", + "electron-to-chromium": "^1.5.73", + "node-releases": "^2.0.19", + "update-browserslist-db": "^1.1.1" + }, + "bin": { + "browserslist": "cli.js" + }, + "engines": { + "node": "^6 || ^7 || ^8 || ^9 || ^10 || ^11 || ^12 || >=13.7" + } + }, + "node_modules/buffer-from": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/buffer-from/-/buffer-from-1.1.2.tgz", + "integrity": "sha512-E+XQCRwSbaaiChtv6k6Dwgc+bx+Bs6vuKJHHl5kox/BaKbhiXzqQOwK4cO22yElGp2OCmjwVhT3HmxgyPGnJfQ==", + "license": "MIT" + }, + "node_modules/bytes": { + "version": "3.1.2", + "resolved": "https://registry.npmmirror.com/bytes/-/bytes-3.1.2.tgz", + "integrity": "sha512-/Nf7TyzTx6S3yRJObOAV7956r8cr2+Oj8AC5dt8wSP3BQAoeX58NoHyCU8P8zGkNXStjTSi6fzO6F0pBdcYbEg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/call-bind-apply-helpers": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/call-bind-apply-helpers/-/call-bind-apply-helpers-1.0.2.tgz", + "integrity": "sha512-Sp1ablJ0ivDkSzjcaJdxEunN5/XvksFJ2sMBFfq6x0ryhQV/2b/KwFe21cMpmHtPOSij8K99/wSfoEuTObmuMQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "function-bind": "^1.1.2" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/call-bound": { + "version": "1.0.4", + "resolved": "https://registry.npmmirror.com/call-bound/-/call-bound-1.0.4.tgz", + "integrity": "sha512-+ys997U96po4Kx/ABpBCqhA9EuxJaQWDQg7295H4hBphv3IZg0boBKuwYpt4YXp6MZ5AmZQnU/tyMTlRpaSejg==", + "dev": true, + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.2", + "get-intrinsic": "^1.3.0" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/camel-case": { + "version": "4.1.2", + "resolved": "https://registry.npmmirror.com/camel-case/-/camel-case-4.1.2.tgz", + "integrity": "sha512-gxGWBrTT1JuMx6R+o5PTXMmUnhnVzLQ9SNutD4YqKtI6ap897t3tKECYla6gCWEkplXnlNybEkZg9GEGxKFCgw==", + "dev": true, + "license": "MIT", + "dependencies": { + "pascal-case": "^3.1.2", + "tslib": "^2.0.3" + } + }, + "node_modules/caniuse-lite": { + "version": "1.0.30001707", + "resolved": "https://registry.npmmirror.com/caniuse-lite/-/caniuse-lite-1.0.30001707.tgz", + "integrity": "sha512-3qtRjw/HQSMlDWf+X79N206fepf4SOOU6SQLMaq/0KkZLmSjPxAkBOQQ+FxbHKfHmYLZFfdWsO3KA90ceHPSnw==", + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/caniuse-lite" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "CC-BY-4.0" + }, + "node_modules/chalk": { + "version": "4.1.2", + "resolved": "https://registry.npmmirror.com/chalk/-/chalk-4.1.2.tgz", + "integrity": "sha512-oKnbhFyRIXpUuez8iBMmyEa4nbj4IOQyuhc/wy9kY7/WVPcwIO9VA668Pu8RkO7+0G76SLROeyw9CpQ061i4mA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-styles": "^4.1.0", + "supports-color": "^7.1.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/chalk?sponsor=1" + } + }, + "node_modules/chokidar": { + "version": "3.6.0", + "resolved": "https://registry.npmmirror.com/chokidar/-/chokidar-3.6.0.tgz", + "integrity": "sha512-7VT13fmjotKpGipCW9JEQAusEPE+Ei8nl6/g4FBAmIm0GOOLMua9NDDo/DWp0ZAxCr3cPq5ZpBqmPAQgDda2Pw==", + "dev": true, + "license": "MIT", + "dependencies": { + "anymatch": "~3.1.2", + "braces": "~3.0.2", + "glob-parent": "~5.1.2", + "is-binary-path": "~2.1.0", + "is-glob": "~4.0.1", + "normalize-path": "~3.0.0", + "readdirp": "~3.6.0" + }, + "engines": { + "node": ">= 8.10.0" + }, + "funding": { + "url": "https://paulmillr.com/funding/" + }, + "optionalDependencies": { + "fsevents": "~2.3.2" + } + }, + "node_modules/chrome-trace-event": { + "version": "1.0.4", + "resolved": "https://registry.npmmirror.com/chrome-trace-event/-/chrome-trace-event-1.0.4.tgz", + "integrity": "sha512-rNjApaLzuwaOTjCiT8lSDdGN1APCiqkChLMJxJPWLunPAt5fy8xgU9/jNOchV84wfIxrA0lRQB7oCT8jrn/wrQ==", + "license": "MIT", + "engines": { + "node": ">=6.0" + } + }, + "node_modules/clean-css": { + "version": "5.3.3", + "resolved": "https://registry.npmmirror.com/clean-css/-/clean-css-5.3.3.tgz", + "integrity": "sha512-D5J+kHaVb/wKSFcyyV75uCn8fiY4sV38XJoe4CUyGQ+mOU/fMVYUdH1hJC+CJQ5uY3EnW27SbJYS4X8BiLrAFg==", + "dev": true, + "license": "MIT", + "dependencies": { + "source-map": "~0.6.0" + }, + "engines": { + "node": ">= 10.0" + } + }, + "node_modules/clean-webpack-plugin": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/clean-webpack-plugin/-/clean-webpack-plugin-4.0.0.tgz", + "integrity": "sha512-WuWE1nyTNAyW5T7oNyys2EN0cfP2fdRxhxnIQWiAp0bMabPdHhoGxM8A6YL2GhqwgrPnnaemVE7nv5XJ2Fhh2w==", + "license": "MIT", + "dependencies": { + "del": "^4.1.1" + }, + "engines": { + "node": ">=10.0.0" + }, + "peerDependencies": { + "webpack": ">=4.0.0 <6.0.0" + } + }, + "node_modules/clone-deep": { + "version": "4.0.1", + "resolved": "https://registry.npmmirror.com/clone-deep/-/clone-deep-4.0.1.tgz", + "integrity": "sha512-neHB9xuzh/wk0dIHweyAXv2aPGZIVk3pLMe+/RNzINf17fe0OG96QroktYAUm7SM1PBnzTabaLboqqxDyMU+SQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-plain-object": "^2.0.4", + "kind-of": "^6.0.2", + "shallow-clone": "^3.0.0" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/color-convert": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/color-convert/-/color-convert-2.0.1.tgz", + "integrity": "sha512-RRECPsj7iu/xb5oKYcsFHSppFNnsj/52OVTRKb4zP5onXwVF3zVmmToNcOfGC+CRDpfK/U584fMg38ZHCaElKQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "color-name": "~1.1.4" + }, + "engines": { + "node": ">=7.0.0" + } + }, + "node_modules/color-name": { + "version": "1.1.4", + "resolved": "https://registry.npmmirror.com/color-name/-/color-name-1.1.4.tgz", + "integrity": "sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA==", + "dev": true, + "license": "MIT" + }, + "node_modules/colorette": { + "version": "2.0.20", + "resolved": "https://registry.npmmirror.com/colorette/-/colorette-2.0.20.tgz", + "integrity": "sha512-IfEDxwoWIjkeXL1eXcDiow4UbKjhLdq6/EuSVR9GMN7KVH3r9gQ83e73hsz1Nd1T3ijd5xv1wcWRYO+D6kCI2w==", + "dev": true, + "license": "MIT" + }, + "node_modules/commander": { + "version": "2.20.3", + "resolved": "https://registry.npmmirror.com/commander/-/commander-2.20.3.tgz", + "integrity": "sha512-GpVkmM8vF2vQUkj2LvZmD35JxeJOLCwJ9cUkugyk2nuhbv3+mJvpLYYt+0+USMxE+oj+ey/lJEnhZw75x/OMcQ==", + "license": "MIT" + }, + "node_modules/compressible": { + "version": "2.0.18", + "resolved": "https://registry.npmmirror.com/compressible/-/compressible-2.0.18.tgz", + "integrity": "sha512-AF3r7P5dWxL8MxyITRMlORQNaOA2IkAFaTr4k7BUumjPtRpGDTZpl0Pb1XCO6JeDCBdp126Cgs9sMxqSjgYyRg==", + "dev": true, + "license": "MIT", + "dependencies": { + "mime-db": ">= 1.43.0 < 2" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/compression": { + "version": "1.8.0", + "resolved": "https://registry.npmmirror.com/compression/-/compression-1.8.0.tgz", + "integrity": "sha512-k6WLKfunuqCYD3t6AsuPGvQWaKwuLLh2/xHNcX4qE+vIfDNXpSqnrhwA7O53R7WVQUnt8dVAIW+YHr7xTgOgGA==", + "dev": true, + "license": "MIT", + "dependencies": { + "bytes": "3.1.2", + "compressible": "~2.0.18", + "debug": "2.6.9", + "negotiator": "~0.6.4", + "on-headers": "~1.0.2", + "safe-buffer": "5.2.1", + "vary": "~1.1.2" + }, + "engines": { + "node": ">= 0.8.0" + } + }, + "node_modules/concat-map": { + "version": "0.0.1", + "resolved": "https://registry.npmmirror.com/concat-map/-/concat-map-0.0.1.tgz", + "integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg==", + "license": "MIT" + }, + "node_modules/connect-history-api-fallback": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/connect-history-api-fallback/-/connect-history-api-fallback-2.0.0.tgz", + "integrity": "sha512-U73+6lQFmfiNPrYbXqr6kZ1i1wiRqXnp2nhMsINseWXO8lDau0LGEffJ8kQi4EjLZympVgRdvqjAgiZ1tgzDDA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.8" + } + }, + "node_modules/content-disposition": { + "version": "0.5.4", + "resolved": "https://registry.npmmirror.com/content-disposition/-/content-disposition-0.5.4.tgz", + "integrity": "sha512-FveZTNuGw04cxlAiWbzi6zTAL/lhehaWbTtgluJh4/E95DqMwTmha3KZN1aAWA8cFIhHzMZUvLevkw5Rqk+tSQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "safe-buffer": "5.2.1" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/content-type": { + "version": "1.0.5", + "resolved": "https://registry.npmmirror.com/content-type/-/content-type-1.0.5.tgz", + "integrity": "sha512-nTjqfcBFEipKdXCv4YDQWCfmcLZKm81ldF0pAopTvyrFGVbcR6P/VAAd5G7N+0tTr8QqiU0tFadD6FK4NtJwOA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/cookie": { + "version": "0.7.1", + "resolved": "https://registry.npmmirror.com/cookie/-/cookie-0.7.1.tgz", + "integrity": "sha512-6DnInpx7SJ2AK3+CTUE/ZM0vWTUboZCegxhC2xiIydHR9jNuTAASBrfEpHhiGOZw/nX51bHt6YQl8jsGo4y/0w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/cookie-signature": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/cookie-signature/-/cookie-signature-1.0.6.tgz", + "integrity": "sha512-QADzlaHc8icV8I7vbaJXJwod9HWYp8uCqf1xa4OfNu1T7JVxQIrUgOWtHdNDtPiywmFbiS12VjotIXLrKM3orQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/core-util-is": { + "version": "1.0.3", + "resolved": "https://registry.npmmirror.com/core-util-is/-/core-util-is-1.0.3.tgz", + "integrity": "sha512-ZQBvi1DcpJ4GDqanjucZ2Hj3wEO5pZDS89BWbkcrvdxksJorwUDDZamX9ldFkp9aw2lmBDLgkObEA4DWNJ9FYQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/cross-env": { + "version": "7.0.3", + "resolved": "https://registry.npmmirror.com/cross-env/-/cross-env-7.0.3.tgz", + "integrity": "sha512-+/HKd6EgcQCJGh2PSjZuUitQBQynKor4wrFbRg4DtAgS1aWO+gU52xpH7M9ScGgXSYmAVS9bIJ8EzuaGw0oNAw==", + "license": "MIT", + "dependencies": { + "cross-spawn": "^7.0.1" + }, + "bin": { + "cross-env": "src/bin/cross-env.js", + "cross-env-shell": "src/bin/cross-env-shell.js" + }, + "engines": { + "node": ">=10.14", + "npm": ">=6", + "yarn": ">=1" + } + }, + "node_modules/cross-spawn": { + "version": "7.0.6", + "resolved": "https://registry.npmmirror.com/cross-spawn/-/cross-spawn-7.0.6.tgz", + "integrity": "sha512-uV2QOWP2nWzsy2aMp8aRibhi9dlzF5Hgh5SHaB9OiTGEyDTiJJyx0uy51QXdyWbtAHNua4XJzUKca3OzKUd3vA==", + "license": "MIT", + "dependencies": { + "path-key": "^3.1.0", + "shebang-command": "^2.0.0", + "which": "^2.0.1" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/css-loader": { + "version": "7.1.2", + "resolved": "https://registry.npmmirror.com/css-loader/-/css-loader-7.1.2.tgz", + "integrity": "sha512-6WvYYn7l/XEGN8Xu2vWFt9nVzrCn39vKyTEFf/ExEyoksJjjSZV/0/35XPlMbpnr6VGhZIUg5yJrL8tGfes/FA==", + "license": "MIT", + "dependencies": { + "icss-utils": "^5.1.0", + "postcss": "^8.4.33", + "postcss-modules-extract-imports": "^3.1.0", + "postcss-modules-local-by-default": "^4.0.5", + "postcss-modules-scope": "^3.2.0", + "postcss-modules-values": "^4.0.0", + "postcss-value-parser": "^4.2.0", + "semver": "^7.5.4" + }, + "engines": { + "node": ">= 18.12.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "@rspack/core": "0.x || 1.x", + "webpack": "^5.27.0" + }, + "peerDependenciesMeta": { + "@rspack/core": { + "optional": true + }, + "webpack": { + "optional": true + } + } + }, + "node_modules/css-select": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/css-select/-/css-select-4.3.0.tgz", + "integrity": "sha512-wPpOYtnsVontu2mODhA19JrqWxNsfdatRKd64kmpRbQgh1KtItko5sTnEpPdpSaJszTOhEMlF/RPz28qj4HqhQ==", + "dev": true, + "license": "BSD-2-Clause", + "dependencies": { + "boolbase": "^1.0.0", + "css-what": "^6.0.1", + "domhandler": "^4.3.1", + "domutils": "^2.8.0", + "nth-check": "^2.0.1" + }, + "funding": { + "url": "https://github.com/sponsors/fb55" + } + }, + "node_modules/css-what": { + "version": "6.1.0", + "resolved": "https://registry.npmmirror.com/css-what/-/css-what-6.1.0.tgz", + "integrity": "sha512-HTUrgRJ7r4dsZKU6GjmpfRK1O76h97Z8MfS1G0FozR+oF2kG6Vfe8JE6zwrkbxigziPHinCJ+gCPjA9EaBDtRw==", + "dev": true, + "license": "BSD-2-Clause", + "engines": { + "node": ">= 6" + }, + "funding": { + "url": "https://github.com/sponsors/fb55" + } + }, + "node_modules/cssesc": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/cssesc/-/cssesc-3.0.0.tgz", + "integrity": "sha512-/Tb/JcjK111nNScGob5MNtsntNM1aCNUDipB/TkwZFhyDrrE47SOx/18wF2bbjgc3ZzCSKW1T5nt5EbFoAz/Vg==", + "license": "MIT", + "bin": { + "cssesc": "bin/cssesc" + }, + "engines": { + "node": ">=4" + } + }, + "node_modules/d3": { + "version": "5.7.0", + "resolved": "https://registry.npmmirror.com/d3/-/d3-5.7.0.tgz", + "integrity": "sha512-8KEIfx+dFm8PlbJN9PI0suazrZ41QcaAufsKE9PRcqYPWLngHIyWJZX96n6IQKePGgeSu0l7rtlueSSNq8Zc3g==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-array": "1", + "d3-axis": "1", + "d3-brush": "1", + "d3-chord": "1", + "d3-collection": "1", + "d3-color": "1", + "d3-contour": "1", + "d3-dispatch": "1", + "d3-drag": "1", + "d3-dsv": "1", + "d3-ease": "1", + "d3-fetch": "1", + "d3-force": "1", + "d3-format": "1", + "d3-geo": "1", + "d3-hierarchy": "1", + "d3-interpolate": "1", + "d3-path": "1", + "d3-polygon": "1", + "d3-quadtree": "1", + "d3-random": "1", + "d3-scale": "2", + "d3-scale-chromatic": "1", + "d3-selection": "1", + "d3-shape": "1", + "d3-time": "1", + "d3-time-format": "2", + "d3-timer": "1", + "d3-transition": "1", + "d3-voronoi": "1", + "d3-zoom": "1" + } + }, + "node_modules/d3-array": { + "version": "1.2.4", + "resolved": "https://registry.npmmirror.com/d3-array/-/d3-array-1.2.4.tgz", + "integrity": "sha512-KHW6M86R+FUPYGb3R5XiYjXPq7VzwxZ22buHhAEVG5ztoEcZZMLov530mmccaqA1GghZArjQV46fuc8kUqhhHw==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-axis": { + "version": "1.0.12", + "resolved": "https://registry.npmmirror.com/d3-axis/-/d3-axis-1.0.12.tgz", + "integrity": "sha512-ejINPfPSNdGFKEOAtnBtdkpr24c4d4jsei6Lg98mxf424ivoDP2956/5HDpIAtmHo85lqT4pruy+zEgvRUBqaQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-brush": { + "version": "1.1.6", + "resolved": "https://registry.npmmirror.com/d3-brush/-/d3-brush-1.1.6.tgz", + "integrity": "sha512-7RW+w7HfMCPyZLifTz/UnJmI5kdkXtpCbombUSs8xniAyo0vIbrDzDwUJB6eJOgl9u5DQOt2TQlYumxzD1SvYA==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-dispatch": "1", + "d3-drag": "1", + "d3-interpolate": "1", + "d3-selection": "1", + "d3-transition": "1" + } + }, + "node_modules/d3-chord": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/d3-chord/-/d3-chord-1.0.6.tgz", + "integrity": "sha512-JXA2Dro1Fxw9rJe33Uv+Ckr5IrAa74TlfDEhE/jfLOaXegMQFQTAgAw9WnZL8+HxVBRXaRGCkrNU7pJeylRIuA==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-array": "1", + "d3-path": "1" + } + }, + "node_modules/d3-collection": { + "version": "1.0.7", + "resolved": "https://registry.npmmirror.com/d3-collection/-/d3-collection-1.0.7.tgz", + "integrity": "sha512-ii0/r5f4sjKNTfh84Di+DpztYwqKhEyUlKoPrzUFfeSkWxjW49xU2QzO9qrPrNkpdI0XJkfzvmTu8V2Zylln6A==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-color": { + "version": "1.4.1", + "resolved": "https://registry.npmmirror.com/d3-color/-/d3-color-1.4.1.tgz", + "integrity": "sha512-p2sTHSLCJI2QKunbGb7ocOh7DgTAn8IrLx21QRc/BSnodXM4sv6aLQlnfpvehFMLZEfBc6g9pH9SWQccFYfJ9Q==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-contour": { + "version": "1.3.2", + "resolved": "https://registry.npmmirror.com/d3-contour/-/d3-contour-1.3.2.tgz", + "integrity": "sha512-hoPp4K/rJCu0ladiH6zmJUEz6+u3lgR+GSm/QdM2BBvDraU39Vr7YdDCicJcxP1z8i9B/2dJLgDC1NcvlF8WCg==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-array": "^1.1.1" + } + }, + "node_modules/d3-dispatch": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/d3-dispatch/-/d3-dispatch-1.0.6.tgz", + "integrity": "sha512-fVjoElzjhCEy+Hbn8KygnmMS7Or0a9sI2UzGwoB7cCtvI1XpVN9GpoYlnb3xt2YV66oXYb1fLJ8GMvP4hdU1RA==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-drag": { + "version": "1.2.5", + "resolved": "https://registry.npmmirror.com/d3-drag/-/d3-drag-1.2.5.tgz", + "integrity": "sha512-rD1ohlkKQwMZYkQlYVCrSFxsWPzI97+W+PaEIBNTMxRuxz9RF0Hi5nJWHGVJ3Om9d2fRTe1yOBINJyy/ahV95w==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-dispatch": "1", + "d3-selection": "1" + } + }, + "node_modules/d3-dsv": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/d3-dsv/-/d3-dsv-1.2.0.tgz", + "integrity": "sha512-9yVlqvZcSOMhCYzniHE7EVUws7Fa1zgw+/EAV2BxJoG3ME19V6BQFBwI855XQDsxyOuG7NibqRMTtiF/Qup46g==", + "license": "BSD-3-Clause", + "dependencies": { + "commander": "2", + "iconv-lite": "0.4", + "rw": "1" + }, + "bin": { + "csv2json": "bin/dsv2json", + "csv2tsv": "bin/dsv2dsv", + "dsv2dsv": "bin/dsv2dsv", + "dsv2json": "bin/dsv2json", + "json2csv": "bin/json2dsv", + "json2dsv": "bin/json2dsv", + "json2tsv": "bin/json2dsv", + "tsv2csv": "bin/dsv2dsv", + "tsv2json": "bin/dsv2json" + } + }, + "node_modules/d3-ease": { + "version": "1.0.7", + "resolved": "https://registry.npmmirror.com/d3-ease/-/d3-ease-1.0.7.tgz", + "integrity": "sha512-lx14ZPYkhNx0s/2HX5sLFUI3mbasHjSSpwO/KaaNACweVwxUruKyWVcb293wMv1RqTPZyZ8kSZ2NogUZNcLOFQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-fetch": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/d3-fetch/-/d3-fetch-1.2.0.tgz", + "integrity": "sha512-yC78NBVcd2zFAyR/HnUiBS7Lf6inSCoWcSxFfw8FYL7ydiqe80SazNwoffcqOfs95XaLo7yebsmQqDKSsXUtvA==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-dsv": "1" + } + }, + "node_modules/d3-force": { + "version": "1.2.1", + "resolved": "https://registry.npmmirror.com/d3-force/-/d3-force-1.2.1.tgz", + "integrity": "sha512-HHvehyaiUlVo5CxBJ0yF/xny4xoaxFxDnBXNvNcfW9adORGZfyNF1dj6DGLKyk4Yh3brP/1h3rnDzdIAwL08zg==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-collection": "1", + "d3-dispatch": "1", + "d3-quadtree": "1", + "d3-timer": "1" + } + }, + "node_modules/d3-format": { + "version": "1.4.5", + "resolved": "https://registry.npmmirror.com/d3-format/-/d3-format-1.4.5.tgz", + "integrity": "sha512-J0piedu6Z8iB6TbIGfZgDzfXxUFN3qQRMofy2oPdXzQibYGqPB/9iMcxr/TGalU+2RsyDO+U4f33id8tbnSRMQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-geo": { + "version": "1.12.1", + "resolved": "https://registry.npmmirror.com/d3-geo/-/d3-geo-1.12.1.tgz", + "integrity": "sha512-XG4d1c/UJSEX9NfU02KwBL6BYPj8YKHxgBEw5om2ZnTRSbIcego6dhHwcxuSR3clxh0EpE38os1DVPOmnYtTPg==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-array": "1" + } + }, + "node_modules/d3-hierarchy": { + "version": "1.1.9", + "resolved": "https://registry.npmmirror.com/d3-hierarchy/-/d3-hierarchy-1.1.9.tgz", + "integrity": "sha512-j8tPxlqh1srJHAtxfvOUwKNYJkQuBFdM1+JAUfq6xqH5eAqf93L7oG1NVqDa4CpFZNvnNKtCYEUC8KY9yEn9lQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-interpolate": { + "version": "1.4.0", + "resolved": "https://registry.npmmirror.com/d3-interpolate/-/d3-interpolate-1.4.0.tgz", + "integrity": "sha512-V9znK0zc3jOPV4VD2zZn0sDhZU3WAE2bmlxdIwwQPPzPjvyLkd8B3JUVdS1IDUFDkWZ72c9qnv1GK2ZagTZ8EA==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-color": "1" + } + }, + "node_modules/d3-path": { + "version": "1.0.9", + "resolved": "https://registry.npmmirror.com/d3-path/-/d3-path-1.0.9.tgz", + "integrity": "sha512-VLaYcn81dtHVTjEHd8B+pbe9yHWpXKZUC87PzoFmsFrJqgFwDe/qxfp5MlfsfM1V5E/iVt0MmEbWQ7FVIXh/bg==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-polygon": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/d3-polygon/-/d3-polygon-1.0.6.tgz", + "integrity": "sha512-k+RF7WvI08PC8reEoXa/w2nSg5AUMTi+peBD9cmFc+0ixHfbs4QmxxkarVal1IkVkgxVuk9JSHhJURHiyHKAuQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-quadtree": { + "version": "1.0.7", + "resolved": "https://registry.npmmirror.com/d3-quadtree/-/d3-quadtree-1.0.7.tgz", + "integrity": "sha512-RKPAeXnkC59IDGD0Wu5mANy0Q2V28L+fNe65pOCXVdVuTJS3WPKaJlFHer32Rbh9gIo9qMuJXio8ra4+YmIymA==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-random": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/d3-random/-/d3-random-1.1.2.tgz", + "integrity": "sha512-6AK5BNpIFqP+cx/sreKzNjWbwZQCSUatxq+pPRmFIQaWuoD+NrbVWw7YWpHiXpCQ/NanKdtGDuB+VQcZDaEmYQ==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-scale": { + "version": "2.2.2", + "resolved": "https://registry.npmmirror.com/d3-scale/-/d3-scale-2.2.2.tgz", + "integrity": "sha512-LbeEvGgIb8UMcAa0EATLNX0lelKWGYDQiPdHj+gLblGVhGLyNbaCn3EvrJf0A3Y/uOOU5aD6MTh5ZFCdEwGiCw==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-array": "^1.2.0", + "d3-collection": "1", + "d3-format": "1", + "d3-interpolate": "1", + "d3-time": "1", + "d3-time-format": "2" + } + }, + "node_modules/d3-scale-chromatic": { + "version": "1.5.0", + "resolved": "https://registry.npmmirror.com/d3-scale-chromatic/-/d3-scale-chromatic-1.5.0.tgz", + "integrity": "sha512-ACcL46DYImpRFMBcpk9HhtIyC7bTBR4fNOPxwVSl0LfulDAwyiHyPOTqcDG1+t5d4P9W7t/2NAuWu59aKko/cg==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-color": "1", + "d3-interpolate": "1" + } + }, + "node_modules/d3-selection": { + "version": "1.4.2", + "resolved": "https://registry.npmmirror.com/d3-selection/-/d3-selection-1.4.2.tgz", + "integrity": "sha512-SJ0BqYihzOjDnnlfyeHT0e30k0K1+5sR3d5fNueCNeuhZTnGw4M4o8mqJchSwgKMXCNFo+e2VTChiSJ0vYtXkg==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-shape": { + "version": "1.3.7", + "resolved": "https://registry.npmmirror.com/d3-shape/-/d3-shape-1.3.7.tgz", + "integrity": "sha512-EUkvKjqPFUAZyOlhY5gzCxCeI0Aep04LwIRpsZ/mLFelJiUfnK56jo5JMDSE7yyP2kLSb6LtF+S5chMk7uqPqw==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-path": "1" + } + }, + "node_modules/d3-time": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/d3-time/-/d3-time-1.1.0.tgz", + "integrity": "sha512-Xh0isrZ5rPYYdqhAVk8VLnMEidhz5aP7htAADH6MfzgmmicPkTo8LhkLxci61/lCB7n7UmE3bN0leRt+qvkLxA==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-time-format": { + "version": "2.3.0", + "resolved": "https://registry.npmmirror.com/d3-time-format/-/d3-time-format-2.3.0.tgz", + "integrity": "sha512-guv6b2H37s2Uq/GefleCDtbe0XZAuy7Wa49VGkPVPMfLL9qObgBST3lEHJBMUp8S7NdLQAGIvr2KXk8Hc98iKQ==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-time": "1" + } + }, + "node_modules/d3-timer": { + "version": "1.0.10", + "resolved": "https://registry.npmmirror.com/d3-timer/-/d3-timer-1.0.10.tgz", + "integrity": "sha512-B1JDm0XDaQC+uvo4DT79H0XmBskgS3l6Ve+1SBCfxgmtIb1AVrPIoqd+nPSv+loMX8szQ0sVUhGngL7D5QPiXw==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-transition": { + "version": "1.3.2", + "resolved": "https://registry.npmmirror.com/d3-transition/-/d3-transition-1.3.2.tgz", + "integrity": "sha512-sc0gRU4PFqZ47lPVHloMn9tlPcv8jxgOQg+0zjhfZXMQuvppjG6YuwdMBE0TuqCZjeJkLecku/l9R0JPcRhaDA==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-color": "1", + "d3-dispatch": "1", + "d3-ease": "1", + "d3-interpolate": "1", + "d3-selection": "^1.1.0", + "d3-timer": "1" + } + }, + "node_modules/d3-voronoi": { + "version": "1.1.4", + "resolved": "https://registry.npmmirror.com/d3-voronoi/-/d3-voronoi-1.1.4.tgz", + "integrity": "sha512-dArJ32hchFsrQ8uMiTBLq256MpnZjeuBtdHpaDlYuQyjU0CVzCJl/BVW+SkszaAeH95D/8gxqAhgx0ouAWAfRg==", + "license": "BSD-3-Clause" + }, + "node_modules/d3-zoom": { + "version": "1.8.3", + "resolved": "https://registry.npmmirror.com/d3-zoom/-/d3-zoom-1.8.3.tgz", + "integrity": "sha512-VoLXTK4wvy1a0JpH2Il+F2CiOhVu7VRXWF5M/LroMIh3/zBAC3WAt7QoIvPibOavVo20hN6/37vwAsdBejLyKQ==", + "license": "BSD-3-Clause", + "dependencies": { + "d3-dispatch": "1", + "d3-drag": "1", + "d3-interpolate": "1", + "d3-selection": "1", + "d3-transition": "1" + } + }, + "node_modules/dagre": { + "version": "0.8.5", + "resolved": "https://registry.npmmirror.com/dagre/-/dagre-0.8.5.tgz", + "integrity": "sha512-/aTqmnRta7x7MCCpExk7HQL2O4owCT2h8NT//9I1OQ9vt29Pa0BzSAkR5lwFUcQ7491yVi/3CXU9jQ5o0Mn2Sw==", + "license": "MIT", + "dependencies": { + "graphlib": "^2.1.8", + "lodash": "^4.17.15" + } + }, + "node_modules/debug": { + "version": "2.6.9", + "resolved": "https://registry.npmmirror.com/debug/-/debug-2.6.9.tgz", + "integrity": "sha512-bC7ElrdJaJnPbAP+1EotYvqZsb3ecl5wi6Bfi6BJTUcNowp6cvspg0jXznRTKDjm/E7AdgFBVeAPVMNcKGsHMA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ms": "2.0.0" + } + }, + "node_modules/default-gateway": { + "version": "6.0.3", + "resolved": "https://registry.npmmirror.com/default-gateway/-/default-gateway-6.0.3.tgz", + "integrity": "sha512-fwSOJsbbNzZ/CUFpqFBqYfYNLj1NbMPm8MMCIzHjC83iSJRBEGmDUxU+WP661BaBQImeC2yHwXtz+P/O9o+XEg==", + "dev": true, + "license": "BSD-2-Clause", + "dependencies": { + "execa": "^5.0.0" + }, + "engines": { + "node": ">= 10" + } + }, + "node_modules/define-lazy-prop": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/define-lazy-prop/-/define-lazy-prop-2.0.0.tgz", + "integrity": "sha512-Ds09qNh8yw3khSjiJjiUInaGX9xlqZDY7JVryGxdxV7NPeuqQfplOpQ66yJFZut3jLa5zOwkXw1g9EI2uKh4Og==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/del": { + "version": "4.1.1", + "resolved": "https://registry.npmmirror.com/del/-/del-4.1.1.tgz", + "integrity": "sha512-QwGuEUouP2kVwQenAsOof5Fv8K9t3D8Ca8NxcXKrIpEHjTXK5J2nXLdP+ALI1cgv8wj7KuwBhTwBkOZSJKM5XQ==", + "license": "MIT", + "dependencies": { + "@types/glob": "^7.1.1", + "globby": "^6.1.0", + "is-path-cwd": "^2.0.0", + "is-path-in-cwd": "^2.0.0", + "p-map": "^2.0.0", + "pify": "^4.0.1", + "rimraf": "^2.6.3" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/depd": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/depd/-/depd-2.0.0.tgz", + "integrity": "sha512-g7nH6P6dyDioJogAAGprGpCtVImJhpPk/roCzdb3fIh61/s/nPsfR6onyMwkCAR/OlC3yBC0lESvUoQEAssIrw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/destroy": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/destroy/-/destroy-1.2.0.tgz", + "integrity": "sha512-2sJGJTaXIIaR1w4iJSNoN0hnMY7Gpc/n8D4qSCJw8QqFWXf7cuAgnEHxBpweaVcPevC2l3KpjYCx3NypQQgaJg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8", + "npm": "1.2.8000 || >= 1.4.16" + } + }, + "node_modules/detect-node": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/detect-node/-/detect-node-2.1.0.tgz", + "integrity": "sha512-T0NIuQpnTvFDATNuHN5roPwSBG83rFsuO+MXXH9/3N1eFbn4wcPjttvjMLEPWJ0RGUYgQE7cGgS3tNxbqCGM7g==", + "dev": true, + "license": "MIT" + }, + "node_modules/dns-packet": { + "version": "5.6.1", + "resolved": "https://registry.npmmirror.com/dns-packet/-/dns-packet-5.6.1.tgz", + "integrity": "sha512-l4gcSouhcgIKRvyy99RNVOgxXiicE+2jZoNmaNmZ6JXiGajBOJAesk1OBlJuM5k2c+eudGdLxDqXuPCKIj6kpw==", + "dev": true, + "license": "MIT", + "dependencies": { + "@leichtgewicht/ip-codec": "^2.0.1" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/dom-converter": { + "version": "0.2.0", + "resolved": "https://registry.npmmirror.com/dom-converter/-/dom-converter-0.2.0.tgz", + "integrity": "sha512-gd3ypIPfOMr9h5jIKq8E3sHOTCjeirnl0WK5ZdS1AW0Odt0b1PaWaHdJ4Qk4klv+YB9aJBS7mESXjFoDQPu6DA==", + "dev": true, + "license": "MIT", + "dependencies": { + "utila": "~0.4" + } + }, + "node_modules/dom-serializer": { + "version": "1.4.1", + "resolved": "https://registry.npmmirror.com/dom-serializer/-/dom-serializer-1.4.1.tgz", + "integrity": "sha512-VHwB3KfrcOOkelEG2ZOfxqLZdfkil8PtJi4P8N2MMXucZq2yLp75ClViUlOVwyoHEDjYU433Aq+5zWP61+RGag==", + "dev": true, + "license": "MIT", + "dependencies": { + "domelementtype": "^2.0.1", + "domhandler": "^4.2.0", + "entities": "^2.0.0" + }, + "funding": { + "url": "https://github.com/cheeriojs/dom-serializer?sponsor=1" + } + }, + "node_modules/dom-serializer/node_modules/entities": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/entities/-/entities-2.2.0.tgz", + "integrity": "sha512-p92if5Nz619I0w+akJrLZH0MX0Pb5DX39XOwQTtXSdQQOaYH03S1uIQp4mhOZtAXrxq4ViO67YTiLBo2638o9A==", + "dev": true, + "license": "BSD-2-Clause", + "funding": { + "url": "https://github.com/fb55/entities?sponsor=1" + } + }, + "node_modules/domelementtype": { + "version": "2.3.0", + "resolved": "https://registry.npmmirror.com/domelementtype/-/domelementtype-2.3.0.tgz", + "integrity": "sha512-OLETBj6w0OsagBwdXnPdN0cnMfF9opN69co+7ZrbfPGrdpPVNBUj02spi6B1N7wChLQiPn4CSH/zJvXw56gmHw==", + "dev": true, + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/fb55" + } + ], + "license": "BSD-2-Clause" + }, + "node_modules/domhandler": { + "version": "4.3.1", + "resolved": "https://registry.npmmirror.com/domhandler/-/domhandler-4.3.1.tgz", + "integrity": "sha512-GrwoxYN+uWlzO8uhUXRl0P+kHE4GtVPfYzVLcUxPL7KNdHKj66vvlhiweIHqYYXWlw+T8iLMp42Lm67ghw4WMQ==", + "dev": true, + "license": "BSD-2-Clause", + "dependencies": { + "domelementtype": "^2.2.0" + }, + "engines": { + "node": ">= 4" + }, + "funding": { + "url": "https://github.com/fb55/domhandler?sponsor=1" + } + }, + "node_modules/domutils": { + "version": "2.8.0", + "resolved": "https://registry.npmmirror.com/domutils/-/domutils-2.8.0.tgz", + "integrity": "sha512-w96Cjofp72M5IIhpjgobBimYEfoPjx1Vx0BSX9P30WBdZW2WIKU0T1Bd0kz2eNZ9ikjKgHbEyKx8BB6H1L3h3A==", + "dev": true, + "license": "BSD-2-Clause", + "dependencies": { + "dom-serializer": "^1.0.1", + "domelementtype": "^2.2.0", + "domhandler": "^4.2.0" + }, + "funding": { + "url": "https://github.com/fb55/domutils?sponsor=1" + } + }, + "node_modules/dot-case": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/dot-case/-/dot-case-3.0.4.tgz", + "integrity": "sha512-Kv5nKlh6yRrdrGvxeJ2e5y2eRUpkUosIW4A2AS38zwSz27zu7ufDwQPi5Jhs3XAlGNetl3bmnGhQsMtkKJnj3w==", + "dev": true, + "license": "MIT", + "dependencies": { + "no-case": "^3.0.4", + "tslib": "^2.0.3" + } + }, + "node_modules/dunder-proto": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/dunder-proto/-/dunder-proto-1.0.1.tgz", + "integrity": "sha512-KIN/nDJBQRcXw0MLVhZE9iQHmG68qAVIBg9CqmUYjmQIhgij9U5MFvrqkUL5FbtyyzZuOeOt0zdeRe4UY7ct+A==", + "dev": true, + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.1", + "es-errors": "^1.3.0", + "gopd": "^1.2.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/ee-first": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/ee-first/-/ee-first-1.1.1.tgz", + "integrity": "sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow==", + "dev": true, + "license": "MIT" + }, + "node_modules/electron-to-chromium": { + "version": "1.5.126", + "resolved": "https://registry.npmmirror.com/electron-to-chromium/-/electron-to-chromium-1.5.126.tgz", + "integrity": "sha512-AtH1uLcTC72LA4vfYcEJJkrMk/MY/X0ub8Hv7QGAePW2JkeUFHEL/QfS4J77R6M87Sss8O0OcqReSaN1bpyA+Q==", + "license": "ISC" + }, + "node_modules/encodeurl": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/encodeurl/-/encodeurl-2.0.0.tgz", + "integrity": "sha512-Q0n9HRi4m6JuGIV1eFlmvJB7ZEVxu93IrMyiMsGC0lrMJMWzRgx6WGquyfQgZVb31vhGgXnfmPNNXmxnOkRBrg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/enhanced-resolve": { + "version": "5.18.1", + "resolved": "https://registry.npmmirror.com/enhanced-resolve/-/enhanced-resolve-5.18.1.tgz", + "integrity": "sha512-ZSW3ma5GkcQBIpwZTSRAI8N71Uuwgs93IezB7mf7R60tC8ZbJideoDNKjHn2O9KIlx6rkGTTEk1xUCK2E1Y2Yg==", + "license": "MIT", + "dependencies": { + "graceful-fs": "^4.2.4", + "tapable": "^2.2.0" + }, + "engines": { + "node": ">=10.13.0" + } + }, + "node_modules/entities": { + "version": "4.5.0", + "resolved": "https://registry.npmmirror.com/entities/-/entities-4.5.0.tgz", + "integrity": "sha512-V0hjH4dGPh9Ao5p0MoRY6BVqtwCjhz6vI5LT8AJ55H+4g9/4vbHx1I54fS0XuclLhDHArPQCiMjDxjaL8fPxhw==", + "dev": true, + "license": "BSD-2-Clause", + "engines": { + "node": ">=0.12" + }, + "funding": { + "url": "https://github.com/fb55/entities?sponsor=1" + } + }, + "node_modules/envinfo": { + "version": "7.14.0", + "resolved": "https://registry.npmmirror.com/envinfo/-/envinfo-7.14.0.tgz", + "integrity": "sha512-CO40UI41xDQzhLB1hWyqUKgFhs250pNcGbyGKe1l/e4FSaI/+YE4IMG76GDt0In67WLPACIITC+sOi08x4wIvg==", + "dev": true, + "license": "MIT", + "bin": { + "envinfo": "dist/cli.js" + }, + "engines": { + "node": ">=4" + } + }, + "node_modules/es-define-property": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/es-define-property/-/es-define-property-1.0.1.tgz", + "integrity": "sha512-e3nRfgfUZ4rNGL232gUgX06QNyyez04KdjFrF+LTRoOXmrOgFKDg4BCdsjW8EnT69eqdYGmRpJwiPVYNrCaW3g==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/es-errors": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/es-errors/-/es-errors-1.3.0.tgz", + "integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/es-module-lexer": { + "version": "1.6.0", + "resolved": "https://registry.npmmirror.com/es-module-lexer/-/es-module-lexer-1.6.0.tgz", + "integrity": "sha512-qqnD1yMU6tk/jnaMosogGySTZP8YtUgAffA9nMN+E/rjxcfRQ6IEk7IiozUjgxKoFHBGjTLnrHB/YC45r/59EQ==", + "license": "MIT" + }, + "node_modules/es-object-atoms": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/es-object-atoms/-/es-object-atoms-1.1.1.tgz", + "integrity": "sha512-FGgH2h8zKNim9ljj7dankFPcICIK9Cp5bm+c2gQSYePhpaG5+esrLODihIorn+Pe6FGJzWhXQotPv73jTaldXA==", + "dev": true, + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/escalade": { + "version": "3.2.0", + "resolved": "https://registry.npmmirror.com/escalade/-/escalade-3.2.0.tgz", + "integrity": "sha512-WUj2qlxaQtO4g6Pq5c29GTcWGDyd8itL8zTlipgECz3JesAiiOKotd8JU6otB3PACgG6xkJUyVhboMS+bje/jA==", + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/escape-html": { + "version": "1.0.3", + "resolved": "https://registry.npmmirror.com/escape-html/-/escape-html-1.0.3.tgz", + "integrity": "sha512-NiSupZ4OeuGwr68lGIeym/ksIZMJodUGOSCZ/FSnTxcrekbvqrgdUxlJOMpijaKZVjAJrWrGs/6Jy8OMuyj9ow==", + "dev": true, + "license": "MIT" + }, + "node_modules/eslint-scope": { + "version": "5.1.1", + "resolved": "https://registry.npmmirror.com/eslint-scope/-/eslint-scope-5.1.1.tgz", + "integrity": "sha512-2NxwbF/hZ0KpepYN0cNbo+FN6XoK7GaHlQhgx/hIZl6Va0bF45RQOOwhLIy8lQDbuCiadSLCBnH2CFYquit5bw==", + "license": "BSD-2-Clause", + "dependencies": { + "esrecurse": "^4.3.0", + "estraverse": "^4.1.1" + }, + "engines": { + "node": ">=8.0.0" + } + }, + "node_modules/esrecurse": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/esrecurse/-/esrecurse-4.3.0.tgz", + "integrity": "sha512-KmfKL3b6G+RXvP8N1vr3Tq1kL/oCFgn2NYXEtqP8/L3pKapUA4G8cFVaoF3SU323CD4XypR/ffioHmkti6/Tag==", + "license": "BSD-2-Clause", + "dependencies": { + "estraverse": "^5.2.0" + }, + "engines": { + "node": ">=4.0" + } + }, + "node_modules/esrecurse/node_modules/estraverse": { + "version": "5.3.0", + "resolved": "https://registry.npmmirror.com/estraverse/-/estraverse-5.3.0.tgz", + "integrity": "sha512-MMdARuVEQziNTeJD8DgMqmhwR11BRQ/cBP+pLtYdSTnf3MIO8fFeiINEbX36ZdNlfU/7A9f3gUw49B3oQsvwBA==", + "license": "BSD-2-Clause", + "engines": { + "node": ">=4.0" + } + }, + "node_modules/estraverse": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/estraverse/-/estraverse-4.3.0.tgz", + "integrity": "sha512-39nnKffWz8xN1BU/2c79n9nB9HDzo0niYUqx6xyqUnyoAnQyyWpOTdZEeiCch8BBu515t4wp9ZmgVfVhn9EBpw==", + "license": "BSD-2-Clause", + "engines": { + "node": ">=4.0" + } + }, + "node_modules/etag": { + "version": "1.8.1", + "resolved": "https://registry.npmmirror.com/etag/-/etag-1.8.1.tgz", + "integrity": "sha512-aIL5Fx7mawVa300al2BnEE4iNvo1qETxLrPI/o05L7z6go7fCw1J6EQmbK4FmJ2AS7kgVF/KEZWufBfdClMcPg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/eventemitter3": { + "version": "4.0.7", + "resolved": "https://registry.npmmirror.com/eventemitter3/-/eventemitter3-4.0.7.tgz", + "integrity": "sha512-8guHBZCwKnFhYdHr2ysuRWErTwhoN2X8XELRlrRwpmfeY2jjuUN4taQMsULKUVo1K4DvZl+0pgfyoysHxvmvEw==", + "dev": true, + "license": "MIT" + }, + "node_modules/events": { + "version": "3.3.0", + "resolved": "https://registry.npmmirror.com/events/-/events-3.3.0.tgz", + "integrity": "sha512-mQw+2fkQbALzQ7V0MY0IqdnXNOeTtP4r0lN9z7AAawCXgqea7bDii20AYrIBrFd/Hx0M2Ocz6S111CaFkUcb0Q==", + "license": "MIT", + "engines": { + "node": ">=0.8.x" + } + }, + "node_modules/execa": { + "version": "5.1.1", + "resolved": "https://registry.npmmirror.com/execa/-/execa-5.1.1.tgz", + "integrity": "sha512-8uSpZZocAZRBAPIEINJj3Lo9HyGitllczc27Eh5YYojjMFMn8yHMDMaUHE2Jqfq05D/wucwI4JGURyXt1vchyg==", + "dev": true, + "license": "MIT", + "dependencies": { + "cross-spawn": "^7.0.3", + "get-stream": "^6.0.0", + "human-signals": "^2.1.0", + "is-stream": "^2.0.0", + "merge-stream": "^2.0.0", + "npm-run-path": "^4.0.1", + "onetime": "^5.1.2", + "signal-exit": "^3.0.3", + "strip-final-newline": "^2.0.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sindresorhus/execa?sponsor=1" + } + }, + "node_modules/express": { + "version": "4.21.2", + "resolved": "https://registry.npmmirror.com/express/-/express-4.21.2.tgz", + "integrity": "sha512-28HqgMZAmih1Czt9ny7qr6ek2qddF4FclbMzwhCREB6OFfH+rXAnuNCwo1/wFvrtbgsQDb4kSbX9de9lFbrXnA==", + "dev": true, + "license": "MIT", + "dependencies": { + "accepts": "~1.3.8", + "array-flatten": "1.1.1", + "body-parser": "1.20.3", + "content-disposition": "0.5.4", + "content-type": "~1.0.4", + "cookie": "0.7.1", + "cookie-signature": "1.0.6", + "debug": "2.6.9", + "depd": "2.0.0", + "encodeurl": "~2.0.0", + "escape-html": "~1.0.3", + "etag": "~1.8.1", + "finalhandler": "1.3.1", + "fresh": "0.5.2", + "http-errors": "2.0.0", + "merge-descriptors": "1.0.3", + "methods": "~1.1.2", + "on-finished": "2.4.1", + "parseurl": "~1.3.3", + "path-to-regexp": "0.1.12", + "proxy-addr": "~2.0.7", + "qs": "6.13.0", + "range-parser": "~1.2.1", + "safe-buffer": "5.2.1", + "send": "0.19.0", + "serve-static": "1.16.2", + "setprototypeof": "1.2.0", + "statuses": "2.0.1", + "type-is": "~1.6.18", + "utils-merge": "1.0.1", + "vary": "~1.1.2" + }, + "engines": { + "node": ">= 0.10.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/express" + } + }, + "node_modules/fast-deep-equal": { + "version": "3.1.3", + "resolved": "https://registry.npmmirror.com/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz", + "integrity": "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q==", + "license": "MIT" + }, + "node_modules/fast-uri": { + "version": "3.0.6", + "resolved": "https://registry.npmmirror.com/fast-uri/-/fast-uri-3.0.6.tgz", + "integrity": "sha512-Atfo14OibSv5wAp4VWNsFYE1AchQRTv9cBGWET4pZWHzYshFSS9NQI6I57rdKn9croWVMbYFbLhJ+yJvmZIIHw==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/fastify" + }, + { + "type": "opencollective", + "url": "https://opencollective.com/fastify" + } + ], + "license": "BSD-3-Clause" + }, + "node_modules/fastest-levenshtein": { + "version": "1.0.16", + "resolved": "https://registry.npmmirror.com/fastest-levenshtein/-/fastest-levenshtein-1.0.16.tgz", + "integrity": "sha512-eRnCtTTtGZFpQCwhJiUOuxPQWRXVKYDn0b2PeHfXL6/Zi53SLAzAHfVhVWK2AryC/WH05kGfxhFIPvTF0SXQzg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 4.9.1" + } + }, + "node_modules/faye-websocket": { + "version": "0.11.4", + "resolved": "https://registry.npmmirror.com/faye-websocket/-/faye-websocket-0.11.4.tgz", + "integrity": "sha512-CzbClwlXAuiRQAlUyfqPgvPoNKTckTPGfwZV4ZdAhVcP2lh9KUxJg2b5GkE7XbjKQ3YJnQ9z6D9ntLAlB+tP8g==", + "dev": true, + "license": "Apache-2.0", + "dependencies": { + "websocket-driver": ">=0.5.1" + }, + "engines": { + "node": ">=0.8.0" + } + }, + "node_modules/fill-range": { + "version": "7.1.1", + "resolved": "https://registry.npmmirror.com/fill-range/-/fill-range-7.1.1.tgz", + "integrity": "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==", + "dev": true, + "license": "MIT", + "dependencies": { + "to-regex-range": "^5.0.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/finalhandler": { + "version": "1.3.1", + "resolved": "https://registry.npmmirror.com/finalhandler/-/finalhandler-1.3.1.tgz", + "integrity": "sha512-6BN9trH7bp3qvnrRyzsBz+g3lZxTNZTbVO2EV1CS0WIcDbawYVdYvGflME/9QP0h0pYlCDBCTjYa9nZzMDpyxQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "debug": "2.6.9", + "encodeurl": "~2.0.0", + "escape-html": "~1.0.3", + "on-finished": "2.4.1", + "parseurl": "~1.3.3", + "statuses": "2.0.1", + "unpipe": "~1.0.0" + }, + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/find-up": { + "version": "4.1.0", + "resolved": "https://registry.npmmirror.com/find-up/-/find-up-4.1.0.tgz", + "integrity": "sha512-PpOwAdQ/YlXQ2vj8a3h8IipDuYRi3wceVQQGYWxNINccq40Anw7BlsEXCMbt1Zt+OLA6Fq9suIpIWD0OsnISlw==", + "dev": true, + "license": "MIT", + "dependencies": { + "locate-path": "^5.0.0", + "path-exists": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/flat": { + "version": "5.0.2", + "resolved": "https://registry.npmmirror.com/flat/-/flat-5.0.2.tgz", + "integrity": "sha512-b6suED+5/3rTpUBdG1gupIl8MPFCAMA0QXwmljLhvCUKcUvdE4gWky9zpuGCcXHOsz4J9wPGNWq6OKpmIzz3hQ==", + "dev": true, + "license": "BSD-3-Clause", + "bin": { + "flat": "cli.js" + } + }, + "node_modules/follow-redirects": { + "version": "1.15.9", + "resolved": "https://registry.npmmirror.com/follow-redirects/-/follow-redirects-1.15.9.tgz", + "integrity": "sha512-gew4GsXizNgdoRyqmyfMHyAmXsZDk6mHkSxZFCzW9gwlbtOW44CDtYavM+y+72qD/Vq2l550kMF52DT8fOLJqQ==", + "dev": true, + "funding": [ + { + "type": "individual", + "url": "https://github.com/sponsors/RubenVerborgh" + } + ], + "license": "MIT", + "engines": { + "node": ">=4.0" + }, + "peerDependenciesMeta": { + "debug": { + "optional": true + } + } + }, + "node_modules/forwarded": { + "version": "0.2.0", + "resolved": "https://registry.npmmirror.com/forwarded/-/forwarded-0.2.0.tgz", + "integrity": "sha512-buRG0fpBtRHSTCOASe6hD258tEubFoRLb4ZNA6NxMVHNw2gOcwHo9wyablzMzOA5z9xA9L1KNjk/Nt6MT9aYow==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/fresh": { + "version": "0.5.2", + "resolved": "https://registry.npmmirror.com/fresh/-/fresh-0.5.2.tgz", + "integrity": "sha512-zJ2mQYM18rEFOudeV4GShTGIQ7RbzA7ozbU9I/XBpm7kqgMywgmylMwXHxZJmkVoYkna9d2pVXVXPdYTP9ej8Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/fs-monkey": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/fs-monkey/-/fs-monkey-1.0.6.tgz", + "integrity": "sha512-b1FMfwetIKymC0eioW7mTywihSQE4oLzQn1dB6rZB5fx/3NpNEdAWeCSMB+60/AeT0TCXsxzAlcYVEFCTAksWg==", + "dev": true, + "license": "Unlicense" + }, + "node_modules/fs.realpath": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/fs.realpath/-/fs.realpath-1.0.0.tgz", + "integrity": "sha512-OO0pH2lK6a0hZnAdau5ItzHPI6pUlvI7jMVnxUQRtw4owF2wk8lOSabtGDCTP4Ggrg2MbGnWO9X8K1t4+fGMDw==", + "license": "ISC" + }, + "node_modules/fsevents": { + "version": "2.3.3", + "resolved": "https://registry.npmmirror.com/fsevents/-/fsevents-2.3.3.tgz", + "integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" + } + }, + "node_modules/function-bind": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/function-bind/-/function-bind-1.1.2.tgz", + "integrity": "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA==", + "dev": true, + "license": "MIT", + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/get-intrinsic": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/get-intrinsic/-/get-intrinsic-1.3.0.tgz", + "integrity": "sha512-9fSjSaos/fRIVIp+xSJlE6lfwhES7LNtKaCBIamHsjr2na1BiABJPo0mOjjz8GJDURarmCPGqaiVg5mfjb98CQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "call-bind-apply-helpers": "^1.0.2", + "es-define-property": "^1.0.1", + "es-errors": "^1.3.0", + "es-object-atoms": "^1.1.1", + "function-bind": "^1.1.2", + "get-proto": "^1.0.1", + "gopd": "^1.2.0", + "has-symbols": "^1.1.0", + "hasown": "^2.0.2", + "math-intrinsics": "^1.1.0" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/get-proto": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/get-proto/-/get-proto-1.0.1.tgz", + "integrity": "sha512-sTSfBjoXBp89JvIKIefqw7U2CCebsc74kiY6awiGogKtoSGbgjYE/G/+l9sF3MWFPNc9IcoOC4ODfKHfxFmp0g==", + "dev": true, + "license": "MIT", + "dependencies": { + "dunder-proto": "^1.0.1", + "es-object-atoms": "^1.0.0" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/get-stream": { + "version": "6.0.1", + "resolved": "https://registry.npmmirror.com/get-stream/-/get-stream-6.0.1.tgz", + "integrity": "sha512-ts6Wi+2j3jQjqi70w5AlN8DFnkSwC+MqmxEzdEALB2qXZYV3X/b1CTfgPLGJNMeAWxdPfU8FO1ms3NUfaHCPYg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/glob": { + "version": "7.2.3", + "resolved": "https://registry.npmmirror.com/glob/-/glob-7.2.3.tgz", + "integrity": "sha512-nFR0zLpU2YCaRxwoCJvL6UvCH2JFyFVIvwTLsIf21AuHlMskA1hhTdk+LlYJtOlYt9v6dvszD2BGRqBL+iQK9Q==", + "deprecated": "Glob versions prior to v9 are no longer supported", + "license": "ISC", + "dependencies": { + "fs.realpath": "^1.0.0", + "inflight": "^1.0.4", + "inherits": "2", + "minimatch": "^3.1.1", + "once": "^1.3.0", + "path-is-absolute": "^1.0.0" + }, + "engines": { + "node": "*" + }, + "funding": { + "url": "https://github.com/sponsors/isaacs" + } + }, + "node_modules/glob-parent": { + "version": "5.1.2", + "resolved": "https://registry.npmmirror.com/glob-parent/-/glob-parent-5.1.2.tgz", + "integrity": "sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow==", + "dev": true, + "license": "ISC", + "dependencies": { + "is-glob": "^4.0.1" + }, + "engines": { + "node": ">= 6" + } + }, + "node_modules/glob-to-regexp": { + "version": "0.4.1", + "resolved": "https://registry.npmmirror.com/glob-to-regexp/-/glob-to-regexp-0.4.1.tgz", + "integrity": "sha512-lkX1HJXwyMcprw/5YUZc2s7DrpAiHB21/V+E1rHUrVNokkvB6bqMzT0VfV6/86ZNabt1k14YOIaT7nDvOX3Iiw==", + "license": "BSD-2-Clause" + }, + "node_modules/globby": { + "version": "6.1.0", + "resolved": "https://registry.npmmirror.com/globby/-/globby-6.1.0.tgz", + "integrity": "sha512-KVbFv2TQtbzCoxAnfD6JcHZTYCzyliEaaeM/gH8qQdkKr5s0OP9scEgvdcngyk7AVdY6YVW/TJHd+lQ/Df3Daw==", + "license": "MIT", + "dependencies": { + "array-union": "^1.0.1", + "glob": "^7.0.3", + "object-assign": "^4.0.1", + "pify": "^2.0.0", + "pinkie-promise": "^2.0.0" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/globby/node_modules/pify": { + "version": "2.3.0", + "resolved": "https://registry.npmmirror.com/pify/-/pify-2.3.0.tgz", + "integrity": "sha512-udgsAY+fTnvv7kI7aaxbqwWNb0AHiB0qBO89PZKPkoTmGOgdbrHDKD+0B2X4uTfJ/FT1R09r9gTsjUjNJotuog==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/gopd": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/gopd/-/gopd-1.2.0.tgz", + "integrity": "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/graceful-fs": { + "version": "4.2.11", + "resolved": "https://registry.npmmirror.com/graceful-fs/-/graceful-fs-4.2.11.tgz", + "integrity": "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ==", + "license": "ISC" + }, + "node_modules/graphlib": { + "version": "2.1.8", + "resolved": "https://registry.npmmirror.com/graphlib/-/graphlib-2.1.8.tgz", + "integrity": "sha512-jcLLfkpoVGmH7/InMC/1hIvOPSUh38oJtGhvrOFGzioE1DZ+0YW16RgmOJhHiuWTvGiJQ9Z1Ik43JvkRPRvE+A==", + "license": "MIT", + "dependencies": { + "lodash": "^4.17.15" + } + }, + "node_modules/handle-thing": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/handle-thing/-/handle-thing-2.0.1.tgz", + "integrity": "sha512-9Qn4yBxelxoh2Ow62nP+Ka/kMnOXRi8BXnRaUwezLNhqelnN49xKz4F/dPP8OYLxLxq6JDtZb2i9XznUQbNPTg==", + "dev": true, + "license": "MIT" + }, + "node_modules/has-flag": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/has-flag/-/has-flag-4.0.0.tgz", + "integrity": "sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ==", + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/has-symbols": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/has-symbols/-/has-symbols-1.1.0.tgz", + "integrity": "sha512-1cDNdwJ2Jaohmb3sg4OmKaMBwuC48sYni5HUw2DvsC8LjGTLK9h+eb1X6RyuOHe4hT0ULCW68iomhjUoKUqlPQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/hasown": { + "version": "2.0.2", + "resolved": "https://registry.npmmirror.com/hasown/-/hasown-2.0.2.tgz", + "integrity": "sha512-0hJU9SCPvmMzIBdZFqNPXWa6dqh7WdH0cII9y+CyS8rG3nL48Bclra9HmKhVVUHyPWNH5Y7xDwAB7bfgSjkUMQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "function-bind": "^1.1.2" + }, + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/he": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/he/-/he-1.2.0.tgz", + "integrity": "sha512-F/1DnUGPopORZi0ni+CvrCgHQ5FyEAHRLSApuYWMmrbSwoN2Mn/7k+Gl38gJnR7yyDZk6WLXwiGod1JOWNDKGw==", + "dev": true, + "license": "MIT", + "bin": { + "he": "bin/he" + } + }, + "node_modules/hpack.js": { + "version": "2.1.6", + "resolved": "https://registry.npmmirror.com/hpack.js/-/hpack.js-2.1.6.tgz", + "integrity": "sha512-zJxVehUdMGIKsRaNt7apO2Gqp0BdqW5yaiGHXXmbpvxgBYVZnAql+BJb4RO5ad2MgpbZKn5G6nMnegrH1FcNYQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "inherits": "^2.0.1", + "obuf": "^1.0.0", + "readable-stream": "^2.0.1", + "wbuf": "^1.1.0" + } + }, + "node_modules/hpack.js/node_modules/readable-stream": { + "version": "2.3.8", + "resolved": "https://registry.npmmirror.com/readable-stream/-/readable-stream-2.3.8.tgz", + "integrity": "sha512-8p0AUk4XODgIewSi0l8Epjs+EVnWiK7NoDIEGU0HhE7+ZyY8D1IMY7odu5lRrFXGg71L15KG8QrPmum45RTtdA==", + "dev": true, + "license": "MIT", + "dependencies": { + "core-util-is": "~1.0.0", + "inherits": "~2.0.3", + "isarray": "~1.0.0", + "process-nextick-args": "~2.0.0", + "safe-buffer": "~5.1.1", + "string_decoder": "~1.1.1", + "util-deprecate": "~1.0.1" + } + }, + "node_modules/hpack.js/node_modules/safe-buffer": { + "version": "5.1.2", + "resolved": "https://registry.npmmirror.com/safe-buffer/-/safe-buffer-5.1.2.tgz", + "integrity": "sha512-Gd2UZBJDkXlY7GbJxfsE8/nvKkUEU1G38c1siN6QP6a9PT9MmHB8GnpscSmMJSoF8LOIrt8ud/wPtojys4G6+g==", + "dev": true, + "license": "MIT" + }, + "node_modules/hpack.js/node_modules/string_decoder": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/string_decoder/-/string_decoder-1.1.1.tgz", + "integrity": "sha512-n/ShnvDi6FHbbVfviro+WojiFzv+s8MPMHBczVePfUpDJLwoLT0ht1l4YwBCbi8pJAveEEdnkHyPyTP/mzRfwg==", + "dev": true, + "license": "MIT", + "dependencies": { + "safe-buffer": "~5.1.0" + } + }, + "node_modules/html-entities": { + "version": "2.5.3", + "resolved": "https://registry.npmmirror.com/html-entities/-/html-entities-2.5.3.tgz", + "integrity": "sha512-D3AfvN7SjhTgBSA8L1BN4FpPzuEd06uy4lHwSoRWr0lndi9BKaNzPLKGOWZ2ocSGguozr08TTb2jhCLHaemruw==", + "dev": true, + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/mdevils" + }, + { + "type": "patreon", + "url": "https://patreon.com/mdevils" + } + ], + "license": "MIT" + }, + "node_modules/html-loader": { + "version": "5.1.0", + "resolved": "https://registry.npmmirror.com/html-loader/-/html-loader-5.1.0.tgz", + "integrity": "sha512-Jb3xwDbsm0W3qlXrCZwcYqYGnYz55hb6aoKQTlzyZPXsPpi6tHXzAfqalecglMQgNvtEfxrCQPaKT90Irt5XDA==", + "dev": true, + "license": "MIT", + "dependencies": { + "html-minifier-terser": "^7.2.0", + "parse5": "^7.1.2" + }, + "engines": { + "node": ">= 18.12.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "^5.0.0" + } + }, + "node_modules/html-minifier-terser": { + "version": "7.2.0", + "resolved": "https://registry.npmmirror.com/html-minifier-terser/-/html-minifier-terser-7.2.0.tgz", + "integrity": "sha512-tXgn3QfqPIpGl9o+K5tpcj3/MN4SfLtsx2GWwBC3SSd0tXQGyF3gsSqad8loJgKZGM3ZxbYDd5yhiBIdWpmvLA==", + "dev": true, + "license": "MIT", + "dependencies": { + "camel-case": "^4.1.2", + "clean-css": "~5.3.2", + "commander": "^10.0.0", + "entities": "^4.4.0", + "param-case": "^3.0.4", + "relateurl": "^0.2.7", + "terser": "^5.15.1" + }, + "bin": { + "html-minifier-terser": "cli.js" + }, + "engines": { + "node": "^14.13.1 || >=16.0.0" + } + }, + "node_modules/html-minifier-terser/node_modules/commander": { + "version": "10.0.1", + "resolved": "https://registry.npmmirror.com/commander/-/commander-10.0.1.tgz", + "integrity": "sha512-y4Mg2tXshplEbSGzx7amzPwKKOCGuoSRP/CjEdwwk0FOGlUbq6lKuoyDZTNZkmxHdJtp54hdfY/JUrdL7Xfdug==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14" + } + }, + "node_modules/html-webpack-plugin": { + "version": "5.6.3", + "resolved": "https://registry.npmmirror.com/html-webpack-plugin/-/html-webpack-plugin-5.6.3.tgz", + "integrity": "sha512-QSf1yjtSAsmf7rYBV7XX86uua4W/vkhIt0xNXKbsi2foEeW7vjJQz4bhnpL3xH+l1ryl1680uNv968Z+X6jSYg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/html-minifier-terser": "^6.0.0", + "html-minifier-terser": "^6.0.2", + "lodash": "^4.17.21", + "pretty-error": "^4.0.0", + "tapable": "^2.0.0" + }, + "engines": { + "node": ">=10.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/html-webpack-plugin" + }, + "peerDependencies": { + "@rspack/core": "0.x || 1.x", + "webpack": "^5.20.0" + }, + "peerDependenciesMeta": { + "@rspack/core": { + "optional": true + }, + "webpack": { + "optional": true + } + } + }, + "node_modules/html-webpack-plugin/node_modules/commander": { + "version": "8.3.0", + "resolved": "https://registry.npmmirror.com/commander/-/commander-8.3.0.tgz", + "integrity": "sha512-OkTL9umf+He2DZkUq8f8J9of7yL6RJKI24dVITBmNfZBmri9zYZQrKkuXiKhyfPSu8tUhnVBB1iKXevvnlR4Ww==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 12" + } + }, + "node_modules/html-webpack-plugin/node_modules/html-minifier-terser": { + "version": "6.1.0", + "resolved": "https://registry.npmmirror.com/html-minifier-terser/-/html-minifier-terser-6.1.0.tgz", + "integrity": "sha512-YXxSlJBZTP7RS3tWnQw74ooKa6L9b9i9QYXY21eUEvhZ3u9XLfv6OnFsQq6RxkhHygsaUMvYsZRV5rU/OVNZxw==", + "dev": true, + "license": "MIT", + "dependencies": { + "camel-case": "^4.1.2", + "clean-css": "^5.2.2", + "commander": "^8.3.0", + "he": "^1.2.0", + "param-case": "^3.0.4", + "relateurl": "^0.2.7", + "terser": "^5.10.0" + }, + "bin": { + "html-minifier-terser": "cli.js" + }, + "engines": { + "node": ">=12" + } + }, + "node_modules/htmlparser2": { + "version": "6.1.0", + "resolved": "https://registry.npmmirror.com/htmlparser2/-/htmlparser2-6.1.0.tgz", + "integrity": "sha512-gyyPk6rgonLFEDGoeRgQNaEUvdJ4ktTmmUh/h2t7s+M8oPpIPxgNACWa+6ESR57kXstwqPiCut0V8NRpcwgU7A==", + "dev": true, + "funding": [ + "https://github.com/fb55/htmlparser2?sponsor=1", + { + "type": "github", + "url": "https://github.com/sponsors/fb55" + } + ], + "license": "MIT", + "dependencies": { + "domelementtype": "^2.0.1", + "domhandler": "^4.0.0", + "domutils": "^2.5.2", + "entities": "^2.0.0" + } + }, + "node_modules/htmlparser2/node_modules/entities": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/entities/-/entities-2.2.0.tgz", + "integrity": "sha512-p92if5Nz619I0w+akJrLZH0MX0Pb5DX39XOwQTtXSdQQOaYH03S1uIQp4mhOZtAXrxq4ViO67YTiLBo2638o9A==", + "dev": true, + "license": "BSD-2-Clause", + "funding": { + "url": "https://github.com/fb55/entities?sponsor=1" + } + }, + "node_modules/http-deceiver": { + "version": "1.2.7", + "resolved": "https://registry.npmmirror.com/http-deceiver/-/http-deceiver-1.2.7.tgz", + "integrity": "sha512-LmpOGxTfbpgtGVxJrj5k7asXHCgNZp5nLfp+hWc8QQRqtb7fUy6kRY3BO1h9ddF6yIPYUARgxGOwB42DnxIaNw==", + "dev": true, + "license": "MIT" + }, + "node_modules/http-errors": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/http-errors/-/http-errors-2.0.0.tgz", + "integrity": "sha512-FtwrG/euBzaEjYeRqOgly7G0qviiXoJWnvEH2Z1plBdXgbyjv34pHTSb9zoeHMyDy33+DWy5Wt9Wo+TURtOYSQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "depd": "2.0.0", + "inherits": "2.0.4", + "setprototypeof": "1.2.0", + "statuses": "2.0.1", + "toidentifier": "1.0.1" + }, + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/http-parser-js": { + "version": "0.5.9", + "resolved": "https://registry.npmmirror.com/http-parser-js/-/http-parser-js-0.5.9.tgz", + "integrity": "sha512-n1XsPy3rXVxlqxVioEWdC+0+M+SQw0DpJynwtOPo1X+ZlvdzTLtDBIJJlDQTnwZIFJrZSzSGmIOUdP8tu+SgLw==", + "dev": true, + "license": "MIT" + }, + "node_modules/http-proxy": { + "version": "1.18.1", + "resolved": "https://registry.npmmirror.com/http-proxy/-/http-proxy-1.18.1.tgz", + "integrity": "sha512-7mz/721AbnJwIVbnaSv1Cz3Am0ZLT/UBwkC92VlxhXv/k/BBQfM2fXElQNC27BVGr0uwUpplYPQM9LnaBMR5NQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "eventemitter3": "^4.0.0", + "follow-redirects": "^1.0.0", + "requires-port": "^1.0.0" + }, + "engines": { + "node": ">=8.0.0" + } + }, + "node_modules/http-proxy-middleware": { + "version": "2.0.7", + "resolved": "https://registry.npmmirror.com/http-proxy-middleware/-/http-proxy-middleware-2.0.7.tgz", + "integrity": "sha512-fgVY8AV7qU7z/MmXJ/rxwbrtQH4jBQ9m7kp3llF0liB7glmFeVZFBepQb32T3y8n8k2+AEYuMPCpinYW+/CuRA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/http-proxy": "^1.17.8", + "http-proxy": "^1.18.1", + "is-glob": "^4.0.1", + "is-plain-obj": "^3.0.0", + "micromatch": "^4.0.2" + }, + "engines": { + "node": ">=12.0.0" + }, + "peerDependencies": { + "@types/express": "^4.17.13" + }, + "peerDependenciesMeta": { + "@types/express": { + "optional": true + } + } + }, + "node_modules/human-signals": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/human-signals/-/human-signals-2.1.0.tgz", + "integrity": "sha512-B4FFZ6q/T2jhhksgkbEW3HBvWIfDW85snkQgawt07S7J5QXTk6BkNV+0yAeZrM5QpMAdYlocGoljn0sJ/WQkFw==", + "dev": true, + "license": "Apache-2.0", + "engines": { + "node": ">=10.17.0" + } + }, + "node_modules/iconv-lite": { + "version": "0.4.24", + "resolved": "https://registry.npmmirror.com/iconv-lite/-/iconv-lite-0.4.24.tgz", + "integrity": "sha512-v3MXnZAcvnywkTUEZomIActle7RXXeedOR31wwl7VlyoXO4Qi9arvSenNQWne1TcRwhCL1HwLI21bEqdpj8/rA==", + "license": "MIT", + "dependencies": { + "safer-buffer": ">= 2.1.2 < 3" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/icss-utils": { + "version": "5.1.0", + "resolved": "https://registry.npmmirror.com/icss-utils/-/icss-utils-5.1.0.tgz", + "integrity": "sha512-soFhflCVWLfRNOPU3iv5Z9VUdT44xFRbzjLsEzSr5AQmgqPMTHdU3PMT1Cf1ssx8fLNJDA1juftYl+PUcv3MqA==", + "license": "ISC", + "engines": { + "node": "^10 || ^12 || >= 14" + }, + "peerDependencies": { + "postcss": "^8.1.0" + } + }, + "node_modules/import-local": { + "version": "3.2.0", + "resolved": "https://registry.npmmirror.com/import-local/-/import-local-3.2.0.tgz", + "integrity": "sha512-2SPlun1JUPWoM6t3F0dw0FkCF/jWY8kttcY4f599GLTSjh2OCuuhdTkJQsEcZzBqbXZGKMK2OqW1oZsjtf/gQA==", + "dev": true, + "license": "MIT", + "dependencies": { + "pkg-dir": "^4.2.0", + "resolve-cwd": "^3.0.0" + }, + "bin": { + "import-local-fixture": "fixtures/cli.js" + }, + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/inflight": { + "version": "1.0.6", + "resolved": "https://registry.npmmirror.com/inflight/-/inflight-1.0.6.tgz", + "integrity": "sha512-k92I/b08q4wvFscXCLvqfsHCrjrF7yiXsQuIVvVE7N82W3+aqpzuUdBbfhWcy/FZR3/4IgflMgKLOsvPDrGCJA==", + "deprecated": "This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.", + "license": "ISC", + "dependencies": { + "once": "^1.3.0", + "wrappy": "1" + } + }, + "node_modules/inherits": { + "version": "2.0.4", + "resolved": "https://registry.npmmirror.com/inherits/-/inherits-2.0.4.tgz", + "integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ==", + "license": "ISC" + }, + "node_modules/inline-chunk-html-plugin": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/inline-chunk-html-plugin/-/inline-chunk-html-plugin-1.1.1.tgz", + "integrity": "sha512-6W1eGIj8z/Yla6xJx5il6jJfCxMZS3kVkbiLQThbbjdsDLRIWkUVmpnhfW2l6WAwCW+qfy0zoXVGBZM1E5XF3g==", + "deprecated": "Package no longer supported. Contact Support at https://www.npmjs.com/support for more info.", + "dev": true + }, + "node_modules/interpret": { + "version": "3.1.1", + "resolved": "https://registry.npmmirror.com/interpret/-/interpret-3.1.1.tgz", + "integrity": "sha512-6xwYfHbajpoF0xLW+iwLkhwgvLoZDfjYfoFNu8ftMoXINzwuymNLd9u/KmwtdT2GbR+/Cz66otEGEVVUHX9QLQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10.13.0" + } + }, + "node_modules/ipaddr.js": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/ipaddr.js/-/ipaddr.js-2.2.0.tgz", + "integrity": "sha512-Ag3wB2o37wslZS19hZqorUnrnzSkpOVy+IiiDEiTqNubEYpYuHWIf6K4psgN2ZWKExS4xhVCrRVfb/wfW8fWJA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 10" + } + }, + "node_modules/is-binary-path": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/is-binary-path/-/is-binary-path-2.1.0.tgz", + "integrity": "sha512-ZMERYes6pDydyuGidse7OsHxtbI7WVeUEozgR/g7rd0xUimYNlvZRE/K2MgZTjWy725IfelLeVcEM97mmtRGXw==", + "dev": true, + "license": "MIT", + "dependencies": { + "binary-extensions": "^2.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/is-core-module": { + "version": "2.16.1", + "resolved": "https://registry.npmmirror.com/is-core-module/-/is-core-module-2.16.1.tgz", + "integrity": "sha512-UfoeMA6fIJ8wTYFEUjelnaGI67v6+N7qXJEvQuIGa99l4xsCruSYOVSQ0uPANn4dAzm8lkYPaKLrrijLq7x23w==", + "dev": true, + "license": "MIT", + "dependencies": { + "hasown": "^2.0.2" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/is-docker": { + "version": "2.2.1", + "resolved": "https://registry.npmmirror.com/is-docker/-/is-docker-2.2.1.tgz", + "integrity": "sha512-F+i2BKsFrH66iaUFc0woD8sLy8getkwTwtOBjvs56Cx4CgJDeKQeqfz8wAYiSb8JOprWhHH5p77PbmYCvvUuXQ==", + "dev": true, + "license": "MIT", + "bin": { + "is-docker": "cli.js" + }, + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/is-extglob": { + "version": "2.1.1", + "resolved": "https://registry.npmmirror.com/is-extglob/-/is-extglob-2.1.1.tgz", + "integrity": "sha512-SbKbANkN603Vi4jEZv49LeVJMn4yGwsbzZworEoyEiutsN3nJYdbO36zfhGJ6QEDpOZIFkDtnq5JRxmvl3jsoQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/is-glob": { + "version": "4.0.3", + "resolved": "https://registry.npmmirror.com/is-glob/-/is-glob-4.0.3.tgz", + "integrity": "sha512-xelSayHH36ZgE7ZWhli7pW34hNbNl8Ojv5KVmkJD4hBdD3th8Tfk9vYasLM+mXWOZhFkgZfxhLSnrwRr4elSSg==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-extglob": "^2.1.1" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/is-number": { + "version": "7.0.0", + "resolved": "https://registry.npmmirror.com/is-number/-/is-number-7.0.0.tgz", + "integrity": "sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.12.0" + } + }, + "node_modules/is-path-cwd": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/is-path-cwd/-/is-path-cwd-2.2.0.tgz", + "integrity": "sha512-w942bTcih8fdJPJmQHFzkS76NEP8Kzzvmw92cXsazb8intwLqPibPPdXf4ANdKV3rYMuuQYGIWtvz9JilB3NFQ==", + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/is-path-in-cwd": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/is-path-in-cwd/-/is-path-in-cwd-2.1.0.tgz", + "integrity": "sha512-rNocXHgipO+rvnP6dk3zI20RpOtrAM/kzbB258Uw5BWr3TpXi861yzjo16Dn4hUox07iw5AyeMLHWsujkjzvRQ==", + "license": "MIT", + "dependencies": { + "is-path-inside": "^2.1.0" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/is-path-inside": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/is-path-inside/-/is-path-inside-2.1.0.tgz", + "integrity": "sha512-wiyhTzfDWsvwAW53OBWF5zuvaOGlZ6PwYxAbPVDhpm+gM09xKQGjBq/8uYN12aDvMxnAnq3dxTyoSoRNmg5YFg==", + "license": "MIT", + "dependencies": { + "path-is-inside": "^1.0.2" + }, + "engines": { + "node": ">=6" + } + }, + "node_modules/is-plain-obj": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/is-plain-obj/-/is-plain-obj-3.0.0.tgz", + "integrity": "sha512-gwsOE28k+23GP1B6vFl1oVh/WOzmawBrKwo5Ev6wMKzPkaXaCDIQKzLnvsA42DRlbVTWorkgTKIviAKCWkfUwA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/is-plain-object": { + "version": "2.0.4", + "resolved": "https://registry.npmmirror.com/is-plain-object/-/is-plain-object-2.0.4.tgz", + "integrity": "sha512-h5PpgXkWitc38BBMYawTYMWJHFZJVnBquFE57xFpjB8pJFiF6gZ+bU+WyI/yqXiFR5mdLsgYNaPe8uao6Uv9Og==", + "dev": true, + "license": "MIT", + "dependencies": { + "isobject": "^3.0.1" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/is-stream": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/is-stream/-/is-stream-2.0.1.tgz", + "integrity": "sha512-hFoiJiTl63nn+kstHGBtewWSKnQLpyb155KHheA1l39uvtO9nWIop1p3udqPcUd/xbF1VLMO4n7OI6p7RbngDg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/is-wsl": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/is-wsl/-/is-wsl-2.2.0.tgz", + "integrity": "sha512-fKzAra0rGJUUBwGBgNkHZuToZcn+TtXHpeCgmkMJMMYx1sQDYaCSyjJBSCa2nH1DGm7s3n1oBnohoVTBaN7Lww==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-docker": "^2.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/isarray": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/isarray/-/isarray-1.0.0.tgz", + "integrity": "sha512-VLghIWNM6ELQzo7zwmcg0NmTVyWKYjvIeM83yjp0wRDTmUnrM678fQbcKBo6n2CJEF0szoG//ytg+TKla89ALQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/isexe": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/isexe/-/isexe-2.0.0.tgz", + "integrity": "sha512-RHxMLp9lnKHGHRng9QFhRCMbYAcVpn69smSGcq3f36xjgVVWThj4qqLbTLlq7Ssj8B+fIQ1EuCEGI2lKsyQeIw==", + "license": "ISC" + }, + "node_modules/isobject": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/isobject/-/isobject-3.0.1.tgz", + "integrity": "sha512-WhB9zCku7EGTj/HQQRz5aUQEUeoQZH2bWcltRErOpymJ4boYE6wL9Tbr23krRPSZ+C5zqNSrSw+Cc7sZZ4b7vg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/jest-worker": { + "version": "27.5.1", + "resolved": "https://registry.npmmirror.com/jest-worker/-/jest-worker-27.5.1.tgz", + "integrity": "sha512-7vuh85V5cdDofPyxn58nrPjBktZo0u9x1g8WtjQol+jZDaE+fhN+cIvTj11GndBnMnyfrUOG1sZQxCdjKh+DKg==", + "license": "MIT", + "dependencies": { + "@types/node": "*", + "merge-stream": "^2.0.0", + "supports-color": "^8.0.0" + }, + "engines": { + "node": ">= 10.13.0" + } + }, + "node_modules/jest-worker/node_modules/supports-color": { + "version": "8.1.1", + "resolved": "https://registry.npmmirror.com/supports-color/-/supports-color-8.1.1.tgz", + "integrity": "sha512-MpUEN2OodtUzxvKQl72cUF7RQ5EiHsGvSsVG0ia9c5RbWGL2CI4C7EpPS8UTBIplnlzZiNuV56w+FuNxy3ty2Q==", + "license": "MIT", + "dependencies": { + "has-flag": "^4.0.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/supports-color?sponsor=1" + } + }, + "node_modules/json-parse-even-better-errors": { + "version": "2.3.1", + "resolved": "https://registry.npmmirror.com/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz", + "integrity": "sha512-xyFwyhro/JEof6Ghe2iz2NcXoj2sloNsWr/XsERDK/oiPCfaNhl5ONfp+jQdAZRQQ0IJWNzH9zIZF7li91kh2w==", + "license": "MIT" + }, + "node_modules/json-schema-traverse": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz", + "integrity": "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug==", + "license": "MIT" + }, + "node_modules/kind-of": { + "version": "6.0.3", + "resolved": "https://registry.npmmirror.com/kind-of/-/kind-of-6.0.3.tgz", + "integrity": "sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/launch-editor": { + "version": "2.10.0", + "resolved": "https://registry.npmmirror.com/launch-editor/-/launch-editor-2.10.0.tgz", + "integrity": "sha512-D7dBRJo/qcGX9xlvt/6wUYzQxjh5G1RvZPgPv8vi4KRU99DVQL/oW7tnVOCCTm2HGeo3C5HvGE5Yrh6UBoZ0vA==", + "dev": true, + "license": "MIT", + "dependencies": { + "picocolors": "^1.0.0", + "shell-quote": "^1.8.1" + } + }, + "node_modules/lit": { + "version": "3.2.1", + "resolved": "https://registry.npmmirror.com/lit/-/lit-3.2.1.tgz", + "integrity": "sha512-1BBa1E/z0O9ye5fZprPtdqnc0BFzxIxTTOO/tQFmyC/hj1O3jL4TfmLBw0WEwjAokdLwpclkvGgDJwTIh0/22w==", + "license": "BSD-3-Clause", + "dependencies": { + "@lit/reactive-element": "^2.0.4", + "lit-element": "^4.1.0", + "lit-html": "^3.2.0" + } + }, + "node_modules/lit-element": { + "version": "4.1.1", + "resolved": "https://registry.npmmirror.com/lit-element/-/lit-element-4.1.1.tgz", + "integrity": "sha512-HO9Tkkh34QkTeUmEdNYhMT8hzLid7YlMlATSi1q4q17HE5d9mrrEHJ/o8O2D0cMi182zK1F3v7x0PWFjrhXFew==", + "license": "BSD-3-Clause", + "dependencies": { + "@lit-labs/ssr-dom-shim": "^1.2.0", + "@lit/reactive-element": "^2.0.4", + "lit-html": "^3.2.0" + } + }, + "node_modules/lit-html": { + "version": "3.2.1", + "resolved": "https://registry.npmmirror.com/lit-html/-/lit-html-3.2.1.tgz", + "integrity": "sha512-qI/3lziaPMSKsrwlxH/xMgikhQ0EGOX2ICU73Bi/YHFvz2j/yMCIrw4+puF2IpQ4+upd3EWbvnHM9+PnJn48YA==", + "license": "BSD-3-Clause", + "dependencies": { + "@types/trusted-types": "^2.0.2" + } + }, + "node_modules/loader-runner": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/loader-runner/-/loader-runner-4.3.0.tgz", + "integrity": "sha512-3R/1M+yS3j5ou80Me59j7F9IMs4PXs3VqRrm0TU3AbKPxlmpoY1TNscJV/oGJXo8qCatFGTfDbY6W6ipGOYXfg==", + "license": "MIT", + "engines": { + "node": ">=6.11.5" + } + }, + "node_modules/locate-path": { + "version": "5.0.0", + "resolved": "https://registry.npmmirror.com/locate-path/-/locate-path-5.0.0.tgz", + "integrity": "sha512-t7hw9pI+WvuwNJXwk5zVHpyhIqzg2qTlklJOf0mVxGSbe3Fp2VieZcduNYjaLDoy6p9uGpQEGWG87WpMKlNq8g==", + "dev": true, + "license": "MIT", + "dependencies": { + "p-locate": "^4.1.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/lodash": { + "version": "4.17.21", + "resolved": "https://registry.npmmirror.com/lodash/-/lodash-4.17.21.tgz", + "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg==", + "license": "MIT" + }, + "node_modules/lower-case": { + "version": "2.0.2", + "resolved": "https://registry.npmmirror.com/lower-case/-/lower-case-2.0.2.tgz", + "integrity": "sha512-7fm3l3NAF9WfN6W3JOmf5drwpVqX78JtoGJ3A6W0a6ZnldM41w2fV5D490psKFTpMds8TJse/eHLFFsNHHjHgg==", + "dev": true, + "license": "MIT", + "dependencies": { + "tslib": "^2.0.3" + } + }, + "node_modules/math-intrinsics": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/math-intrinsics/-/math-intrinsics-1.1.0.tgz", + "integrity": "sha512-/IXtbwEk5HTPyEwyKX6hGkYXxM9nbj64B+ilVJnC/R6B0pH5G4V3b0pVbL7DBj4tkhBAppbQUlf6F6Xl9LHu1g==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + } + }, + "node_modules/media-typer": { + "version": "0.3.0", + "resolved": "https://registry.npmmirror.com/media-typer/-/media-typer-0.3.0.tgz", + "integrity": "sha512-dq+qelQ9akHpcOl/gUVRTxVIOkAJ1wR3QAvb4RsVjS8oVoFjDGTc679wJYmUmknUF5HwMLOgb5O+a3KxfWapPQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/memfs": { + "version": "3.5.3", + "resolved": "https://registry.npmmirror.com/memfs/-/memfs-3.5.3.tgz", + "integrity": "sha512-UERzLsxzllchadvbPs5aolHh65ISpKpM+ccLbOJ8/vvpBKmAWf+la7dXFy7Mr0ySHbdHrFv5kGFCUHHe6GFEmw==", + "dev": true, + "license": "Unlicense", + "dependencies": { + "fs-monkey": "^1.0.4" + }, + "engines": { + "node": ">= 4.0.0" + } + }, + "node_modules/merge-descriptors": { + "version": "1.0.3", + "resolved": "https://registry.npmmirror.com/merge-descriptors/-/merge-descriptors-1.0.3.tgz", + "integrity": "sha512-gaNvAS7TZ897/rVaZ0nMtAyxNyi/pdbjbAwUpFQpN70GqnVfOiXpeUUMKRBmzXaSQ8DdTX4/0ms62r2K+hE6mQ==", + "dev": true, + "license": "MIT", + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/merge-stream": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/merge-stream/-/merge-stream-2.0.0.tgz", + "integrity": "sha512-abv/qOcuPfk3URPfDzmZU1LKmuw8kT+0nIHvKrKgFrwifol/doWcdA4ZqsWQ8ENrFKkd67Mfpo/LovbIUsbt3w==", + "license": "MIT" + }, + "node_modules/methods": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/methods/-/methods-1.1.2.tgz", + "integrity": "sha512-iclAHeNqNm68zFtnZ0e+1L2yUIdvzNoauKU4WBA3VvH/vPFieF7qfRlwUZU+DA9P9bPXIS90ulxoUoCH23sV2w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/micromatch": { + "version": "4.0.8", + "resolved": "https://registry.npmmirror.com/micromatch/-/micromatch-4.0.8.tgz", + "integrity": "sha512-PXwfBhYu0hBCPw8Dn0E+WDYb7af3dSLVWKi3HGv84IdF4TyFoC0ysxFd0Goxw7nSv4T/PzEJQxsYsEiFCKo2BA==", + "dev": true, + "license": "MIT", + "dependencies": { + "braces": "^3.0.3", + "picomatch": "^2.3.1" + }, + "engines": { + "node": ">=8.6" + } + }, + "node_modules/mime": { + "version": "1.6.0", + "resolved": "https://registry.npmmirror.com/mime/-/mime-1.6.0.tgz", + "integrity": "sha512-x0Vn8spI+wuJ1O6S7gnbaQg8Pxh4NNHb7KSINmEWKiPE4RKOplvijn+NkmYmmRgP68mc70j2EbeTFRsrswaQeg==", + "dev": true, + "license": "MIT", + "bin": { + "mime": "cli.js" + }, + "engines": { + "node": ">=4" + } + }, + "node_modules/mime-db": { + "version": "1.52.0", + "resolved": "https://registry.npmmirror.com/mime-db/-/mime-db-1.52.0.tgz", + "integrity": "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==", + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/mime-types": { + "version": "2.1.35", + "resolved": "https://registry.npmmirror.com/mime-types/-/mime-types-2.1.35.tgz", + "integrity": "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==", + "license": "MIT", + "dependencies": { + "mime-db": "1.52.0" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/mimic-fn": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/mimic-fn/-/mimic-fn-2.1.0.tgz", + "integrity": "sha512-OqbOk5oEQeAZ8WXWydlu9HJjz9WVdEIvamMCcXmuqUYjTknH/sqsWvhQ3vgwKFRR1HpjvNBKQ37nbJgYzGqGcg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/minimalistic-assert": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/minimalistic-assert/-/minimalistic-assert-1.0.1.tgz", + "integrity": "sha512-UtJcAD4yEaGtjPezWuO9wC4nwUnVH/8/Im3yEHQP4b67cXlD/Qr9hdITCU1xDbSEXg2XKNaP8jsReV7vQd00/A==", + "dev": true, + "license": "ISC" + }, + "node_modules/minimatch": { + "version": "3.1.2", + "resolved": "https://registry.npmmirror.com/minimatch/-/minimatch-3.1.2.tgz", + "integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==", + "license": "ISC", + "dependencies": { + "brace-expansion": "^1.1.7" + }, + "engines": { + "node": "*" + } + }, + "node_modules/ms": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/ms/-/ms-2.0.0.tgz", + "integrity": "sha512-Tpp60P6IUJDTuOq/5Z8cdskzJujfwqfOTkrwIwj7IRISpnkJnT6SyJ4PCPnGMoFjC9ddhal5KVIYtAt97ix05A==", + "dev": true, + "license": "MIT" + }, + "node_modules/multicast-dns": { + "version": "7.2.5", + "resolved": "https://registry.npmmirror.com/multicast-dns/-/multicast-dns-7.2.5.tgz", + "integrity": "sha512-2eznPJP8z2BFLX50tf0LuODrpINqP1RVIm/CObbTcBRITQgmC/TjcREF1NeTBzIcR5XO/ukWo+YHOjBbFwIupg==", + "dev": true, + "license": "MIT", + "dependencies": { + "dns-packet": "^5.2.2", + "thunky": "^1.0.2" + }, + "bin": { + "multicast-dns": "cli.js" + } + }, + "node_modules/nanoid": { + "version": "3.3.11", + "resolved": "https://registry.npmmirror.com/nanoid/-/nanoid-3.3.11.tgz", + "integrity": "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "bin": { + "nanoid": "bin/nanoid.cjs" + }, + "engines": { + "node": "^10 || ^12 || ^13.7 || ^14 || >=15.0.1" + } + }, + "node_modules/negotiator": { + "version": "0.6.4", + "resolved": "https://registry.npmmirror.com/negotiator/-/negotiator-0.6.4.tgz", + "integrity": "sha512-myRT3DiWPHqho5PrJaIRyaMv2kgYf0mUVgBNOYMuCH5Ki1yEiQaf/ZJuQ62nvpc44wL5WDbTX7yGJi1Neevw8w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/neo-async": { + "version": "2.6.2", + "resolved": "https://registry.npmmirror.com/neo-async/-/neo-async-2.6.2.tgz", + "integrity": "sha512-Yd3UES5mWCSqR+qNT93S3UoYUkqAZ9lLg8a7g9rimsWmYGK8cVToA4/sF3RrshdyV3sAGMXVUmpMYOw+dLpOuw==", + "license": "MIT" + }, + "node_modules/no-case": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/no-case/-/no-case-3.0.4.tgz", + "integrity": "sha512-fgAN3jGAh+RoxUGZHTSOLJIqUc2wmoBwGR4tbpNAKmmovFoWq0OdRkb0VkldReO2a2iBT/OEulG9XSUc10r3zg==", + "dev": true, + "license": "MIT", + "dependencies": { + "lower-case": "^2.0.2", + "tslib": "^2.0.3" + } + }, + "node_modules/node-forge": { + "version": "1.3.1", + "resolved": "https://registry.npmmirror.com/node-forge/-/node-forge-1.3.1.tgz", + "integrity": "sha512-dPEtOeMvF9VMcYV/1Wb8CPoVAXtp6MKMlcbAt4ddqmGqUJ6fQZFXkNZNkNlfevtNkGtaSoXf/vNNNSvgrdXwtA==", + "dev": true, + "license": "(BSD-3-Clause OR GPL-2.0)", + "engines": { + "node": ">= 6.13.0" + } + }, + "node_modules/node-releases": { + "version": "2.0.19", + "resolved": "https://registry.npmmirror.com/node-releases/-/node-releases-2.0.19.tgz", + "integrity": "sha512-xxOWJsBKtzAq7DY0J+DTzuz58K8e7sJbdgwkbMWQe8UYB6ekmsQ45q0M/tJDsGaZmbC+l7n57UV8Hl5tHxO9uw==", + "license": "MIT" + }, + "node_modules/normalize-path": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/normalize-path/-/normalize-path-3.0.0.tgz", + "integrity": "sha512-6eZs5Ls3WtCisHWp9S2GUy8dqkpGi4BVSz3GaqiE6ezub0512ESztXUwUB6C6IKbQkY2Pnb/mD4WYojCRwcwLA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/npm-run-path": { + "version": "4.0.1", + "resolved": "https://registry.npmmirror.com/npm-run-path/-/npm-run-path-4.0.1.tgz", + "integrity": "sha512-S48WzZW777zhNIrn7gxOlISNAqi9ZC/uQFnRdbeIHhZhCA6UqpkOT8T1G7BvfdgP4Er8gF4sUbaS0i7QvIfCWw==", + "dev": true, + "license": "MIT", + "dependencies": { + "path-key": "^3.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/nth-check": { + "version": "2.1.1", + "resolved": "https://registry.npmmirror.com/nth-check/-/nth-check-2.1.1.tgz", + "integrity": "sha512-lqjrjmaOoAnWfMmBPL+XNnynZh2+swxiX3WUE0s4yEHI6m+AwrK2UZOimIRl3X/4QctVqS8AiZjFqyOGrMXb/w==", + "dev": true, + "license": "BSD-2-Clause", + "dependencies": { + "boolbase": "^1.0.0" + }, + "funding": { + "url": "https://github.com/fb55/nth-check?sponsor=1" + } + }, + "node_modules/object-assign": { + "version": "4.1.1", + "resolved": "https://registry.npmmirror.com/object-assign/-/object-assign-4.1.1.tgz", + "integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/object-inspect": { + "version": "1.13.4", + "resolved": "https://registry.npmmirror.com/object-inspect/-/object-inspect-1.13.4.tgz", + "integrity": "sha512-W67iLl4J2EXEGTbfeHCffrjDfitvLANg0UlX3wFUUSTx92KXRFegMHUVgSqE+wvhAbi4WqjGg9czysTV2Epbew==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/obuf": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/obuf/-/obuf-1.1.2.tgz", + "integrity": "sha512-PX1wu0AmAdPqOL1mWhqmlOd8kOIZQwGZw6rh7uby9fTc5lhaOWFLX3I6R1hrF9k3zUY40e6igsLGkDXK92LJNg==", + "dev": true, + "license": "MIT" + }, + "node_modules/on-finished": { + "version": "2.4.1", + "resolved": "https://registry.npmmirror.com/on-finished/-/on-finished-2.4.1.tgz", + "integrity": "sha512-oVlzkg3ENAhCk2zdv7IJwd/QUD4z2RxRwpkcGY8psCVcCYZNq4wYnVWALHM+brtuJjePWiYF/ClmuDr8Ch5+kg==", + "dev": true, + "license": "MIT", + "dependencies": { + "ee-first": "1.1.1" + }, + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/on-headers": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/on-headers/-/on-headers-1.0.2.tgz", + "integrity": "sha512-pZAE+FJLoyITytdqK0U5s+FIpjN0JP3OzFi/u8Rx+EV5/W+JTWGXG8xFzevE7AjBfDqHv/8vL8qQsIhHnqRkrA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/once": { + "version": "1.4.0", + "resolved": "https://registry.npmmirror.com/once/-/once-1.4.0.tgz", + "integrity": "sha512-lNaJgI+2Q5URQBkccEKHTQOPaXdUxnZZElQTZY0MFUAuaEqe1E+Nyvgdz/aIyNi6Z9MzO5dv1H8n58/GELp3+w==", + "license": "ISC", + "dependencies": { + "wrappy": "1" + } + }, + "node_modules/onetime": { + "version": "5.1.2", + "resolved": "https://registry.npmmirror.com/onetime/-/onetime-5.1.2.tgz", + "integrity": "sha512-kbpaSSGJTWdAY5KPVeMOKXSrPtr8C8C7wodJbcsd51jRnmD+GZu8Y0VoU6Dm5Z4vWr0Ig/1NKuWRKf7j5aaYSg==", + "dev": true, + "license": "MIT", + "dependencies": { + "mimic-fn": "^2.1.0" + }, + "engines": { + "node": ">=6" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/open": { + "version": "8.4.2", + "resolved": "https://registry.npmmirror.com/open/-/open-8.4.2.tgz", + "integrity": "sha512-7x81NCL719oNbsq/3mh+hVrAWmFuEYUqrq/Iw3kUzH8ReypT9QQ0BLoJS7/G9k6N81XjW4qHWtjWwe/9eLy1EQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "define-lazy-prop": "^2.0.0", + "is-docker": "^2.1.1", + "is-wsl": "^2.2.0" + }, + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/p-limit": { + "version": "2.3.0", + "resolved": "https://registry.npmmirror.com/p-limit/-/p-limit-2.3.0.tgz", + "integrity": "sha512-//88mFWSJx8lxCzwdAABTJL2MyWB12+eIY7MDL2SqLmAkeKU9qxRvWuSyTjm3FUmpBEMuFfckAIqEaVGUDxb6w==", + "dev": true, + "license": "MIT", + "dependencies": { + "p-try": "^2.0.0" + }, + "engines": { + "node": ">=6" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/p-locate": { + "version": "4.1.0", + "resolved": "https://registry.npmmirror.com/p-locate/-/p-locate-4.1.0.tgz", + "integrity": "sha512-R79ZZ/0wAxKGu3oYMlz8jy/kbhsNrS7SKZ7PxEHBgJ5+F2mtFW2fK2cOtBh1cHYkQsbzFV7I+EoRKe6Yt0oK7A==", + "dev": true, + "license": "MIT", + "dependencies": { + "p-limit": "^2.2.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/p-map": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/p-map/-/p-map-2.1.0.tgz", + "integrity": "sha512-y3b8Kpd8OAN444hxfBbFfj1FY/RjtTd8tzYwhUqNYXx0fXx2iX4maP4Qr6qhIKbQXI02wTLAda4fYUbDagTUFw==", + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/p-retry": { + "version": "4.6.2", + "resolved": "https://registry.npmmirror.com/p-retry/-/p-retry-4.6.2.tgz", + "integrity": "sha512-312Id396EbJdvRONlngUx0NydfrIQ5lsYu0znKVUzVvArzEIt08V1qhtyESbGVd1FGX7UKtiFp5uwKZdM8wIuQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/retry": "0.12.0", + "retry": "^0.13.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/p-try": { + "version": "2.2.0", + "resolved": "https://registry.npmmirror.com/p-try/-/p-try-2.2.0.tgz", + "integrity": "sha512-R4nPAVTAU0B9D35/Gk3uJf/7XYbQcyohSKdvAxIRSNghFl4e71hVoGnBNQz9cWaXxO2I10KTC+3jMdvvoKw6dQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/param-case": { + "version": "3.0.4", + "resolved": "https://registry.npmmirror.com/param-case/-/param-case-3.0.4.tgz", + "integrity": "sha512-RXlj7zCYokReqWpOPH9oYivUzLYZ5vAPIfEmCTNViosC78F8F0H9y7T7gG2M39ymgutxF5gcFEsyZQSph9Bp3A==", + "dev": true, + "license": "MIT", + "dependencies": { + "dot-case": "^3.0.4", + "tslib": "^2.0.3" + } + }, + "node_modules/parse5": { + "version": "7.2.1", + "resolved": "https://registry.npmmirror.com/parse5/-/parse5-7.2.1.tgz", + "integrity": "sha512-BuBYQYlv1ckiPdQi/ohiivi9Sagc9JG+Ozs0r7b/0iK3sKmrb0b9FdWdBbOdx6hBCM/F9Ir82ofnBhtZOjCRPQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "entities": "^4.5.0" + }, + "funding": { + "url": "https://github.com/inikulin/parse5?sponsor=1" + } + }, + "node_modules/parseurl": { + "version": "1.3.3", + "resolved": "https://registry.npmmirror.com/parseurl/-/parseurl-1.3.3.tgz", + "integrity": "sha512-CiyeOxFT/JZyN5m0z9PfXw4SCBJ6Sygz1Dpl0wqjlhDEGGBP1GnsUVEL0p63hoG1fcj3fHynXi9NYO4nWOL+qQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/pascal-case": { + "version": "3.1.2", + "resolved": "https://registry.npmmirror.com/pascal-case/-/pascal-case-3.1.2.tgz", + "integrity": "sha512-uWlGT3YSnK9x3BQJaOdcZwrnV6hPpd8jFH1/ucpiLRPh/2zCVJKS19E4GvYHvaCcACn3foXZ0cLB9Wrx1KGe5g==", + "dev": true, + "license": "MIT", + "dependencies": { + "no-case": "^3.0.4", + "tslib": "^2.0.3" + } + }, + "node_modules/path-exists": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/path-exists/-/path-exists-4.0.0.tgz", + "integrity": "sha512-ak9Qy5Q7jYb2Wwcey5Fpvg2KoAc/ZIhLSLOSBmRmygPsGwkVVt0fZa0qrtMz+m6tJTAHfZQ8FnmB4MG4LWy7/w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/path-is-absolute": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/path-is-absolute/-/path-is-absolute-1.0.1.tgz", + "integrity": "sha512-AVbw3UJ2e9bq64vSaS9Am0fje1Pa8pbGqTTsmXfaIiMpnr5DlDhfJOuLj9Sf95ZPVDAUerDfEk88MPmPe7UCQg==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/path-is-inside": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/path-is-inside/-/path-is-inside-1.0.2.tgz", + "integrity": "sha512-DUWJr3+ULp4zXmol/SZkFf3JGsS9/SIv+Y3Rt93/UjPpDpklB5f1er4O3POIbUuUJ3FXgqte2Q7SrU6zAqwk8w==", + "license": "(WTFPL OR MIT)" + }, + "node_modules/path-key": { + "version": "3.1.1", + "resolved": "https://registry.npmmirror.com/path-key/-/path-key-3.1.1.tgz", + "integrity": "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q==", + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/path-parse": { + "version": "1.0.7", + "resolved": "https://registry.npmmirror.com/path-parse/-/path-parse-1.0.7.tgz", + "integrity": "sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw==", + "dev": true, + "license": "MIT" + }, + "node_modules/path-to-regexp": { + "version": "0.1.12", + "resolved": "https://registry.npmmirror.com/path-to-regexp/-/path-to-regexp-0.1.12.tgz", + "integrity": "sha512-RA1GjUVMnvYFxuqovrEqZoxxW5NUZqbwKtYz/Tt7nXerk0LbLblQmrsgdeOxV5SFHf0UDggjS/bSeOZwt1pmEQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/picocolors": { + "version": "1.1.1", + "resolved": "https://registry.npmmirror.com/picocolors/-/picocolors-1.1.1.tgz", + "integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==", + "license": "ISC" + }, + "node_modules/picomatch": { + "version": "2.3.1", + "resolved": "https://registry.npmmirror.com/picomatch/-/picomatch-2.3.1.tgz", + "integrity": "sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8.6" + }, + "funding": { + "url": "https://github.com/sponsors/jonschlinkert" + } + }, + "node_modules/pify": { + "version": "4.0.1", + "resolved": "https://registry.npmmirror.com/pify/-/pify-4.0.1.tgz", + "integrity": "sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g==", + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/pinkie": { + "version": "2.0.4", + "resolved": "https://registry.npmmirror.com/pinkie/-/pinkie-2.0.4.tgz", + "integrity": "sha512-MnUuEycAemtSaeFSjXKW/aroV7akBbY+Sv+RkyqFjgAe73F+MR0TBWKBRDkmfWq/HiFmdavfZ1G7h4SPZXaCSg==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/pinkie-promise": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/pinkie-promise/-/pinkie-promise-2.0.1.tgz", + "integrity": "sha512-0Gni6D4UcLTbv9c57DfxDGdr41XfgUjqWZu492f0cIGr16zDU06BWP/RAEvOuo7CQ0CNjHaLlM59YJJFm3NWlw==", + "license": "MIT", + "dependencies": { + "pinkie": "^2.0.0" + }, + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/pkg-dir": { + "version": "4.2.0", + "resolved": "https://registry.npmmirror.com/pkg-dir/-/pkg-dir-4.2.0.tgz", + "integrity": "sha512-HRDzbaKjC+AOWVXxAU/x54COGeIv9eb+6CkDSQoNTt4XyWoIJvuPsXizxu/Fr23EiekbtZwmh1IcIG/l/a10GQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "find-up": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/postcss": { + "version": "8.5.3", + "resolved": "https://registry.npmmirror.com/postcss/-/postcss-8.5.3.tgz", + "integrity": "sha512-dle9A3yYxlBSrt8Fu+IpjGT8SY8hN0mlaA6GY8t0P5PjIOZemULz/E2Bnm/2dcUOena75OTNkHI76uZBNUUq3A==", + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/postcss/" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/postcss" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "nanoid": "^3.3.8", + "picocolors": "^1.1.1", + "source-map-js": "^1.2.1" + }, + "engines": { + "node": "^10 || ^12 || >=14" + } + }, + "node_modules/postcss-modules-extract-imports": { + "version": "3.1.0", + "resolved": "https://registry.npmmirror.com/postcss-modules-extract-imports/-/postcss-modules-extract-imports-3.1.0.tgz", + "integrity": "sha512-k3kNe0aNFQDAZGbin48pL2VNidTF0w4/eASDsxlyspobzU3wZQLOGj7L9gfRe0Jo9/4uud09DsjFNH7winGv8Q==", + "license": "ISC", + "engines": { + "node": "^10 || ^12 || >= 14" + }, + "peerDependencies": { + "postcss": "^8.1.0" + } + }, + "node_modules/postcss-modules-local-by-default": { + "version": "4.2.0", + "resolved": "https://registry.npmmirror.com/postcss-modules-local-by-default/-/postcss-modules-local-by-default-4.2.0.tgz", + "integrity": "sha512-5kcJm/zk+GJDSfw+V/42fJ5fhjL5YbFDl8nVdXkJPLLW+Vf9mTD5Xe0wqIaDnLuL2U6cDNpTr+UQ+v2HWIBhzw==", + "license": "MIT", + "dependencies": { + "icss-utils": "^5.0.0", + "postcss-selector-parser": "^7.0.0", + "postcss-value-parser": "^4.1.0" + }, + "engines": { + "node": "^10 || ^12 || >= 14" + }, + "peerDependencies": { + "postcss": "^8.1.0" + } + }, + "node_modules/postcss-modules-scope": { + "version": "3.2.1", + "resolved": "https://registry.npmmirror.com/postcss-modules-scope/-/postcss-modules-scope-3.2.1.tgz", + "integrity": "sha512-m9jZstCVaqGjTAuny8MdgE88scJnCiQSlSrOWcTQgM2t32UBe+MUmFSO5t7VMSfAf/FJKImAxBav8ooCHJXCJA==", + "license": "ISC", + "dependencies": { + "postcss-selector-parser": "^7.0.0" + }, + "engines": { + "node": "^10 || ^12 || >= 14" + }, + "peerDependencies": { + "postcss": "^8.1.0" + } + }, + "node_modules/postcss-modules-values": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/postcss-modules-values/-/postcss-modules-values-4.0.0.tgz", + "integrity": "sha512-RDxHkAiEGI78gS2ofyvCsu7iycRv7oqw5xMWn9iMoR0N/7mf9D50ecQqUo5BZ9Zh2vH4bCUR/ktCqbB9m8vJjQ==", + "license": "ISC", + "dependencies": { + "icss-utils": "^5.0.0" + }, + "engines": { + "node": "^10 || ^12 || >= 14" + }, + "peerDependencies": { + "postcss": "^8.1.0" + } + }, + "node_modules/postcss-selector-parser": { + "version": "7.1.0", + "resolved": "https://registry.npmmirror.com/postcss-selector-parser/-/postcss-selector-parser-7.1.0.tgz", + "integrity": "sha512-8sLjZwK0R+JlxlYcTuVnyT2v+htpdrjDOKuMcOVdYjt52Lh8hWRYpxBPoKx/Zg+bcjc3wx6fmQevMmUztS/ccA==", + "license": "MIT", + "dependencies": { + "cssesc": "^3.0.0", + "util-deprecate": "^1.0.2" + }, + "engines": { + "node": ">=4" + } + }, + "node_modules/postcss-value-parser": { + "version": "4.2.0", + "resolved": "https://registry.npmmirror.com/postcss-value-parser/-/postcss-value-parser-4.2.0.tgz", + "integrity": "sha512-1NNCs6uurfkVbeXG4S8JFT9t19m45ICnif8zWLd5oPSZ50QnwMfK+H3jv408d4jw/7Bttv5axS5IiHoLaVNHeQ==", + "license": "MIT" + }, + "node_modules/prettier": { + "version": "3.5.3", + "resolved": "https://registry.npmmirror.com/prettier/-/prettier-3.5.3.tgz", + "integrity": "sha512-QQtaxnoDJeAkDvDKWCLiwIXkTgRhwYDEQCghU9Z6q03iyek/rxRh/2lC3HB7P8sWT2xC/y5JDctPLBIGzHKbhw==", + "license": "MIT", + "bin": { + "prettier": "bin/prettier.cjs" + }, + "engines": { + "node": ">=14" + }, + "funding": { + "url": "https://github.com/prettier/prettier?sponsor=1" + } + }, + "node_modules/pretty-error": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/pretty-error/-/pretty-error-4.0.0.tgz", + "integrity": "sha512-AoJ5YMAcXKYxKhuJGdcvse+Voc6v1RgnsR3nWcYU7q4t6z0Q6T86sv5Zq8VIRbOWWFpvdGE83LtdSMNd+6Y0xw==", + "dev": true, + "license": "MIT", + "dependencies": { + "lodash": "^4.17.20", + "renderkid": "^3.0.0" + } + }, + "node_modules/process-nextick-args": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/process-nextick-args/-/process-nextick-args-2.0.1.tgz", + "integrity": "sha512-3ouUOpQhtgrbOa17J7+uxOTpITYWaGP7/AhoR3+A+/1e9skrzelGi/dXzEYyvbxubEF6Wn2ypscTKiKJFFn1ag==", + "dev": true, + "license": "MIT" + }, + "node_modules/proxy-addr": { + "version": "2.0.7", + "resolved": "https://registry.npmmirror.com/proxy-addr/-/proxy-addr-2.0.7.tgz", + "integrity": "sha512-llQsMLSUDUPT44jdrU/O37qlnifitDP+ZwrmmZcoSKyLKvtZxpyV0n2/bD/N4tBAAZ/gJEdZU7KMraoK1+XYAg==", + "dev": true, + "license": "MIT", + "dependencies": { + "forwarded": "0.2.0", + "ipaddr.js": "1.9.1" + }, + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/proxy-addr/node_modules/ipaddr.js": { + "version": "1.9.1", + "resolved": "https://registry.npmmirror.com/ipaddr.js/-/ipaddr.js-1.9.1.tgz", + "integrity": "sha512-0KI/607xoxSToH7GjN1FfSbLoU0+btTicjsQSWQlh/hZykN8KpmMf7uYwPW3R+akZ6R/w18ZlXSHBYXiYUPO3g==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/qs": { + "version": "6.13.0", + "resolved": "https://registry.npmmirror.com/qs/-/qs-6.13.0.tgz", + "integrity": "sha512-+38qI9SOr8tfZ4QmJNplMUxqjbe7LKvvZgWdExBOmd+egZTtjLB67Gu0HRX3u/XOq7UU2Nx6nsjvS16Z9uwfpg==", + "dev": true, + "license": "BSD-3-Clause", + "dependencies": { + "side-channel": "^1.0.6" + }, + "engines": { + "node": ">=0.6" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/randombytes": { + "version": "2.1.0", + "resolved": "https://registry.npmmirror.com/randombytes/-/randombytes-2.1.0.tgz", + "integrity": "sha512-vYl3iOX+4CKUWuxGi9Ukhie6fsqXqS9FE2Zaic4tNFD2N2QQaXOMFbuKK4QmDHC0JO6B1Zp41J0LpT0oR68amQ==", + "license": "MIT", + "dependencies": { + "safe-buffer": "^5.1.0" + } + }, + "node_modules/range-parser": { + "version": "1.2.1", + "resolved": "https://registry.npmmirror.com/range-parser/-/range-parser-1.2.1.tgz", + "integrity": "sha512-Hrgsx+orqoygnmhFbKaHE6c296J+HTAQXoxEF6gNupROmmGJRoyzfG3ccAveqCBrwr/2yxQ5BVd/GTl5agOwSg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/raw-body": { + "version": "2.5.2", + "resolved": "https://registry.npmmirror.com/raw-body/-/raw-body-2.5.2.tgz", + "integrity": "sha512-8zGqypfENjCIqGhgXToC8aB2r7YrBX+AQAfIPs/Mlk+BtPTztOvTS01NRW/3Eh60J+a48lt8qsCzirQ6loCVfA==", + "dev": true, + "license": "MIT", + "dependencies": { + "bytes": "3.1.2", + "http-errors": "2.0.0", + "iconv-lite": "0.4.24", + "unpipe": "1.0.0" + }, + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/readable-stream": { + "version": "3.6.2", + "resolved": "https://registry.npmmirror.com/readable-stream/-/readable-stream-3.6.2.tgz", + "integrity": "sha512-9u/sniCrY3D5WdsERHzHE4G2YCXqoG5FTHUiCC4SIbr6XcLZBY05ya9EKjYek9O5xOAwjGq+1JdGBAS7Q9ScoA==", + "dev": true, + "license": "MIT", + "dependencies": { + "inherits": "^2.0.3", + "string_decoder": "^1.1.1", + "util-deprecate": "^1.0.1" + }, + "engines": { + "node": ">= 6" + } + }, + "node_modules/readdirp": { + "version": "3.6.0", + "resolved": "https://registry.npmmirror.com/readdirp/-/readdirp-3.6.0.tgz", + "integrity": "sha512-hOS089on8RduqdbhvQ5Z37A0ESjsqz6qnRcffsMU3495FuTdqSm+7bhJ29JvIOsBDEEnan5DPu9t3To9VRlMzA==", + "dev": true, + "license": "MIT", + "dependencies": { + "picomatch": "^2.2.1" + }, + "engines": { + "node": ">=8.10.0" + } + }, + "node_modules/rechoir": { + "version": "0.8.0", + "resolved": "https://registry.npmmirror.com/rechoir/-/rechoir-0.8.0.tgz", + "integrity": "sha512-/vxpCXddiX8NGfGO/mTafwjq4aFa/71pvamip0++IQk3zG8cbCj0fifNPrjjF1XMXUne91jL9OoxmdykoEtifQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "resolve": "^1.20.0" + }, + "engines": { + "node": ">= 10.13.0" + } + }, + "node_modules/relateurl": { + "version": "0.2.7", + "resolved": "https://registry.npmmirror.com/relateurl/-/relateurl-0.2.7.tgz", + "integrity": "sha512-G08Dxvm4iDN3MLM0EsP62EDV9IuhXPR6blNz6Utcp7zyV3tr4HVNINt6MpaRWbxoOHT3Q7YN2P+jaHX8vUbgog==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.10" + } + }, + "node_modules/renderkid": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/renderkid/-/renderkid-3.0.0.tgz", + "integrity": "sha512-q/7VIQA8lmM1hF+jn+sFSPWGlMkSAeNYcPLmDQx2zzuiDfaLrOmumR8iaUKlenFgh0XRPIUeSPlH3A+AW3Z5pg==", + "dev": true, + "license": "MIT", + "dependencies": { + "css-select": "^4.1.3", + "dom-converter": "^0.2.0", + "htmlparser2": "^6.1.0", + "lodash": "^4.17.21", + "strip-ansi": "^6.0.1" + } + }, + "node_modules/require-from-string": { + "version": "2.0.2", + "resolved": "https://registry.npmmirror.com/require-from-string/-/require-from-string-2.0.2.tgz", + "integrity": "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw==", + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/requires-port": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/requires-port/-/requires-port-1.0.0.tgz", + "integrity": "sha512-KigOCHcocU3XODJxsu8i/j8T9tzT4adHiecwORRQ0ZZFcp7ahwXuRU1m+yuO90C5ZUyGeGfocHDI14M3L3yDAQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/resolve": { + "version": "1.22.10", + "resolved": "https://registry.npmmirror.com/resolve/-/resolve-1.22.10.tgz", + "integrity": "sha512-NPRy+/ncIMeDlTAsuqwKIiferiawhefFJtkNSW0qZJEqMEb+qBt/77B/jGeeek+F0uOeN05CDa6HXbbIgtVX4w==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-core-module": "^2.16.0", + "path-parse": "^1.0.7", + "supports-preserve-symlinks-flag": "^1.0.0" + }, + "bin": { + "resolve": "bin/resolve" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/resolve-cwd": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/resolve-cwd/-/resolve-cwd-3.0.0.tgz", + "integrity": "sha512-OrZaX2Mb+rJCpH/6CpSqt9xFVpN++x01XnN2ie9g6P5/3xelLAkXWVADpdz1IHD/KFfEXyE6V0U01OQ3UO2rEg==", + "dev": true, + "license": "MIT", + "dependencies": { + "resolve-from": "^5.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/resolve-from": { + "version": "5.0.0", + "resolved": "https://registry.npmmirror.com/resolve-from/-/resolve-from-5.0.0.tgz", + "integrity": "sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/retry": { + "version": "0.13.1", + "resolved": "https://registry.npmmirror.com/retry/-/retry-0.13.1.tgz", + "integrity": "sha512-XQBQ3I8W1Cge0Seh+6gjj03LbmRFWuoszgK9ooCpwYIrhhoO80pfq4cUkU5DkknwfOfFteRwlZ56PYOGYyFWdg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 4" + } + }, + "node_modules/rimraf": { + "version": "2.7.1", + "resolved": "https://registry.npmmirror.com/rimraf/-/rimraf-2.7.1.tgz", + "integrity": "sha512-uWjbaKIK3T1OSVptzX7Nl6PvQ3qAGtKEtVRjRuazjfL3Bx5eI409VZSqgND+4UNnmzLVdPj9FqFJNPqBZFve4w==", + "deprecated": "Rimraf versions prior to v4 are no longer supported", + "license": "ISC", + "dependencies": { + "glob": "^7.1.3" + }, + "bin": { + "rimraf": "bin.js" + } + }, + "node_modules/rw": { + "version": "1.3.3", + "resolved": "https://registry.npmmirror.com/rw/-/rw-1.3.3.tgz", + "integrity": "sha512-PdhdWy89SiZogBLaw42zdeqtRJ//zFd2PgQavcICDUgJT5oW10QCRKbJ6bg4r0/UY2M6BWd5tkxuGFRvCkgfHQ==", + "license": "BSD-3-Clause" + }, + "node_modules/safe-buffer": { + "version": "5.2.1", + "resolved": "https://registry.npmmirror.com/safe-buffer/-/safe-buffer-5.2.1.tgz", + "integrity": "sha512-rp3So07KcdmmKbGvgaNxQSJr7bGVSVk5S9Eq1F+ppbRo70+YeaDxkw5Dd8NPN+GD6bjnYm2VuPuCXmpuYvmCXQ==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/feross" + }, + { + "type": "patreon", + "url": "https://www.patreon.com/feross" + }, + { + "type": "consulting", + "url": "https://feross.org/support" + } + ], + "license": "MIT" + }, + "node_modules/safer-buffer": { + "version": "2.1.2", + "resolved": "https://registry.npmmirror.com/safer-buffer/-/safer-buffer-2.1.2.tgz", + "integrity": "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg==", + "license": "MIT" + }, + "node_modules/schema-utils": { + "version": "4.3.0", + "resolved": "https://registry.npmmirror.com/schema-utils/-/schema-utils-4.3.0.tgz", + "integrity": "sha512-Gf9qqc58SpCA/xdziiHz35F4GNIWYWZrEshUc/G/r5BnLph6xpKuLeoJoQuj5WfBIx/eQLf+hmVPYHaxJu7V2g==", + "license": "MIT", + "dependencies": { + "@types/json-schema": "^7.0.9", + "ajv": "^8.9.0", + "ajv-formats": "^2.1.1", + "ajv-keywords": "^5.1.0" + }, + "engines": { + "node": ">= 10.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + } + }, + "node_modules/select-hose": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/select-hose/-/select-hose-2.0.0.tgz", + "integrity": "sha512-mEugaLK+YfkijB4fx0e6kImuJdCIt2LxCRcbEYPqRGCs4F2ogyfZU5IAZRdjCP8JPq2AtdNoC/Dux63d9Kiryg==", + "dev": true, + "license": "MIT" + }, + "node_modules/selfsigned": { + "version": "2.4.1", + "resolved": "https://registry.npmmirror.com/selfsigned/-/selfsigned-2.4.1.tgz", + "integrity": "sha512-th5B4L2U+eGLq1TVh7zNRGBapioSORUeymIydxgFpwww9d2qyKvtuPU2jJuHvYAwwqi2Y596QBL3eEqcPEYL8Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/node-forge": "^1.3.0", + "node-forge": "^1" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/semver": { + "version": "7.7.1", + "resolved": "https://registry.npmmirror.com/semver/-/semver-7.7.1.tgz", + "integrity": "sha512-hlq8tAfn0m/61p4BVRcPzIGr6LKiMwo4VM6dGi6pt4qcRkmNzTcWq6eCEjEh+qXjkMDvPlOFFSGwQjoEa6gyMA==", + "license": "ISC", + "bin": { + "semver": "bin/semver.js" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/send": { + "version": "0.19.0", + "resolved": "https://registry.npmmirror.com/send/-/send-0.19.0.tgz", + "integrity": "sha512-dW41u5VfLXu8SJh5bwRmyYUbAoSB3c9uQh6L8h/KtsFREPWpbX1lrljJo186Jc4nmci/sGUZ9a0a0J2zgfq2hw==", + "dev": true, + "license": "MIT", + "dependencies": { + "debug": "2.6.9", + "depd": "2.0.0", + "destroy": "1.2.0", + "encodeurl": "~1.0.2", + "escape-html": "~1.0.3", + "etag": "~1.8.1", + "fresh": "0.5.2", + "http-errors": "2.0.0", + "mime": "1.6.0", + "ms": "2.1.3", + "on-finished": "2.4.1", + "range-parser": "~1.2.1", + "statuses": "2.0.1" + }, + "engines": { + "node": ">= 0.8.0" + } + }, + "node_modules/send/node_modules/encodeurl": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/encodeurl/-/encodeurl-1.0.2.tgz", + "integrity": "sha512-TPJXq8JqFaVYm2CWmPvnP2Iyo4ZSM7/QKcSmuMLDObfpH5fi7RUGmd/rTDf+rut/saiDiQEeVTNgAmJEdAOx0w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/send/node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/serialize-javascript": { + "version": "6.0.2", + "resolved": "https://registry.npmmirror.com/serialize-javascript/-/serialize-javascript-6.0.2.tgz", + "integrity": "sha512-Saa1xPByTTq2gdeFZYLLo+RFE35NHZkAbqZeWNd3BpzppeVisAqpDjcp8dyf6uIvEqJRd46jemmyA4iFIeVk8g==", + "license": "BSD-3-Clause", + "dependencies": { + "randombytes": "^2.1.0" + } + }, + "node_modules/serve-index": { + "version": "1.9.1", + "resolved": "https://registry.npmmirror.com/serve-index/-/serve-index-1.9.1.tgz", + "integrity": "sha512-pXHfKNP4qujrtteMrSBb0rc8HJ9Ms/GrXwcUtUtD5s4ewDJI8bT3Cz2zTVRMKtri49pLx2e0Ya8ziP5Ya2pZZw==", + "dev": true, + "license": "MIT", + "dependencies": { + "accepts": "~1.3.4", + "batch": "0.6.1", + "debug": "2.6.9", + "escape-html": "~1.0.3", + "http-errors": "~1.6.2", + "mime-types": "~2.1.17", + "parseurl": "~1.3.2" + }, + "engines": { + "node": ">= 0.8.0" + } + }, + "node_modules/serve-index/node_modules/depd": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/depd/-/depd-1.1.2.tgz", + "integrity": "sha512-7emPTl6Dpo6JRXOXjLRxck+FlLRX5847cLKEn00PLAgc3g2hTZZgr+e4c2v6QpSmLeFP3n5yUo7ft6avBK/5jQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/serve-index/node_modules/http-errors": { + "version": "1.6.3", + "resolved": "https://registry.npmmirror.com/http-errors/-/http-errors-1.6.3.tgz", + "integrity": "sha512-lks+lVC8dgGyh97jxvxeYTWQFvh4uw4yC12gVl63Cg30sjPX4wuGcdkICVXDAESr6OJGjqGA8Iz5mkeN6zlD7A==", + "dev": true, + "license": "MIT", + "dependencies": { + "depd": "~1.1.2", + "inherits": "2.0.3", + "setprototypeof": "1.1.0", + "statuses": ">= 1.4.0 < 2" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/serve-index/node_modules/inherits": { + "version": "2.0.3", + "resolved": "https://registry.npmmirror.com/inherits/-/inherits-2.0.3.tgz", + "integrity": "sha512-x00IRNXNy63jwGkJmzPigoySHbaqpNuzKbBOmzK+g2OdZpQ9w+sxCN+VSB3ja7IAge2OP2qpfxTjeNcyjmW1uw==", + "dev": true, + "license": "ISC" + }, + "node_modules/serve-index/node_modules/setprototypeof": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/setprototypeof/-/setprototypeof-1.1.0.tgz", + "integrity": "sha512-BvE/TwpZX4FXExxOxZyRGQQv651MSwmWKZGqvmPcRIjDqWub67kTKuIMx43cZZrS/cBBzwBcNDWoFxt2XEFIpQ==", + "dev": true, + "license": "ISC" + }, + "node_modules/serve-index/node_modules/statuses": { + "version": "1.5.0", + "resolved": "https://registry.npmmirror.com/statuses/-/statuses-1.5.0.tgz", + "integrity": "sha512-OpZ3zP+jT1PI7I8nemJX4AKmAX070ZkYPVWV/AaKTJl+tXCTGyVdC1a4SL8RUQYEwk/f34ZX8UTykN68FwrqAA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/serve-static": { + "version": "1.16.2", + "resolved": "https://registry.npmmirror.com/serve-static/-/serve-static-1.16.2.tgz", + "integrity": "sha512-VqpjJZKadQB/PEbEwvFdO43Ax5dFBZ2UECszz8bQ7pi7wt//PWe1P6MN7eCnjsatYtBT6EuiClbjSWP2WrIoTw==", + "dev": true, + "license": "MIT", + "dependencies": { + "encodeurl": "~2.0.0", + "escape-html": "~1.0.3", + "parseurl": "~1.3.3", + "send": "0.19.0" + }, + "engines": { + "node": ">= 0.8.0" + } + }, + "node_modules/setprototypeof": { + "version": "1.2.0", + "resolved": "https://registry.npmmirror.com/setprototypeof/-/setprototypeof-1.2.0.tgz", + "integrity": "sha512-E5LDX7Wrp85Kil5bhZv46j8jOeboKq5JMmYM3gVGdGH8xFpPWXUMsNrlODCrkoxMEeNi/XZIwuRvY4XNwYMJpw==", + "dev": true, + "license": "ISC" + }, + "node_modules/shallow-clone": { + "version": "3.0.1", + "resolved": "https://registry.npmmirror.com/shallow-clone/-/shallow-clone-3.0.1.tgz", + "integrity": "sha512-/6KqX+GVUdqPuPPd2LxDDxzX6CAbjJehAAOKlNpqqUpAqPM6HeL8f+o3a+JsyGjn2lv0WY8UsTgUJjU9Ok55NA==", + "dev": true, + "license": "MIT", + "dependencies": { + "kind-of": "^6.0.2" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/shebang-command": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/shebang-command/-/shebang-command-2.0.0.tgz", + "integrity": "sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA==", + "license": "MIT", + "dependencies": { + "shebang-regex": "^3.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/shebang-regex": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/shebang-regex/-/shebang-regex-3.0.0.tgz", + "integrity": "sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A==", + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/shell-quote": { + "version": "1.8.2", + "resolved": "https://registry.npmmirror.com/shell-quote/-/shell-quote-1.8.2.tgz", + "integrity": "sha512-AzqKpGKjrj7EM6rKVQEPpB288oCfnrEIuyoT9cyF4nmGa7V8Zk6f7RRqYisX8X9m+Q7bd632aZW4ky7EhbQztA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/side-channel/-/side-channel-1.1.0.tgz", + "integrity": "sha512-ZX99e6tRweoUXqR+VBrslhda51Nh5MTQwou5tnUDgbtyM0dBgmhEDtWGP/xbKn6hqfPRHujUNwz5fy/wbbhnpw==", + "dev": true, + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "object-inspect": "^1.13.3", + "side-channel-list": "^1.0.0", + "side-channel-map": "^1.0.1", + "side-channel-weakmap": "^1.0.2" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-list": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/side-channel-list/-/side-channel-list-1.0.0.tgz", + "integrity": "sha512-FCLHtRD/gnpCiCHEiJLOwdmFP+wzCmDEkc9y7NsYxeF4u7Btsn1ZuwgwJGxImImHicJArLP4R0yX4c2KCrMrTA==", + "dev": true, + "license": "MIT", + "dependencies": { + "es-errors": "^1.3.0", + "object-inspect": "^1.13.3" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-map": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/side-channel-map/-/side-channel-map-1.0.1.tgz", + "integrity": "sha512-VCjCNfgMsby3tTdo02nbjtM/ewra6jPHmpThenkTYh8pG9ucZ/1P8So4u4FGBek/BjpOVsDCMoLA/iuBKIFXRA==", + "dev": true, + "license": "MIT", + "dependencies": { + "call-bound": "^1.0.2", + "es-errors": "^1.3.0", + "get-intrinsic": "^1.2.5", + "object-inspect": "^1.13.3" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/side-channel-weakmap": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/side-channel-weakmap/-/side-channel-weakmap-1.0.2.tgz", + "integrity": "sha512-WPS/HvHQTYnHisLo9McqBHOJk2FkHO/tlpvldyrnem4aeQp4hai3gythswg6p01oSoTl58rcpiFAjF2br2Ak2A==", + "dev": true, + "license": "MIT", + "dependencies": { + "call-bound": "^1.0.2", + "es-errors": "^1.3.0", + "get-intrinsic": "^1.2.5", + "object-inspect": "^1.13.3", + "side-channel-map": "^1.0.1" + }, + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/signal-exit": { + "version": "3.0.7", + "resolved": "https://registry.npmmirror.com/signal-exit/-/signal-exit-3.0.7.tgz", + "integrity": "sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ==", + "dev": true, + "license": "ISC" + }, + "node_modules/sockjs": { + "version": "0.3.24", + "resolved": "https://registry.npmmirror.com/sockjs/-/sockjs-0.3.24.tgz", + "integrity": "sha512-GJgLTZ7vYb/JtPSSZ10hsOYIvEYsjbNU+zPdIHcUaWVNUEPivzxku31865sSSud0Da0W4lEeOPlmw93zLQchuQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "faye-websocket": "^0.11.3", + "uuid": "^8.3.2", + "websocket-driver": "^0.7.4" + } + }, + "node_modules/source-map": { + "version": "0.6.1", + "resolved": "https://registry.npmmirror.com/source-map/-/source-map-0.6.1.tgz", + "integrity": "sha512-UjgapumWlbMhkBgzT7Ykc5YXUT46F0iKu8SGXq0bcwP5dz/h0Plj6enJqjz1Zbq2l5WaqYnrVbwWOWMyF3F47g==", + "license": "BSD-3-Clause", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/source-map-js": { + "version": "1.2.1", + "resolved": "https://registry.npmmirror.com/source-map-js/-/source-map-js-1.2.1.tgz", + "integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==", + "license": "BSD-3-Clause", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/source-map-support": { + "version": "0.5.21", + "resolved": "https://registry.npmmirror.com/source-map-support/-/source-map-support-0.5.21.tgz", + "integrity": "sha512-uBHU3L3czsIyYXKX88fdrGovxdSCoTGDRZ6SYXtSRxLZUzHg5P/66Ht6uoUlHu9EZod+inXhKo3qQgwXUT/y1w==", + "license": "MIT", + "dependencies": { + "buffer-from": "^1.0.0", + "source-map": "^0.6.0" + } + }, + "node_modules/spdy": { + "version": "4.0.2", + "resolved": "https://registry.npmmirror.com/spdy/-/spdy-4.0.2.tgz", + "integrity": "sha512-r46gZQZQV+Kl9oItvl1JZZqJKGr+oEkB08A6BzkiR7593/7IbtuncXHd2YoYeTsG4157ZssMu9KYvUHLcjcDoA==", + "dev": true, + "license": "MIT", + "dependencies": { + "debug": "^4.1.0", + "handle-thing": "^2.0.0", + "http-deceiver": "^1.2.7", + "select-hose": "^2.0.0", + "spdy-transport": "^3.0.0" + }, + "engines": { + "node": ">=6.0.0" + } + }, + "node_modules/spdy-transport": { + "version": "3.0.0", + "resolved": "https://registry.npmmirror.com/spdy-transport/-/spdy-transport-3.0.0.tgz", + "integrity": "sha512-hsLVFE5SjA6TCisWeJXFKniGGOpBgMLmerfO2aCyCU5s7nJ/rpAepqmFifv/GCbSbueEeAJJnmSQ2rKC/g8Fcw==", + "dev": true, + "license": "MIT", + "dependencies": { + "debug": "^4.1.0", + "detect-node": "^2.0.4", + "hpack.js": "^2.1.6", + "obuf": "^1.1.2", + "readable-stream": "^3.0.6", + "wbuf": "^1.7.3" + } + }, + "node_modules/spdy-transport/node_modules/debug": { + "version": "4.4.0", + "resolved": "https://registry.npmmirror.com/debug/-/debug-4.4.0.tgz", + "integrity": "sha512-6WTZ/IxCY/T6BALoZHaE4ctp9xm+Z5kY/pzYaCHRFeyVhojxlrm+46y68HA6hr0TcwEssoxNiDEUJQjfPZ/RYA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ms": "^2.1.3" + }, + "engines": { + "node": ">=6.0" + }, + "peerDependenciesMeta": { + "supports-color": { + "optional": true + } + } + }, + "node_modules/spdy-transport/node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/spdy/node_modules/debug": { + "version": "4.4.0", + "resolved": "https://registry.npmmirror.com/debug/-/debug-4.4.0.tgz", + "integrity": "sha512-6WTZ/IxCY/T6BALoZHaE4ctp9xm+Z5kY/pzYaCHRFeyVhojxlrm+46y68HA6hr0TcwEssoxNiDEUJQjfPZ/RYA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ms": "^2.1.3" + }, + "engines": { + "node": ">=6.0" + }, + "peerDependenciesMeta": { + "supports-color": { + "optional": true + } + } + }, + "node_modules/spdy/node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmmirror.com/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/statuses": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/statuses/-/statuses-2.0.1.tgz", + "integrity": "sha512-RwNA9Z/7PrK06rYLIzFMlaF+l73iwpzsqRIFgbMLbTcLD6cOao82TaWefPXQvB2fOC4AjuYSEndS7N/mTCbkdQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/string_decoder": { + "version": "1.3.0", + "resolved": "https://registry.npmmirror.com/string_decoder/-/string_decoder-1.3.0.tgz", + "integrity": "sha512-hkRX8U1WjJFd8LsDJ2yQ/wWWxaopEsABU1XfkM8A+j0+85JAGppt16cr1Whg6KIbb4okU6Mql6BOj+uup/wKeA==", + "dev": true, + "license": "MIT", + "dependencies": { + "safe-buffer": "~5.2.0" + } + }, + "node_modules/strip-ansi": { + "version": "6.0.1", + "resolved": "https://registry.npmmirror.com/strip-ansi/-/strip-ansi-6.0.1.tgz", + "integrity": "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-regex": "^5.0.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/strip-final-newline": { + "version": "2.0.0", + "resolved": "https://registry.npmmirror.com/strip-final-newline/-/strip-final-newline-2.0.0.tgz", + "integrity": "sha512-BrpvfNAE3dcvq7ll3xVumzjKjZQ5tI1sEUIKr3Uoks0XUl45St3FlatVqef9prk4jRDzhW6WZg+3bk93y6pLjA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/style-loader": { + "version": "4.0.0", + "resolved": "https://registry.npmmirror.com/style-loader/-/style-loader-4.0.0.tgz", + "integrity": "sha512-1V4WqhhZZgjVAVJyt7TdDPZoPBPNHbekX4fWnCJL1yQukhCeZhJySUL+gL9y6sNdN95uEOS83Y55SqHcP7MzLA==", + "license": "MIT", + "engines": { + "node": ">= 18.12.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "^5.27.0" + } + }, + "node_modules/supports-color": { + "version": "7.2.0", + "resolved": "https://registry.npmmirror.com/supports-color/-/supports-color-7.2.0.tgz", + "integrity": "sha512-qpCAvRl9stuOHveKsn7HncJRvv501qIacKzQlO/+Lwxc9+0q2wLyv4Dfvt80/DPn2pqOBsJdDiogXGR9+OvwRw==", + "dev": true, + "license": "MIT", + "dependencies": { + "has-flag": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/supports-preserve-symlinks-flag": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/supports-preserve-symlinks-flag/-/supports-preserve-symlinks-flag-1.0.0.tgz", + "integrity": "sha512-ot0WnXS9fgdkgIcePe6RHNk1WA8+muPa6cSjeR3V8K27q9BB1rTE3R1p7Hv0z1ZyAc8s6Vvv8DIyWf681MAt0w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/tapable": { + "version": "2.2.1", + "resolved": "https://registry.npmmirror.com/tapable/-/tapable-2.2.1.tgz", + "integrity": "sha512-GNzQvQTOIP6RyTfE2Qxb8ZVlNmw0n88vp1szwWRimP02mnTsx3Wtn5qRdqY9w2XduFNUgvOwhNnQsjwCp+kqaQ==", + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/terser": { + "version": "5.39.0", + "resolved": "https://registry.npmmirror.com/terser/-/terser-5.39.0.tgz", + "integrity": "sha512-LBAhFyLho16harJoWMg/nZsQYgTrg5jXOn2nCYjRUcZZEdE3qa2zb8QEDRUGVZBW4rlazf2fxkg8tztybTaqWw==", + "license": "BSD-2-Clause", + "dependencies": { + "@jridgewell/source-map": "^0.3.3", + "acorn": "^8.8.2", + "commander": "^2.20.0", + "source-map-support": "~0.5.20" + }, + "bin": { + "terser": "bin/terser" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/terser-webpack-plugin": { + "version": "5.3.14", + "resolved": "https://registry.npmmirror.com/terser-webpack-plugin/-/terser-webpack-plugin-5.3.14.tgz", + "integrity": "sha512-vkZjpUjb6OMS7dhV+tILUW6BhpDR7P2L/aQSAv+Uwk+m8KATX9EccViHTJR2qDtACKPIYndLGCyl3FMo+r2LMw==", + "license": "MIT", + "dependencies": { + "@jridgewell/trace-mapping": "^0.3.25", + "jest-worker": "^27.4.5", + "schema-utils": "^4.3.0", + "serialize-javascript": "^6.0.2", + "terser": "^5.31.1" + }, + "engines": { + "node": ">= 10.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "^5.1.0" + }, + "peerDependenciesMeta": { + "@swc/core": { + "optional": true + }, + "esbuild": { + "optional": true + }, + "uglify-js": { + "optional": true + } + } + }, + "node_modules/thunky": { + "version": "1.1.0", + "resolved": "https://registry.npmmirror.com/thunky/-/thunky-1.1.0.tgz", + "integrity": "sha512-eHY7nBftgThBqOyHGVN+l8gF0BucP09fMo0oO/Lb0w1OF80dJv+lDVpXG60WMQvkcxAkNybKsrEIE3ZtKGmPrA==", + "dev": true, + "license": "MIT" + }, + "node_modules/to-regex-range": { + "version": "5.0.1", + "resolved": "https://registry.npmmirror.com/to-regex-range/-/to-regex-range-5.0.1.tgz", + "integrity": "sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-number": "^7.0.0" + }, + "engines": { + "node": ">=8.0" + } + }, + "node_modules/toidentifier": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/toidentifier/-/toidentifier-1.0.1.tgz", + "integrity": "sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.6" + } + }, + "node_modules/ts-loader": { + "version": "9.5.2", + "resolved": "https://registry.npmmirror.com/ts-loader/-/ts-loader-9.5.2.tgz", + "integrity": "sha512-Qo4piXvOTWcMGIgRiuFa6nHNm+54HbYaZCKqc9eeZCLRy3XqafQgwX2F7mofrbJG3g7EEb+lkiR+z2Lic2s3Zw==", + "dev": true, + "license": "MIT", + "dependencies": { + "chalk": "^4.1.0", + "enhanced-resolve": "^5.0.0", + "micromatch": "^4.0.0", + "semver": "^7.3.4", + "source-map": "^0.7.4" + }, + "engines": { + "node": ">=12.0.0" + }, + "peerDependencies": { + "typescript": "*", + "webpack": "^5.0.0" + } + }, + "node_modules/ts-loader/node_modules/source-map": { + "version": "0.7.4", + "resolved": "https://registry.npmmirror.com/source-map/-/source-map-0.7.4.tgz", + "integrity": "sha512-l3BikUxvPOcn5E74dZiq5BGsTb5yEwhaTSzccU6t4sDOH8NWJCstKO5QT2CvtFoK6F0saL7p9xHAqHOlCPJygA==", + "dev": true, + "license": "BSD-3-Clause", + "engines": { + "node": ">= 8" + } + }, + "node_modules/tslib": { + "version": "2.8.1", + "resolved": "https://registry.npmmirror.com/tslib/-/tslib-2.8.1.tgz", + "integrity": "sha512-oJFu94HQb+KVduSUQL7wnpmqnfmLsOA/nAh6b6EH0wCEoK0/mPeXU6c3wKDV83MkOuHPRHtSXKKU99IBazS/2w==", + "dev": true, + "license": "0BSD" + }, + "node_modules/type-is": { + "version": "1.6.18", + "resolved": "https://registry.npmmirror.com/type-is/-/type-is-1.6.18.tgz", + "integrity": "sha512-TkRKr9sUTxEH8MdfuCSP7VizJyzRNMjj2J2do2Jr3Kym598JVdEksuzPQCnlFPW4ky9Q+iA+ma9BGm06XQBy8g==", + "dev": true, + "license": "MIT", + "dependencies": { + "media-typer": "0.3.0", + "mime-types": "~2.1.24" + }, + "engines": { + "node": ">= 0.6" + } + }, + "node_modules/typescript": { + "version": "5.8.2", + "resolved": "https://registry.npmmirror.com/typescript/-/typescript-5.8.2.tgz", + "integrity": "sha512-aJn6wq13/afZp/jT9QZmwEjDqqvSGp1VT5GVg+f/t6/oVyrgXM6BY1h9BRh/O5p3PlUPAe+WuiEZOmb/49RqoQ==", + "dev": true, + "license": "Apache-2.0", + "bin": { + "tsc": "bin/tsc", + "tsserver": "bin/tsserver" + }, + "engines": { + "node": ">=14.17" + } + }, + "node_modules/unpipe": { + "version": "1.0.0", + "resolved": "https://registry.npmmirror.com/unpipe/-/unpipe-1.0.0.tgz", + "integrity": "sha512-pjy2bYhSsufwWlKwPc+l3cN7+wuJlK6uz0YdJEOlQDbl6jo/YlPi4mb8agUkVC8BF7V8NuzeyPNqRksA3hztKQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/update-browserslist-db": { + "version": "1.1.3", + "resolved": "https://registry.npmmirror.com/update-browserslist-db/-/update-browserslist-db-1.1.3.tgz", + "integrity": "sha512-UxhIZQ+QInVdunkDAaiazvvT/+fXL5Osr0JZlJulepYu6Jd7qJtDZjlur0emRlT71EN3ScPoE7gvsuIKKNavKw==", + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/browserslist" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/browserslist" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "escalade": "^3.2.0", + "picocolors": "^1.1.1" + }, + "bin": { + "update-browserslist-db": "cli.js" + }, + "peerDependencies": { + "browserslist": ">= 4.21.0" + } + }, + "node_modules/util-deprecate": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/util-deprecate/-/util-deprecate-1.0.2.tgz", + "integrity": "sha512-EPD5q1uXyFxJpCrLnCc1nHnq3gOa6DZBocAIiI2TaSCA7VCJ1UJDMagCzIkXNsUYfD1daK//LTEQ8xiIbrHtcw==", + "license": "MIT" + }, + "node_modules/utila": { + "version": "0.4.0", + "resolved": "https://registry.npmmirror.com/utila/-/utila-0.4.0.tgz", + "integrity": "sha512-Z0DbgELS9/L/75wZbro8xAnT50pBVFQZ+hUEueGDU5FN51YSCYM+jdxsfCiHjwNP/4LCDD0i/graKpeBnOXKRA==", + "dev": true, + "license": "MIT" + }, + "node_modules/utils-merge": { + "version": "1.0.1", + "resolved": "https://registry.npmmirror.com/utils-merge/-/utils-merge-1.0.1.tgz", + "integrity": "sha512-pMZTvIkT1d+TFGvDOqodOclx0QWkkgi6Tdoa8gC8ffGAAqz9pzPTZWAybbsHHoED/ztMtkv/VoYTYyShUn81hA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4.0" + } + }, + "node_modules/uuid": { + "version": "8.3.2", + "resolved": "https://registry.npmmirror.com/uuid/-/uuid-8.3.2.tgz", + "integrity": "sha512-+NYs2QeMWy+GWFOEm9xnn6HCDp0l7QBD7ml8zLUmJ+93Q5NF0NocErnwkTkXVFNiX3/fpC6afS8Dhb/gz7R7eg==", + "dev": true, + "license": "MIT", + "bin": { + "uuid": "dist/bin/uuid" + } + }, + "node_modules/vary": { + "version": "1.1.2", + "resolved": "https://registry.npmmirror.com/vary/-/vary-1.1.2.tgz", + "integrity": "sha512-BNGbWLfd0eUPabhkXUVm0j8uuvREyTh5ovRa/dyow/BqAbZJyC+5fU+IzQOzmAKzYqYRAISoRhdQr3eIZ/PXqg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.8" + } + }, + "node_modules/watchpack": { + "version": "2.4.2", + "resolved": "https://registry.npmmirror.com/watchpack/-/watchpack-2.4.2.tgz", + "integrity": "sha512-TnbFSbcOCcDgjZ4piURLCbJ3nJhznVh9kw6F6iokjiFPl8ONxe9A6nMDVXDiNbrSfLILs6vB07F7wLBrwPYzJw==", + "license": "MIT", + "dependencies": { + "glob-to-regexp": "^0.4.1", + "graceful-fs": "^4.1.2" + }, + "engines": { + "node": ">=10.13.0" + } + }, + "node_modules/wbuf": { + "version": "1.7.3", + "resolved": "https://registry.npmmirror.com/wbuf/-/wbuf-1.7.3.tgz", + "integrity": "sha512-O84QOnr0icsbFGLS0O3bI5FswxzRr8/gHwWkDlQFskhSPryQXvrTMxjxGP4+iWYoauLoBvfDpkrOauZ+0iZpDA==", + "dev": true, + "license": "MIT", + "dependencies": { + "minimalistic-assert": "^1.0.0" + } + }, + "node_modules/webpack": { + "version": "5.98.0", + "resolved": "https://registry.npmmirror.com/webpack/-/webpack-5.98.0.tgz", + "integrity": "sha512-UFynvx+gM44Gv9qFgj0acCQK2VE1CtdfwFdimkapco3hlPCJ/zeq73n2yVKimVbtm+TnApIugGhLJnkU6gjYXA==", + "license": "MIT", + "dependencies": { + "@types/eslint-scope": "^3.7.7", + "@types/estree": "^1.0.6", + "@webassemblyjs/ast": "^1.14.1", + "@webassemblyjs/wasm-edit": "^1.14.1", + "@webassemblyjs/wasm-parser": "^1.14.1", + "acorn": "^8.14.0", + "browserslist": "^4.24.0", + "chrome-trace-event": "^1.0.2", + "enhanced-resolve": "^5.17.1", + "es-module-lexer": "^1.2.1", + "eslint-scope": "5.1.1", + "events": "^3.2.0", + "glob-to-regexp": "^0.4.1", + "graceful-fs": "^4.2.11", + "json-parse-even-better-errors": "^2.3.1", + "loader-runner": "^4.2.0", + "mime-types": "^2.1.27", + "neo-async": "^2.6.2", + "schema-utils": "^4.3.0", + "tapable": "^2.1.1", + "terser-webpack-plugin": "^5.3.11", + "watchpack": "^2.4.1", + "webpack-sources": "^3.2.3" + }, + "bin": { + "webpack": "bin/webpack.js" + }, + "engines": { + "node": ">=10.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependenciesMeta": { + "webpack-cli": { + "optional": true + } + } + }, + "node_modules/webpack-cli": { + "version": "5.1.4", + "resolved": "https://registry.npmmirror.com/webpack-cli/-/webpack-cli-5.1.4.tgz", + "integrity": "sha512-pIDJHIEI9LR0yxHXQ+Qh95k2EvXpWzZ5l+d+jIo+RdSm9MiHfzazIxwwni/p7+x4eJZuvG1AJwgC4TNQ7NRgsg==", + "dev": true, + "license": "MIT", + "dependencies": { + "@discoveryjs/json-ext": "^0.5.0", + "@webpack-cli/configtest": "^2.1.1", + "@webpack-cli/info": "^2.0.2", + "@webpack-cli/serve": "^2.0.5", + "colorette": "^2.0.14", + "commander": "^10.0.1", + "cross-spawn": "^7.0.3", + "envinfo": "^7.7.3", + "fastest-levenshtein": "^1.0.12", + "import-local": "^3.0.2", + "interpret": "^3.1.1", + "rechoir": "^0.8.0", + "webpack-merge": "^5.7.3" + }, + "bin": { + "webpack-cli": "bin/cli.js" + }, + "engines": { + "node": ">=14.15.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "5.x.x" + }, + "peerDependenciesMeta": { + "@webpack-cli/generators": { + "optional": true + }, + "webpack-bundle-analyzer": { + "optional": true + }, + "webpack-dev-server": { + "optional": true + } + } + }, + "node_modules/webpack-cli/node_modules/commander": { + "version": "10.0.1", + "resolved": "https://registry.npmmirror.com/commander/-/commander-10.0.1.tgz", + "integrity": "sha512-y4Mg2tXshplEbSGzx7amzPwKKOCGuoSRP/CjEdwwk0FOGlUbq6lKuoyDZTNZkmxHdJtp54hdfY/JUrdL7Xfdug==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14" + } + }, + "node_modules/webpack-dev-middleware": { + "version": "5.3.4", + "resolved": "https://registry.npmmirror.com/webpack-dev-middleware/-/webpack-dev-middleware-5.3.4.tgz", + "integrity": "sha512-BVdTqhhs+0IfoeAf7EoH5WE+exCmqGerHfDM0IL096Px60Tq2Mn9MAbnaGUe6HiMa41KMCYF19gyzZmBcq/o4Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "colorette": "^2.0.10", + "memfs": "^3.4.3", + "mime-types": "^2.1.31", + "range-parser": "^1.2.1", + "schema-utils": "^4.0.0" + }, + "engines": { + "node": ">= 12.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "^4.0.0 || ^5.0.0" + } + }, + "node_modules/webpack-dev-server": { + "version": "4.15.1", + "resolved": "https://registry.npmmirror.com/webpack-dev-server/-/webpack-dev-server-4.15.1.tgz", + "integrity": "sha512-5hbAst3h3C3L8w6W4P96L5vaV0PxSmJhxZvWKYIdgxOQm8pNZ5dEOmmSLBVpP85ReeyRt6AS1QJNyo/oFFPeVA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/bonjour": "^3.5.9", + "@types/connect-history-api-fallback": "^1.3.5", + "@types/express": "^4.17.13", + "@types/serve-index": "^1.9.1", + "@types/serve-static": "^1.13.10", + "@types/sockjs": "^0.3.33", + "@types/ws": "^8.5.5", + "ansi-html-community": "^0.0.8", + "bonjour-service": "^1.0.11", + "chokidar": "^3.5.3", + "colorette": "^2.0.10", + "compression": "^1.7.4", + "connect-history-api-fallback": "^2.0.0", + "default-gateway": "^6.0.3", + "express": "^4.17.3", + "graceful-fs": "^4.2.6", + "html-entities": "^2.3.2", + "http-proxy-middleware": "^2.0.3", + "ipaddr.js": "^2.0.1", + "launch-editor": "^2.6.0", + "open": "^8.0.9", + "p-retry": "^4.5.0", + "rimraf": "^3.0.2", + "schema-utils": "^4.0.0", + "selfsigned": "^2.1.1", + "serve-index": "^1.9.1", + "sockjs": "^0.3.24", + "spdy": "^4.0.2", + "webpack-dev-middleware": "^5.3.1", + "ws": "^8.13.0" + }, + "bin": { + "webpack-dev-server": "bin/webpack-dev-server.js" + }, + "engines": { + "node": ">= 12.13.0" + }, + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/webpack" + }, + "peerDependencies": { + "webpack": "^4.37.0 || ^5.0.0" + }, + "peerDependenciesMeta": { + "webpack": { + "optional": true + }, + "webpack-cli": { + "optional": true + } + } + }, + "node_modules/webpack-dev-server/node_modules/rimraf": { + "version": "3.0.2", + "resolved": "https://registry.npmmirror.com/rimraf/-/rimraf-3.0.2.tgz", + "integrity": "sha512-JZkJMZkAGFFPP2YqXZXPbMlMBgsxzE8ILs4lMIX/2o0L9UBw9O/Y3o6wFw/i9YLapcUJWwqbi3kdxIPdC62TIA==", + "deprecated": "Rimraf versions prior to v4 are no longer supported", + "dev": true, + "license": "ISC", + "dependencies": { + "glob": "^7.1.3" + }, + "bin": { + "rimraf": "bin.js" + }, + "funding": { + "url": "https://github.com/sponsors/isaacs" + } + }, + "node_modules/webpack-merge": { + "version": "5.10.0", + "resolved": "https://registry.npmmirror.com/webpack-merge/-/webpack-merge-5.10.0.tgz", + "integrity": "sha512-+4zXKdx7UnO+1jaN4l2lHVD+mFvnlZQP/6ljaJVb4SZiwIKeUnrT5l0gkT8z+n4hKpC+jpOv6O9R+gLtag7pSA==", + "dev": true, + "license": "MIT", + "dependencies": { + "clone-deep": "^4.0.1", + "flat": "^5.0.2", + "wildcard": "^2.0.0" + }, + "engines": { + "node": ">=10.0.0" + } + }, + "node_modules/webpack-sources": { + "version": "3.2.3", + "resolved": "https://registry.npmmirror.com/webpack-sources/-/webpack-sources-3.2.3.tgz", + "integrity": "sha512-/DyMEOrDgLKKIG0fmvtz+4dUX/3Ghozwgm6iPp8KRhvn+eQf9+Q7GWxVNMk3+uCPWfdXYC4ExGBckIXdFEfH1w==", + "license": "MIT", + "engines": { + "node": ">=10.13.0" + } + }, + "node_modules/websocket-driver": { + "version": "0.7.4", + "resolved": "https://registry.npmmirror.com/websocket-driver/-/websocket-driver-0.7.4.tgz", + "integrity": "sha512-b17KeDIQVjvb0ssuSDF2cYXSg2iztliJ4B9WdsuB6J952qCPKmnVq4DyW5motImXHDC1cBT/1UezrJVsKw5zjg==", + "dev": true, + "license": "Apache-2.0", + "dependencies": { + "http-parser-js": ">=0.5.1", + "safe-buffer": ">=5.1.0", + "websocket-extensions": ">=0.1.1" + }, + "engines": { + "node": ">=0.8.0" + } + }, + "node_modules/websocket-extensions": { + "version": "0.1.4", + "resolved": "https://registry.npmmirror.com/websocket-extensions/-/websocket-extensions-0.1.4.tgz", + "integrity": "sha512-OqedPIGOfsDlo31UNwYbCFMSaO9m9G/0faIHj5/dZFDMFqPTcx6UwqyOy3COEaEOg/9VsGIpdqn62W5KhoKSpg==", + "dev": true, + "license": "Apache-2.0", + "engines": { + "node": ">=0.8.0" + } + }, + "node_modules/which": { + "version": "2.0.2", + "resolved": "https://registry.npmmirror.com/which/-/which-2.0.2.tgz", + "integrity": "sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA==", + "license": "ISC", + "dependencies": { + "isexe": "^2.0.0" + }, + "bin": { + "node-which": "bin/node-which" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/wildcard": { + "version": "2.0.1", + "resolved": "https://registry.npmmirror.com/wildcard/-/wildcard-2.0.1.tgz", + "integrity": "sha512-CC1bOL87PIWSBhDcTrdeLo6eGT7mCFtrg0uIJtqJUFyK+eJnzl8A1niH56uu7KMa5XFrtiV+AQuHO3n7DsHnLQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/wrappy": { + "version": "1.0.2", + "resolved": "https://registry.npmmirror.com/wrappy/-/wrappy-1.0.2.tgz", + "integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ==", + "license": "ISC" + }, + "node_modules/ws": { + "version": "8.13.0", + "resolved": "https://registry.npmmirror.com/ws/-/ws-8.13.0.tgz", + "integrity": "sha512-x9vcZYTrFPC7aSIbj7sRCYo7L/Xb8Iy+pW0ng0wt2vCJv7M9HOMy0UoN3rr+IFC7hb7vXoqS+P9ktyLLLhO+LA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10.0.0" + }, + "peerDependencies": { + "bufferutil": "^4.0.1", + "utf-8-validate": ">=5.0.2" + }, + "peerDependenciesMeta": { + "bufferutil": { + "optional": true + }, + "utf-8-validate": { + "optional": true + } + } + } + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json new file mode 100644 index 0000000000000000000000000000000000000000..e7df65fb657b483573e29f8d6b9c8c8468792d50 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/package.json @@ -0,0 +1,74 @@ +{ + "name": "tb-graph-ascend", + "version": "0.1.0", + "private": "true", + "main": "index.js", + "scripts": { + "dev": "webpack serve --config webpack.dev.js", + "buildLinux": "cross-env NODE_ENV=production webpack && cp dist/index.html ../server/static/", + "buildWin": "cross-env NODE_ENV=production webpack && copy dist\\index.html ..\\server\\static\\", + "prettier": "prettier --config ./.prettierrc --write ./src/**/*.ts" + }, + "devDependencies": { + "@types/d3": "5.7.2", + "@types/lodash": "^4.14.172", + "@types/node": "^16.4.13", + "@types/offscreencanvas": "^2019.6.3", + "@types/requirejs": "^2.1.33", + "@types/resize-observer-browser": "^0.1.6", + "@types/three": "^0.131.0", + "html-loader": "^5.1.0", + "html-webpack-plugin": "^5.6.3", + "inline-chunk-html-plugin": "^1.1.1", + "ts-loader": "^9.5.1", + "tslib": "^2.6.2", + "typescript": "^5.4.5", + "webpack": "^5.96.1", + "webpack-cli": "^5.1.4", + "webpack-dev-server": "4.15.1", + "ws": "8.13.0" + }, + "dependencies": { + "@polymer/decorators": "^3.0.0", + "@polymer/iron-behaviors": "^3.0.1", + "@polymer/iron-collapse": "^3.0.1", + "@polymer/iron-icon": "^3.0.1", + "@polymer/iron-icons": "^3.0.1", + "@polymer/iron-iconset-svg": "^3.0.1", + "@polymer/iron-list": "^3.1.0", + "@polymer/iron-resizable-behavior": "^3.0.1", + "@polymer/paper-behaviors": "^3.0.1", + "@polymer/paper-button": "^3.0.1", + "@polymer/paper-checkbox": "^3.1.0", + "@polymer/paper-dialog": "^3.0.1", + "@polymer/paper-dropdown-menu": "^3.1.0", + "@polymer/paper-icon-button": "^3.0.2", + "@polymer/paper-item": "^3.0.1", + "@polymer/paper-listbox": "^3.0.1", + "@polymer/paper-progress": "^3.0.1", + "@polymer/paper-tooltip": "^3.0.1", + "@polymer/polymer": "^3.5.1", + "@types/lodash": "^4.17.1", + "@vaadin/button": "24.6.5", + "@vaadin/combo-box": "24.6.5", + "@vaadin/details": "24.6.5", + "@vaadin/icon": "24.6.5", + "@vaadin/icons": "24.6.5", + "@vaadin/notification": "24.6.5", + "@vaadin/progress-bar": "24.6.5", + "@vaadin/tabs": "24.6.5", + "@vaadin/tabsheet": "24.6.5", + "@vaadin/text-field": "24.6.5", + "@vaadin/tooltip": "24.6.5", + "@vaadin/grid": "24.6.5", + "@vaadin/select": "24.6.5", + "clean-webpack-plugin": "^4.0.0", + "cross-env": "^7.0.3", + "css-loader": "^7.1.2", + "d3": "5.7.0", + "dagre": "^0.8.5", + "lodash": "^4.17.21", + "prettier": "^3.4.2", + "style-loader": "^4.0.0" + } +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.css b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.css new file mode 100644 index 0000000000000000000000000000000000000000..ac81d67c4f2d92e40d801031358fc6c14066cdaa --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.css @@ -0,0 +1,17 @@ +html, +body, +graph-app { + height: 100%; + margin: 0; + font-family: Roboto, sans-serif; +} + +vaadin-combo-box-scroller { + overflow: scroll; + font-size: 14px; +} + +vaadin-combo-box-item { + overflow: unset; + font-size: 14px; +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..84ee8ab528c6b2eaaf813e47323bef8e7b5f1916 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/index.ts @@ -0,0 +1,18 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import './tf_graph_dashboard/index'; +import './index.css'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dark_mode_mixin.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dark_mode_mixin.ts new file mode 100644 index 0000000000000000000000000000000000000000..a562f0b6734cf20274f573cd87b2a61d8d90c926 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dark_mode_mixin.ts @@ -0,0 +1,57 @@ +/* Copyright 2021 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { PolymerElement } from '@polymer/polymer'; + +/** + * Polymer mixin replacement for `:host-context(body.dark-mode)`. + *- + * Unfortunately, Firefox does not support `:host-context()` and cannot use the + * WebComponent way of styling shadow DOMs with context for ancestor [1][2]. + * To work around the issue, we are creating a WebComponent mixin that adds + * class `dark-mode` to `:host` when body contains the class, `.dark-mode`. + * + * Unfortunately, due to our infamiliarity with mixins, our types are imperfect. + * + */ +export function DarkModeMixin(Base: new () => PolymerElement): new () => T { + return class Foo extends Base { + private observer?: MutationObserver; + + override connectedCallback(): void { + super.connectedCallback(); + this._maybeSetDarkMode(); + + this.observer = new MutationObserver((mutations) => { + const classChanged = mutations.some((mutation) => { + return mutation.attributeName === 'class'; + }); + if (classChanged) { + this._maybeSetDarkMode(); + } + }); + this.observer.observe(document.body, { attributes: true }); + } + + override disconnectedCallback(): void { + super.disconnectedCallback(); + this.observer?.disconnect(); + } + + private _maybeSetDarkMode(): void { + this.classList.toggle('dark-mode', document.body.classList.contains('dark-mode')); + } + } as unknown as new () => T; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom-repeat.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom-repeat.ts new file mode 100644 index 0000000000000000000000000000000000000000..53f21caa684fba44ef4bc35c9094340fd06cf18e --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom-repeat.ts @@ -0,0 +1,16 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +export * from '@polymer/polymer/lib/elements/dom-repeat'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom.ts new file mode 100644 index 0000000000000000000000000000000000000000..7b1ace8b926fd768f63b9aa03c4163c05bacdb8a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/dom.ts @@ -0,0 +1,16 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +export * from '@polymer/polymer/lib/legacy/polymer.dom'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/irons_and_papers.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/irons_and_papers.ts new file mode 100644 index 0000000000000000000000000000000000000000..a0d4c3a1777619222f0a6c045cbc66715a5244db --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/irons_and_papers.ts @@ -0,0 +1,35 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +/** + * @fileoverview Imports all TensorBoard dependencies to paper and iron components. Please + * import this module for dependency on iron and paper components. + */ + +import '@polymer/iron-collapse/iron-collapse'; +import '@polymer/iron-icon'; +import '@polymer/iron-icons/image-icons'; +import '@polymer/iron-icons/iron-icons'; +import '@polymer/iron-iconset-svg'; +import '@polymer/iron-list/iron-list'; +import '@polymer/paper-button'; +import '@polymer/paper-checkbox'; +import '@polymer/paper-dialog'; +import '@polymer/paper-dropdown-menu/paper-dropdown-menu'; +import '@polymer/paper-icon-button/paper-icon-button'; +import '@polymer/paper-item'; +import '@polymer/paper-listbox'; +import '@polymer/paper-progress'; +import '@polymer/paper-tooltip/paper-tooltip'; +export { PaperCheckboxElement } from '@polymer/paper-checkbox'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/legacy_element_mixin.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/legacy_element_mixin.ts new file mode 100644 index 0000000000000000000000000000000000000000..e0768eac40066592e566d145cb6f9d1ed9054f31 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/legacy_element_mixin.ts @@ -0,0 +1,16 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +export * from '@polymer/polymer/lib/legacy/legacy-element-mixin'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/register_style_dom_module.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/register_style_dom_module.ts new file mode 100644 index 0000000000000000000000000000000000000000..fd968cc733b4996f0d6100ca5f6f7508c94b1b3a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/polymer/register_style_dom_module.ts @@ -0,0 +1,52 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import '@polymer/polymer/lib/elements/dom-module'; + +export interface DomModuleOptions { + moduleName: string; + styleDependencies?: string[]; + styleContent: string; +} + +/** + * Interop for Polymer 3 styling + * + * The following process is a workaround. While Polymer 3.0 does not use + * elements for templating, style modules do. The following process + * is a workaround for this fact. This process may be updated as required. + */ +export function registerStyleDomModule(args: DomModuleOptions): void { + const { moduleName, styleContent } = args; + const domModule = document.createElement('dom-module'); + const template = document.createElement('template'); + + const styleIncludes: HTMLStyleElement[] = []; + if (args.styleDependencies) { + args.styleDependencies.forEach((dep) => { + const style = document.createElement('style'); + style.setAttribute('include', dep); + styleIncludes.push(style); + }); + } + const style = document.createElement('style'); + Object.assign(style, { textContent: styleContent }); + + styleIncludes.forEach((styleElement) => { + template.content.appendChild(styleElement); + }); + template.content.appendChild(style); + domModule.appendChild(template); + (domModule as any).register(moduleName); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..f492ad1fe236069f449f4ad8b02b8e5b4b73d650 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/index.ts @@ -0,0 +1,16 @@ +/* Copyright 2021 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +export * from './types'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/types.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/types.ts new file mode 100644 index 0000000000000000000000000000000000000000..2f575df43714fdff32ae70d436149422c981662a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tb_debug/types.ts @@ -0,0 +1,77 @@ +/* Copyright 2021 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +export interface ActionEvent { + eventCategory: string; + eventAction: string; + eventLabel?: string; + eventValue?: number; +} + +export const GRAPH_DEBUG_ACTION_EVENT_CATEGORY = 'Graph dashboard actions'; +export const GRAPH_DEBUG_TIMING_EVENT_CATEGORY = 'Graph dashboard timings'; + +/** + * Timing-based events, part of `GRAPH_DEBUG_TIMING_EVENT_CATEGORY`. + */ +export enum GraphDebugTimingEventId { + // Pre-rendering. + // `FETCH_PBTXT_BYTES` is fired for both filesystem and server sources. + FETCH_PBTXT_BYTES = 'FETCH_PBTXT_BYTES', + FETCH_PBTXT_BYTES_FROM_FILESYSTEM = 'FETCH_PBTXT_BYTES_FROM_FILESYSTEM', + FETCH_PBTXT_BYTES_FROM_SERVER = 'FETCH_PBTXT_BYTES_FROM_SERVER', + PARSE_PBTXT_INTO_OBJECT = 'PARSE_PBTXT_INTO_OBJECT', + FETCH_METADATA_PBTXT_BYTES = 'FETCH_METADATA_PBTXT_BYTES', + PARSE_METADATA_PBTXT_INTO_OBJECT = 'PARSE_METADATA_PBTXT_INTO_OBJECT', + NORMALIZING_NAMES = 'NORMALIZING_NAMES', + BUILD_SLIM_GRAPH = 'BUILD_SLIM_GRAPH', + HIERARCHY_ADD_NODES = 'HIERARCHY_ADD_NODES', + HIERARCHY_DETECT_SERIES = 'HIERARCHY_DETECT_SERIES', + HIERARCHY_ADD_EDGES = 'HIERARCHY_ADD_EDGES', + HIERARCHY_FIND_SIMILAR_SUBGRAPHS = 'HIERARCHY_FIND_SIMILAR_SUBGRAPHS', + // Rendering. + RENDER_BUILD_HIERARCHY = 'RENDER_BUILD_HIERARCHY', + RENDER_SCENE_LAYOUT = 'RENDER_SCENE_LAYOUT', + RENDER_SCENE_BUILD_SCENE = 'RENDER_SCENE_BUILD_SCENE', + // Total graph loading (superset of other phases). Note that after [1], + // this timing no longer includes `HIERARCHY_FIND_SIMILAR_SUBGRAPHS`, + // which is computed lazily. + GRAPH_LOAD_SUCCEEDED = 'GRAPH_LOAD_SUCCEEDED', + GRAPH_LOAD_FAILED = 'GRAPH_LOAD_FAILED', +} + +/** + * Non-timing based actions due to user interaction, part of + * `GRAPH_DEBUG_ACTION_EVENT_CATEGORY`. + */ +export enum GraphDebugActionEventId { + // Labeled by state: expanded or collapsed. + NODE_EXPANSION_TOGGLED = 'NODE_EXPANSION_TOGGLED', + NODE_SEARCH_RESULT_FOCUSED = 'NODE_SEARCH_RESULT_FOCUSED', + // Labeled by direction between auxiliary graph and the main graph. + NODE_AUXILIARY_EXTRACTION_CHANGED = 'NODE_AUXILIARY_EXTRACTION_CHANGED', + // Labeled by graph type: Op, Conceptual, Profile. + GRAPH_TYPE_CHANGED = 'GRAPH_TYPE_CHANGED', + TRACE_INPUT_MODE_TOGGLED = 'TRACE_INPUT_MODE_TOGGLED', + // Labeled by mode: Structure, Device, TPU Compat, etc. + NODE_COLOR_MODE_CHANGED = 'NODE_COLOR_MODE_CHANGED', + UPLOADED_GRAPH_FROM_FILESYSTEM = 'UPLOADED_GRAPH_FROM_FILESYSTEM', +} + +// Merge the string enums. +export const GraphDebugEventId = { + ...GraphDebugTimingEventId, + ...GraphDebugActionEventId, +}; +export type GraphDebugEventId = typeof GraphDebugEventId; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_backend/requestManager.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_backend/requestManager.ts new file mode 100644 index 0000000000000000000000000000000000000000..a93256199001fa3d0bfaf539fc57871f0ac432a2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_backend/requestManager.ts @@ -0,0 +1,283 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import { safeJSONParse } from '../utils'; + +const FEATURE_FLAGS_HEADER_NAME = 'X-TensorBoard-Feature-Flags'; + +interface ResolveReject { + resolve: (x: unknown) => void; + reject: (x: unknown) => void; +} + +/** + * Manages many fetch requests. Launches up to nSimultaneousRequests + * simultaneously, and maintains a LIFO queue of requests to process when + * more urls are requested than can be handled at once. The queue can be + * cleared. + * + * When a request is made, a Promise is returned which resolves with the + * parsed JSON result from the request. + */ +export class RequestCancellationError extends Error { + public override name = 'RequestCancellationError'; +} + +export class InvalidRequestOptionsError extends Error { + public override name = 'InvalidRequestOptionsError'; + constructor(msg: string) { + super(msg); + // The following is needed due to a limitation of TypeScript when + Object.setPrototypeOf(this, InvalidRequestOptionsError.prototype); + } +} + +export class RequestNetworkError extends Error { + public override name: string; + public req: XMLHttpRequest; + public url: string; + constructor(req: XMLHttpRequest, url) { + super(); + this.message = `RequestNetworkError: ${req.status} at ${url}`; + this.name = 'RequestNetworkError'; + this.req = req; + this.url = url; + } +} + +/** The HTTP method-type to use. Currently only 'GET' and 'POST' are + * supported. + */ +export enum HttpMethodType { + GET = 'GET', + POST = 'POST', +} + +/** + * Holds options that can be used to configure the HTTP request. + */ +export class RequestOptions { + public methodType: HttpMethodType; + /** The content-type request header to use. Cannot be set for a GET request.*/ + public contentType?: string; + /** The request body to use. This is the object that is passed to the + * XMLHttpRequest.send() method. If not given the 'send' method is called + * without an argument. + */ + public body?: any; + /** If specified, this will be the value set in the + * XMLHttpRequest.withCredentials property. + */ + public withCredentials?: boolean; + // Validates this object. Throws InvalidRequestOptionsError on error. + public validate(): void { + if (this.methodType === HttpMethodType.GET) { + // We don't allow a body for a GET. + if (this.body) { + throw new InvalidRequestOptionsError('body must be missing for a GET request.'); + } + } + // We allow body-less or contentType-less POSTs even if they don't + // make much sense. + } +} + +// Form data for a POST request as a convenient multidict interface, +// since the built-in `FormData` type doesn't have a value constructor. +// +// A raw string value is equivalent to a singleton array, and thus an +// empty array value is equivalent to omitting the key entirely. +export interface PostData { + [key: string]: string | string[]; +} + +export class RequestManager { + private _queue: ResolveReject[]; + private _maxRetries: number; + private _nActiveRequests: number; + private _nSimultaneousRequests: number; + constructor(nSimultaneousRequests = 1000, maxRetries = 3) { + this._queue = []; + this._nActiveRequests = 0; + this._nSimultaneousRequests = nSimultaneousRequests; + this._maxRetries = maxRetries; + } + + /** + * Gives a promise that loads assets from given url (respects queuing). If + * postData is provided, this request will use POST, not GET. This is an + * object mapping POST keys to string values. + */ + public request(url: string, postData?: PostData): Promise { + const requestOptions = requestOptionsFromPostData(postData); + return this.requestWithOptions(url, requestOptions); + } + + public requestWithOptions(url: string, requestOptions: RequestOptions): Promise { + requestOptions.validate(); + const promise = new Promise((resolve, reject) => { + const resolver = { resolve: resolve, reject: reject }; + this._queue.push(resolver); + this.launchRequests(); + }) + .then(() => { + return this.promiseWithRetries(url, this._maxRetries, requestOptions); + }) + .then( + (response) => { + // Success - Let's free space for another active + // request, and launch it + this._nActiveRequests--; + this.launchRequests(); + return response; + }, + (rejection) => { + if (rejection.name === 'RequestNetworkError') { + // If we failed due to network error, we should + // decrement + // _nActiveRequests because this request was + // active + this._nActiveRequests--; + this.launchRequests(); + } + return Promise.reject(rejection); + }, + ); + return promise; + } + + public fetch(url: string, fetchOptions?: RequestInit): Promise { + return new Promise((resolve, reject) => { + const resolver = { resolve: resolve, reject: reject }; + this._queue.push(resolver); + this.launchRequests(); + }).then(() => { + let numTries = 1; + return new Promise((resolve) => { + const retryFetch = (): void => { + fetch(url, fetchOptions).then((response) => { + if (!response.ok && this._maxRetries > numTries) { + numTries++; + retryFetch(); + return; + } + resolve(response); + this._nActiveRequests--; + this.launchRequests(); + }); + }; + retryFetch(); + }); + }); + } + + /* Actually get promise from url using XMLHttpRequest */ + protected _promiseFromUrl(url: string, requestOptions: RequestOptions): Promise { + return new Promise((resolve, reject) => { + const req = buildXMLHttpRequest( + requestOptions.methodType, + url, + requestOptions.withCredentials, + requestOptions.contentType, + ); + req.setRequestHeader(FEATURE_FLAGS_HEADER_NAME, JSON.stringify({})); + req.onload = function (): void { + if (req.status === 200) { + resolve(safeJSONParse(req.responseText) as any); + } else { + reject(new RequestNetworkError(req, url)); + } + }; + req.onerror = function (): void { + reject(new RequestNetworkError(req, url)); + }; + if (requestOptions.body) { + req.send(requestOptions.body); + } else { + req.send(); + } + }); + } + + private launchRequests(): void { + while (this._nActiveRequests < this._nSimultaneousRequests && this._queue.length > 0) { + this._nActiveRequests++; + this._queue.pop()?.resolve(undefined); + } + } + + /** + * Try to request a given URL using overwritable _promiseFromUrl method. + * If the request fails for any reason, we will retry up to maxRetries + * times. In practice, this will help us paper over transient network issues + * like '502 Bad Gateway'. + * By default, Chrome displays network errors in console, so + * the user will be able to tell when the requests are failing. I think this + * is a feature, if the request failures and retries are causing any + * pain to users, they can see it and file issues. + */ + private promiseWithRetries(url: string, maxRetries: number, requestOptions: RequestOptions): any { + let success = (x): any => x; + let failure = (x): any => { + if (maxRetries > 0) { + return this.promiseWithRetries(url, maxRetries - 1, requestOptions); + } else { + return Promise.reject(x); + } + }; + return this._promiseFromUrl(url, requestOptions).then(success, failure); + } +} + +function buildXMLHttpRequest( + methodType: HttpMethodType, + url: string, + withCredentials?: boolean, + contentType?: string, +): XMLHttpRequest { + const req = new XMLHttpRequest(); + req.open(methodType, url); + if (withCredentials) { + req.withCredentials = withCredentials; + } + if (contentType) { + req.setRequestHeader('Content-Type', contentType); + } + return req; +} + +function requestOptionsFromPostData(postData?: PostData): RequestOptions { + const result = new RequestOptions(); + if (!postData) { + result.methodType = HttpMethodType.GET; + return result; + } + result.methodType = HttpMethodType.POST; + result.body = formDataFromDictionary(postData); + return result; +} + +function formDataFromDictionary(postData: PostData): FormData { + const formData = new FormData(); + for (const [key, maybeValues] of Object.entries(postData)) { + const values = Array.isArray(maybeValues) ? maybeValues : [maybeValues]; + for (const value of values) { + formData.append(key, value); + } + } + return formData; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/scrollbar-style.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/scrollbar-style.ts new file mode 100644 index 0000000000000000000000000000000000000000..67113080ae2aab7ef84ca8cf573af3ae2520050d --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/scrollbar-style.ts @@ -0,0 +1,39 @@ +/* Copyright 2016 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { registerStyleDomModule } from '../polymer/register_style_dom_module'; + +registerStyleDomModule({ + moduleName: 'scrollbar-style', + styleContent: ` + .scrollbar::-webkit-scrollbar-track { + visibility: hidden; + } + + .scrollbar::-webkit-scrollbar { + width: 10px; + } + + .scrollbar::-webkit-scrollbar-thumb { + border-radius: 10px; + -webkit-box-shadow: inset 0 0 2px rgba(0, 0, 0, 0.3); + background-color: var(--paper-grey-500); + color: var(--paper-grey-900); + } + .scrollbar { + box-sizing: border-box; + } + `, +}); diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tensorboard-color.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tensorboard-color.ts new file mode 100644 index 0000000000000000000000000000000000000000..e76ed139f9d2956cad1cb1c782358b9ee1975c00 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tensorboard-color.ts @@ -0,0 +1,57 @@ +/* Copyright 2016 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +const style = document.createElement('style'); +style.setAttribute('is', 'custom-style'); +style.textContent = ` + :root { + --tb-orange-weak: #ffa726; + --tb-orange-strong: #f57c00; + --tb-orange-dark: #dc7320; + --tb-grey-darker: #e2e2e2; + --tb-grey-lighter: #f3f3f3; + --tb-ui-dark-accent: #757575; + --tb-ui-light-accent: #e0e0e0; + --tb-ui-border: var(--paper-grey-300); + --tb-graph-faded: #e0d4b3; + --tb-secondary-text-color: var(--paper-grey-800); + --tb-raised-button-shadow-color: rgba(0, 0, 0, 0.2); + --primary-background-color: #fff; + --secondary-background-color:rgb(247, 247, 247); + --tb-layout-background-color: #f5f5f5; + --tb-link: #1976d2; /* material blue 700. */ + --tb-link-visited: #7b1fa2; /* material purple 700. */ + } + + :root .dark-mode { + --tb-ui-border: var(--paper-grey-700); + --tb-ui-dark-accent: var(--paper-grey-400); + --tb-ui-light-accent: var(--paper-grey-600); + --tb-secondary-text-color: var(--paper-grey-400); + --tb-raised-button-shadow-color: rgba(255, 255, 255, 0.5); + --primary-text-color: #fff; + --secondary-text-color: var(--paper-grey-400); + --primary-background-color: #303030; /* material grey A400. */ + --secondary-background-color: #3a3a3a; + --tb-layout-background-color: #3a3a3a; + --tb-link: #42a5f5; /* material blue 400. */ + --tb-link-visited: #ba68c8; /* material purple 300. */ + /* Overrides paper-material */ + --shadow-elevation-2dp_-_box-shadow: 0 2px 2px 0 rgba(255, 255, 255, 0.14), + 0 1px 5px 0 rgba(255, 255, 255, 0.12), + 0 3px 1px -2px rgba(255, 255, 255, 0.2); + } +`; +document.head.appendChild(style); diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tf-dashboard-layout.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tf-dashboard-layout.ts new file mode 100644 index 0000000000000000000000000000000000000000..264d5d2c589b9f58d242b305eb6b4161c19f4382 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_dashboard_common/tf-dashboard-layout.ts @@ -0,0 +1,152 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { customElement } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import { DarkModeMixin } from '../polymer/dark_mode_mixin'; +import './scrollbar-style'; +import './tensorboard-color'; + +@customElement('tf-dashboard-layout') +class TfDashboardLayout extends DarkModeMixin(PolymerElement) { + static readonly template = html` + + +
+ +
+ + + `; + + _toggleSidebar(): void { + // 通过 ID 获取元素并隐藏 + const sidebar = this.shadowRoot?.querySelector('#sidebar'); + const sidebarToggle = this.shadowRoot?.querySelector('#sidebar-toggle'); + // 检查并切换 display 样式 + if (sidebar) { + sidebar?.classList.toggle('sider-hidden'); // 改为显示 + sidebarToggle?.classList.toggle('sidebar-toggle-fold'); // 改变箭头方向 + } + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_globals/globals.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_globals/globals.ts new file mode 100644 index 0000000000000000000000000000000000000000..503570bdea58f6d0ab3e610b0d2b7d587cb61754 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_globals/globals.ts @@ -0,0 +1,37 @@ +/* Copyright 2016 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +// If true, TensorBoard stores its hash in the URI state. +// If false, tab switching in TensorBoard will not update location hash, +// because hash updates interfere with wct_tests. +let _useHash = false; + +export function setUseHash(shouldUseHash: boolean): void { + _useHash = shouldUseHash; +} + +export function useHash(): boolean { + return _useHash; +} + +let _fakeHash = ''; + +export function setFakeHash(h: string): void { + _fakeHash = h; +} + +export function getFakeHash(): string { + return _fakeHash; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/components/legend/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/components/legend/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..67203042380ccfec1c17b4ed56b2c560d6a54a10 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/components/legend/index.ts @@ -0,0 +1,98 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement } from '@polymer/decorators'; +@customElement('scene-legend') +class Legend extends PolymerElement { + static get template(): HTMLTemplateElement { + return html` + +
+
+ + Module or Operators +
+
+ + Unexpanded Module or Operators +
+ +
+ Unexpandable Node: It can be an Api, operator or module. It cannot be expanded because it has no + subnodes +
+
+
+
+
+ + Api List +
+ +
Apis between modules
+
+
+
+
+ + Multi Collection +
+ +
Fusion node Collection
+
+
+
+
+ `; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-minimap.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-minimap.ts new file mode 100644 index 0000000000000000000000000000000000000000..8ea82c1fbabb827535c4668ebb7811f0ac929ded --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-minimap.ts @@ -0,0 +1,89 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { customElement } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import * as tf_scene_minimap from '../tf_graph_common/minimap'; + +@customElement('tf-graph-minimap') +export class TfGraphMinimap extends PolymerElement { + static readonly template = html` + + + + + + + + + + + + + + + + `; + + /** + * Initializes the minimap and returns a minimap object to notify when + * things update. + * + * @param svg The main svg element. + * @param zoomG The svg group used for panning and zooming the main svg. + * @param mainZoom The main zoom behavior. + * @param maxWAndH The maximum width/height for the minimap. + * @param labelPadding Padding in pixels due to the main graph labels. + */ + init(svg, zoomG, mainZoom, maxWAndH, labelPadding): tf_scene_minimap.Minimap { + return new tf_scene_minimap.Minimap(svg, zoomG, mainZoom, this, maxWAndH, labelPadding); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.html.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.html.ts new file mode 100644 index 0000000000000000000000000000000000000000..7033d6b90ec811b090101933abccba2e0aa879f9 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.html.ts @@ -0,0 +1,381 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import { html } from '@polymer/polymer'; + +// Please keep node font-size/classnames in sync with tf-graph-common/common.ts +export const template = html` + +
+
+
+
+
+ + + + + +
+`; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.ts new file mode 100644 index 0000000000000000000000000000000000000000..e1829e4140ec2901b00649410081386a2e072674 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph-scene.ts @@ -0,0 +1,710 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ + +import { customElement, observe, property } from '@polymer/decorators'; +import { PolymerElement } from '@polymer/polymer'; +import * as d3 from 'd3'; +import * as _ from 'lodash'; +import { DarkModeMixin } from '../polymer/dark_mode_mixin'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import * as tb_debug from '../tb_debug'; +import '../tf_dashboard_common/tensorboard-color'; +import * as tf_graph from '../tf_graph_common/graph'; +import * as tf_graph_layout from '../tf_graph_common/layout'; +import * as tf_graph_minimap from '../tf_graph_common/minimap'; +import * as tf_graph_scene_node from '../tf_graph_common/node'; +import * as tf_graph_render from '../tf_graph_common/render'; +import * as tf_graph_scene from '../tf_graph_common/scene'; +import { TfGraphScene } from '../tf_graph_common/tf-graph-scene'; +import * as tf_graph_util from '../tf_graph_common/util'; +import './tf-graph-minimap'; +import { template } from './tf-graph-scene.html'; + +// 限制滚动速度 +const MOUSE_MOVE_DELTA = 800; +@customElement('tf-graph-scene') +class TfGraphScene2 extends LegacyElementMixin(DarkModeMixin(PolymerElement)) implements TfGraphScene { + static readonly template = template; + @property({ type: Number }) + _step: number = 20; + + @property({ type: Number }) + mouseX: number = 0; + + @property({ type: Number }) + mouseY: number = 0; + + @property({ type: Number }) + x: number = 0; + + @property({ type: Number }) + y: number = 0; + + @property({ type: Object }) + renderHierarchy: tf_graph_render.RenderGraphInfo; + + @property({ type: String }) + name: string; + + // For each render hierarchy, we only fit it to the viewport once (when the scene is attached to + // the DOM). We do not fit the hierarchy again (unless the user clicks the reset button). For + // instance, if the user enters a certain view in the graph, switches to another dashboard, and + // returns to the graph dashboard, the user expects the previous view. These properties enable + // that behavior. + + /** Whether the scene has fit the current render hierarchy (to the viewport) at least once. */ + @property({ type: Boolean }) + _hasRenderHierarchyBeenFitOnce: boolean; + + /** Whether this scene element is currently attached to a parent element. */ + @property({ type: Boolean }) + _isAttached: boolean; + + /** This property is a d3_zoom object. */ + @property({ type: Object }) + _zoom: object; + + /** This property is a d3_drag object. */ + @property({ type: Object }) + _drag: object; + + @property({ + type: String, + observer: '_highlightedNodeChanged', + }) + highlightedNode: string; + + @property({ + type: String, + observer: '_selectedNodeChanged', + }) + selectedNode: string; + + @property({ + type: String, + observer: '_linkedNodeChanged', + }) + linkedNode: string; + + /** Keeps track of if the graph has been zoomed/panned since loading */ + @property({ + type: Boolean, + observer: '_onZoomChanged', + }) + _zoomed: boolean = false; + + /** + * Keeps track of the starting coordinates of a graph zoom/pan. + * + * @private {{x: number, y: number}?} + */ + @property({ + type: Object, + }) + _zoomStartCoords: { x: number; y: number } | null = null; + + /** + * Keeps track of the current coordinates of a graph zoom/pan + * + * @private {{x: number, y: number}?} + */ + @property({ + type: Object, + }) + _zoomTransform: { x: number; y: number } | null = null; + + /** Maximum distance of a zoom event for it to be interpreted as a click */ + @property({ + type: Number, + }) + _maxZoomDistanceForClick: number = 20; + + /* + * Dictionary for easily stylizing nodes when state changes. + * _nodeGroupIndex[nodeName] = d3_selection of the nodeGroup + */ + @property({ + type: Object, + }) + _nodeGroupIndex = {}; + + /* + * Dictionary for easily stylizing annotation nodes when state changes. + * _annotationGroupIndex[nodeName][hostNodeName] = + * d3_selection of the annotationGroup + */ + @property({ + type: Object, + }) + _annotationGroupIndex = {}; + + /* + * Dictionary for easily stylizing edges when state changes. + * _edgeGroupIndex[edgeName] = d3_selection of the edgeGroup + */ + @property({ + type: Object, + }) + _edgeGroupIndex = {}; + + /** + * Max font size for metanode label strings. + */ + @property({ + type: Number, + }) + maxMetanodeLabelLengthFontSize: number = 9; + + /** + * Min font size for metanode label strings. + */ + @property({ type: Number }) + minMetanodeLabelLengthFontSize: number = 6; + + /** + * Metanode label strings longer than this are given smaller fonts. + */ + @property({ type: Number }) + maxMetanodeLabelLengthLargeFont: number = 11; + + /** + * Metanode label strings longer than this are truncated with ellipses. + */ + @property({ type: Number }) + maxMetanodeLabelLength: number = 50; + + @property({ type: Object }) + progress: any; + + // An array of ContextMenuItem objects. Items that appear in the context + // menu for a node. + @property({ type: Array }) + nodeContextMenuItems: unknown[]; + + @property({ type: Boolean }) + showMinimap: boolean = true; + + /** + * A minimap object to notify for zoom events. + */ + private minimap: tf_graph_minimap.Minimap; + + private enablePanSignal: Boolean = true; + + @observe('renderHierarchy') + _renderHierarchyChanged(): void { + let renderHierarchy = this.renderHierarchy; + this._hasRenderHierarchyBeenFitOnce = false; + this._resetState(); + this._build(renderHierarchy); + } + + @observe('showMinimap') + _minimapVisChanged(): void { + const minimap = this.$.minimap as HTMLElement; + minimap.style.display = this.showMinimap ? 'block' : 'none'; + } + + // Animation and fitting must come after the observer for the hierarchy changing because we must + // first build the render hierarchy. + @observe('_isAttached', 'renderHierarchy') + _animateAndFit(): void { + const isAttached = this._isAttached; + if (this._hasRenderHierarchyBeenFitOnce || !isAttached) { + // Do not animate and fit if the scene has already fitted this render hierarchy once. Or if + // the graph dashboard is not attached (in which case the scene lacks DOM info for fitting). + return; + } + // Fit to screen after the graph is done animating. + setTimeout(this.fit.bind(this), tf_graph_layout.PARAMS.animation.duration); + } + + getNode(nodeName): tf_graph_render.RenderNodeInfo { + return this.renderHierarchy.getRenderNodeByName(nodeName); + } + + isNodeExpanded(node): boolean { + return node.expanded; + } + + setNodeExpanded(renderNode): void { + this._build(this.renderHierarchy); + this._updateLabels(!this._zoomed); + } + + /** + * Pans to a node. Assumes that the node exists. + * @param nodeName {string} The name of the node to pan to. + */ + panToNode(nodeName): void { + const zoomed = tf_graph_scene.panToNode(nodeName, this.$.svg, this.$.root, this._zoom); + if (zoomed) { + this._zoomed = true; + } + } + + /** + * Returns the outer-most SVG that renders the graph. + */ + getGraphSvgRoot(): SVGElement { + return this.$.svg as SVGElement; + } + + getContextMenu(): HTMLElement { + return this.$.contextMenu as HTMLElement; + } + + /** + * Resets the state of the component. Called whenever the whole graph + * (dataset) changes. + */ + _resetState(): void { + // Reset the state of the component. + this._nodeGroupIndex = {}; + this._annotationGroupIndex = {}; + this._edgeGroupIndex = {}; + this._updateLabels(false); + // Remove all svg elements under the 'root' svg group. + d3.select(this.$.svg).select('#root').selectAll('*').remove(); + // And the defs. + tf_graph_scene_node.removeGradientDefinitions(this.$.svg as SVGElement); + } + + /** Main method for building the scene */ + _build(renderHierarchy: tf_graph_render.RenderGraphInfo): void { + if (!renderHierarchy) { + return; + } + tf_graph_util.time( + 'tf-graph-scene (layout):', + (): void => { + // layout the scene for this meta / series node + tf_graph_layout.layoutScene(renderHierarchy.root); + }, + tb_debug.GraphDebugEventId.RENDER_SCENE_LAYOUT, + ); + tf_graph_util.time( + 'tf-graph-scene (build scene):', + (): void => { + tf_graph_scene_node.buildGroupForScene(d3.select(this.$.root), renderHierarchy.root, this); + tf_graph_scene.addGraphClickListener(this.$.svg, this); + }, + tb_debug.GraphDebugEventId.RENDER_SCENE_BUILD_SCENE, + ); + // Update the minimap again when the graph is done animating. + setTimeout((): void => { + this.minimap.update(); + }, tf_graph_layout.PARAMS.animation.duration); + } + + ready(): void { + super.ready(); + + this.addEventListener('no-pan-to-node', this._noPanToNode.bind(this)) + this._zoom = d3 + .zoom() + .on('end', () => { + if (this._zoomStartCoords && this._zoomTransform) { + // Calculate the total distance dragged during the zoom event. + // If it is sufficiently small, then fire an event indicating + // that zooming has ended. Otherwise wait to fire the zoom end + // event, so that a mouse click registered as part of this zooming + // is ignored (as this mouse click was part of a zooming, and should + // not be used to indicate an actual click on the graph). + let dragDistance = Math.sqrt( + Math.pow(this._zoomStartCoords.x - this._zoomTransform.x, 2) + + Math.pow(this._zoomStartCoords.y - this._zoomTransform.y, 2), + ); + if (dragDistance < this._maxZoomDistanceForClick) { + this._fireEnableClick(); + } else { + setTimeout(this._fireEnableClick.bind(this), 50); + } + } + this._zoomStartCoords = null; + } + ) + .on('zoom', () => { + this._zoomTransform = d3.event.transform; + if (!this._zoomStartCoords) { + this._zoomStartCoords = this._zoomTransform; + this.fire('disable-click'); + } + this._zoomed = true; + d3.select(this.$.root).attr('transform', d3.event.transform.toString()); + this.x = d3.event.transform.x; + this.y = d3.event.transform.y; + // Notify the minimap. + this.minimap.zoom(d3.event.transform); + }); + + d3.select(this.$.svg).call(this._addEventListener.bind(this)).on('dblclick.zoom', null); + d3.select(window).on('resize', () => { + // Notify the minimap that the user's window was resized. + // The minimap will figure out the new dimensions of the main svg + // and will use the existing translate and scale params. + this.minimap.zoom(); + }); + // Initialize the minimap. + this.minimap = (this.$.minimap as any).init( + this.$.svg, + this.$.root, + this._zoom, + tf_graph_layout.PARAMS.minimap.size, + tf_graph_layout.PARAMS.subscene.meta.labelHeight, + ); + + // Add keyboard event listener + this._addEventListener(); + } + + _addEventListener(): void { + let isDragging = false; + let startX; + let startY; + let lastTime = 0; + const smoothFactor = 0.2; // 控制平滑的因子 + const maxDelta = MOUSE_MOVE_DELTA; // 限制滚动速度 + const svgElement = this.$.svg as SVGSVGElement; + svgElement.setAttribute('tabindex', '0'); + + svgElement.addEventListener('mousedown', (event: MouseEvent) => { + isDragging = true; + startX = event.clientX; + startY = event.clientY; + svgElement.focus(); + }); + window.addEventListener('mouseup', () => { + isDragging = false; + }); + svgElement.addEventListener('mousemove', (event: MouseEvent) => { + [this.mouseX, this.mouseY] = [event.clientX, event.clientY]; + }); + // prettier-ignore + svgElement.addEventListener( + 'mousemove', // 不能根据鼠标移动来ws,提节流方法 + _.throttle((event: MouseEvent) => { + if (isDragging) { + this.x = this.x + ((event.clientX - startX) / 2); + this.y = this.y + ((event.clientY - startY) / 2); + this._moveView(); + startX = event.clientX; + startY = event.clientY; + } + }, 15), + ); // 节流,限制触发频率。每15毫秒,事件最多执行一次 + // prettier-ignore + svgElement.addEventListener('wheel', (event: WheelEvent) => { + const currentTime = performance.now(); + const deltaTime = currentTime - lastTime; + if (deltaTime > 16) { + // 确保每帧调用 + const deltaY = Math.sign(event.deltaY) * Math.min(Math.abs(event.deltaY), maxDelta); + this.y = this.y - (deltaY * smoothFactor * 2); + this._moveView(); + lastTime = currentTime; + } + }); + svgElement.addEventListener('keydown', (event: KeyboardEvent) => { + switch (event.key) { + case 'w': + case 'W': + this._scaleView(1.1); + break; + case 's': + case 'S': + this._scaleView(0.9); + break; + case 'a': + case 'A': + this.x += this._step; + this._moveView(); + break; + case 'd': + case 'D': + this.x -= this._step; + this._moveView(); + break; + default: + return; // Exit if it's not an arrow key + } + }); + } + + _scaleView(scaleFactor: number): void { + if (this._zoomTransform) { + const svgElement = this.$.svg as SVGSVGElement; + const currentTransform = d3.zoomTransform(svgElement); + const k = currentTransform.k === 0 ? 1 : currentTransform.k; + const [mouseX, mouseY] = [ + this.mouseX - svgElement.getBoundingClientRect().left, + this.mouseY - svgElement.getBoundingClientRect().top, + ]; + const translateX = (mouseX - currentTransform.x) / k; + const translateY = (mouseY - currentTransform.y) / k; + const newScale = currentTransform.k * scaleFactor; + this.x = mouseX - (translateX * newScale); + this.y = mouseY - (translateY * newScale); + const newTransform = d3.zoomIdentity.translate(this.x, this.y).scale(newScale); + d3.select(this.$.svg).call(d3.zoom().transform, newTransform); + d3.select(this.$.root).attr('transform', newTransform.toString()); + this._zoomTransform = newTransform; + this.minimap.zoom(newTransform); + } + } + + _moveView(): void { + if (this._zoomTransform) { + requestAnimationFrame(() => { + const svgElement = this.$.svg as SVGElement; + const currentTransform = d3.zoomTransform(svgElement); + const newTransform = d3.zoomIdentity.translate(this.x, this.y).scale(currentTransform.k); + const svgSelection = d3.select(this.$.svg); + const rootSelection = d3.select(this.$.root); + svgSelection.call(d3.zoom().transform, newTransform); + rootSelection.attr('transform', newTransform.toString()); + // 更新存储的变换对象 + this._zoomTransform = newTransform; + // 通知小地图更新,只在变换发生变化时调用 + this.minimap.zoom(newTransform); + }); + } + } + + override attached(): void { + this.set('_isAttached', true); + } + + override detached(): void { + this.set('_isAttached', false); + } + + _updateLabels(showLabels): void { + let mainGraphTitleElement = this.$$('.title') as HTMLElement; + let titleStyle = mainGraphTitleElement.style; + let auxTitleElement = this.$$('.auxTitle') as HTMLElement; + let auxTitleStyle = auxTitleElement.style; + let functionLibraryTitleStyle = (this.$$('.functionLibraryTitle') as HTMLElement).style; + const root = d3.select(this.$.svg); + let core = root.select(`.${tf_graph_scene.Class.Scene.GROUP}>.${tf_graph_scene.Class.Scene.CORE}`).node(); + const isProgressComplete = showLabels && this.progress && this.progress.value === 100 && core; + if (isProgressComplete) { + let aux = + root.select(`.${tf_graph_scene.Class.Scene.GROUP}>.${tf_graph_scene.Class.Scene.INEXTRACT}`).node() || + root.select(`.${tf_graph_scene.Class.Scene.GROUP}>.${tf_graph_scene.Class.Scene.OUTEXTRACT}`).node(); + let coreX = (core as any).getCTM().e; + let auxX = aux ? (aux as any).getCTM().e : null; + titleStyle.display = 'inline'; + titleStyle.left = `${coreX}px`; + if (auxX !== null && auxX !== coreX) { + auxTitleStyle.display = 'inline'; + // Make sure that the aux title is positioned rightwards enough so as to + // prevent overlap with the main graph title. + auxX = Math.max(coreX + mainGraphTitleElement.getBoundingClientRect().width, auxX); + auxTitleStyle.left = `${auxX}px`; + } else { + auxTitleStyle.display = 'none'; + } + let functionLibrary = root + .select(`.${tf_graph_scene.Class.Scene.GROUP}>.${tf_graph_scene.Class.Scene.FUNCTION_LIBRARY}`) + .node(); + let functionLibraryX = functionLibrary ? (functionLibrary as any).getCTM().e : null; + if (functionLibraryX !== null && functionLibraryX !== auxX) { + functionLibraryTitleStyle.display = 'inline'; + // Make sure that the function library title is positioned rightwards + // enough so as to prevent overlap with other content. + functionLibraryX = Math.max(auxX + auxTitleElement.getBoundingClientRect().width, functionLibraryX); + functionLibraryTitleStyle.left = `${functionLibraryX}px`; + } else { + functionLibraryTitleStyle.display = 'none'; + } + } else { + titleStyle.display = 'none'; + auxTitleStyle.display = 'none'; + functionLibraryTitleStyle.display = 'none'; + } + } + + fit(): void { + this._hasRenderHierarchyBeenFitOnce = true; + tf_graph_scene.fit( + this.$.svg, + this.$.root, + this._zoom, + (): void => { + this._zoomed = false; + }, + ); + } + + isNodeSelected(n): boolean { + return n === this.selectedNode; + } + + isNodeHighlighted(n): boolean { + return n === this.highlightedNode; + } + + isNodeLinked(n): boolean { + return n === this.linkedNode; + } + + addAnnotationGroup(a, d, selection): void { + let an = a.node.name; + this._annotationGroupIndex[an] = this._annotationGroupIndex[an] || {}; + this._annotationGroupIndex[an][d.node.name] = selection; + } + + getAnnotationGroupsIndex(a): any { + return this._annotationGroupIndex[a]; + } + + removeAnnotationGroup(a, d): void { + delete this._annotationGroupIndex[a.node.name][d.node.name]; + } + + addNodeGroup(n, selection): void { + this._nodeGroupIndex[n] = selection; + } + + getNodeGroup(n): any { + return this._nodeGroupIndex[n]; + } + + removeNodeGroup(n): void { + delete this._nodeGroupIndex[n]; + } + + /** + * Update node and annotation node of the given name. + * @param {String} n node name + */ + _updateNodeState(n): void { + let node = this.getNode(n); + if (!node) { + return; + } + let nodeGroup = this.getNodeGroup(n); + if (nodeGroup) { + tf_graph_scene_node.stylize(nodeGroup, node, this as any); + } + let annotationGroupIndex = this.getAnnotationGroupsIndex(n); + _.each(annotationGroupIndex, (aGroup, hostName) => { + tf_graph_scene_node.stylize(aGroup, node, this as any, tf_graph_scene.Class.Annotation.NODE); + }); + } + + /** + * Handles new node selection. 1) Updates the selected-state of each node, + * 2) triggers input tracing. + * @param selectedNode {string} The name of the newly selected node. + * @param oldSelectedNode {string} The name of the previously selected node. + * @private + */ + _selectedNodeChanged(selectedNode, oldSelectedNode): void { + if (selectedNode === oldSelectedNode) { + return; + } + if (oldSelectedNode) { + this._updateNodeState(oldSelectedNode); + } + if (!selectedNode) { + this.set('linkedNode', ''); + return; + } + let node = this.renderHierarchy.hierarchy.node(selectedNode); + if (!node) { + return; + } + // Update the minimap to reflect the highlighted (selected) node. + (this.minimap as any).update(); + let nodeParents: string[] = []; + // Create list of all metanode parents of the selected node. + while (node.parentNode !== null && node.parentNode.name !== tf_graph.ROOT_NAME) { + node = (node as any).parentNode; + nodeParents.push(node.name); + } + // Ensure each parent metanode is built and expanded. + let topParentNodeToBeExpanded; + _.forEachRight(nodeParents, (parentName) => { + this.renderHierarchy.buildSubhierarchy(parentName); + let renderNode = this.renderHierarchy.getRenderNodeByName(parentName); + if (renderNode.node.isGroupNode && !renderNode.expanded) { + renderNode.expanded = true; + if (!topParentNodeToBeExpanded) { + topParentNodeToBeExpanded = renderNode; + } + } + }); + // If any expansion was needed to display this selected node, then + // inform the scene of the top-most expansion. + if (topParentNodeToBeExpanded) { + this.setNodeExpanded(topParentNodeToBeExpanded); + this._zoomed = true; + } + if (selectedNode) { + this._updateNodeState(selectedNode); + } + // Give time for any expanding to finish before panning to a node. + // Otherwise, the pan will be computed from incorrect measurements. + setTimeout(() => { + // 鼠标点击不自动移动居中 + if (this.enablePanSignal) { + this.panToNode(selectedNode); + } + this.enablePanSignal = true; + }, tf_graph_layout.PARAMS.animation.duration); + } + + _highlightedNodeChanged(highlightedNode, oldHighlightedNode): void { + if (highlightedNode === oldHighlightedNode) { + return; + } + if (highlightedNode) { + this._updateNodeState(highlightedNode); + } + if (oldHighlightedNode) { + this._updateNodeState(oldHighlightedNode); + } + } + + _linkedNodeChanged(linkedNode, oldLinkedNode): void { + if (linkedNode === oldLinkedNode) { + return; + } + if (oldLinkedNode) { + this._updateNodeState(oldLinkedNode); + } + if (linkedNode) { + this._updateNodeState(linkedNode); + } + } + + _onZoomChanged(): void { + this._updateLabels(!this._zoomed); + } + + _fireEnableClick(): void { + this.fire('enable-click'); + } + + // 取消鼠标点击自动居中 + _noPanToNode(): void { + this.enablePanSignal = false + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph.ts new file mode 100644 index 0000000000000000000000000000000000000000..e16376056d37364161451c01e45a80d8effdf472 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph/tf-graph.ts @@ -0,0 +1,506 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ + +import { customElement, observe, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import * as _ from 'lodash'; +import '../polymer/irons_and_papers'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import * as tb_debug from '../tb_debug'; +import * as tf_graph from '../tf_graph_common/graph'; +import * as tf_graph_hierarchy from '../tf_graph_common/hierarchy'; +import * as tf_graph_render from '../tf_graph_common/render'; +import * as tf_graph_util from '../tf_graph_common/util'; +import * as tf_graph_layout from '../tf_graph_common/layout'; +import './tf-graph-scene'; +import './components/legend/index'; +import type { MinimapVis, Selection } from '../tf_graph_controls/tf-graph-controls'; +import { fetchPbTxt, parseGraphPbTxt } from '../tf_graph_common/parser'; +import * as tf_hierarchy from '../tf_graph_common/hierarchy'; +import * as tf_graph_parser from '../tf_graph_common/parser'; + +import { BENCH_PREFIX } from '../tf_graph_common/common'; +import { safeJSONParse } from '../utils'; + +let _isRankJump = ''; + +export function setRankJump(value: string): void { + _isRankJump = value; +} + +@customElement('tf-graph') +class TfGraph extends LegacyElementMixin(PolymerElement) { + static readonly template = html` + + +
+
+ + +
+ +
+ `; + + @property({ + type: Object, + notify: true, + observer: '_graphChanged', + }) + graphHierarchy: tf_graph_hierarchy.MergedHierarchy; + + @property({ type: Object }) + hierarchyParams: tf_graph_hierarchy.HierarchyParams; + + @property({ type: Object, notify: true }) + progress: object; + + @property({ type: String }) + override title: string; + + @property({ type: String, notify: true }) + selectedNode: string; + + @property({ type: String, notify: true }) + linkedNode: string; + + @property({ type: Object }) + _lastSelectedEdgeGroup: any; + + @property({ type: Object }) + _lastHighlightedEdgeGroup: any; + + @property({ type: String, notify: true }) + highlightedNode: string; + + @property({ type: Object, notify: true }) + highlightedEdge: tf_graph_render.EdgeData; + + @property({ + type: Object, + readOnly: true, + notify: true, + }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo; + + @property({ type: Array }) + nodeContextMenuItems: unknown[]; + + @property({ type: Number }) + _renderDepth: number = 1; + + @property({ type: Boolean }) + _allowGraphSelect: boolean = true; + + @property({ type: Object }) + handleNodeSelected: any = ''; + + @property({ type: Object }) + selection: Selection; + + @property({ type: Object }) + minimapVis: MinimapVis = { npu: true, bench: true }; + + @observe('graphHierarchy', 'handleNodeSelected') + _buildNewRenderHierarchy(): void { + let graphHierarchy = this.graphHierarchy; + if (!graphHierarchy) { + return; + } + this._buildRenderHierarchy(graphHierarchy); + } + + @observe('selectedNode') + // Called when the selected node changes, ie there is a new selected node or + // the current one is unselected. + _selectedNodeChanged(): void { + let selectedNode = this.selectedNode; + if (this.handleNodeSelected) { + // A higher-level component provided a callback. Run it. + this.handleNodeSelected(selectedNode); + } + if (!selectedNode) { + return; + } + const [selectedHierarchy, linkedHierarchy] = ( + selectedNode.startsWith(BENCH_PREFIX) + ? [this.renderHierarchy.bench, this.renderHierarchy.npu] + : [this.renderHierarchy.npu, this.renderHierarchy.bench] + ) as tf_graph_render.RenderGraphInfo[]; + let node = selectedHierarchy.hierarchy.node(selectedNode); + if (!node) { + return; + } + const linkNodes = node.nodeAttributes._linked_node; + if (Array.isArray(linkNodes)) { + let tempNode = ''; + let lastRenderNode: tf_graph_render.RenderNodeInfo | undefined; + let lastExpandStatus = false; + for (let linkNode of linkNodes) { + const renderLinkedNode = linkedHierarchy.getRenderNodeByName(linkNode); + // Expand all ancestors of the linked node. + if (renderLinkedNode) { + lastRenderNode = renderLinkedNode; + lastExpandStatus = renderLinkedNode.expanded; + renderLinkedNode.expanded = true; + tempNode = linkNode; + } else { + break; + } + } + if (lastRenderNode) { + lastRenderNode.expanded = lastExpandStatus; + } + this.set('linkedNode', tempNode); + } else { + this.set('linkedNode', ''); + } + } + + @observe('selectedNode') + async _menuSelectedNodeExpand(): Promise { + function shouldSkip(renderHierarchy: any, selectedNode: any): boolean { + const hasRenderHierarchy = !!renderHierarchy; + const isNodeRendered = + renderHierarchy?.npu?.renderedOpNames?.includes(selectedNode) || + renderHierarchy?.bench?.renderedOpNames?.includes(selectedNode); + const hasSelectedNode = !!selectedNode; + + return !hasRenderHierarchy || isNodeRendered || !hasSelectedNode; + } + if (shouldSkip(this.renderHierarchy, this.selectedNode)) { + return; + } else { + const current = this.selectedNode; + const tempHierarchy = ( + current.startsWith(BENCH_PREFIX) ? this.renderHierarchy.bench : this.renderHierarchy.npu + ) as tf_graph_render.RenderGraphInfo; + const params = new URLSearchParams(); + if (this.selection.run) { + params.set('run', this.selection.run); + } + if (this.selection.tag) { + params.set('tag', this.selection.tag); + } + params.set('node', this.selectedNode); + const nodeMap = tempHierarchy.hierarchy.getNodeMap(); + const expandnodesPath = `expandnodes?${String(params)}`; + + let nodeName = ''; + try { + const expandnodeStr = await tf_graph_parser.fetchPbTxt(expandnodesPath); + let expandnodes; + try { + expandnodes = safeJSONParse(new TextDecoder().decode(expandnodeStr).replace(/'/g, '"')) as object; + } catch (e) { + console.error('Get expandnodes failed, please check expanded function and the nodedata in vis file'); + } + if (expandnodes[1].length === 0 && expandnodes[2].length === 0) { + return; + } + for (const i of expandnodes[1]) { + nodeName = expandnodes[0] + i; + const renderNode = tempHierarchy.getRenderNodeByName(nodeName); + if (nodeName in nodeMap && !renderNode.expanded) { + await this._nodeToggleExpand({ detail: { name: nodeName } }); + } + } + this.async(() => { + try { + this.set('selectedNode', ''); // 临时清空 + this.set('selectedNode', current); // 恢复原值 + } catch (e) { + console.error('Error during async set operation:', e); + } + }, 175); // 代码会在延迟 175 毫秒后执行, 给浏览器足够的时间来处理多层展开带来的渲染和状态变化 + } catch (error) { + console.error('Error fetching expandnodesPath:', error); + } + } + } + + /** + * Pans to a node. Assumes that the node exists. + * @param nodeName {string} The name of the node to pan to. + */ + panToNode(nodeName): void { + (this.$$('tf-graph-scene') as any).panToNode(nodeName); + } + + ready(): void { + super.ready(); + + this.addEventListener('graph-select', this._graphSelected.bind(this)); + this.addEventListener('disable-click', this._disableClick.bind(this)); + this.addEventListener('enable-click', this._enableClick.bind(this)); + // Nodes + this.addEventListener('node-toggle-expand', this._nodeToggleExpand.bind(this)); + document.addEventListener('parent-node-toggle-expand', this._parentNodeToggleExpand.bind(this)); + this.addEventListener('node-select', this._nodeSelected.bind(this)); + this.addEventListener('node-highlight', this._nodeHighlighted.bind(this)); + this.addEventListener('node-unhighlight', this._nodeUnhighlighted.bind(this)); + this.addEventListener('node-toggle-extract', this._nodeToggleExtract.bind(this)); + + // Annotations + + /* Note: currently highlighting/selecting annotation node has the same + * behavior as highlighting/selecting actual node so we point to the same + * set of event listeners. However, we might redesign this to be a bit + * different. + */ + this.addEventListener('annotation-select', this._nodeSelected.bind(this)); + this.addEventListener('annotation-highlight', this._nodeHighlighted.bind(this)); + this.addEventListener('annotation-unhighlight', this._nodeUnhighlighted.bind(this)); + } + + _buildRenderHierarchy(graphHierarchy: tf_graph_hierarchy.MergedHierarchy): void { + if ( + graphHierarchy.npu.root.type !== tf_graph.NodeType.META && + graphHierarchy.bench?.root.type !== tf_graph.NodeType.META + ) { + // root must be metanode but sometimes Polymer's dom-if has not + // remove tf-graph element yet in + // and thus mistakenly pass non-metanode to this module. + return; + } + + // Certain Polymer property setter are dynamically generated and is not properly + // typed. + const anyThis = this as any; + + const renderGraph = tf_graph_util.time( + 'new tf_graph_render.Hierarchy', + () => { + const npuRenderGraph = new tf_graph_render.RenderGraphInfo(graphHierarchy.npu); + const mergedRenderGraph: tf_graph_render.MergedRenderGraphInfo = { npu: npuRenderGraph }; + if (graphHierarchy.bench) { + const benchRenderGraph = new tf_graph_render.RenderGraphInfo(graphHierarchy.bench); + mergedRenderGraph.bench = benchRenderGraph; + } + return mergedRenderGraph; + }, + tb_debug.GraphDebugEventId.RENDER_BUILD_HIERARCHY, + ); + setTimeout(() => { + if (_isRankJump) { + this.fire('node-select', { name: _isRankJump }); + _isRankJump = ''; + } + }, tf_graph_layout.PARAMS.animation.duration); + anyThis._setRenderHierarchy(renderGraph); + } + + _getVisible(name: string): string { + if (!name) { + return name; + } + const tempHierarchy = ( + name.startsWith(BENCH_PREFIX) ? this.renderHierarchy.bench : this.renderHierarchy.npu + ) as tf_graph_render.RenderGraphInfo; + return tempHierarchy.getNearestVisibleAncestor(name); + } + + fit(): void { + (this.$.scene as any).fit(); + (this.$$('#bench') as any).fit(); + } + + _graphChanged(): void { + if (!this.graphHierarchy) { + return; + } + + // When a new graph is loaded, fire this event so that there is no + // info-card being displayed for the previously-loaded graph. + this.fire('graph-select'); + } + + _graphSelected(event): void { + // Reset this variable as a bug in d3 zoom behavior can cause zoomend + // callback not to be called if a right-click happens during a zoom event. + this._allowGraphSelect = true; + } + + _disableClick(event): void { + this._allowGraphSelect = false; + } + + _enableClick(event): void { + this._allowGraphSelect = true; + } + + // Called only when a new (non-null) node is selected. + _nodeSelected(event): void { + this.set('selectedNode', event.detail.name); + } + + _nodeHighlighted(event): void { + this.set('highlightedNode', event.detail.name); + } + + _nodeUnhighlighted(event): void { + this.set('highlightedNode', null); + } + + async _parentNodeToggleExpand(event): Promise { + const nodeName = event.detail.nodeData.node.name; + const matchedNodeLink = event.detail.nodeData.node.matchedNodeLink; + if (matchedNodeLink) { + let matched = matchedNodeLink[matchedNodeLink.length - 1]; + this.set('selectedNode', matched); + } else { + const params = new URLSearchParams(); + params.set('run', this.selection.run); + params.set('node', nodeName); + if (this.selection.tag) { + params.set('tag', this.selection.tag); + } + params.set('batch', String(this.selection.batch === -1 ? -1 : this.selection.batch - 1)); + params.set('step', String(this.selection.step === -1 ? -1 : this.selection.step - 1)); + const parentPath = `parent?${String(params)}`; + const parentStr = await tf_graph_parser.fetchPbTxt(parentPath); + const parentNode = new TextDecoder().decode(parentStr).replace(/'/g, '"'); + this.set('selectedNode', parentNode); + } + } + + async _nodeToggleExpand(event): Promise { + // Immediately select the node that is about to be expanded. + // Compute the sub-hierarchy scene. + const nodeName = event.detail.name; + const isBench = nodeName.startsWith(BENCH_PREFIX); + const tempHierarchy = ( + isBench ? this.renderHierarchy.bench : this.renderHierarchy.npu + ) as tf_graph_render.RenderGraphInfo; + const renderNode = tempHierarchy.getRenderNodeByName(nodeName); + // Op nodes are not expandable. + if (renderNode.node.type === tf_graph.NodeType.OP) { + return; + } + if (!renderNode.expanded && !tempHierarchy.checkSubhierarchy(nodeName)) { + const params = new URLSearchParams(); + params.set('run', this.selection.run); + params.set('node', renderNode.node.name || ''); + if (this.selection.tag) { + params.set('tag', this.selection.tag); + } + params.set('batch', String(this.selection.batch === -1 ? -1 : this.selection.batch - 1)); + params.set('step', String(this.selection.step === -1 ? -1 : this.selection.step - 1)); + const graphPath = `subgraph?${String(params)}`; + const arrayBuffer = await fetchPbTxt(graphPath); // 等待 fetchPbTxt 完成 + const graphDef = await parseGraphPbTxt(arrayBuffer); // 等待 parseGraphPbTxt 完成 + const slimGraph = await tf_graph.build(graphDef, tf_graph.defaultBuildParams, undefined); // 等待 tf_graph.build 完成 + tf_hierarchy.update(tempHierarchy.hierarchy, slimGraph, nodeName); + tempHierarchy.buildSubhierarchy(nodeName, slimGraph); + renderNode.expanded = !renderNode.expanded; + this.async(() => { + (isBench ? this.$$('#bench') : (this.$.scene as any)).setNodeExpanded(renderNode); + }, 75); + } else { + renderNode.expanded = !renderNode.expanded; + this.async(() => { + (isBench ? this.$$('#bench') : (this.$.scene as any)).setNodeExpanded(renderNode); + }, 75); + } + } + + _nodeToggleExtract(event): void { + // Toggle the include setting of the specified node appropriately. + const nodeName = event.detail.name; + this.nodeToggleExtract(nodeName); + } + + nodeToggleExtract(nodeName: string): void { + const tempHierarchy = ( + nodeName.startsWith(BENCH_PREFIX) ? this.renderHierarchy.bench : this.renderHierarchy.npu + ) as tf_graph_render.RenderGraphInfo; + const renderNode = tempHierarchy.getRenderNodeByName(nodeName); + if (renderNode.node.include === tf_graph.InclusionType.INCLUDE) { + renderNode.node.include = tf_graph.InclusionType.EXCLUDE; + } else if (renderNode.node.include === tf_graph.InclusionType.EXCLUDE) { + renderNode.node.include = tf_graph.InclusionType.INCLUDE; + } else { + renderNode.node.include = tempHierarchy.isNodeAuxiliary(renderNode) + ? tf_graph.InclusionType.INCLUDE + : tf_graph.InclusionType.EXCLUDE; + } + // Rebuild the render hierarchy. + this._buildRenderHierarchy(this.graphHierarchy); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_board/tf-graph-board.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_board/tf-graph-board.ts new file mode 100644 index 0000000000000000000000000000000000000000..d4482aca583946ee53efc869a27714bb97c0c692 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_board/tf-graph-board.ts @@ -0,0 +1,264 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import { customElement, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import '../polymer/irons_and_papers'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import '../tf_graph/tf-graph'; +import * as tf_graph from '../tf_graph_common/graph'; +import * as tf_graph_hierarchy from '../tf_graph_common/hierarchy'; +import * as tf_graph_render from '../tf_graph_common/render'; +import '../tf_graph_node_info/index'; +import * as _ from 'lodash'; +import type { MinimapVis } from '../tf_graph_controls/tf-graph-controls'; + +/** + * Element for putting tf-graph and tf-graph-info side by side. + * + * Example + * + */ +@customElement('tf-graph-board') +class TfGraphBoard extends LegacyElementMixin(PolymerElement) { + static readonly template = html` + + +
+
+ +
+
+ + +
+
+ `; + + @property({ type: Object }) + graphHierarchy: tf_graph_hierarchy.MergedHierarchy; + + @property({ type: Object }) + graph: tf_graph.MergedSlimGraph; + + @property({ type: Object }) + hierarchyParams: tf_graph_hierarchy.HierarchyParams = tf_graph_hierarchy.defaultHierarchyParams; + + /** + * A number between 0 and 100 denoting the % of progress + * for the progress bar and the displayed message. + * @type {{value: number, msg: string}} + */ + @property({ type: Object }) + progress: object; + + @property({ type: Object, notify: true }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo; + + @property({ type: Object }) + menu: any; + + @property({ type: Object }) + colorset: any; + + @property({ type: String, notify: true }) + selectedNode: string; + + @property({ type: String }) + _highlightedNode: string; + + // An optional function that takes a node selected event (whose `detail` + // property is the selected node ... which could be null if a node is + // deselected). Called whenever a node is selected or deselected. + @property({ type: Object }) + handleNodeSelected: object; + + @property({ type: Object }) + selection: object; + + @property({ type: Object }) + tooltips: object; + + @property({ type: String }) + selectNodeCopy: string = ''; + + @property({ type: Object }) + minimapVis: MinimapVis = { npu: true, bench: true }; + + ready(): void { + super.ready(); + } + + fit(): void { + (this.$.graph as any).fit(); + } + + /** True if the progress is not complete yet (< 100 %). */ + _isNotComplete(progress): boolean { + return progress.value < 100; + } + + _getContainerClass(progress): string { + let result = 'container'; + if (progress.error) { + result += ' error'; + } + if (this._isNotComplete(progress)) { + result += ' loading'; + } + return result; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/common.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/common.ts new file mode 100644 index 0000000000000000000000000000000000000000..caeba6bf68260c6ec285c1899a090d03d21d6426 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/common.ts @@ -0,0 +1,228 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +/** + * @fileoverview Common interfaces for the tensorflow graph visualizer. + */ +import * as d3 from 'd3'; + +export interface ProgressTracker { + updateProgress: (incrementValue: number) => void; + setMessage: (msg: string) => void; + reportError: (msg: string, err: Error) => void; +} + +/** Enums element class of objects in the scene */ +export const Class = { + Node: { + // element that contains nodes. + CONTAINER: 'nodes', + // element that contains detail about a node. + GROUP: 'node', + // element that contains visual elements (like rect, ellipse). + SHAPE: 'nodeshape', + OUTER: 'outer', + // <*> element(s) under SHAPE that should receive color updates. + COLOR_TARGET: 'nodecolortarget', + // element showing the node's label. + LABEL: 'nodelabel', + // element that contains all visuals for the expand/collapse + // button for expandable group nodes. + BUTTON_CONTAINER: 'buttoncontainer', + // element that surrounds expand/collapse buttons. + BUTTON_CIRCLE: 'buttoncircle', + // element of the expand button. + EXPAND_BUTTON: 'expandbutton', + // element of the collapse button. + COLLAPSE_BUTTON: 'collapsebutton', + }, + Edge: { + CONTAINER: 'edges', + GROUP: 'edge', + LINE: 'edgeline', + REFERENCE_EDGE: 'referenceedge', + REF_LINE: 'refline', + SELECTABLE: 'selectableedge', + SELECTED: 'selectededge', + STRUCTURAL: 'structural', + HIGHLIGHTED: 'highlighted', + }, + Annotation: { + OUTBOX: 'out-annotations', + INBOX: 'in-annotations', + GROUP: 'annotation', + NODE: 'annotation-node', + EDGE: 'annotation-edge', + CONTROL_EDGE: 'annotation-control-edge', + LABEL: 'annotation-label', + ELLIPSIS: 'annotation-ellipsis', + }, + Scene: { + GROUP: 'scene', + CORE: 'core', + FUNCTION_LIBRARY: 'function-library', + INEXTRACT: 'in-extract', + OUTEXTRACT: 'out-extract', + }, + Subscene: { GROUP: 'subscene' }, + OPNODE: 'op', + METANODE: 'meta', + SERIESNODE: 'series', + BRIDGENODE: 'bridge', + ELLIPSISNODE: 'ellipsis', + API_LIST: 'api_list', + MULTI_COLLECTION: 'multi_collection', +}; + +// Please keep this in sync with tf-graph-scene.html.ts. +export const FontSizeInPx: Record> = { + Edge: { + LABEL: 3.5, + }, + Annotation: { + LABEL: 5, + }, + Node: { + EXPANDED_LABEL: 9, + SERIES_LABEL: 8, + OP_LABEL: 6, + HEALTH_PILL_STAT_LABEL: 4, + }, +}; + +export const SVG_NAMESPACE = 'http://www.w3.org/2000/svg'; + +/** + * Given a container d3 selection, select a child element of a given tag and + * class. If multiple children matches the tag and class name, returns only + * the first one. + * + * @param container + * @param tagName tag name. + * @param className (optional) Class name or list of class names. + * @return selection of the element, or an empty selection + */ +export function selectChild( + container, + tagName: string, + className?: string | string[], +): d3.Selection { + let children = container.node().childNodes; + for (let i = 0; i < children.length; i++) { + let child = children[i]; + if (child.tagName === tagName) { + if (className instanceof Array) { + let hasAllClasses = true; + for (let j = 0; j < className.length; j++) { + hasAllClasses = hasAllClasses && child.classList.contains(className[j]); + } + if (hasAllClasses) { + return d3.select(child); + } + } else if (!className || child.classList.contains(className)) { + return d3.select(child); + } + } + } + return d3.select(null); +} + +/** + * Given a container d3 selection, select a child svg element of a given tag + * and class if exists or append / insert one otherwise. If multiple children + * matches the tag and class name, returns only the first one. + * + * @param container + * @param tagName tag name. + * @param className (optional) Class name or a list of class names. + * @param before (optional) reference DOM node for insertion. + * @return selection of the element + */ +export function selectOrCreateChild( + container, + tagName: string, + className?: string | string[], + before?, +): d3.Selection { + let child = selectChild(container, tagName, className); + if (!child.empty()) { + return child; + } + let newElement = document.createElementNS(SVG_NAMESPACE, tagName); + if (className instanceof Array) { + for (let i = 0; i < className.length; i++) { + newElement.classList.add(className[i]); + } + } else { + newElement.classList.add(className ?? ''); + } + if (before) { + // if before exists, insert + container.node().insertBefore(newElement, before); + } else { + // otherwise, append + container.node().appendChild(newElement); + } + return ( + d3 + .select(newElement) + // need to bind data to emulate d3_selection.append + .datum(container.datum()) + ); +} + +/** The minimum stroke width of an edge. */ +export const MIN_EDGE_WIDTH = 0.75; +/** The maximum stroke width of an edge. */ +export const MAX_EDGE_WIDTH = 12; +/** The exponent used in the power scale for edge thickness. */ +const EDGE_WIDTH_SCALE_EXPONENT = 0.3; +/** The domain (min and max value) for the edge width. */ +const DOMAIN_EDGE_WIDTH_SCALE = [1, 5000000]; +export const EDGE_WIDTH_SIZE_BASED_SCALE: d3.ScalePower = d3 + .scalePow() + .exponent(EDGE_WIDTH_SCALE_EXPONENT) + .domain(DOMAIN_EDGE_WIDTH_SCALE) + .range([MIN_EDGE_WIDTH, MAX_EDGE_WIDTH]) + .clamp(true); + +export const globalTooltips: { [key: string]: string } = {}; +// NPU侧模型的节点前缀 +export const NPU_PREFIX = 'N___'; +// 标杆侧模型的节点前缀 +export const BENCH_PREFIX = 'B___'; +// 未匹配节点颜色 +export const UNMATCHED_COLOR = '#C7C7C7'; +// 展开对应侧节点 +export const EXPAND_NODE = '展开对应侧节点'; +// 数据发送 +export const DATA_SEND = '数据发送'; +// 数据接收 +export const DATA_RECEIVE = '数据接收'; +// 数据发送接收 +export const DATA_SEND_RECEIVE = '数据发送接收'; +// 数据读取时间 +export const DATA_LOAD_TIME = 3000; +// 数据过大提示时间 +export const DATA_NOTICE_TIME = 600; +// 预设颜色 +export const defaultColorSetting = [ + { key: '#FFFCF3', values: [0, 0.2] }, + { key: '#FFEDBE', values: [0.2, 0.4] }, + { key: '#FFDC7F', values: [0.4, 0.6] }, + { key: '#FFC62E', values: [0.6, 0.8] }, + { key: '#ff704d', values: [0.8, 1] } +]; +// 预设颜色设置项 +export const defaultColorSelects = [{ key: 'NaN', values: [NaN, NaN] }]; \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/contextmenu.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/contextmenu.ts new file mode 100644 index 0000000000000000000000000000000000000000..8ebae9a7d53b81802edf71285a83d2544d29d862 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/contextmenu.ts @@ -0,0 +1,264 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import * as d3 from 'd3'; +import { TfGraphScene } from './tf-graph-scene'; +import * as tf_graph_parser from '../tf_graph_common/parser'; +import { getColorByPrecisionIndex } from './node'; +import { setRankJump } from '../tf_graph/tf-graph'; +import { BENCH_PREFIX, NPU_PREFIX, EXPAND_NODE, DATA_SEND, DATA_RECEIVE, DATA_SEND_RECEIVE } from './common'; +import { safeJSONParse } from '../utils'; + +export interface TitleFunction { + (data: any): string; +} +/** Function that takes action based on item clicked in the context menu. */ +export interface ActionFunction { + (data: any): string; +} +/** + * The interface for an item in the context menu + */ +export interface ContextMenuItem { + title: TitleFunction; + action: ActionFunction; +} +/** + * Returns the top and left distance of the scene element from the top left + * corner of the screen. + */ +function getOffset(sceneElement): { left: number; top: number } { + let leftDistance = 0; + let topDistance = 0; + let currentElement = sceneElement; + while (currentElement && currentElement.offsetLeft >= 0 && currentElement.offsetTop >= 0) { + leftDistance += currentElement.offsetLeft - currentElement.scrollLeft; + topDistance += currentElement.offsetTop - currentElement.scrollTop; + currentElement = currentElement.offsetParent; + } + return { + left: leftDistance, + top: topDistance, + }; +} +/** + * Returns the event listener, which can be used as an argument for the d3 + * selection.on function. Renders the context menu that is to be displayed + * in response to the event. + */ +export function getMenu(sceneElement: TfGraphScene, nodeData): () => Promise { + let selectedNode = ''; + const menuNode = sceneElement.getContextMenu(); + const menuSelection = d3.select(sceneElement.getContextMenu()); + // Function called to populate the context menu. + return async function (): Promise { + // Position and display the menu. + let event = d3.event; + const sceneOffset = getOffset(sceneElement); + menuSelection + .style('display', 'block') + .style('left', `${event.clientX - sceneOffset.left + 1}px`) + .style('top', `${event.clientY - sceneOffset.top + 1}px`); + // Stop the event from propagating further. + event.preventDefault(); + event.stopPropagation(); + function maybeCloseMenu(closeEvent?: any): void { + if (closeEvent?.composedPath().includes(menuNode)) { + return; + } + menuSelection.style('display', 'none'); + document.body.removeEventListener('mousedown', maybeCloseMenu, { + capture: true, + }); + } + // Dismiss and remove the click listener as soon as there is a mousedown + // on the document. We use capture listener so no component can stop + // context menu from dismissing due to stopped propagation. + document.body.addEventListener('mousedown', maybeCloseMenu, { + capture: true, + }); + // Add provided items to the context menu. + menuSelection.text(''); + const { name } = nodeData.node; + let nodePrefix = name.substring(0, 4); + let side = ''; + let nodeName = name; + if (nodePrefix === NPU_PREFIX) { + side = 'NPU'; + nodeName = name.substring(4); + } else if (nodePrefix === BENCH_PREFIX) { + side = 'Bench'; + nodeName = name.substring(4); + } else { + nodePrefix = ''; + } + + // 设置 URL 参数 + const params = new URLSearchParams(); + params.set('side', side); + params.set('node', nodeName); + + // 获取侧路径并解析 + const sidePath = `rank?${String(params)}`; + const sideStr = await tf_graph_parser.fetchPbTxt(sidePath); + const decodedStr = new TextDecoder().decode(sideStr); + const decodedObj = safeJSONParse(decodedStr); + if (decodedObj === null) { + console.error('Error: loading contextmenu failed please check data or getMenu function.'); + return; + } + // 构建菜单选项 + let communicationsType = decodedObj.communications_type; + const menuOptions = [{ text: EXPAND_NODE, action: 'expand' }]; + if (communicationsType || decodedObj[0]?.communications_type) { + // 定义一个函数来生成 titleText + const getTitleText = (type: string): string => { + if (type === 'send') { + return DATA_SEND; + } + if (type === 'receive') { + return DATA_RECEIVE; + } + return DATA_SEND_RECEIVE; + }; + + // 如果 communicationsType 存在,直接处理 + if (communicationsType) { + const titleText = getTitleText(communicationsType); + menuOptions.push({ text: titleText, action: 'rank' }); + } else if (decodedObj[0].communications_type) { + // 如果 decodedObj 中存在 communications_type,遍历处理 + decodedObj.forEach((item) => { + const titleText = getTitleText(item.communications_type); + menuOptions.push({ text: titleText, action: 'rank' }); + }); + } + } + let list = menuSelection.append('ul'); + list + .selectAll('li') + .data(menuOptions) + .enter() + .append('li') + .on('click', (d, i) => { + if (d.action === 'expand') { + sceneElement.fire('parent-node-toggle-expand', { nodeData }); + } + maybeCloseMenu(); + }) + .on('mouseover', (d, i, nodes) => { + if (d.action === 'rank') { + const parentLi = d3.select(nodes[i]); + // 第一项是expand,rank一定从第二项开始,所以是i-1 + const id = decodedObj[i - 1]?.communications_type || decodedObj.communications_type; + // 检查是否已经有子菜单,防止重复添加 + if (!parentLi.select(`#submenu-${id}`).empty()) { + return; + } + // 动态生成子菜单 + let nodeInfo: Record; + if (decodedObj[0]?.communications_type) { + nodeInfo = decodedObj[i - 1]?.nodes_info ?? {}; // 如果是 undefined,使用空对象 + } else { + nodeInfo = decodedObj.nodes_info ?? {}; // 如果是 undefined,使用空对象 + } + const subMenuOptions: Array<{ text: string; action: string; color: string }> = []; // 定义 subMenuOptions 数组的类型 + for (const [key, value] of Object.entries(nodeInfo)) { + const rank = `rank${key}`; + const screen = getColorByPrecisionIndex(String(value[0])); + const menuName = value[1]; + subMenuOptions.push({ text: rank, action: menuName, color: screen }); + } + // 定义一个函数来设置样式,减少重复代码 + const setStyle = (element, styles): void => { + Object.keys(styles).forEach((key) => { + element.style(key, styles[key]); + }); + }; + + // 常用样式 + const submenuStyles1 = { + position: 'absolute', + left: '100%', + top: '27px', + background: '#e2e2e2', + color: 'black', + }; + + const submenuStyles2 = { + ...submenuStyles1, + top: '54px', // 第二个项 + }; + + // 创建第三个样式对象,只修改 top 属性 + const submenuStyles3 = { + ...submenuStyles1, + top: '81px', // 第三个项 + }; + + // 鼠标悬停时的样式 + const hoverStyles = { + border: '1px solid #000', + color: 'black', + }; + + // 鼠标离开时的样式 + const normalStyles = { + border: '1px solid rgb(199, 199, 199)', + }; + + const submenuStyles = [submenuStyles1, submenuStyles2, submenuStyles3]; + + // 创建子菜单 + const submenu = parentLi + .append('ul') + .attr('class', 'submenu') + .attr('id', `submenu-${id}`) // 动态生成唯一的 id + .call(setStyle, submenuStyles[i - 1]); + + // 创建子菜单项 + submenu + .selectAll('li') + .data(subMenuOptions) + .enter() + .append('li') + .text((dText) => dText.text) + .style('background-color', (dColor) => dColor.color) + .call(setStyle, normalStyles) + .on('mouseover', function () { + d3.select(this).call(setStyle, hoverStyles); + }) + .on('mouseout', function () { + d3.select(this).call(setStyle, normalStyles); + }) + .on('click', (subD) => { + // 添加子菜单点击逻辑 + selectedNode = `${nodePrefix}${subD.action}`; + setRankJump(selectedNode); + sceneElement.fire('contextMenuTag-changed', parseInt(subD.text.slice(4), 10)); + maybeCloseMenu(); + }); + } + }) + .on('mouseleave', (d, i, nodes) => { + if (d.action === 'rank') { + d3.select(nodes[i]).select('.submenu').remove(); // 隐藏子菜单 + } + }) + .text((d) => d.text); + }; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/graph.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/graph.ts new file mode 100644 index 0000000000000000000000000000000000000000..e666b486aa652c3e71bcbd5c529f5e3df93d7267 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/graph.ts @@ -0,0 +1,586 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import { graphlib } from 'dagre'; +import * as _ from 'lodash'; +import * as tb_debug from '../tb_debug'; +import { ProgressTracker } from './common'; +import * as tf_graph_proto from './proto'; +import * as tf_graph_util from './util'; + +export const NAMESPACE_DELIM = '/'; +export const ROOT_NAME = '__root__'; +export const FUNCTION_LIBRARY_NODE_PREFIX = '__function_library__'; +/** Attribute key used for storing attributes that are too large. */ +export const LARGE_ATTRS_KEY = '_too_large_attrs'; +/** Precision attributes are used to represent the color of nodes. */ +export const NODE_TYPE = 'node_type'; +export const PRECISION_INDEX = 'precision_index'; +export const MATCHED_NODE_LINK = 'matched_node_link'; +export const OVERFLOW_LEVEL = 'overflow_level'; +/** + * Maximum allowed size in bytes, before the attribute is considered large + * and filtered out of the graph. + */ +export const LIMIT_ATTR_SIZE = 1024; +// Separator between the source and the destination name of the edge. +export const EDGE_KEY_DELIM = '--'; +export enum GraphType { + FULL = 0, + EMBEDDED = 1, + META = 2, + SERIES = 3, + CORE = 4, + SHADOW = 5, + BRIDGE = 6, + EDGE = 7, +} +export enum NodeType { + META = 0, + OP = 1, + SERIES = 2, + BRIDGE = 3, + ELLIPSIS = 4, + MULTI_COLLECTION = 8, + API_LIST = 9, +} +/** Indicates if a node is to be included in the main graph when rendered. */ +export enum InclusionType { + INCLUDE = 0, + EXCLUDE = 1, + UNSPECIFIED = 2, +} + +// Including both the NPU and benchmark slimgraph. +export interface MergedSlimGraph { + npu: SlimGraph; + bench?: SlimGraph; +} +/** + * A SlimGraph is inspired by graphlib.Graph, but having only the functionality + * that we need. + */ +export class SlimGraph { + nodes: { + [nodeName: string]: OpNode; + }; + metaNodes: { + [nodeName: string]: Metanode; + }; + constructor() { + this.nodes = {}; + this.metaNodes = {}; + } +} +export interface NormalizedInput { + name: string; + /** The index of the output tensor of the source node. */ + outputTensorKey: string; +} +export interface BuildParams { + enableEmbedding: boolean; + inEmbeddingTypes: string[]; + outEmbeddingTypes: string[]; + refEdges: { + [inputEdge: string]: boolean; + }; +} +/** + * The most basic information about a node in the hierarchical graph. + */ +export interface Node { + /** The name of the node, used frequently to look up nodes by name. */ + name: string; + /** Which type of node this is. */ + type: NodeType; + inputData: { + [key: string]: any; + }; + outputData: { + [key: string]: any; + }; + suggestions: { + [key: string]: string; + }; + /** + * Whether this node is a type that may contain other nodes. Those types + * should extend from GroupNode. + * + * For an OpNode, isGroupNode will be false, even though it may have + * embeddings. These embedding Nodes will have their parentNode set to the + * OpNode. However, embeddings are later rendered as annotations, not as + * children to be made visible on expansion (like a Metanode or SeriesNode). + */ + isGroupNode: boolean; + /** + * The number of nodes this node represents. For OpNodes, this will be 1, and + * for GroupNodes it will be a count of the total number of descendents it + * contains. + */ + cardinality: number; + /** + * The Node which is this Node's parent. This is of type Node and not + * GroupNode because of embeddings, which will have a parent OpNode. + */ + parentNode: Node | null; + /** If the node is to be included or excluded from the main graph when + * rendered. Defaults to UNSPECIFIED, which means that the rendering + * algorithm determines if it will be included or not. Then can be set to + * INCLUDE or EXCLUDE manually by the user. + */ + include: InclusionType; + /** + * Node attributes specify customizable visual aspects of a node and + * application-specific metadata associated with a node. The name + * 'nodeAttributes' is meant to avoid naming-conflicts with the 'attr' in + * subclasses of Node. + */ + nodeAttributes: { + [key: string]: any; + }; +} +export type TensorShape = number[]; +export interface OpNode extends Node { + op: string; + attr: Array<{ + key: string; + value: any; + }>; + inputData: { + [key: string]: any; + }; + outputData: { + [key: string]: any; + }; + stackData: []; + matchedNodeLink: []; + suggestions: { + [key: string]: string; + }; +} + +export interface GroupNode extends Node { + metagraph: graphlib.Graph; +} +export interface Metanode extends GroupNode { + depth: number; + attr: Array<{ + key: string; + value: any; + }>; + inputData: { + [key: string]: any; + }; + outputData: { + [key: string]: any; + }; + stackData: []; + matchedNodeLink: []; + suggestions: { + [key: string]: string; + }; + getFirstChild: () => GroupNode | OpNode; + getRootOp: () => OpNode; + /** Return name of all leaves inside a metanode. */ + leaves: () => string[]; +} + +/** + * A label object for nodes in the full graph and leaf nodes in the render + * graph. + */ +export class OpNodeImpl implements OpNode { + name: string; + op: string; + attr: Array<{ + key: string; + value: any; + }>; + type: NodeType; + isGroupNode: boolean; + cardinality: number; + parentNode: Node | null; + include: InclusionType; + inputData: { + [key: string]: any; + }; + outputData: { + [key: string]: any; + }; + stackData: []; + matchedNodeLink: []; + suggestions: { + [key: string]: string; + }; + nodeAttributes: { + [key: string]: any; + }; + + /** + * Constructs a new Op node. + * + * @param rawNode The raw node. + */ + constructor(rawNode: tf_graph_proto.NodeDef) { + this.op = rawNode.op; + this.name = rawNode.name; + this.attr = rawNode.attr; + // additional properties + this.type = NodeType.OP; + this.isGroupNode = false; + this.cardinality = 1; + this.parentNode = null; + this.include = InclusionType.UNSPECIFIED; + this.inputData = rawNode.input_data; + this.outputData = rawNode.output_data; + this.suggestions = rawNode.suggestions; + this.stackData = rawNode.stack_info; + this.matchedNodeLink = rawNode.matched_node_link; + this.nodeAttributes = {}; + } +} + +export function createMetanode(name: string, opt = {}): Metanode { + return new MetanodeImpl(name, opt); +} + +export class MetanodeImpl implements Metanode { + name: string; + type: NodeType; + depth: number; + isGroupNode: boolean; + cardinality: number; + metagraph: graphlib.Graph; + parentNode: Node | null; + include: InclusionType; + inputData: { + [key: string]: any; + }; + outputData: { + [key: string]: any; + }; + stackData: []; + matchedNodeLink: []; + suggestions: { + [key: string]: string; + }; + nodeAttributes: { + [key: string]: any; + }; + attr: Array<{ + key: string; + value: any; + }>; + + /** A label object for meta-nodes in the graph hierarchy */ + constructor(name: string, opt = {}) { + this.name = name; + this.type = NodeType.META; + /** number of levels under this group */ + this.depth = 1; + this.isGroupNode = true; + /** # of leaf nodes (including embedded ones) */ + this.cardinality = 0; + /** graph contains metanodes, nodes, edges + * and metaedges for main items within this metanode + */ + this.metagraph = createGraph(name, GraphType.META, opt); + /** Metanode which contains this node, if any */ + this.parentNode = null; + this.include = InclusionType.UNSPECIFIED; + this.attr = []; + this.inputData = {}; + this.outputData = {}; + this.stackData = []; + this.matchedNodeLink = []; + this.suggestions = {}; + this.nodeAttributes = {}; + } + + getFirstChild(): GroupNode | OpNode { + return this.metagraph.node(this.metagraph.nodes()[0]) as any; + } + + /** + * Returns the op node associated with the metanode. + * For example, if the metanode is 'sgd', the associated + * op node is sgd/(sgd). + */ + getRootOp(): OpNode { + let nameSplit = this.name.split('/'); + let rootOpName = `${this.name}/(${nameSplit[nameSplit.length - 1]})`; + return this.metagraph.node(rootOpName) as any; + } + + /** + * Return an array of the names of all the leaves (non-GroupNodes) inside + * this metanode. This performs a breadth-first search of the tree, so + * immediate child leaves will appear earlier in the output array than + * descendant leaves. + */ + leaves(): string[] { + let leaves: string[] = []; + let queue = [this as Node]; + let metagraph; // Defined here due to a limitation of ES6->5 compilation. + while (queue.length) { + let node = queue.shift(); + if (node?.isGroupNode) { + metagraph = (node).metagraph; + _.each(metagraph.nodes(), (name) => queue.push(metagraph.node(name))); + } else { + leaves.push(node?.name ?? ''); + } + } + return leaves; + } +} + +export const defaultBuildParams: BuildParams = { + enableEmbedding: true, + inEmbeddingTypes: ['Const'], + outEmbeddingTypes: ['^[a-zA-Z]+Summary$'], + // This is the whitelist of inputs on op types that are considered + // reference edges. "Assign 0" indicates that the first input to + // an OpNode with operation type "Assign" is a reference edge. + refEdges: { + 'Assign 0': true, + 'AssignAdd 0': true, + 'AssignSub 0': true, + 'assign 0': true, + 'assign_add 0': true, + 'assign_sub 0': true, + 'count_up_to 0': true, + 'ScatterAdd 0': true, + 'ScatterSub 0': true, + 'ScatterUpdate 0': true, + 'scatter_add 0': true, + 'scatter_sub 0': true, + 'scatter_update 0': true, + }, +}; + +export function build( + graphDef: tf_graph_proto.GraphDef, + params: BuildParams, + tracker?: ProgressTracker, +): Promise { + let embeddingNodeNames: string[] = []; + let rawNodes = graphDef.node; + /** + * A list of all the non-embedding node names which appear in the processed + * list of raw nodes. Here we pre-allocate enough room for all the rawNodes, + * even though there will some number of embeddings. The excess array length + * is spliced off later. + * + * Experimentation shows that around 30% of the array will go unused, and + * even for very large networks that amounts to less than 10k spaces. + */ + let nodeNames = new Array(rawNodes.length); + return tf_graph_util + .runAsyncTask( + 'Normalizing names', + 30, + () => { + let opNodes = new Array(rawNodes.length); + let index = 0; + const processRawNode = (rawNode: tf_graph_proto.NodeDef): OpNodeImpl | MetanodeImpl => { + if (!rawNode.isLeaf) { + let metaNode = new MetanodeImpl(rawNode.name); + metaNode.attr = rawNode.attr; + metaNode.nodeAttributes._order = index; + if (rawNode.matched_node_link && rawNode.matched_node_link.length > 0) { + metaNode.nodeAttributes._linked_node = rawNode.matched_node_link; + } + metaNode.inputData = rawNode.input_data; + metaNode.outputData = rawNode.output_data; + metaNode.stackData = rawNode.stack_info; + metaNode.matchedNodeLink = rawNode.matched_node_link; + metaNode.suggestions = rawNode.suggestions; + if (Number(rawNode.node_type) === 1) { + metaNode.type = 0; + } else { + metaNode.type = Number(rawNode.node_type); + } + opNodes[index] = metaNode; + nodeNames[index] = metaNode.name; + index++; + return metaNode; + } else { + let opNode = new OpNodeImpl(rawNode); + opNode.nodeAttributes._order = index; + if (rawNode.matched_node_link && rawNode.matched_node_link.length > 0) { + opNode.nodeAttributes._linked_node = rawNode.matched_node_link; + } + opNodes[index] = opNode; + nodeNames[index] = opNode.name; + index++; + return opNode; + } + }; + _.each(rawNodes, processRawNode); + opNodes.splice(index); + nodeNames.splice(index); + return opNodes; + }, + tracker, + tb_debug.GraphDebugEventId.NORMALIZING_NAMES, + ) + .then((opNodes) => { + // Create the graph data structure from the graphlib library. + return tf_graph_util.runAsyncTask( + 'Building the data structure', + 70, + () => { + let normalizedNameDict = mapStrictHierarchy(nodeNames, embeddingNodeNames); + let graph = new SlimGraph(); + // Add the nodes to the graph. + _.each(opNodes, (opNode) => { + if (opNode instanceof OpNodeImpl) { + let normalizedName = normalizedNameDict[opNode.name] || opNode.name; + graph.nodes[normalizedName] = opNode; + // Update the name of the node. + opNode.name = normalizedName; + } else { + graph.metaNodes[opNode.name] = opNode as MetanodeImpl; + } + }); + return graph; + }, + tracker, + tb_debug.GraphDebugEventId.BUILD_SLIM_GRAPH, + ); + }); +} + +/** + * Create a new graphlib.Graph() instance with default parameters + */ +export function createGraph(name: string, type, graphOptions: LabeledGraphOptions = {}): graphlib.Graph { + const graph = new graphlib.Graph({ ...graphOptions, multigraph: true }); + graph.setGraph({ + name: name, + rankdir: graphOptions.rankdir || 'TB', + type: type, + } as any); + return graph; +} + +/** + * Returns a strict node name (name => name/(name)) to avoid conflicts + * where the node name is also a namespace. + */ +export function getStrictName(name: string): string { + let parts = name.split(NAMESPACE_DELIM); + return `${name}${NAMESPACE_DELIM}(${parts[parts.length - 1]})`; +} + +/** + * For each op node (embedding or non-embedding), rename it if there is a + * non-embedding node under its namespace. For example, assume node name 'A'. + * If there is a non-embedding node under its namespace (e.g. 'A/B'), 'A' will + * be renamed to 'A/(A)'. Then the namespace 'A' will contain 2 nodes: '(A)' + * and 'B'. If all the nodes under 'A' are embedding nodes (e.g. constant and + * summary), keep 'A' as an Op node and don't create a namespace. + * + * @param nodeNames An array of regular (non-embedding) node names. + * @param embeddingNodeNames An array of embedding node names. + * @return Dictionary object mapping names that need to be renamed to + * new names. + */ +function mapStrictHierarchy( + nodeNames: string[], + embeddingNodeNames: string[], +): { + [oldName: string]: string; +} { + /** Dictionary that maps the old new to the new name */ + let newNameDictionary: { + [oldName: string]: string; + } = {}; + /** Set used to store all namespaces. */ + let namespaceSet: { + [namespace: string]: boolean; + } = {}; + // sort the nodes to make prefix check faster + nodeNames.sort(); + // look for nodes with a prefix a,a/b -> a/(a),a/b + for (let i = 0; i < nodeNames.length - 1; ++i) { + let a = nodeNames[i]; + // Get all the parent namespaces of the current node + // and add them in the namespace set. + _.each(getHierarchicalPath(a).slice(0, -1), (ns) => { + namespaceSet[ns] = true; + }); + for (let j = i + 1; j < nodeNames.length; ++j) { + let b = nodeNames[j]; + if (_.startsWith(b, a)) { + if (b.length > a.length && b.charAt(a.length) === NAMESPACE_DELIM) { + newNameDictionary[a] = getStrictName(a); + break; + } + } else { + break; + } + } + } + // Go through all the embedding node names and rename them in case they + // collide with namespaces. + _.each(embeddingNodeNames, (embeddingName) => { + if (embeddingName in namespaceSet) { + // Rename to follow strict hierarchy. + newNameDictionary[embeddingName] = getStrictName(embeddingName); + } + }); + return newNameDictionary; +} + +/** + * Returns the hierarchical path of the current node, based on the node's name. + * For example, if the name is 'a/b/c', the returned path is + * ['a', 'a/b', 'a/b/c']. + */ +export function getHierarchicalPath(name: string): string[] { + let path: string[] = []; + let i = name.indexOf(NAMESPACE_DELIM); + // Push all parent portions of the path. + while (i >= 0) { + path.push(name.substring(0, i)); + i = name.indexOf(NAMESPACE_DELIM, i + 1); + } + // Push the leaf of the path. + path.push(name); + return path; +} + +/** + * An extended variant of the options object for `graphlib.Graph`, used + * to configure a `graphlib.Graph` at its creation. + * + * Dagre's constructor has an `opts` object as a parameter, let's call it + * 'GraphCtorOptions'. The Graph's `setGraph()` has a `label` parameter, + * let's call it `LabelOptions`. + * + * Since both are configured when a `graphlib.Graph` is first initialized, + * TensorBoard's Graph code passes around this hybrid object which includes + * properties from both `GraphCtorOptions` (compound) and `LabelOptions` + * (rankdir). + */ +export interface LabeledGraphOptions { + compound?: boolean; + rankdir?: string; + multigraph?: boolean; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/hierarchy.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/hierarchy.ts new file mode 100644 index 0000000000000000000000000000000000000000..814f468771bd80e25fe937698a1ba7d1bc92c2e7 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/hierarchy.ts @@ -0,0 +1,164 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +/** + * Package for the Graph Hierarchy for TensorFlow graph. + */ +import * as _ from 'lodash'; +import * as tb_debug from '../tb_debug'; +import { ProgressTracker } from './common'; +import * as tf_graph from './graph'; +import { + createMetanode, + GroupNode, + Metanode, + OpNode, + ROOT_NAME, + SlimGraph, + MetanodeImpl, +} from './graph'; +import * as tf_graph_util from './util'; + +export enum HierarchyEvent { + /** + * Fired when the templates may have been updated. No event payload attached. + */ + TEMPLATES_UPDATED = 0, +} + +// Including both the NPU and the benchmark hierarchy. +export interface MergedHierarchy { + npu: Hierarchy; + bench?: Hierarchy; +} +/** + * Class for the Graph Hierarchy for TensorFlow graph. + */ +export class Hierarchy extends tf_graph_util.Dispatcher { + root: Metanode; + /** + * Options passed to dagre for creating the graph. Note that the + * `compound` argument will be overridden to true. + */ + graphOptions: tf_graph.LabeledGraphOptions = {}; + + private index: { + [nodeName: string]: GroupNode | OpNode; + }; + + constructor(params: HierarchyParams) { + super(); + this.graphOptions.compound = true; + this.graphOptions.rankdir = params.rankDirection; + this.root = createMetanode(ROOT_NAME, this.graphOptions); + this.index = {}; + this.index[ROOT_NAME] = this.root; + } + + getNodeMap(): { + [nodeName: string]: GroupNode | OpNode; + } { + return this.index; + } + + node(name: string): GroupNode | OpNode { + return this.index[name]; + } + + setNode(name: string, node: GroupNode | OpNode): void { + this.index[name] = node; + } +} + +export interface HierarchyParams { + verifyTemplate: boolean; + rankDirection: string; +} + +export const defaultHierarchyParams: HierarchyParams = { + verifyTemplate: true, + rankDirection: 'TB', +}; + +/** + * @param graph The raw graph. + * @param params Parameters used when building a hierarchy. + */ +export function build( + graph: tf_graph.SlimGraph, + params: HierarchyParams, + tracker?: ProgressTracker, +): Promise { + const h = new Hierarchy(params); + return tf_graph_util + .runAsyncTask( + 'Adding nodes', + 100, + () => { + addNodesInVis(h, graph, ROOT_NAME); + }, + tracker, + tb_debug.GraphDebugEventId.HIERARCHY_ADD_NODES, + ) + .then(() => { + return h; + }); +} + +/** + * Updates hierarchy when the subgraph of a node is built. + * @param oldGraph + * @param slimGraph + */ +export function update(oldGraph: Hierarchy, slimGraph: tf_graph.SlimGraph, nodeName: string): void { + let node = oldGraph.node(nodeName) as Metanode; + if (node) { + addNodesInVis(oldGraph, slimGraph, nodeName); + } +} + +/** + * Creates the metanodes in the hierarchical graph and assigns parent-child + * relationship between them in vis mode. + * @param h + * @param graph + * @param parentName + */ +function addNodesInVis(h: Hierarchy, graph: SlimGraph, parentName: string): void { + const parentNode = h.node(parentName); + if (!(parentNode instanceof MetanodeImpl)) { + return; + } + const orderedNodes: Array<{ idx: number; name: string; node: any }> = []; + _.each([graph.nodes, graph.metaNodes], (nodes) => { + _.each(nodes, (node) => { + node.parentNode = parentNode; + orderedNodes.push({ + idx: node.nodeAttributes._order ?? 0, + name: node.name, + node, + }); + h.setNode(node.name, node); + }); + }); + _.each( + orderedNodes.sort((a, b) => a.idx - b.idx), + (item) => { + parentNode.metagraph.setNode(item.name, item.node); + }, + ); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/layout.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/layout.ts new file mode 100644 index 0000000000000000000000000000000000000000..37de98fc567a4012e5c4e20b8a902904a3ed5d3d --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/layout.ts @@ -0,0 +1,316 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import * as d3 from 'd3'; +import * as dagre from 'dagre'; +import { graphlib } from 'dagre'; +import * as _ from 'lodash'; +import { NodeType } from './graph'; +import * as render from './render'; + +export const PARAMS = { + animation: { + /** Default duration for graph animations in ms. */ + duration: 250, + }, + graph: { + /** Graph parameter for metanode. */ + meta: { + /** + * Dagre's nodesep param - number of pixels that + * separate nodes horizontally in the layout. + * + */ + nodeSep: 5, + /** + * Dagre's edgesep param - number of pixels that separate + * edges horizontally in the layout. + */ + edgeSep: 5, + }, + /** + * Padding is used to correctly position the graph SVG inside of its parent + * element. The padding amounts are applied using an SVG transform of X and + * Y coordinates. + */ + padding: { paddingTop: 40, paddingLeft: 20 }, + }, + subscene: { + meta: { + paddingTop: 5, + paddingBottom: 5, + paddingLeft: 8, + paddingRight: 8, + /** + * Used to leave room for the label on top of the highest node in + * the core graph. + */ + labelHeight: 20, + /** X-space between each extracted node and the core graph. */ + extractXOffset: 15, + /** Y-space between each extracted node. */ + extractYOffset: 20, + }, + }, + nodeSize: { + /** Size of meta nodes. */ + meta: { + radius: 5, + width: 60, + maxLabelWidth: 200, + /** A scale for the node's height based on number of nodes inside */ + // Hack - set this as an any type to avoid issues in exporting a type + // from an external module. + height: (d3 as any).scaleLinear().domain([1, 200]).range([15, 60]).clamp(true), + /** The radius of the circle denoting the expand button. */ + expandButtonRadius: 3, + }, + multiCollection: { + radius: 5, + width: 60, + maxLabelWidth: 200, + /** A scale for the node's height based on number of nodes inside */ + // Hack - set this as an any type to avoid issues in exporting a type + // from an external module. + height: (d3 as any).scaleLinear().domain([1, 200]).range([15, 60]).clamp(true), + /** The radius of the circle denoting the expand button. */ + expandButtonRadius: 3, + }, + apiList: { + radius: 5, + width: 60, + maxLabelWidth: 200, + /** A scale for the node's height based on number of nodes inside */ + // Hack - set this as an any type to avoid issues in exporting a type + // from an external module. + height: (d3 as any).scaleLinear().domain([1, 200]).range([15, 60]).clamp(true), + /** The radius of the circle denoting the expand button. */ + expandButtonRadius: 3, + }, + /** Size of op nodes. */ + op: { + width: 30, + height: 12, + radius: 6, + labelOffset: -12, + maxLabelWidth: 40, + }, + }, + shortcutSize: { + /** Size of shortcuts for op nodes */ + op: { width: 10, height: 4 }, + /** Size of shortcuts for meta nodes */ + meta: { width: 12, height: 4, radius: 1 }, + /** Size of shortcuts for multiCollection nodes */ + multiCollection: { width: 12, height: 4, radius: 1 }, + /** Size of shortcuts for apiList nodes */ + apiList: { width: 12, height: 4, radius: 1 }, + }, + annotations: { + /** Maximum possible width of the bounding box for in annotations */ + inboxWidth: 50, + /** Maximum possible width of the bounding box for out annotations */ + outboxWidth: 50, + /** X-space between the shape and each annotation-node. */ + xOffset: 10, + /** Y-space between each annotation-node. */ + yOffset: 3, + /** X-space between each annotation-node and its label. */ + labelOffset: 2, + /** Defines the max width for annotation label */ + maxLabelWidth: 40, + }, + constant: { size: { width: 4, height: 4 } }, + minimap: { + /** The maximum width/height the minimap can have. */ + size: 150, + }, +}; +/** + * The minimum width we confer upon the auxiliary nodes section if functions + * also appear. Without enforcing this minimum, metanodes in the function + * library section could jut into the auxiliary nodes section because the + * title "Auxiliary Nodes" is longer than the width of the auxiliary nodes + * section itself. + */ +export const MIN_AUX_WIDTH = 140; +/** + * Keep this number as the same as 'maxMetanodeLabelLength' in 'tf-graph-scene' + */ +export const MAX_TEXT_LENGTH = 50; +/** + * 6 pixels per character. + */ +export const CHARACTER_WIDTH = 6; +/** Calculate layout for a scene of a group node. */ +export function layoutScene(renderNodeInfo: render.RenderGroupNodeInfo): void { + // Update layout, size, and annotations of its children nodes and edges. + if (renderNodeInfo.node.isGroupNode) { + layoutChildren(renderNodeInfo); + } + // Update position of its children nodes and edges + if (renderNodeInfo.node.type === NodeType.META) { + layoutMetanode(renderNodeInfo, 10); + } else if (renderNodeInfo.node.type === NodeType.MULTI_COLLECTION) { + layoutMetanode(renderNodeInfo, 10); + } else if (renderNodeInfo.node.type === NodeType.API_LIST) { + layoutMetanode(renderNodeInfo, 32); + } else { + } +} +/** + * Updates the total width of an unexpanded node which includes the size of its + * in and out annotations. + */ +function updateTotalWidthOfNode(renderInfo: render.RenderNodeInfo): void { + // Assign the width of the core box (the main shape of the node). + renderInfo.coreBox.width = renderInfo.width; + renderInfo.coreBox.height = renderInfo.height; + let labelLength = renderInfo.displayName.length; + // Compute the total width of the node. + if (renderInfo.node.type === NodeType.OP) { + renderInfo.width = PARAMS.nodeSize.op.maxLabelWidth; + } else { + renderInfo.width = Math.max( + renderInfo.coreBox.width, + Math.min(labelLength * CHARACTER_WIDTH, PARAMS.nodeSize.meta.maxLabelWidth), + ); + } +} +/** + * Update layout, size, and annotations of its children nodes and edges. + */ +function layoutChildren(renderNodeInfo: render.RenderGroupNodeInfo): void { + let children = renderNodeInfo.coreGraph + .nodes() + .map((n) => { + return renderNodeInfo.coreGraph.node(n); + }) + _.each(children, (childNodeInfo) => { + // Set size of each child + switch (childNodeInfo.node.type) { + case NodeType.OP: + _.extend(childNodeInfo, PARAMS.nodeSize.op); + break; + case NodeType.META: + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: + if (!childNodeInfo.expanded) { + // Set fixed width and scalable height based on cardinality + _.extend(childNodeInfo, PARAMS.nodeSize.meta); + childNodeInfo.height = PARAMS.nodeSize.meta.height(childNodeInfo.node.cardinality); + childNodeInfo.width = Math.max( + childNodeInfo.width, + Math.min(childNodeInfo.displayName.length, MAX_TEXT_LENGTH) * CHARACTER_WIDTH, + ); + } else { + let childGroupNodeInfo = childNodeInfo; + layoutScene(childGroupNodeInfo); // Recursively layout its subscene. + } + break; + default: + throw Error(`Unrecognized node type: ${childNodeInfo.node.type}`); + } + // Compute total width of un-expanded nodes. Width of expanded nodes + // has already been computed. + if (!childNodeInfo.expanded) { + updateTotalWidthOfNode(childNodeInfo); + } + }); +} +/** + * Calculate layout for a graph using dagre + * @param graph the graph to be laid out + * @param params layout parameters + * @return width and height of the core graph + */ +function dagreLayout(graph: graphlib.Graph, params): { height: number; width: number } { + _.extend(graph.graph(), { + nodesep: params.nodeSep, + ranksep: params.rankSep, + edgesep: params.edgeSep, + }); + dagre.layout(graph); + // Calculate the true bounding box of the graph by iterating over nodes and + // edges rather than accepting dagre's word for it. In particular, we should + // ignore the extra-wide bridge nodes and bridge edges, and allow for + // annotation boxes and labels. + let minX = Infinity; + let minY = Infinity; + let maxX = -Infinity; + let maxY = -Infinity; + _.each(graph.nodes(), (nodeName) => { + let nodeInfo = graph.node(nodeName); + let w = 0.5 * nodeInfo.width; + let x1 = nodeInfo.x - w; + let x2 = nodeInfo.x + w; + minX = x1 < minX ? x1 : minX; + maxX = x2 > maxX ? x2 : maxX; + let h = 0.5 * nodeInfo.height; + let y1 = nodeInfo.y - h; + let y2 = nodeInfo.y + h; + minY = y1 < minY ? y1 : minY; + maxY = y2 > maxY ? y2 : maxY; + }); + + _.each(graph.nodes(), (nodeName) => { + let nodeInfo = graph.node(nodeName); + nodeInfo.x -= minX; + nodeInfo.y -= minY; + }); + return { + width: maxX - minX, + height: maxY - minY, + }; +} +/** Layout a metanode. Only called for an expanded node. */ +function layoutMetanode(renderNodeInfo: render.RenderGroupNodeInfo, rankSep: number): void { + // First, copy params specific to meta nodes onto this render info object. + let params = PARAMS.subscene.meta; + _.extend(renderNodeInfo, params); + // Invoke dagre.layout() on the core graph and record the bounding box + // dimensions. + _.extend(renderNodeInfo.coreBox, dagreLayout(renderNodeInfo.coreGraph, { ...PARAMS.graph.meta, rankSep })); + // Compute the total padding between the core graph, in-extract and + // out-extract boxes. + let numParts = 0; + if (renderNodeInfo.coreGraph.nodeCount() > 0) { + numParts++; + } + let offset = PARAMS.subscene.meta.extractXOffset; + let padding = numParts <= 1 ? 0 : numParts * offset; + renderNodeInfo.coreBox.width += padding + padding; + renderNodeInfo.coreBox.height = params.labelHeight + renderNodeInfo.coreBox.height, + // Determine the whole metanode's width (from left to right). + renderNodeInfo.width = + Math.max(renderNodeInfo.displayName.length * CHARACTER_WIDTH, renderNodeInfo.coreBox.width) + + params.paddingLeft + + params.paddingRight; + // Determine the whole metanode's height (from top to bottom). + renderNodeInfo.height = renderNodeInfo.paddingTop + renderNodeInfo.coreBox.height + renderNodeInfo.paddingBottom; +} + +/** + * Determines the center position of the node's shape. The position depends + * on if the node has in and out-annotations. + */ +export function computeCXPositionOfNodeShape(renderInfo: render.RenderNodeInfo): number { + if (renderInfo.expanded) { + return renderInfo.x; + } + return renderInfo.x - (renderInfo.width / 2) + (renderInfo.coreBox.width / 2); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/loader.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/loader.ts new file mode 100644 index 0000000000000000000000000000000000000000..2e7462c78ce0079d3e2779398f4d15ccd29b2646 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/loader.ts @@ -0,0 +1,82 @@ +/* Copyright 2019 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import { NPU_PREFIX, BENCH_PREFIX, type ProgressTracker } from './common'; +import * as tf_graph from './graph'; +import * as hierarchy from './hierarchy'; +import * as parser from './parser'; +import { GraphDef } from './proto'; +import * as tf_graph_util from './util'; + +export interface GraphAndHierarchy { + graph: tf_graph.MergedSlimGraph; + graphHierarchy: hierarchy.MergedHierarchy; +} +export function fetchAndConstructHierarchicalGraph( + tracker: ProgressTracker, + remotePath: string | null, + pbTxtFile: Blob | null, + hierarchyParams: hierarchy.HierarchyParams = hierarchy.defaultHierarchyParams, +): Promise { + const dataTracker = tf_graph_util.getSubtaskTracker(tracker, 30, 'Data'); + const graphTracker = tf_graph_util.getSubtaskTracker(tracker, 20, 'Graph'); + const hierarchyTracker = tf_graph_util.getSubtaskTracker(tracker, 50, 'Namespace hierarchy'); + return parser + .fetchAndParseGraphData(remotePath as string, pbTxtFile, dataTracker) + .then( + (graph): any => { + if (graph.node.length !== 2) { + return tf_graph.build(graph, tf_graph.defaultBuildParams, graphTracker); + } + const npuGraph: GraphDef = { node: [] }; + const benchGraph: GraphDef = { node: [] }; + if (graph.node[0].name.startsWith(NPU_PREFIX) && graph.node[1].name.startsWith(BENCH_PREFIX)) { + npuGraph.node.push(graph.node[0]); + benchGraph.node.push(graph.node[1]); + } + if (graph.node[0].name.startsWith(BENCH_PREFIX) && graph.node[1].name.startsWith(NPU_PREFIX)) { + npuGraph.node.push(graph.node[1]); + benchGraph.node.push(graph.node[0]); + } + return Promise.all([ + tf_graph.build(npuGraph, tf_graph.defaultBuildParams, graphTracker), + tf_graph.build(benchGraph, tf_graph.defaultBuildParams, graphTracker), + ]); + }, + () => { + throw new Error( + 'Malformed GraphDef. This can sometimes be caused by ' + + 'a bad network connection or invalid inputting files ', + ); + }, + ) + .then(async (graph) => { + if (Array.isArray(graph)) { + const mergedGraph: tf_graph.MergedSlimGraph = { npu: graph[0], bench: graph[1] }; + const npuHierarchy = await hierarchy.build(graph[0], hierarchyParams, hierarchyTracker); + const benchHierarchy = await hierarchy.build(graph[1], hierarchyParams, hierarchyTracker); + return { graph: mergedGraph, graphHierarchy: { npu: npuHierarchy, bench: benchHierarchy } }; + } + const graphHierarchy = await hierarchy.build(graph, hierarchyParams, hierarchyTracker); + return { graph: { npu: graph }, graphHierarchy: { npu: graphHierarchy } }; + }) + .catch((e) => { + // Generic error catch, for errors that happened outside + // asynchronous tasks. + const msg = `Graph visualization failed.\n\n${e}`; + tracker.reportError(msg, e); + // Don't swallow the error. + throw e; + }); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/minimap.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/minimap.ts new file mode 100644 index 0000000000000000000000000000000000000000..7a200ec651c1408464063a942199f479fff477e5 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/minimap.ts @@ -0,0 +1,292 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import * as d3 from 'd3'; + +const FRAC_VIEWPOINT_AREA: number = 0.8; +export class Minimap { + /** The minimap container. */ + private minimap: HTMLElement; + /** The canvas used for drawing the mini version of the svg. */ + private canvas: HTMLCanvasElement; + /** A buffer canvas used for temporary drawing to avoid flickering. */ + private canvasBuffer: HTMLCanvasElement; + /** The minimap svg used for holding the viewpoint rectangle. */ + private minimapSvg: SVGSVGElement; + /** The rectangle showing the current viewpoint. */ + private viewpoint: SVGRectElement; + /** + * The scale factor for the minimap. The factor is determined automatically + * so that the minimap doesn't violate the maximum width/height specified + * in the constructor. The minimap maintains the same aspect ratio as the + * original svg. + */ + private scaleMinimap: number; + /** The main svg element. */ + private svg: SVGSVGElement; + /** The svg group used for panning and zooming the main svg. */ + private zoomG: SVGGElement; + /** The zoom behavior of the main svg. */ + private mainZoom: d3.ZoomBehavior; + /** The maximum width and height for the minimap. */ + private maxWandH: number; + /** The last translation vector used in the main svg. */ + private translate: [number, number]; + /** The last scaling factor used in the main svg. */ + private scaleMain: number; + /** The coordinates of the viewpoint rectangle. */ + private viewpointCoord: { + x: number; + y: number; + }; + /** The current size of the minimap */ + private minimapSize: { + width: number; + height: number; + }; + /** Padding (px) due to the main labels of the graph. */ + private labelPadding: number; + /** + * Constructs a new minimap. + * + * @param svg The main svg element. + * @param zoomG The svg group used for panning and zooming the main svg. + * @param mainZoom The main zoom behavior. + * @param minimap The minimap container. + * @param maxWandH The maximum width/height for the minimap. + * @param labelPadding Padding in pixels due to the main graph labels. + */ + constructor( + svg: SVGSVGElement, + zoomG: SVGGElement, + mainZoom: d3.ZoomBehavior, + minimap: HTMLElement, + maxWandH: number, + labelPadding: number, + ) { + this.svg = svg; + this.labelPadding = labelPadding; + this.zoomG = zoomG; + this.mainZoom = mainZoom; + this.maxWandH = maxWandH; + let $shadowRoot = d3.select(minimap.shadowRoot as unknown as Element); + // The minimap will have 2 main components: the canvas showing the content + // and an svg showing a rectangle of the currently zoomed/panned viewpoint. + let $minimapSvg = $shadowRoot.select('svg'); + // Make the viewpoint rectangle draggable. + let $viewpoint = $minimapSvg.select('rect'); + let dragmove = (d): void => { + this.viewpointCoord.x = (d3.event).x; + this.viewpointCoord.y = (d3.event).y; + this.updateViewpoint(); + }; + this.viewpointCoord = { x: 0, y: 0 }; + let drag = d3.drag().subject(Object).on('drag', dragmove); + $viewpoint.datum(this.viewpointCoord as any).call(drag); + // Make the minimap clickable. + $minimapSvg.on('click', (): void => { + if ((d3.event).defaultPrevented) { + // This click was part of a drag event, so suppress it. + return; + } + // Update the coordinates of the viewpoint. + let width = Number($viewpoint.attr('width')); + let height = Number($viewpoint.attr('height')); + let clickCoords = d3.mouse($minimapSvg.node() as any); + this.viewpointCoord.x = clickCoords[0] - (width / 2); + this.viewpointCoord.y = clickCoords[1] - (height / 2); + this.updateViewpoint(); + }); + this.viewpoint = $viewpoint.node(); + this.minimapSvg = $minimapSvg.node(); + this.minimap = minimap; + this.canvas = $shadowRoot.select('canvas.first').node(); + this.canvasBuffer = $shadowRoot.select('canvas.second').node(); + this.update(); + } + + /** + * Redraws the minimap. Should be called whenever the main svg + * was updated (e.g. when a node was expanded). + */ + update(): void { + let sceneSize: DOMRect | null = null; + try { + // Get the size of the entire scene. + sceneSize = this.zoomG.getBBox(); + if (sceneSize.width === 0) { + // There is no scene anymore. We have been detached from the dom. + return; + } + } catch (e) { + // Firefox produced NS_ERROR_FAILURE if we have been + // detached from the dom. + return; + } + let $svg = d3.select(this.svg); + // Read all the style rules in the document and embed them into the svg. + // The svg needs to be self contained, i.e. all the style rules need to be + // embedded so the canvas output matches the origin. + let stylesText = ''; + const anySvg = this.svg as any; + // MSEdge does not have `getRootNode`. In that case, manually get the root + // node. This is more brittle than the getRootNode as changing DOM structure + // will break this. + const rootNode = anySvg.getRootNode ? anySvg.getRootNode() : this.svg.parentNode; + const styleSheets = rootNode.styleSheets; + for (let k = 0; k < styleSheets.length; k++) { + try { + let cssRules = (styleSheets[k]).cssRules || (styleSheets[k]).rules; + if (cssRules == null) { + continue; + } + for (let i = 0; i < cssRules.length; i++) { + // Remove tf-* selectors from the styles. + stylesText += `${cssRules[i].cssText.replace(/ ?tf-[\w-]+ ?/g, '')}\n`; + } + } catch (e: any) { + if (e.name !== 'SecurityError') { + throw e; + } + } + } + // Temporarily add the css rules to the main svg. + let svgStyle = $svg.append('style'); + svgStyle.text(stylesText); + // Temporarily remove the zoom/pan transform from the main svg since we + // want the minimap to show a zoomed-out and centered view. + let $zoomG = d3.select(this.zoomG); + let zoomTransform = $zoomG.attr('transform'); + $zoomG.attr('transform', null); + // Account for SVG content shift. SVGGraphicsElement.getBBox().width returns + // width in pixel value of very tight bounding box of non-empty content. + // Since we want to measure the sceneSize from the origin to the right most + // edge of the right most node, we need to account for distance from the + // origin to the left edge of the bounding box. + sceneSize.height += sceneSize.y; + sceneSize.width += sceneSize.x; + // Since we add padding, account for that here. + sceneSize.height += this.labelPadding * 2; + sceneSize.width += this.labelPadding * 2; + // Temporarily assign an explicit width/height to the main svg, since + // it doesn't have one (uses flex-box), but we need it for the canvas + // to work. + $svg.attr('width', sceneSize.width).attr('height', sceneSize.height); + // Since the content inside the svg changed (e.g. a node was expanded), + // the aspect ratio have also changed. Thus, we need to update the scale + // factor of the minimap. The scale factor is determined such that both + // the width and height of the minimap are <= maximum specified w/h. + this.scaleMinimap = this.maxWandH / Math.max(sceneSize.width, sceneSize.height); + // canvas宽度缩小一半,图像填充满需要乘2 + this.minimapSize = { + width: sceneSize.width * this.scaleMinimap * 2, + height: sceneSize.height * this.scaleMinimap, + }; + // Update the size of the minimap's svg, the buffer canvas and the + // viewpoint rect. + d3.select(this.minimapSvg).attr(this.minimapSize as any); + d3.select(this.canvasBuffer).attr(this.minimapSize as any); + if (this.translate != null && this.zoom != null) { + // Update the viewpoint rectangle shape since the aspect ratio of the + // map has changed. + requestAnimationFrame(() => this.zoom()); + } + let svgXml = new XMLSerializer().serializeToString(this.svg); + // Now that the svg is serialized for rendering, remove the temporarily + // assigned styles, explicit width and height and bring back the pan/zoom + // transform. + svgStyle.remove(); + $svg.attr('width', null).attr('height', null); + $zoomG.attr('transform', zoomTransform); + let image = new Image(); + image.onload = (): void => { + // Draw the svg content onto the buffer canvas. + let context = this.canvasBuffer.getContext('2d'); + context?.clearRect(0, 0, this.canvasBuffer.width, this.canvasBuffer.height); + context?.drawImage(image, 0, 0, this.minimapSize.width, this.minimapSize.height); + requestAnimationFrame(() => { + // Hide the old canvas and show the new buffer canvas. + d3.select(this.canvasBuffer).style('display', null); + d3.select(this.canvas).style('display', 'none'); + // Swap the two canvases. + [this.canvas, this.canvasBuffer] = [this.canvasBuffer, this.canvas]; + }); + }; + image.onerror = (): void => { + let blob = new Blob([svgXml], { type: 'image/svg+xml;charset=utf-8' }); + image.src = (URL as any).createObjectURL(blob); + }; + image.src = `data:image/svg+xml;charset=utf-8,${encodeURIComponent(svgXml)}`; + } + + /** + * Handles changes in zooming/panning. Should be called from the main svg + * to notify that a zoom/pan was performed and this minimap will update it's + * viewpoint rectangle. + * + * @param translate The translate vector, or none to use the last used one. + * @param scale The scaling factor, or none to use the last used one. + */ + zoom(transform?: d3.ZoomTransform): void { + if (this.scaleMinimap == null) { + // Scene is not ready yet. + return; + } + // Update the new translate and scale params, only if specified. + if (transform) { + this.translate = [transform.x, transform.y]; + this.scaleMain = transform.k; + } + // Update the location of the viewpoint rectangle. + let svgRect = this.svg.getBoundingClientRect(); + let $viewpoint = d3.select(this.viewpoint); + this.viewpointCoord.x = (-this.translate[0] * this.scaleMinimap) / this.scaleMain; + this.viewpointCoord.y = (-this.translate[1] * this.scaleMinimap) / this.scaleMain; + let viewpointWidth = (svgRect.width * this.scaleMinimap) / this.scaleMain; + let viewpointHeight = (svgRect.height * this.scaleMinimap) / this.scaleMain; + $viewpoint + .attr('x', this.viewpointCoord.x) + .attr('y', this.viewpointCoord.y) + .attr('width', viewpointWidth) + .attr('height', viewpointHeight); + // Show/hide the minimap depending on the viewpoint area as fraction of the + // whole minimap. + let mapWidth = this.minimapSize.width / 2; // 前面乘了这里要除回来 + let mapHeight = this.minimapSize.height; + let x = this.viewpointCoord.x; + let y = this.viewpointCoord.y; + let w = Math.min(Math.max(0, x + viewpointWidth), mapWidth) - Math.min(Math.max(0, x), mapWidth); + let h = Math.min(Math.max(0, y + viewpointHeight), mapHeight) - Math.min(Math.max(0, y), mapHeight); + let fracIntersect = (w * h) / (mapWidth * mapHeight); + if (fracIntersect < FRAC_VIEWPOINT_AREA) { + this.minimap.classList.remove('hidden'); + } else { + this.minimap.classList.add('hidden'); + } + } + + /** + * Updates the position and the size of the viewpoint rectangle. + * It also notifies the main svg about the new panned position. + */ + private updateViewpoint(): void { + // Update the coordinates of the viewpoint rectangle. + d3.select(this.viewpoint).attr('x', this.viewpointCoord.x).attr('y', this.viewpointCoord.y); + // Update the translation vector of the main svg to reflect the + // new viewpoint. + let mainX = (-this.viewpointCoord.x * this.scaleMain) / this.scaleMinimap; + let mainY = (-this.viewpointCoord.y * this.scaleMain) / this.scaleMinimap; + d3.select(this.svg).call(this.mainZoom.transform, d3.zoomIdentity.translate(mainX, mainY).scale(this.scaleMain)); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/node.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/node.ts new file mode 100644 index 0000000000000000000000000000000000000000..b290227cfc3f50a1a94deff35e34877bfa632a9a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/node.ts @@ -0,0 +1,725 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import { Notification } from '@vaadin/notification'; +import * as d3 from 'd3'; +import * as _ from 'lodash'; +import * as tf_graph_common from './common'; +import { Class, FontSizeInPx, selectChild, selectOrCreateChild } from './common'; +import * as contextmenu from './contextmenu'; +import * as tf_graph from './graph'; +import { MetanodeImpl, NodeType, OpNodeImpl } from './graph'; +import * as layout from './layout'; +import * as render from './render'; +import * as tf_graph_scene from './scene'; +import { TfGraphScene } from './tf-graph-scene'; +import * as tf_graph_util from './util'; + +/** + * Select or Create a 'g.nodes' group to a given sceneGroup + * and builds a number of 'g.node' groups inside the group. + * + * Structure Pattern: + * + * + * + * + * ... + * + * + * ... + * + * + * + * + * node name + * + * + * + * + * ... + * + * + * + * @param sceneGroup selection of the container + * @param nodeData array of render node information to map + * @param sceneElement polymer element + * @return selection of the created nodeGroups + */ +let colorStorage: { [key: string]: string } = {}; +export function buildGroup(sceneGroup, nodeData: render.RenderNodeInfo[], sceneElement): void { + let container = tf_graph_common.selectOrCreateChild(sceneGroup, 'g', Class.Node.CONTAINER); + // Select all children and join with data. + // (Note that all children of g.nodes are g.node) + let nodeGroups = (container as any) + .selectAll(function () { + return this.childNodes; + }) + .data(nodeData, (d) => { + // make sure that we don't have to swap shape type + return `${d.node.name}:${d.node.type}`; + }); + // ENTER + nodeGroups + .enter() + .append('g') + .attr('data-name', (d) => { + return d.node.name; + }) + .each(function (d) { + let nodeGroup = d3.select(this); + // index node group for quick stylizing + sceneElement.addNodeGroup(d.node.name, nodeGroup); + }) + .merge(nodeGroups) + // ENTER + UPDATE + .attr('class', (d: render.RenderNodeInfo) => { + return `${Class.Node.GROUP} ${nodeClass(d)}`; + }) + .each(function (d: render.RenderNodeInfo) { + let nodeGroup = d3.select(this); + // Build .shape first (background of the node). + let shape = buildShape(nodeGroup, d, Class.Node.SHAPE); + let shape1 = buildShape(nodeGroup, d, Class.Node.OUTER); + if (d.node.isGroupNode) { + addButton(shape, d, sceneElement); + } + if (d.node.isGroupNode) { + addButton(shape1, d, sceneElement); + } + addInteraction(shape, d, sceneElement); + addInteraction(shape1, d, sceneElement); + // Build subscene on the top. + subsceneBuild(nodeGroup, d, sceneElement); + // Build label last. Should be on top of everything else. + let label = labelBuild(nodeGroup, d, sceneElement); + // Do not add interaction to metanode labels as they live inside the + // metanode shape which already has the same interactions. + addInteraction(label, d, sceneElement, d.node.type === NodeType.META); + stylize(nodeGroup, d, sceneElement); + position(nodeGroup, d); + }); + // EXIT + nodeGroups + .exit() + .each(function (d) { + // remove all indices on remove + sceneElement.removeNodeGroup(d.node.name); + }) + .remove(); +} +/** + * Update or remove the subscene of a render group node depending on whether it + * is a expanded. If the node is not a group node, this method has no effect. + * + * @param nodeGroup selection of the container + * @param renderNodeInfo the render information for the node. + * @param sceneElement polymer element. + * @return Selection of the subscene group, or null if node group does not have + * a subscene. Op nodes, bridge nodes and unexpanded group nodes will + * not have a subscene. + */ +function subsceneBuild(nodeGroup, renderNodeInfo: render.RenderGroupNodeInfo, sceneElement: TfGraphScene): void { + if (renderNodeInfo.node.isGroupNode) { + if (renderNodeInfo.expanded) { + // Recursively build the subscene. + buildGroupForScene(nodeGroup, renderNodeInfo, sceneElement, Class.Subscene.GROUP); + return; + } + // Clean out existing subscene if the node is not expanded. + tf_graph_scene.selectChild(nodeGroup, 'g', Class.Subscene.GROUP).remove(); + } +} +/** + * Translate the subscene of the given node group + */ +function subscenePosition(nodeGroup, d: render.RenderNodeInfo): void { + // Translate the subscene to the middle of the parent node in vertical direction. + let x0 = d.x - (d.coreBox.width / 2); + let y0 = d.y - (d.height / 2) + d.paddingTop; + let subscene = tf_graph_scene.selectChild(nodeGroup, 'g', Class.Subscene.GROUP); + tf_graph_scene.translate(subscene, x0, y0); +} +/** + * Add an expand/collapse button to a group node + * + * @param selection The group node selection. + * @param d Info about the node being rendered. + * @param sceneElement polymer element. + */ +function addButton(selection, d: render.RenderNodeInfo, sceneElement): void { + let group = tf_graph_common.selectOrCreateChild(selection, 'g', Class.Node.BUTTON_CONTAINER); + tf_graph_common.selectOrCreateChild(group, 'circle', Class.Node.BUTTON_CIRCLE); + tf_graph_common.selectOrCreateChild(group, 'path', Class.Node.EXPAND_BUTTON).attr('d', 'M0,-2.2 V2.2 M-2.2,0 H2.2'); + tf_graph_common.selectOrCreateChild(group, 'path', Class.Node.COLLAPSE_BUTTON).attr('d', 'M-2.2,0 H2.2'); + (group as any).on('click', (dGroup: any) => { + // Stop this event's propagation so that it isn't also considered a + // node-select. + (d3.event).stopPropagation(); + sceneElement.fire('node-toggle-expand', { name: dGroup.node.name }); + }); + tf_graph_scene.positionButton(group, d); +} +/** + * Fire node-* events when the selection is interacted. + * + * @param disableInteraction When true, have the provided selection + * ignore all pointer events. Used for text labels inside of metanodes, which + * don't need interaction as their surrounding shape has interaction, and if + * given interaction would cause conflicts with the expand/collapse button. + */ +function addInteraction( + selection, + d: render.RenderNodeInfo, + sceneElement: TfGraphScene, + disableInteraction?: boolean, +): void { + if (disableInteraction) { + selection.attr('pointer-events', 'none'); + return; + } + let contextMenuFunction = contextmenu.getMenu(sceneElement, d); + let time = 0; + let timeOut; + let mouseMoved = false; + let startX; + let startY; + const movementThreshold = 5; + selection + .on('dblclick', (dDbClick) => { + clearTimeout(timeOut); + sceneElement.fire('node-toggle-expand', { name: dDbClick.node.name }); + }) + .on('mouseover', (dMouseover) => { + // don't send mouseover over expanded group, + // otherwise it is causing too much glitches + if (sceneElement.isNodeExpanded(dMouseover)) { + return; + } + sceneElement.fire('node-highlight', { name: dMouseover.node.name }); + }) + .on('mouseout', (dMouseOut) => { + // don't send mouseover over expanded group, + // otherwise it is causing too much glitches + if (sceneElement.isNodeExpanded(dMouseOut)) { + return; + } + sceneElement.fire('node-unhighlight', { name: dMouseOut.node.name }); + }) + .on('mousedown', () => { + startX = d3.event.clientX; + startY = d3.event.clientY; + mouseMoved = false; // 重置标志变量 + }) + + // 监听鼠标抬起事件,检查是否超过浮动阈值 + .on('mouseup', () => { + const deltaX = Math.abs(d3.event.clientX - startX); + const deltaY = Math.abs(d3.event.clientY - startY); + if (deltaX > movementThreshold || deltaY > movementThreshold) { + mouseMoved = true; + } + }) + + .on('click', (dClick) => { + clearTimeout(timeOut); // 清除第一个单击事件 + if (mouseMoved) { + mouseMoved = false; // 重置标志变量 + return; + } + timeOut = setTimeout(() => { + sceneElement.fire('node-select', { name: dClick.node.name }); + sceneElement.fire('no-pan-to-node', {}); + }, time); + }) + + .on('contextmenu', (dCtx, i) => { + sceneElement.fire('node-select', { name: dCtx.node.name }); + contextMenuFunction.call(dCtx, i); + }); +} + +/** + * Append svg text for label and assign data. + * @param nodeGroup + * @param renderNodeInfo The render node information for the label. + * @param sceneElement polymer element. + */ +function labelBuild(nodeGroup, renderNodeInfo: render.RenderNodeInfo, sceneElement): d3.Selection { + let text = renderNodeInfo.displayName; + // Truncate long labels for unexpanded Metanodes. + let useFontScale = + renderNodeInfo.node.type === (NodeType.META || NodeType.MULTI_COLLECTION || NodeType.API_LIST) && + !renderNodeInfo.expanded; + let label = tf_graph_common.selectOrCreateChild(nodeGroup, 'text', Class.Node.LABEL); + // Make sure the label is visually on top among its siblings. + let labelNode = label.node(); + labelNode.parentNode?.appendChild(labelNode); + label.attr('dy', '.35em').attr('text-anchor', 'middle'); + + // In tf-graph-scene styles, fontSizes are defined to vary from 6px to 9px. Since we + // do not want to invoke computedStyles or hardcode the fontSize that would be + // duplicated in styles, we are rounding it to 8px which does not cause any visible + // jank. + let fontSize = 8; + + switch (renderNodeInfo.node.type) { + case NodeType.META: + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: + fontSize = renderNodeInfo.expanded ? FontSizeInPx.Node.EXPANDED_LABEL : FontSizeInPx.Node.SERIES_LABEL; + break; + case NodeType.OP: + fontSize = FontSizeInPx.Node.OP_LABEL; + break; + default: + break; + } + + if (useFontScale) { + if (text.length > sceneElement.maxMetanodeLabelLength) { + text = `${text.substr(0, sceneElement.maxMetanodeLabelLength - 2)}…`; + } + let scale = getLabelFontScale(sceneElement); + label.attr('font-size', `${scale(text.length)}px`); + fontSize = scale(text.length); + } + let txtElement = >label.text(text); + enforceLabelWidth(txtElement, renderNodeInfo.node.type, fontSize, renderNodeInfo); + return label; +} + +/** + * This function shortens text which would exceed the maximum pixel width of + * a label. + * + * @param txtElementSelection The text element containing the label's text as d3 + * selection. + * @param nodeType The type of the node the label belongs to. If the node is + * an annotation, the value is -1. Label widths are defined in + * layout.PARAMS.nodeSize.{meta|op|...}.maxLabelWidth for nodes and + * layout.PARAMS.annotations.labelWidth for annotations. + * @param renderNodeInfo The render information about the node, required to + * determine whether META nodes are collapsed or expanded. + */ +export function enforceLabelWidth( + txtElementSelection: d3.Selection, + nodeType: NodeType | number, + fontSize: number, + renderNodeInfo?: render.RenderNodeInfo, +): void { + // Get text element itself and its on-screen width. + let txtNode = txtElementSelection.node(); + let labelContent = txtNode.textContent; + + // Get maximum length from settings. + let maxLength: number | null = null; + switch (nodeType) { + case NodeType.META: + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: + if (renderNodeInfo && !renderNodeInfo.expanded) { + // Only trim text if + // node expanded. + maxLength = layout.PARAMS.nodeSize.meta.maxLabelWidth; + } + break; + case NodeType.OP: + maxLength = layout.PARAMS.nodeSize.op.maxLabelWidth; + break; + case -1: + maxLength = layout.PARAMS.annotations.maxLabelWidth; + break; + default: + break; + } + if (maxLength === null) { + return; + } + + txtNode.textContent = tf_graph_util.maybeTruncateString(txtNode.textContent ?? '', fontSize, maxLength); + // Add tooltip with full name and return. + txtElementSelection.append('title').text(labelContent); +} +/** + * d3 scale used for sizing font of labels, used by labelBuild, + * initialized once by getLabelFontScale. + */ +let fontScale: any = null; +function getLabelFontScale(sceneElement): any { + if (!fontScale) { + fontScale = d3 + .scaleLinear() + .domain([sceneElement.maxMetanodeLabelLengthLargeFont, sceneElement.maxMetanodeLabelLength]) + .range([sceneElement.maxMetanodeLabelLengthFontSize, sceneElement.minMetanodeLabelLengthFontSize]) + .clamp(true); + } + return fontScale; +} +/** + * Set label position of a given node group + */ +function labelPosition(nodeGroup, cx: number, cy: number, yOffset: number): void { + tf_graph_scene + .selectChild(nodeGroup, 'text', Class.Node.LABEL) + .transition() + .attr('x', cx) + .attr('y', cy + yOffset); +} +/** + * Select or append/insert shape for a node and assign renderNode + * as the shape's data. + * + * @param nodeGroup + * @param d Render node information. + * @param nodeClass class for the element. + * @return Selection of the shape. + */ +export function buildShape(nodeGroup, d, nodeClassName: string): d3.Selection { + // Create a group to house the underlying visual elements. + let shapeGroup = tf_graph_common.selectOrCreateChild(nodeGroup, 'g', nodeClassName); + switch (d.node.type) { + case NodeType.OP: { + tf_graph_common.selectOrCreateChild(shapeGroup, 'ellipse', Class.Node.COLOR_TARGET); + break; + } + case NodeType.META: + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: + tf_graph_common + .selectOrCreateChild(shapeGroup, 'rect', Class.Node.COLOR_TARGET) + .attr('rx', d.radius) + .attr('ry', d.radius); + break; + default: + throw Error(`Unrecognized node type: ${d.node.type}`); + } + return shapeGroup; +} +export function nodeClass(d: render.RenderNodeInfo): string { + switch (d.node.type) { + case NodeType.OP: + return Class.OPNODE; + case NodeType.META: + return Class.METANODE; + case NodeType.MULTI_COLLECTION: + return Class.MULTI_COLLECTION; + case NodeType.API_LIST: + return Class.API_LIST; + default: + return ''; + } +} +/** Modify node and its subscene and its label's positional attributes */ +function position(nodeGroup, d: render.RenderNodeInfo): void { + let shapeGroup = tf_graph_scene.selectChild(nodeGroup, 'g', Class.Node.SHAPE); + let shapeGroupHeader = tf_graph_scene.selectChild(nodeGroup, 'g', Class.Node.OUTER); + let cx = layout.computeCXPositionOfNodeShape(d); + switch (d.node.type) { + case NodeType.OP: { + // position shape + let shape = tf_graph_scene.selectChild(shapeGroup, 'ellipse'); + tf_graph_scene.positionEllipse(shape, cx, d.y, d.coreBox.width, d.coreBox.height); + labelPosition(nodeGroup, cx, d.y, d.labelOffset); + break; + } + case NodeType.META: { + // position shape + let shapes = shapeGroup.selectAll('rect'); + let INSIDE_RECT_OFFSET = 0; + // y值为定位值 取值为(15(色块高度)/2 (中心点y值) + 0.6 (向下偏移值, 不偏移会覆盖边及其颜色))得出 + // 偏移值为8.1 + let OFFSET_VALUE = 8.1; + if (d.expanded) { + tf_graph_scene.positionRect(shapes, d.x, d.y, d.width, d.height); + INSIDE_RECT_OFFSET = d.y - (d.height / 2) + OFFSET_VALUE; + subscenePosition(nodeGroup, d); + // Put the label on top. + labelPosition(nodeGroup, cx, d.y, (-d.height / 2) + (d.labelHeight / 2)); + } else { + tf_graph_scene.positionRect(shapes, cx, d.y, d.coreBox.width, d.coreBox.height); + // Place the label in the middle. + labelPosition(nodeGroup, cx, d.y, 0); + } + let shapesHeader = shapeGroupHeader.selectAll('rect'); + if (d.expanded) { + tf_graph_scene.positionRect(shapesHeader, d.x, INSIDE_RECT_OFFSET, d.width - 1, 15); + subscenePosition(nodeGroup, d); + // Put the label on top. + labelPosition(nodeGroup, cx, d.y, (-d.height / 2) + (d.labelHeight / 2)); + } else { + tf_graph_scene.positionRect(shapesHeader, cx, d.y, d.coreBox.width, d.coreBox.height); + // Place the label in the middle. + labelPosition(nodeGroup, cx, d.y, 0); + } + break; + } + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: { + // position shape + let shapes = shapeGroup.selectAll('rect'); + if (d.expanded) { + tf_graph_scene.positionRect(shapes, d.x, d.y, d.width, d.height); + subscenePosition(nodeGroup, d); + // Put the label on top. + labelPosition(nodeGroup, cx, d.y, (-d.height / 2) + (d.labelHeight / 2)); + } else { + tf_graph_scene.positionRect(shapes, cx, d.y, d.coreBox.width, d.coreBox.height); + // Place the label in the middle. + labelPosition(nodeGroup, cx, d.y, 0); + } + break; + } + default: { + throw Error(`Unrecognized node type: ${d.node.type}`); + } + } +} + +export function removeGradientDefinitions(svgRoot: SVGElement): void { + d3.select(svgRoot).select('defs#_graph-gradients').remove(); +} + +export function getColorByPrecisionIndex(precisionStr: string): string { + if (['medium', 'high', 'critical'].includes(precisionStr)) { + switch (precisionStr) { + case 'medium': + return '#B6C7FC'; + case 'high': + return '#7E96F0'; + case 'critical': + return '#4668B8'; + default: + break; + } + } + const precision = Number(precisionStr); + if (isNaN(precision)) { + return 'white'; + } + if (Object.entries(colorStorage).length !== 0) { + for (const [color, details] of Object.entries(colorStorage)) { + const detailsArray: any[] = [details]; + const [start, end] = detailsArray[0].value; + // 进入md5模式 + const isPrecisionInRange = precision >= start && precision < end; + const isPrecisionAtEnd = precision === end && end === 1; + if (start === end) { + if (precision === start) { + return color; + } + } else if (isPrecisionInRange || isPrecisionAtEnd) { + // 其他区间模式, 最后一个区间的右侧一定为1,所以特化precision == 1的情况 + return color; + } + } + return 'white'; + } else { + const colorMap = [ + { precision: 0.2, color: '#ff704d' }, + { precision: 0.4, color: '#FFC62E' }, + { precision: 0.6, color: '#FFDC7F' }, + { precision: 0.8, color: '#FFEDBE' }, + { precision: 1, color: '#FFFCF3' }, + ]; + for (const range of colorMap) { + if (precision <= range.precision) { + return range.color; + } + } + return 'white'; + } +} +/** + * Returns the fill color for the node given its state and the 'color by' + * option. + * Takes in optional svgRoot, when passed, that populates SVG definitions + * for the fill inside the svgRoot when necessary. + */ +export function getFillForNode(renderInfo: render.RenderNodeInfo): string { + if (renderInfo.node instanceof OpNodeImpl || renderInfo.node instanceof MetanodeImpl) { + const precisionItem = renderInfo.node.attr.find((item) => item.key === tf_graph.PRECISION_INDEX); + const overflowLevelItem = renderInfo.node.attr.find((item) => item.key === tf_graph.OVERFLOW_LEVEL); + const matchedNodeLink = renderInfo.node.matchedNodeLink; // 标杆侧没有任何颜色 + // 以前缀来判断是单图节点还是对比图节点 + if ( + renderInfo.node.name.startsWith(tf_graph_common.BENCH_PREFIX) || + renderInfo.node.name.startsWith(tf_graph_common.NPU_PREFIX) + ) { + if (_.isEmpty(matchedNodeLink)) { + return '#C7C7C7'; + } + } + if (overflowLevelItem) { + switch (overflowLevelItem.value) { + case 'medium': + return '#B6C7FC'; + case 'high': + return '#7E96F0'; + case 'critical': + return '#4668B8'; + default: + Notification.show('Unknown overflow level', { + position: 'middle', + duration: 1000, + theme: 'error', + }); + // 处理未知情况 + return 'transparent'; + } + } else if (precisionItem) { + return getColorByPrecisionIndex(precisionItem.value); + } else { + return 'transparent'; + } + } else { + // Other nodes are white. + return 'transparent'; + } +} +/** + * Modify node style by toggling class and assign attributes (only for things + * that can't be done in css). + */ +export function stylize( + nodeGroup, + renderInfo: render.RenderNodeInfo, + sceneElement: TfGraphScene, + nodeClassName?, +): void { + const resolvedNodeClassName = nodeClassName || Class.Node.SHAPE || Class.Node.OUTER; + const isHighlighted = sceneElement.isNodeHighlighted(renderInfo.node.name); + const isSelected = sceneElement.isNodeSelected(renderInfo.node.name); + const isExpanded = renderInfo.expanded && resolvedNodeClassName !== Class.Annotation.NODE; + const isLinked = sceneElement.isNodeLinked(renderInfo.node.name); + nodeGroup.classed('highlighted', isHighlighted); + nodeGroup.classed('selected', isSelected); + nodeGroup.classed('expanded', isExpanded); + nodeGroup.classed('linked', isLinked); + // Main node always exists here and it will be reached before subscene, + // so d3 selection is fine here.OLOR_TARGET) + const node = nodeGroup.select(`.${resolvedNodeClassName} .${Class.Node.COLOR_TARGET}`); + const outerNode = nodeGroup.select(`.${Class.Node.OUTER} .${Class.Node.COLOR_TARGET}`); + const fillColor = getFillForNode(renderInfo); + if ( + renderInfo.node.type === tf_graph.NodeType.META || + renderInfo.node.type === tf_graph.NodeType.MULTI_COLLECTION || + renderInfo.node.type === tf_graph.NodeType.API_LIST + ) { + node.style('fill', 'white'); + outerNode.style('fill', fillColor); + } else { + node.style('fill', fillColor); + } + // Choose outline to be darker version of node color if the node is a single + // color and is not selected. + node.style('stroke', isSelected ? null : getStrokeForFill(fillColor === 'transparent' ? 'white' : fillColor)); +} +/** + * Given a node's fill color/gradient, determine the stroke for the node. + */ +export function getStrokeForFill(fill: string): string { + // If node is colored by a gradient, then use a dark gray outline. + if (fill.substring(0, 3) === 'url') { + return render.MetanodeColors.GRADIENT_OUTLINE; + } else if (fill.startsWith('rgba')) { + return 'rgb(167, 167, 167)'; + } else { + return d3.rgb(fill).darker().toString(); + } +} + +/** + * Scene. + */ + +/** + * Select or create a sceneGroup and build/update its nodes and edges. + * + * Structure Pattern: + * + * + * + * + * ... stuff from tf.graph.scene.edges.build ... + * + * + * ... stuff from tf.graph.scene.nodes.build ... + * + * + * + * + * ... stuff from tf.graph.scene.nodes.build ... + * + * + * + * + * ... stuff from tf.graph.scene.nodes.build ... + * + * + * + * + * @param container D3 selection of the parent. + * @param renderNode render node of a metanode or series node. + * @param sceneElement polymer element. + * @param sceneClass class attribute of the scene (default='scene'). + */ +export function buildGroupForScene( + container, + renderNode: render.RenderGroupNodeInfo, + sceneElement: TfGraphScene, + sceneClass?: string, +): void { + const newSceneClass = sceneClass ?? Class.Scene.GROUP; + let isNewSceneGroup = selectChild(container, 'g', newSceneClass).empty(); + let sceneGroup = selectOrCreateChild(container, 'g', newSceneClass); + // core + let coreGroup = selectOrCreateChild(sceneGroup, 'g', Class.Scene.CORE); + let coreNodes = _.reduce( + renderNode.coreGraph.nodes(), + (nodes, name) => { + let node = renderNode.coreGraph.node(name); + nodes.push(node); + return nodes; + }, + Array(), + ); + // Create the layer of nodes for this scene (ellipses, rects etc). + buildGroup(coreGroup, coreNodes, sceneElement); + tf_graph_scene.position(sceneGroup, renderNode); + // Fade in the scene group if it didn't already exist. + if (isNewSceneGroup) { + sceneGroup.attr('opacity', 0).transition().attr('opacity', 1); + } +} +export function getColors(colors): void { + colorStorage = colors; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/parser.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/parser.ts new file mode 100644 index 0000000000000000000000000000000000000000..ad3047b2572e9eb9f5cc8c5f7c558cb2b6d02ae1 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/parser.ts @@ -0,0 +1,303 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import * as tb_debug from '../tb_debug'; +import { safeJSONParse } from '../utils'; +import { ProgressTracker } from './common'; +import * as tf_graph_proto from './proto'; +import * as tf_graph_util from './util'; + +function parseValue(value: string): string | number | boolean { + if (value === 'true') { + return true; + } + if (value === 'false') { + return false; + } + let firstChar = value[0]; + if (firstChar === '"') { + return value.substring(1, value.length - 1); + } + let num = parseFloat(value); + return isNaN(num) ? value : num; +} +/** + * Fetches a text file and returns a promise of the result. + */ +export function fetchPbTxt(filepath: string): Promise { + return new Promise((resolve, reject) => { + fetch(filepath).then((res) => { + // Fetch does not reject for 400+. + if (res.ok) { + res.arrayBuffer().then(resolve, reject); + } else { + res.text().then(reject, reject); + } + }); + }); +} + +/** + * Fetches the graph file, parses it and returns a promise of the result. The + * result will be undefined if the graph is empty. + */ +export function fetchAndParseGraphData( + path: string, + pbTxtFile: Blob | null, + tracker: ProgressTracker, +): Promise { + return tf_graph_util + .runAsyncPromiseTask( + 'Reading graph pbtxt', + 40, + async () => { + if (pbTxtFile) { + const result = await new Promise((resolve, reject) => { + let fileReader = new FileReader(); + fileReader.onload = (): void => resolve(fileReader.result as ArrayBuffer); + fileReader.onerror = (): void => reject(fileReader.error); + fileReader.readAsArrayBuffer(pbTxtFile); + }); + return result; + } + + const result = await fetchPbTxt(path); + return result; + }, + tracker, + tb_debug.GraphDebugEventId.FETCH_PBTXT_BYTES, + ) + .then((arrayBuffer: ArrayBuffer) => { + return tf_graph_util.runAsyncPromiseTask( + 'Parsing graph.pbtxt', + 60, + () => { + return parseGraphPbTxt(arrayBuffer); + }, + tracker, + tb_debug.GraphDebugEventId.PARSE_PBTXT_INTO_OBJECT, + ); + }); +} +/** + * Parse a file object in a streaming fashion line by line (or custom delim). + * Can handle very large files. + * @param input The file object as an array buffer. + * @param callback The callback called on each line + * @param chunkSize The size of each read chunk. (optional) + * @param delim The delimiter used to split a line. (optional) + * @returns Promise that resolves with true when it is finished. + */ +export function streamParse( + arrayBuffer: ArrayBuffer, + callback: (string) => void, + chunkSize: number = 1000000, + delim: string = '\n', +): Promise { + return new Promise((resolve, reject) => { + function readChunk(oldData: string, newData: string, offset: number): void { + const doneReading = offset >= arrayBuffer.byteLength; + const parts = newData.split(delim); + parts[0] = oldData + parts[0]; + // The last part may be part of a longer string that got cut off + // due to the chunking. + const remainder = doneReading ? '' : parts.pop(); + for (let part of parts) { + try { + callback(part); + } catch (e) { + reject(e); + return; + } + } + if (doneReading) { + resolve(true); + return; + } + const nextChunk = new Blob([arrayBuffer.slice(offset, offset + chunkSize)]); + const file = new FileReader(); + file.onload = function (e: any): void { + readChunk(remainder ?? '', e.target.result, offset + chunkSize); + }; + file.readAsText(nextChunk); + } + readChunk('', '', 0); + }); +} +/** + * Since proto-txt doesn't explicitly say whether an attribute is repeated + * (an array) or not, we keep a hard-coded list of attributes that are known + * to be repeated. This list is used in parsing time to convert repeated + * attributes into arrays even when the attribute only shows up once in the + * object. + * Repeated fields have to be in sync with graph.proto and all of its + * dependencies. + */ +const GRAPH_REPEATED_FIELDS: { + [attrPath: string]: boolean; +} = { + 'library.function': true, + 'library.function.node_def': true, + 'library.function.node_def.input': true, + 'library.function.node_def.attr': true, + 'library.function.node_def.attr.value.list.b': true, + 'library.function.node_def.attr.value.list.f': true, + 'library.function.node_def.attr.value.list.func': true, + 'library.function.node_def.attr.value.list.i': true, + 'library.function.node_def.attr.value.list.s': true, + 'library.function.node_def.attr.value.list.shape': true, + 'library.function.node_def.attr.value.list.shape.dim': true, + 'library.function.node_def.attr.value.list.tensor': true, + 'library.function.node_def.attr.value.list.type': true, + 'library.function.node_def.attr.value.shape.dim': true, + 'library.function.node_def.attr.value.tensor.string_val': true, + 'library.function.node_def.attr.value.tensor.tensor_shape.dim': true, + 'library.function.signature.input_arg': true, + 'library.function.signature.output_arg': true, + 'library.versions': true, + node: true, + 'node.input': true, + 'node.attr.value.list.b': true, + 'node.attr.value.list.f': true, + 'node.attr.value.list.func': true, + 'node.attr.value.list.i': true, + 'node.attr.value.list.s': true, + 'node.attr.value.list.shape': true, + 'node.attr.value.list.shape.dim': true, + 'node.attr.value.list.tensor': true, + 'node.attr.value.list.type': true, + 'node.attr.value.shape.dim': true, + 'node.attr.value.tensor.string_val': true, + 'node.attr.value.tensor.tensor_shape.dim': true, +}; +/** + * Parses an ArrayBuffer of a proto txt file into a raw Graph object. + */ +export function parseGraphPbTxt(input: ArrayBuffer): Promise { + return parsePbtxtFile(input, GRAPH_REPEATED_FIELDS); +} +/** + * Parses a ArrayBuffer of a proto txt file into javascript object. + * + * @param input The ArrayBuffer or file object implementing slice. + * @param repeatedFields Map (Set) of all the repeated fields, since you can't + * tell directly from the pbtxt if a field is repeated or not. + * @returns The parsed object. + */ +function parsePbtxtFile( + input: ArrayBuffer, + repeatedFields: { + [attrPath: string]: boolean; + }, +): Promise { + let output: { + [name: string]: any; + } = {}; + let stack: Array<{ [name: string]: any }> = []; + let path: string[] = []; + let current: { + [name: string]: any; + } = output; + function splitNameAndValueInAttribute(line: string): { name: string; value: any } { + let colonIndex = line.indexOf(':'); + let name = line.substring(0, colonIndex).trim(); + let value: any = parseValue(line.substring(colonIndex + 2).trim()); + if (name === 'input_data' || name === 'output_data') { + value = safeJSONParse((value as string).replace(/'{/g, '{').replace(/}'/g, '}').replace(/'/g, '"')) as object; + } else if (name === 'matched_node_link') { + value = safeJSONParse((value as string).replace(/'/g, '"')) as string[]; + } else if (name === 'subnodes') { + value = safeJSONParse((value as string).replace(/'/g, '"')) as string[]; + } else if (name === 'suggestions') { + value = safeJSONParse((value as string).replace(/'{/g, '{').replace(/}'/g, '}').replace(/'/g, '"')) as object; + } else { + } + if (name === 'attr') { + const valueObj = safeJSONParse((value as string).replace(/'/g, '"')) as object; + value = Object.keys(valueObj).map((key) => { + return { + key, + value: valueObj[key], + }; + }); + } + return { + name: name, + value: value, + }; + } + /** + * Adds a value, given the attribute name and the host object. If the + * attribute already exists, but is not an array, it will convert it to an + * array of values. + * + * @param obj The host object that holds the attribute. + * @param name The attribute name (key). + * @param value The attribute value. + * @param pathAtt A path that identifies the attribute. Used to check if + * an attribute is an array or not. + */ + function addAttribute( + obj: { [name: string]: any }, + name: string, + value: { [name: string]: any } | string | number | boolean, + pathAtt: string[], + ): void { + // We treat 'node' specially since it is done so often. + let existingValue = obj[name]; + if (existingValue == null) { + obj[name] = pathAtt.join('.') in repeatedFields ? [value] : value; + } else if (Array.isArray(existingValue)) { + existingValue.push(value); + } else { + obj[name] = [existingValue, value]; + } + } + // Run through the file a line at a time. + return streamParse(input, (line: string) => { + let lineNew = line.trim(); + if (!lineNew) { + return; + } + switch (lineNew[lineNew.length - 1]) { + case '{': { + // create new object + let name = lineNew.substring(0, lineNew.length - 2).trim(); + let newValue: { + [name: string]: any; + } = {}; + stack.push(current); + path.push(name); + addAttribute(current, name, newValue, path); + current = newValue; + break; + } + case '}': { + current = stack.pop() ?? {}; + path.pop(); + break; + } + default: { + let x = splitNameAndValueInAttribute(lineNew); + addAttribute(current, x.name, x.value, path.concat(x.name)); + break; + } + } + }).then(() => { + return output; + }); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/proto.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/proto.ts new file mode 100644 index 0000000000000000000000000000000000000000..bc59775863b955f2b074fda1892a656732e0566c --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/proto.ts @@ -0,0 +1,72 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +/** + * @fileoverview Interfaces that parallel proto definitions in + * third_party/tensorflow/core/framework/... + * graph.proto + * step_stats.proto + * These should stay in sync. + * + * When adding a repeated field to this file, make sure to update the + * GRAPH_REPEATED_FIELDS and METADATA_REPEATED_FIELDS lists within parser.ts. + * Otherwise, the parser has no way of differentiating between a field with a + * certain value and a repeated field that has only 1 occurence, resulting in + * subtle bugs. + */ + +export enum NodeOpType { + MODULE = 0, + DEFAULT = 1, + MULTI_COLLECTION = 8, + API_LIST = 9, +} +/** Name of the node */ +export interface NodeDef { + name: string; + /** List of nodes that are inputs for this node. */ + input: string[]; + /** The name of the operation associated with this node. */ + op: string; + /** The op type of the node. */ + node_type: NodeOpType; + /** The array of inputs data in JSON string format. */ + input_data: { + [key: string]: any; + }; + /** The array of outputs data in JSON string format. */ + output_data: { + [key: string]: any; + }; + stack_info: []; + matched_node_link: []; + suggestions: { + [key: string]: string; + }; + /** The array consist of the path of linked node in graph comparison. */ + subnodes?: string[]; + isLeaf: boolean; + /** List of attributes that describe/modify the operation. */ + attr: Array<{ + key: string; + value: Record; + }>; +} +/** + * TensorFlow graph definition as defined in the graph.proto file. + */ +export interface GraphDef { + // A list of nodes in the graph. + node: NodeDef[]; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/render.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/render.ts new file mode 100644 index 0000000000000000000000000000000000000000..bd1c36643f5375c7d2b376f04bded71a4ab99589 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/render.ts @@ -0,0 +1,494 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +/** + * Package for the Render Hierarchy for TensorFlow graph. + */ +import { graphlib } from 'dagre'; +import * as _ from 'lodash'; +import { NPU_PREFIX, BENCH_PREFIX } from './common'; +import * as tf_graph from './graph'; +import { + createGraph, + getHierarchicalPath, + GraphType, + GroupNode, + Node, + NodeType, +} from './graph'; +import { Hierarchy } from './hierarchy'; + +const NODE_LINE_FEED_NUMBER = 5; + +export interface Point { + x: number; + y: number; +} +/** + * Color parameters for op nodes. + */ +export const OpNodeColors = { + DEFAULT_FILL: '#ffffff', + DEFAULT_STROKE: '#b2b2b2', +}; +/** + * Color parameters for node encoding. + * @type {Object} + */ +export const MetanodeColors = { + /** + * Default fill and stroke to use when no other information is available. + */ + DEFAULT_FILL: '#d9d9d9', + DEFAULT_STROKE: '#a6a6a6', + GRADIENT_OUTLINE: '#888', +}; + +/** + * The regular expression to use when parsing for the string that is + * used to label a function node in the graph. We strip away a prefix + * indicating that the node represents a function definition. We also + * remove an arbitrary hexadecimal suffix and the number following it + * if it is present. To be clear, we extract foo from + * __function_library__foo_deadb00f_42. + */ +const nodeDisplayNameRegex = new RegExp( + `^(?:'${tf_graph.FUNCTION_LIBRARY_NODE_PREFIX}')?(\\w+)_[a-z0-9]{8}(?:_\\d+)?$`, +); +// 同时包含npu侧和标杆侧的图渲染信息 +export interface MergedRenderGraphInfo { + npu: RenderGraphInfo; + bench?: RenderGraphInfo; +} +/** + * Stores the rendering information, such as x and y coordinates, + * for each node in the graph. + */ +export class RenderGraphInfo { + hierarchy: Hierarchy; + renderedOpNames: string[]; + root: RenderGroupNodeInfo; + private index: { + [nodeName: string]: RenderNodeInfo; + }; + // Since the rendering information for each node is constructed lazily, + // upon node's expansion by the user, we keep a map between the node's name + // and whether the rendering information was already constructed for that + // node. + private hasSubhierarchy: { + [nodeName: string]: boolean; + }; + + constructor(hierarchy: Hierarchy) { + this.hierarchy = hierarchy; + this.index = {}; + this.renderedOpNames = []; + // Maps node name to whether the rendering hierarchy was already + // constructed. + this.hasSubhierarchy = {}; + this.root = new RenderGroupNodeInfo(hierarchy.root, hierarchy.graphOptions); + this.index[hierarchy.root.name] = this.root; + this.renderedOpNames.push(hierarchy.root.name); + this.buildSubhierarchy(hierarchy.root.name); + this.root.expanded = true; + } + + /** + * Get index. + */ + getIndex(): { [nodeName: string]: RenderNodeInfo } { + return this.index; + } + + /** + * Get a previously created RenderNodeInfo by its node name. + */ + getRenderNodeByName(nodeName: string): RenderNodeInfo { + return this.index[nodeName]; + } + + /** + * Get the underlying node in the hierarchical graph by its name. + */ + getNodeByName(nodeName: string): Node { + return this.hierarchy.node(nodeName); + } + + /** + * Get a previously created RenderNodeInfo for the specified node name, + * or create one if it hasn't been created yet. + */ + getOrCreateRenderNodeByName(nodeName: string): RenderNodeInfo | null { + // Polymer may invoke this with null. + if (!nodeName) { + return null; + } + if (nodeName in this.index) { + return this.index[nodeName]; + } + let node = this.hierarchy.node(nodeName); + // Exit early if the node does not exist in the hierarchy. This can happen + // when a graph is reloaded while the infocard points to a node not visible + // at the top-level. + if (!node) { + return null; + } + let renderInfo = node.isGroupNode + ? new RenderGroupNodeInfo(node, this.hierarchy.graphOptions) + : new RenderNodeInfo(node); + this.index[nodeName] = renderInfo; + this.renderedOpNames.push(nodeName); + return this.index[nodeName]; + } + + /** + * Return the nearest ancestor node, including itself, that is visible + * in the visualization. This method is used so that we can select + * (highlight) a node that isn't drawn yet, by selecting (highlighting) + * its nearest ancestor that has been drawn. + */ + getNearestVisibleAncestor(name: string): string { + let path = getHierarchicalPath(name); + let i = 0; + let renderNode: RenderNodeInfo | null = null; + // Fallthrough. If everything was expanded return the node. + let nodeName = name; + for (; i < path.length; i++) { + nodeName = path[i]; + renderNode = this.getRenderNodeByName(nodeName); + // Op nodes have expanded set to false by default. + if (renderNode && !renderNode.expanded) { + break; + } + } + if (!renderNode) { + return ''; + } + // Check case where highlighted node is an embedded node whose parent node + // is also its hierarchical parent. In this case, we want to return the + // embedded node name, as it is also displayed if its parent has been + // displayed. + if (i === path.length - 2) { + let nextName = path[i + 1]; + } + return nodeName; + } + + buildSubhierarchy(nodeName: string, subGraph: tf_graph.SlimGraph | undefined = undefined): void { + // Terminate if the rendering hierarchy was already constructed + // for this node. + if (nodeName in this.hasSubhierarchy) { + return; + } + // Record that we constructed the rendering hierarchy for this node, so we + // don't construct it another time. + this.hasSubhierarchy[nodeName] = true; + let renderNodeInfo = this.index[nodeName]; + // If it is not a meta node or a series node, don't do anything. + const excludedTypes = [NodeType.META, NodeType.MULTI_COLLECTION, NodeType.API_LIST]; + if (!excludedTypes.includes(renderNodeInfo.node.type)) { + return; + } + // At this point we know the rendering information is about a group node. + let renderGroupNodeInfo = renderNodeInfo; + let metagraph = renderGroupNodeInfo.node.metagraph; + let coreGraph = renderGroupNodeInfo.coreGraph; + // Create render nodes to represent each child from the metagraph. Although + // these will initially be added to the coreGraph, they may later be + // extracted. Also, due to extraction, the coreGraph may contain disjoint + // groups between which there is no visible path (other than annotations). + _.each(metagraph.nodes(), (childName, index: number) => { + let childRenderInfo = this.getOrCreateRenderNodeByName(childName); + if (!childRenderInfo) { + return; + } + coreGraph.setNode(childName, childRenderInfo); + // 可展开节点自成一行,模块间游离节点每NODE_LINE_FEED_NUMBER个换行 + if (index >= 1 && subGraph && Object.keys(subGraph.metaNodes).length > 0) { + coreGraph.setEdge(metagraph.nodes()[index - 1], childName, {}); + } else if (index >= NODE_LINE_FEED_NUMBER && subGraph && Object.keys(subGraph.metaNodes).length === 0) { + coreGraph.setEdge(metagraph.nodes()[index - NODE_LINE_FEED_NUMBER], childName, {}); + } + }); + // Look up the parent node's render information and short circuit if none. + let parentNode = renderGroupNodeInfo.node.parentNode; + if (!parentNode) { + return; + } + _.each([true, false], (inbound) => { + _.each(coreGraph.nodes(), (childName) => { + let isTerminal = inbound + ? !coreGraph.predecessors(childName)?.length + : !coreGraph.successors(childName)?.length; + if (!isTerminal) { + return; + } + }); + }); + } + + checkSubhierarchy(nodeName: string): boolean { + return nodeName in this.hasSubhierarchy; + } +} +/** + * A class for rendering annotation object which contains label + * about the node embedded as annotation, type of annotation and the location + * of both the annotation's node and edge. + * + * Annotation objects include embedded constants, embedded summary, and + * edge shortcuts. + */ +export class Annotation { + node: Node; + renderNodeInfo: RenderNodeInfo; + annotationType: AnnotationType; + /** + * Center position of annotation relative to the host + * node's center x. + */ + dx: number; + /** + * Center position of annotation relative to the host + * node's center y. + */ + dy: number; + width: number; + height: number; + /** + * A flag whether it is an in-annotation (if true) or + * out-annotation (if false). + */ + isIn: boolean; + /** Label horizontal offset from the end of the node shape */ + labelOffset: number; + /** + * Array of points for edges from the annotation to its host + * node. Each point contains the point location, relative to + * the host node's center. + */ + points: Array<{ + dx: number; + dy: number; + }>; + + /** + * Creates a new Annotation. + * + * @param node The underlying node this annotation points to. + * @param renderNodeInfo The render information for the underlying node + * this annotation points to. This can be null if the annotation + * denotes an embedding (constant, summary), in which case we + * use the node property. + * with the annotation. + * @param type The type of the annotation. + * @param isIn True if it is an in-annotation. False if it is an + * out-annotation. + */ + constructor( + node: Node, + renderNodeInfo: RenderNodeInfo, + type: AnnotationType, + isIn: boolean, + ) { + this.node = node; + this.renderNodeInfo = renderNodeInfo; + this.annotationType = type; + // Properties specified by layout + this.dx = 0; + this.dy = 0; + this.width = 0; + this.height = 0; + this.isIn = isIn; + this.points = []; + } +} +export enum AnnotationType { + SHORTCUT = 0, + CONSTANT = 1, + SUMMARY = 2, + ELLIPSIS = 3, +} +/** + * Manages a list of annotations. Two will be used for each + * RenderNodeInfo, one for in annotations and one for out annotations. + */ +export class AnnotationList { + /** + * List of visually drawable annotations, may include an ellipses annotation + * if the number added exceeds the number specified by maxAnnotations. + */ + list: Annotation[]; + /** + * Set of nodes which have been added as annotations to this list, so we can + * prevent duplicates. + */ + nodeNames: { + [nodeName: string]: boolean; + }; + + constructor() { + this.list = []; + this.nodeNames = {}; + } +} +/** + * Contains rendering information about a node in the hierarchical graph. + */ +export class RenderNodeInfo { + /** Reference to the original underlying Node from the hierarchical graph. */ + node: Node; + /** Whether the node is expanded or not. */ + expanded: boolean; + // --- Params specified by layout --- // + /** Center x position */ + x: number; + /** Center y position */ + y: number; + /** + * Total width of the node's shape, including in- and out-annotations. This + * property is used by dagre to layout the graph. + */ + width: number; + /** + * Total height of the node's shape, including in- and out-annotations. This + * property is used by dagre to layout the graph. + */ + height: number; + /** + * Size of the main box of the node, excluding in- and out-annotations. This + * property is used to draw the rectangle/ellipse shape denoting the node. + */ + coreBox: { + width: number; + height: number; + }; + // --- Params for the size of the node box --- // + /** Label vertical offset from the center of node shape */ + labelOffset: number; + /** Rectangle radius (for making rounded rectangle) */ + radius: number; + // --- Params for expanded node --- // + /** Label height for expanded node. */ + labelHeight: number; + // Paddings between inner subscene and the border of the expanded node. + paddingTop: number; + paddingLeft: number; + paddingRight: number; + paddingBottom: number; + /** + * The name string used to label the node in the graph. + */ + displayName: string; + constructor(node: Node) { + this.node = node; + this.expanded = false; + // Params specified by layout + this.x = 0; + this.y = 0; + this.width = 0; + this.height = 0; + // Params for node box. + this.labelOffset = 0; + this.radius = 0; + // Params for expanded node + this.labelHeight = 0; + this.paddingTop = 0; + this.paddingLeft = 0; + this.paddingRight = 0; + this.paddingBottom = 0; + this.coreBox = { width: 0, height: 0 }; + // Only use the portion beyond the prefix as the display name. + if (node.name.startsWith(BENCH_PREFIX) && node.parentNode.name === tf_graph.ROOT_NAME) { + this.displayName = '标杆'; + } else { + const nameList = node.name.split('.'); + if (nameList.length > 3) { + const secondLastItem = nameList[nameList.length - 2]; + nameList.splice(nameList.length - 2, 1); + nameList.splice(2, 0, secondLastItem); + this.displayName = nameList.slice(1, nameList.length - 1).join('.'); + } else if (node.name.startsWith(BENCH_PREFIX) || node.name.startsWith(NPU_PREFIX)) { + this.displayName = node.name.slice(4); + } else { + this.displayName = node.name; + } + } + + if ( + node.type === (NodeType.META || NodeType.MULTI_COLLECTION || NodeType.API_LIST) + ) { + // Function names are suffixed with a length-8 hexadecimal string + // followed by an optional number. We remove that suffix because + // the user did not generate that suffix. That suffix merely + // serves to differentiate between functions with different + // signatures but the same name otherwise. + // Furthermore, we remove the prefix that merely ascertains this + // node as a function definition. There is no reason for the user + // to see that in the graph, as the node would already be within + // the functions scene group. + const match = this.displayName.match(nodeDisplayNameRegex); + if (match) { + // The display name had been successfully extracted. This is the most + // common scenario. + this.displayName = match[1]; + } else if (_.startsWith(this.displayName, tf_graph.FUNCTION_LIBRARY_NODE_PREFIX)) { + // The string does not match the usual pattern for how functions are + // named. Just use the entire second portion of the string as the name + // if we can successfully remove the prefix. + this.displayName = this.displayName.substring(tf_graph.FUNCTION_LIBRARY_NODE_PREFIX.length); + } + } + } +} + +function setGraphDepth(graph: graphlib.Graph, depth: number): void { + _.each(graph.nodes(), (nodeName) => { + let child = graph.node(nodeName); + child.expanded = depth > 1; // set all child of depth 1 to collapsed + if (depth > 0) { + switch (child.node.type) { + case NodeType.META: + case NodeType.MULTI_COLLECTION: + case NodeType.API_LIST: + setGroupNodeDepth(child, depth - 1); + break; + default: + // Do nothing for leaf + } + } + }); +} +export class RenderGroupNodeInfo extends RenderNodeInfo { + override node: GroupNode; + /** + * The core graph is derived from the underlying node's metagraph, minus + * the extracted source-like and sink-like nodes. + */ + coreGraph: graphlib.Graph; + constructor(groupNode: GroupNode, graphOptions: tf_graph.LabeledGraphOptions) { + super(groupNode); + let metagraph = groupNode.metagraph; + let gl = metagraph.graph() as any; + this.coreGraph = createGraph(gl.name, GraphType.CORE, graphOptions); + } +} +function setGroupNodeDepth(renderInfo: RenderGroupNodeInfo, depth: number): void { + if (renderInfo.coreGraph) { + setGraphDepth(renderInfo.coreGraph, depth); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/scene.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/scene.ts new file mode 100644 index 0000000000000000000000000000000000000000..df1820288b698d58b3a10d82bc79e7786e9f1080 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/scene.ts @@ -0,0 +1,212 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import * as d3 from 'd3'; +import { Class as _Class, selectChild as _selectChild } from './common'; +import * as layout from './layout'; +import * as render from './render'; + +export const selectChild = _selectChild; +export const Class = _Class; + +/** + * The dimensions of the minimap including padding and margin. + */ +const MINIMAP_BOX_WIDTH = 320; +const MINIMAP_BOX_HEIGHT = 150; +/** + * Helper method for fitting the graph in the svg view. + * + * @param svg The main svg. + * @param zoomG The svg group used for panning and zooming. + * @param d3zoom The zoom behavior. + * @param callback Called when the fitting is done. + */ +export function fit(svg, zoomG, d3zoom, callback): void { + let svgRect = svg.getBoundingClientRect(); + let sceneSize: DOMRect | null = null; + try { + sceneSize = zoomG.getBBox(); + if (sceneSize?.width === 0) { + // There is no scene anymore. We have been detached from the dom. + return; + } + } catch (e) { + // Firefox produced NS_ERROR_FAILURE if we have been + // detached from the dom. + return; + } + let scale = 0.9 * Math.min(svgRect.width / (sceneSize?.width ?? 1), svgRect.height / (sceneSize?.height ?? 1), 2); + let params = layout.PARAMS.graph; + const transform = d3.zoomIdentity.scale(scale).translate(params.padding.paddingLeft, params.padding.paddingTop); + d3.select(svg) + .transition() + .duration(500) + .call(d3zoom.transform, transform) + .on('end.fitted', () => { + // Remove the listener for the zoomend event, + // so we don't get called at the end of regular zoom events, + // just those that fit the graph to screen. + d3zoom.on('end.fitted', null); + callback(); + }); +} +/** + * Helper method for panning the graph to center on the provided node, + * if the node is currently off-screen. + * + * @param nodeName The node to center the graph on + * @param svg The root SVG element for the graph + * @param zoomG The svg group used for panning and zooming. + * @param d3zoom The zoom behavior. + * @return True if the graph had to be panned to display the + * provided node. + */ +export function panToNode(nodeName: string, svg, zoomG, d3zoom): boolean { + const node = d3.select(svg).select(`[data-name="${nodeName}"]`).node(); + if (!node) { + console.warn(`panToNode failed for node name "${nodeName}"`); + return false; + } + // Check if the selected node is off-screen in either + // X or Y dimension in either direction. + let nodeBox = node.getBBox(); + let nodeCtm = node.getScreenCTM(); + let pointTL = svg.createSVGPoint(); + let pointBR = svg.createSVGPoint(); + pointTL.x = nodeBox.x; + pointTL.y = nodeBox.y; + pointBR.x = nodeBox.x + nodeBox.width; + pointBR.y = nodeBox.y + nodeBox.height; + pointTL = pointTL.matrixTransform(nodeCtm); + pointBR = pointBR.matrixTransform(nodeCtm); + let isOutsideOfBounds = (start, end, lowerBound, upperBound): boolean => { + // Return if even a part of the interval is out of bounds. + return !(start > lowerBound && end < upperBound); + }; + let svgRect = svg.getBoundingClientRect(); + // Subtract to make sure that the node is not hidden behind the minimap. + const horizontalBound = svgRect.left + svgRect.width - MINIMAP_BOX_WIDTH; + const verticalBound = svgRect.top + svgRect.height - MINIMAP_BOX_HEIGHT; + if ( + isOutsideOfBounds(pointTL.x, pointBR.x, svgRect.left, horizontalBound) || + isOutsideOfBounds(pointTL.y, pointBR.y, svgRect.top, verticalBound) + ) { + // Determine the amount to translate the graph in both X and Y dimensions in + // order to center the selected node. This takes into account the position + // of the node, the size of the svg scene, the amount the scene has been + // scaled by through zooming, and any previous transforms already performed + // by this logic. + let centerX = (pointTL.x + pointBR.x) / 2; + let centerY = (pointTL.y + pointBR.y) / 2; + let dx = svgRect.left + (svgRect.width / 2) - centerX; + let dy = svgRect.top + (svgRect.height / 2) - centerY; + + // We translate by this amount. We divide the X and Y translations by the + // scale to undo how translateBy scales the translations (in d3 v4). + const svgTransform = d3.zoomTransform(svg); + d3.select(svg) + .transition() + .duration(500) + .call(d3zoom.translateBy, dx / svgTransform.k, dy / svgTransform.k); + return true; + } + return false; +} +/** + * Given a scene's svg group, set g.in-extract, g.coreGraph, g.out-extract svg + * groups' position relative to the scene. + * + * @param sceneGroup + * @param renderNode render node of a metanode or series node. + */ +export function position(sceneGroup, renderNode: render.RenderGroupNodeInfo): void { + // Translate scenes down by the label height so that when showing graphs in + // expanded metanodes, the graphs are below the labels. Do not shift them + // down for series nodes as series nodes don't have labels inside of their + // bounding boxes. + let yTranslate = layout.PARAMS.subscene.meta.labelHeight; + // core + translate(selectChild(sceneGroup, 'g', Class.Scene.CORE), 0, yTranslate); +} + +/** Adds a click listener to a group that fires a graph-select event */ +export function addGraphClickListener(graphGroup, sceneElement): void { + d3.select(graphGroup).on('click', () => { + sceneElement.fire('graph-select'); + }); +} +/** Helper for adding transform: translate(x0, y0) */ +export function translate(selection, x0: number, y0: number): void { + // If it is already placed on the screen, make it a transition. + let selectionTemp = selection; + if (selection.attr('transform') != null) { + selectionTemp = selection.transition('position'); + } + selectionTemp.attr('transform', `translate(${x0},${y0})`); +} +/** + * Helper for setting position of a svg rect + * @param rect A d3 selection of rect(s) to set position of. + * @param cx Center x. + * @param cy Center x. + * @param width Width to set. + * @param height Height to set. + */ +export function positionRect(rect, cx: number, cy: number, width: number, height: number): void { + rect + .transition() + .attr('x', cx - (width / 2)) + .attr('y', cy - (height / 2)) + .attr('width', width) + .attr('height', height); +} + +/** + * Helper for setting position of a svg expand/collapse button + * @param button container group + * @param renderNode the render node of the group node to position + * the button on. + */ +export function positionButton(button, renderNode: render.RenderNodeInfo): void { + let cx = layout.computeCXPositionOfNodeShape(renderNode); + // Position the button in the top-right corner of the group node, + // with space given the draw the button inside of the corner. + let width = renderNode.expanded ? renderNode.width : renderNode.coreBox.width; + let height = renderNode.expanded ? renderNode.height : renderNode.coreBox.height; + let x = cx + (width / 2) - 6; + let y = renderNode.y - (height / 2) + 6; + let translateStr = `translate(${x},${y})`; + button.selectAll('path').transition().attr('transform', translateStr); + button.select('circle').transition().attr({ cx: x, cy: y, r: layout.PARAMS.nodeSize.meta.expandButtonRadius }); +} +/** + * Helper for setting position of a svg ellipse + * @param ellipse ellipse to set position of. + * @param cx Center x. + * @param cy Center x. + * @param width Width to set. + * @param height Height to set. + */ +export function positionEllipse(ellipse, cx: number, cy: number, width: number, height: number): void { + ellipse + .transition() + .attr('cx', cx) + .attr('cy', cy) + .attr('rx', width / 2) + .attr('ry', height / 2); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-icon.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-icon.ts new file mode 100644 index 0000000000000000000000000000000000000000..ad3300be29fedc7bb576e5a57988898409fc629f --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-icon.ts @@ -0,0 +1,197 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { computed, customElement, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import { DarkModeMixin } from '../polymer/dark_mode_mixin'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import '../tf_dashboard_common/tensorboard-color'; +import { MetanodeColors, OpNodeColors } from './render'; + +export enum GraphIconType { + CONST = 'CONST', + META = 'META', + OP = 'OP', + SUMMARY = 'SUMMARY', + MULTI_COLLECTION = 'MULTI_COLLECTION', + API_LIST = 'API_LIST', +} +@customElement('tf-graph-icon') +class TfGraphIcon extends LegacyElementMixin(DarkModeMixin(PolymerElement)) { + static readonly template = html` + + + + + + + + + + + + `; + + @property({ type: String }) + type: string; + + @property({ type: Boolean }) + vertical: boolean = false; + + @property({ type: String }) + fillOverride: string | null = null; + + @property({ type: String }) + strokeOverride: string | null = null; + + @property({ type: Number }) + height: number = 20; + + @property({ type: Boolean }) + faded: boolean = false; + + getSvgDefinableElement(): HTMLElement { + return this.$.svgDefs as HTMLElement; + } + + @computed('type', 'fillOverride') + get _fill(): string { + let type = this.type; + let fillOverride = this.fillOverride; + if (fillOverride != null) { + return fillOverride; + } + switch (type) { + case GraphIconType.META: + return MetanodeColors.DEFAULT_FILL; + default: + return OpNodeColors.DEFAULT_FILL; + } + } + + @computed('type', 'strokeOverride') + get _stroke(): string { + let type = this.type; + let strokeOverride = this.strokeOverride; + if (strokeOverride != null) { + return strokeOverride; + } + switch (type) { + case GraphIconType.META: + return MetanodeColors.DEFAULT_STROKE; + default: + return OpNodeColors.DEFAULT_STROKE; + } + } + + /** + * Test whether the specified node's type, or the literal type string, + * match a particular other type. + */ + _isType(type: GraphIconType, targetType: GraphIconType): boolean { + return type === targetType; + } + + _fadedClass(faded: boolean, shape: string): string { + return `${faded ? `faded-${shape}` : ''}`; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-scene.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-scene.ts new file mode 100644 index 0000000000000000000000000000000000000000..190cc9ed98a99da85c3255410f20ee37483b1c56 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-graph-scene.ts @@ -0,0 +1,36 @@ +/* Copyright 2019 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import * as d3 from 'd3'; +import { Annotation, RenderNodeInfo } from './render'; + +type Selection = d3.Selection; +// This technically extends Polymer.Component whose constructor is not +// accessible. +export abstract class TfGraphScene extends HTMLElement { + maxMetanodeLabelLength: number; + maxMetanodeLabelLengthLargeFont: number; + maxMetanodeLabelLengthFontSize: number; + abstract fire(eventName: string, daat: any): void; + abstract addNodeGroup(name: string, selection: Selection): void; + abstract removeNodeGroup(name: string): void; + abstract removeAnnotationGroup(annotation: Annotation, renderNode: RenderNodeInfo): void; + abstract isNodeExpanded(node: RenderNodeInfo): boolean; + abstract isNodeHighlighted(nodeName: string): boolean; + abstract isNodeSelected(nodeName: string): boolean; + abstract isNodeLinked(nodeName: string): boolean; + abstract getAnnotationGroupsIndex(name: string): Selection; + abstract getGraphSvgRoot(): SVGElement; + abstract getContextMenu(): HTMLElement; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-node-icon.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-node-icon.ts new file mode 100644 index 0000000000000000000000000000000000000000..41d34974686a4bdba7f94a71b69bb9cb6d8642f1 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/tf-node-icon.ts @@ -0,0 +1,164 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { customElement, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import * as tf_graph from '../tf_graph_common/graph'; +import * as tf_graph_scene_node from '../tf_graph_common/node'; +import './tf-graph-icon'; +import * as tf_graph_icon from './tf-graph-icon'; + +@customElement('tf-node-icon') +class TfNodeIcon extends LegacyElementMixin(PolymerElement) { + static readonly template = html` + + + `; + + /** + * Node to represent with an icon. Optional, but if specified, its + * properties override those defined in the type, vertical, const and + * summary properties. + * This property is a tf.graph.Node. + */ + @property({ + type: Object, + }) + node: object | null = null; + + /** + * Render node information associated with this node. Optional. If + * specified, this is only used when computing the fill of the icon + * element. + * This property is a tf.graph.render.RenderNodeInfo. + */ + @property({ + type: Object, + }) + renderInfo: object | null = null; + + /** Type of node to draw (ignored if node is set). */ + @property({ + type: String, + }) + type: string | null = null; + + /** Direction for series (ignored for other types). */ + @property({ + type: Boolean, + }) + vertical: boolean = false; + + /** Whether the op is Const (ignored for non-ops). */ + @property({ + type: Boolean, + }) + const: boolean = false; + + /** Whether the op is a Summary (ignored for non-ops). */ + @property({ + type: Boolean, + }) + summary: boolean = false; + + /** + * Fill for the icon, optional. If fill is specified and node is not + * specified, then this value will override the default for the + * element. However, if node is specified, this value will be ignored. + */ + @property({ + type: String, + }) + fill: string | null = null; + + /** Height of the SVG element in pixels, used for scaling. */ + @property({ + type: Number, + }) + height: number = 20; + + @property({ + type: String, + computed: '_computeFillOverride(node, renderInfo, fill)', + observer: '_onFillOverrideChanged', + }) + _fillOverride: string; + + /** + * Returns fill value based on node and configuration. If any of those + * configurations or node value missing, it returns `fill` property. + * Note that if this evaluates to null, it will be chosen based on + * the type of the node. + */ + _computeFillOverride(inputNode, inputRenderInfo, inputFill): string | null { + if (inputNode && inputRenderInfo) { + return tf_graph_scene_node.getFillForNode(inputRenderInfo); + } + return inputFill; + } + + _getStrokeOverride(fillOverride): string | null { + return fillOverride ? tf_graph_scene_node.getStrokeForFill(fillOverride) : null; + } + + /** + * Returns graph-icon type from input, type, and summary. + */ + _getType(inputNode, isSummary, isConst, inputType): string { + const { GraphIconType } = tf_graph_icon; + if (inputNode) { + switch (inputNode.type) { + case tf_graph.NodeType.OP: { + const opName = inputNode.op; + if (typeof opName !== 'string') { + return GraphIconType.OP; + } + if (opName === 'Const' || isConst) { + return GraphIconType.CONST; + } + if (opName.endsWith('Summary') || isSummary) { + return GraphIconType.SUMMARY; + } + return GraphIconType.OP; + } + case tf_graph.NodeType.META: + return GraphIconType.META; + default: + } + } + return inputType; + } + + _onFillOverrideChanged(newFill, oldFill): void { + const { node, renderInfo } = this; + if (newFill !== oldFill) { + tf_graph_scene_node.removeGradientDefinitions((this.$.icon as any).getSvgDefinableElement()); + } + if (node && renderInfo) { + tf_graph_scene_node.getFillForNode(renderInfo as any); + } + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/util.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/util.ts new file mode 100644 index 0000000000000000000000000000000000000000..2f9831bf264458bf7ed14fba27c69a61ebbee588 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_common/util.ts @@ -0,0 +1,379 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the 'License'); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an 'AS IS' BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +/** + * @fileoverview Utility functions for the tensorflow graph visualizer. + */ +import * as _ from 'lodash'; +import { + GraphDebugActionEventId, + GraphDebugTimingEventId, + GRAPH_DEBUG_ACTION_EVENT_CATEGORY, + GRAPH_DEBUG_TIMING_EVENT_CATEGORY, +} from '../tb_debug/types'; +import { ProgressTracker } from './common'; + +const ASYNC_TASK_DELAY = 20; + +interface DebugTimingEvent { + timingId: GraphDebugTimingEventId; + // An associated duration in milliseconds for a timing event. + eventValue: number; +} + +interface DebugActionEvent { + actionId: GraphDebugActionEventId; + eventLabel?: string; +} + +export type DebugEvent = DebugTimingEvent | DebugActionEvent; + +/** + * Measure and log a synchronous task. + */ +export function time(msg: string, task: () => T, debugEventId?: GraphDebugTimingEventId): T { + let start = Date.now(); + let result = task(); + const durationInMs = Date.now() - start; + /* tslint:disable */ + console.log(msg, ':', durationInMs, 'ms'); + return result; +} +/** + * Creates a tracker that sets the progress property of the + * provided polymer component. The provided component must have + * a property called 'progress' that is not read-only. The progress + * property is an object with a numerical 'value' property and a + * string 'msg' property. + */ +export function getTracker(polymerComponent: any): ProgressTracker { + return { + setMessage: function (msg): void { + polymerComponent.set('progress', { + value: polymerComponent.progress.value, + msg: msg, + }); + }, + updateProgress: function (value): void { + polymerComponent.set('progress', { + value: polymerComponent.progress.value + value, + msg: polymerComponent.progress.msg, + }); + }, + reportError: function (msg: string, err): void { + // Log the stack trace in the console. + console.error(err.stack); + // And send a user-friendly message to the UI. + polymerComponent.set('progress', { + value: polymerComponent.progress.value, + msg: msg, + error: true, + }); + }, + }; +} +/** + * Creates a tracker for a subtask given the parent tracker, the total + * progress + * of the subtask and the subtask message. The parent task should pass a + * subtracker to its subtasks. The subtask reports its own progress which + * becomes relative to the main task. + */ +export function getSubtaskTracker( + parentTracker: ProgressTracker, + impactOnTotalProgress: number, + subtaskMsg: string, +): ProgressTracker { + return { + setMessage: function (progressMsg): void { + // The parent should show a concatenation of its message along with + // its subtask tracker message. + parentTracker.setMessage(`${subtaskMsg}: ${progressMsg}`); + }, + updateProgress: function (incrementValue): void { + // Update the parent progress relative to the child progress. + // For example, if the sub-task progresses by 30%, and the impact on the + // total progress is 50%, then the task progresses by 30% * 50% = 15%. + parentTracker.updateProgress((incrementValue * impactOnTotalProgress) / 100); + }, + reportError: function (msg: string, err: Error): void { + // The parent should show a concatenation of its message along with + // its subtask error message. + parentTracker.reportError(`${subtaskMsg}: ${msg}`, err); + }, + }; +} +/** + * Runs a synchronous expensive task and return the result. + * Please use runAsyncPromiseTask in case a task returns a Promise. + */ +export function runTask( + msg: string, + incProgressValue: number, + task: () => T, + tracker: ProgressTracker, + debugEventId?: GraphDebugTimingEventId, +): any { + // Update the progress message to say the current running task. + tracker.setMessage(msg); + // Run the expensive task with a delay that gives enough time for the + // UI to update. + try { + let result = time(msg, task, debugEventId); + // Update the progress value. + tracker.updateProgress(incProgressValue); + // Return the result to be used by other tasks. + return result; + } catch (e: any) { + // Errors that happen inside asynchronous tasks are + // reported to the tracker using a user-friendly message. + tracker.reportError(`Failed${msg}`, e); + return null; + } +} +/** + * Runs an expensive task asynchronously and returns a promise of the result. + */ +export function runAsyncTask( + msg: string, + incProgressValue: number, + task: () => T, + tracker?: ProgressTracker, + debugEventId?: GraphDebugTimingEventId, +): Promise { + return new Promise((resolve, reject) => { + // Update the progress message to say the current running task. + if (tracker) { + tracker.setMessage(msg); + } + // Run the expensive task with a delay that gives enough time for the + // UI to update. + setTimeout(() => { + try { + let result = time(msg, task, debugEventId); + // Update the progress value. + if (tracker) { + tracker.updateProgress(incProgressValue); + } + // Return the result to be used by other tasks. + resolve(result); + } catch (e: any) { + // Errors that happen inside asynchronous tasks are + // reported to the tracker using a user-friendly message. + if (tracker) { + tracker.reportError(`Failed${msg}`, e); + } + } + }, ASYNC_TASK_DELAY); + }); +} +/** + * Asynchronously runs an expensive task that returns a promise. Updates the + * tracker's progress after the promise resolves. Returns a new promise that + * resolves after the progress is updated. + */ +export function runAsyncPromiseTask( + msg: string, + incProgressValue: number, + task: () => Promise, + tracker: ProgressTracker, + debugEventId?: GraphDebugTimingEventId, +): Promise { + return new Promise((resolve, reject) => { + let handleError = function (e): void { + // Errors that happen inside asynchronous tasks are + // reported to the tracker using a user-friendly message. + tracker.reportError(`Failed${msg}`, e); + reject(e); + }; + // Update the progress message to say the current running task. + tracker.setMessage(msg); + // Run the expensive task with a delay that gives enough time for the + // UI to update. + setTimeout(() => { + try { + let start = Date.now(); + task() + .then((value) => { + const durationInMs = Date.now() - start; + // Update the progress value. + tracker.updateProgress(incProgressValue); + // Return the result to be used by other tasks. + resolve(value); + }) + .catch(handleError); + } catch (e) { + handleError(e); + } + }, ASYNC_TASK_DELAY); + }); +} +/** + * Returns a query selector with escaped special characters that are not + * allowed in a query selector. + */ +export function escapeQuerySelector(querySelector: string): string { + return querySelector.replace(/(?[:.\[\],/\\\(\)])/g, '\\$'); +} +/** + * Given a list of strings, it returns a new list of strings with the longest + * common prefix removed. If the common prefix is one of the strings in the + * list, it returns the original strings. + */ +export function removeCommonPrefix(strings: string[]): string[] { + if (strings.length < 2) { + return strings; + } + let index = 0; + let largestIndex = 0; + // Find the shortest name across all strings. + let minLength = _.min(_.map(strings, (str) => str.length)) as number; + while (true) { + index++; + let prefixes = _.map(strings, (str) => str.substring(0, index)); + let allTheSame = prefixes.every((prefix, i) => { + return i === 0 ? true : prefix === prefixes[i - 1]; + }); + if (allTheSame) { + if (index >= minLength) { + // There is a string whose whole name is a prefix to other string. + // In this case, we return the original list of string. + return strings; + } + largestIndex = index; + } else { + break; + } + } + return _.map(strings, (str) => str.substring(largestIndex)); +} +/** + * Given a timestamp in microseconds, return a human-friendly string denoting + * how long ago the timestamp was. + */ +export function computeHumanFriendlyTime(timeInMicroseconds: number): string { + let timeDifferenceInMs = Number(new Date()) - Number(new Date(timeInMicroseconds / 1000)); + if (timeDifferenceInMs < 30000) { + return 'just now'; + } else if (timeDifferenceInMs < 60000) { + return `${Math.floor(timeDifferenceInMs / 1000)} seconds ago`; + } else if (timeDifferenceInMs < 120000) { + return 'a minute ago'; + } else if (timeDifferenceInMs < 3600000) { + return `${Math.floor(timeDifferenceInMs / 60000)} minutes ago`; + } else if (Math.floor(timeDifferenceInMs / 3600000) === 1) { + return 'an hour ago'; + } else if (timeDifferenceInMs < 86400000) { + return `${Math.floor(timeDifferenceInMs / 3600000)} hours ago`; + } else if (timeDifferenceInMs < 172800000) { + return 'yesterday'; + } else { + return `${Math.floor(timeDifferenceInMs / 86400000)} days ago`; + } +} + +const canvas = document.createElement('canvas'); +const measurerContext = canvas.getContext('2d'); + +/** + * Returns width of `text` rendered with Roboto at provided fontSize. + */ +export function measureTextWidth(text: string, fontSize: number): number { + if (measurerContext) { + measurerContext.font = `${fontSize}px Roboto, sans-serif`; + } + return measurerContext?.measureText(text).width as number; +} + +/** + * Returns, if rendered `text` does not fit into maxWidth, truncated string with trailing + * ellipsis. + */ +export function maybeTruncateString(text: string, fontSize: number, maxWidth: number): string { + if (!text) { + return ''; + } + if (measureTextWidth(text, fontSize) <= maxWidth) { + return text; + } + + let start = 0; + let end = text.length; + while (start < end) { + const middle = start + Math.round((end - start) / 2); + const substring = `${text.slice(0, middle)}…`; + if (measureTextWidth(substring, fontSize) <= maxWidth) { + start = middle; + } else { + end = middle - 1; + } + } + return start === 0 ? text[0] : `${text.slice(0, start)}…`; +} + +/** + * Extend this subclass to receive event dispatching traits. + * Useful for when various locations need to observe changes on + * a common instance, who has a limited lifetime. + * + * This is not intended for use with framework-supported elements. + * For example, prefer using `@Output myEmitter` on Angular + * Components, or Polymer's `on-myprop-changed` for Polymer + * elements, instead. + * + * Example usage: + * + * ``` + * export enum ReactorEvent {EXPLODED} + * export class Reactor extends Dispatcher { + * _update() { + * this.dispatchEvent(ReactorEvent.EXPLODED); + * } + * } + * + * // Elsewhere + * const r = new Reactor(); + * r.addEventListener(ReactorEvent.EXPLODED, this._cleanup); + * ``` + */ +export class Dispatcher { + private eventTypeToListeners = new Map void>>(); + + addListener(eventType: EventType, listener: () => void): void { + this.getListeners(eventType)?.push(listener); + } + + removeListener(eventType: EventType, listener: () => void): void { + const newListeners = this.getListeners(eventType)?.filter((x) => { + return x !== listener; + }); + this.eventTypeToListeners.set(eventType, newListeners); + } + + dispatchEvent(eventType: EventType, payload?: any): void { + for (const listener of this.getListeners(eventType)) { + listener(payload); + } + } + + private getListeners(eventType): any { + if (!this.eventTypeToListeners.has(eventType)) { + this.eventTypeToListeners.set(eventType, []); + } + return this.eventTypeToListeners.get(eventType); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_color_select/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_color_select/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..1526d736f8c719d6cf5a579fb089300ec6531f68 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_color_select/index.ts @@ -0,0 +1,959 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import '@vaadin/combo-box'; +import * as _ from 'lodash'; +import * as d3 from 'd3'; +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property, observe } from '@polymer/decorators'; +import * as tf_graph_parser from '../../../tf_graph_common/parser'; +import { BENCH_PREFIX, NPU_PREFIX, UNMATCHED_COLOR, defaultColorSetting, defaultColorSelects} from '../../../tf_graph_common/common'; +import * as tf_graph_node from '../../../tf_graph_common/node'; +import { getElementBySelectors } from './../../utils'; +import * as tf_graph_render from '../../../tf_graph_common/render'; +import { DarkModeMixin } from '../../../polymer/dark_mode_mixin'; +import { LegacyElementMixin } from '../../../polymer/legacy_element_mixin'; + +const UNMATCHED_NODE_NAME = '无匹配节点'; +@customElement('tf-color-select') +class Legend extends LegacyElementMixin(DarkModeMixin(PolymerElement)) { + // 定义模板 + static readonly template = html` + + + `; + + // 核心part + @property({ type: Array }) + datasets; + + @property({ type: Boolean }) + _colorSetting: boolean = true; // 颜色设置按钮 + + @property({ type: Boolean }) + _filterSetting: boolean = false; + + @property({ type: Boolean }) + _overFlowLevel: boolean = true; // 溢出筛选图例 + + @property({ type: Array }) + selectColor: any = []; + + @property({ type: String, notify: true }) + selectedPrecisionNode: string = ''; + + @property({ type: String, notify: true }) + selectedOverflowNode: string = ''; + + @property({ type: Object }) + precisionmenu: any = []; + + // 颜色图例 + @property({ type: Object }) + colorset; + + @property({ type: Object }) + colorSetChanged; + + // 溢出图例默认数据 + @property({ type: Object }) + overFlowSet: any = [ + ['#B6C7FC', 'medium'], + ['#7E96F0', 'high'], + ['#4668B8', 'critical'], + ]; + + // 自定义颜色设置 + @property({ type: Array }) + standardColorList = ['#FFFCF3', '#FFEDBE', '#FFDC7F', '#FFC62E', '#FF9B3D', '#FF704D', '#FF4118']; + + @property({ type: Array }) + colorList = _.cloneDeep(this.standardColorList); + + @property({ type: Array }) + colorSelects = defaultColorSelects; + + @property({ type: Number, notify: true }) + dropdownIndex; + + @property({ type: Object }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo = {} as any; + + @property({ type: Object }) + colors: any; + + @property({ type: String, notify: true }) + selectedNode: string | null = null; + + // 目录,全量节点数据,支撑各种节点的搜索 + @property({ type: Object }) + menu: any; + + @property({ type: Object }) + unmatched: any = []; + + @property({ type: Object }) + npu_unmatched: any = []; + + @property({ type: Object }) + bench_unmatched: any = []; + + // 溢出筛选 + @property({ type: Array }) + overflowLevel: any = []; + + @property({ type: Object }) + overflowmenu: any = []; + + @property({ type: Boolean }) + overflowcheck; + + @property({ type: Boolean }) + enableConfig = true; + + @property({ type: Boolean }) + showSwitchIcon = true; + + @property({ type: Object }) + selection: any = {}; + + @observe('unmatched') + _observeUnmatchedNode(): void { + this.set('npu_unmatched', this.unmatched[0]); + this.set('bench_unmatched', this.unmatched[1]); + } + + @observe('colorset') + _observeColorSet(): void { + if (this.colorset.length !== 0) { + const colorsets = this.colorset; + for (const item of colorsets) { + if (item[1].value.length === 0) { + item[1].value.push(UNMATCHED_NODE_NAME); + } + } + this.colorSetChanged = colorsets; + } else { + return; + } + } + + @observe('renderHierarchy') + observeColorSetting(): void { + if (_.isEmpty(this.renderHierarchy)) { + return; + } + if (this.renderHierarchy.bench) { + this.set('enableConfig', true); + this.set('showSwitchIcon', !!this.overflowcheck); + } else { + if (this.overflowcheck) { + this._selectedTabChanged(); + this.set('enableConfig', true); + // 隐藏切换按钮 + this.set('showSwitchIcon', false); + // 切换至精度溢出,隐藏精度筛选 + this.set('_filterSetting', true); + } else { + this.set('enableConfig', false); + } + } + } + + // 写一个如果切换数据清除所有checkbox和所有this.selectColor + @observe('selection') + _clearAllToggleCheckboxAndInputField(): void { + this.set('selectedSide', '0'); + const allCheckboxes = this.shadowRoot?.querySelectorAll('paper-checkbox'); + if (allCheckboxes) { + allCheckboxes.forEach((checkbox) => { + checkbox.checked = false; // 清空每个 checkbox 的选中状态 + }); + } + this.selectColor = []; + this.precisionmenu = []; + this.overflowLevel = []; + // 清除精度筛选输入框 + this.set('selectedPrecisionNode', ''); + // 清除精度溢出输入框 + this.set('selectedOverflowNode', ''); + this.set('selectedNode', ''); + } + + toggleVisibility(): void { + this.set('_colorSetting', !this._colorSetting); + } + + _clickSetting(event): void { + event.stopPropagation(); + this.set('_colors', true); + this.toggleVisibility(); + } + + _defaultSetting(): void { + // 配置预设 + this.colorSelects = defaultColorSetting; + this._confirmAction(); + // 清空并且还原至临时配置结构 + this.colorSelects = defaultColorSelects; + } + + _cancelAction(): void { + this.toggleVisibility(); + } + + _confirmAction(): void { + const newColorsList = {}; + const len = this.colorSelects.length; + if (len === 0) { + this.showDynamicDialog('配置失败,请添加配置项。'); + return; + } + + // 遍历每一项,动态生成 newColorsList 对象 + for (let i = 0; i < len; i++) { + const color = this.colorSelects[i].key; + const leftValue = this.colorSelects[i].values[0]; + const rightValue = this.colorSelects[i].values[1]; + // 检查每个组中的所有输入框是否都有值 + if (isNaN(leftValue) || isNaN(rightValue) || color === 'NaN') { + this.showDynamicDialog('配置失败,存在未配置项。'); + return; + } + // 将每个 color 和其对应的 leftValue 和 rightValue 作为 value 数组,设置到 colors 对象中 + newColorsList[color] = { + value: [leftValue, rightValue], + description: + '此节点所有输入输出的统计量相对误差,值越大代表测量值与标杆值的偏差越大,相对误差计算方式:|(测量值-标杆值)/标杆值|', + }; + } + // 无匹配节点图例一定存在 + newColorsList[UNMATCHED_COLOR] = { + value: UNMATCHED_NODE_NAME, + description: '对比过程中节点未匹配上', + }; + // 更新颜色列表 + this.set('colors', newColorsList); + let newColorSetChanged: any[] = []; + Object.entries(newColorsList).forEach(([color, details]) => { + let detailsTyped = details as { value: string }; + const colorset: any[] = [color, detailsTyped]; + newColorSetChanged.push(colorset); + }); + this.set('colorSetChanged', newColorSetChanged); + const params = new URLSearchParams(); + params.set('colors', JSON.stringify(newColorsList)); + params.set('run', JSON.stringify(this.selection.run)); + const colorsPath = `updateColors?${String(params)}`; + tf_graph_parser.fetchPbTxt(colorsPath); + // 根据颜色列表重绘 + let nodeDataSet = Object.entries(this.renderHierarchy.npu.getIndex()); + tf_graph_node.getColors(this.colors); + for (let [_key, value] of Object.entries(nodeDataSet)) { + const renderInfo = value[1]; + const svgRoot = getElementBySelectors(['tf-graph-board', 'tf-graph', 'tf-graph-scene', 'svg']); + const sceneElement = getElementBySelectors(['tf-graph-board', 'tf-graph', 'tf-graph-scene']) as any; + const nodeGroup = d3.select(svgRoot).select(`.node[data-name="${renderInfo.node.name}"]`); + tf_graph_node.stylize(nodeGroup, renderInfo, sceneElement); + } + // 清除精度筛选输入框 + this.set('selectedPrecisionNode', ''); + this.toggleVisibility(); + } + + _toggleDropdown(event): void { + const selectBox = event.target.closest('.select-box'); // 获取最近的父元素 .select-box + const dropdown = selectBox.nextElementSibling; // 获取下一个兄弟元素,即 .dropdown + dropdown.hidden = !dropdown.hidden; + this.dropdownIndex = event.model.index; + function maybeCloseMenu(eventCloseMenu?: any): void { + if (eventCloseMenu?.composedPath().includes(selectBox)) { + return; + } + dropdown.hidden = true; + document.body.removeEventListener('click', maybeCloseMenu, { + capture: true, + }); + } + if (!dropdown.hidden) { + document.body.addEventListener('click', maybeCloseMenu, { + capture: true, + }); + } + } + + _onOptionHover(event): void { + event.target.style.border = 'solid 1px black'; + } + + _outOptionHover(event): void { + event.target.style.border = ''; + } + + _changeColor(event): void { + const dropdown = event.target.closest('.dropdown'); + const select = dropdown.previousElementSibling; + dropdown.hidden = true; + const selectedColor = event.target.value; + select.style.backgroundColor = selectedColor; + this.set(`colorSelects.${this.dropdownIndex}.key`, selectedColor); + this.notifyPath('colorSelects'); + this._setColorList(); + } + + // 不显示NaN 而显示空 + _formatValue(value): string { + return isNaN(value) ? '' : value; + } + + _validateInputs(event: any): void { + const index = event.model.index; + const { values } = this.colorSelects[index]; + + // 显式定义 leftInputSet 和 rightInputSet 的类型为 number[] + const [leftInputSet, rightInputSet] = this.colorSelects.reduce<[number[], number[]]>( + (acc, item) => { + acc[0].push(item.values[0]); + acc[1].push(item.values[1]); + return acc; + }, + [[], []], // 初始值为两个空数组 + ); + + let value = parseFloat(event.target.value); + // 输入值验证 NaN值防护 限制输入范围 + if (isNaN(value) || value < 0 || value > 1) { + this._clearInput(event, index); + return; + } + + const valueStr = value.toString(); + + // 检查是否存在小数点 + const parts = valueStr.split('.'); + + // 如果存在小数点且小数部分长度超过最大限制 + if (parts.length > 1 && parts[1].length > 5) { + // 使用 toFixed 保留最多5位小数 + value = parseFloat(value.toFixed(5)); + } + + const isLeftInput = event.target.id === 'input-left'; + const otherSide = isLeftInput ? values[1] : values[0]; + const [left, right] = isLeftInput ? [value, otherSide] : [otherSide, value]; + + // 检查输入值是否有效 + const isLeftInputGreater = isLeftInput && left > right; + const isRightInputGreater = !isLeftInput && right < left; + + if (isLeftInputGreater || isRightInputGreater) { + this._clearInput(event, index); + return; + } + + // 检查输入值是否与其他区间冲突 + const isConflict = this.colorSelects.some((item, i) => { + // 排除当前输入框 + if (i === index) { + return false; + } + + const [leftInput, rightInput] = item.values; + return ( + (isLeftInput && left !== leftInput && left >= leftInput && left < rightInput) || + (!isLeftInput && right !== rightInput && right > leftInput && right <= rightInput) || + (isLeftInput && leftInputSet.includes(left)) || + (!isLeftInput && rightInputSet.includes(right)) + ); + }); + + if (isConflict) { + this._clearInput(event, index); + return; + } + + // 0!@#¥ 也可以被float转换为0,阻止这种情况发生 + event.target.value = value; + // 更新值 + this.set(`colorSelects.${index}.values.${isLeftInput ? 0 : 1}`, value); + } + + _clearInput(event: any, index: number): void { + event.target.value = ''; // 清空输入框 + this.set(`colorSelects.${index}.values.${event.target.id === 'input-left' ? 0 : 1}`, NaN); // 更新 colorSelects + } + + _addOption(): void { + if (this.colorSelects.length < 5) { + const obj = { + key: 'NaN', + values: [NaN, NaN], + }; + this.push('colorSelects', obj); + } + // 确保它在当前同步操作this.push()之后才执行. + this.async(() => { + this._setColorList(); + }, 0); + } + + _removeOption(event): void { + const index = event.model.index; + + // 删除项 + this.splice('colorSelects', index, 1); + + // 恢复其他输入框的值 + this.colorSelects.forEach((item, i) => { + if (i >= index) { + this.set(`colorSelects.${i}.values`, item.values); + } + }); + this._setColorList(); + } + + _setColorList(): void { + let colorSelectElements = this.shadowRoot?.querySelectorAll('[id^="color-select"]'); + let backgroundColors: string[] = []; + this.colorSelects.forEach((item) => { + // 获取计算后的背景色 + const backgroundColor = item.key; + backgroundColors.push(backgroundColor); + }); + let newColorList = this.standardColorList.filter((color) => !backgroundColors.includes(color)); + this.set('colorList', newColorList); + // 清除选中,否则再次选中不同列表的同一顺位的值的时候不会触发on-change + this.async(() => { + colorSelectElements?.forEach((element) => { + if (element instanceof HTMLSelectElement) { + element.selectedIndex = -1; + } + }); + }, 0); + } + + _toggleOverflowLevelOpen(): void { + this.set('_overFlowLevel', !this._overFlowLevel); + } + + showDynamicDialog(message): void { + // 检查是否已经有显示的对话框,避免重复添加 + let existingDialog = this.shadowRoot?.querySelector('#dynamicDialog'); + if (existingDialog) { + existingDialog.remove(); // 删除旧的对话框 + } + // 创建新的对话框 + const dialog = document.createElement('paper-dialog'); + dialog.id = 'dynamicDialog'; + // 添加标题 + const title = document.createElement('h2'); + title.textContent = '提示'; + dialog.appendChild(title); + // 添加提示内容 + const content = document.createElement('div'); + content.textContent = message; + dialog.appendChild(content); + // 添加按钮 + const buttonContainer = document.createElement('div'); + buttonContainer.classList.add('buttons'); + const closeButton = document.createElement('paper-button'); + closeButton.setAttribute('dialog-dismiss', ''); + closeButton.textContent = '关闭'; + buttonContainer.appendChild(closeButton); + dialog.appendChild(buttonContainer); + // 添加到 shadow DOM + this.shadowRoot?.appendChild(dialog); + // 打开对话框 + dialog.open(); + } + + async _toggleCheckbox(this, event): Promise { + const { batch, step, run, tag } = this.selection; + const item = event.model.item; + let checkbox; + let overflowCheckbox; + if (item[1].value) { + checkbox = this.shadowRoot?.getElementById(`checkbox-${event.model.index}`) as HTMLInputElement; + } else { + overflowCheckbox = this.shadowRoot?.getElementById(`overflowCheckbox-${event.model.index}`) as HTMLInputElement; + } + const params = new URLSearchParams(); + if (run) { + params.set('run', run); + } + if (tag) { + params.set('tag', tag); + } + params.set('batch', String(batch === -1 ? -1 : batch - 1)); + params.set('step', String(step === -1 ? -1 : step - 1)); + // 更新 selectColor 数组 + if (checkbox) { + if (checkbox.checked) { + this.selectColor.push(item[1].value); // 添加选中的颜色 + } else { + const index = this.selectColor.findIndex( + (color) => color[0] === item[1].value[0] && color[1] === item[1].value[1], + ); + if (index !== -1) { + this.selectColor.splice(index, 1); // 取消选中的颜色 + } + } + if (this.selectColor.length === 0) { + this.precisionmenu = []; + return; + } + params.set('precision_index', this.selectColor.join(',')); + const screenPath = `screen?${String(params)}`; + try { + const screenStr = tf_graph_parser.fetchPbTxt(screenPath); + const precisionmenu = JSON.parse(new TextDecoder().decode(await screenStr).replace(/'/g, '"')) as object; + this.set('precisionmenu', precisionmenu); + } catch (e) { + console.error('Get precision menu failed, please check the toggleCheckbox and the data in vis file'); + } + // 更新数据绑定 + this.notifyPath(`menu.${event.model.index}.checked`, checkbox.checked); + // 清除精度筛选输入框 + this.set('selectedPrecisionNode', ''); + } else { + if (overflowCheckbox.checked) { + this.overflowLevel.push(item[1]); // 添加选中的颜色 + } else { + const index = this.overflowLevel.findIndex((overflow) => overflow === item[1]); + if (index !== -1) { + this.overflowLevel.splice(index, 1); // 取消选中的颜色 + } + } + if (this.overflowLevel.length === 0) { + this.overflowmenu = []; + return; + } + params.set('overflow_level', this.overflowLevel.join(',')); + const screenPath = `screen?${String(params)}`; + + try { + const screenStr = tf_graph_parser.fetchPbTxt(screenPath); + this.overflowmenu = JSON.parse(new TextDecoder().decode(await screenStr).replace(/'/g, '"')) as object; + } catch (e) { + console.error('Get overflow menu failed, please check the toggleCheckbox and the data in vis file'); + } + // 更新数据绑定 + this.notifyPath(`menu.${event.model.index}.checked`, overflowCheckbox.checked); + // 清除精度溢出输入框 + this.set('selectedOverflowNode', ''); + } + } + + _selectedTabChanged(): void { + this.set('_filterSetting', !this._filterSetting); + } + + _observePrecsionNode = () => { + let prefix = NPU_PREFIX; + const node = prefix + this.selectedPrecisionNode; + this.set('selectedNode', node); + } + + _observeOverFlowNode = () => { + const isCompareGraph = this.renderHierarchy.bench?.renderedOpNames.some((name: string) => + name.startsWith(BENCH_PREFIX), + ); + let prefix = ''; + if (isCompareGraph) { + prefix = NPU_PREFIX; + } + const node = prefix + this.selectedOverflowNode; + this.set('selectedNode', node); + } + + _handlePrecisonSearch(event): void { + this._handleNodeSearch(event, 'precision'); + } + + _handleOverflowSearch(event): void { + this._handleNodeSearch(event, 'overflow'); + } + + _handleNodeSearch(event, type: 'precision' | 'overflow'): void { + const action = event.target.getAttribute('data-action'); + const menuFirstRow = this.menu[0]; + const selectedNode = this.selectedNode; + let nodeList; + let colorSet; + if (type === 'overflow') { + nodeList = this.overflowmenu; + colorSet = this.overflowLevel; + } else { + nodeList = this.precisionmenu; + if (type === 'precision') { + colorSet = this.selectColor; + } else { + colorSet = null; + } + } + const prefix = NPU_PREFIX; + const hasBNode = this.renderHierarchy.bench?.renderedOpNames.some((name: string) => name.startsWith(BENCH_PREFIX)); + const showDialog = (message: string): void => { + this.showDynamicDialog(message); + }; + + const setDefaultNode = (): void => { + const defaultNode = hasBNode ? `${prefix}${nodeList[0]}` : `${prefix}${nodeList[0]}`; + this.set('selectedNode', defaultNode); + }; + + if (colorSet.length === 0) { + showDialog('请选择颜色'); + return; + } + if (nodeList.length === 0) { + showDialog('选择的颜色没有节点存在'); + return; + } + + // 如果用户未选中节点,设置默认节点 + if (!selectedNode) { + setDefaultNode(); + return; + } + + // 获取 selectedNode 在 menuFirstRow 中的索引 + const slicedNode = hasBNode ? selectedNode.slice(4) : selectedNode; + const startIndex = menuFirstRow.indexOf(slicedNode); + if (startIndex === -1) { + setDefaultNode(); + return; + } + + // 查找下一个节点 + const findNextNode = (): string | null => { + if (nodeList.includes(selectedNode)) { + const currentIndex = nodeList.indexOf(selectedNode); + if (currentIndex + 1 >= nodeList.length) { + showDialog('没有下一个问题节点'); + return null; + } + return nodeList[currentIndex + 1]; + } + for (let i = startIndex + 1; i < menuFirstRow.length; i++) { + if (nodeList.includes(menuFirstRow[i])) { + return menuFirstRow[i]; + } + } + showDialog('没有下一个问题节点'); + return null; + }; + + // 查找上一个节点 + const findPreviousNode = (): string | null => { + if (nodeList.includes(selectedNode)) { + const currentIndex = nodeList.indexOf(selectedNode); + if (currentIndex === 0) { + showDialog('没有上一个问题节点'); + return null; + } + return nodeList[currentIndex - 1]; + } + for (let i = startIndex - 1; i >= 0; i--) { + if (nodeList.includes(menuFirstRow[i])) { + return menuFirstRow[i]; + } + } + showDialog('没有上一个问题节点'); + return null; + }; + + // 执行查找 + const nextNode = action === 'next' ? findNextNode() : findPreviousNode(); + + if (nextNode) { + const nextSelectedNode = hasBNode ? `${prefix}${nextNode}` : nextNode; + this.set('selectedNode', nextSelectedNode); + } + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..e8ffb8527f1192788eb538ef15fc187715a77ce5 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/index.ts @@ -0,0 +1,420 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import '@vaadin/button'; +import '@vaadin/details'; +import '@vaadin/combo-box'; + +import { isEmpty } from 'lodash'; +import { Notification } from '@vaadin/notification'; +import { PolymerElement, html } from '@polymer/polymer'; +import * as tf_graph_render from '../../../tf_graph_common/render'; +import { customElement, property, observe } from '@polymer/decorators'; +import { NPU_PREFIX, BENCH_PREFIX } from '../../../tf_graph_common/common'; +import useMatched from './useMatched'; +import type { UseMatchedType } from './useMatched'; +import '@vaadin/progress-bar'; +import '../tf_search_combox/index'; +@customElement('tf-manual-match') +class Legend extends PolymerElement { + // 定义模板 + static readonly template = html` + + +
+

注意:匹配结束后需要点击保存按钮,将操作后数据更新到文件中,否则操作无效

+ 保存 + +
+
+

未匹配节点

+ + +
+ 点击匹配 +
+
+
+

已匹配节点

+ + +
+ 取消匹配 +
+
+
+ `; + + @property({ type: Object }) + unmatched: any = []; + + @property({ type: Object }) + selection: any; + + @property({ type: Object }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo = {} as any; + + @property({ type: Boolean }) + isCompareGraph: boolean = true; + + @property({ type: String, notify: true }) + selectedNode: string = ''; + + @property({ type: Array }) + npuMatchedNodes: Array = []; + + @property({ type: Array }) + benchMatchedNodes: Array = []; + + @property({ type: Array }) + npuUnMatchedNodes: Array = []; + + @property({ type: Array }) + benchUnMatchedNodes: Array = []; + + @property({ type: String }) + selectedNpuMatchedNode: string = ''; + + @property({ type: String }) + selectedBenchMatchedNode: string = ''; + + @property({ type: String }) + selectedNpuUnMatchedNode: string = ''; + + @property({ type: String }) + selectedBenchUnMatchedNode: string = ''; + + @property({ type: Boolean }) + saveLoading: boolean = false; + + useMatched: UseMatchedType = useMatched(); + + npuMatchedNodeList = {}; + benchMatchedNodeList = {}; + + @observe('unmatched') + _observeUnmatchedNode(): void { + this.set('npuUnMatchedNodes', this.unmatched[0]); + this.set('benchUnMatchedNodes', this.unmatched[1]); + this.set('selectedNpuUnMatchedNode', ''); + this.set('selectedBenchUnMatchedNode', ''); + } + + @observe('renderHierarchy') + _observeRenderHierarchy(): void { + const isCompareGraphTemp = this.renderHierarchy.bench?.renderedOpNames.some((name: string) => + name.startsWith(BENCH_PREFIX), + ); + this.set('isCompareGraph', isCompareGraphTemp); + } + + @observe('selection') + async _observeSelection(): Promise { + if (isEmpty(this.selection)) { + return; + } + const result = await this.useMatched.queryMatchedStateList(this.selection); + if (result.success) { + // 初始化已匹配节点列表 + const { npu_match_nodes, bench_match_nodes } = result.data; + this.npuMatchedNodeList = npu_match_nodes; + this.benchMatchedNodeList = bench_match_nodes; + this.set('npuMatchedNodes', Object.keys(npu_match_nodes)); + this.set('benchMatchedNodes', Object.keys(bench_match_nodes)); + this.set('selectedNpuMatchedNode', ''); + this.set('selectedBenchMatchedNode', ''); + } else { + Notification.show(`错误:${result.error}`, { + position: 'middle', + duration: 3000, + theme: 'error', + }); + } + } + + @observe('selectedNode') + _observeSelectedNode(): void { + if (isEmpty(this.selectedNode)) { + return; + } + if (this.selectedNode.startsWith(NPU_PREFIX)) { + this.set('selectedNpuUnMatchedNode', this.selectedNode.replace(NPU_PREFIX, '')); + } else if (this.selectedNode.startsWith(BENCH_PREFIX)) { + this.set('selectedBenchUnMatchedNode', this.selectedNode.replace(BENCH_PREFIX, '')); + } + } + + // 一定要写箭头函数,不然父子组件传值this指向有问题 + _changeNpuUnMatchedNode = (): void => { + if (this.isCompareGraph) { + const node = NPU_PREFIX + this.selectedNpuUnMatchedNode; + this.set('selectedNode', node); + } else { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + } + }; + + _changeBenchUnMatchedNode = (): void => { + if (this.isCompareGraph) { + const node = BENCH_PREFIX + this.selectedBenchUnMatchedNode; + this.set('selectedNode', node); + } else { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + } + }; + + _changeNpuMatchedNode = (): void => { + if (this.isCompareGraph) { + const node = NPU_PREFIX + this.selectedNpuMatchedNode; + this.set('selectedBenchMatchedNode', this.npuMatchedNodeList[this.selectedNpuMatchedNode]); + this.set('selectedNode', node); + // 展开对应侧节点 + this.set('selectedNode', BENCH_PREFIX + this.selectedBenchMatchedNode); + } else { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + } + }; + + _changeBenchMatchedNode = (): void => { + if (this.isCompareGraph) { + const node = BENCH_PREFIX + this.selectedBenchMatchedNode; + this.set('selectedNpuMatchedNode', this.benchMatchedNodeList[this.selectedBenchMatchedNode]); + this.set('selectedNode', node); + // 展开对应侧节点 + this.set('selectedNode', NPU_PREFIX + this.selectedNpuMatchedNode); + } else { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + } + }; + + // 取消匹配 + async _deletelMatchedNodesLink(): Promise { + const result = await this.useMatched.deleteMatchedNodesLink( + this.selectedNpuMatchedNode, + this.selectedBenchMatchedNode, + this.selection, + this.renderHierarchy, + ); + if (result.success) { + // 更新匹配关系 + delete this.npuMatchedNodeList[this.selectedNpuMatchedNode]; + delete this.benchMatchedNodeList[this.selectedBenchMatchedNode]; + // 未匹配列表添加取消匹配的节点 + this.set('npuUnMatchedNodes', [...this.npuUnMatchedNodes, this.selectedNpuMatchedNode]); + this.set('benchUnMatchedNodes', [...this.benchUnMatchedNodes, this.selectedBenchMatchedNode]); + // 已匹配列表删除匹配成功的节点 + this.set('npuMatchedNodes', Object.keys(this.npuMatchedNodeList)); + this.set('benchMatchedNodes', Object.keys(this.benchMatchedNodeList)); + // 未匹配列表选择取消匹配的节点 + this.set('selectedNpuUnMatchedNode', this.selectedNpuMatchedNode); + this.set('selectedBenchUnMatchedNode', this.selectedBenchMatchedNode); + // 选中节点 + this.set('selectedNode', ''); + this.set('selectedNode', NPU_PREFIX + this.selectedNpuMatchedNode); + // 已匹配列表清空选中的节点 + this.set('selectedNpuMatchedNode', ''); + this.set('selectedBenchMatchedNode', ''); + Notification.show('取消成功:对应节点状态已更新', { + position: 'middle', + duration: 3000, + theme: 'success', + }); + } else { + Notification.show(`匹配失败:${result.error}`, { + position: 'middle', + duration: 3000, + theme: 'error', + }); + } + } + + // 匹配节点 + async _addMatchedNodesLink(): Promise { + const result = await this.useMatched.addMatchedNodesLink( + this.selectedNpuUnMatchedNode, + this.selectedBenchUnMatchedNode, + this.selection, + this.renderHierarchy, + ); + if (result.success) { + // 更新匹配关系 + this.npuMatchedNodeList[this.selectedNpuUnMatchedNode] = this.selectedBenchUnMatchedNode; + this.benchMatchedNodeList[this.selectedBenchUnMatchedNode] = this.selectedNpuUnMatchedNode; + // 未匹配列表删除匹配成功的节点 + this.set( + 'npuUnMatchedNodes', + this.npuUnMatchedNodes.filter((node) => node !== this.selectedNpuUnMatchedNode), + ); + this.set( + 'benchUnMatchedNodes', + this.benchUnMatchedNodes.filter((node) => node !== this.selectedBenchUnMatchedNode), + ); + // 已匹配列表添加匹配成功的节点 + this.set('npuMatchedNodes', Object.keys(this.npuMatchedNodeList)); + this.set('benchMatchedNodes', Object.keys(this.benchMatchedNodeList)); + // 已匹配列表选择匹配成功的节点 + this.set('selectedNpuMatchedNode', this.selectedNpuUnMatchedNode); + this.set('selectedBenchMatchedNode', this.selectedBenchUnMatchedNode); + // 选中节点 + this.set('selectedNode', ''); + this.set('selectedNode', NPU_PREFIX + this.selectedNpuUnMatchedNode); + // 未匹配列表清空选中的节点 + this.set('selectedNpuUnMatchedNode', ''); + this.set('selectedBenchUnMatchedNode', ''); + + Notification.show('匹配成功:对应节点状态已更新', { + position: 'middle', + duration: 3000, + theme: 'success', + }); + } else { + Notification.show(`匹配失败:${result.error}`, { + position: 'middle', + duration: 3000, + theme: 'error', + }); + } + } + + // 保存 + async _saveMatchedNodesLink(): Promise { + this.set('saveLoading', true); + const result = await this.useMatched.saveMatchedNodesLink(this.selection); + this.set('saveLoading', false); + if (result.success) { + Notification.show('保存成功:文件已变更', { + position: 'middle', + duration: 3000, + theme: 'success', + }); + } else { + Notification.show(`保存失败:${result.error}`, { + position: 'middle', + duration: 3000, + theme: 'error', + }); + } + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/useMatched.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/useMatched.ts new file mode 100644 index 0000000000000000000000000000000000000000..0ff91499eb42758e3c54c7dc723e1b09bba84260 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_manual_match/useMatched.ts @@ -0,0 +1,238 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { isEmpty } from 'lodash'; +import { fetchPbTxt } from '../../../tf_graph_common/parser'; +import { NPU_PREFIX, BENCH_PREFIX } from '../../../tf_graph_common/common'; +import { safeJSONParse } from '../../../utils'; + +export interface UseMatchedType { + saveMatchedNodesLink: (selection: any) => Promise; + addMatchedNodesLink: ( + npuNodeName: string, + benchNodeName: string, + selection: any, + renderHierarchy: any, + ) => Promise; + queryMatchedStateList: (selection: any) => Promise; + deleteMatchedNodesLink: ( + npuNodeName: string, + benchNodeName: string, + selection: any, + renderHierarchy: any, + ) => Promise; +} + +const useMatched = (): UseMatchedType => { + const requestAddMatchNodes = async (npuNodeName: string, benchNodeName: string, metaData: any): Promise => { + const params = new URLSearchParams(); + params.set('npuNodeName', JSON.stringify(npuNodeName)); + params.set('benchNodeName', JSON.stringify(benchNodeName)); + params.set('metaData', JSON.stringify(metaData)); + // 接口请求 + const precisionPath = `addMatchNodes?${String(params)}`; + const precisionStr = await fetchPbTxt(precisionPath); // 获取异步的 ArrayBuffer + const decoder = new TextDecoder(); + const decodedStr = decoder.decode(precisionStr); // 解码 ArrayBuffer 到字符串 + // 接口返回 + const mactchResult = safeJSONParse(decodedStr.replace(/"None"/g, '{}')); + return mactchResult; + }; + + const requestDeleteMatchNodes = async (npuNodeName: string, benchNodeName: string, metaData: any): Promise => { + const params = new URLSearchParams(); + params.set('npuNodeName', JSON.stringify(npuNodeName)); + params.set('benchNodeName', JSON.stringify(benchNodeName)); + params.set('metaData', JSON.stringify(metaData)); + // 接口请求 + const precisionPath = `deleteMatchNodes?${String(params)}`; + const precisionStr = await fetchPbTxt(precisionPath); // 获取异步的 ArrayBuffer + const decoder = new TextDecoder(); + const decodedStr = decoder.decode(precisionStr); // 解码 ArrayBuffer 到字符串 + // 接口返回 + const mactchResult = safeJSONParse(decodedStr.replace(/"None"/g, '{}')); + return mactchResult; + }; + + const requestMatchStateList = async (metaData: any): Promise => { + const params = new URLSearchParams(); + params.set('metaData', JSON.stringify(metaData)); + // 接口请求 + const precisionPath = `getMatchedStateList?${String(params)}`; + const precisionStr = await fetchPbTxt(precisionPath); // 获取异步的 ArrayBuffer + const decoder = new TextDecoder(); + const decodedStr = decoder.decode(precisionStr); // 解码 ArrayBuffer 到字符串 + // 接口返回 + const mactchResult = safeJSONParse(decodedStr.replace(/"None"/g, '{}')); + return mactchResult; + }; + + const queryMatchedStateList = async (selection: any): Promise => { + const metaData = { + run: selection.run, + tag: selection.tag, + }; + const matchStateList = await requestMatchStateList(metaData); + return matchStateList; + }; + + const saveMatchedNodesLink = async (selection: any): Promise => { + const metaData = { + run: selection.run, + tag: selection.tag, + }; + const params = new URLSearchParams(); + params.set('metaData', JSON.stringify(metaData)); + // 接口请求 + const precisionPath = `saveData?${String(params)}`; + const precisionStr = await fetchPbTxt(precisionPath); // 获取异步的 ArrayBuffer + const decoder = new TextDecoder(); + const decodedStr = decoder.decode(precisionStr); // 解码 ArrayBuffer 到字符串 + // 接口返回 + const saveResult = safeJSONParse(decodedStr.replace(/"None"/g, '{}')); + return saveResult; + }; + + const addMatchedNodesLink = async ( + npuNodeName: string, + benchNodeName: string, + selection: any, + renderHierarchy: any, + ): Promise => { + if (isEmpty(npuNodeName) || isEmpty(benchNodeName)) { + return { + success: false, + error: '调试侧节点或标杆节点为空', + }; + } + const metaData = { + run: selection.run, + tag: selection.tag, + }; + const matchResult = await requestAddMatchNodes(npuNodeName, benchNodeName, metaData); + if (matchResult.success) { + const graphNpuNodeName = NPU_PREFIX + npuNodeName; + const graphBenchNodeName = BENCH_PREFIX + benchNodeName; + const graphNpuNodeInputData = renderHierarchy.npu?.index?.[graphNpuNodeName]?.node?.inputData; + const graphNpuNodeOutputData = renderHierarchy.npu?.index?.[graphNpuNodeName]?.node?.outputData; + const intputStatisticalDiff = matchResult.data.intput_statistical_diff; + const outputStatisticalDiff = matchResult.data.output_statistical_diff; + // 更新节点之间的匹配关系 + updateGraphNodeData(graphNpuNodeInputData, intputStatisticalDiff); + updateGraphNodeData(graphNpuNodeOutputData, outputStatisticalDiff); + renderHierarchy.npu.index[graphNpuNodeName].node.matchedNodeLink = [graphBenchNodeName]; + renderHierarchy.bench.index[graphBenchNodeName].node.matchedNodeLink = [graphNpuNodeName]; + renderHierarchy.npu.index[graphNpuNodeName].node.nodeAttributes._linked_node = [graphBenchNodeName]; + renderHierarchy.bench.index[graphBenchNodeName].node.nodeAttributes._linked_node = [graphNpuNodeName]; + // 更新匹配精度,节点重新上色 + const precisionIndex = matchResult.data.precision_error; + const nodeAtts = renderHierarchy.npu.index[graphNpuNodeName].node.attr; + const precisionIndexObj = nodeAtts?.find((item) => item.key === 'precision_index'); + if (precisionIndexObj) { + precisionIndexObj.value = precisionIndex; + } else { + nodeAtts.push({ + key: 'precision_index', + value: precisionIndex, + }); + } + } + return matchResult; + }; + + const deleteMatchedNodesLink = async ( + npuNodeName: string, + benchNodeName: string, + selection: any, + renderHierarchy: any, + ): Promise => { + if (isEmpty(npuNodeName) || isEmpty(benchNodeName)) { + return { + success: false, + error: '调试侧节点或标杆节点为空', + }; + } + const metaData = { + run: selection.run, + tag: selection.tag, + }; + const matchResult = await requestDeleteMatchNodes(npuNodeName, benchNodeName, metaData); + matchResult.success = true; + if (matchResult.success) { + const graphNpuNodeName = NPU_PREFIX + npuNodeName; + const graphBenchNodeName = BENCH_PREFIX + benchNodeName; + const graphNpuNodeInputData = renderHierarchy.npu?.index?.[graphNpuNodeName]?.node?.inputData; + const graphNpuNodeOutputData = renderHierarchy.npu?.index?.[graphNpuNodeName]?.node?.outputData; + // 清空节点之间的匹配关系 + deleteMatchedNodeData(graphNpuNodeInputData); + deleteMatchedNodeData(graphNpuNodeOutputData); + renderHierarchy.npu.index[graphNpuNodeName].node.matchedNodeLink = []; + renderHierarchy.bench.index[graphBenchNodeName].node.matchedNodeLink = []; + renderHierarchy.npu.index[graphNpuNodeName].node.nodeAttributes._linked_node = []; + renderHierarchy.bench.index[graphBenchNodeName].node.nodeAttributes._linked_node = []; + // 更新匹配精度,节点重新上色 + const nodeAtts = renderHierarchy.npu.index[graphNpuNodeName].node.attr; + const precisionIndexObj = nodeAtts?.filter((item) => item.key === 'precision_index'); + renderHierarchy.npu.index[graphNpuNodeName].node.attr = precisionIndexObj; + } + return matchResult; + }; + + const updateGraphNodeData = (graphNpuNodeData, statisticalDiff): void => { + if (isEmpty(statisticalDiff) || isEmpty(graphNpuNodeData)) { + return; + } + for (const key in statisticalDiff) { + if (Object.prototype.hasOwnProperty.call(statisticalDiff, key)) { + const value = statisticalDiff[key]; + graphNpuNodeData[key] = { + ...graphNpuNodeData[key], // 如果 graphNpuNodeData[key] 可能为 undefined,则需要额外处理以避免错误 + ...value, + }; + } + } + }; + const deleteMatchedNodeData = (graphNpuNodeData): void => { + const keysToRemove = [ + 'MaxAbsErr', + 'MinAbsErr', + 'NormAbsErr', + 'MeanAbsErr', + 'MaxRelativeErr', + 'MinRelativeErr', + 'NormRelativeErr', + 'MeanRelativeErr', + ]; + for (const key in graphNpuNodeData) { + if (Object.prototype.hasOwnProperty.call(graphNpuNodeData, key)) { + const fildObj = graphNpuNodeData[key]; + keysToRemove.forEach((keyToRemove) => { + // 确保要删除的键存在于当前对象中 + delete fildObj[keyToRemove]; + }); + } + } + }; + + return { + saveMatchedNodesLink, + addMatchedNodesLink, + queryMatchedStateList, + deleteMatchedNodesLink, + }; +}; + +export default useMatched; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_search_combox/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_search_combox/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..234a43188ce4801c44a6025d94f16f4260b6b447 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/tf_search_combox/index.ts @@ -0,0 +1,181 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import '@vaadin/button'; +import '@vaadin/details'; +import '@vaadin/combo-box'; + +import { isEmpty } from 'lodash'; +import { Notification } from '@vaadin/notification'; +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property } from '@polymer/decorators'; +import '@vaadin/progress-bar'; +@customElement('tf-search-combox') +class Legend extends PolymerElement { + // 定义模板 + static readonly template = html` + +
+ + + +
+ `; + @property({ type: Object }) + onSelectChange!: () => void; + + @property({ type: String, notify: true }) + selectedValue: string = ''; + + @property({ type: Array }) + items: string[] = []; + + @property({ type: Boolean }) + isCompareGraph: boolean = true; + + _onChange(): void { + this.onSelectChange(); + } + + // 选择列表中的下一个节点 + _selectNext(): void { + if (!this.isCompareGraph) { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } + + if (isEmpty(this.items)) { + Notification.show('提示:列表为空', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } + if (isEmpty(this.selectedValue)) { + this.set('selectedValue', this.items[0]); + this.onSelectChange(); + return; + } + const index = this.items.indexOf(this.selectedValue); + if (index + 1 >= this.items.length) { + Notification.show('提示:已到达列表底部', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } else { + this.set('selectedValue', this.items[index + 1]); + } + this.onSelectChange(); + } + + // 选择列表中的上一个节点 + _selectPrevious(): void { + if (!this.isCompareGraph) { + Notification.show('提示:单图节点不支持匹配', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } + + if (isEmpty(this.items)) { + Notification.show('提示:列表为空', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } + if (isEmpty(this.selectedValue)) { + this.set('selectedValue', this.items[0]); + this.onSelectChange(); + return; + } + const index = this.items.indexOf(this.selectedValue); + if (index - 1 < 0) { + Notification.show('提示:已到达列表顶部', { + position: 'middle', + duration: 2000, + theme: 'contrast', + }); + return; + } else { + this.set('selectedValue', this.items[index - 1]); + } + this.onSelectChange(); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/ts_linkage_search_combox/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/ts_linkage_search_combox/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..3d396d0f80aa3afe0f4906a624d5ba8970b13a3e --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/components/ts_linkage_search_combox/index.ts @@ -0,0 +1,197 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import '@vaadin/button'; +import '@vaadin/details'; +import '@vaadin/combo-box'; +import '@vaadin/select'; +import '@vaadin/text-field'; +import { NPU_PREFIX, BENCH_PREFIX } from '../../../tf_graph_common/common'; +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property, observe } from '@polymer/decorators'; +import '@vaadin/progress-bar'; +import * as tf_graph_render from '../../../tf_graph_common/render'; +import '../tf_search_combox/index'; +@customElement('tf-linkage-search-combox') +class Legend extends PolymerElement { + // 定义模板 + static readonly template = html` + +
+
+ + +
+
+ +
+
+ `; + + @property({ type: Object }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo = {} as any; + + @property({ type: String, notify: true }) + selectedNode = ''; + + @property({ type: Boolean }) + isCompareGraph: boolean = true; + + @property({ type: Array }) + menu = []; + + @property({ type: String }) + selectedMenuNode = ''; + + @property({ type: Array }) + menuItem = []; + + @property({ type: String }) + searchText = ''; + + @property({ type: String }) + selectedSide = '0'; + + @property({ type: Array }) + menuSideItem = [ + { label: '调试侧', value: '0' }, + { label: '标杆侧', value: '1' }, + ]; + + @observe('selectedSide') + _observeSelectSide(): void { + if (this.menu) { + this.set('menuItem', this.menu[Number(this.selectedSide)]); + this.set('searchText', ''); + this.set('selectedMenuNode', ''); + } + } + + @observe('menu') + _observeMenu(): void { + this.set('selectedMenuNode', ''); + this.set('selectedSide', '0'); + this.set('searchText', ''); + this.set('menuItem', this.menu[Number(this.selectedSide)]); + } + + @observe('renderHierarchy') + _observeRenderHierarchy(): void { + const isCompareGraphTemp = this.renderHierarchy.bench?.renderedOpNames.some((name: string) => + name.startsWith(BENCH_PREFIX), + ); + this.updateStyles({ '--select-border-color': isCompareGraphTemp ? '#0d0d0d' : 'white' }); + this.set('isCompareGraph', isCompareGraphTemp); + } + + _onSelectedMenuNode = (): void => { + let prefix = ''; + if (this.isCompareGraph) { + if (this.selectedSide === '0') { + prefix = NPU_PREFIX; + } else { + prefix = BENCH_PREFIX; + } + } + const node = prefix + this.selectedMenuNode; + this.set('selectedNode', node); + }; + + _onChangeSearchText(): void { + const allNodeItems = this.menu[Number(this.selectedSide)] as Array; + if (!this.searchText) { + this.set('menuItem', allNodeItems); + return; + } + const searchTextLower = this.searchText.trim().toLowerCase(); // 将搜索文本转换为小写 + const filterItem = allNodeItems?.filter((item: string) => { + return item.toLowerCase().indexOf(searchTextLower) !== -1; // 将目标文本转换为小写 + }); + this.set('selectedMenuNode', ''); + this.set('menuItem', filterItem); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/tf-graph-controls.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/tf-graph-controls.ts new file mode 100644 index 0000000000000000000000000000000000000000..3e1c0e652f08395d2ac98845b91db05d8f69ba45 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/tf-graph-controls.ts @@ -0,0 +1,754 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ +import '@vaadin/icon'; +import '@vaadin/icons'; +import '@vaadin/details'; +import '@vaadin/select'; +import './components/ts_linkage_search_combox/index'; +import { customElement, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import * as _ from 'lodash'; +import { DarkModeMixin } from '../polymer/dark_mode_mixin'; +import '../polymer/irons_and_papers'; +import './components/tf_manual_match/index'; +import './components/tf_color_select/index'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import '../tf_dashboard_common/tensorboard-color'; +import * as tf_graph_render from '../tf_graph_common/render'; +import '../tf_graph_common/tf-graph-icon'; +import '../tf_graph_loader/tf-graph-dashboard-loader'; +import { PaperCheckboxElement } from '../polymer/irons_and_papers'; + +export interface Selection { + run: string; + tag: string | null; + batch: number; + step: number; +} +export interface RunItem { + name: string; + tags: string[]; +} +export interface MinimapVis { + npu: boolean; + bench: boolean; +} +export type Dataset = Array; +@customElement('tf-graph-controls') +class TfGraphControls extends LegacyElementMixin(DarkModeMixin(PolymerElement)) { + static readonly template = html` + + +
+ + +
+
+
+
+ + + 自适应屏幕 + +
+
+ 调试侧缩略图 + +
+
+
目录 ([[datasets.length]])
+ + + + + +
+ +
+ + +
+
+
+ + +
+ +
+ + + `; + + // 核心part + @property({ type: Array, observer: '_datasetsChanged' }) + datasets: any = []; + + /** + * @type {tf_graph_render.MergedRenderGraphInfo} + */ + @property({ type: Object }) + renderHierarchy: tf_graph_render.MergedRenderGraphInfo; + + /** + * @type {!Selection} + */ + @property({ + type: Object, + notify: true, + readOnly: true, + computed: + '_computeSelection(datasets, _selectedRunIndex, _selectedTagIndex, _selectedMicroStep)', + }) + selection: object; + + @property({ type: Object }) + graphDef: any; + + @property({ type: Array, computed: '_getTags(datasets, _selectedRunIndex)' }) + tagList: string[] = []; + + // Run 路径选择 + @property({ type: Number }) + _selectedRunIndex: number = 0; + + // Tag选择 + @property({ type: Number }) + _selectedTagIndex: number = 0; + + @property({ type: Boolean }) + showSessionRunsDropdown: boolean = true; + + // MicroStep 选择 和 Step选择 + @property({ type: Number }) + _selectedMicroStep: number = -1; + + @property({ type: Number }) + _selectedStep: number = -1; + + @property({ type: Object }) + microsteps: any; + + @property({ type: Object }) + steplist: any; + + // 目录,全量节点数据,支撑各种节点的搜索 + @property({ type: Object }) + menu: any; + + // 颜色数据 + @property({ type: Object }) + colors: any; + + // 颜色图例 + @property({ type: Object }) + colorset; + + // 溢出检测标志 + @property({ type: Boolean }) + overflowcheck; + + // 节点匹配,未匹配部分节点 + @property({ type: Object }) + unmatched: any = []; + + // 上传文件 + @property({ type: Object, notify: true }) + selectedFile: object; + + @property({ type: String, notify: true }) + selectedNode: string | null = null; + + @property({ type: Object, notify: true }) + minimapVis: MinimapVis = { npu: true, bench: true }; + + override ready(): void { + super.ready(); + this._showTabContent('设置', 'nodes-content'); + document.addEventListener('contextMenuTag-changed', this._getTagChanged.bind(this)); + } + + _getTagChanged(contextMenuTag): void { + this.set('_selectedTagIndex', contextMenuTag.detail); + } + + _showTabContent(buttonText, contentId): void { + // Remove 'active' class from all buttons + this.shadowRoot?.querySelectorAll('.tab-button').forEach((button) => { + button.classList.remove('active'); + }); + + // Add 'active' class to the clicked button + const buttons = this.shadowRoot?.querySelectorAll('.tab-button'); + buttons?.forEach((button) => { + if ((button as HTMLElement).innerHTML === buttonText) { + button?.classList.add('active'); + } + }); + + // Hide all content + this.shadowRoot?.querySelectorAll('.tab-content').forEach((content) => { + content.classList.add('hidden'); + }); + + // Show the selected content + const selectedContent = this.shadowRoot?.getElementById(contentId); + if (selectedContent) { + selectedContent.classList.remove('hidden'); + } + } + + // 使用示例 + _showNodeControls(): void { + this._showTabContent('设置', 'nodes-content'); + } + + _showMatch(): void { + this._showTabContent('匹配', 'match-content'); + } + + _getTags(datasets: Dataset, _selectedRunIndex: number): string[] { + if (!datasets || !datasets[_selectedRunIndex]) { + return []; + } + this.set('_selectedTagIndex', -1); + // 需等待dom-repeat加载完成,因此需要再次异步以刷新下拉选项 + setTimeout(() => { + this.set('_selectedTagIndex', 0); + }, 0); + return datasets[_selectedRunIndex].tags; + } + + _fit(): void { + this.fire('fit-tap'); + } + + _clearMicroStep(): void { + // 也清除一下MicroStep和Step + this.set('_selectedMicroStep', -1); + this.set('_selectedStep', -1); + this.set('selectedNode', null); + } + + computedLength(microsteps): number { + return microsteps.length > 0 ? microsteps.length - 1 : 0; + } + + _datasetsChanged(newDatasets: Dataset, oldDatasets: Dataset): void { + if (oldDatasets !== null) { + // Select the first dataset by default. + this._selectedRunIndex = 0; + } + } + + _computeSelection( + datasets: Dataset, + _selectedRunIndex: number, + _selectedTagIndex: number, + _selectedMicroStep: number, + ): { run: string; tag: string | null; batch: number; step: number } | null { + if (!datasets[_selectedRunIndex] || !datasets[_selectedRunIndex].tags[_selectedTagIndex]) { + return null; + } + return { + run: datasets[_selectedRunIndex].name, + tag: datasets[_selectedRunIndex].tags[_selectedTagIndex], + batch: _selectedMicroStep, + step: this._selectedStep, + }; + } + + _toggleNpuMinimap(event: CustomEvent): void { + const checkbox = event.target as PaperCheckboxElement; + this.set('minimapVis.npu', checkbox.checked); + } + + _toggleBenchMinimap(event: CustomEvent): void { + const checkbox = event.target as PaperCheckboxElement; + this.set('minimapVis.bench', checkbox.checked); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/utils.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/utils.ts new file mode 100644 index 0000000000000000000000000000000000000000..4dd19bf1af0a96539411370def45d0a70b14d571 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_controls/utils.ts @@ -0,0 +1,26 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +export const getElementBySelectors = (selectors): Element | null => { + let currentElement = document.querySelector('graph-app'); + for (const selector of selectors) { + currentElement = currentElement?.shadowRoot?.querySelector(selector); + if (!currentElement) { + return null; + } + } + return currentElement; +}; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_dashboard/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_dashboard/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..e1add6c39a744aa81220cafcaf19e97a5fb3c924 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_dashboard/index.ts @@ -0,0 +1,307 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +Copyright (c) 2025, Huawei Technologies. +Adapt to the model hierarchical visualization data collected by the msprobe tool +==============================================================================*/ + +import { customElement, observe, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; +import '../polymer/irons_and_papers'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import { RequestManager } from '../tf_backend/requestManager'; +import '../tf_dashboard_common/tf-dashboard-layout'; +import * as tf_storage from '../tf_storage'; +import * as vz_sorting from '../vz_sorting/sorting'; +import '../tf_graph_board/tf-graph-board'; +import * as tf_graph_render from '../tf_graph_common/render'; +import '../tf_graph_controls/tf-graph-controls'; +import '../tf_graph_loader/tf-graph-dashboard-loader'; + +/** + * The (string) name for the run of the selected dataset in the graph dashboard. + */ +const RUN_STORAGE_KEY = 'run'; + +/** + * tf-graph-dashboard displays a graph from a TensorFlow run. + * + * It has simple behavior: Creates a url-generator and run-generator + * to talk to the backend, and then passes the runsWithGraph (list of runs with + * associated graphs) along with the url generator into tf-graph-board for display. + * + * If there are multiple runs with graphs, the first run's graph is shown + * by default. The user can select a different run from a dropdown menu. + */ +@customElement('graph-app') +class TfGraphDashboard extends LegacyElementMixin(PolymerElement) { + static readonly template = html` + + + +
+ +
+

No graph definition files were found.

+
+
+ +
+
+
+ + `; + + @property({ type: Array }) + _datasets: any[] = []; + + @property({ type: Boolean }) + _datasetsFetched: boolean = false; + + @property({ type: Number }) + _selectedDataset: number = 0; + + @property({ type: Object }) + _renderHierarchy: tf_graph_render.RenderGraphInfo; + + @property({ type: Object }) + _requestManager: RequestManager = new RequestManager(); + + @property({ type: String, notify: true }) + selectedNode: string; + + @property({ type: Boolean }) + _isAttached: boolean; + + // Whether this dashboard is initialized. This dashboard should only be initialized once. + @property({ type: Boolean }) + _initialized: boolean; + + @property({ type: Array }) + runs: unknown[]; + + @property({ + type: String, + notify: true, + observer: '_runObserver', + }) + run: string = tf_storage + .getStringInitializer(RUN_STORAGE_KEY, { + defaultValue: '', + useLocalStorage: false, + }) + .call(this); + + @property({ type: Object }) + _selection: object; + + @property({ type: Object }) + _selectedFile: any; + + _runObserver = tf_storage.getStringObserver(RUN_STORAGE_KEY, { + defaultValue: '', + polymerProperty: 'run', + useLocalStorage: false, + }); + + @observe('_isAttached') + _maybeInitializeDashboard(): void { + let isAttached = this._isAttached; + if (this._initialized || !isAttached) { + // Either this dashboard is already initialized ... or we are not yet ready to initialize. + return; + } + // Set this to true so we only initialize once. + this._initialized = true; + this._fetchDataset().then((dataset) => { + const runNames = Object.keys(dataset); + // Transform raw data into UI friendly data. + this._datasets = runNames.sort(vz_sorting.compareTagNames).map((runName) => { + const tagsData = dataset[runName]; + const tags = tagsData.sort(vz_sorting.compareTagNames); + return { name: runName, tags }; + }); + this._datasetsFetched = true; + }); + } + + @observe('_datasetsFetched', '_datasets', 'run') + _determineSelectedDataset(): void { + let datasetsFetched = this._datasetsFetched; + let datasets = this._datasets; + let run = this.run; + // By default, load the first dataset. + if (!run) { + // By default, load the first dataset. + this.set('_selectedDataset', 0); + return; + } + // If the URL specifies a dataset, load it. + const dataset = datasets.findIndex((d) => d.name === run); + if (dataset === -1) { + if (datasetsFetched) { + // Tell the user if the dataset cannot be found to avoid misleading + // the user. + const dialog = this.$$('#error-dialog') as any; + dialog.textContent = `No dataset named "${run}" could be found.`; + dialog.open(); + } + return; + } + this.set('_selectedDataset', dataset); + } + + @observe('_datasetsFetched', '_datasets', '_selectedDataset') + _updateSelectedDatasetName(): void { + let datasetsFetched = this._datasetsFetched; + let datasets = this._datasets; + let selectedDataset = this._selectedDataset; + if (!datasetsFetched) { + return; + } + // Cannot update `run` to update the hash in case datasets for graph is empty. + if (datasets.length <= selectedDataset) { + return; + } + this.set('run', datasets[selectedDataset].name); + } + + override attached(): void { + this.set('_isAttached', true); + } + + override detached(): void { + this.set('_isAttached', false); + } + + ready(): void { + super.ready(); + } + + _fit(): void { + (this.$$('#graphboard') as any).fit(); + } + + _getGraphDisplayClassName(_selectedFile: any, _datasets: any[]): string { + const isDataValid = _selectedFile || _datasets.length; + return isDataValid ? '' : 'no-graph'; + } + + _fetchDataset(): Promise { + return this._requestManager.request('info'); + } + + _datasetsState(datasetsFetched, datasets, state): boolean { + if (!datasetsFetched) { + return state === 'NOT_LOADED'; + } + if (!datasets || !datasets.length) { + return state === 'EMPTY'; + } + return state === 'PRESENT'; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_loader/tf-graph-dashboard-loader.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_loader/tf-graph-dashboard-loader.ts new file mode 100644 index 0000000000000000000000000000000000000000..a32a9f04bbeef769637d12f597e3b30ed6c6f7ba --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_loader/tf-graph-dashboard-loader.ts @@ -0,0 +1,335 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { customElement, observe, property } from '@polymer/decorators'; +import { PolymerElement } from '@polymer/polymer'; +import { LegacyElementMixin } from '../polymer/legacy_element_mixin'; +import * as tf_graph_common from '../tf_graph_common/common'; +import * as tf_graph from '../tf_graph_common/graph'; +import * as tf_graph_hierarchy from '../tf_graph_common/hierarchy'; +import * as tf_graph_loader from '../tf_graph_common/loader'; +import * as tf_graph_parser from '../tf_graph_common/parser'; +import * as tf_graph_util from '../tf_graph_common/util'; +import * as tf_graph_node from '../tf_graph_common/node'; +import { DATA_LOAD_TIME, DATA_NOTICE_TIME } from '../tf_graph_common/common'; +import * as tf_graph_controls from '../tf_graph_controls/tf-graph-controls'; +import { safeJSONParse } from '../utils'; + +interface GraphRunTag { + run: string; + tag: string | null; +} + +interface Components { + menu: object; + tooltips: string; + colors: object; + overflowCheck: boolean; + microSteps: number; + stepList: []; + unMatchedNode: []; + match: []; +} +/** + * Data loader for tf-graph-dashboard. + * + * The loader loads op graph, conceptual graphs, and RunMetadata associated with + * an op graph which is the major difference from the tf-graph-loader which is + * only capable of loading an op graph. Another difference is that the loader + * takes `selection` from the tf-graph-controls as an input as opposed to URL + * path of an data endpoint. + */ +@customElement('tf-graph-dashboard-loader') +class TfGraphDashboardLoader extends LegacyElementMixin(PolymerElement) { + static readonly _template = null; + + @property({ type: Array }) + datasets: any[]; + + /** + * @type {{value: number, msg: string}} + * + * A number between 0 and 100 denoting the % of progress + * for the progress bar and the displayed message. + */ + @property({ type: Object, notify: true }) + progress: object; + + @property({ type: Object }) + selection: any; + + /** + * @type {?Event} + */ + @property({ type: Object }) + selectedFile: object; + + @property({ type: Object }) + hierarchyParams = tf_graph_hierarchy.defaultHierarchyParams; + + @property({ + type: Object, + readOnly: true, // readonly so outsider can't change this via binding + notify: true, + }) + outGraphHierarchy: tf_graph_hierarchy.Hierarchy; + + @property({ + type: Object, + readOnly: true, // readonly so outsider can't change this via binding + notify: true, + }) + outGraph: tf_graph.SlimGraph; + + @property({ + type: Object, + readOnly: true, // This property produces data. + notify: true, + }) + outStats: object; + + @property({ type: Object }) + _graphRunTag: GraphRunTag; + + @property({ + type: Object, + notify: true, + }) + menu: object; + + @property({ + type: Object, + notify: true, + }) + colorset: object; + + @property({ + type: Object, + notify: true, + }) + tooltips: object; + + @property({ + type: Object, + notify: true, + }) + colors: any; + + @property({ + type: Array, + notify: true, + }) + overflowcheck; + + @property({ + type: Object, + notify: true, + }) + microsteps: any; + + @property({ + type: Object, + notify: true, + }) + steplist: any; + + @property({ + type: Object, + notify: true, + }) + unmatched: object; + + @property({ + type: Object, + notify: true, + }) + matchedlist: object; + + @observe('selectedFile') + _selectedFileChanged(): void { + let e = this.selectedFile; + if (!e) { + return; + } + const target = (e as any).target as HTMLInputElement; + const file = target.files?.[0]; + if (!file) { + return; + } + // Clear out the value of the file chooser. This ensures that if the user + // selects the same file, we'll re-read it. + target.value = ''; + this._fetchAndConstructHierarchicalGraph(null, file); + } + + @observe('selection') + _selectionChanged(): void { + if (!this.selection) { + return; + } + // selection can change a lot within a microtask. + // Don't fetch too much too fast and introduce race condition. + this.debounce('selectionchange', () => { + this._load(this.selection); + }); + } + + getColors(): any { + return this.colors; + } + _setCompoments(componentsPath): Promise { + return new Promise(async (resolve, reject) => { + this.set('progress', { + value: 0, + msg: '', + }); + + const tracker = tf_graph_util.getTracker(this); + const dataTracker = tf_graph_util.getSubtaskTracker(tracker, 100, 'Data'); + dataTracker.setMessage('Initialization in progress'); + + let timer = 0; + let shouldBreak = false; // 标志位,用于控制循环退出 + + // 启动定时器任务 + const timerTask = async function (): Promise { + let previousProgress = 0; // 记录上一次更新的进度 + + while (timer <= DATA_LOAD_TIME && !shouldBreak) { + if (timer < DATA_NOTICE_TIME) { + const progress = Math.log(timer + 1) / Math.log(DATA_NOTICE_TIME); + const progressIncrement = (progress * 100) - previousProgress; + dataTracker.updateProgress(progressIncrement); + previousProgress = progress * 100; + } else { + dataTracker.setMessage('File data too large, still reading'); + } + await new Promise((resolveTimer) => setTimeout(resolveTimer, 100)); + timer++; + } + }.bind(this); + + const fetchTask = async (): Promise => { + let componentsStr; + try { + componentsStr = await tf_graph_parser.fetchPbTxt(componentsPath); + } catch (e) { + shouldBreak = true; // 捕获 fetchPbTxt 错误并停止定时器 + dataTracker.reportError('Fetch error, please check first file in file path', e as Error); + return; + } + + shouldBreak = true; // 正常流程也停止定时器 + + let components: Components = { + menu: [], + tooltips: '', + colors: {}, + overflowCheck: false, + microSteps: 0, + stepList: [], + unMatchedNode: [], + match: [], + }; + + try { + if (componentsStr) { + components = { + ...components, + ...(safeJSONParse(new TextDecoder().decode(componentsStr).replace(/'/g, '"')) as Components), + }; + } + } catch (e) { + shouldBreak = true; // 解析错误时停止定时器 + dataTracker.reportError( + 'Parse components failed, please check the format of config data in the input vis file', + e as Error, + ); + return; + } + // 后续处理逻辑... + const entries = Object.entries(components.tooltips || {}); + const toolTipObject = Object.fromEntries(entries); + + this.set('menu', components.menu); + this.set('tooltips', toolTipObject); + this.set('colors', components.colors); + this.set('overflowcheck', components.overflowCheck); + this.set('colorset', Object.entries(components.colors || {})); + this.set('unmatched', components.unMatchedNode); + this.set('matchedlist', components.match); + + tf_graph_node.getColors(components.colors); + + const microstepsCount = Number(components.microSteps); + if (microstepsCount) { + const microstepsArray = ['ALL', ...Array.from({ length: microstepsCount }, (_, index) => index)]; + this.set('microsteps', microstepsArray); + } else { + this.set('microsteps', []); + } + const steplistCount = Number(components.microSteps); + this.set('steplist', steplistCount ? components.stepList : []); + resolve(); + } + + // 同时启动定时器和 fetch 任务 + await Promise.all([timerTask(), fetchTask()]); + }); + } + + _load(selection: tf_graph_controls.Selection): void { + const { run, tag, batch, step } = selection; + this.set('outStats', null); + const params = new URLSearchParams(); + params.set('run', run); + if (tag !== undefined && tag !== null) { + params.set('tag', tag); + } + params.set('batch', String(batch === -1 ? -1 : batch - 1)); + params.set('step', String(step === -1 ? -1 : step - 1)); + const componentsPath = `components?${String(params)}`; + params.set('node', 'root'); + const graphPath = `subgraph?${String(params)}`; + this._setCompoments(componentsPath).then(() => { + // _setCompoments 完成后执行此行 + this._fetchAndConstructHierarchicalGraph(graphPath).then(() => { + this._graphRunTag = { run, tag }; // 图形构建完成后执行 + }); + }); + return; + } + + _fetchAndConstructHierarchicalGraph(path: string | null, pbTxtFile?: Blob): Promise { + // Reset the progress bar to 0. + this.set('progress', { + value: 0, + msg: '', + }); + const tracker = tf_graph_util.getTracker(this); + return tf_graph_loader + .fetchAndConstructHierarchicalGraph( + tracker, + path, + pbTxtFile !== undefined ? pbTxtFile : null, + this.hierarchyParams, + ) + .then(({ graph, graphHierarchy }): void => { + this._setOutGraph(graph); + this._setOutGraphHierarchy(graphHierarchy); + }, + ); + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_resize_height/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_resize_height/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..d569317564fc4a16e1a965ccb2cd8386cf51f787 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_resize_height/index.ts @@ -0,0 +1,112 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property, observe } from '@polymer/decorators'; + +@customElement('tf-resize-height') +class ResizableTabsheet extends PolymerElement { + static readonly template = html` + + +
+
+ +
+ `; + + @property({ + type: Number, + notify: true, + }) + height: number = 300; + + _resize: (event: MouseEvent) => void = () => {}; + _stopResize: (this: Document, ev: MouseEvent) => any = () => {}; + + @observe('height') + _updateHeight(newHeight): void { + this.updateStyles({ '--tabsheet-height': `${newHeight}px` }); + } + + override ready(): void { + super.ready(); + this._initResizeHandle(); + } + + _initResizeHandle(): void { + const tabsheet = this.$.tabsheet as HTMLElement; + const resizeHandle = this.$.resizeHandle as HTMLElement; + + let isResizing = false; + let startY = 0; + let startHeight = 0; + + // 开始拖拽 + resizeHandle.addEventListener('mousedown', (event: MouseEvent) => { + isResizing = true; + startY = event.clientY; + startHeight = tabsheet.offsetHeight; + document.body.style.cursor = 'ns-resize'; + document.addEventListener('mousemove', this._resize); + document.addEventListener('mouseup', this._stopResize); + }); + + // 拖拽过程 + this._resize = (event): void => { + if (!isResizing) { + return; + } + const deltaY = startY - event.clientY; // 向上拖拽为正 + this.set('height', Math.max(10, startHeight + deltaY)); // 更新高度 + }; + + // 停止拖拽 + this._stopResize = (): void => { + isResizing = false; + document.body.style.cursor = ''; + document.removeEventListener('mousemove', this._resize); + document.removeEventListener('mouseup', this._stopResize); + }; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaadin_table/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaadin_table/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..ff03bc2e051be53f7c33d88f0b9d55a7da52dd28 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaadin_table/index.ts @@ -0,0 +1,195 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property, query } from '@polymer/decorators'; +import '@vaadin/grid'; // 引入新的 Vaadin Grid 组件 +import '@vaadin/tooltip'; +import type { GridEventContext } from '@vaadin/grid'; +@customElement('tf-vaadin-table') +class TfVaadinTable extends PolymerElement { + static readonly template = html` + + + + `; + + @property({ type: Object }) + syncGrid?: HTMLElement; // 点击高亮需要同步的表格元素 + + @property({ type: Boolean }) + isSingleGraphNode = false; // 是否是单节点图 + + @property({ type: Object }) + tooltips: any; + + @property({ type: Object }) + handleCellClick!: (e: MouseEvent, syncGrid: HTMLElement) => void; + + @property({ + type: Array, + computed: '_computeHeaders(ioDataset)', + }) + headers!: any[]; + + @property({ + type: Boolean, + computed: '_isEmptyGrid(ioDataset)', + }) + isEmptyGrid!: false; + + renderDefaultValue!: (root: HTMLElement, column: any, rowData: any) => void; + + override connectedCallback(): void { + super.connectedCallback(); + this.renderDefaultValue = this._renderDefaultValue.bind(this); + } + + /** + * 计算表头(所有唯一的键) + * @param {Array} data 数据源 + * @return {Array} 表头数组 + */ + _computeHeaders(data): any[] { + if (this._isEmptyGrid(data)) { + return []; + } + const ignoreDataIndex = ['data_name', 'isBench', 'isMatched', 'value']; + const headers = Array.from( + data.reduce((keys, item) => { + // 只取前5个数据项,避免性能问题 + Object.keys(item).forEach((key) => { + if (!ignoreDataIndex.includes(key)) { + keys.add(String(key)); + } + }); + return keys; + }, new Set()), + ); + return headers; + } + + _isEmptyGrid(data): boolean { + return !Array.isArray(data) || data.length === 0; + } + + _renderDefaultValue(root: HTMLElement, column: any, rowData: any): void { + const selectedColor = this._getCssVariable('--selected-color'); + const matchedColor = this._getCssVariable('--matched-color'); + const isBench = 'isBench'; + const isMatched = 'isMatched'; + root.classList.remove('splitter'); + if (rowData.item[isBench]) { + root.style.backgroundColor = matchedColor; + if (rowData.item[isMatched]) { + root.classList.add('splitter'); + } + } else { + root.style.backgroundColor = selectedColor; + } + if (column.path === 'name' && !this.isSingleGraphNode) { + const className = rowData.item[isMatched] ? 'avater-matched' : 'avater-unmatched'; + root.innerHTML = `${rowData.item[column.path]}`; + return; + } + let tooltip = rowData.item[column.path] ?? '-'; + if (this.tooltips?.[column.path]) { + tooltip = `${this.tooltips[column.path]}:\n${tooltip}`; + } + root.title = tooltip; + root.textContent = rowData.item[column.path] ?? '-'; + } + + handleGridClick(e: MouseEvent): void { + this.handleCellClick(e, this.syncGrid as HTMLElement); // 调用后方法的this会指向当前组件,无法拿到同级别的表格组件,所以需要回传 + } + + _getCssVariable(variableName): string { + const computedStyle = getComputedStyle(this); + return computedStyle.getPropertyValue(variableName).trim(); // 去掉多余的空格 + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaddin_text_table/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaddin_text_table/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..fa5001f1be193334b4e908147fdd50918de46cb0 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/components/tf_vaddin_text_table/index.ts @@ -0,0 +1,248 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +import { PolymerElement, html } from '@polymer/polymer'; +import { customElement, property } from '@polymer/decorators'; +import '@vaadin/grid'; // 引入新的 Vaadin Grid 组件 +import '@vaadin/tooltip'; +import type { GridEventContext } from '@vaadin/grid'; +import { Notification } from '@vaadin/notification'; +@customElement('tf-vaadin-text-table') +class TfVaadinTable extends PolymerElement { + static readonly template = html` + + + + `; + + @property({ type: Object }) + syncGrid!: HTMLElement; // 点击高亮需要同步的表格元素 + + @property({ type: Object }) + handleCellClick!: (e: MouseEvent, syncGrid: HTMLElement) => void; + + @property({ + type: Array, + computed: '_computeHeaders(dataset)', + }) + headers: any[] = []; + + @property({ + type: Boolean, + computed: '_isEmptyGrid(dataset)', + }) + isEmptyGrid: boolean = false; + + renderDefaultValue!: (root: HTMLElement, column: any, rowData: any) => void; + tooltipGenerator!: (context: GridEventContext>) => string; + + override connectedCallback(): void { + super.connectedCallback(); + this.renderDefaultValue = this._renderDefaultValue.bind(this); + } + + /** + * 计算表头(所有唯一的键) + * @param {Array} data 数据源 + * @return {Array} 表头数组 + */ + _computeHeaders(data): any[] { + if (this._isEmptyGrid(data)) { + return []; + } + const ignoreDataIndex = ['title']; + const headers = Array.from( + data.slice(0, 5).reduce((keys, item) => { + // 只取前5个数据项,避免性能问题 + Object.keys(item).forEach((key) => { + if (!ignoreDataIndex.includes(key)) { + keys.add(key); + } + }); + return keys; + }, new Set()), + ); + return headers; + } + + _isEmptyGrid(data): boolean { + return !Array.isArray(data) || data.length === 0; + } + + _renderDefaultValue(root: HTMLElement, column: any, rowData: any): void { + const propertyName = column.path; + const titleName = rowData.item.title; + if (!root.firstElementChild) { + switch (titleName) { + case 'stackInfo': + case 'suggestions': + this._createCopyableTextarea(root, propertyName, rowData); + break; + default: + root.style.fontWeight = 'bold'; + root.textContent = `${titleName}:${rowData.item[propertyName] || 'null'}`; + break; + } + } else { + switch (titleName) { + case 'stackInfo': + case 'suggestions': + this._updateCopyableTextarea(root, propertyName, rowData); + break; + default: + root.textContent = `${titleName}:${rowData.item[propertyName] || 'null'}`; + } + } + } + + _createCopyableTextarea(root: HTMLElement, propertyName: any, rowData: any): void { + const container = document.createElement('div'); + container.className = 'copyable-input'; + + const title = document.createElement('div'); + const textTitle = 'title'; + title.className = 'copyable-input-title'; + title.style.fontWeight = 'bold'; + title.textContent = `${rowData.item[textTitle]}:`; + container.appendChild(title); + + const textarea = document.createElement('textarea'); + textarea.readOnly = true; + textarea.value = rowData.item[propertyName]; + textarea.onmouseenter = () => { + button.style.display = 'unset'; + }; + textarea.onmouseleave = () => { + button.style.display = 'none'; + }; + container.appendChild(textarea); + + const button = document.createElement('button'); + button.className = 'copy-button'; + button.textContent = '复制'; + button.style.display = 'none'; + button.onmousemove = () => { + button.style.display = 'unset'; + }; + button.onclick = () => { + navigator.clipboard + .writeText(textarea.value) + .then(() => { + Notification.show('复制成功', { + position: 'middle', + duration: 1000, + theme: 'success', + }); + }) + .catch((err) => { + Notification.show('复制失败,请重试', { + position: 'middle', + duration: 1000, + theme: 'error', + }); + console.error('Failed to copy text:', err); + }); + }; + container.appendChild(button); + + root.appendChild(container); + } + + _updateCopyableTextarea(root: HTMLElement, propertyName: any, rowData: any): void { + const title = root.querySelector('.copyable-input-title'); + const textTitle = 'title'; + if (title) { + title.textContent = `${rowData.item[textTitle]}:`; + } + const textarea = root.querySelector('textarea'); + if (textarea) { + textarea.value = rowData.item[propertyName]; + } + } + + _tooltipGenerator = (context: GridEventContext>): string => { + const { column, item } = context; + return item?.[column?.path || ''] || ''; + }; + + handleGridClick(e: MouseEvent): void { + this.handleCellClick(e, this.syncGrid); // 调用后方法的this会指向当前组件,无法拿到同级别的表格组件,所以需要回传 + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/domain/useNodeInfoDomain.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/domain/useNodeInfoDomain.ts new file mode 100644 index 0000000000000000000000000000000000000000..a7f3702e2fe12054aefbaa1fd511cb2ace76d4c3 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/domain/useNodeInfoDomain.ts @@ -0,0 +1,38 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +import { fetchPbTxt } from '../../tf_graph_common/parser'; + +const useNodeInfoDomain = (): { getMatchNodeInfo: (nodeInfo: any, metaData: any) => Promise } => { + const getMatchNodeInfo = async (nodeInfo: any, metaData: any): Promise => { + const params = new URLSearchParams(); + params.set('nodeInfo', JSON.stringify(nodeInfo)); + params.set('metaData', JSON.stringify(metaData)); + // 接口请求 + const precisionPath = `getNodeInfo?${String(params)}`; + const precisionStr = await fetchPbTxt(precisionPath); // 获取异步的 ArrayBuffer + const decoder = new TextDecoder(); + const decodedStr = decoder.decode(precisionStr); // 解码 ArrayBuffer 到字符串 + // 接口返回 + const mactchResult = JSON.parse(decodedStr.replace(/"None"/g, '{}')); + return mactchResult; + }; + + return { + getMatchNodeInfo, + }; +}; + +export default useNodeInfoDomain; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..1a0ffaf77d01ac9f92f6f8f54091bd40db1c5517 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/index.ts @@ -0,0 +1,327 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +import '@vaadin/tabs'; +import '@vaadin/tabsheet'; +import { Notification } from '@vaadin/notification'; +import { isEmpty } from 'lodash'; +import { PolymerElement, html } from '@polymer/polymer'; +import { observe, customElement, property } from '@polymer/decorators'; +import useNodeInfo from './useNodeInfo'; +import './components/tf_vaadin_table/index'; +import './components/tf_vaddin_text_table/index'; +import './components/tf_resize_height/index'; +import * as tf_graph_hierarchy from '../tf_graph_common/hierarchy'; +import type { UseNodeInfoType } from './useNodeInfo'; +import { GroupNode, Metanode, OpNode } from '../tf_graph_common/graph'; +import { BENCH_PREFIX, NPU_PREFIX } from '../tf_graph_common/common'; + +@customElement('tf-graph-vaadin-tab') +class TfGraphNodeInfo extends PolymerElement { + static readonly template = html` + + + + + + + + 节点信息 + + +
+ +
+
+
+ + +
+ +
+ + + +
+
+
+
+ +
+ + +
+
+
+
+ `; + + @property({ + type: Object, + computed: '_getNode(selectedNode)', + }) + _node: GroupNode | OpNode | undefined | null; + + @property({ + type: Object, + }) + _matchedNode: any; + + @property({ type: String }) + selectedNode: string = ''; + + @property({ type: Object }) + selection: any; + + @property({ type: Object }) + graphHierarchy?: tf_graph_hierarchy.MergedHierarchy; + + @property({ type: Array }) + ioDataset: any[] = []; + + @property({ type: Array }) + detailData: any[] = []; + + @property({ type: String }) + npuNodeName?: string; + + @property({ type: String }) + benchNodeName?: string; + + @property({ type: Number }) + isSingleGraphNode: boolean = false; + + useNodeInfo: UseNodeInfoType = useNodeInfo(); + + @observe('_node') + async _updateTableData(): Promise { + if (!this._node) { + this.set('ioDataset', []); + this.set('detailData', []); + this.set('npuNodeName', ''); + this.set('benchNodeName', ''); + return; + } + const isNodeStartWithN = this._node.name.startsWith(NPU_PREFIX); + const isSingleGraphNode = !isNodeStartWithN && !this._node?.name?.startsWith(BENCH_PREFIX); + this.set('isSingleGraphNode', isSingleGraphNode); + if (!isSingleGraphNode) { + await this._updateMatchedNodeLink(this.selectedNode); + } + // 考虑选中的节点是匹配节点的情况 + const npuNode = isNodeStartWithN || isSingleGraphNode ? this._node : this._matchedNode; + let benchNode; + if (!isSingleGraphNode) { + benchNode = isNodeStartWithN ? this._matchedNode : this._node; + } else { + benchNode = undefined; + } + this.set('npuNodeName', npuNode?.name?.replace(NPU_PREFIX, '')); + this.set('benchNodeName', benchNode?.name?.replace(BENCH_PREFIX, '')); + const inputDataset = this.useNodeInfo.getIoDataSet(npuNode, benchNode, 'inputData'); + const outputDataSet = this.useNodeInfo.getIoDataSet(npuNode, benchNode, 'outputData'); + const ioDataset = [ + ...inputDataset.matchedIoDataset, + ...outputDataSet.matchedIoDataset, + ...inputDataset.unMatchedNpuIoDataset, + ...outputDataSet.unMatchedNpuIoDataset, + ...inputDataset.unMatchedBenchIoDataset, + ...outputDataSet.unMatchedBenchIoDataset, + ]; + this.set('ioDataset', ioDataset); + const detailData = this.useNodeInfo.getDetailDataSet(npuNode, benchNode); + this.set('detailData', detailData); + } + + _getNode(selectedNode): GroupNode | OpNode | undefined | null { + if (!selectedNode) { + return null; + } + const isBench = selectedNode.startsWith(BENCH_PREFIX); + return isBench ? this.graphHierarchy?.bench?.node(selectedNode) : this.graphHierarchy?.npu.node(selectedNode); + } + + async _updateMatchedNodeLink(selectedNode: string): Promise { + const isBench = selectedNode.startsWith(BENCH_PREFIX); + const [selectedHierarchy, matchedHierarchy] = isBench + ? [this.graphHierarchy?.bench, this.graphHierarchy?.npu] + : [this.graphHierarchy?.npu, this.graphHierarchy?.bench]; + const matchedNodes: Array = (selectedHierarchy?.node(selectedNode) as Metanode | OpNode)?.matchedNodeLink; + if (isEmpty(matchedNodes)) { + this.set('_matchedNode', null); + return; + } + const matchNodeName = matchedNodes[matchedNodes.length - 1]; + let matchNode = matchNodeName ? matchedHierarchy?.node(matchNodeName) : null; + // 如果没有匹配节点,则通过接口获取匹配节点的信息 + if (isEmpty(matchNode)) { + const nodeInfo = { + nodeName: matchNodeName?.replace(new RegExp(`^(${NPU_PREFIX}|${BENCH_PREFIX})`), ''), // 去掉前缀 + nodeType: this._node?.name?.startsWith(NPU_PREFIX) ? 'Bench' : 'NPU', + }; + const { success, data, error } = await this.useNodeInfo.getNodeInfo(nodeInfo, this.selection); + if (success) { + this.set('_matchedNode', data); + } else { + Notification.show(`获取匹配节点失败:${error}`, { + position: 'middle', + duration: 2000, + theme: 'error', + }); + } + } else { + this.set('_matchedNode', matchNode); + } + } + + // 点击单元格高亮 + handleGridCellClick(e: MouseEvent, syncGrid: HTMLElement): void { + const target = e.composedPath()[0] as HTMLElement; // 获取点击的目标元素 + const slotValue = target.getAttribute('slot'); // 提取 slot 属性 + if (!slotValue || !slotValue.startsWith('vaadin-grid-cell-content-')) { + return; + } + const cellIndex = parseInt(slotValue.split('-').pop() || '0', 10); + // 前8个元素是表头不可选中,所以跳过 + if (cellIndex <= 8) { + return; + } + const highlightedCells = this.shadowRoot?.querySelectorAll('.highlight-cell'); + highlightedCells?.forEach((cell) => cell.classList.remove('highlight-cell')); // 清除所有高亮样式 + target.classList.add('highlight-cell'); // 添加高亮样式到当前单元格 + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/type/index.d.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/type/index.d.ts new file mode 100644 index 0000000000000000000000000000000000000000..fcb10e13f646c629d3cb9e1e93be732a7ae4953a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/type/index.d.ts @@ -0,0 +1,28 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +export interface MatchNodeInfo { + success: boolean; + data?: { + name?: string; + inputData?: Record; + outputData?: Record; + stackData?: string; + suggestions?: Record; + include?: number; + subnodes?: Array; + }; + error?: string; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/useNodeInfo.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/useNodeInfo.ts new file mode 100644 index 0000000000000000000000000000000000000000..f4caf72cf19a72942ceb9818d3dc208e39287fc0 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_graph_node_info/useNodeInfo.ts @@ -0,0 +1,198 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +import { isEmpty, cloneDeep } from 'lodash'; +import useNodeInfoDomain from './domain/useNodeInfoDomain'; +import type { MatchNodeInfo } from './type'; +import { BENCH_PREFIX, NPU_PREFIX } from '../tf_graph_common/common'; +import { safeJSONParse } from '../utils'; + +export interface UseNodeInfoType { + getNodeInfo: ( + nodeInfo: { + nodeName: string; + nodeType: string; + }, + metaData: any, + ) => Promise; + getIoDataSet: ( + npuNode: any, + benchNode: any, + type: 'inputData' | 'outputData', + ) => { + matchedIoDataset: Array>; + unMatchedNpuIoDataset: Array>; + unMatchedBenchIoDataset: Array>; + }; + getDetailDataSet: (npuNode: any, benchNode: any) => Array>; +} + +const useNodeInfo = (): UseNodeInfoType => { + const useNodeInfoService = useNodeInfoDomain(); + /** + * 获取节点信息 + * @param node_name + * @param graph_name + * @param run_name + * @returns + */ + const getNodeInfo = async ( + nodeInfo: { nodeName: string; nodeType: string }, + metaData: any, + ): Promise => { + const mactchResult = await useNodeInfoService.getMatchNodeInfo(nodeInfo, metaData); + mactchResult.data = convertNodeInfo(mactchResult.data); // 提取有效数据,统一命名 + return mactchResult; + }; + + const convertNodeInfo = (nodeInfo: any): MatchNodeInfo['data'] => { + if (isEmpty(nodeInfo)) { + return {}; + } + return { + name: nodeInfo.id, + inputData: nodeInfo.input_data, + outputData: nodeInfo.output_data, + stackData: !isEmpty(nodeInfo.stack_info) ? JSON.stringify(nodeInfo.stack_info) : '', + suggestions: nodeInfo.suggestions, + subnodes: nodeInfo.subnodes, + include: nodeInfo.subnodes?.length, + }; + }; + + /** + * 获取匹配的输入输出数据 + * @param npuNode NPU节点信息 + * @param benchNode 匹配的节点信息 + * @param name 'inputData' | 'outputData' + * @returns { matchedDataset: Array<{}>, unMatchedNpuDataset: Array<{}>, unMatchedBenchDataset: Array<{}> } + */ + const getIoDataSet = ( + npuNode: any, + benchNode: any, + type: 'inputData' | 'outputData', + ): { + matchedIoDataset: Array>; + unMatchedNpuIoDataset: Array>; + unMatchedBenchIoDataset: Array>; + } => { + if (isEmpty(npuNode?.[type]) && isEmpty(benchNode?.[type])) { + return { + matchedIoDataset: [], + unMatchedNpuIoDataset: [], + unMatchedBenchIoDataset: [], + }; + } + const npuNodeName = npuNode?.name.replace(new RegExp(`^(${NPU_PREFIX}|${BENCH_PREFIX})`), ''); + const benchNodeName = benchNode?.name?.replace(new RegExp(`^(${NPU_PREFIX}|${BENCH_PREFIX})`), ''); + const npuData = cloneDeep(npuNode?.[type]); // 获取当前节点的输入数据 + const benchData = cloneDeep(benchNode?.[type]); // 获取匹配节点的输入数据 + + const matchedIoDataset: Array> = []; // 初始化输入数据集 + const unMatchedBenchIoDataset: Array> = []; + const unMatchedNpuIoDataset: Array> = []; + const npuKeys = Object.keys(npuData || {}); + const benchKeys = Object.keys(benchData || {}); + const minLength = Math.min(npuKeys.length, benchKeys.length); + for (let i = 0; i < minLength; i++) { + const npuKey = npuKeys[i]; + const benchKey = benchKeys[i]; + matchedIoDataset.push({ + name: npuKey.replace(`${npuNodeName}.`, ''), + isMatched: true, + ...npuData[npuKey], + }); + matchedIoDataset.push({ + name: benchKey.replace(`${benchNodeName}.`, ''), + isBench: true, + isMatched: true, + ...benchData[benchKey], + }); + delete npuData[npuKey]; + delete benchData[benchKey]; + } + Object.keys(npuData || {}).forEach((key) => { + if (npuData[key] !== 'None') { + unMatchedNpuIoDataset.push({ + name: key.replace(`${npuNodeName}.`, ''), + ...npuData[key], + }); + } + }); + Object.keys(benchData || {}).forEach((key) => { + if (benchData[key] !== 'None') { + unMatchedBenchIoDataset.push({ + name: key.replace(`${benchNodeName}.`, ''), + isBench: true, + ...benchData[key], + }); + } + }); + return { matchedIoDataset, unMatchedNpuIoDataset, unMatchedBenchIoDataset }; + }; + + const getDetailDataSet = (npuNode: any, benchNode: any): Array> => { + if (isEmpty(npuNode) && isEmpty(benchNode)) { + return []; + } + const nodeName = `NPU节点:${npuNode?.name?.replace(new RegExp(`^(${NPU_PREFIX}|${BENCH_PREFIX})`), '')}`; + const benchNodeName = `标杆节点:${benchNode?.name?.replace(new RegExp(`^(${NPU_PREFIX}|${BENCH_PREFIX})`), '')}`; + const detailData: Array> = []; + // 获取stackInfo + const stackInfo: Record = {}; + const npustackInfo = npuNode?.stackData; + const benchstackInfo = benchNode?.stackData; + const title = 'title'; + if (!isEmpty(npustackInfo)) { + stackInfo[nodeName] = safeJSONParse(npustackInfo.replace(/'/g, '"'))?.join('\n'); + } + if (!isEmpty(benchstackInfo)) { + stackInfo[benchNodeName] = safeJSONParse(benchstackInfo.replace(/'/g, '"')); + } + if (!isEmpty(stackInfo)) { + stackInfo[title] = 'stackInfo'; + detailData.push(stackInfo); + } + // 获取suggestions + const suggestion: Record = {}; + const npusuggestion = npuNode?.suggestions; + const benchsuggestion = benchNode?.suggestions; + if (!isEmpty(npusuggestion)) { + suggestion[nodeName] = converObjectToString(npusuggestion); + } + if (!isEmpty(benchsuggestion)) { + suggestion[benchNodeName] = converObjectToString(benchsuggestion); + } + if (!isEmpty(suggestion)) { + suggestion[title] = 'suggestions'; + detailData.push(suggestion); + } + return detailData; + }; + + const converObjectToString = (obj: any): string => { + return Object.entries(obj) + .map(([key, value]) => `${key}: ${value}`) + .join('\n'); + }; + + return { + getNodeInfo, + getIoDataSet, + getDetailDataSet, + }; +}; + +export default useNodeInfo; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..e0d7e905a2b45bb959b35cea9d1e6d32d8da240c --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/index.ts @@ -0,0 +1,16 @@ +/* Copyright 2020 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +export * from './listeners'; +export * from './storage'; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/listeners.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/listeners.ts new file mode 100644 index 0000000000000000000000000000000000000000..8d4785ab84264ee8f1c56b3196b1e03d7b2acd19 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/listeners.ts @@ -0,0 +1,49 @@ +/* Copyright 2018 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +export class ListenKey { + public readonly listener: () => void; + constructor(listener: () => void) { + this.listener = listener; + } +} +const hashListeners = new Set(); +const storageListeners = new Set(); +window.addEventListener('hashchange', () => { + hashListeners.forEach((listenKey) => listenKey.listener()); +}); +// [1]: The event only triggers when another tab edits the storage. Changing a +// value in current browser tab will NOT trigger below event. +window.addEventListener('storage', () => { + storageListeners.forEach((listenKey) => listenKey.listener()); +}); +export function addHashListener(fn: () => void): ListenKey { + const key = new ListenKey(fn); + hashListeners.add(key); + return key; +} +export function addStorageListener(fn: () => void): ListenKey { + const key = new ListenKey(fn); + storageListeners.add(key); + return key; +} +export function fireStorageChanged(): void { + storageListeners.forEach((listenKey) => listenKey.listener()); +} +export function removeHashListenerByKey(key: ListenKey): void { + hashListeners.delete(key); +} +export function removeStorageListenerByKey(key: ListenKey): void { + storageListeners.delete(key); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage.ts new file mode 100644 index 0000000000000000000000000000000000000000..fcff8da99ad778cb1c5ca7b6ff05571eaa4b88a2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage.ts @@ -0,0 +1,217 @@ +/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import * as _ from 'lodash'; +import { + addHashListener, + addStorageListener, + fireStorageChanged, + ListenKey, + removeHashListenerByKey, + removeStorageListenerByKey, +} from './listeners'; +import { + componentToDict, + dictToComponent, + readComponent, + TAB_KEY, + unsetFromURI, + updateUrlDict, + writeComponent, +} from './storage_utils'; + +export { getUrlDict as getUrlHashDict } from './storage_utils'; + +/** + * The name of the property for users to set on a Polymer component + * in order for its stored properties to be stored in the URI unambiguously. + * (No need to set this if you want multiple instances of the component to + * share URI state) + * + * Example: + * + * + * The disambiguator should be set to any unique value so that multiple + * instances of the component can store properties in URI storage. + * + * Because it's hard to dereference this variable in HTML property bindings, + * it is NOT safe to change the disambiguator string without find+replace + * across the codebase. + */ +export const DISAMBIGUATOR = 'disambiguator'; + +export const { + get: getString, + set: setString, + getInitializer: getStringInitializer, + getObserver: getStringObserver, + disposeBinding: disposeStringBinding, +} = makeBindings( + (x) => x, + (x) => x, +); +export const { + get: getBoolean, + set: setBoolean, + getInitializer: getBooleanInitializer, + getObserver: getBooleanObserver, + disposeBinding: disposeBooleanBinding, +} = makeBindings( + (s) => { + if (s === 'true') { + return true; + } else if (s === 'false') { + return false; + } else { + return undefined; + } + }, + (b) => b.toString(), +); +export const { + get: getNumber, + set: setNumber, + getInitializer: getNumberInitializer, + getObserver: getNumberObserver, + disposeBinding: disposeNumberBinding, +} = makeBindings( + (s) => Number(s), + (n) => n.toString(), +); +export const { + get: getObject, + set: setObject, + getInitializer: getObjectInitializer, + getObserver: getObjectObserver, + disposeBinding: disposeObjectBinding, +} = makeBindings( + (s) => JSON.parse(atob(s)) as Record, + (o) => btoa(JSON.stringify(o)), +); +export interface StorageOptions { + defaultValue?: T; + useLocalStorage?: boolean; +} +export interface AutoStorageOptions extends StorageOptions { + polymerProperty?: string; +} +export interface SetterOptions extends StorageOptions { + defaultValue?: T; + useLocalStorage?: boolean; + useLocationReplace?: boolean; +} +export function makeBindings( + fromString: (string) => T, + toString: (T) => string, +): { + get: (key: string, option?: StorageOptions) => T; + set: (key: string, value: T, option?: SetterOptions) => void; + getInitializer: (key: string, options: AutoStorageOptions) => () => T; + getObserver: (key: string, options: AutoStorageOptions) => () => void; + disposeBinding: () => void; +} { + const hashListeners: ListenKey[] = []; + const storageListeners: ListenKey[] = []; + function get(key: string, options: StorageOptions = {}): T { + const { defaultValue, useLocalStorage = false } = options; + const value = useLocalStorage ? window.localStorage.getItem(key) : componentToDict(readComponent())[key]; + return (value === undefined ? _.cloneDeep(defaultValue) : fromString(value)) as T; + } + function set(key: string, value: T, options: SetterOptions = {}): void { + const { defaultValue, useLocalStorage = false, useLocationReplace = false } = options; + const stringValue = toString(value); + if (useLocalStorage) { + window.localStorage.setItem(key, stringValue); + // Because of listeners.ts:[1], we need to manually notify all UI elements + // listening to storage within the tab of a change. + fireStorageChanged(); + } else if (!_.isEqual(value, get(key, { useLocalStorage }))) { + if (_.isEqual(value, defaultValue)) { + unsetFromURI(key, useLocationReplace); + } else { + const items = componentToDict(readComponent()); + items[key] = stringValue; + writeComponent(dictToComponent(items), useLocationReplace); + } + } + } + /** + * Returns a function that can be used on a `value` declaration to a Polymer + * property. It updates the `polymerProperty` when storage changes -- i.e., + * when `useLocalStorage`, it listens to storage change from another tab and + * when `useLocalStorage=false`, it listens to hashchange. + */ + function getInitializer(key: string, options: StorageOptions): () => T { + const fullOptions = { + defaultValue: options.defaultValue, + polymerProperty: key, + useLocalStorage: false, + ...options, + }; + return function () { + const uriStorageName = getURIStorageName(this, key); + // setComponentValue will be called every time the underlying storage + // changes and is responsible for ensuring that new state will propagate + // to the component with specified property. It is important that this + // function does not re-assign needlessly, to avoid Polymer observer + // churn. + const setComponentValue = (): void => { + const storedValue = get(uriStorageName, fullOptions); + const currentValue = this[fullOptions.polymerProperty]; + if (!_.isEqual(storedValue, currentValue)) { + this[fullOptions.polymerProperty] = storedValue; + } + }; + const addListener = fullOptions.useLocalStorage ? addStorageListener : addHashListener; + const listenKey = addListener(() => setComponentValue()); + if (fullOptions.useLocalStorage) { + storageListeners.push(listenKey); + } else { + hashListeners.push(listenKey); + } + // Set the value on the property. + setComponentValue(); + return this[fullOptions.polymerProperty]; + }; + } + function disposeBinding(): void { + hashListeners.forEach((key) => removeHashListenerByKey(key)); + storageListeners.forEach((key) => removeStorageListenerByKey(key)); + } + function getObserver(key: string, options: StorageOptions): () => void { + const fullOptions = { + defaultValue: options.defaultValue, + polymerProperty: key, + useLocalStorage: false, + ...options, + }; + return function () { + const uriStorageName = getURIStorageName(this, key); + const newVal = this[fullOptions.polymerProperty]; + set(uriStorageName, newVal, fullOptions); + }; + } + return { get, set, getInitializer, getObserver, disposeBinding }; +} +/** + * Get a unique storage name for a (Polymer component, propertyName) tuple. + * + * DISAMBIGUATOR must be set on the component, if other components use the + * same propertyName. + */ +function getURIStorageName(component: Record, propertyName: string): string { + const d = component[DISAMBIGUATOR]; + const components = d == null ? [propertyName] : [d, propertyName]; + return components.join('.'); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage_utils.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage_utils.ts new file mode 100644 index 0000000000000000000000000000000000000000..59619ca08e015c2e9c586dbb3836b9bb30d478c0 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_storage/storage_utils.ts @@ -0,0 +1,121 @@ +/* Copyright 2021 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +import { getFakeHash, setFakeHash, useHash } from '../tf_globals/globals'; +import { addHashListener } from './listeners'; + +/** + * A keyword that users cannot use, since TensorBoard uses this to store info + * about the active tab. + */ +export const TAB_KEY = '__tab__'; + +export interface StringDict { + [key: string]: string; +} + +// Keep an up-to-date store of URL params, which iframed plugins can request. +let urlDict: StringDict = {}; + +export function getUrlDict(): StringDict { + return urlDict; +} + +export function updateUrlDict(dict: StringDict): void { + urlDict = dict; +} + +addHashListener(() => { + urlDict = componentToDict(readComponent()); +}); + +/** + * Read component from URI (e.g. returns "events&runPrefix=train*"). + */ +export function readComponent(): string { + return useHash() ? window.location.hash.slice(1) : getFakeHash(); +} + +/** + * Convert a URI Component into a dictionary of strings. + * Component should consist of key-value pairs joined by a delimiter + * with the exception of the tabName. + * Returns dict consisting of all key-value pairs and + * dict[TAB] = tabName + */ +export function componentToDict(component: string): StringDict { + const items = {} as StringDict; + const tokens = component.split('&'); + tokens.forEach((token) => { + const kv = token.split('='); + // Special backwards compatibility for URI components like #scalars. + if (kv.length === 1) { + items[TAB_KEY] = kv[0]; + } else if (kv.length === 2) { + items[decodeURIComponent(kv[0])] = decodeURIComponent(kv[1]); + } + }); + return items; +} + +/** + * Write component to URI. + */ +export function writeComponent(component: string, useLocationReplace = false): void { + if (useHash()) { + if (useLocationReplace) { + const url = new URL(window.location.href); + url.hash = component; + window.history.replaceState(window.history.state, '', url.toString()); + } else { + window.location.hash = component; + } + } else { + setFakeHash(component); + } +} + +/** + * Convert dictionary of strings into a URI Component. + * All key value entries get added as key value pairs in the component, + * with the exception of a key with the TAB value, which if present + * gets prepended to the URI Component string for backwards compatibility + * reasons. + */ +export function dictToComponent(items: StringDict): string { + let component = ''; + // Add the tab name e.g. 'events', 'images', 'histograms' as a prefix + // for backwards compatbility. + if (items[TAB_KEY] !== undefined) { + component += items[TAB_KEY]; + } + // Join other strings with &key=value notation + const nonTab = Object.keys(items) + .map((key) => [key, items[key]]) + .filter((pair) => pair[0] !== TAB_KEY) + .map((pair) => { + return `${encodeURIComponent(pair[0])}=${encodeURIComponent(pair[1])}`; + }) + .join('&'); + return nonTab.length > 0 ? `${component}&${nonTab}` : component; +} + +/** + * Delete a key from the URI. + */ +export function unsetFromURI(key, useLocationReplace = false): void { + const items = componentToDict(readComponent()); + delete items[key]; + writeComponent(dictToComponent(items), useLocationReplace); +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_wbr_string/tf-wbr-string.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_wbr_string/tf-wbr-string.ts new file mode 100644 index 0000000000000000000000000000000000000000..85542af4a72c0509134a297e87d617274b4e0fd6 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/tf_wbr_string/tf-wbr-string.ts @@ -0,0 +1,62 @@ +/* Copyright 2019 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +import { computed, customElement, property } from '@polymer/decorators'; +import { html, PolymerElement } from '@polymer/polymer'; + +// tf-wbr-string safely renders a string, with word break elements inserted +// after substrings that match a regular expression pattern. +@customElement('tf-wbr-string') +class TfWbrString extends PolymerElement { + static readonly template = html` + + + `; + + @property({ type: String }) + value: string = ''; + + /** + * Regular expression pattern for specifying delimiters. elements + * are inserted after all nonoverlapping matches. A match that is + * overlapped by another match further to the left is ignored. Empty + * matches will consume the remainder of the string so it is advised + * to not allow empty matches in your pattern. + */ + @property({ type: String }) + delimiterPattern: string = ''; + + @computed('value', 'delimiterPattern') + get _parts(): unknown[] { + let value = this.value; + let delimiterPattern = this.delimiterPattern; + const result: string[] = []; + while (true) { + const delimiterRegExp = new RegExp(delimiterPattern, 'g'); + delimiterRegExp.test(value); + if (delimiterRegExp.lastIndex === 0) { + result.push(value); + break; + } else { + result.push(value.slice(0, delimiterRegExp.lastIndex)); + value = value.slice(delimiterRegExp.lastIndex); + } + } + return result; + } +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/utils/index.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/utils/index.ts new file mode 100644 index 0000000000000000000000000000000000000000..dc4ec3fc6be1b0d6b2c265ba4cdb1416c6b54d48 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/utils/index.ts @@ -0,0 +1,37 @@ +/* Copyright (c) 2025, Huawei Technologies. + * All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +const removePrototypePollution = (obj: any): void => { + if (obj && typeof obj === 'object') { + for (let key in obj) { + if (key === '__proto__' || key === 'constructor') { + delete obj[key]; + } else if (typeof obj[key] === 'object') { + removePrototypePollution(obj[key]); + } + } + } +}; + +export const safeJSONParse = (str: any, defaultValue: any = null): any => { + try { + const res = JSON.parse(str); + removePrototypePollution(res); + return res; + } catch (error) { + return defaultValue; + } +}; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/vz_sorting/sorting.ts b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/vz_sorting/sorting.ts new file mode 100644 index 0000000000000000000000000000000000000000..b2589a3d89ae1a04ed2979af08258e604f1219c0 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/src/vz_sorting/sorting.ts @@ -0,0 +1,71 @@ +/* Copyright 2016 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ +/** + * Compares tag names asciinumerically broken into components. + * + *

This is the comparison function used for sorting most string values in + * TensorBoard. Unlike the standard asciibetical comparator, this function + * knows that 'a10b' > 'a2b'. Fixed point and engineering notation are + * supported. This function also splits the input by slash and underscore to + * perform array comparison. Therefore it knows that 'a/a' < 'a+/a' even + * though '+' < '/' in the ASCII table. + */ +export function compareTagNames(a: string, b: string): number { + let ai = 0; + let bi = 0; + while (true) { + // Handle end of strings + if (ai === a.length) { + return bi === b.length ? 0 : -1; + } + if (bi === b.length) { + return 1; + } + + // Check for digits + if (isDigit(a[ai]) && isDigit(b[bi])) { + const ais = ai; + const bis = bi; + + // Consume all consecutive digits (simplified from original) + while (ai < a.length && isDigit(a[ai])) ai++; + while (bi < b.length && isDigit(b[bi])) bi++; + + const an = parseInt(a.slice(ais, ai), 10); + const bn = parseInt(b.slice(bis, bi), 10); + + if (an !== bn) { + return an - bn; + } + continue; + } + + // Regular character comparison + if (a[ai] < b[bi]) { + return -1; + } + if (a[ai] > b[bi]) { + return 1; + } + + ai++; + bi++; + } +} + +// Simplified digit check +function isDigit(c: string): boolean { + return c >= '0' && c <= '9'; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/tsconfig.json b/plugins/tensorboard-plugins/tb_graph_ascend/fe/tsconfig.json new file mode 100644 index 0000000000000000000000000000000000000000..89892c56201d5eebbd4c86233afb1ad5d1717e20 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/tsconfig.json @@ -0,0 +1,24 @@ +{ + "compilerOptions": { + "downlevelIteration": true, + "emitDecoratorMetadata": true, + "exactOptionalPropertyTypes": true, + "experimentalDecorators": true, + "importHelpers": true, + "inlineSourceMap": true, + "lib": [ + "dom", + "ES2022", + "dom.iterable" + ], + "noImplicitAny": false, + "moduleResolution": "node", + "module": "ES2022", + "noFallthroughCasesInSwitch": true, + "noImplicitReturns": true, + "noImplicitOverride": true, + "skipLibCheck": true, + "strict": true, + "target": "es2018" + } +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.config.js b/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.config.js new file mode 100644 index 0000000000000000000000000000000000000000..40c723a472539805f8f663c91c6d66e413a024d9 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.config.js @@ -0,0 +1,66 @@ +/* ------------------------------------------------------------------------- + Copyright (c) 2025, Huawei Technologies. + All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--------------------------------------------------------------------------------------------*/ + +const path = require('path'); +const HtmlWebpackPlugin = require('html-webpack-plugin'); +const { CleanWebpackPlugin } = require('clean-webpack-plugin'); +const InlineChunkHtmlPlugin = require('inline-chunk-html-plugin'); + +module.exports = { + entry: { + app: './src/index', + }, + output: { + filename: 'index.js', + path: path.resolve(__dirname, 'dist'), + }, + module: { + rules: [ + { + test: /\.(html)$/, + use: { + loader: 'html-loader', + }, + }, + { + test: /\.ts?$/, + use: { + loader: 'ts-loader', + options: { + transpileOnly: true, + }, + }, + exclude: /node_modules/, + }, + { + test: /\.css$/i, + use: ['style-loader', 'css-loader'], + }, + ], + }, + resolve: { + extensions: ['.ts', '.js'], + }, + plugins: [ + new CleanWebpackPlugin(), + new HtmlWebpackPlugin({ + inject: 'body', + template: './index.html', + }), + new InlineChunkHtmlPlugin(HtmlWebpackPlugin, [/.*/]), + ], +}; diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.dev.js b/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.dev.js new file mode 100644 index 0000000000000000000000000000000000000000..fef5808dfe451c8c1a545fa4e0db7a8ac2381d07 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/fe/webpack.dev.js @@ -0,0 +1,147 @@ +/* ------------------------------------------------------------------------- + Copyright (c) 2025, Huawei Technologies. + All rights reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--------------------------------------------------------------------------------------------*/ +const path = require('path'); +const HtmlWebpackPlugin = require('html-webpack-plugin'); + +module.exports = { + mode: 'development', // 明确指定开发模式 + devtool: 'eval-cheap-source-map', // 开发环境推荐使用的 source map 类型 + + entry: { + app: './src/index', // 保持与生产环境一致的入口 + }, + + output: { + filename: '[name].bundle.js', // 使用带 hash 的文件名 + path: path.resolve(__dirname, 'dist'), + publicPath: '/', // 确保 dev server 的静态资源路径正确 + }, + + devServer: { + static: { + directory: path.join(__dirname, 'dist'), // 服务资源目录 + }, + proxy: [ + { + context: (pathname) => { + return !pathname.match(/(\.js|\.css|\.html|\.ico|\.svg)$/); + }, + target: 'http://127.0.0.1:6006', + changeOrigin: true, + secure: false, + pathRewrite: { + '^/(.*)': '/data/plugin/graph_ascend/$1', // 路径转换核心逻辑 + }, + on: { + error: (err, req, res) => { + // 安全处理响应对象 + console.error(`[HPM] 代理错误: ${err.message}`); + if (res && !res.headersSent) { + res.writeHead(500, { 'Content-Type': 'text/plain' }); + res.end('Proxy Error'); + } + }, + proxyReqWs: (proxyReq, req, socket) => { + // WebSocket 错误专属处理 + socket.on('error', (error) => { + console.error('[HPM] WebSocket 错误:', error.message); + }); + }, + }, + }, + ], + webSocketServer: { + type: 'ws', + options: { + path: '/ws', + noInfo: true, + }, + }, + http2: true, // 推荐启用HTTP/2 + https: false, // 根据实际需要配置 + hot: true, // 启用热模块替换 + liveReload: true, // 启用实时重新加载 + port: 8080, // 自定义端口号 + open: true, // 自动打开浏览器 + client: { + overlay: { + errors: true, + warnings: false, + }, // 在浏览器中显示编译错误 + }, + headers: { + 'X-Proxy-Source': 'webpack-dev-server', + }, + }, + + module: { + rules: [ + { + test: /\.html$/, + use: [ + { + loader: 'html-loader', + options: { + sources: false, // 开发环境不需要优化资源路径 + }, + }, + ], + }, + { + test: /\.ts?$/, + use: { + loader: 'ts-loader', + options: { + transpileOnly: true, // 保持快速编译 + experimentalWatchApi: true, // 启用 TypeScript 的监听 API + }, + }, + exclude: /node_modules/, + }, + { + test: /\.css$/i, + use: [ + 'style-loader', + { + loader: 'css-loader', + options: { + sourceMap: true, // 启用 CSS source maps + }, + }, + ], + }, + ], + }, + + resolve: { + extensions: ['.ts', '.js', '.json'], // 添加 .json 扩展名解析 + }, + + plugins: [ + new HtmlWebpackPlugin({ + inject: 'body', + template: './index.html', + minify: false, // 开发环境不压缩 HTML + }), + ], + + optimization: { + removeAvailableModules: false, + removeEmptyChunks: false, + splitChunks: false, // 禁用代码拆分加速编译 + }, +}; \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/__init__.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/__init__.py similarity index 62% rename from profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/__init__.py rename to plugins/tensorboard-plugins/tb_graph_ascend/server/__init__.py index 7101187a2c2619f3b1c20dded14b433950b4c662..ee2432f470b406bca849a0d9362b8e396a7e21b2 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/__init__.py +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/__init__.py @@ -1,14 +1,15 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. # -# Licensed under the Apache License, Version 2.0 (the "License"); +# Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # -# http://www.apache.org/licenses/LICENSE-2.0 +# http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +# ============================================================================== diff --git a/profiler/msprof_analyze/precheck/distributed_cluster/distributed_cluster_base.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/__init__.py similarity index 62% rename from profiler/msprof_analyze/precheck/distributed_cluster/distributed_cluster_base.py rename to plugins/tensorboard-plugins/tb_graph_ascend/server/app/__init__.py index 7ccd1e542eee2050542a08df62e1720a9cdf4dcb..ee2432f470b406bca849a0d9362b8e396a7e21b2 100644 --- a/profiler/msprof_analyze/precheck/distributed_cluster/distributed_cluster_base.py +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/__init__.py @@ -1,19 +1,15 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. # -# Licensed under the Apache License, Version 2.0 (the "License"); +# Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # -# http://www.apache.org/licenses/LICENSE-2.0 +# http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. - - -class DistributedClusterBase: - def __init__(self): - pass +# ============================================================================== diff --git a/profiler/msprof_analyze/precheck/distributed_cluster/__init__.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/__init__.py similarity index 57% rename from profiler/msprof_analyze/precheck/distributed_cluster/__init__.py rename to plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/__init__.py index b14094e3f9a77a0970342980ed8de1017f58ce19..ee2432f470b406bca849a0d9362b8e396a7e21b2 100644 --- a/profiler/msprof_analyze/precheck/distributed_cluster/__init__.py +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/__init__.py @@ -1,14 +1,15 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. # -# Licensed under the Apache License, Version 2.0 (the "License"); +# Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # -# http://www.apache.org/licenses/LICENSE-2.0 +# http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and -# limitations under the License. \ No newline at end of file +# limitations under the License. +# ============================================================================== diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py new file mode 100644 index 0000000000000000000000000000000000000000..9517d252ca206722497d27ad10c2a4f9bd3c5222 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/controllers/match_nodes_controller.py @@ -0,0 +1,263 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +from ..utils.graph_utils import GraphUtils +from ..utils.global_state import ADD_MATCH_KEYS + + +class MatchNodesController: + + @staticmethod + def process_md5_task_add(graph_data, npu_node_name, bench_node_name): + npu_match_nodes_list = graph_data.get('npu_match_nodes', {}) + bench_match_nodes_list = graph_data.get('bench_match_nodes', {}) + npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {}) + bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {}) + # 去除节点名称前缀 + npu_input_data = GraphUtils.remove_prefix(npu_node_data.get('input_data', {}), npu_node_name + '.') + bench_input_data = GraphUtils.remove_prefix(bench_node_data.get('input_data', {}), bench_node_name + '.') + npu_output_data = GraphUtils.remove_prefix(npu_node_data.get('output_data', {}), npu_node_name + '.') + bench_output_data = GraphUtils.remove_prefix(bench_node_data.get('output_data', {}), bench_node_name + '.') + # 计算精度误差 + precision_input_error = MatchNodesController.calculate_md5_diff(npu_input_data, bench_input_data) + precision_output_error = MatchNodesController.calculate_md5_diff(npu_output_data, bench_output_data) + precision_error = precision_input_error and precision_output_error + # 在原始数据上,添加匹配节点,和匹配节点信息 + npu_node_data['matched_node_link'] = [bench_node_name] + bench_node_data['matched_node_link'] = [npu_node_name] + # 防止 KeyError 或 TypeError + npu_node_data.setdefault('data', {})['precision_index'] = precision_error + # 后端维护一个匹配节点列表,前端展示 + npu_match_nodes_list[npu_node_name] = bench_node_name + bench_match_nodes_list[bench_node_name] = npu_node_name + + graph_data['npu_match_nodes'] = npu_match_nodes_list + graph_data['bench_match_nodes'] = bench_match_nodes_list + return { + 'success': True, + 'data': { + 'precision_error': precision_error, + }, + } + + @staticmethod + def process_md5_task_delete(graph_data, npu_node_name, bench_node_name): + npu_match_nodes_list = graph_data.get('npu_match_nodes', {}) + bench_match_nodes_list = graph_data.get('bench_match_nodes', {}) + npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {}) + bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {}) + # 在原始数据上,删除匹配节点,和匹配节点信息 + npu_node_data['matched_node_link'] = [] + bench_node_data['matched_node_link'] = [] + # 后端维护一个匹配节点列表,前端展示 + try: + del npu_node_data['data']['precision_index'] + del npu_match_nodes_list[npu_node_name] + del bench_match_nodes_list[bench_node_name] + except KeyError: + return { + 'success': False, + 'error': "操作失败:删除节点信息失败", + } + graph_data['npu_match_nodes'] = npu_match_nodes_list + graph_data['bench_match_nodes'] = bench_match_nodes_list + return { + 'success': True, + 'data': {}, + } + + @staticmethod + def process_summary_task_add(graph_data, npu_node_name, bench_node_name): + npu_match_nodes_list = graph_data.get('npu_match_nodes', {}) + bench_match_nodes_list = graph_data.get('bench_match_nodes', {}) + npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name) + bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name) + # 计算统计误差 + intput_statistical_diff = MatchNodesController.calculate_statistical_diff( + npu_node_data.get('input_data'), bench_node_data.get('input_data'), npu_node_name, bench_node_name + ) + output_statistical_diff = MatchNodesController.calculate_statistical_diff( + npu_node_data.get('output_data'), bench_node_data.get('output_data'), npu_node_name, bench_node_name + ) + # 计算精度误差 + precision_error = MatchNodesController.calculate_max_relative_error(output_statistical_diff) + # 有一个没有匹配上,则认为匹配失败 + if not intput_statistical_diff or not output_statistical_diff: + return { + 'success': False, + 'error': '输入或输出统计误差值为空(Input and output statistical error calculation failed)', + } + + if precision_error == -1: + return { + 'success': False, + 'error': '输出统计误差值为空,计算精度误差失败(Calculation of precision error failed)', + } + # 在原始数据上,添加匹配节点,和匹配节点信息 + npu_node_data['matched_node_link'] = [bench_node_name] + bench_node_data['matched_node_link'] = [npu_node_name] + MatchNodesController.update_graph_node_data(npu_node_data.get('input_data'), intput_statistical_diff) + MatchNodesController.update_graph_node_data(npu_node_data.get('output_data'), output_statistical_diff) + # 防止 KeyError 或 TypeError + npu_node_data.setdefault('data', {})['precision_index'] = precision_error + # 后端维护一个匹配节点列表,前端展示 + npu_match_nodes_list[npu_node_name] = bench_node_name + bench_match_nodes_list[bench_node_name] = npu_node_name + graph_data['npu_match_nodes'] = npu_match_nodes_list + graph_data['bench_match_nodes'] = bench_match_nodes_list + return { + 'success': True, + 'data': { + 'precision_error': precision_error, + 'intput_statistical_diff': intput_statistical_diff, + 'output_statistical_diff': output_statistical_diff, + }, + } + + @staticmethod + def process_summary_task_delete(graph_data, npu_node_name, bench_node_name): + npu_match_nodes_list = graph_data.get('npu_match_nodes', {}) + bench_match_nodes_list = graph_data.get('bench_match_nodes', {}) + npu_node_data = graph_data.get('NPU', {}).get('node', {}).get(npu_node_name, {}) + bench_node_data = graph_data.get('Bench', {}).get('node', {}).get(bench_node_name, {}) + # 在原始数据上,删除匹配节点,和匹配节点信息 + npu_node_data['matched_node_link'] = [] + bench_node_data['matched_node_link'] = [] + + MatchNodesController.delete_matched_node_data(npu_node_data.get('input_data')) + MatchNodesController.delete_matched_node_data(npu_node_data.get('output_data')) + # 后端维护一个匹配节点列表,前端展示 + try: + # 防止 KeyError 或 TypeError + npu_node_data.get('data', {}).pop('precision_index', None) + del npu_match_nodes_list[npu_node_name] + del bench_match_nodes_list[bench_node_name] + except KeyError: + return { + 'success': False, + 'error': "操作失败:删除节点信息失败", + } + graph_data['npu_match_nodes'] = npu_match_nodes_list + graph_data['bench_match_nodes'] = bench_match_nodes_list + return { + 'success': True, + 'data': {}, + } + + @staticmethod + def calculate_statistical_diff(npu_data, bench_data, npu_node_name, bench_node_name): + result = {} + # 去除节点名称前缀并转化为列表形式 + npu_data_simple = list(GraphUtils.remove_prefix(npu_data, npu_node_name + '.').values()) + bench_data_simple = list(GraphUtils.remove_prefix(bench_data, bench_node_name + '.').values()) + npu_data_keys = list(GraphUtils.remove_prefix(npu_data, npu_node_name + '.').keys()) + # 使用 zip 只对比最短的列表长度 + for npu_values, bench_values in zip(npu_data_simple, bench_data_simple): + npu_max = GraphUtils.convert_to_float(npu_values.get('Max', float('nan'))) + bench_max = GraphUtils.convert_to_float(bench_values.get('Max', float('nan'))) + npu_min = GraphUtils.convert_to_float(npu_values.get('Min', float('nan'))) + bench_min = GraphUtils.convert_to_float(bench_values.get('Min', float('nan'))) + npu_norm = GraphUtils.convert_to_float(npu_values.get('Norm', float('nan'))) + bench_norm = GraphUtils.convert_to_float(bench_values.get('Norm', float('nan'))) + npu_mean = GraphUtils.convert_to_float(npu_values.get('Mean', float('nan'))) + bench_mean = GraphUtils.convert_to_float(bench_values.get('Mean', float('nan'))) + + # Calculate absolute differences + max_diff_abs = abs(npu_max - bench_max) + min_diff_abs = abs(npu_min - bench_min) + norm_diff_abs = abs(npu_norm - bench_norm) + mean_diff_abs = abs(npu_mean - bench_mean) + + # Calculate relative differences (avoid division by zero) + max_diff_rel = abs(max_diff_abs / (bench_max if bench_max != 0 else 1)) + min_diff_rel = abs(min_diff_abs / (bench_min if bench_min != 0 else 1)) + norm_diff_rel = abs(norm_diff_abs / (bench_norm if bench_norm != 0 else 1)) + mean_diff_rel = abs(mean_diff_abs / (bench_mean if bench_mean != 0 else 1)) + + # 将结果记录到字典中 + result[npu_node_name + '.' + npu_data_keys[len(result)]] = dict( + zip( + ADD_MATCH_KEYS, + [ + max_diff_abs, + min_diff_abs, + mean_diff_abs, + norm_diff_abs, + max_diff_rel, + min_diff_rel, + mean_diff_rel, + norm_diff_rel, + ], + ) + ) + + return result + + # 计算最大相对误差 + @staticmethod + def calculate_max_relative_error(result): + max_rel_error = -1 + for _, diff_values in result.items(): + max_diff_rel = diff_values.get('MaxRelativeErr', float('nan')) + min_diff_rel = diff_values.get('MinRelativeErr', float('nan')) + norm_diff_rel = diff_values.get('NormRelativeErr', float('nan')) + mean_diff_rel = diff_values.get('MeanRelativeErr', float('nan')) + + max_rel_error_for_key = max(max_diff_rel, min_diff_rel, norm_diff_rel, mean_diff_rel) + max_rel_error = max(max_rel_error, max_rel_error_for_key) + + return min(max_rel_error, 1) + + @staticmethod + def calculate_md5_diff(npu_data, bench_data): + if npu_data == {} or bench_data == {}: + return 0 + # 对比每个NPU和Bench所有数据md值,如果有一个不一样则返回0,否则返回1 + for npu_key, bench_key in zip(npu_data, npu_data): + npu_md5 = npu_data[npu_key].get('md5', '') + bench_md5 = bench_data[bench_key].get('md5', '') + if npu_md5 != bench_md5: + return 0 + return 1 + + @staticmethod + def update_graph_node_data(graph_npu_node_data, statistical_diff): + if not statistical_diff or not graph_npu_node_data: + return + for key, diff_values in statistical_diff.items(): + # 格式化相对误差字段 + for field in ['MaxRelativeErr', 'MinRelativeErr', 'NormRelativeErr', 'MeanRelativeErr']: + diff_values[field] = GraphUtils.format_relative_err(diff_values.get(field, float('nan'))) + + # 转换 absErr 为 NaN 字符串 + for field in ['MaxAbsErr', 'MinAbsErr', 'MeanAbsErr', 'NormAbsErr']: + diff_values[field] = GraphUtils.nan_to_str(diff_values.get(field, float('nan'))) + if key in graph_npu_node_data: + graph_npu_node_data[key].update(diff_values) + else: + graph_npu_node_data[key] = diff_values + + @staticmethod + def delete_matched_node_data(graph_npu_node_data): + keys_to_remove = ADD_MATCH_KEYS + # 遍历graph_npu_node_data中的每个主键和对应的子字典 + for key, fild_obj in graph_npu_node_data.items(): + # 使用字典解析创建新的子字典,排除不需要的键 + graph_npu_node_data[key] = { + sub_key: value + for sub_key, value in fild_obj.items() + if sub_key not in keys_to_remove + } diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/__init__.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..ee2432f470b406bca849a0d9362b8e396a7e21b2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/__init__.py @@ -0,0 +1,15 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/graph_service.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/graph_service.py new file mode 100644 index 0000000000000000000000000000000000000000..a6ed0fc58631bc1070d90333bb1d9dec14be0d84 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/service/graph_service.py @@ -0,0 +1,111 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +from ..utils.graph_utils import GraphUtils +from ..utils.global_state import get_global_value +from ..controllers.match_nodes_controller import MatchNodesController + + +class GraphService: + + @staticmethod + def get_node_info(node_info, meta_data): + graph_data, error_message = GraphUtils.get_graph_data(meta_data) + if error_message: + return {'success': False, 'error': error_message} + + node_type = node_info.get('nodeType') + node_name = node_info.get('nodeName') + node_details = graph_data.get(node_type, {}).get('node', {}).get(node_name) + return {'success': True, 'data': node_details} + + @staticmethod + def add_match_nodes(npu_node_name, bench_node_name, meta_data): + graph_data, error_message = GraphUtils.get_graph_data(meta_data) + if error_message: + return {'success': False, 'error': error_message} + + task = graph_data.get('task') + # 根据任务类型计算误差 + if task == 'md5': + result = MatchNodesController.process_md5_task_add(graph_data, npu_node_name, bench_node_name) + return result + elif task == 'summary': + result = MatchNodesController.process_summary_task_add(graph_data, npu_node_name, bench_node_name) + return result + else: + return {'success': False, 'error': '任务类型不支持(Task type not supported) '} + + @staticmethod + def delete_match_nodes(npu_node_name, bench_node_name, meta_data): + graph_data, error_message = GraphUtils.get_graph_data(meta_data) + if error_message: + return {'success': False, 'error': error_message} + + task = graph_data.get('task') + # 根据任务类型计算误差 + if task == 'md5': + result = MatchNodesController.process_md5_task_delete(graph_data, npu_node_name, bench_node_name) + return result + elif task == 'summary': + result = MatchNodesController.process_summary_task_delete(graph_data, npu_node_name, bench_node_name) + return result + else: + return {'success': False, 'error': '任务类型不支持(Task type not supported) '} + + @staticmethod + def get_matched_state_list(meta_data): + graph_data, error_message = GraphUtils.get_graph_data(meta_data) + if error_message: + return {'success': False, 'error': error_message} + + npu_match_nodes_list = graph_data.get('npu_match_nodes', {}) + bench_match_nodes_list = graph_data.get('bench_match_nodes', {}) + return { + 'success': True, + 'data': { + 'npu_match_nodes': npu_match_nodes_list, + 'bench_match_nodes': bench_match_nodes_list, + }, + } + + @staticmethod + def update_colors(run, colors): + """Set new colors in jsondata.""" + try: + first_run_tag = get_global_value("first_run_tag", {}).get(run) + first_file_data, error = GraphUtils.safe_load_data(run, first_run_tag) + if error: + return {'success': False, 'error': error} + first_file_data['Colors'] = colors + GraphUtils.safe_save_data(first_file_data, run, first_run_tag) + return {'success': True, 'error': None, 'data': {}} + except Exception as e: + return {'success': False, 'error': str(e), 'data': None} + + @staticmethod + def save_data(meta_data): + graph_data, error_message = GraphUtils.get_graph_data(meta_data) + if error_message: + return {'success': False, 'error': error_message} + + run = meta_data.get('run') + tag = meta_data.get('tag') + try: + GraphUtils.safe_save_data(graph_data, run, tag) + except (ValueError, IOError, PermissionError) as e: + return {'success': False, 'error': f"Error: {e}"} + return {'success': True} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/__init__.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..ee2432f470b406bca849a0d9362b8e396a7e21b2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/__init__.py @@ -0,0 +1,15 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py new file mode 100644 index 0000000000000000000000000000000000000000..0ae16ef82f2d1e817bb0256db3e96b154286ed2c --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/global_state.py @@ -0,0 +1,70 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +import re + +# 模块级全局变量 + +ADD_MATCH_KEYS = [ + 'MaxAbsErr', + 'MinAbsErr', + 'MeanAbsErr', + 'NormAbsErr', + 'MaxRelativeErr', + 'MinRelativeErr', + 'MeanRelativeErr', + 'NormRelativeErr', +] +FILE_NAME_REGEX = r'^[a-zA-Z0-9_\-\.]+$' # 文件名正则表达式 + +_state = {'logdir': '', 'current_tag': '', 'current_run': '', 'current_file_path': '', 'current_file_data': {}, + 'runs': {}} + + +def init_defaults(): + """ + 初始化全局变量的默认值 + """ + global _state + _state = {'logdir': '', 'current_tag': '', 'current_run': '', 'current_file_path': '', 'current_file_data': {}, + 'runs': {}} + + +def set_global_value(key, value, inner_key=None): + """ + 设置全局变量的值, + 若inner_key不为空则更改_state中某个字典类型的变量中的某个字段值,若不存在则添加 + """ + if inner_key is None: + _state[key] = value + else: + if key in _state: + _state[key][inner_key] = value + else: + _state[key] = {inner_key: value} + + +def get_global_value(key, default=None): + """ + 获取全局变量的值 + """ + return _state.get(key, default) + + +def reset_global_state(): + """ + 重置所有全局变量为默认值 + """ + init_defaults() diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..1c3e153750a963d0b4dcd42d7942b6ad5bfa16c1 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/utils/graph_utils.py @@ -0,0 +1,190 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +import math +import os +import json +import re +import stat + +from tensorboard.util import tb_logging +from .global_state import get_global_value, set_global_value, FILE_NAME_REGEX + +logger = tb_logging.get_logger() +MAX_FILE_SIZE = 1024 * 1024 * 1024 # 最大文件大小限制为1GB + + +class GraphUtils: + + @staticmethod + def get_graph_data(meta_data): + run = meta_data.get('run') + tag = meta_data.get('tag') + current_tag = get_global_value('current_tag') + current_run = get_global_value('current_run') + if current_run == run and current_tag == tag: + return get_global_value('current_file_data'), None # 直接返回获取结果 + else: + json_data, error_message = GraphUtils.safe_load_data(run, tag) + if error_message: + return None, error_message + set_global_value('current_file_data', json_data) + set_global_value('current_tag', tag) + set_global_value('current_run', run) + return json_data, error_message + + @staticmethod + def remove_prefix(node_data, prefix): + if node_data is None: + return {} + return {k[len(prefix):] if k.startswith(prefix) else k: v for k, v in node_data.items()} + + @staticmethod + def convert_to_float(value): + try: + return float(value) + except ValueError: + return float('nan') + + @staticmethod + def format_relative_err(value): + """格式化相对误差为百分比,保留四位小数""" + if value is None or math.isnan(value): + return "NaN" + else: + return "{:.4%}".format(value) + + @staticmethod + def nan_to_str(value): + """将 NaN 转换为 'NaN' 字符串""" + return "NaN" if math.isnan(value) else value + + @staticmethod + def is_relative_to(path, base): + abs_path = os.path.abspath(path) + abs_base = os.path.abspath(base) + return os.path.commonpath([abs_path, abs_base]) == str(abs_base) + + @staticmethod + def safe_save_data(data, run_name, tag): + safe_base_dir = get_global_value('logdir') + run = get_global_value('runs', {}).get(run_name) + if run is None or tag is None: + error_message = 'The query parameters "run" and "tag" are required' + return None, error_message + try: + # 构建文件路径并标准化 + file_path = os.path.join(run, f"{tag}.vis") + file_path = os.path.normpath(file_path) + # 权限校验:检查目录是否有写权限 + if not os.access(run, os.W_OK): + raise PermissionError(f"No write permission for directory: {run}\n") + # 检查 run 目录是否存在,如果不存在则创建 + if not os.path.exists(run): + os.makedirs(run, exist_ok=True) + os.chmod(run, 0o750) + # 检查 tag 是否为合法文件名 + if not re.match(FILE_NAME_REGEX, tag): + raise ValueError(f"Invalid tag: {tag}.") + # 检查文件路径是否合法,防止路径遍历攻击 + if not file_path.startswith(os.path.abspath(run)): + raise ValueError(f"Invalid file path: {file_path}. Potential path traversal attack.\n") + # 基础路径校验 + if not GraphUtils.is_relative_to(file_path, safe_base_dir): + raise ValueError(f"Path out of bounds: {file_path}") + if os.path.islink(file_path): + raise RuntimeError("The target file is a symbolic link") + if os.path.islink(run): + raise RuntimeError(f"Parent directory contains a symbolic link") + if not os.path.isfile(file_path): + raise RuntimeError("The target path is not a regular file") + # # 尝试写入文件 + with open(file_path, "w", encoding="utf-8") as file: + json.dump(data, file, ensure_ascii=False, indent=4) + os.chmod(file_path, 0o640) + # 最终校验(防御TOCTOU攻击) + if os.path.islink(file_path): + raise RuntimeError("The file has been replaced with a symbolic link") + return True, None + except (TypeError, ValueError) as e: + logger.error(f"Invalid data: {e}") + return None, 'Invalid data' + except OSError as e: + logger.error(f"Failed to create directory: {run}. Error: {e}\n") + return None, 'failed to create directory {run}' + except Exception as e: + logger.error(f'Error: File "{file_path}" is not accessible. Error: {e}') + return None, 'failed to save file' + + @staticmethod + def safe_load_data(run_name, tag, only_check=False): + """Load a single .vis file from a given directory based on the tag.""" + run_dir = get_global_value('runs', {}).get(run_name) + if run_dir is None or tag is None: + error_message = 'The query parameters "run" and "tag" are required' + return None, error_message + try: + file_path = os.path.join(run_dir, f"{tag}.vis") + file_path = os.path.normpath(file_path) # 标准化路径 + # 解析真实路径(包含符号链接跟踪) + real_path = os.path.realpath(file_path) + safe_base_dir = get_global_value('logdir') + # 安全验证1:路径归属检查(防止越界访问) + if not GraphUtils.is_relative_to(file_path, safe_base_dir): + raise RuntimeError(f"Path out of bounds:") + # 安全验证2:禁止符号链接文件 + if os.path.islink(file_path): + raise RuntimeError(f"Detected symbolic link file") + if os.path.islink(run_dir): + raise RuntimeError(f"Parent directory contains a symbolic link") + # 安全验证3:二次文件类型检查(防御TOCTOU攻击) + if not os.path.isfile(real_path): + raise RuntimeError(f"Path is not a regular file") + # 安全检查4:文件存在性验证 + if not os.path.exists(real_path): + raise FileNotFoundError(f"File does not exist") + # 权限验证 + if not os.stat(real_path).st_mode & stat.S_IRUSR: + raise PermissionError(f"File has no read permissions") + # 文件大小验证 + if os.path.getsize(real_path) > MAX_FILE_SIZE: + raise RuntimeError(f"File size exceeds limit ({os.path.getsize(real_path)} > {MAX_FILE_SIZE/1024}MB)") + # 读取文件比较耗时,支持onlyCheck参数,仅进行安全校验 + if only_check: + return True, None + # 尝试解析 JSON 文件,校验文件内容是否合理 + with open(file_path, 'r', encoding='utf-8') as f: + return json.load(f), None + except json.JSONDecodeError: + logger.error(f'Error: File "{file_path}" is not a valid JSON file!') + return None, "File is not a valid JSON file!" + except Exception as e: + logger.error(f'Error: File "{file_path}" is not accessible. Error: {e}') + return None, 'failed to load file' + + @staticmethod + def process_vis_file(dir_path, file, run_tag_pairs): + file_path = os.path.join(dir_path, file) + if os.path.isfile(file_path) and file.endswith('.vis'): + run = dir_path + run_name = os.path.basename(run) + set_global_value('runs', run, run_name) + tag = file[:-4] # Use the filename without extension as tag + _, error = GraphUtils.safe_load_data(run_name, tag, True) + if error: + logger.error(f'Error: File run:"{run}, tag:{tag}" is not accessible. Error: {error}') + return + run_tag_pairs.setdefault(run_name, []).append(tag) diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/__init__.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..ee2432f470b406bca849a0d9362b8e396a7e21b2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/__init__.py @@ -0,0 +1,15 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py new file mode 100644 index 0000000000000000000000000000000000000000..107068cc3dae94a4c3da7787c1c03d83b32451f2 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/app/views/graph_views.py @@ -0,0 +1,115 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +import json +from werkzeug import wrappers +from tensorboard.backend import http_util +from ..service.graph_service import GraphService + + +class GraphView: + + # 获取当前节点对应节点的信息看板数据 + @staticmethod + @wrappers.Request.application + def get_node_info(request): + try: + node_info = json.loads(request.args.get("nodeInfo")) + meta_data = json.loads(request.args.get("metaData")) + node_detail = GraphService.get_node_info(node_info, meta_data) + except (TypeError, json.JSONDecodeError): + node_detail = {'success': False, + 'error': 'GetNodeInfo failed: The query parameters are not in a legal JSON format.'} + except Exception as e: + node_detail = {'success': False, 'error': e} + return http_util.Respond(request, node_detail, "application/json") + + # 添加匹配节点 + @staticmethod + @wrappers.Request.application + def add_match_nodes(request): + try: + npu_node_name = json.loads(request.args.get("npuNodeName")) + bench_node_name = json.loads(request.args.get("benchNodeName")) + meta_data = json.loads(request.args.get("metaData")) + match_result = GraphService.add_match_nodes(npu_node_name, bench_node_name, meta_data) + except (TypeError, json.JSONDecodeError): + match_result = {'success': False, + 'error': 'AddMatchNodes failed: The query parameters are not in a legal JSON format.'} + except Exception as e: + match_result = {'success': False, 'error': e} + return http_util.Respond(request, match_result, "application/json") + + # 取消节点匹配 + @staticmethod + @wrappers.Request.application + def delete_match_nodes(request): + try: + npu_node_name = json.loads(request.args.get("npuNodeName")) + bench_node_name = json.loads(request.args.get("benchNodeName")) + meta_data = json.loads(request.args.get("metaData")) + match_result = GraphService.delete_match_nodes(npu_node_name, bench_node_name, meta_data) + except (TypeError, json.JSONDecodeError): + match_result = {'success': False, + 'error': 'DeleteMatchNodes failed: The query parameters are not in a legal JSON format.'} + except Exception as e: + match_result = {'success': False, 'error': e} + return http_util.Respond(request, match_result, "application/json") + + # 获取匹配节点列表 + @staticmethod + @wrappers.Request.application + def get_matched_state_list(request): + try: + meta_data = json.loads(request.args.get("metaData")) + matched_state_list = GraphService.get_matched_state_list(meta_data) + except (TypeError, json.JSONDecodeError): + matched_state_list = { + 'success': False, + 'error': 'GetMatchedStateList failed: The query parameters are not in a legal JSON format.' + } + except Exception as e: + matched_state_list = {'success': False, 'error': e} + return http_util.Respond(request, matched_state_list, "application/json") + + # 保存匹配节点列表 + @staticmethod + @wrappers.Request.application + def save_data(request): + try: + meta_data = json.loads(request.args.get("metaData")) + save_result = GraphService.save_data(meta_data) + except (TypeError, json.JSONDecodeError): + save_result = {'success': False, + 'error': 'SaveData failed: The query parameters are not in a legal JSON format.'} + except Exception as e: + save_result = {'success': False, 'error': e} + return http_util.Respond(request, save_result, "application/json") + + # 更新颜色信息 + @staticmethod + @wrappers.Request.application + def update_colors(request): + try: + run = json.loads(request.args.get('run')) + colors = json.loads(request.args.get('colors')) + update_result = GraphService.update_colors(run, colors) + except (TypeError, json.JSONDecodeError): + update_result = {'success': False, + 'error': 'UpdateColors failed: The query parameters are not in a legal JSON format.'} + except Exception as e: + update_result = {'success': False, 'error': e} + return http_util.Respond(request, update_result, "application/json") diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/constants.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/constants.py new file mode 100644 index 0000000000000000000000000000000000000000..f988dc104d4985681fa37bc423b9103d092fb161 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/constants.py @@ -0,0 +1,53 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +MAX_FILE_SIZE = 1024 * 1024 * 1024 +PLUGIN_NAME = 'graph_ascend' +PLUGIN_NAME_RUN_METADATA_WITH_GRAPH = 'graph_ascend_run_metadata_graph' +SETS = { + 'Bench': ('Bench', 'B___', 'N___'), + 'NPU': ('NPU', 'N___', 'B___'), + 'B___': ('Bench', 'N___'), + 'N___': ('NPU', 'B___'), +} +NA_DATA = [ + ['Max diff', 'N/A'], + ['Min diff', 'N/A'], + ['Mean diff', 'N/A'], + ['L2norm diff', 'N/A'], + ['MaxRelativeErr', 'N/A'], + ['MinRelativeErr', 'N/A'], + ['MeanRelativeErr', 'N/A'], + ['NormRelativeErr', 'N/A'], +] +PREFIX_MAP = {'N___': 'NPU', 'B___': 'Bench'} +UNMATCH_NAME_SET = [ + 'Max diff', + 'Min diff', + 'Mean diff', + 'L2norm diff', + 'MaxRelativeErr', + 'MinRelativeErr', + 'MeanRelativeErr', + 'NormRelativeErr', + 'Cosine', + 'MaxAbsErr', + 'MaxRelativeErr', + 'One Thousandth Err Ratio', + 'Five Thousandth Err Ratio', +] +SCREEN_MAP = {'precision_index': 'precision_index', 'overflow_level': 'overflow_level'} +UNMATCHED_NODE_NAME = '无匹配节点' diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py b/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py new file mode 100644 index 0000000000000000000000000000000000000000..dbd31134790ef8501bb4deff43fd02dc7a40102b --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/plugin.py @@ -0,0 +1,583 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Copyright (c) 2025, Huawei Technologies. +# Adapt to the model hierarchical visualization data collected by the msprobe tool +# ============================================================================== +"""The TensorBoard Graphs plugin.""" + +import os +from werkzeug import wrappers, Response, exceptions +from tensorboard.backend import http_util +from tensorboard.plugins import base_plugin +from tensorboard.util import tb_logging + +from . import constants +from .app.views.graph_views import GraphView +from .app.utils.graph_utils import GraphUtils +from .app.utils.global_state import set_global_value, get_global_value + +logger = tb_logging.get_logger() + + +class GraphsPlugin(base_plugin.TBPlugin): + """Graphs Plugin for TensorBoard.""" + + plugin_name = constants.PLUGIN_NAME + headers = [('X-Content-Type-Options', 'nosniff')] + + def __init__(self, context): + """Instantiates GraphsPlugin via TensorBoard core. + + Args: + context: A base_plugin.TBContext instance. + """ + super().__init__(context) + self._data_provider = context.data_provider + self.logdir = os.path.abspath(os.path.expanduser(context.logdir.rstrip('/'))) + # 将logdir赋值给global_state中的logdir属性,方便其他模块使用 + set_global_value('logdir', os.path.abspath(os.path.expanduser(context.logdir.rstrip('/')))) + self._current_file_path = None # Store the path of the currently loaded file + self._current_file_data = None # Store the data of the currently loaded file + self._current_tag = None # Store the tag of the currently loaded file + self.batch_id = 0 # 将 batch_id 声明为实例变量 + self.step_id = 0 # 可以同样声明 step_id + self.dfs_node_ids = [] # batch和step没变的话就将所有的nodename存起来,方便快速读取 + self.check_batch_id = -1 # 来配合node_ids监察用的,他不变node_ids就不用重新读取了 + self.check_step_id = 0 # 同上 + self.check_tag = None + + def get_plugin_apps(self): + return { + '/index.js': self.static_file_route, + '/index.html': self.static_file_route, + "/info": self.info_route, + "/components": self.get_all_data, + "/expandnodes": self.get_all_upnodes, + "/screen": self.get_all_screen_nodes, + "/parent": self.get_parent_node, + "/rank": self.get_rank, + "/subgraph": self.subgraph_route, + '/getNodeInfo': GraphView.get_node_info, + '/addMatchNodes': GraphView.add_match_nodes, + '/deleteMatchNodes': GraphView.delete_match_nodes, + '/getMatchedStateList': GraphView.get_matched_state_list, + '/saveData': GraphView.save_data, + '/updateColors': GraphView.update_colors, + } + + def is_active(self): + """The graphs plugin is active if any run has a graph.""" + for content in os.listdir(self.logdir): + content_path = os.path.join(self.logdir, content) + if os.path.isfile(content_path) and content.endswith('.vis'): + return True + if os.path.isdir(content_path): + for file in os.listdir(content_path): + if os.path.isfile(os.path.join(content_path, file)) and file.endswith('.vis'): + return True + return False + + def data_plugin_names(self): + return ( + constants.PLUGIN_NAME, + constants.PLUGIN_NAME_RUN_METADATA_WITH_GRAPH, + ) + + def frontend_metadata(self): + return base_plugin.FrontendMetadata( + es_module_path='/index.js', + disable_reload=True, + ) + + # 拿所有nodename的 + def get_all_node_names(self, json_data, request): + npu_ids, bench_ids = [], [] + batch = request.args.get("batch") + step = request.args.get("step") + if batch is None or step is None: + logger.error('The param "batch" or "step" does not exist or not a valid value') + # 获取 NPU 和 Bench 数据 + npu_data = self.json_get(json_data, 'NPU') + bench_data = self.json_get(json_data, 'Bench') + + def extract_ids(nodes_data, id_list): + for node_name in nodes_data.get("node"): + id_list.append(node_name) + + def traverse_npu(subnodes): + for node in subnodes: + node_data = ( + self.json_get(npu_data, 'node', node) if npu_data else self.json_get(json_data, 'node', node) + ) + micro_step_id = node_data.get('micro_step_id') + if str(micro_step_id) == batch or micro_step_id is None: + npu_ids.append(node) + traverse_npu(node_data.get('subnodes', [])) + + # 提取 NPU 节点 ID + if batch == '-1' and step == '-1': + extract_ids(npu_data or json_data, npu_ids) + else: + root = (npu_data or json_data).get('root') + root_subnodes = self.json_get((npu_data or json_data), 'node', root, 'subnodes') + traverse_npu(root_subnodes) + + # 提取 Bench 节点 ID + if bench_data: + extract_ids(bench_data, bench_ids) + # 返回格式为 [[NPU节点ID列表], [Bench节点ID列表]] + return [npu_ids, bench_ids] + + def dfs_collect_nodes(self, json_data, request): + root_subnodes_set = [] + all_node_names = [] + try: + request_micro_step_id = request.args.get("batch") + except ValueError: + logger.error('The param "batch" or "step" does not exist or not a valid value') + root_name = self.json_get(json_data, 'NPU', 'root') or \ + self.json_get(json_data, 'root') + root_subnodes = self.json_get(json_data, 'NPU', 'node', root_name, 'subnodes') \ + if 'NPU' in json_data else \ + self.json_get(json_data, 'node', root_name, 'subnodes') + if root_subnodes: + for node in root_subnodes: + json_path = ['NPU', 'node', node, 'micro_step_id'] if 'NPU' in json_data \ + else ['node', node, 'micro_step_id'] + micro_step_id = self.json_get(json_data, *json_path) + if request_micro_step_id == '-1' or str(micro_step_id) == request_micro_step_id: + root_subnodes_set.append(node) + + def get_leaf_nodes(subnodes_set): + npu_data = self.json_get(json_data, 'NPU') + for node in subnodes_set: + node_data = ( + self.json_get(npu_data, 'node', node) if npu_data else self.json_get(json_data, 'node', node) + ) + if node_data: + if node_data.get('subnodes'): + get_leaf_nodes(node_data.get('subnodes')) + else: + all_node_names.append(node) + + get_leaf_nodes(root_subnodes_set) + + return all_node_names + + # 拿所有precisonNodes的,与controls的精度筛选联动 + @wrappers.Request.application + def get_all_screen_nodes(self, request): + grouped_screen_set, inaccuracy_node_ids = [], [] + precision_none = 0 + screen = '' + # 尝试获取 screen_set 和 screen 的值 + for key, value in constants.SCREEN_MAP.items(): + if key in request.args: + screen_set = request.args.get(key) + screen = value + break # 找到一个匹配的 key 后跳出循环 + + if screen == 'precision_index': + precision_set_str = screen_set.split(',') + if constants.UNMATCHED_NODE_NAME in precision_set_str: + precision_set_str = [p for p in precision_set_str if p != constants.UNMATCHED_NODE_NAME] + precision_none = 1 + grouped_screen_set = [ + list(map(float, precision_set_str[i: i + 2])) + for i in range(0, len(precision_set_str), 2) + ] + else: + grouped_screen_set = screen_set + tag = request.args.get("tag") + json_data = self.check_jsondata(request) + + def has_conditions_changed(tag, batch): + return ( + self.check_batch_id != batch + or self.check_step_id != self.step_id + or self.check_tag != tag + or self.check_tag is None + ) + + if has_conditions_changed(tag, self.batch_id): + self.dfs_node_ids = self.dfs_collect_nodes(json_data, request) + self.check_batch_id = self.batch_id + self.check_step_id = self.step_id + self.check_tag = tag + node_ids = self.dfs_node_ids + for node in node_ids: + node_data = self.json_get(json_data, 'NPU', 'node', node, 'data') or self.json_get( + json_data, 'node', node, 'data' + ) + matched = self.json_get(json_data, 'NPU', 'node', node, 'matched_node_link') or self.json_get( + json_data, 'node', node, 'matched_node_link' + ) + inaccuracy = node_data.get(screen) if node_data is not None else None + # 如果 inaccuracy 为 None,直接检查是否符合条件 + if inaccuracy is None and precision_none == 0: + continue # 跳过后续的处理,进入下一个 node + if inaccuracy is None and precision_none == 1: + if (node_data is None or node_data.get('overflow_level', False)) and not matched: + inaccuracy_node_ids.append(node) + continue # 跳过后续的处理,进入下一个 node + + # 对于 inaccuracy 是数字类型,检查是否在某个子范围内,精度误差 + if isinstance(inaccuracy, (int, float)): + for group in grouped_screen_set: + if len(group) > 1 and all(g is not None for g in group) and group[0] <= inaccuracy <= group[1]: + inaccuracy_node_ids.append(node) + break # 找到符合条件的,跳出当前循环 + # 对于非数字的 inaccuracy,检查是否在 grouped_screen_set 中,溢出检测 + elif inaccuracy in grouped_screen_set: + inaccuracy_node_ids.append(node) + else: + logger.error(f'The inaccuracy in {node} is not a valid value') + + return http_util.Respond(request, inaccuracy_node_ids, "application/json") + + def group_precision_set(self, precision_set): + if len(precision_set) % 2 != 0: + raise ValueError('The number of elements in precision_set is not even') + grouped_precision_set = [precision_set[i: i + 2] for i in range(0, len(precision_set), 2)] + return grouped_precision_set + + def get_all_unmatched_nodes(self, all_node_names, json_data): + is_npu_present = 'NPU' in json_data + + def collect_unmatched_nodes(node_list, *path): + return [node for node in node_list if not self.json_get(json_data, *path, node, 'matched_node_link')] + + npu_unmatched = ( + collect_unmatched_nodes(all_node_names[0], 'NPU', 'node') + if is_npu_present + else collect_unmatched_nodes(all_node_names[0], 'node') + ) + bench_unmatched = collect_unmatched_nodes(all_node_names[1], 'Bench', 'node') if is_npu_present else [] + return [npu_unmatched, bench_unmatched] + + @wrappers.Request.application + def get_parent_node(self, request): + node = request.args.get("node")[4:] # 获取节点信息 + prefix = request.args.get("node")[:4] # 获取前缀 + json_data = self.check_jsondata(request) # 检查请求中的 JSON 数据 + + def find_upnode(node): + matched_node_link_list = self.json_get( + json_data, constants.PREFIX_MAP[prefix], 'node', node, 'matched_node_link' + ) + + if matched_node_link_list: + result = matched_node_link_list[-1] # 获取匹配的最后一个节点 + return http_util.Respond(request, result, "application/json") # 返回响应 + else: + # 如果没有找到 matched_node_link,继续递归查找上级节点 + upnode = self.json_get(json_data, constants.PREFIX_MAP[prefix], 'node', node, 'upnode') + if upnode: + return find_upnode(upnode) # 递归查找上级节点 + else: + return http_util.Respond(request, {}, "application/json") # 如果没有找到上级节点,返回空响应 + + return find_upnode(node) + + # 多卡跳转后端的方法,负责拿到rank数据 + @wrappers.Request.application + def get_rank(self, request): + node_name = request.args.get('node') + side = request.args.get('side') + + # 构建 JSON 路径 + json_path = [side] if side else [] # 如果 side 存在,路径中包含 side + json_path.extend(['node', node_name, 'matched_distributed']) + + # 获取 matched_distributed + matched_distributed = self.json_get(self._current_file_data, *json_path) + + # 返回结果 + if matched_distributed: + return http_util.Respond(request, matched_distributed, "application/json") + else: + return http_util.Respond(request, {}, "application/json") + + # 拿json_data里面所有配置数据的 + @wrappers.Request.application + def get_all_data(self, request): + """Returns all data in json format.""" + response_data = {} + tag = request.args.get('tag') + run = request.args.get('run') + json_data = self.check_jsondata(request) + if not json_data: + return http_util.Respond( + request, f"Failed to check file '{tag}', view the log for detail.", "text/plain", 400 + ) + self._current_file_data = json_data + if json_data.get('StepList', {}) and 'ALL' not in json_data.get('StepList', {}): + json_data['StepList'].insert(0, 'ALL') + all_node_names = self.get_all_node_names(json_data, request) + # 读取第一个文件中的Colors和OverflowCheck + first_run_tag = get_global_value("first_run_tag", {}).get(run) + first_file_data, _ = GraphUtils.safe_load_data(run, first_run_tag) + # 读取全局信息 + response_data['menu'] = all_node_names + response_data['unMatchedNode'] = self.get_all_unmatched_nodes(all_node_names, json_data) + if json_data.get('MicroSteps', {}): + response_data['microSteps'] = json_data.get('MicroSteps') + if json_data.get('StepList', {}): + response_data['stepList'] = json_data.get('StepList') + if json_data.get('Tooltips', {}): + response_data['tooltips'] = json_data.get('Tooltips') + if 'Colors' in first_file_data: + response_data['colors'] = first_file_data.get('Colors') + if 'OverflowCheck' in first_file_data: + response_data['overflowCheck'] = first_file_data.get('OverflowCheck') + else: + # 适配老精度溢出数据 + response_data['overflowCheck'] = True + return http_util.Respond(request, response_data, "application/json") + + @wrappers.Request.application + def static_file_route(self, request): + filename = os.path.basename(request.path) + extension = os.path.splitext(filename)[1] + if extension == '.html': + mimetype = 'text/html' + elif extension == '.js': + mimetype = 'application/javascript' + else: + mimetype = 'application/octet-stream' + filepath = os.path.join(os.path.dirname(__file__), 'static', filename) + try: + with open(filepath, 'rb') as infile: + contents = infile.read() + except IOError as e: + raise exceptions.NotFound('404 Not Found') from e + return Response(contents, content_type=mimetype, headers=GraphsPlugin.headers) + + # 方便多层级展开的upnodes节点集合,与tf-graph的_menuSelectedNodeExpand联动 + @wrappers.Request.application + def get_all_upnodes(self, request): + npu_upnodes_list, matched_upnodes_list, node_list = [], [], [] + node, matched_node, prefix = '', '', '' + node_arg = request.args.get('node') + json_data = self.check_jsondata(request) + prefix = str(node_arg)[:4] if str(node_arg)[:4] in constants.PREFIX_MAP else '' + node = node_arg[4:] if prefix in constants.PREFIX_MAP else node_arg + if prefix in constants.PREFIX_MAP and json_data.get(constants.PREFIX_MAP[prefix], {}): + node_list = json_data[constants.PREFIX_MAP[prefix]].get('node', {}) + else: + node_list = json_data.get('node', {}) + matched_node = ( + node_list.get(node, {}).get('matched_node_link', [])[-1] + if node_list.get(node, {}).get('matched_node_link') + else None + ) + + def get_upnodes(node, prefix): + upnodes_list = [] + if prefix == '': + node_list = json_data.get('node', {}) + else: + node_list = json_data.get('NPU' if prefix == 'N___' else 'Bench', {}).get('node', {}) + while node in node_list: + upnode = node_list[node].get('upnode') + if not upnode or upnode == 'None': + break + upnodes_list.insert(0, upnode) + node = upnode + return upnodes_list + + npu_upnodes_list = get_upnodes(node, prefix) + # 如果 matched_node 是 None 的话 + if matched_node is None: + previous_node = None # 用于跟踪上一个 node + for node in reversed(npu_upnodes_list): + if node_list.get(node, {}).get('matched_node_link'): # 判断条件 + matched_node = previous_node # 将 matched_node 设置为上一个 node + break + previous_node = node # 更新 previous_node 为当前 node + if prefix in constants.PREFIX_MAP: + matched_upnodes_list = get_upnodes(matched_node, prefix) + return http_util.Respond(request, [[prefix], npu_upnodes_list, matched_upnodes_list], "application/json") + + # 检查到底是读一般还是用之前存的 + def check_jsondata(self, request): + meta_data = { + "tag": request.args.get("tag"), + "run": request.args.get('run') + } + graph_data, _ = GraphUtils.get_graph_data(meta_data) + return graph_data + + # 处理xx.get + def json_get(self, data, *args): + result = data + for key in args: + if result is None: + return None + result = result.get(key) + return result + + # 获取子图数据,最核心且基本的所在 + @wrappers.Request.application + def subgraph_route(self, request): + """Returns a subgraph for a given node id, modified to use run and tag from query parameters.""" + json_data = self.check_jsondata(request) + if not json_data: + return http_util.Respond(request, "Failed to get subgraph, view the log for details.", "text/plain", 400) + node_id = request.args.get("node") + self.batch_id = request.args.get("batch") + self.step_id = request.args.get("step") + if node_id is None: + return http_util.Respond(request, 'The query parameter "node" is required', "text/plain", 400) + if node_id == 'root': + if json_data.get('Bench', {}): + subgraph_pbtxt_set = {} + for node_type in ('Bench', 'NPU'): + subgraph = {'node': {}, 'edge': {}} + node = self.json_get(json_data, constants.SETS[node_type][0], 'root') + node_data = self.json_get(json_data, constants.SETS[node_type][0], 'node', node) + node = constants.SETS[node_type][1] + node + matched_node_link = node_data['matched_node_link'] + if matched_node_link[0][:4] != constants.SETS[node_type][2]: + matched_node_link[0] = constants.SETS[node_type][2] + matched_node_link[0] + subgraph['node'][node] = node_data + subgraph_pbtxt_set[node_type] = self._convert_to_protobuf_format(subgraph) + subgraph_pbtxt = subgraph_pbtxt_set.get('NPU', '') + subgraph_pbtxt_set.get('Bench', '') + else: + subgraph = {'node': {}, 'edge': {}} + node = json_data.get('root') + node_data = self.json_get(json_data, 'node', node) + subgraph['node'][node] = node_data + subgraph_pbtxt = self._convert_to_protobuf_format(subgraph) + else: + subgraph = self._extract_subgraph(json_data, node_id) + subgraph_pbtxt = self._convert_to_protobuf_format(subgraph) + return http_util.Respond(request, subgraph_pbtxt, "text/x-protobuf") + + @wrappers.Request.application + def info_route(self, request): + """ + Returns a dict of all runs and their data availabilities, + including a flag indicating if a .vis file is present. + """ + run_tag_pairs = {} + for content in os.listdir(self.logdir): + content_path = os.path.join(self.logdir, content) + if os.path.isdir(content_path): + for file in os.listdir(content_path): + GraphUtils.process_vis_file(content_path, file, run_tag_pairs) + else: + GraphUtils.process_vis_file(self.logdir, content, run_tag_pairs) + + first_run_tag = {} + for run in run_tag_pairs.keys(): + first_run_tag[run] = run_tag_pairs[run][0] + set_global_value('first_run_tag', first_run_tag) + return http_util.Respond(request, run_tag_pairs, "application/json") + + # 同上二者一体 + def _extract_subgraph(self, json_data, node_id): + """提取子图,支持多种节点前缀逻辑""" + subgraph = {'node': {}, 'edge': []} + + # 检查前缀并获取节点集合 + prefix = node_id[:4] + if prefix in constants.SETS and len(prefix) == 4: + node_id = node_id[4:] + node_set = self.json_get(json_data, constants.SETS[prefix][0], 'node') + else: + prefix = '' + node_set = json_data.get('node', {}) + + # 获取当前节点数据 + node_data = node_set.get(node_id, {}) + subnodes = node_data.get('subnodes', []) + + # 遍历子节点 + for subnode_id in subnodes: + subnode_id_data = node_set.get(subnode_id, {}) + if subnode_id_data.get('micro_step_id') is not None: + self._process_subnode(subgraph, prefix, subnode_id, subnode_id_data, json_data) + else: + self._process_non_root_subnode(subgraph, prefix, subnode_id, subnode_id_data) + + return subgraph + + def _process_non_root_subnode(self, subgraph, prefix, subnode_id, subnode_id_data): + """处理非根子节点""" + # 更新匹配的节点链接 + self._update_matched_node_links(subnode_id_data, prefix) + + # 添加前缀并存入子图 + full_subnode_id = prefix + subnode_id + subgraph['node'][full_subnode_id] = subnode_id_data + + # 针对分micro_step_id和step_id取的部分节点 + def _process_subnode(self, subgraph, prefix, subnode_id, subnode_id_data, json_data): + batchid = subnode_id_data.get('micro_step_id') + stepid = subnode_id_data.get('step_id') + steplist = json_data.get('StepList') + + def should_update_node(): + """判断是否需要更新节点的条件逻辑""" + if self.batch_id == '-1': + if self.step_id == '-1': # batch_id 和 step_id 都为 -1 + return True + return stepid == str(steplist[int(self.step_id) + 1]) # 匹配 step_id + else: # batch_id 有效 + if self.step_id != '-1': # step_id 有效 + return batchid == int(self.batch_id) and stepid == str(steplist[int(self.step_id) + 1]) + return batchid == int(self.batch_id) # 仅匹配 batch_id + + if should_update_node(): + self._update_matched_node_links(subnode_id_data, prefix) + subnode_id = prefix + subnode_id + subgraph['node'][subnode_id] = subnode_id_data + + def _update_matched_node_links(self, subnode_id_data, prefix): + if 'matched_node_link' in subnode_id_data: + for index, matched_node_link in enumerate(subnode_id_data['matched_node_link']): + if matched_node_link[:4] != constants.SETS[prefix][1]: + matched_node_link = constants.SETS[prefix][1] + matched_node_link + subnode_id_data['matched_node_link'][index] = matched_node_link + + # 拼接成类json + def _convert_to_protobuf_format(self, subgraph): + """Converts subgraph data to the protobuf text format expected by the frontend.""" + nodes = subgraph.get('node', {}) + protobuf_format = "" + for node_id, node_data in nodes.items(): + protobuf_format += f'node {{\n name: "{node_id}"\n op: "{node_data.get("id")}"\n' + protobuf_format += f' node_type: {node_data.get("node_type", 0)}\n' + if node_data.get("matched_node_link"): + protobuf_format += f' matched_node_link: {node_data.get("matched_node_link")}\n' + protobuf_format += f' attr: "{node_data.get("data", "{}")}"\n'.replace('True', 'true').replace( + 'False', 'false' + ) + protobuf_format += f' precision_index: {(node_data.get("data", {}).get("precision_index"))}\n' + if node_data.get("input_data"): + protobuf_format += f' input_data: "{node_data.get("input_data", "{}")}"\n' + if node_data.get("output_data"): + protobuf_format += f' output_data: "{node_data.get("output_data", "{}")}"\n' + protobuf_format += f' suggestions: "{node_data.get("suggestions", "{}")}"\n' + if not node_data.get("subnodes"): + protobuf_format += f' isLeaf: true\n' + else: + protobuf_format += f' isLeaf: false\n' + protobuf_format += f' subnodes: {node_data.get("subnodes")}\n' + if node_data.get("stack_info"): + protobuf_format += f' stack_info: {node_data.get("stack_info")}\n' + protobuf_format += '}\n' + return protobuf_format diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/server/static/index.js b/plugins/tensorboard-plugins/tb_graph_ascend/server/static/index.js new file mode 100644 index 0000000000000000000000000000000000000000..e64f87ec8631da37606fe740e1cdfe9155d34986 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/server/static/index.js @@ -0,0 +1,6 @@ +/* + * Copyright (c) Huawei Technologies Co., Ltd. 2024-2024. All rights reserved. + */ +export async function render() { + document.location.href = 'index.html'; +} diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/setup.py b/plugins/tensorboard-plugins/tb_graph_ascend/setup.py new file mode 100644 index 0000000000000000000000000000000000000000..6550f1440a9761c4018d677d50af4d50da64a0b7 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/setup.py @@ -0,0 +1,57 @@ +# ------------------------------------------------------------------------- +# Copyright (c) 2025, Huawei Technologies. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# --------------------------------------------------------------------------------------------# +import setuptools + +VERSION = '1.0.3' +INSTALL_REQUIRED = ["tensorboard >= 2.11.2"] + +setuptools.setup( + name="tb-graph-ascend", + version=VERSION, + description="Model Hierarchical Visualization TensorBoard Plugin", + long_description="Model Hierarchical Visualization TensorBoard Plugin : \ + https://gitee.com/ascend/mstt/tree/master/plugins/tensorboard-plugins/tb_graph_ascend", + url="https://gitee.com/ascend/mstt/tree/master/plugins/tensorboard-plugins/tb_graph_ascend", + author="Ascend Team", + author_email="pmail_mindstudio@huawei.com", + packages=setuptools.find_packages(), + package_data={ + "server": ["static/**"], + }, + entry_points={ + "tensorboard_plugins": [ + "graph_ascend = server.plugin:GraphsPlugin", + ], + }, + python_requires=">=3.7", + install_requires=INSTALL_REQUIRED, + classifiers=[ + 'Intended Audience :: Developers', + 'Intended Audience :: Education', + 'Intended Audience :: Science/Research', + 'License :: OSI Approved :: BSD License', + 'Programming Language :: Python :: 3', + 'Topic :: Scientific/Engineering', + 'Topic :: Scientific/Engineering :: Mathematics', + 'Topic :: Scientific/Engineering :: Artificial Intelligence', + 'Topic :: Software Development', + 'Topic :: Software Development :: Libraries', + 'Topic :: Software Development :: Libraries :: Python Modules', + ], + license='BSD-3', + keywords='tensorboard graph ascend plugin', +) diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/conftest.py b/plugins/tensorboard-plugins/tb_graph_ascend/test/conftest.py new file mode 100644 index 0000000000000000000000000000000000000000..70475d16bd28218a6e07f899b4b9757e8010a16f --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/conftest.py @@ -0,0 +1,78 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +import os +import json +from pathlib import Path +import pytest + + +def load_st_test_cases(): + meta_path = Path(__file__).parent / "data/metadata_st.json" + with open(meta_path) as f: + return json.load(f)["test_cases"] + + +def load_ut_test_cases(): + meta_path = Path(__file__).parent / "data/metadata_ut.json" + with open(meta_path) as f: + return json.load(f)["test_cases"] + + +# 动态生成测试用例 +def pytest_generate_tests(metafunc): + if "meta_data" in metafunc.fixturenames and "operation" in metafunc.fixturenames: + ut_test_cases = load_st_test_cases() + params = [] + for case in ut_test_cases: + meta_data = case["meta_data"] + meta_data["run"] = Path(__file__).parent / meta_data["run"] + for op in case["operations"]: + params.append(pytest.param( + meta_data, + op, + id=f"{meta_data['run']}-{op['type']}" + )) + # 确保参数名称与参数值数量一致 + metafunc.parametrize("meta_data, operation", params) + if "ut_test_case" in metafunc.fixturenames: + ut_test_cases = load_ut_test_cases() + params = [] + for case in ut_test_cases: + params.append(pytest.param( + case, + id=f"{case['type']}-{case['name']}" + )) + # 确保参数名称与参数值数量一致 + metafunc.parametrize("ut_test_case", params) + + +@pytest.fixture +def meta_data(meta_data): + # 返回当前测试的操作配置 + return meta_data + + +@pytest.fixture +def operation_config(operation): + # 返回当前测试的操作配置 + return operation + + +@pytest.fixture +def ut_test_case(ut_test_case): + # 返回当前测试的操作配置 + return ut_test_case diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/data/compare_statis.vis b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/compare_statis.vis new file mode 100644 index 0000000000000000000000000000000000000000..5820af6ac01e2a20d36dd6921da9c82b8fd43513 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/compare_statis.vis @@ -0,0 +1,948 @@ +{ + "ToolTip": "{\"shape\": \"\\u6570\"}", + "NPU": { + "edge": [], + "node": { + "AddOne_0": { + "matched_node_link": [ + "AddOne_0" + ], + "data": { + "precision_index": 0.0 + }, + "id": "AddOne_0", + "inputs": [], + "input_data": { + "AddOne_0.input_arg.0": { + "type": "torch.Tensor", + "dtype": "", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.input_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.1": {}, + "AddOne_0.2": {} + }, + "output_data": { + "AddOne_0.output_arg.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.output_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.1": {}, + "AddOne_0.2": {} + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "pair": "None", + "subnodes": [ + "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee", + "add_51111" + ], + "type": "AddOne", + "upnode": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee" + }, + "AddOne_1": { + "matched_node_link": [], + "data": { + "precision_index": 0.8 + }, + "id": "AddOne_1", + "inputs": [], + "input_data": { + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.input_arg.0": { + "data_name": "-10", + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 9, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "add_1", + "add_4" + ], + "type": "AddOne", + "upnode": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee" + }, + "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee": { + "matched_node_link": [], + "data": { + "precision_index": 0 + }, + "id": "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee", + "inputs": [], + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "add_2" + ], + "type": "AddOne", + "upnode": "AddThree_0", + "micro_step_id": "3" + }, + "AddThree_0": { + "matched_node_link": [ + "B___AddThree_0" + ], + "data": { + "precision_index": 0.5 + }, + "id": "AddThree_0", + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "arg0_1_0", + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee", + "output_0" + ], + "type": "AddThree", + "upnode": "None" + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee": { + "matched_node_link": [ + "B___Test.maxpoolMaxPool2.maxpoolpo.tt.ee" + ], + "data": { + "precision_status": false, + "Host Self Duration(us)": 56.24, + "Host Total Duration(us)": 56.24, + "Device Self Duration(us)": 0, + "Device Total Duration(us)": 0, + "overflow_level": "medium" + }, + "id": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "inputs": [], + "input_data": { + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.input_arg.0": { + "data_name": "-10", + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": { + "output.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 128, + 512, + 2, + 2 + ], + "Max": "538.2343242", + "Min": "-234.124124234", + "Mean": "-510.23432654", + "Norm": "3555.3406" + }, + "output.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 16, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "pair": "None", + "subnodes": [ + "AddOne_0", + "AddOne_1" + ], + "suggestions": { + "text": "test ptdbg工具 with longlong tenm tensldjta te alsejtka gaew jtljae tet jsdklfj.", + "ptdbg工具": "https://gitee.com/ascend/att/tree/master/debug/accuracy_tools/ptdbg_ascend" + }, + "stack_info": [ + "File /home/w3000/xxxx/subnodes/sdd/adad/srit-sda/artar/prased, line 136. om het, \n dada = rtens/sda.ddd(asdw)", + "File /home/w3000/xxxx/subnodes/sdd/adad/srit-sda/artar/prased, line 136. om het, \n dada = rtens/sda.ddd(asdw)", + "File /home/w3000/xxxx/subnodes/sdd/adad/srit-sda/artar/prased, line 136. om het, \n dada = rtens/sda.ddd(asdw)" + ], + "type": "AddTwo", + "upnode": "AddThree_0", + "micro_step_id": "0" + }, + "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee": { + "matched_node_link": [ + "B___AddThree_0", + "B___Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "B___AddOne_0", + "B___Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee" + ], + "data": { + "precision_index": 0.35 + }, + "id": "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee", + "inputs": [], + "input_data": { + "input_arg.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "input_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "input_arg.2": "None" + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": { + "output.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 128, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "output.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 16, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "pair": "None", + "subnodes": [], + "type": "addlongtingmelasidngkonklajelkjsakljgskadtest", + "upnode": "AddOne_0" + }, + "add_51111": { + "matched_node_link": [], + "data": { + "precision_status": true, + "precision_index": 1, + "md5 Compare Result": "Pass" + }, + "id": "add_51111", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "AddOne_0" + }, + "add_1": { + "matched_node_link": [], + "data": { + "precision_status": false, + "precision_index": 0, + "md5 Compare Result": "Pass" + }, + "id": "add_1", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "AddOne_1" + }, + "add_4": { + "matched_node_link": [], + "data": {}, + "id": "add_4", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "AddOne_1" + }, + "add_2": { + "matched_node_link": [], + "data": {}, + "id": "add_2", + "inputs": [], + "input_data": { + "add_2.input_arg.0": { + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "6408.2343242", + "Min": "-3134.124124234", + "Mean": "-501.23432654", + "Norm": "3555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "add_2.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee" + }, + "arg0_1_0": { + "matched_node_link": [], + "data": {}, + "id": "arg0_1_0", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "arg0_1", + "upnode": "AddThree_0", + "micro_step_id": "2", + "suggestions": { + "text": "test ptdbg工" + }, + "stack_info": [ + "File /home/w3000/xxxx/subnodes/sdd/adad/srit-sda/artar/prased, line 136. om het, \n dada = rtens/sda.ddd(asdw)" + ] + }, + "output_0": { + "matched_node_link": [], + "data": {}, + "id": "output_0", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "output", + "upnode": "AddThree_0", + "micro_step_id": "3" + } + }, + "root": "AddThree_0" + }, + "Bench": { + "edge": [], + "node": { + "AddOne_0": { + "matched_node_link": [ + "AddOne_0" + ], + "data": {}, + "id": "AddOne_0", + "inputs": [], + "input_data": { + "AddOne_0.input_arg.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "AddOne_0.input_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "output_data": { + "AddOne_0.output_arg.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "AddOne_0.output_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "pair": "None", + "subnodes": [ + "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee" + ], + "type": "AddOne", + "upnode": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee" + }, + "AddOne_1": { + "matched_node_link": [], + "data": {}, + "id": "AddOne_1", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "add_1" + ], + "type": "AddOne", + "upnode": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee" + }, + "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee": { + "matched_node_link": [], + "data": {}, + "id": "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "add_2" + ], + "type": "AddOne", + "upnode": "AddThree_0" + }, + "AddThree_0": { + "matched_node_link": [ + "N___AddThree_0" + ], + "data": {}, + "id": "AddThree_0", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [ + "arg0_1_0", + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee", + "output_0" + ], + "type": "AddThree", + "upnode": "root" + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee": { + "matched_node_link": [], + "data": {}, + "id": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "inputs": [], + "input_data": { + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.input_arg.0": { + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "548.2343242", + "Min": "-234.124124234", + "Mean": "-5.23432654", + "Norm": "35555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": { + "output.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 128, + 512, + 2, + 2 + ], + "Max": "5038.2343242", + "Min": "-1234.124124234", + "Mean": "-410.23432654", + "Norm": "3255.3406" + }, + "output.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 16, + 256, + 2, + 2 + ], + "Max": "538.2343242", + "Min": "-234.124124234", + "Mean": "-51.23432654", + "Norm": "35555.3406" + } + }, + "pair": "None", + "subnodes": [ + "AddOne_0", + "AddOne_1" + ], + "type": "AddTwo", + "upnode": "AddThree_0" + }, + "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee": { + "matched_node_link": [ + "N___AddThree_0", + "N___Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "N___AddOne_0", + "N___Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee" + ], + "data": {}, + "id": "Test.add_0_withlonglonglonglonglonglonglonglongname.tt.ee", + "inputs": [], + "input_data": { + "input_arg.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 32, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "input_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 1, + "outputs": [], + "output_data": { + "output.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 128, + 512, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + }, + "output.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 16, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "AddOne_0" + }, + "add_1": { + "matched_node_link": [], + "data": {}, + "id": "add_1", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 1, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "AddOne_1" + }, + "add_2": { + "matched_node_link": [], + "data": {}, + "id": "add_2", + "inputs": [], + "input_data": { + "add_2.input_arg.0": { + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "548.2343242", + "Min": "-234.124124234", + "Mean": "-5.23432654", + "Norm": "35555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "add_2.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 1, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "add", + "upnode": "Test.AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA.tt.ee" + }, + "arg0_1_0": { + "matched_node_link": [], + "data": {}, + "id": "arg0_1_0", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 1, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "arg0_1", + "upnode": "AddThree_0" + }, + "output_0": { + "matched_node_link": [], + "data": {}, + "id": "output_0", + "inputs": [], + "input_data": {}, + "is_forward": true, + "node_type": 1, + "outputs": [], + "output_data": {}, + "pair": "None", + "subnodes": [], + "type": "output", + "upnode": "AddThree_0" + } + }, + "root": "AddThree_0" + }, + "Colors": { + "#FF9B3D": { + "value": [ + 0, + 0.5 + ], + "description": "此节点所有输入输出的统计量相对误差,值越大代表测量值与标杆值的偏差越大,相对误差计算方式:|(测量值-标杆值)/标杆值|" + }, + "#FFDC7F": { + "value": [ + 0.5, + 1 + ], + "description": "此节点所有输入输出的统计量相对误差,值越大代表测量值与标杆值的偏差越大,相对误差计算方式:|(测量值-标杆值)/标杆值|" + }, + "#C7C7C7": { + "value": "无匹配节点", + "description": "对比过程中节点未匹配上" + } + }, + "MicroSteps": 5, + "StepList": [ + "ALL", + 0, + 1, + 2, + 3 + ], + "task": "summary", + "match": [], + "npu_match_nodes": { + "AddOne_0": "AddOne_0" + }, + "bench_match_nodes": { + "AddOne_0": "AddOne_0" + }, + "OverflowCheck": true +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_st.json b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_st.json new file mode 100644 index 0000000000000000000000000000000000000000..820e11cc39b7c4ed4111690970d2714a82bf1899 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_st.json @@ -0,0 +1,237 @@ +{ + "test_cases": [ + { + "meta_data": { + "run": "data", + "tag": "compare_statis" + }, + "operations": [ + { + "type": "get_node_info", + "node_info": { + "nodeName": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "nodeType": "Bench" + }, + "expected": { + "success": true, + "data": { + "matched_node_link": [], + "data": {}, + "id": "Test.maxpoolMaxPool2.maxpoolpo.tt.ee", + "inputs": [], + "input_data": { + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.input_arg.0": { + "longlonglonglonglonglongName": "hah", + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": "[32, 512, 2, 2]", + "Max": "548.2343242", + "Min": "-234.124124234", + "Mean": "-5.23432654", + "Norm": "35555.3406", + "error_key": [ + "type", + "shape" + ] + }, + "Test.maxpoolMaxPool2.maxpoolpo.tt.ee.kwrag_arg.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 64, + 256, + 2, + 2 + ], + "Max": "5348.2343242", + "Min": "-2344.124124234", + "Mean": "-51.23432654", + "Norm": "355555.3406" + } + }, + "is_forward": true, + "node_type": 0, + "outputs": [], + "output_data": { + "output.0": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 128, + 512, + 2, + 2 + ], + "Max": "5038.2343242", + "Min": "-1234.124124234", + "Mean": "-410.23432654", + "Norm": "3255.3406" + }, + "output.1": { + "type": "torch.Tensor", + "dtype": "torch.float32", + "shape": [ + 16, + 256, + 2, + 2 + ], + "Max": "538.2343242", + "Min": "-234.124124234", + "Mean": "-51.23432654", + "Norm": "35555.3406" + } + }, + "pair": "None", + "subnodes": [ + "AddOne_0", + "AddOne_1" + ], + "type": "AddTwo", + "upnode": "AddThree_0" + } + } + }, + { + "type": "add_match_nodes", + "npu_node_name": "AddOne_0", + "bench_node_name": "AddOne_0", + "expected": { + "success": true, + "data": { + "precision_error": 0.0, + "intput_statistical_diff": { + "AddOne_0.input_arg.0": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.input_arg.1": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + } + }, + "output_statistical_diff": { + "AddOne_0.output_arg.0": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.output_arg.1": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + } + } + } + } + }, + { + "type": "delete_match_nodes", + "npu_node_name": "AddOne_0", + "bench_node_name": "AddOne_0", + "expected": { + "success": true, + "data": {} + } + }, + { + "type": "add_match_nodes", + "npu_node_name": "AddOne_0", + "bench_node_name": "AddOne_0", + "expected": { + "success": true, + "data": { + "precision_error": 0.0, + "intput_statistical_diff": { + "AddOne_0.input_arg.0": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.input_arg.1": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + } + }, + "output_statistical_diff": { + "AddOne_0.output_arg.0": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + }, + "AddOne_0.output_arg.1": { + "MaxAbsErr": 0.0, + "MinAbsErr": 0.0, + "MeanAbsErr": 0.0, + "NormAbsErr": 0.0, + "MaxRelativeErr": "0.0000%", + "MinRelativeErr": "0.0000%", + "MeanRelativeErr": "0.0000%", + "NormRelativeErr": "0.0000%" + } + } + } + } + }, + { + "type": "get_matched_state_list", + "expected": { + "success": true, + "data": { + "npu_match_nodes": { + "AddOne_0": "AddOne_0" + }, + "bench_match_nodes": { + "AddOne_0": "AddOne_0" + } + } + } + }, + { + "type": "save_data", + "expected": { + "success": true + } + } + ] + } + ] +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_ut.json b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_ut.json new file mode 100644 index 0000000000000000000000000000000000000000..025fbd14c5fdf5307928ffe4bf3ebd0f102c36f9 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/data/metadata_ut.json @@ -0,0 +1,174 @@ +{ + "test_cases": [ + { + "name": "md5_full_match_success", + "type": "process_md5_task_add", + "input": { + "graph_data": { + "NPU": { + "node": { + "npu_conv2d": { + "input_data": { + "npu_conv2d.data1": { + "md5": "a1b2c3" + }, + "npu_conv2d.data2": { + "md5": "d4e5f6" + } + }, + "output_data": { + "npu_conv2d.out1": { + "md5": "x7y8z9" + } + } + } + } + }, + "Bench": { + "node": { + "bench_conv2d": { + "input_data": { + "bench_conv2d.data1": { + "md5": "a1b2c3" + }, + "bench_conv2d.data2": { + "md5": "d4e5f6" + } + }, + "output_data": { + "bench_conv2d.out1": { + "md5": "x7y8z9" + } + } + } + } + } + }, + "npu_node_name": "npu_conv2d", + "bench_node_name": "bench_conv2d" + }, + "expected": { + "success": true, + "data": { + "precision_error": 1 + } + } + }, + { + "name": "md5_partial_mismatch_failure", + "type": "process_md5_task_add", + "input": { + "graph_data": { + "NPU": { + "node": { + "npu_conv2d": { + "input_data": { + "npu_conv2d.data1": { + "md5": "a1b2c3" + }, + "npu_conv2d.data2": { + "md5": "xxxxxx" + } + } + } + } + }, + "Bench": { + "node": { + "bench_conv2d": { + "input_data": { + "bench_conv2d.data1": { + "md5": "a1b2c3" + }, + "bench_conv2d.data2": { + "md5": "d4e5f6" + } + } + } + } + } + }, + "npu_node_name": "npu_conv2d", + "bench_node_name": "bench_conv2d" + }, + "expected": { + "success": true, + "data": { + "precision_error": 0 + } + } + }, + { + "name": "delete_existing_node_success", + "type": "process_md5_task_delete", + "input": { + "graph_data": { + "npu_match_nodes": { + "npu_conv2d": "bench_conv2d" + }, + "bench_match_nodes": { + "bench_conv2d": "npu_conv2d" + }, + "NPU": { + "node": { + "npu_conv2d": { + "data": { + "precision_index": 1 + }, + "matched_node_link": [ + "bench_conv2d" + ] + } + } + } + }, + "npu_node_name": "npu_conv2d", + "bench_node_name": "bench_conv2d" + }, + "expected": { + "success": true, + "data": {} + } + }, + { + "name": "delete_non_existent_node_failure", + "type": "process_md5_task_delete", + "input": { + "graph_data": { + "NPU": { + "node": { + "npu_conv2d": {} + } + } + }, + "npu_node_name": "npu_conv2d", + "bench_node_name": "bench_conv2d" + }, + "expected": { + "success": false, + "error": "操作失败:删除节点信息失败" + } + }, + { + "name": "missing_input_data_handling", + "type": "process_md5_task_add", + "input": { + "graph_data": { + "NPU": { + "node": { + "npu_conv2d": {} + } + } + }, + "npu_node_name": "npu_conv2d", + "bench_node_name": "bench_conv2d" + }, + "expected": { + "success": true, + "data": { + "precision_error": 0 + } + } + } + ] +} \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/integration/service/test_graph_service.py b/plugins/tensorboard-plugins/tb_graph_ascend/test/integration/service/test_graph_service.py new file mode 100644 index 0000000000000000000000000000000000000000..173b741bdfef29e08d9b86e7c4979d0dcfe42c9d --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/integration/service/test_graph_service.py @@ -0,0 +1,59 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +import pytest +from server.app.service.graph_service import GraphService + + +class TestMatchNodesService: + + @staticmethod + def test_service(self, meta_data, operation_config): + op_type = operation_config["type"] + expected = operation_config["expected"] + # 执行操作 + try: + if op_type == "get_node_info": + result = GraphService.get_node_info( + node_info=operation_config["node_info"], + meta_data=meta_data + ) + elif op_type == "add_match_nodes": + result = GraphService.add_match_nodes( + npu_node_name=operation_config["npu_node_name"], + bench_node_name=operation_config["bench_node_name"], + meta_data=meta_data + ) + elif op_type == "delete_match_nodes": + result = GraphService.delete_match_nodes( + npu_node_name=operation_config["npu_node_name"], + bench_node_name=operation_config["bench_node_name"], + meta_data=meta_data + ) + elif op_type == "get_matched_state_list": + result = GraphService.get_matched_state_list( + meta_data=meta_data + ) + elif op_type == "save_data": + result = GraphService.save_data( + meta_data=meta_data + ) + except Exception as e: + result = {"error": type(e).__name__} + + # 验证结果 + assert result == expected, \ + f"Operation {op_type} failed on {operation_config}" diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/pytest.ini b/plugins/tensorboard-plugins/tb_graph_ascend/test/pytest.ini new file mode 100644 index 0000000000000000000000000000000000000000..ddb991b93410d7fb17d45cfe3d1e74a9a9b92e3a --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/pytest.ini @@ -0,0 +1,9 @@ +[pytest] +testpaths = tests +python_files = test_*.py +norecursedirs = .* venv build dist +addopts = -v --strict-markers +log_cli = true +log_level = DEBUG +markers = + slow: marks tests as slow (deselect with -m 'not slow') \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_graph_ascend/test/units/controllers/test_match_nodes_controller.py b/plugins/tensorboard-plugins/tb_graph_ascend/test/units/controllers/test_match_nodes_controller.py new file mode 100644 index 0000000000000000000000000000000000000000..652f9c3b9198b90428b1c71fa35841317435a6e9 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_graph_ascend/test/units/controllers/test_match_nodes_controller.py @@ -0,0 +1,89 @@ +# Copyright (c) 2025, Huawei Technologies. +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +import pytest +from server.app.controllers.match_nodes_controller import MatchNodesController +from server.app.utils.graph_utils import GraphUtils + + +class TestMatchNodesController: + + @staticmethod + def test_match_node_controller(ut_test_case): + op_type = ut_test_case.get("type") + input_data = ut_test_case.get('input') + expected = ut_test_case.get("expected") + # 执行操作 + try: + if op_type == "process_md5_task_add": + result = MatchNodesController.process_md5_task_add( + graph_data=input_data.get("graph_data"), + npu_node_name=input_data.get("npu_node_name"), + bench_node_name=input_data.get("bench_node_name"), + ) + elif op_type == "process_md5_task_delete": + result = MatchNodesController.process_md5_task_delete( + graph_data=input_data.get("graph_data"), + npu_node_name=input_data.get("npu_node_name"), + bench_node_name=input_data.get("bench_node_name"), + ) + elif op_type == "process_summary_task_add": + + result = MatchNodesController.process_summary_task_add( + graph_data=input_data.get("graph_data"), + npu_node_name=input_data.get("npu_node_name"), + bench_node_name=input_data.get("bench_node_name"), + ) + elif op_type == "process_summary_task_delete": + result = MatchNodesController.process_summary_task_delete( + npu_data=input_data.get("npu_data"), + bench_data=input_data.get("bench_data"), + npu_node_name=input_data.get("npu_node_name"), + bench_node_name=input_data.get("bench_node_name"), + ) + elif op_type == "calculate_statistical_diff": + result = MatchNodesController.calculate_statistical_diff( + npu_data=input_data.get("npu_data"), + bench_data=input_data.get("bench_data"), + npu_node_name=input_data.get("npu_node_name"), + bench_node_name=input_data.get("bench_node_name"), + ) + elif op_type == "calculate_max_relative_error": + result = MatchNodesController.calculate_max_relative_error( + result=input_data.get("result"), + ) + elif op_type == "calculate_md5_diff": + result = MatchNodesController.calculate_md5_diff( + npu_data=input_data.get("npu_data"), + bench_data=input_data.get("bench_data"), + ) + elif op_type == "update_graph_node_data": + result = MatchNodesController.update_graph_node_data( + graph_npu_node_data=input_data.get("graph_npu_node_data"), + statistical_diff=input_data.get("statistical_diff"), + ) + elif op_type == "delete_matched_node_data": + result = MatchNodesController.delete_matched_node_data( + graph_npu_node_data=input_data.get("graph_npu_node_data"), + ) + else: + return + except Exception as e: + result = {"error": type(e).__name__} + + # 验证结果 + assert result == expected, \ + f"Operation {op_type} failed on {ut_test_case}" diff --git a/plugins/tensorboard-plugins/tb_plugin/README.md b/plugins/tensorboard-plugins/tb_plugin/README.md index 152a91b9588bd002813e5fb04da2e8a4c909e9ae..b4b417c4d21899c195674be08d196a5d9f11021a 100644 --- a/plugins/tensorboard-plugins/tb_plugin/README.md +++ b/plugins/tensorboard-plugins/tb_plugin/README.md @@ -17,7 +17,7 @@ 2. 从源代码安装 * 从仓库下载源码: - `git clone https://gitee.com/ascend/att.git` + `git clone https://gitee.com/ascend/mstt.git` * 进入目录 `/plugins/tensorboard_plugins/tb_plugin` 下. * 编译前端代码 @@ -128,25 +128,37 @@ ##### Kernel View - Kernel View展示算子在加速核上运行的详细信息。 + Kernel View 展示算子在加速核上运行的详细信息。此视图包含两张饼图和两张表,可通过 Group By 切换表格数据:算子的详情表以及统计表。 - ![Alt text](./docs/images/kernel_view.PNG) + * 上方为饼图,展示耗时最多的数个算子耗时比例信息(左侧饼图)和算子执行在各类加速核上耗时百分比(右侧饼图) - * Calls: 算子调度的次数。 + ![Alt text](./docs/images/kernel_view.PNG) - * Accelerator Core: 计算核。 + * 选择 Group By 为 All 时,展示算子详情表,部分字段说明如下: - * Block Dim: Task运行切分数量,对应Task运行时核数。 + | 字段名 | 说明 | + | ---------------- | -------------------------------------- | + | Step Id | 标识在哪个 Step 采集的数据 | + | Name | 运行在 npu 上的算子名称 | + | Type | 算子类型 | + | Accelerator Core | AI 加速核类型,包括 AI Core、AI CPU 等 | + | Start Time(us) | 算子执行开始时间 | + | Duration(us) | 当前算子执行耗时 | + | Wait Time(us) | 算子执行等待时间 | + | Block Dim | 运行切分数量,对应任务执行时的核数 | ![Alt text](./docs/images/kernel_view_group_by_statistic.PNG) - * Accelerator Core Utilization: 算子执行在各类core上耗时百分比。 + * 选择 Group By 为 Statistic 时,展示算子信息统计表,此表格展示各算子的执行统计信息,字段说明如下: - * Name: 运行在npu上的算子名称。 - - * Total Duration、 Max Duration、Avg Duration、Min Duration: 算子调用总耗时、最大耗时、平均耗时以及最小耗时。 - - 此视图包含两张饼图和两张表,可通过Group By切换表格数据:算子的详细表以及统计表。 + | 字段名 | 说明 | + | ---------------- | -------| + | Name | 运行在 npu 上的算子名称 | + | Calls | 算子执行次数 | + | Total Duration(us) | 算子执行总时间 | + | Min Duration(us) | 算子执行的最小时间 | + | Max Duration(us) | 算子执行的最大时间 | + | Avg Duration(us) | 算子执行平均时间 | ##### Trace View @@ -162,7 +174,7 @@ ![Alt text](./docs/images/trace_view_launch.PNG) - 选择只展示async_nup,可以查看框架侧算子与昇腾硬件上执行的算子的关联关系。 + 选择只展示async_npu,可以查看框架侧算子与昇腾硬件上执行的算子的下发执行关系。 ![Alt text](./docs/images/trace_view_npu_utilization.PNG) @@ -268,7 +280,7 @@ ###### 文件导入 界面分为左侧边栏和右侧展示界面。点击左侧的Import Files或在左侧未勾选文件时点击右侧界面中心的Import Files字体,将会弹出系统文件资源管理窗,可以上传需要比对的模型网络训练日志文件。 - 注:当前最多支持上传6个文件,单个文件大小不能超过50MB。 + **注:当前最多支持上传6个文件,单个文件大小不能超过50MB。** ![Alt text](./docs/images/accuracy.PNG) ###### 已上传文件操作 @@ -319,8 +331,8 @@ * 比对方式有三种,通过Comparison Setting进行设定。 * Comparison Normal:相同iteration,后选择文件的loss值减去先选择文件的loss值。 - * Comparison Normal:相同iteration,两个文件的loss的差值的绝对值。 - * Comparison Normal:相同iteration,两个文件的loss的差值的绝对值 / 先选择文件的loss值。 + * Comparison Absolute:相同iteration,两个文件的loss的差值的绝对值。 + * Comparison Relative:相同iteration,两个文件的loss的差值的绝对值 / 先选择文件的loss值。 ### 公网URL说明 diff --git "a/plugins/tensorboard-plugins/tb_plugin/docs/\345\205\254\347\275\221URL\350\257\264\346\230\216.xlsx" "b/plugins/tensorboard-plugins/tb_plugin/docs/\345\205\254\347\275\221URL\350\257\264\346\230\216.xlsx" index b7a8bf1fd0e7eec640e46af76e16c6a228f335ba..de0bb25fe155aa188e5670a377311e96168586e8 100644 Binary files "a/plugins/tensorboard-plugins/tb_plugin/docs/\345\205\254\347\275\221URL\350\257\264\346\230\216.xlsx" and "b/plugins/tensorboard-plugins/tb_plugin/docs/\345\205\254\347\275\221URL\350\257\264\346\230\216.xlsx" differ diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/prettier.json b/plugins/tensorboard-plugins/tb_plugin/fe/prettier.json index 6049640793f6907bbd38c7065360df0ac24d64d4..ef5789da9458a66e7dacc1dfdeeb764642331734 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/prettier.json +++ b/plugins/tensorboard-plugins/tb_plugin/fe/prettier.json @@ -1,12 +1,12 @@ { - "parser": "typescript", - "semi": false, - "singleQuote": true, - "jsxSingleQuote": false, - "bracketSpacing": true, - "tabWidth": 2, - "useTabs": false, - "trailingComma": "none", - "proseWrap": "always", - "endOfLine": "lf" + "parser": "typescript", + "semi": true, + "singleQuote": true, + "jsxSingleQuote": false, + "bracketSpacing": true, + "tabWidth": 2, + "useTabs": false, + "trailingComma": "all", + "proseWrap": "always", + "endOfLine": "lf" } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/scripts/add_header.py b/plugins/tensorboard-plugins/tb_plugin/fe/scripts/add_header.py index 03fb7c15aea6bf361b241910fa4529bc0996286c..69bc6c05541cbaff0fc88eb7456f501fb5bd4f71 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/scripts/add_header.py +++ b/plugins/tensorboard-plugins/tb_plugin/fe/scripts/add_header.py @@ -1,4 +1,23 @@ -#!/usr/bin/env python +# ------------------------------------------------------------------------- +# Copyright (c) Microsoft Corporation. +# Copyright(c) 2023 Huawei Technologies. +# All rights reserved +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Modifications: Add visualization of PyTorch Ascend profiling. +# -------------------------------------------------------------------------- +# !/usr/bin/env python import glob import os import sys diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/api.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/api.ts index b00601fba8852eeed9be052c6ed8adc106d49215..29cde96ebbde928cde967b3b1b365d12e74ee734 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/api.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/api.ts @@ -15,7 +15,7 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ @@ -33,11 +33,11 @@ * Do not edit the file manually. */ -import * as url from 'url' -import * as portableFetch from 'portable-fetch' -import { Configuration } from './configuration' +import * as url from 'url'; +import * as portableFetch from 'portable-fetch'; +import { Configuration } from './configuration'; -const BASE_PATH = '.'.replace(/\/+$/, '') +const BASE_PATH = '.'.replace(/\/+$/, ''); /** * @@ -47,8 +47,8 @@ export const COLLECTION_FORMATS = { csv: ',', ssv: ' ', tsv: '\t', - pipes: '|' -} + pipes: '|', +}; /** * @@ -56,7 +56,7 @@ export const COLLECTION_FORMATS = { * @interface FetchAPI */ export interface FetchAPI { - (url: string, init?: any): Promise + (url: string, init?: any): Promise; } /** @@ -65,8 +65,8 @@ export interface FetchAPI { * @interface FetchArgs */ export interface FetchArgs { - url: string - options: any + url: string; + options: any; } /** @@ -75,7 +75,7 @@ export interface FetchArgs { * @class BaseAPI */ export class BaseAPI { - protected configuration: Configuration + protected configuration: Configuration; constructor( configuration?: Configuration, @@ -83,8 +83,8 @@ export class BaseAPI { protected fetch: FetchAPI = portableFetch ) { if (configuration) { - this.configuration = configuration - this.basePath = configuration.basePath || this.basePath + this.configuration = configuration; + this.basePath = configuration.basePath || this.basePath; } } } @@ -96,9 +96,9 @@ export class BaseAPI { * @extends {Error} */ export class RequiredError extends Error { - name: 'RequiredError' + name: 'RequiredError'; constructor(public field: string, msg?: string) { - super(msg) + super(msg); } } @@ -107,7 +107,7 @@ export class RequiredError extends Error { * @export * @interface CallStackTableData */ -export interface CallStackTableData extends Array { } +export interface CallStackTableData extends Array {} /** * * @export @@ -119,67 +119,67 @@ export interface CallStackTableDataInner { * @type {string} * @memberof CallStackTableDataInner */ - name: string + name: string; /** * * @type {string} * @memberof CallStackTableDataInner */ - input_shape?: string + input_shape?: string; /** * * @type {number} * @memberof CallStackTableDataInner */ - calls: number + calls: number; /** * * @type {number} * @memberof CallStackTableDataInner */ - device_self_duration?: number + device_self_duration?: number; /** * * @type {number} * @memberof CallStackTableDataInner */ - device_total_duration?: number + device_total_duration?: number; /** * * @type {number} * @memberof CallStackTableDataInner */ - host_self_duration: number + host_self_duration: number; /** * * @type {number} * @memberof CallStackTableDataInner */ - host_total_duration: number + host_total_duration: number; /** * * @type {string} * @memberof CallStackTableDataInner */ - call_stack?: string + call_stack?: string; /** * * @type {string} * @memberof CallStackTableDataInner */ - tc_eligible?: string + tc_eligible?: string; /** * * @type {number} * @memberof CallStackTableDataInner */ - tc_self_ratio?: number + tc_self_ratio?: number; /** * * @type {number} * @memberof CallStackTableDataInner */ - tc_total_ratio?: number + tc_total_ratio?: number; } /** * @@ -192,25 +192,25 @@ export interface DiffNode { * @type {OpStats} * @memberof DiffNode */ - left: OpStats + left: OpStats; /** * * @type {OpStats} * @memberof DiffNode */ - right: OpStats + right: OpStats; /** * * @type {string} * @memberof DiffNode */ - path: string + path: string; /** * * @type {Array} * @memberof DiffNode */ - children: Array + children: Array; } /** * @@ -223,13 +223,13 @@ export interface DistributedGraph { * @type {DistributedGraphMetadata} * @memberof DistributedGraph */ - metadata: DistributedGraphMetadata + metadata: DistributedGraphMetadata; /** * * @type {any} * @memberof DistributedGraph */ - data: any + data: any; } /** * @@ -242,19 +242,19 @@ export interface DistributedGraphMetadata { * @type {string} * @memberof DistributedGraphMetadata */ - title: string + title: string; /** * * @type {Array} * @memberof DistributedGraphMetadata */ - legends: Array + legends: Array; /** * * @type {string} * @memberof DistributedGraphMetadata */ - units: string + units: string; } /** * @@ -267,13 +267,13 @@ export interface Environment { * @type {string} * @memberof Environment */ - title: string + title: string; /** * * @type {string} * @memberof Environment */ - value: string + value: string; } /** * @@ -286,13 +286,13 @@ export interface GpuInfo { * @type {GpuInfoMetadata} * @memberof GpuInfo */ - metadata: GpuInfoMetadata + metadata: GpuInfoMetadata; /** * * @type {any} * @memberof GpuInfo */ - data: any + data: any; } /** * @@ -305,7 +305,7 @@ export interface GpuInfoMetadata { * @type {string} * @memberof GpuInfoMetadata */ - title: string + title: string; } /** * @@ -318,13 +318,13 @@ export interface GpuMetric { * @type {string} * @memberof GpuMetric */ - title: string + title: string; /** * * @type {string} * @memberof GpuMetric */ - value: string + value: string; } /** * @@ -337,13 +337,13 @@ export interface GpuMetrics { * @type {Array} * @memberof GpuMetrics */ - data: Array + data: Array; /** * * @type {string} * @memberof GpuMetrics */ - tooltip: string + tooltip: string; } /** * @@ -356,19 +356,19 @@ export interface Graph { * @type {string} * @memberof Graph */ - title?: string + title?: string; /** * * @type {Array} * @memberof Graph */ - columns: Array + columns: Array; /** * * @type {Array>} * @memberof Graph */ - rows: Array> + rows: Array>; } /** * @@ -381,13 +381,13 @@ export interface ValueAndTooltip { * @type {string | number} * @memberof ValueAndTooltip */ - value: string | number + value: string | number; /** * * @type {string} * @memberof ValueAndTooltip */ - tooltip?: string + tooltip?: string; } /** * @@ -400,19 +400,19 @@ export interface StepedGraph { * @type {string} * @memberof StepedGraph */ - title?: string + title?: string; /** * * @type {Array} * @memberof StepedGraph */ - columns: Array + columns: Array; /** * * @type {Array>} * @memberof StepedGraph */ - rows: Array> + rows: Array>; } /** * @@ -425,19 +425,19 @@ export interface GraphAscend { * @type {string} * @memberof GraphAscend */ - title?: string + title?: string; /** * * @type {Array} * @memberof GraphAscend */ - columns: Array + columns: Array; /** * * @type {any} * @memberof GraphAscend */ - rows: any + rows: any; } /** * @@ -450,25 +450,25 @@ export interface GraphColumn { * @type {string} * @memberof GraphColumn */ - type: string + type: string; /** * * @type {string} * @memberof GraphColumn */ - name: string + name: string; /** * * @type {string} * @memberof GraphColumn */ - role?: string + role?: string; /** * * @type {GraphColumnP} * @memberof GraphColumn */ - p?: GraphColumnP + p?: GraphColumnP; } /** * @@ -481,7 +481,7 @@ export interface GraphColumnP { * @type {boolean} * @memberof GraphColumnP */ - html?: boolean + html?: boolean; } /** * @@ -494,13 +494,13 @@ export interface InlineResponse200 { * @type {TableMetadata} * @memberof InlineResponse200 */ - metadata: TableMetadata + metadata: TableMetadata; /** * * @type {OperationTableData} * @memberof InlineResponse200 */ - data: OperationTableData + data: OperationTableData; } /** * @@ -513,13 +513,13 @@ export interface InlineResponse2001 { * @type {TableMetadata} * @memberof InlineResponse2001 */ - metadata: TableMetadata + metadata: TableMetadata; /** * * @type {CallStackTableData} * @memberof InlineResponse2001 */ - data: CallStackTableData + data: CallStackTableData; } /** * @@ -532,13 +532,13 @@ export interface InlineResponse2002 { * @type {GpuInfoMetadata} * @memberof InlineResponse2002 */ - metadata: GpuInfoMetadata + metadata: GpuInfoMetadata; /** * * @type {any} * @memberof InlineResponse2002 */ - data: any + data: any; } /** * @@ -551,8 +551,8 @@ export interface KernelGraph { * @type {Graph} * @memberof KernelGraph */ - total: Graph, - device_target: string + total: Graph; + device_target: string; } /** * @@ -565,50 +565,50 @@ export interface KeyedColumn { * @type {string} * @memberof KeyedColumn */ - type: string + type: string; /** * * @type {string} * @memberof KeyedColumn */ - name: string + name: string; /** * * @type {string} * @memberof KeyedColumn */ - key: string + key: string; } /** - * + * * @export * @interface MemoryCurveDataAll */ export interface MemoryCurveDataAll { /** - * + * * @type {string} * @memberof MemoryCurveDataAll */ - default_device: string + default_device: string; /** - * + * * @type {Array} * @memberof MemoryCurveDataAll */ - devices: Array + devices: Array; /** * * @type {MemoryCurveDataAscend} * @memberof MemoryCurveDataAll */ - total: MemoryCurveDataAscend + total: MemoryCurveDataAscend; /** * * @type {MemoryCurveDataAscend} * @memberof MemoryCurveDataAll */ - ptaGe: MemoryCurveDataAscend + ptaGe: MemoryCurveDataAscend; } /** * @@ -621,19 +621,19 @@ export interface MemoryCurveData { * @type {MemoryCurveDataMetadata} * @memberof MemoryCurveData */ - metadata: MemoryCurveDataMetadata + metadata: MemoryCurveDataMetadata; /** * * @type {Array} * @memberof MemoryCurveData */ - columns: Array + columns: Array; /** * * @type {any} * @memberof MemoryCurveData */ - rows: any + rows: any; } /** * @@ -646,19 +646,19 @@ export interface MemoryCurveDataAscend { * @type {MemoryCurveDataMetadata} * @memberof MemoryCurveDataAscend */ - metadata: MemoryCurveDataMetadata + metadata: MemoryCurveDataMetadata; /** * * @type {any} * @memberof MemoryCurveDataAscend */ - columns: any + columns: any; /** * * @type {any} * @memberof MemoryCurveDataAscend */ - rows: any + rows: any; } /** * @@ -671,55 +671,55 @@ export interface MemoryCurveDataMetadata { * @type {string} * @memberof MemoryCurveDataMetadata */ - default_device: string + default_device: string; /** * * @type {Array} * @memberof MemoryCurveDataMetadata */ - devices: Array + devices: Array; /** * * @type {any} * @memberof MemoryCurveDataMetadata */ - peaks: any + peaks: any; /** * * @type {any} * @memberof MemoryCurveDataMetadata */ - totals: any + totals: any; /** * * @type {number} * @memberof MemoryCurveDataMetadata */ - first_ts: number + first_ts: number; /** * * @type {string} * @memberof MemoryCurveDataMetadata */ - time_metric: string + time_metric: string; /** * * @type {string} * @memberof MemoryCurveDataMetadata */ - memory_metric: string + memory_metric: string; /** * * @type {number} * @memberof MemoryCurveDataMetadata */ - time_factor: number + time_factor: number; /** * * @type {number} * @memberof MemoryCurveDataMetadata */ - memory_factor: number + memory_factor: number; } /** * @@ -732,38 +732,38 @@ export interface MemoryEventsData { * @type {MemoryEventsTableMetadata} * @memberof MemoryEventsData */ - metadata: MemoryEventsTableMetadata + metadata: MemoryEventsTableMetadata; /** * * @type {Array} * @memberof MemoryEventsData */ - columns: Array + columns: Array; /** * * @type {any} * @memberof MemoryEventsData */ - rows: any + rows: any; } /** - * + * * @exports * @interface MemoryEventsDataAll */ export interface MemoryEventsDataAll { /** - * + * * @type {MemoryEventsData} * @memberof MemoryEventsDataAll */ - operator: MemoryEventsData + operator: MemoryEventsData; /** - * + * * @type {MemoryEventsData} * @memberof MemoryEventsDataAll */ - component: MemoryEventsData + component: MemoryEventsData; } /** * @@ -776,25 +776,25 @@ export interface MemoryEventsTableMetadata { * @type {string} * @memberof MemoryEventsTableMetadata */ - title: string + title: string; /** * * @type {string} * @memberof MemoryEventsTableMetadata */ - default_device: string + default_device: string; /** * * @type {string} * @memberof MemoryEventsTableMetadata */ - search?: string + search?: string; /** * * @type {string} * @memberof MemoryEventsTableMetadata */ - sort?: string + sort?: string; } /** * @@ -807,19 +807,19 @@ export interface MemoryStatsData { * @type {MemoryStatsTableMetadata} * @memberof MemoryStatsData */ - metadata: MemoryStatsTableMetadata + metadata: MemoryStatsTableMetadata; /** * * @type {Array} * @memberof MemoryStatsData */ - columns: Array + columns: Array; /** * * @type {any} * @memberof MemoryStatsData */ - rows: any + rows: any; } /** * @@ -832,25 +832,25 @@ export interface MemoryStatsTableMetadata { * @type {string} * @memberof MemoryStatsTableMetadata */ - title: string + title: string; /** * * @type {string} * @memberof MemoryStatsTableMetadata */ - default_device: string + default_device: string; /** * * @type {string} * @memberof MemoryStatsTableMetadata */ - search: string + search: string; /** * * @type {string} * @memberof MemoryStatsTableMetadata */ - sort: string + sort: string; } /** * @@ -863,61 +863,61 @@ export interface ModuleStats { * @type {string} * @memberof ModuleStats */ - name: string + name: string; /** * * @type {string} * @memberof ModuleStats */ - id: string + id: string; /** * * @type {number} * @memberof ModuleStats */ - occurences: number + occurences: number; /** * * @type {number} * @memberof ModuleStats */ - operators: number + operators: number; /** * * @type {number} * @memberof ModuleStats */ - host_duration: number + host_duration: number; /** * * @type {number} * @memberof ModuleStats */ - self_host_duration: number + self_host_duration: number; /** * * @type {number} * @memberof ModuleStats */ - device_duration: number + device_duration: number; /** * * @type {number} * @memberof ModuleStats */ - self_device_duration: number + self_device_duration: number; /** * * @type {number} * @memberof ModuleStats */ - avg_duration: number + avg_duration: number; /** * * @type {Array} * @memberof ModuleStats */ - children: Array + children: Array; } /** * @@ -930,13 +930,13 @@ export interface ModuleViewData { * @type {Array} * @memberof ModuleViewData */ - columns: Array + columns: Array; /** * * @type {Array} * @memberof ModuleViewData */ - data: Array + data: Array; } /** * @@ -949,37 +949,37 @@ export interface OpAgg { * @type {string} * @memberof OpAgg */ - name: string + name: string; /** * * @type {number} * @memberof OpAgg */ - calls: number + calls: number; /** * * @type {number} * @memberof OpAgg */ - host_duration: number + host_duration: number; /** * * @type {number} * @memberof OpAgg */ - device_duration: number + device_duration: number; /** * * @type {number} * @memberof OpAgg */ - self_host_duration: number + self_host_duration: number; /** * * @type {number} * @memberof OpAgg */ - self_device_duration: number + self_device_duration: number; } /** * @@ -992,38 +992,38 @@ export interface OpStats { * @type {string} * @memberof OpStats */ - name: string + name: string; /** * * @type {number} * @memberof OpStats */ - duration: number + duration: number; /** * * @type {number} * @memberof OpStats */ - device_duration: number + device_duration: number; /** * * @type {number} * @memberof OpStats */ - total_duration: number + total_duration: number; /** * * @type {Array} * @memberof OpStats */ - aggs: Array + aggs: Array; } /** * * @export * @interface OperationTableData */ -export interface OperationTableData extends Array { } +export interface OperationTableData extends Array {} /** * * @export @@ -1035,67 +1035,67 @@ export interface OperationTableDataInner { * @type {string} * @memberof OperationTableDataInner */ - name: string + name: string; /** * * @type {string} * @memberof OperationTableDataInner */ - input_shape?: string + input_shape?: string; /** * * @type {number} * @memberof OperationTableDataInner */ - calls: number + calls: number; /** * * @type {number} * @memberof OperationTableDataInner */ - device_self_duration?: number + device_self_duration?: number; /** * * @type {number} * @memberof OperationTableDataInner */ - device_total_duration?: number + device_total_duration?: number; /** * * @type {number} * @memberof OperationTableDataInner */ - host_self_duration: number + host_self_duration: number; /** * * @type {number} * @memberof OperationTableDataInner */ - host_total_duration: number + host_total_duration: number; /** * * @type {boolean} * @memberof OperationTableDataInner */ - has_call_stack: boolean + has_call_stack: boolean; /** * * @type {string} * @memberof OperationTableDataInner */ - tc_eligible?: string + tc_eligible?: string; /** * * @type {number} * @memberof OperationTableDataInner */ - tc_self_ratio?: number + tc_self_ratio?: number; /** * * @type {number} * @memberof OperationTableDataInner */ - tc_total_ratio?: number + tc_total_ratio?: number; } /** * @@ -1108,25 +1108,25 @@ export interface OperatorGraph { * @type {Graph} * @memberof OperatorGraph */ - device_total_time: Graph + device_total_time: Graph; /** * * @type {Graph} * @memberof OperatorGraph */ - device_self_time: Graph + device_self_time: Graph; /** * * @type {Graph} * @memberof OperatorGraph */ - host_total_time: Graph + host_total_time: Graph; /** * * @type {Graph} * @memberof OperatorGraph */ - host_self_time: Graph + host_self_time: Graph; } /** * @@ -1139,37 +1139,37 @@ export interface OperatorNode { * @type {string} * @memberof OperatorNode */ - name: string + name: string; /** * * @type {number} * @memberof OperatorNode */ - start_time: number + start_time: number; /** * * @type {number} * @memberof OperatorNode */ - end_time: number + end_time: number; /** * * @type {string} * @memberof OperatorNode */ - type: string + type: string; /** * * @type {number} * @memberof OperatorNode */ - tid: number + tid: number; /** * * @type {Array} * @memberof OperatorNode */ - children: Array + children: Array; } /** * @@ -1182,31 +1182,31 @@ export interface Overview { * @type {Array} * @memberof Overview */ - performance: Array + performance: Array; /** * * @type {Array} * @memberof Overview */ - environments: Array + environments: Array; /** * * @type {StepedGraph} * @memberof Overview */ - steps: StepedGraph + steps: StepedGraph; /** * * @type {string} * @memberof Overview */ - recommendations: string + recommendations: string; /** * * @type {GpuMetrics} * @memberof Overview */ - gpu_metrics?: GpuMetrics + gpu_metrics?: GpuMetrics; } /** * @@ -1219,31 +1219,31 @@ export interface Performance { * @type {string} * @memberof Performance */ - name: string + name: string; /** * * @type {string} * @memberof Performance */ - description?: string + description?: string; /** * * @type {string} * @memberof Performance */ - value?: string + value?: string; /** * * @type {string} * @memberof Performance */ - extra?: string + extra?: string; /** * * @type {Array} * @memberof Performance */ - children?: Array + children?: Array; } /** * @@ -1256,13 +1256,13 @@ export interface Runs { * @type {Array} * @memberof Runs */ - runs: Array + runs: Array; /** * * @type {boolean} * @memberof Runs */ - loading: boolean + loading: boolean; } /** * @@ -1275,13 +1275,13 @@ export interface TableData { * @type {Graph} * @memberof TableData */ - data: Graph + data: Graph; /** * * @type {TableMetadata} * @memberof TableData */ - metadata: TableMetadata + metadata: TableMetadata; } /** * @@ -1294,13 +1294,13 @@ export interface TableMetadata { * @type {string} * @memberof TableMetadata */ - sort: string + sort: string; /** * * @type {any} * @memberof TableMetadata */ - tooltips?: any + tooltips?: any; } /** * @@ -1313,7 +1313,7 @@ export interface TensorCoresGraph { * @type {Graph} * @memberof TensorCoresGraph */ - total: Graph + total: Graph; } /** * @@ -1326,32 +1326,32 @@ export interface ValueAndFormat { * @type {string | number | boolean} * @memberof ValueAndFormat */ - v: string | number | boolean + v: string | number | boolean; /** * * @type {string} * @memberof ValueAndFormat */ - f: string + f: string; } /** - * + * * @exports * @interface Views */ export interface Views { /** - * + * * @type {string} * @memberof Views */ - device_target: string + device_target: string; /** - * + * * @type {Array} * @memberof Views */ - views: Array + views: Array; } /** * DefaultApi - fetch parameter creator @@ -1388,75 +1388,75 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling diffnodeGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling diffnodeGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling diffnodeGet.' - ) + ); } // verify required parameter 'exp_run' is not null or undefined if (exp_run === null || exp_run === undefined) { throw new RequiredError( 'exp_run', 'Required parameter exp_run was null or undefined when calling diffnodeGet.' - ) + ); } // verify required parameter 'exp_worker' is not null or undefined if (exp_worker === null || exp_worker === undefined) { throw new RequiredError( 'exp_worker', 'Required parameter exp_worker was null or undefined when calling diffnodeGet.' - ) + ); } // verify required parameter 'exp_span' is not null or undefined if (exp_span === null || exp_span === undefined) { throw new RequiredError( 'exp_span', 'Required parameter exp_span was null or undefined when calling diffnodeGet.' - ) + ); } - const localVarPath = `/diffnode` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/diffnode`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (exp_run !== undefined) { - localVarQueryParameter['exp_run'] = exp_run + localVarQueryParameter.exp_run = exp_run; } if (exp_worker !== undefined) { - localVarQueryParameter['exp_worker'] = exp_worker + localVarQueryParameter.exp_worker = exp_worker; } if (exp_span !== undefined) { - localVarQueryParameter['exp_span'] = exp_span + localVarQueryParameter.exp_span = exp_span; } if (path !== undefined) { - localVarQueryParameter['path'] = path + localVarQueryParameter.path = path; } localVarUrlObj.query = Object.assign( @@ -1464,19 +1464,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1497,38 +1497,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling distributedCommopsGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling distributedCommopsGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling distributedCommopsGet.' - ) + ); } - const localVarPath = `/distributed/commops` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/distributed/commops`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -1536,19 +1536,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1569,38 +1569,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling distributedGpuinfoGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling distributedGpuinfoGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling distributedGpuinfoGet.' - ) + ); } - const localVarPath = `/distributed/gpuinfo` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/distributed/gpuinfo`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -1608,19 +1608,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1641,38 +1641,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling distributedOverlapGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling distributedOverlapGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling distributedOverlapGet.' - ) + ); } - const localVarPath = `/distributed/overlap` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/distributed/overlap`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -1680,19 +1680,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1713,38 +1713,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling distributedWaittimeGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling distributedWaittimeGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling distributedWaittimeGet.' - ) + ); } - const localVarPath = `/distributed/waittime` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/distributed/waittime`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -1752,19 +1752,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1787,49 +1787,49 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling kernelGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling kernelGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling kernelGet.' - ) + ); } // verify required parameter 'group_by' is not null or undefined if (group_by === null || group_by === undefined) { throw new RequiredError( 'group_by', 'Required parameter group_by was null or undefined when calling kernelGet.' - ) + ); } - const localVarPath = `/kernel` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/kernel`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (group_by !== undefined) { - localVarQueryParameter['group_by'] = group_by + localVarQueryParameter.group_by = group_by; } localVarUrlObj.query = Object.assign( @@ -1837,19 +1837,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1872,42 +1872,42 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling kernelTableGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling kernelTableGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling kernelTableGet.' - ) + ); } - const localVarPath = `/kernel/table` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/kernel/table`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (group_by !== undefined) { - localVarQueryParameter['group_by'] = group_by + localVarQueryParameter.group_by = group_by; } localVarUrlObj.query = Object.assign( @@ -1915,19 +1915,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -1948,38 +1948,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling kernelTcPieGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling kernelTcPieGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling kernelTcPieGet.' - ) + ); } - const localVarPath = `/kernel/tc_pie` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/kernel/tc_pie`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -1987,19 +1987,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2020,38 +2020,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling memoryCurveGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling memoryCurveGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling memoryCurveGet.' - ) + ); } - const localVarPath = `/memory_curve` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/memory_curve`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -2059,19 +2059,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2096,46 +2096,46 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling memoryEventsGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling memoryEventsGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling memoryEventsGet.' - ) + ); } - const localVarPath = `/memory_events` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/memory_events`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (start_ts !== undefined) { - localVarQueryParameter['start_ts'] = start_ts + localVarQueryParameter.start_ts = start_ts; } if (end_ts !== undefined) { - localVarQueryParameter['end_ts'] = end_ts + localVarQueryParameter.end_ts = end_ts; } localVarUrlObj.query = Object.assign( @@ -2143,19 +2143,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2180,46 +2180,46 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling memoryGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling memoryGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling memoryGet.' - ) + ); } - const localVarPath = `/memory` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/memory`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (start_ts !== undefined) { - localVarQueryParameter['start_ts'] = start_ts + localVarQueryParameter.start_ts = start_ts; } if (end_ts !== undefined) { - localVarQueryParameter['end_ts'] = end_ts + localVarQueryParameter.end_ts = end_ts; } localVarUrlObj.query = Object.assign( @@ -2227,19 +2227,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2260,38 +2260,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling moduleGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling moduleGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling moduleGet.' - ) + ); } - const localVarPath = `/module` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/module`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -2299,19 +2299,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2334,49 +2334,49 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling operationGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling operationGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling operationGet.' - ) + ); } // verify required parameter 'group_by' is not null or undefined if (group_by === null || group_by === undefined) { throw new RequiredError( 'group_by', 'Required parameter group_by was null or undefined when calling operationGet.' - ) + ); } - const localVarPath = `/operation` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/operation`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (group_by !== undefined) { - localVarQueryParameter['group_by'] = group_by + localVarQueryParameter.group_by = group_by; } localVarUrlObj.query = Object.assign( @@ -2384,19 +2384,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2423,64 +2423,64 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling operationStackGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling operationStackGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling operationStackGet.' - ) + ); } // verify required parameter 'group_by' is not null or undefined if (group_by === null || group_by === undefined) { throw new RequiredError( 'group_by', 'Required parameter group_by was null or undefined when calling operationStackGet.' - ) + ); } // verify required parameter 'op_name' is not null or undefined if (op_name === null || op_name === undefined) { throw new RequiredError( 'op_name', 'Required parameter op_name was null or undefined when calling operationStackGet.' - ) + ); } - const localVarPath = `/operation/stack` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/operation/stack`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (group_by !== undefined) { - localVarQueryParameter['group_by'] = group_by + localVarQueryParameter.group_by = group_by; } if (op_name !== undefined) { - localVarQueryParameter['op_name'] = op_name + localVarQueryParameter.op_name = op_name; } if (input_shape !== undefined) { - localVarQueryParameter['input_shape'] = input_shape + localVarQueryParameter.input_shape = input_shape; } localVarUrlObj.query = Object.assign( @@ -2488,19 +2488,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2523,49 +2523,49 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling operationTableGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling operationTableGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling operationTableGet.' - ) + ); } // verify required parameter 'group_by' is not null or undefined if (group_by === null || group_by === undefined) { throw new RequiredError( 'group_by', 'Required parameter group_by was null or undefined when calling operationTableGet.' - ) + ); } - const localVarPath = `/operation/table` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/operation/table`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } if (group_by !== undefined) { - localVarQueryParameter['group_by'] = group_by + localVarQueryParameter.group_by = group_by; } localVarUrlObj.query = Object.assign( @@ -2573,19 +2573,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2606,38 +2606,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling overviewGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling overviewGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling overviewGet.' - ) + ); } - const localVarPath = `/overview` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/overview`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -2645,19 +2645,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2665,30 +2665,30 @@ export const DefaultApiFetchParamCreator = function ( * @throws {RequiredError} */ runsGet(options: any = {}): FetchArgs { - const localVarPath = `/runs` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/runs`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; localVarUrlObj.query = Object.assign( {}, localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2703,27 +2703,27 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling spansGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling spansGet.' - ) + ); } - const localVarPath = `/spans` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/spans`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } localVarUrlObj.query = Object.assign( @@ -2731,19 +2731,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2764,38 +2764,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling traceGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling traceGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling traceGet.' - ) + ); } - const localVarPath = `/trace` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/trace`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -2803,19 +2803,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2836,38 +2836,38 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling treeGet.' - ) + ); } // verify required parameter 'worker' is not null or undefined if (worker === null || worker === undefined) { throw new RequiredError( 'worker', 'Required parameter worker was null or undefined when calling treeGet.' - ) + ); } // verify required parameter 'span' is not null or undefined if (span === null || span === undefined) { throw new RequiredError( 'span', 'Required parameter span was null or undefined when calling treeGet.' - ) + ); } - const localVarPath = `/tree` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/tree`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (worker !== undefined) { - localVarQueryParameter['worker'] = worker + localVarQueryParameter.worker = worker; } if (span !== undefined) { - localVarQueryParameter['span'] = span + localVarQueryParameter.span = span; } localVarUrlObj.query = Object.assign( @@ -2875,19 +2875,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2901,16 +2901,16 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling viewsGet.' - ) + ); } - const localVarPath = `/views` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/views`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } localVarUrlObj.query = Object.assign( @@ -2918,19 +2918,19 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } + options: localVarRequestOptions, + }; }, /** * @@ -2945,27 +2945,27 @@ export const DefaultApiFetchParamCreator = function ( throw new RequiredError( 'run', 'Required parameter run was null or undefined when calling workersGet.' - ) + ); } // verify required parameter 'view' is not null or undefined if (view === null || view === undefined) { throw new RequiredError( 'view', 'Required parameter view was null or undefined when calling workersGet.' - ) + ); } - const localVarPath = `/workers` - const localVarUrlObj = url.parse(localVarPath, true) - const localVarRequestOptions = Object.assign({ method: 'GET' }, options) - const localVarHeaderParameter = {} as any - const localVarQueryParameter = {} as any + const localVarPath = `/workers`; + const localVarUrlObj = url.parse(localVarPath, true); + const localVarRequestOptions = Object.assign({ method: 'GET' }, options); + const localVarHeaderParameter = {} as any; + const localVarQueryParameter = {} as any; if (run !== undefined) { - localVarQueryParameter['run'] = run + localVarQueryParameter.run = run; } if (view !== undefined) { - localVarQueryParameter['view'] = view + localVarQueryParameter.view = view; } localVarUrlObj.query = Object.assign( @@ -2973,22 +2973,22 @@ export const DefaultApiFetchParamCreator = function ( localVarUrlObj.query, localVarQueryParameter, options.query - ) + ); // fix override query string Detail: https://stackoverflow.com/a/7517673/1077943 - delete localVarUrlObj.search + delete localVarUrlObj.search; localVarRequestOptions.headers = Object.assign( {}, localVarHeaderParameter, options.headers - ) + ); return { url: url.format(localVarUrlObj), - options: localVarRequestOptions - } - } - } -} + options: localVarRequestOptions, + }; + }, + }; +}; /** * DefaultApi - functional programming interface @@ -3029,7 +3029,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { exp_span, path, options - ) + ); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3039,12 +3039,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3062,7 +3062,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).distributedCommopsGet(run, worker, span, options) + ).distributedCommopsGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3072,12 +3072,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3095,7 +3095,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).distributedGpuinfoGet(run, worker, span, options) + ).distributedGpuinfoGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3105,12 +3105,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3128,7 +3128,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).distributedOverlapGet(run, worker, span, options) + ).distributedOverlapGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3138,12 +3138,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3161,7 +3161,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).distributedWaittimeGet(run, worker, span, options) + ).distributedWaittimeGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3171,12 +3171,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3196,7 +3196,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).kernelGet(run, worker, span, group_by, options) + ).kernelGet(run, worker, span, group_by, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3206,12 +3206,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3231,7 +3231,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).kernelTableGet(run, worker, span, group_by, options) + ).kernelTableGet(run, worker, span, group_by, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3241,12 +3241,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3264,7 +3264,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).kernelTcPieGet(run, worker, span, options) + ).kernelTcPieGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3274,12 +3274,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3294,10 +3294,13 @@ export const DefaultApiFp = function (configuration?: Configuration) { worker: string, span: string, options?: any - ): (fetch?: FetchAPI, basePath?: string) => Promise { + ): ( + fetch?: FetchAPI, + basePath?: string + ) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).memoryCurveGet(run, worker, span, options) + ).memoryCurveGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3307,12 +3310,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3331,10 +3334,13 @@ export const DefaultApiFp = function (configuration?: Configuration) { start_ts?: number, end_ts?: number, options?: any - ): (fetch?: FetchAPI, basePath?: string) => Promise { + ): ( + fetch?: FetchAPI, + basePath?: string + ) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).memoryEventsGet(run, worker, span, start_ts, end_ts, options) + ).memoryEventsGet(run, worker, span, start_ts, end_ts, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3344,12 +3350,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3371,7 +3377,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).memoryGet(run, worker, span, start_ts, end_ts, options) + ).memoryGet(run, worker, span, start_ts, end_ts, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3381,12 +3387,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3404,7 +3410,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).moduleGet(run, worker, span, options) + ).moduleGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3414,12 +3420,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3439,7 +3445,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).operationGet(run, worker, span, group_by, options) + ).operationGet(run, worker, span, group_by, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3449,12 +3455,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3486,7 +3492,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { op_name, input_shape, options - ) + ); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3496,12 +3502,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3521,7 +3527,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).operationTableGet(run, worker, span, group_by, options) + ).operationTableGet(run, worker, span, group_by, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3531,12 +3537,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3554,7 +3560,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).overviewGet(run, worker, span, options) + ).overviewGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3564,12 +3570,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3579,9 +3585,8 @@ export const DefaultApiFp = function (configuration?: Configuration) { runsGet( options?: any ): (fetch?: FetchAPI, basePath?: string) => Promise { - const localVarFetchArgs = DefaultApiFetchParamCreator( - configuration - ).runsGet(options) + const localVarFetchArgs = + DefaultApiFetchParamCreator(configuration).runsGet(options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3591,12 +3596,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3612,7 +3617,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise> { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).spansGet(run, worker, options) + ).spansGet(run, worker, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3622,12 +3627,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3645,7 +3650,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).traceGet(run, worker, span, options) + ).traceGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3655,12 +3660,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3678,7 +3683,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).treeGet(run, worker, span, options) + ).treeGet(run, worker, span, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3688,12 +3693,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3707,7 +3712,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).viewsGet(run, options) + ).viewsGet(run, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3717,12 +3722,12 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } + }); + }; }, /** * @@ -3738,7 +3743,7 @@ export const DefaultApiFp = function (configuration?: Configuration) { ): (fetch?: FetchAPI, basePath?: string) => Promise> { const localVarFetchArgs = DefaultApiFetchParamCreator( configuration - ).workersGet(run, view, options) + ).workersGet(run, view, options); return ( fetch: FetchAPI = portableFetch, basePath: string = BASE_PATH @@ -3748,15 +3753,15 @@ export const DefaultApiFp = function (configuration?: Configuration) { localVarFetchArgs.options ).then((response) => { if (response.status >= 200 && response.status < 300) { - return response.json() + return response.json(); } else { - throw response + throw response; } - }) - } - } - } -} + }); + }; + }, + }; +}; /** * DefaultApi - factory interface @@ -3799,7 +3804,7 @@ export const DefaultApiFactory = function ( exp_span, path, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3820,7 +3825,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3841,7 +3846,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3862,7 +3867,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3883,7 +3888,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3907,7 +3912,7 @@ export const DefaultApiFactory = function ( span, group_by, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3931,7 +3936,7 @@ export const DefaultApiFactory = function ( span, group_by, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3947,7 +3952,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3963,7 +3968,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -3990,7 +3995,7 @@ export const DefaultApiFactory = function ( start_ts, end_ts, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4017,7 +4022,7 @@ export const DefaultApiFactory = function ( start_ts, end_ts, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4033,7 +4038,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4057,7 +4062,7 @@ export const DefaultApiFactory = function ( span, group_by, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4087,7 +4092,7 @@ export const DefaultApiFactory = function ( op_name, input_shape, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4111,7 +4116,7 @@ export const DefaultApiFactory = function ( span, group_by, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4127,7 +4132,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4135,7 +4140,7 @@ export const DefaultApiFactory = function ( * @throws {RequiredError} */ runsGet(options?: any) { - return DefaultApiFp(configuration).runsGet(options)(fetch, basePath) + return DefaultApiFp(configuration).runsGet(options)(fetch, basePath); }, /** * @@ -4149,7 +4154,7 @@ export const DefaultApiFactory = function ( run, worker, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4165,7 +4170,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4181,7 +4186,7 @@ export const DefaultApiFactory = function ( worker, span, options - )(fetch, basePath) + )(fetch, basePath); }, /** * @@ -4190,7 +4195,10 @@ export const DefaultApiFactory = function ( * @throws {RequiredError} */ viewsGet(run: string, options?: any) { - return DefaultApiFp(configuration).viewsGet(run, options)(fetch, basePath) + return DefaultApiFp(configuration).viewsGet(run, options)( + fetch, + basePath + ); }, /** * @@ -4204,10 +4212,10 @@ export const DefaultApiFactory = function ( run, view, options - )(fetch, basePath) - } - } -} + )(fetch, basePath); + }, + }; +}; /** * DefaultApi - object-oriented interface @@ -4248,7 +4256,7 @@ export class DefaultApi extends BaseAPI { exp_span, path, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4271,7 +4279,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4294,7 +4302,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4317,7 +4325,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4340,7 +4348,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4366,7 +4374,7 @@ export class DefaultApi extends BaseAPI { span, group_by, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4392,7 +4400,7 @@ export class DefaultApi extends BaseAPI { span, group_by, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4415,7 +4423,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4438,7 +4446,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4467,7 +4475,7 @@ export class DefaultApi extends BaseAPI { start_ts, end_ts, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4496,7 +4504,7 @@ export class DefaultApi extends BaseAPI { start_ts, end_ts, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4514,7 +4522,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4540,7 +4548,7 @@ export class DefaultApi extends BaseAPI { span, group_by, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4572,7 +4580,7 @@ export class DefaultApi extends BaseAPI { op_name, input_shape, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4598,7 +4606,7 @@ export class DefaultApi extends BaseAPI { span, group_by, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4616,7 +4624,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4629,7 +4637,7 @@ export class DefaultApi extends BaseAPI { return DefaultApiFp(this.configuration).runsGet(options)( this.fetch, this.basePath - ) + ); } /** @@ -4645,7 +4653,7 @@ export class DefaultApi extends BaseAPI { run, worker, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4663,7 +4671,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4681,7 +4689,7 @@ export class DefaultApi extends BaseAPI { worker, span, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } /** @@ -4695,7 +4703,7 @@ export class DefaultApi extends BaseAPI { return DefaultApiFp(this.configuration).viewsGet(run, options)( this.fetch, this.basePath - ) + ); } /** @@ -4711,6 +4719,6 @@ export class DefaultApi extends BaseAPI { run, view, options - )(this.fetch, this.basePath) + )(this.fetch, this.basePath); } } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/configuration.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/configuration.ts index edec57eed84498fa3dcaa804ada9787b0202066c..85b77bf651c049ec5a2ec85379414f619904c6dd 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/configuration.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/configuration.ts @@ -14,13 +14,12 @@ * https://github.com/swagger-api/swagger-codegen.git * Do not edit the file manually. */ - export interface ConfigurationParameters { - apiKey?: string | ((name: string) => string) - username?: string - password?: string - accessToken?: string | ((name: string, scopes?: string[]) => string) - basePath?: string + apiKey?: string | ((name: string) => string); + username?: string; + password?: string; + accessToken?: string | ((name: string, scopes?: string[]) => string); + basePath?: string; } export class Configuration { @@ -29,41 +28,41 @@ export class Configuration { * @param name security name * @memberof Configuration */ - apiKey?: string | ((name: string) => string) + apiKey?: string | ((name: string) => string); /** * parameter for basic security * * @type {string} * @memberof Configuration */ - username?: string + username?: string; /** * parameter for basic security * * @type {string} * @memberof Configuration */ - password?: string + password?: string; /** * parameter for oauth2 security * @param name security name * @param scopes oauth2 scope * @memberof Configuration */ - accessToken?: string | ((name: string, scopes?: string[]) => string) + accessToken?: string | ((name: string, scopes?: string[]) => string); /** * override base path * * @type {string} * @memberof Configuration */ - basePath?: string + basePath?: string; constructor(param: ConfigurationParameters = {}) { - this.apiKey = param.apiKey - this.username = param.username - this.password = param.password - this.accessToken = param.accessToken - this.basePath = param.basePath + this.apiKey = param.apiKey; + this.username = param.username; + this.password = param.password; + this.accessToken = param.accessToken; + this.basePath = param.basePath; } } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/custom.d.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/custom.d.ts index bfe6a59d9df208845d2fb5a43edb7a2f3d8721ae..992af468898f15bee4f609a8cb752e21f0a9ad48 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/custom.d.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/custom.d.ts @@ -2,5 +2,5 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -declare module 'portable-fetch' -declare module 'url' +declare module 'portable-fetch'; +declare module 'url'; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/index.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/index.ts index 1ab79fb65f34d7c33099bac7e54378c3f54fdb35..7ad784e60de2777174cea9d902ad9cf2550fad68 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/index.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/generated/index.ts @@ -14,6 +14,5 @@ * https://github.com/swagger-api/swagger-codegen.git * Do not edit the file manually. */ - -export * from './api' -export * from './configuration' +export * from './api'; +export * from './configuration'; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/index.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/index.ts index f43336a583b81998422facba8787270d6cee7673..98b35abfbc09785ffa09b1bbaa48c73685ec84f5 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/index.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/index.ts @@ -2,7 +2,7 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as api from './generated' +import * as api from './generated'; -export const defaultApi = new api.DefaultApi(undefined, undefined, fetch) -export * from './generated/api' +export const defaultApi = new api.DefaultApi(undefined, undefined, fetch); +export * from './generated/api'; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/mock.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/mock.ts index 744c222a0266eed6359bb60fc0f6ba9601ba8edc..4b4b447d97192b7c7c00784dd9176faeed25d64b 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/api/mock.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/api/mock.ts @@ -6,8 +6,8 @@ export class MockAPI { runsGet() { return { runs: ['resnet50_num_workers_0', 'resnet50_num_workers_4'], - loading: false - } + loading: false, + }; } viewsGet(run: string) { @@ -16,16 +16,16 @@ export class MockAPI { 'Operator', 'Kernel', 'Trace', - 'Memory' - ]) + 'Memory', + ]); } - spansGet(run: string, view: String) { - return Promise.resolve(['1', '2']) + spansGet(run: string, view: string): Promise { + return Promise.resolve(['1', '2']); } - workersGet(run: string, view: String) { - return Promise.resolve(['worker0']) + workersGet(run: string, view: string): Promise { + return Promise.resolve(['worker0']); } overviewGet(run: string, worker: string, span: string) { @@ -46,7 +46,7 @@ export class MockAPI { { type: 'number', name: 'CPU Exec' }, { type: 'string', role: 'tooltip', p: { html: 'true' } }, { type: 'number', name: 'Other' }, - { type: 'string', role: 'tooltip', p: { html: 'true' } } + { type: 'string', role: 'tooltip', p: { html: 'true' } }, ], rows: [ [ @@ -64,7 +64,7 @@ export class MockAPI { 14091, '

Step 5
Total: 187948us
CPU Exec: 14091us
Percentage: 7.5%
', 1115, - '
Step 5
Total: 187948us
Other: 1115us
Percentage: 0.59%
' + '
Step 5
Total: 187948us
Other: 1115us
Percentage: 0.59%
', ], [ '6', @@ -81,7 +81,7 @@ export class MockAPI { 12968, '
Step 6
Total: 175153us
CPU Exec: 12968us
Percentage: 7.4%
', 1148, - '
Step 6
Total: 175153us
Other: 1148us
Percentage: 0.66%
' + '
Step 6
Total: 175153us
Other: 1148us
Percentage: 0.66%
', ], [ '7', @@ -98,7 +98,7 @@ export class MockAPI { 13768, '
Step 7
Total: 179733us
CPU Exec: 13768us
Percentage: 7.66%
', 1213, - '
Step 7
Total: 179733us
Other: 1213us
Percentage: 0.67%
' + '
Step 7
Total: 179733us
Other: 1213us
Percentage: 0.67%
', ], [ '8', @@ -115,7 +115,7 @@ export class MockAPI { 13420, '
Step 8
Total: 174564us
CPU Exec: 13420us
Percentage: 7.69%
', 1200, - '
Step 8
Total: 174564us
Other: 1200us
Percentage: 0.69%
' + '
Step 8
Total: 174564us
Other: 1200us
Percentage: 0.69%
', ], [ '9', @@ -132,7 +132,7 @@ export class MockAPI { 15025, '
Step 9
Total: 182172us
CPU Exec: 15025us
Percentage: 8.25%
', 1141, - '
Step 9
Total: 182172us
Other: 1141us
Percentage: 0.63%
' + '
Step 9
Total: 182172us
Other: 1141us
Percentage: 0.63%
', ], [ '10', @@ -149,9 +149,9 @@ export class MockAPI { 12773, '
Step 10
Total: 165983us
CPU Exec: 12773us
Percentage: 7.7%
', 1117, - '
Step 10
Total: 165983us
Other: 1117us
Percentage: 0.67%
' - ] - ] + '
Step 10
Total: 165983us
Other: 1117us
Percentage: 0.67%
', + ], + ], }, performance: [ { @@ -166,15 +166,15 @@ export class MockAPI { { name: 'Runtime', description: '', value: 2908, extra: 1.64 }, { name: 'DataLoader', description: '', value: 59262, extra: 33.37 }, { name: 'CPU Exec', description: '', value: 13674, extra: 7.7 }, - { name: 'Other', description: '', value: 1156, extra: 0.65 } - ] - } + { name: 'Other', description: '', value: 1156, extra: 0.65 }, + ], + }, ], recommendations: '
  • This run has high time cost on input data loading. 33.4% of the step time is in DataLoader. You could try to set num_workers on DataLoader\'s construction and enable multi-processes on data loading.
  • Kernels with 68% time are launched by Tensor Cores eligible operators. You could enable Automatic Mixed Precision to speedup by using FP16.
', environments: [ { title: 'Number of Worker(s)', value: '1' }, - { title: 'Device Type', value: 'GPU' } + { title: 'Device Type', value: 'GPU' }, ], gpu_metrics: { title: 'GPU Summary', @@ -186,12 +186,12 @@ export class MockAPI { { title: 'GPU Utilization', value: '55.51 %' }, { title: 'Est. SM Efficiency', value: '54.68 %' }, { title: 'Est. Achieved Occupancy', value: '49.13 %' }, - { title: 'Kernel Time using Tensor Cores', value: '0.0 %' } + { title: 'Kernel Time using Tensor Cores', value: '0.0 %' }, ], tooltip: - "The GPU usage metrics:\n\nGPU Utilization:\nGPU busy time / All steps time. The higher, the better. GPU busy time is the time during which there is at least one GPU kernel running on it. All steps time is the total time of all profiler steps(or called as iterations).\n\nEst. SM Efficiency:\nEstimated Stream Multiprocessor Efficiency. The higher, the better. This metric of a kernel, SM_Eff_K = min(blocks of this kernel / SM number of this GPU, 100%). This overall number is the sum of all kernels' SM_Eff_K weighted by kernel's execution duration, divided by all steps time.\n\nEst. Achieved Occupancy:\nFor most cases such as memory bandwidth bounded kernels, the higher the better. Occupancy is the ratio of active warps on an SM to the maximum number of active warps supported by the SM. The theoretical occupancy of a kernel is upper limit occupancy of this kernel, limited by multiple factors such as kernel shape, kernel used resource, and the GPU compute capability.\nEst. Achieved Occupancy of a kernel, OCC_K = min(threads of the kernel / SM number / max threads per SM, theoretical occupancy of the kernel). This overall number is the weighted average of all kernels' OCC_K using kernel's execution duration as weight. It shows fine-grained low-level GPU utilization.\n\nKernel using Tensor Cores:\nTotal GPU Time for Tensor Core kernels / Total GPU Time for all kernels.\n" - } - }) + "The GPU usage metrics:\n\nGPU Utilization:\nGPU busy time / All steps time. The higher, the better. GPU busy time is the time during which there is at least one GPU kernel running on it. All steps time is the total time of all profiler steps(or called as iterations).\n\nEst. SM Efficiency:\nEstimated Stream Multiprocessor Efficiency. The higher, the better. This metric of a kernel, SM_Eff_K = min(blocks of this kernel / SM number of this GPU, 100%). This overall number is the sum of all kernels' SM_Eff_K weighted by kernel's execution duration, divided by all steps time.\n\nEst. Achieved Occupancy:\nFor most cases such as memory bandwidth bounded kernels, the higher the better. Occupancy is the ratio of active warps on an SM to the maximum number of active warps supported by the SM. The theoretical occupancy of a kernel is upper limit occupancy of this kernel, limited by multiple factors such as kernel shape, kernel used resource, and the GPU compute capability.\nEst. Achieved Occupancy of a kernel, OCC_K = min(threads of the kernel / SM number / max threads per SM, theoretical occupancy of the kernel). This overall number is the weighted average of all kernels' OCC_K using kernel's execution duration as weight. It shows fine-grained low-level GPU utilization.\n\nKernel using Tensor Cores:\nTotal GPU Time for Tensor Core kernels / Total GPU Time for all kernels.\n", + }, + }); } diffnodeGet( @@ -216,7 +216,7 @@ export class MockAPI { host_duration: 186312, device_duration: 0, self_host_duration: 186312, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -224,7 +224,7 @@ export class MockAPI { host_duration: 31902, device_duration: 736, self_host_duration: 17460, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -232,7 +232,7 @@ export class MockAPI { host_duration: 62713, device_duration: 0, self_host_duration: 32640, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -240,7 +240,7 @@ export class MockAPI { host_duration: 1711486, device_duration: 8796, self_host_duration: 37162, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -248,7 +248,7 @@ export class MockAPI { host_duration: 4379, device_duration: 0, self_host_duration: 4379, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -256,7 +256,7 @@ export class MockAPI { host_duration: 10596, device_duration: 0, self_host_duration: 6217, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -264,7 +264,7 @@ export class MockAPI { host_duration: 8470, device_duration: 0, self_host_duration: 8470, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -272,7 +272,7 @@ export class MockAPI { host_duration: 19150, device_duration: 0, self_host_duration: 16142, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -280,7 +280,7 @@ export class MockAPI { host_duration: 50043, device_duration: 0, self_host_duration: 50043, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -288,7 +288,7 @@ export class MockAPI { host_duration: 1518205, device_duration: 8796, self_host_duration: 1509009, - self_device_duration: 8796 + self_device_duration: 8796, }, { name: 'aten::_to_copy', @@ -296,7 +296,7 @@ export class MockAPI { host_duration: 1674324, device_duration: 8796, self_host_duration: 104788, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -304,7 +304,7 @@ export class MockAPI { host_duration: 460479, device_duration: 0, self_host_duration: 421547, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -312,7 +312,7 @@ export class MockAPI { host_duration: 9401, device_duration: 0, self_host_duration: 8211, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -320,7 +320,7 @@ export class MockAPI { host_duration: 31311, device_duration: 0, self_host_duration: 31311, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -328,7 +328,7 @@ export class MockAPI { host_duration: 17762, device_duration: 0, self_host_duration: 15082, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -336,7 +336,7 @@ export class MockAPI { host_duration: 4194, device_duration: 0, self_host_duration: 4194, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -344,7 +344,7 @@ export class MockAPI { host_duration: 14514, device_duration: 0, self_host_duration: 10320, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -352,7 +352,7 @@ export class MockAPI { host_duration: 1734, device_duration: 0, self_host_duration: 1734, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -360,7 +360,7 @@ export class MockAPI { host_duration: 86249, device_duration: 0, self_host_duration: 78373, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -368,7 +368,7 @@ export class MockAPI { host_duration: 183533, device_duration: 0, self_host_duration: 75637, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -376,7 +376,7 @@ export class MockAPI { host_duration: 71284, device_duration: 0, self_host_duration: 49575, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -384,7 +384,7 @@ export class MockAPI { host_duration: 4948, device_duration: 0, self_host_duration: 4948, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -392,7 +392,7 @@ export class MockAPI { host_duration: 20922, device_duration: 0, self_host_duration: 15974, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -400,7 +400,7 @@ export class MockAPI { host_duration: 27934, device_duration: 0, self_host_duration: 10747, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -408,7 +408,7 @@ export class MockAPI { host_duration: 168214, device_duration: 75, self_host_duration: 146203, - self_device_duration: 75 + self_device_duration: 75, }, { name: 'aten::resize_', @@ -416,7 +416,7 @@ export class MockAPI { host_duration: 248, device_duration: 0, self_host_duration: 248, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -424,7 +424,7 @@ export class MockAPI { host_duration: 280, device_duration: 0, self_host_duration: 99, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -432,7 +432,7 @@ export class MockAPI { host_duration: 92993, device_duration: 0, self_host_duration: 92405, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -440,7 +440,7 @@ export class MockAPI { host_duration: 93282, device_duration: 0, self_host_duration: 289, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -448,7 +448,7 @@ export class MockAPI { host_duration: 124757, device_duration: 0, self_host_duration: 22050, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -456,7 +456,7 @@ export class MockAPI { host_duration: 44043, device_duration: 71832, self_host_duration: 35027, - self_device_duration: 71832 + self_device_duration: 71832, }, { name: 'aten::_convolution', @@ -464,7 +464,7 @@ export class MockAPI { host_duration: 51312, device_duration: 71832, self_host_duration: 7269, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -472,7 +472,7 @@ export class MockAPI { host_duration: 55287, device_duration: 71832, self_host_duration: 3975, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -480,7 +480,7 @@ export class MockAPI { host_duration: 59323, device_duration: 71832, self_host_duration: 4036, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -488,7 +488,7 @@ export class MockAPI { host_duration: 17461, device_duration: 10540, self_host_duration: 15188, - self_device_duration: 10540 + self_device_duration: 10540, }, { name: 'aten::empty_like', @@ -496,7 +496,7 @@ export class MockAPI { host_duration: 11504, device_duration: 0, self_host_duration: 4865, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -504,7 +504,7 @@ export class MockAPI { host_duration: 3589, device_duration: 0, self_host_duration: 3589, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -512,7 +512,7 @@ export class MockAPI { host_duration: 71328, device_duration: 25802, self_host_duration: 40944, - self_device_duration: 25802 + self_device_duration: 25802, }, { name: 'aten::_batch_norm_impl_index', @@ -520,7 +520,7 @@ export class MockAPI { host_duration: 76354, device_duration: 25802, self_host_duration: 5026, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -528,7 +528,7 @@ export class MockAPI { host_duration: 79832, device_duration: 25802, self_host_duration: 3478, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -536,7 +536,7 @@ export class MockAPI { host_duration: 5417, device_duration: 12000, self_host_duration: 3885, - self_device_duration: 12000 + self_device_duration: 12000, }, { name: 'aten::clamp_min_', @@ -544,7 +544,7 @@ export class MockAPI { host_duration: 8537, device_duration: 12000, self_host_duration: 3120, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -552,7 +552,7 @@ export class MockAPI { host_duration: 16708, device_duration: 12000, self_host_duration: 8171, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -560,7 +560,7 @@ export class MockAPI { host_duration: 442, device_duration: 940, self_host_duration: 405, - self_device_duration: 940 + self_device_duration: 940, }, { name: 'aten::max_pool2d', @@ -568,7 +568,7 @@ export class MockAPI { host_duration: 542, device_duration: 940, self_host_duration: 100, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -576,7 +576,7 @@ export class MockAPI { host_duration: 72931, device_duration: 13090, self_host_duration: 57558, - self_device_duration: 13090 + self_device_duration: 13090, }, { name: 'aten::mean', @@ -584,7 +584,7 @@ export class MockAPI { host_duration: 376, device_duration: 133, self_host_duration: 339, - self_device_duration: 133 + self_device_duration: 133, }, { name: 'aten::adaptive_avg_pool2d', @@ -592,7 +592,7 @@ export class MockAPI { host_duration: 465, device_duration: 133, self_host_duration: 89, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -600,7 +600,7 @@ export class MockAPI { host_duration: 170, device_duration: 0, self_host_duration: 170, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -608,7 +608,7 @@ export class MockAPI { host_duration: 207, device_duration: 0, self_host_duration: 103, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -616,7 +616,7 @@ export class MockAPI { host_duration: 587, device_duration: 0, self_host_duration: 465, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -624,7 +624,7 @@ export class MockAPI { host_duration: 1068, device_duration: 0, self_host_duration: 481, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -632,7 +632,7 @@ export class MockAPI { host_duration: 277, device_duration: 0, self_host_duration: 227, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -640,7 +640,7 @@ export class MockAPI { host_duration: 809, device_duration: 84, self_host_duration: 604, - self_device_duration: 84 + self_device_duration: 84, }, { name: 'aten::linear', @@ -648,7 +648,7 @@ export class MockAPI { host_duration: 1185, device_duration: 84, self_host_duration: 137, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -656,7 +656,7 @@ export class MockAPI { host_duration: 308, device_duration: 14, self_host_duration: 271, - self_device_duration: 14 + self_device_duration: 14, }, { name: 'aten::log_softmax', @@ -664,7 +664,7 @@ export class MockAPI { host_duration: 472, device_duration: 14, self_host_duration: 153, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -672,7 +672,7 @@ export class MockAPI { host_duration: 522, device_duration: 8, self_host_duration: 476, - self_device_duration: 8 + self_device_duration: 8, }, { name: 'aten::nll_loss', @@ -680,7 +680,7 @@ export class MockAPI { host_duration: 590, device_duration: 8, self_host_duration: 68, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -688,7 +688,7 @@ export class MockAPI { host_duration: 641, device_duration: 8, self_host_duration: 51, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -696,7 +696,7 @@ export class MockAPI { host_duration: 1234, device_duration: 22, self_host_duration: 121, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -704,7 +704,7 @@ export class MockAPI { host_duration: 14541, device_duration: 738, self_host_duration: 10083, - self_device_duration: 738 + self_device_duration: 738, }, { name: 'aten::ones_like', @@ -712,7 +712,7 @@ export class MockAPI { host_duration: 516, device_duration: 2, self_host_duration: 142, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -720,7 +720,7 @@ export class MockAPI { host_duration: 573, device_duration: 8, self_host_duration: 310, - self_device_duration: 6 + self_device_duration: 6, }, { name: 'NllLossBackward0', @@ -728,7 +728,7 @@ export class MockAPI { host_duration: 774, device_duration: 8, self_host_duration: 201, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -736,7 +736,7 @@ export class MockAPI { host_duration: 1025, device_duration: 8, self_host_duration: 251, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -744,7 +744,7 @@ export class MockAPI { host_duration: 236, device_duration: 18, self_host_duration: 196, - self_device_duration: 18 + self_device_duration: 18, }, { name: 'LogSoftmaxBackward0', @@ -752,7 +752,7 @@ export class MockAPI { host_duration: 385, device_duration: 18, self_host_duration: 149, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', @@ -760,7 +760,7 @@ export class MockAPI { host_duration: 632, device_duration: 18, self_host_duration: 247, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -768,7 +768,7 @@ export class MockAPI { host_duration: 668, device_duration: 140, self_host_duration: 547, - self_device_duration: 140 + self_device_duration: 140, }, { name: 'AddmmBackward0', @@ -776,7 +776,7 @@ export class MockAPI { host_duration: 1698, device_duration: 140, self_host_duration: 417, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -784,7 +784,7 @@ export class MockAPI { host_duration: 370, device_duration: 15, self_host_duration: 328, - self_device_duration: 15 + self_device_duration: 15, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -792,7 +792,7 @@ export class MockAPI { host_duration: 2710, device_duration: 155, self_host_duration: 567, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'torch::autograd::AccumulateGrad', @@ -800,16 +800,15 @@ export class MockAPI { host_duration: 41184, device_duration: 997, self_host_duration: 16159, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 322, host_duration: 70946, device_duration: 997, self_host_duration: 29762, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -817,7 +816,7 @@ export class MockAPI { host_duration: 280, device_duration: 0, self_host_duration: 64, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -825,7 +824,7 @@ export class MockAPI { host_duration: 428, device_duration: 0, self_host_duration: 148, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -833,7 +832,7 @@ export class MockAPI { host_duration: 170, device_duration: 0, self_host_duration: 104, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -841,7 +840,7 @@ export class MockAPI { host_duration: 264, device_duration: 0, self_host_duration: 94, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', @@ -849,7 +848,7 @@ export class MockAPI { host_duration: 402, device_duration: 0, self_host_duration: 138, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'MeanBackward1', @@ -857,7 +856,7 @@ export class MockAPI { host_duration: 1036, device_duration: 75, self_host_duration: 231, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -865,7 +864,7 @@ export class MockAPI { host_duration: 1254, device_duration: 75, self_host_duration: 218, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -873,7 +872,7 @@ export class MockAPI { host_duration: 13838, device_duration: 17984, self_host_duration: 12131, - self_device_duration: 17984 + self_device_duration: 17984, }, { name: 'ReluBackward0', @@ -881,7 +880,7 @@ export class MockAPI { host_duration: 21183, device_duration: 17984, self_host_duration: 7345, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -889,7 +888,7 @@ export class MockAPI { host_duration: 33492, device_duration: 17984, self_host_duration: 12309, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -897,7 +896,7 @@ export class MockAPI { host_duration: 251, device_duration: 0, self_host_duration: 251, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -905,7 +904,7 @@ export class MockAPI { host_duration: 2579, device_duration: 0, self_host_duration: 2328, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -913,7 +912,7 @@ export class MockAPI { host_duration: 62175, device_duration: 44433, self_host_duration: 36053, - self_device_duration: 44433 + self_device_duration: 44433, }, { name: 'CudnnBatchNormBackward0', @@ -921,16 +920,15 @@ export class MockAPI { host_duration: 69160, device_duration: 44433, self_host_duration: 6985, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 106, host_duration: 88613, device_duration: 44433, self_host_duration: 19453, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -938,7 +936,7 @@ export class MockAPI { host_duration: 40820, device_duration: 76620, self_host_duration: 30768, - self_device_duration: 76620 + self_device_duration: 76620, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -946,7 +944,7 @@ export class MockAPI { host_duration: 44875, device_duration: 90108, self_host_duration: 27458, - self_device_duration: 90108 + self_device_duration: 90108, }, { name: 'aten::cudnn_convolution_backward', @@ -954,7 +952,7 @@ export class MockAPI { host_duration: 101020, device_duration: 166728, self_host_duration: 15325, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -962,16 +960,15 @@ export class MockAPI { host_duration: 107964, device_duration: 166728, self_host_duration: 6944, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 106, host_duration: 129129, device_duration: 177161, self_host_duration: 16746, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -979,7 +976,7 @@ export class MockAPI { host_duration: 483, device_duration: 3048, self_host_duration: 257, - self_device_duration: 2588 + self_device_duration: 2588, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -987,16 +984,15 @@ export class MockAPI { host_duration: 599, device_duration: 3048, self_host_duration: 116, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 2, host_duration: 836, device_duration: 3048, self_host_duration: 237, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -1004,9 +1000,9 @@ export class MockAPI { host_duration: 23818, device_duration: 797, self_host_duration: 19073, - self_device_duration: 797 - } - ] + self_device_duration: 797, + }, + ], }, right: { name: 'multiple nodes', @@ -1020,7 +1016,7 @@ export class MockAPI { host_duration: 31594, device_duration: 0, self_host_duration: 31594, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -1028,7 +1024,7 @@ export class MockAPI { host_duration: 6010, device_duration: 864, self_host_duration: 1910, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -1036,7 +1032,7 @@ export class MockAPI { host_duration: 10338, device_duration: 0, self_host_duration: 2951, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -1044,7 +1040,7 @@ export class MockAPI { host_duration: 47031, device_duration: 8684, self_host_duration: 4258, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -1052,7 +1048,7 @@ export class MockAPI { host_duration: 701, device_duration: 0, self_host_duration: 698, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -1060,7 +1056,7 @@ export class MockAPI { host_duration: 1374, device_duration: 0, self_host_duration: 676, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -1068,7 +1064,7 @@ export class MockAPI { host_duration: 1013, device_duration: 0, self_host_duration: 1013, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -1076,7 +1072,7 @@ export class MockAPI { host_duration: 2074, device_duration: 0, self_host_duration: 1723, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -1084,7 +1080,7 @@ export class MockAPI { host_duration: 6859, device_duration: 0, self_host_duration: 6859, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -1092,7 +1088,7 @@ export class MockAPI { host_duration: 25248, device_duration: 8684, self_host_duration: 16166, - self_device_duration: 8684 + self_device_duration: 8684, }, { name: 'aten::_to_copy', @@ -1100,7 +1096,7 @@ export class MockAPI { host_duration: 42773, device_duration: 8684, self_host_duration: 10227, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -1108,7 +1104,7 @@ export class MockAPI { host_duration: 51788, device_duration: 0, self_host_duration: 46788, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -1116,7 +1112,7 @@ export class MockAPI { host_duration: 1035, device_duration: 0, self_host_duration: 895, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -1124,7 +1120,7 @@ export class MockAPI { host_duration: 11074, device_duration: 0, self_host_duration: 11074, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -1132,7 +1128,7 @@ export class MockAPI { host_duration: 1892, device_duration: 0, self_host_duration: 1600, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -1140,7 +1136,7 @@ export class MockAPI { host_duration: 278, device_duration: 0, self_host_duration: 244, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -1148,7 +1144,7 @@ export class MockAPI { host_duration: 1341, device_duration: 0, self_host_duration: 1097, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -1156,7 +1152,7 @@ export class MockAPI { host_duration: 317, device_duration: 0, self_host_duration: 317, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -1164,7 +1160,7 @@ export class MockAPI { host_duration: 8857, device_duration: 0, self_host_duration: 7959, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -1172,7 +1168,7 @@ export class MockAPI { host_duration: 17840, device_duration: 0, self_host_duration: 7688, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -1180,7 +1176,7 @@ export class MockAPI { host_duration: 6903, device_duration: 0, self_host_duration: 4901, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -1188,7 +1184,7 @@ export class MockAPI { host_duration: 395, device_duration: 0, self_host_duration: 395, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -1196,7 +1192,7 @@ export class MockAPI { host_duration: 2532, device_duration: 0, self_host_duration: 2130, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -1204,7 +1200,7 @@ export class MockAPI { host_duration: 3601, device_duration: 0, self_host_duration: 1427, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -1212,7 +1208,7 @@ export class MockAPI { host_duration: 11707, device_duration: 75, self_host_duration: 9531, - self_device_duration: 75 + self_device_duration: 75, }, { name: 'aten::resize_', @@ -1220,7 +1216,7 @@ export class MockAPI { host_duration: 79, device_duration: 0, self_host_duration: 79, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -1228,7 +1224,7 @@ export class MockAPI { host_duration: 37, device_duration: 0, self_host_duration: 16, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -1236,7 +1232,7 @@ export class MockAPI { host_duration: 9241, device_duration: 0, self_host_duration: 9113, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -1244,7 +1240,7 @@ export class MockAPI { host_duration: 9286, device_duration: 0, self_host_duration: 45, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -1252,7 +1248,7 @@ export class MockAPI { host_duration: 16195, device_duration: 0, self_host_duration: 6105, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -1260,7 +1256,7 @@ export class MockAPI { host_duration: 17357, device_duration: 71414, self_host_duration: 13601, - self_device_duration: 71414 + self_device_duration: 71414, }, { name: 'aten::_convolution', @@ -1268,7 +1264,7 @@ export class MockAPI { host_duration: 18514, device_duration: 71414, self_host_duration: 1157, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -1276,7 +1272,7 @@ export class MockAPI { host_duration: 19185, device_duration: 71414, self_host_duration: 671, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -1284,7 +1280,7 @@ export class MockAPI { host_duration: 19750, device_duration: 71414, self_host_duration: 565, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -1292,7 +1288,7 @@ export class MockAPI { host_duration: 4973, device_duration: 10567, self_host_duration: 3157, - self_device_duration: 10567 + self_device_duration: 10567, }, { name: 'aten::empty_like', @@ -1300,7 +1296,7 @@ export class MockAPI { host_duration: 1924, device_duration: 0, self_host_duration: 598, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -1308,7 +1304,7 @@ export class MockAPI { host_duration: 596, device_duration: 0, self_host_duration: 596, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -1316,7 +1312,7 @@ export class MockAPI { host_duration: 11083, device_duration: 25737, self_host_duration: 5031, - self_device_duration: 25737 + self_device_duration: 25737, }, { name: 'aten::_batch_norm_impl_index', @@ -1324,7 +1320,7 @@ export class MockAPI { host_duration: 11856, device_duration: 25737, self_host_duration: 773, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -1332,7 +1328,7 @@ export class MockAPI { host_duration: 12386, device_duration: 25737, self_host_duration: 530, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -1340,7 +1336,7 @@ export class MockAPI { host_duration: 2189, device_duration: 12010, self_host_duration: 1030, - self_device_duration: 12010 + self_device_duration: 12010, }, { name: 'aten::clamp_min_', @@ -1348,7 +1344,7 @@ export class MockAPI { host_duration: 2614, device_duration: 12010, self_host_duration: 425, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -1356,7 +1352,7 @@ export class MockAPI { host_duration: 3880, device_duration: 12010, self_host_duration: 1266, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -1364,7 +1360,7 @@ export class MockAPI { host_duration: 112, device_duration: 938, self_host_duration: 82, - self_device_duration: 938 + self_device_duration: 938, }, { name: 'aten::max_pool2d', @@ -1372,7 +1368,7 @@ export class MockAPI { host_duration: 127, device_duration: 938, self_host_duration: 15, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -1380,7 +1376,7 @@ export class MockAPI { host_duration: 21459, device_duration: 13178, self_host_duration: 11041, - self_device_duration: 13178 + self_device_duration: 13178, }, { name: 'aten::mean', @@ -1388,7 +1384,7 @@ export class MockAPI { host_duration: 104, device_duration: 126, self_host_duration: 76, - self_device_duration: 126 + self_device_duration: 126, }, { name: 'aten::adaptive_avg_pool2d', @@ -1396,7 +1392,7 @@ export class MockAPI { host_duration: 117, device_duration: 126, self_host_duration: 13, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -1404,7 +1400,7 @@ export class MockAPI { host_duration: 26, device_duration: 0, self_host_duration: 26, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -1412,7 +1408,7 @@ export class MockAPI { host_duration: 31, device_duration: 0, self_host_duration: 15, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -1420,7 +1416,7 @@ export class MockAPI { host_duration: 85, device_duration: 0, self_host_duration: 68, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -1428,7 +1424,7 @@ export class MockAPI { host_duration: 145, device_duration: 0, self_host_duration: 60, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -1436,7 +1432,7 @@ export class MockAPI { host_duration: 30, device_duration: 0, self_host_duration: 25, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -1444,7 +1440,7 @@ export class MockAPI { host_duration: 334, device_duration: 84, self_host_duration: 234, - self_device_duration: 84 + self_device_duration: 84, }, { name: 'aten::linear', @@ -1452,7 +1448,7 @@ export class MockAPI { host_duration: 386, device_duration: 84, self_host_duration: 19, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -1460,7 +1456,7 @@ export class MockAPI { host_duration: 83, device_duration: 14, self_host_duration: 55, - self_device_duration: 14 + self_device_duration: 14, }, { name: 'aten::log_softmax', @@ -1468,7 +1464,7 @@ export class MockAPI { host_duration: 106, device_duration: 14, self_host_duration: 20, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -1476,7 +1472,7 @@ export class MockAPI { host_duration: 96, device_duration: 8, self_host_duration: 68, - self_device_duration: 8 + self_device_duration: 8, }, { name: 'aten::nll_loss', @@ -1484,7 +1480,7 @@ export class MockAPI { host_duration: 105, device_duration: 8, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -1492,7 +1488,7 @@ export class MockAPI { host_duration: 113, device_duration: 8, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -1500,7 +1496,7 @@ export class MockAPI { host_duration: 243, device_duration: 22, self_host_duration: 24, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -1508,7 +1504,7 @@ export class MockAPI { host_duration: 4140, device_duration: 866, self_host_duration: 1851, - self_device_duration: 866 + self_device_duration: 866, }, { name: 'aten::ones_like', @@ -1516,7 +1512,7 @@ export class MockAPI { host_duration: 104, device_duration: 2, self_host_duration: 14, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -1524,7 +1520,7 @@ export class MockAPI { host_duration: 192, device_duration: 9, self_host_duration: 84, - self_device_duration: 6 + self_device_duration: 6, }, { name: 'NllLossBackward0', @@ -1532,7 +1528,7 @@ export class MockAPI { host_duration: 297, device_duration: 9, self_host_duration: 105, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -1540,7 +1536,7 @@ export class MockAPI { host_duration: 352, device_duration: 9, self_host_duration: 55, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -1548,7 +1544,7 @@ export class MockAPI { host_duration: 71, device_duration: 18, self_host_duration: 43, - self_device_duration: 18 + self_device_duration: 18, }, { name: 'LogSoftmaxBackward0', @@ -1556,7 +1552,7 @@ export class MockAPI { host_duration: 91, device_duration: 18, self_host_duration: 20, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', @@ -1564,7 +1560,7 @@ export class MockAPI { host_duration: 126, device_duration: 18, self_host_duration: 35, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -1572,7 +1568,7 @@ export class MockAPI { host_duration: 283, device_duration: 134, self_host_duration: 186, - self_device_duration: 134 + self_device_duration: 134, }, { name: 'AddmmBackward0', @@ -1580,7 +1576,7 @@ export class MockAPI { host_duration: 418, device_duration: 134, self_host_duration: 47, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -1588,7 +1584,7 @@ export class MockAPI { host_duration: 92, device_duration: 14, self_host_duration: 62, - self_device_duration: 14 + self_device_duration: 14, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -1596,7 +1592,7 @@ export class MockAPI { host_duration: 594, device_duration: 148, self_host_duration: 75, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'torch::autograd::AccumulateGrad', @@ -1604,16 +1600,15 @@ export class MockAPI { host_duration: 10317, device_duration: 1069, self_host_duration: 2127, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 322, host_duration: 15128, device_duration: 1069, self_host_duration: 4811, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -1621,7 +1616,7 @@ export class MockAPI { host_duration: 30, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -1629,7 +1624,7 @@ export class MockAPI { host_duration: 45, device_duration: 0, self_host_duration: 15, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -1637,7 +1632,7 @@ export class MockAPI { host_duration: 20, device_duration: 0, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -1645,7 +1640,7 @@ export class MockAPI { host_duration: 31, device_duration: 0, self_host_duration: 11, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', @@ -1653,7 +1648,7 @@ export class MockAPI { host_duration: 48, device_duration: 0, self_host_duration: 17, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'MeanBackward1', @@ -1661,7 +1656,7 @@ export class MockAPI { host_duration: 172, device_duration: 75, self_host_duration: 18, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -1669,7 +1664,7 @@ export class MockAPI { host_duration: 201, device_duration: 75, self_host_duration: 29, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -1677,7 +1672,7 @@ export class MockAPI { host_duration: 3652, device_duration: 18018, self_host_duration: 2361, - self_device_duration: 18018 + self_device_duration: 18018, }, { name: 'ReluBackward0', @@ -1685,7 +1680,7 @@ export class MockAPI { host_duration: 4567, device_duration: 18018, self_host_duration: 915, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -1693,7 +1688,7 @@ export class MockAPI { host_duration: 6457, device_duration: 18018, self_host_duration: 1890, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -1701,7 +1696,7 @@ export class MockAPI { host_duration: 26, device_duration: 0, self_host_duration: 26, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -1709,7 +1704,7 @@ export class MockAPI { host_duration: 261, device_duration: 0, self_host_duration: 235, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -1717,7 +1712,7 @@ export class MockAPI { host_duration: 9943, device_duration: 44401, self_host_duration: 4355, - self_device_duration: 44401 + self_device_duration: 44401, }, { name: 'CudnnBatchNormBackward0', @@ -1725,16 +1720,15 @@ export class MockAPI { host_duration: 11132, device_duration: 44401, self_host_duration: 1189, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 106, host_duration: 14696, device_duration: 44401, self_host_duration: 3564, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -1742,7 +1736,7 @@ export class MockAPI { host_duration: 18813, device_duration: 75568, self_host_duration: 13997, - self_device_duration: 75568 + self_device_duration: 75568, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -1750,7 +1744,7 @@ export class MockAPI { host_duration: 18792, device_duration: 88992, self_host_duration: 11101, - self_device_duration: 88992 + self_device_duration: 88992, }, { name: 'aten::cudnn_convolution_backward', @@ -1758,7 +1752,7 @@ export class MockAPI { host_duration: 40064, device_duration: 164560, self_host_duration: 2459, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -1766,16 +1760,15 @@ export class MockAPI { host_duration: 41205, device_duration: 164560, self_host_duration: 1141, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 106, host_duration: 45209, device_duration: 175014, self_host_duration: 2826, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -1783,7 +1776,7 @@ export class MockAPI { host_duration: 145, device_duration: 3016, self_host_duration: 61, - self_device_duration: 2556 + self_device_duration: 2556, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -1791,16 +1784,15 @@ export class MockAPI { host_duration: 165, device_duration: 3016, self_host_duration: 20, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 2, host_duration: 209, device_duration: 3016, self_host_duration: 44, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -1808,9 +1800,9 @@ export class MockAPI { host_duration: 6835, device_duration: 803, self_host_duration: 3630, - self_device_duration: 803 - } - ] + self_device_duration: 803, + }, + ], }, path: '0', children: [ @@ -1827,7 +1819,7 @@ export class MockAPI { host_duration: 100, device_duration: 0, self_host_duration: 100, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -1835,7 +1827,7 @@ export class MockAPI { host_duration: 4, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -1843,9 +1835,9 @@ export class MockAPI { host_duration: 119, device_duration: 0, self_host_duration: 64, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'multiple nodes', @@ -1859,7 +1851,7 @@ export class MockAPI { host_duration: 17, device_duration: 0, self_host_duration: 17, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -1867,7 +1859,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -1875,11 +1867,11 @@ export class MockAPI { host_duration: 15, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-0' + path: '0-0', }, { left: { @@ -1894,7 +1886,7 @@ export class MockAPI { host_duration: 62288, device_duration: 0, self_host_duration: 62288, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -1902,7 +1894,7 @@ export class MockAPI { host_duration: 959, device_duration: 0, self_host_duration: 959, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -1910,7 +1902,7 @@ export class MockAPI { host_duration: 35273, device_duration: 0, self_host_duration: 16154, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -1918,7 +1910,7 @@ export class MockAPI { host_duration: 877101, device_duration: 0, self_host_duration: 18482, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -1926,7 +1918,7 @@ export class MockAPI { host_duration: 2191, device_duration: 0, self_host_duration: 2191, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -1934,7 +1926,7 @@ export class MockAPI { host_duration: 5301, device_duration: 0, self_host_duration: 3110, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -1942,7 +1934,7 @@ export class MockAPI { host_duration: 4175, device_duration: 0, self_host_duration: 4175, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -1950,7 +1942,7 @@ export class MockAPI { host_duration: 9560, device_duration: 0, self_host_duration: 8045, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -1958,7 +1950,7 @@ export class MockAPI { host_duration: 24689, device_duration: 0, self_host_duration: 24689, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -1966,7 +1958,7 @@ export class MockAPI { host_duration: 780214, device_duration: 0, self_host_duration: 780214, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_to_copy', @@ -1974,7 +1966,7 @@ export class MockAPI { host_duration: 858619, device_duration: 0, self_host_duration: 53009, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -1982,7 +1974,7 @@ export class MockAPI { host_duration: 224031, device_duration: 0, self_host_duration: 204660, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -1990,7 +1982,7 @@ export class MockAPI { host_duration: 4719, device_duration: 0, self_host_duration: 4119, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -1998,7 +1990,7 @@ export class MockAPI { host_duration: 16028, device_duration: 0, self_host_duration: 16028, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -2006,7 +1998,7 @@ export class MockAPI { host_duration: 8918, device_duration: 0, self_host_duration: 7569, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -2014,7 +2006,7 @@ export class MockAPI { host_duration: 2092, device_duration: 0, self_host_duration: 2092, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -2022,7 +2014,7 @@ export class MockAPI { host_duration: 7228, device_duration: 0, self_host_duration: 5136, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -2030,7 +2022,7 @@ export class MockAPI { host_duration: 884, device_duration: 0, self_host_duration: 884, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -2038,7 +2030,7 @@ export class MockAPI { host_duration: 43030, device_duration: 0, self_host_duration: 39068, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -2046,7 +2038,7 @@ export class MockAPI { host_duration: 91440, device_duration: 0, self_host_duration: 37676, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -2054,7 +2046,7 @@ export class MockAPI { host_duration: 35514, device_duration: 0, self_host_duration: 24706, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -2062,7 +2054,7 @@ export class MockAPI { host_duration: 2467, device_duration: 0, self_host_duration: 2467, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -2070,7 +2062,7 @@ export class MockAPI { host_duration: 10375, device_duration: 0, self_host_duration: 7908, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -2078,7 +2070,7 @@ export class MockAPI { host_duration: 13905, device_duration: 0, self_host_duration: 5383, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -2086,7 +2078,7 @@ export class MockAPI { host_duration: 87841, device_duration: 0, self_host_duration: 76794, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -2094,7 +2086,7 @@ export class MockAPI { host_duration: 117, device_duration: 0, self_host_duration: 117, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -2102,7 +2094,7 @@ export class MockAPI { host_duration: 142, device_duration: 0, self_host_duration: 51, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -2110,7 +2102,7 @@ export class MockAPI { host_duration: 51526, device_duration: 0, self_host_duration: 51229, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -2118,7 +2110,7 @@ export class MockAPI { host_duration: 51674, device_duration: 0, self_host_duration: 148, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -2126,9 +2118,9 @@ export class MockAPI { host_duration: 75677, device_duration: 0, self_host_duration: 19330, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'enumerate(DataLoader)#_SingleProcessDataLoaderIter.__next__', @@ -2142,7 +2134,7 @@ export class MockAPI { host_duration: 12399, device_duration: 0, self_host_duration: 12399, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -2150,7 +2142,7 @@ export class MockAPI { host_duration: 98, device_duration: 0, self_host_duration: 98, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -2158,7 +2150,7 @@ export class MockAPI { host_duration: 7665, device_duration: 0, self_host_duration: 1689, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -2166,7 +2158,7 @@ export class MockAPI { host_duration: 21137, device_duration: 0, self_host_duration: 2377, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -2174,7 +2166,7 @@ export class MockAPI { host_duration: 364, device_duration: 0, self_host_duration: 361, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -2182,7 +2174,7 @@ export class MockAPI { host_duration: 745, device_duration: 0, self_host_duration: 384, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -2190,7 +2182,7 @@ export class MockAPI { host_duration: 527, device_duration: 0, self_host_duration: 527, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -2198,7 +2190,7 @@ export class MockAPI { host_duration: 1050, device_duration: 0, self_host_duration: 869, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -2206,7 +2198,7 @@ export class MockAPI { host_duration: 3689, device_duration: 0, self_host_duration: 3689, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -2214,7 +2206,7 @@ export class MockAPI { host_duration: 8695, device_duration: 0, self_host_duration: 8695, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_to_copy', @@ -2222,7 +2214,7 @@ export class MockAPI { host_duration: 18760, device_duration: 0, self_host_duration: 6122, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -2230,7 +2222,7 @@ export class MockAPI { host_duration: 20349, device_duration: 0, self_host_duration: 17634, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -2238,7 +2230,7 @@ export class MockAPI { host_duration: 562, device_duration: 0, self_host_duration: 487, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -2246,7 +2238,7 @@ export class MockAPI { host_duration: 6658, device_duration: 0, self_host_duration: 6658, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -2254,7 +2246,7 @@ export class MockAPI { host_duration: 1028, device_duration: 0, self_host_duration: 870, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -2262,7 +2254,7 @@ export class MockAPI { host_duration: 142, device_duration: 0, self_host_duration: 129, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -2270,7 +2262,7 @@ export class MockAPI { host_duration: 755, device_duration: 0, self_host_duration: 626, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -2278,7 +2270,7 @@ export class MockAPI { host_duration: 168, device_duration: 0, self_host_duration: 168, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -2286,7 +2278,7 @@ export class MockAPI { host_duration: 4922, device_duration: 0, self_host_duration: 4440, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -2294,7 +2286,7 @@ export class MockAPI { host_duration: 9959, device_duration: 0, self_host_duration: 4339, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -2302,7 +2294,7 @@ export class MockAPI { host_duration: 3848, device_duration: 0, self_host_duration: 2737, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -2310,7 +2302,7 @@ export class MockAPI { host_duration: 209, device_duration: 0, self_host_duration: 209, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -2318,7 +2310,7 @@ export class MockAPI { host_duration: 1398, device_duration: 0, self_host_duration: 1187, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -2326,7 +2318,7 @@ export class MockAPI { host_duration: 2013, device_duration: 0, self_host_duration: 812, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -2334,7 +2326,7 @@ export class MockAPI { host_duration: 7421, device_duration: 0, self_host_duration: 6234, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -2342,7 +2334,7 @@ export class MockAPI { host_duration: 36, device_duration: 0, self_host_duration: 36, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -2350,7 +2342,7 @@ export class MockAPI { host_duration: 19, device_duration: 0, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -2358,7 +2350,7 @@ export class MockAPI { host_duration: 4628, device_duration: 0, self_host_duration: 4566, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -2366,7 +2358,7 @@ export class MockAPI { host_duration: 4649, device_duration: 0, self_host_duration: 21, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -2374,11 +2366,11 @@ export class MockAPI { host_duration: 10884, device_duration: 0, self_host_duration: 5859, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-1' + path: '0-1', }, { left: { @@ -2393,7 +2385,7 @@ export class MockAPI { host_duration: 209, device_duration: 0, self_host_duration: 209, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -2401,7 +2393,7 @@ export class MockAPI { host_duration: 4696, device_duration: 4402, self_host_duration: 93, - self_device_duration: 4402 + self_device_duration: 4402, }, { name: 'aten::_to_copy', @@ -2409,7 +2401,7 @@ export class MockAPI { host_duration: 5111, device_duration: 4402, self_host_duration: 206, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -2417,9 +2409,9 @@ export class MockAPI { host_duration: 5170, device_duration: 4402, self_host_duration: 59, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'multiple nodes', @@ -2433,7 +2425,7 @@ export class MockAPI { host_duration: 65, device_duration: 0, self_host_duration: 65, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -2441,7 +2433,7 @@ export class MockAPI { host_duration: 4575, device_duration: 4350, self_host_duration: 26, - self_device_duration: 4350 + self_device_duration: 4350, }, { name: 'aten::_to_copy', @@ -2449,7 +2441,7 @@ export class MockAPI { host_duration: 4670, device_duration: 4350, self_host_duration: 30, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -2457,11 +2449,11 @@ export class MockAPI { host_duration: 4681, device_duration: 4350, self_host_duration: 11, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-2' + path: '0-2', }, { left: { @@ -2476,7 +2468,7 @@ export class MockAPI { host_duration: 14161, device_duration: 0, self_host_duration: 14161, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -2484,7 +2476,7 @@ export class MockAPI { host_duration: 22091, device_duration: 36599, self_host_duration: 17567, - self_device_duration: 36599 + self_device_duration: 36599, }, { name: 'aten::_convolution', @@ -2492,7 +2484,7 @@ export class MockAPI { host_duration: 25744, device_duration: 36599, self_host_duration: 3653, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -2500,7 +2492,7 @@ export class MockAPI { host_duration: 27753, device_duration: 36599, self_host_duration: 2009, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -2508,7 +2500,7 @@ export class MockAPI { host_duration: 29777, device_duration: 36599, self_host_duration: 2024, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -2516,7 +2508,7 @@ export class MockAPI { host_duration: 6519, device_duration: 54, self_host_duration: 5666, - self_device_duration: 54 + self_device_duration: 54, }, { name: 'aten::empty_like', @@ -2524,7 +2516,7 @@ export class MockAPI { host_duration: 5624, device_duration: 0, self_host_duration: 2390, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -2532,7 +2524,7 @@ export class MockAPI { host_duration: 826, device_duration: 0, self_host_duration: 826, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -2540,7 +2532,7 @@ export class MockAPI { host_duration: 35818, device_duration: 12974, self_host_duration: 20557, - self_device_duration: 12974 + self_device_duration: 12974, }, { name: 'aten::_batch_norm_impl_index', @@ -2548,7 +2540,7 @@ export class MockAPI { host_duration: 38324, device_duration: 12974, self_host_duration: 2506, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -2556,7 +2548,7 @@ export class MockAPI { host_duration: 40105, device_duration: 12974, self_host_duration: 1781, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -2564,7 +2556,7 @@ export class MockAPI { host_duration: 2702, device_duration: 6002, self_host_duration: 1935, - self_device_duration: 6002 + self_device_duration: 6002, }, { name: 'aten::clamp_min_', @@ -2572,7 +2564,7 @@ export class MockAPI { host_duration: 4273, device_duration: 6002, self_host_duration: 1571, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -2580,7 +2572,7 @@ export class MockAPI { host_duration: 8371, device_duration: 6002, self_host_duration: 4098, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -2588,7 +2580,7 @@ export class MockAPI { host_duration: 230, device_duration: 474, self_host_duration: 212, - self_device_duration: 474 + self_device_duration: 474, }, { name: 'aten::max_pool2d', @@ -2596,7 +2588,7 @@ export class MockAPI { host_duration: 280, device_duration: 474, self_host_duration: 50, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -2604,7 +2596,7 @@ export class MockAPI { host_duration: 1546, device_duration: 5141, self_host_duration: 1290, - self_device_duration: 5141 + self_device_duration: 5141, }, { name: 'aten::mean', @@ -2612,7 +2604,7 @@ export class MockAPI { host_duration: 189, device_duration: 69, self_host_duration: 170, - self_device_duration: 69 + self_device_duration: 69, }, { name: 'aten::adaptive_avg_pool2d', @@ -2620,7 +2612,7 @@ export class MockAPI { host_duration: 234, device_duration: 69, self_host_duration: 45, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -2628,7 +2620,7 @@ export class MockAPI { host_duration: 52, device_duration: 0, self_host_duration: 52, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -2636,7 +2628,7 @@ export class MockAPI { host_duration: 106, device_duration: 0, self_host_duration: 54, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -2644,7 +2636,7 @@ export class MockAPI { host_duration: 23, device_duration: 0, self_host_duration: 23, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -2652,7 +2644,7 @@ export class MockAPI { host_duration: 55, device_duration: 0, self_host_duration: 41, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -2660,7 +2652,7 @@ export class MockAPI { host_duration: 119, device_duration: 0, self_host_duration: 64, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -2668,7 +2660,7 @@ export class MockAPI { host_duration: 49, device_duration: 0, self_host_duration: 40, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -2676,7 +2668,7 @@ export class MockAPI { host_duration: 404, device_duration: 43, self_host_duration: 302, - self_device_duration: 43 + self_device_duration: 43, }, { name: 'aten::linear', @@ -2684,9 +2676,9 @@ export class MockAPI { host_duration: 591, device_duration: 43, self_host_duration: 68, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: ResNet', @@ -2700,7 +2692,7 @@ export class MockAPI { host_duration: 2292, device_duration: 0, self_host_duration: 2292, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -2708,7 +2700,7 @@ export class MockAPI { host_duration: 8713, device_duration: 36205, self_host_duration: 6819, - self_device_duration: 36205 + self_device_duration: 36205, }, { name: 'aten::_convolution', @@ -2716,7 +2708,7 @@ export class MockAPI { host_duration: 9298, device_duration: 36205, self_host_duration: 585, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -2724,7 +2716,7 @@ export class MockAPI { host_duration: 9653, device_duration: 36205, self_host_duration: 355, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -2732,7 +2724,7 @@ export class MockAPI { host_duration: 9932, device_duration: 36205, self_host_duration: 279, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -2740,7 +2732,7 @@ export class MockAPI { host_duration: 1897, device_duration: 58, self_host_duration: 1201, - self_device_duration: 58 + self_device_duration: 58, }, { name: 'aten::empty_like', @@ -2748,7 +2740,7 @@ export class MockAPI { host_duration: 933, device_duration: 0, self_host_duration: 284, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -2756,7 +2748,7 @@ export class MockAPI { host_duration: 130, device_duration: 0, self_host_duration: 130, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -2764,7 +2756,7 @@ export class MockAPI { host_duration: 5540, device_duration: 12913, self_host_duration: 2504, - self_device_duration: 12913 + self_device_duration: 12913, }, { name: 'aten::_batch_norm_impl_index', @@ -2772,7 +2764,7 @@ export class MockAPI { host_duration: 5942, device_duration: 12913, self_host_duration: 402, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -2780,7 +2772,7 @@ export class MockAPI { host_duration: 6219, device_duration: 12913, self_host_duration: 277, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -2788,7 +2780,7 @@ export class MockAPI { host_duration: 1108, device_duration: 6006, self_host_duration: 523, - self_device_duration: 6006 + self_device_duration: 6006, }, { name: 'aten::clamp_min_', @@ -2796,7 +2788,7 @@ export class MockAPI { host_duration: 1315, device_duration: 6006, self_host_duration: 207, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -2804,7 +2796,7 @@ export class MockAPI { host_duration: 1939, device_duration: 6006, self_host_duration: 624, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -2812,7 +2804,7 @@ export class MockAPI { host_duration: 53, device_duration: 472, self_host_duration: 38, - self_device_duration: 472 + self_device_duration: 472, }, { name: 'aten::max_pool2d', @@ -2820,7 +2812,7 @@ export class MockAPI { host_duration: 61, device_duration: 472, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -2828,7 +2820,7 @@ export class MockAPI { host_duration: 448, device_duration: 5140, self_host_duration: 268, - self_device_duration: 5140 + self_device_duration: 5140, }, { name: 'aten::mean', @@ -2836,7 +2828,7 @@ export class MockAPI { host_duration: 53, device_duration: 63, self_host_duration: 39, - self_device_duration: 63 + self_device_duration: 63, }, { name: 'aten::adaptive_avg_pool2d', @@ -2844,7 +2836,7 @@ export class MockAPI { host_duration: 59, device_duration: 63, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -2852,7 +2844,7 @@ export class MockAPI { host_duration: 8, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -2860,7 +2852,7 @@ export class MockAPI { host_duration: 15, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -2868,7 +2860,7 @@ export class MockAPI { host_duration: 3, device_duration: 0, self_host_duration: 3, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -2876,7 +2868,7 @@ export class MockAPI { host_duration: 8, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -2884,7 +2876,7 @@ export class MockAPI { host_duration: 15, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -2892,7 +2884,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -2900,7 +2892,7 @@ export class MockAPI { host_duration: 173, device_duration: 42, self_host_duration: 123, - self_device_duration: 42 + self_device_duration: 42, }, { name: 'aten::linear', @@ -2908,11 +2900,11 @@ export class MockAPI { host_duration: 198, device_duration: 42, self_host_duration: 10, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-3' + path: '0-3', }, { left: { @@ -2927,7 +2919,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -2935,7 +2927,7 @@ export class MockAPI { host_duration: 158, device_duration: 7, self_host_duration: 139, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::log_softmax', @@ -2943,7 +2935,7 @@ export class MockAPI { host_duration: 241, device_duration: 7, self_host_duration: 78, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -2951,7 +2943,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -2959,7 +2951,7 @@ export class MockAPI { host_duration: 256, device_duration: 4, self_host_duration: 233, - self_device_duration: 4 + self_device_duration: 4, }, { name: 'aten::nll_loss', @@ -2967,7 +2959,7 @@ export class MockAPI { host_duration: 290, device_duration: 4, self_host_duration: 34, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -2975,7 +2967,7 @@ export class MockAPI { host_duration: 313, device_duration: 4, self_host_duration: 23, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -2983,9 +2975,9 @@ export class MockAPI { host_duration: 614, device_duration: 11, self_host_duration: 60, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: CrossEntropyLoss', @@ -2999,7 +2991,7 @@ export class MockAPI { host_duration: 2, device_duration: 0, self_host_duration: 2, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -3007,7 +2999,7 @@ export class MockAPI { host_duration: 42, device_duration: 7, self_host_duration: 28, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::log_softmax', @@ -3015,7 +3007,7 @@ export class MockAPI { host_duration: 54, device_duration: 7, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -3023,7 +3015,7 @@ export class MockAPI { host_duration: 0, device_duration: 0, self_host_duration: 0, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -3031,7 +3023,7 @@ export class MockAPI { host_duration: 47, device_duration: 4, self_host_duration: 34, - self_device_duration: 4 + self_device_duration: 4, }, { name: 'aten::nll_loss', @@ -3039,7 +3031,7 @@ export class MockAPI { host_duration: 52, device_duration: 4, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -3047,7 +3039,7 @@ export class MockAPI { host_duration: 56, device_duration: 4, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -3055,11 +3047,11 @@ export class MockAPI { host_duration: 119, device_duration: 11, self_host_duration: 9, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-4' + path: '0-4', }, { left: { @@ -3074,7 +3066,7 @@ export class MockAPI { host_duration: 47, device_duration: 0, self_host_duration: 47, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -3082,7 +3074,7 @@ export class MockAPI { host_duration: 4, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -3090,9 +3082,9 @@ export class MockAPI { host_duration: 119, device_duration: 0, self_host_duration: 68, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::zeros', @@ -3106,7 +3098,7 @@ export class MockAPI { host_duration: 8, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -3114,7 +3106,7 @@ export class MockAPI { host_duration: 2, device_duration: 0, self_host_duration: 2, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -3122,11 +3114,11 @@ export class MockAPI { host_duration: 17, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-5' + path: '0-5', }, { left: { @@ -3141,7 +3133,7 @@ export class MockAPI { host_duration: 38, device_duration: 0, self_host_duration: 38, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -3149,7 +3141,7 @@ export class MockAPI { host_duration: 7097, device_duration: 142, self_host_duration: 4914, - self_device_duration: 142 + self_device_duration: 142, }, { name: 'aten::zero_', @@ -3157,9 +3149,9 @@ export class MockAPI { host_duration: 14725, device_duration: 142, self_host_duration: 7628, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'Optimizer.zero_grad#SGD.zero_grad', @@ -3173,7 +3165,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -3181,7 +3173,7 @@ export class MockAPI { host_duration: 2036, device_duration: 264, self_host_duration: 909, - self_device_duration: 264 + self_device_duration: 264, }, { name: 'aten::zero_', @@ -3189,11 +3181,11 @@ export class MockAPI { host_duration: 2855, device_duration: 264, self_host_duration: 819, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-6' + path: '0-6', }, { left: { @@ -3208,7 +3200,7 @@ export class MockAPI { host_duration: 79, device_duration: 0, self_host_duration: 79, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_like', @@ -3216,7 +3208,7 @@ export class MockAPI { host_duration: 126, device_duration: 0, self_host_duration: 47, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -3224,7 +3216,7 @@ export class MockAPI { host_duration: 50, device_duration: 1, self_host_duration: 35, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::ones_like', @@ -3232,9 +3224,9 @@ export class MockAPI { host_duration: 253, device_duration: 1, self_host_duration: 77, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::ones_like', @@ -3248,7 +3240,7 @@ export class MockAPI { host_duration: 18, device_duration: 0, self_host_duration: 18, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_like', @@ -3256,7 +3248,7 @@ export class MockAPI { host_duration: 26, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -3264,7 +3256,7 @@ export class MockAPI { host_duration: 20, device_duration: 1, self_host_duration: 8, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::ones_like', @@ -3272,11 +3264,11 @@ export class MockAPI { host_duration: 53, device_duration: 1, self_host_duration: 7, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-7' + path: '0-7', }, { left: { @@ -3291,7 +3283,7 @@ export class MockAPI { host_duration: 69, device_duration: 1, self_host_duration: 43, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::zero_', @@ -3299,7 +3291,7 @@ export class MockAPI { host_duration: 120, device_duration: 1, self_host_duration: 51, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -3307,7 +3299,7 @@ export class MockAPI { host_duration: 304, device_duration: 4, self_host_duration: 168, - self_device_duration: 3 + self_device_duration: 3, }, { name: 'NllLossBackward0', @@ -3315,7 +3307,7 @@ export class MockAPI { host_duration: 368, device_duration: 4, self_host_duration: 64, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -3323,7 +3315,7 @@ export class MockAPI { host_duration: 503, device_duration: 4, self_host_duration: 135, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -3331,7 +3323,7 @@ export class MockAPI { host_duration: 127, device_duration: 9, self_host_duration: 105, - self_device_duration: 9 + self_device_duration: 9, }, { name: 'LogSoftmaxBackward0', @@ -3339,18 +3331,17 @@ export class MockAPI { host_duration: 207, device_duration: 9, self_host_duration: 80, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: LogSoftmaxBackward0', + name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', calls: 1, host_duration: 349, device_duration: 9, self_host_duration: 142, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: CrossEntropyLoss.backward', @@ -3364,7 +3355,7 @@ export class MockAPI { host_duration: 36, device_duration: 2, self_host_duration: 13, - self_device_duration: 2 + self_device_duration: 2, }, { name: 'aten::zero_', @@ -3372,7 +3363,7 @@ export class MockAPI { host_duration: 45, device_duration: 2, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -3380,7 +3371,7 @@ export class MockAPI { host_duration: 99, device_duration: 5, self_host_duration: 43, - self_device_duration: 3 + self_device_duration: 3, }, { name: 'NllLossBackward0', @@ -3388,7 +3379,7 @@ export class MockAPI { host_duration: 112, device_duration: 5, self_host_duration: 13, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -3396,7 +3387,7 @@ export class MockAPI { host_duration: 141, device_duration: 5, self_host_duration: 29, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -3404,7 +3395,7 @@ export class MockAPI { host_duration: 35, device_duration: 9, self_host_duration: 21, - self_device_duration: 9 + self_device_duration: 9, }, { name: 'LogSoftmaxBackward0', @@ -3412,20 +3403,19 @@ export class MockAPI { host_duration: 46, device_duration: 9, self_host_duration: 11, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: LogSoftmaxBackward0', + name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', calls: 1, host_duration: 64, device_duration: 9, self_host_duration: 18, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-8' + path: '0-8', }, { left: { @@ -3440,7 +3430,7 @@ export class MockAPI { host_duration: 61, device_duration: 0, self_host_duration: 61, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -3448,7 +3438,7 @@ export class MockAPI { host_duration: 226, device_duration: 0, self_host_duration: 180, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -3456,7 +3446,7 @@ export class MockAPI { host_duration: 399, device_duration: 0, self_host_duration: 173, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -3464,7 +3454,7 @@ export class MockAPI { host_duration: 345, device_duration: 72, self_host_duration: 282, - self_device_duration: 72 + self_device_duration: 72, }, { name: 'AddmmBackward0', @@ -3472,7 +3462,7 @@ export class MockAPI { host_duration: 854, device_duration: 72, self_host_duration: 208, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -3480,7 +3470,7 @@ export class MockAPI { host_duration: 173, device_duration: 8, self_host_duration: 153, - self_device_duration: 8 + self_device_duration: 8, }, { name: 'aten::view', @@ -3488,7 +3478,7 @@ export class MockAPI { host_duration: 971, device_duration: 0, self_host_duration: 971, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -3496,7 +3486,7 @@ export class MockAPI { host_duration: 1333, device_duration: 80, self_host_duration: 271, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -3504,7 +3494,7 @@ export class MockAPI { host_duration: 12621, device_duration: 501, self_host_duration: 9839, - self_device_duration: 501 + self_device_duration: 501, }, { name: 'torch::autograd::AccumulateGrad', @@ -3512,16 +3502,15 @@ export class MockAPI { host_duration: 20767, device_duration: 501, self_host_duration: 8146, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 161, host_duration: 35735, device_duration: 501, self_host_duration: 14968, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -3529,7 +3518,7 @@ export class MockAPI { host_duration: 128, device_duration: 0, self_host_duration: 30, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -3537,7 +3526,7 @@ export class MockAPI { host_duration: 197, device_duration: 0, self_host_duration: 69, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -3545,7 +3534,7 @@ export class MockAPI { host_duration: 31, device_duration: 0, self_host_duration: 31, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -3553,7 +3542,7 @@ export class MockAPI { host_duration: 79, device_duration: 0, self_host_duration: 48, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -3561,16 +3550,15 @@ export class MockAPI { host_duration: 131, device_duration: 0, self_host_duration: 52, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: ReshapeAliasBackward0', + name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', calls: 1, host_duration: 197, device_duration: 0, self_host_duration: 66, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -3578,7 +3566,7 @@ export class MockAPI { host_duration: 84, device_duration: 0, self_host_duration: 69, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -3586,7 +3574,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -3594,7 +3582,7 @@ export class MockAPI { host_duration: 289, device_duration: 38, self_host_duration: 267, - self_device_duration: 38 + self_device_duration: 38, }, { name: 'MeanBackward1', @@ -3602,7 +3590,7 @@ export class MockAPI { host_duration: 489, device_duration: 38, self_host_duration: 110, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -3610,7 +3598,7 @@ export class MockAPI { host_duration: 592, device_duration: 38, self_host_duration: 103, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -3618,7 +3606,7 @@ export class MockAPI { host_duration: 6958, device_duration: 8972, self_host_duration: 6094, - self_device_duration: 8972 + self_device_duration: 8972, }, { name: 'ReluBackward0', @@ -3626,7 +3614,7 @@ export class MockAPI { host_duration: 10647, device_duration: 8972, self_host_duration: 3689, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -3634,7 +3622,7 @@ export class MockAPI { host_duration: 16826, device_duration: 8972, self_host_duration: 6179, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -3642,7 +3630,7 @@ export class MockAPI { host_duration: 129, device_duration: 0, self_host_duration: 129, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -3650,7 +3638,7 @@ export class MockAPI { host_duration: 1301, device_duration: 0, self_host_duration: 1172, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty', @@ -3658,7 +3646,7 @@ export class MockAPI { host_duration: 20319, device_duration: 0, self_host_duration: 20319, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -3666,7 +3654,7 @@ export class MockAPI { host_duration: 31300, device_duration: 22267, self_host_duration: 18144, - self_device_duration: 22267 + self_device_duration: 22267, }, { name: 'CudnnBatchNormBackward0', @@ -3674,16 +3662,15 @@ export class MockAPI { host_duration: 34805, device_duration: 22267, self_host_duration: 3505, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 53, host_duration: 44607, device_duration: 22267, self_host_duration: 9802, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -3691,7 +3678,7 @@ export class MockAPI { host_duration: 20324, device_duration: 38733, self_host_duration: 15252, - self_device_duration: 38733 + self_device_duration: 38733, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -3699,7 +3686,7 @@ export class MockAPI { host_duration: 21997, device_duration: 45837, self_host_duration: 13786, - self_device_duration: 45837 + self_device_duration: 45837, }, { name: 'aten::cudnn_convolution_backward', @@ -3707,7 +3694,7 @@ export class MockAPI { host_duration: 50059, device_duration: 84570, self_host_duration: 7738, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -3715,16 +3702,15 @@ export class MockAPI { host_duration: 53558, device_duration: 84570, self_host_duration: 3499, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 53, host_duration: 64252, device_duration: 89775, self_host_duration: 8462, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -3732,7 +3718,7 @@ export class MockAPI { host_duration: 2232, device_duration: 5205, self_host_duration: 1944, - self_device_duration: 5205 + self_device_duration: 5205, }, { name: 'aten::fill_', @@ -3740,7 +3726,7 @@ export class MockAPI { host_duration: 61, device_duration: 230, self_host_duration: 44, - self_device_duration: 230 + self_device_duration: 230, }, { name: 'aten::zero_', @@ -3748,7 +3734,7 @@ export class MockAPI { host_duration: 104, device_duration: 230, self_host_duration: 43, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -3756,7 +3742,7 @@ export class MockAPI { host_duration: 246, device_duration: 1544, self_host_duration: 128, - self_device_duration: 1314 + self_device_duration: 1314, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -3764,18 +3750,17 @@ export class MockAPI { host_duration: 304, device_duration: 1544, self_host_duration: 58, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 1, host_duration: 425, device_duration: 1544, self_host_duration: 121, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: ResNet.backward', @@ -3789,7 +3774,7 @@ export class MockAPI { host_duration: 9, device_duration: 0, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -3797,7 +3782,7 @@ export class MockAPI { host_duration: 38, device_duration: 0, self_host_duration: 31, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -3805,7 +3790,7 @@ export class MockAPI { host_duration: 59, device_duration: 0, self_host_duration: 21, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -3813,7 +3798,7 @@ export class MockAPI { host_duration: 139, device_duration: 67, self_host_duration: 90, - self_device_duration: 67 + self_device_duration: 67, }, { name: 'AddmmBackward0', @@ -3821,7 +3806,7 @@ export class MockAPI { host_duration: 210, device_duration: 67, self_host_duration: 23, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -3829,7 +3814,7 @@ export class MockAPI { host_duration: 47, device_duration: 7, self_host_duration: 32, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::view', @@ -3837,7 +3822,7 @@ export class MockAPI { host_duration: 166, device_duration: 0, self_host_duration: 166, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -3845,7 +3830,7 @@ export class MockAPI { host_duration: 299, device_duration: 74, self_host_duration: 37, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -3853,7 +3838,7 @@ export class MockAPI { host_duration: 4087, device_duration: 534, self_host_duration: 2037, - self_device_duration: 534 + self_device_duration: 534, }, { name: 'torch::autograd::AccumulateGrad', @@ -3861,16 +3846,15 @@ export class MockAPI { host_duration: 5134, device_duration: 534, self_host_duration: 1047, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 161, host_duration: 7473, device_duration: 534, self_host_duration: 2339, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -3878,7 +3862,7 @@ export class MockAPI { host_duration: 14, device_duration: 0, self_host_duration: 3, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -3886,7 +3870,7 @@ export class MockAPI { host_duration: 21, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -3894,7 +3878,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -3902,7 +3886,7 @@ export class MockAPI { host_duration: 10, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -3910,16 +3894,15 @@ export class MockAPI { host_duration: 14, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: ReshapeAliasBackward0', + name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', calls: 1, host_duration: 21, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -3927,7 +3910,7 @@ export class MockAPI { host_duration: 9, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -3935,7 +3918,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -3943,7 +3926,7 @@ export class MockAPI { host_duration: 70, device_duration: 38, self_host_duration: 49, - self_device_duration: 38 + self_device_duration: 38, }, { name: 'MeanBackward1', @@ -3951,7 +3934,7 @@ export class MockAPI { host_duration: 89, device_duration: 38, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -3959,7 +3942,7 @@ export class MockAPI { host_duration: 102, device_duration: 38, self_host_duration: 13, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -3967,7 +3950,7 @@ export class MockAPI { host_duration: 1789, device_duration: 9015, self_host_duration: 1158, - self_device_duration: 9015 + self_device_duration: 9015, }, { name: 'ReluBackward0', @@ -3975,7 +3958,7 @@ export class MockAPI { host_duration: 2237, device_duration: 9015, self_host_duration: 448, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -3983,7 +3966,7 @@ export class MockAPI { host_duration: 3144, device_duration: 9015, self_host_duration: 907, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -3991,7 +3974,7 @@ export class MockAPI { host_duration: 12, device_duration: 0, self_host_duration: 12, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -3999,7 +3982,7 @@ export class MockAPI { host_duration: 126, device_duration: 0, self_host_duration: 114, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty', @@ -4007,7 +3990,7 @@ export class MockAPI { host_duration: 3292, device_duration: 0, self_host_duration: 3292, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -4015,7 +3998,7 @@ export class MockAPI { host_duration: 4896, device_duration: 22157, self_host_duration: 2136, - self_device_duration: 22157 + self_device_duration: 22157, }, { name: 'CudnnBatchNormBackward0', @@ -4023,16 +4006,15 @@ export class MockAPI { host_duration: 5495, device_duration: 22157, self_host_duration: 599, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 53, host_duration: 7289, device_duration: 22157, self_host_duration: 1794, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -4040,7 +4022,7 @@ export class MockAPI { host_duration: 9468, device_duration: 37714, self_host_duration: 7052, - self_device_duration: 37714 + self_device_duration: 37714, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -4048,7 +4030,7 @@ export class MockAPI { host_duration: 8906, device_duration: 44342, self_host_duration: 5723, - self_device_duration: 44342 + self_device_duration: 44342, }, { name: 'aten::cudnn_convolution_backward', @@ -4056,7 +4038,7 @@ export class MockAPI { host_duration: 19611, device_duration: 82056, self_host_duration: 1237, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -4064,16 +4046,15 @@ export class MockAPI { host_duration: 20205, device_duration: 82056, self_host_duration: 594, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 53, host_duration: 22185, device_duration: 87283, self_host_duration: 1386, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -4081,7 +4062,7 @@ export class MockAPI { host_duration: 594, device_duration: 5227, self_host_duration: 380, - self_device_duration: 5227 + self_device_duration: 5227, }, { name: 'aten::fill_', @@ -4089,7 +4070,7 @@ export class MockAPI { host_duration: 24, device_duration: 230, self_host_duration: 11, - self_device_duration: 230 + self_device_duration: 230, }, { name: 'aten::zero_', @@ -4097,7 +4078,7 @@ export class MockAPI { host_duration: 32, device_duration: 230, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -4105,7 +4086,7 @@ export class MockAPI { host_duration: 72, device_duration: 1503, self_host_duration: 31, - self_device_duration: 1273 + self_device_duration: 1273, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -4113,20 +4094,19 @@ export class MockAPI { host_duration: 82, device_duration: 1503, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 1, host_duration: 103, device_duration: 1503, self_host_duration: 21, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-9' + path: '0-9', }, { left: { @@ -4141,7 +4121,7 @@ export class MockAPI { host_duration: 75, device_duration: 0, self_host_duration: 75, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4149,7 +4129,7 @@ export class MockAPI { host_duration: 4, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4157,9 +4137,9 @@ export class MockAPI { host_duration: 154, device_duration: 0, self_host_duration: 75, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::zeros', @@ -4173,7 +4153,7 @@ export class MockAPI { host_duration: 32, device_duration: 0, self_host_duration: 32, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4181,7 +4161,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4189,11 +4169,11 @@ export class MockAPI { host_duration: 42, device_duration: 0, self_host_duration: 9, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-10' + path: '0-10', }, { left: { @@ -4208,7 +4188,7 @@ export class MockAPI { host_duration: 40, device_duration: 0, self_host_duration: 40, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -4216,7 +4196,7 @@ export class MockAPI { host_duration: 11873, device_duration: 396, self_host_duration: 9505, - self_device_duration: 396 + self_device_duration: 396, }, { name: 'aten::add_', @@ -4224,9 +4204,9 @@ export class MockAPI { host_duration: 22327, device_duration: 893, self_host_duration: 17668, - self_device_duration: 893 - } - ] + self_device_duration: 893, + }, + ], }, right: { name: 'Optimizer.step#SGD.step', @@ -4240,7 +4220,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -4248,7 +4228,7 @@ export class MockAPI { host_duration: 3395, device_duration: 399, self_host_duration: 1806, - self_device_duration: 399 + self_device_duration: 399, }, { name: 'aten::add_', @@ -4256,11 +4236,11 @@ export class MockAPI { host_duration: 6217, device_duration: 906, self_host_duration: 3246, - self_device_duration: 906 - } - ] + self_device_duration: 906, + }, + ], }, - path: '0-11' + path: '0-11', }, { left: { @@ -4275,7 +4255,7 @@ export class MockAPI { host_duration: 79, device_duration: 0, self_host_duration: 79, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4283,7 +4263,7 @@ export class MockAPI { host_duration: 4, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4291,9 +4271,9 @@ export class MockAPI { host_duration: 106, device_duration: 0, self_host_duration: 62, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'multiple nodes', @@ -4307,7 +4287,7 @@ export class MockAPI { host_duration: 10, device_duration: 0, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4315,7 +4295,7 @@ export class MockAPI { host_duration: 0, device_duration: 0, self_host_duration: 0, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4323,11 +4303,11 @@ export class MockAPI { host_duration: 9, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-12' + path: '0-12', }, { left: { @@ -4342,7 +4322,7 @@ export class MockAPI { host_duration: 53837, device_duration: 0, self_host_duration: 53837, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4350,7 +4330,7 @@ export class MockAPI { host_duration: 955, device_duration: 0, self_host_duration: 955, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4358,7 +4338,7 @@ export class MockAPI { host_duration: 26673, device_duration: 0, self_host_duration: 16083, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -4366,7 +4346,7 @@ export class MockAPI { host_duration: 824006, device_duration: 0, self_host_duration: 18525, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -4374,7 +4354,7 @@ export class MockAPI { host_duration: 2188, device_duration: 0, self_host_duration: 2188, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -4382,7 +4362,7 @@ export class MockAPI { host_duration: 5295, device_duration: 0, self_host_duration: 3107, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -4390,7 +4370,7 @@ export class MockAPI { host_duration: 4123, device_duration: 0, self_host_duration: 4123, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -4398,7 +4378,7 @@ export class MockAPI { host_duration: 9590, device_duration: 0, self_host_duration: 8097, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -4406,7 +4386,7 @@ export class MockAPI { host_duration: 24764, device_duration: 0, self_host_duration: 24764, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -4414,7 +4394,7 @@ export class MockAPI { host_duration: 728608, device_duration: 0, self_host_duration: 728608, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_to_copy', @@ -4422,7 +4402,7 @@ export class MockAPI { host_duration: 805481, device_duration: 0, self_host_duration: 51350, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -4430,7 +4410,7 @@ export class MockAPI { host_duration: 236448, device_duration: 0, self_host_duration: 216887, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -4438,7 +4418,7 @@ export class MockAPI { host_duration: 4682, device_duration: 0, self_host_duration: 4092, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -4446,7 +4426,7 @@ export class MockAPI { host_duration: 15283, device_duration: 0, self_host_duration: 15283, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -4454,7 +4434,7 @@ export class MockAPI { host_duration: 8844, device_duration: 0, self_host_duration: 7513, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -4462,7 +4442,7 @@ export class MockAPI { host_duration: 2102, device_duration: 0, self_host_duration: 2102, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -4470,7 +4450,7 @@ export class MockAPI { host_duration: 7286, device_duration: 0, self_host_duration: 5184, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -4478,7 +4458,7 @@ export class MockAPI { host_duration: 850, device_duration: 0, self_host_duration: 850, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -4486,7 +4466,7 @@ export class MockAPI { host_duration: 43219, device_duration: 0, self_host_duration: 39305, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -4494,7 +4474,7 @@ export class MockAPI { host_duration: 92093, device_duration: 0, self_host_duration: 37961, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -4502,7 +4482,7 @@ export class MockAPI { host_duration: 35770, device_duration: 0, self_host_duration: 24869, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -4510,7 +4490,7 @@ export class MockAPI { host_duration: 2481, device_duration: 0, self_host_duration: 2481, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -4518,7 +4498,7 @@ export class MockAPI { host_duration: 10547, device_duration: 0, self_host_duration: 8066, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -4526,7 +4506,7 @@ export class MockAPI { host_duration: 14029, device_duration: 0, self_host_duration: 5364, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -4534,7 +4514,7 @@ export class MockAPI { host_duration: 79760, device_duration: 0, self_host_duration: 68841, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -4542,7 +4522,7 @@ export class MockAPI { host_duration: 121, device_duration: 0, self_host_duration: 121, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -4550,7 +4530,7 @@ export class MockAPI { host_duration: 138, device_duration: 0, self_host_duration: 48, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -4558,7 +4538,7 @@ export class MockAPI { host_duration: 41467, device_duration: 0, self_host_duration: 41176, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -4566,7 +4546,7 @@ export class MockAPI { host_duration: 41608, device_duration: 0, self_host_duration: 141, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -4574,9 +4554,9 @@ export class MockAPI { host_duration: 49080, device_duration: 0, self_host_duration: 2720, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'enumerate(DataLoader)#_SingleProcessDataLoaderIter.__next__', @@ -4590,7 +4570,7 @@ export class MockAPI { host_duration: 6528, device_duration: 0, self_host_duration: 6528, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -4598,7 +4578,7 @@ export class MockAPI { host_duration: 94, device_duration: 0, self_host_duration: 94, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -4606,7 +4586,7 @@ export class MockAPI { host_duration: 2448, device_duration: 0, self_host_duration: 1214, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -4614,7 +4594,7 @@ export class MockAPI { host_duration: 16544, device_duration: 0, self_host_duration: 1856, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach', @@ -4622,7 +4602,7 @@ export class MockAPI { host_duration: 337, device_duration: 0, self_host_duration: 337, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach', @@ -4630,7 +4610,7 @@ export class MockAPI { host_duration: 629, device_duration: 0, self_host_duration: 292, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -4638,7 +4618,7 @@ export class MockAPI { host_duration: 464, device_duration: 0, self_host_duration: 464, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::unsqueeze', @@ -4646,7 +4626,7 @@ export class MockAPI { host_duration: 1024, device_duration: 0, self_host_duration: 854, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_strided', @@ -4654,7 +4634,7 @@ export class MockAPI { host_duration: 3009, device_duration: 0, self_host_duration: 3009, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -4662,7 +4642,7 @@ export class MockAPI { host_duration: 7419, device_duration: 0, self_host_duration: 7419, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_to_copy', @@ -4670,7 +4650,7 @@ export class MockAPI { host_duration: 14688, device_duration: 0, self_host_duration: 4039, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::upsample_bilinear2d', @@ -4678,7 +4658,7 @@ export class MockAPI { host_duration: 31439, device_duration: 0, self_host_duration: 29154, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::squeeze', @@ -4686,7 +4666,7 @@ export class MockAPI { host_duration: 473, device_duration: 0, self_host_duration: 408, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::round', @@ -4694,7 +4674,7 @@ export class MockAPI { host_duration: 4416, device_duration: 0, self_host_duration: 4416, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::slice', @@ -4702,7 +4682,7 @@ export class MockAPI { host_duration: 864, device_duration: 0, self_host_duration: 730, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'detach_', @@ -4710,7 +4690,7 @@ export class MockAPI { host_duration: 136, device_duration: 0, self_host_duration: 115, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::detach_', @@ -4718,7 +4698,7 @@ export class MockAPI { host_duration: 586, device_duration: 0, self_host_duration: 471, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::result_type', @@ -4726,7 +4706,7 @@ export class MockAPI { host_duration: 149, device_duration: 0, self_host_duration: 149, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::pow', @@ -4734,7 +4714,7 @@ export class MockAPI { host_duration: 3935, device_duration: 0, self_host_duration: 3519, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sub', @@ -4742,7 +4722,7 @@ export class MockAPI { host_duration: 7881, device_duration: 0, self_host_duration: 3349, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::gt', @@ -4750,7 +4730,7 @@ export class MockAPI { host_duration: 3055, device_duration: 0, self_host_duration: 2164, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_local_scalar_dense', @@ -4758,7 +4738,7 @@ export class MockAPI { host_duration: 186, device_duration: 0, self_host_duration: 186, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::item', @@ -4766,7 +4746,7 @@ export class MockAPI { host_duration: 1134, device_duration: 0, self_host_duration: 943, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::is_nonzero', @@ -4774,7 +4754,7 @@ export class MockAPI { host_duration: 1588, device_duration: 0, self_host_duration: 615, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -4782,7 +4762,7 @@ export class MockAPI { host_duration: 4153, device_duration: 0, self_host_duration: 3203, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -4790,7 +4770,7 @@ export class MockAPI { host_duration: 42, device_duration: 0, self_host_duration: 42, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::narrow', @@ -4798,7 +4778,7 @@ export class MockAPI { host_duration: 18, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_cat', @@ -4806,7 +4786,7 @@ export class MockAPI { host_duration: 4613, device_duration: 0, self_host_duration: 4547, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cat', @@ -4814,7 +4794,7 @@ export class MockAPI { host_duration: 4637, device_duration: 0, self_host_duration: 24, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::stack', @@ -4822,11 +4802,11 @@ export class MockAPI { host_duration: 5311, device_duration: 0, self_host_duration: 246, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-13' + path: '0-13', }, { left: { @@ -4841,7 +4821,7 @@ export class MockAPI { host_duration: 203, device_duration: 0, self_host_duration: 203, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -4849,7 +4829,7 @@ export class MockAPI { host_duration: 4687, device_duration: 4394, self_host_duration: 94, - self_device_duration: 4394 + self_device_duration: 4394, }, { name: 'aten::_to_copy', @@ -4857,7 +4837,7 @@ export class MockAPI { host_duration: 5113, device_duration: 4394, self_host_duration: 223, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -4865,9 +4845,9 @@ export class MockAPI { host_duration: 5185, device_duration: 4394, self_host_duration: 72, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'multiple nodes', @@ -4881,7 +4861,7 @@ export class MockAPI { host_duration: 60, device_duration: 0, self_host_duration: 60, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::copy_', @@ -4889,7 +4869,7 @@ export class MockAPI { host_duration: 4559, device_duration: 4334, self_host_duration: 26, - self_device_duration: 4334 + self_device_duration: 4334, }, { name: 'aten::_to_copy', @@ -4897,7 +4877,7 @@ export class MockAPI { host_duration: 4655, device_duration: 4334, self_host_duration: 36, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -4905,11 +4885,11 @@ export class MockAPI { host_duration: 4664, device_duration: 4334, self_host_duration: 9, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-14' + path: '0-14', }, { left: { @@ -4924,7 +4904,7 @@ export class MockAPI { host_duration: 13992, device_duration: 0, self_host_duration: 13992, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -4932,7 +4912,7 @@ export class MockAPI { host_duration: 21952, device_duration: 35233, self_host_duration: 17460, - self_device_duration: 35233 + self_device_duration: 35233, }, { name: 'aten::_convolution', @@ -4940,7 +4920,7 @@ export class MockAPI { host_duration: 25568, device_duration: 35233, self_host_duration: 3616, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -4948,7 +4928,7 @@ export class MockAPI { host_duration: 27534, device_duration: 35233, self_host_duration: 1966, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -4956,7 +4936,7 @@ export class MockAPI { host_duration: 29546, device_duration: 35233, self_host_duration: 2012, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -4964,7 +4944,7 @@ export class MockAPI { host_duration: 6523, device_duration: 53, self_host_duration: 5669, - self_device_duration: 53 + self_device_duration: 53, }, { name: 'aten::empty_like', @@ -4972,7 +4952,7 @@ export class MockAPI { host_duration: 5605, device_duration: 0, self_host_duration: 2378, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -4980,7 +4960,7 @@ export class MockAPI { host_duration: 829, device_duration: 0, self_host_duration: 829, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -4988,7 +4968,7 @@ export class MockAPI { host_duration: 35510, device_duration: 12828, self_host_duration: 20387, - self_device_duration: 12828 + self_device_duration: 12828, }, { name: 'aten::_batch_norm_impl_index', @@ -4996,7 +4976,7 @@ export class MockAPI { host_duration: 38030, device_duration: 12828, self_host_duration: 2520, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -5004,7 +4984,7 @@ export class MockAPI { host_duration: 39727, device_duration: 12828, self_host_duration: 1697, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -5012,7 +4992,7 @@ export class MockAPI { host_duration: 2715, device_duration: 5998, self_host_duration: 1950, - self_device_duration: 5998 + self_device_duration: 5998, }, { name: 'aten::clamp_min_', @@ -5020,7 +5000,7 @@ export class MockAPI { host_duration: 4264, device_duration: 5998, self_host_duration: 1549, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -5028,7 +5008,7 @@ export class MockAPI { host_duration: 8337, device_duration: 5998, self_host_duration: 4073, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -5036,7 +5016,7 @@ export class MockAPI { host_duration: 212, device_duration: 466, self_host_duration: 193, - self_device_duration: 466 + self_device_duration: 466, }, { name: 'aten::max_pool2d', @@ -5044,7 +5024,7 @@ export class MockAPI { host_duration: 262, device_duration: 466, self_host_duration: 50, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -5052,7 +5032,7 @@ export class MockAPI { host_duration: 1553, device_duration: 5165, self_host_duration: 1297, - self_device_duration: 5165 + self_device_duration: 5165, }, { name: 'aten::mean', @@ -5060,7 +5040,7 @@ export class MockAPI { host_duration: 187, device_duration: 64, self_host_duration: 169, - self_device_duration: 64 + self_device_duration: 64, }, { name: 'aten::adaptive_avg_pool2d', @@ -5068,7 +5048,7 @@ export class MockAPI { host_duration: 231, device_duration: 64, self_host_duration: 44, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -5076,7 +5056,7 @@ export class MockAPI { host_duration: 52, device_duration: 0, self_host_duration: 52, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -5084,7 +5064,7 @@ export class MockAPI { host_duration: 101, device_duration: 0, self_host_duration: 49, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -5092,7 +5072,7 @@ export class MockAPI { host_duration: 21, device_duration: 0, self_host_duration: 21, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -5100,7 +5080,7 @@ export class MockAPI { host_duration: 51, device_duration: 0, self_host_duration: 40, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -5108,7 +5088,7 @@ export class MockAPI { host_duration: 120, device_duration: 0, self_host_duration: 69, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -5116,7 +5096,7 @@ export class MockAPI { host_duration: 49, device_duration: 0, self_host_duration: 39, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -5124,7 +5104,7 @@ export class MockAPI { host_duration: 405, device_duration: 41, self_host_duration: 302, - self_device_duration: 41 + self_device_duration: 41, }, { name: 'aten::linear', @@ -5132,9 +5112,9 @@ export class MockAPI { host_duration: 594, device_duration: 41, self_host_duration: 69, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: ResNet', @@ -5148,7 +5128,7 @@ export class MockAPI { host_duration: 2234, device_duration: 0, self_host_duration: 2234, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution', @@ -5156,7 +5136,7 @@ export class MockAPI { host_duration: 8644, device_duration: 35209, self_host_duration: 6782, - self_device_duration: 35209 + self_device_duration: 35209, }, { name: 'aten::_convolution', @@ -5164,7 +5144,7 @@ export class MockAPI { host_duration: 9216, device_duration: 35209, self_host_duration: 572, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::convolution', @@ -5172,7 +5152,7 @@ export class MockAPI { host_duration: 9532, device_duration: 35209, self_host_duration: 316, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::conv2d', @@ -5180,7 +5160,7 @@ export class MockAPI { host_duration: 9818, device_duration: 35209, self_host_duration: 286, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -5188,7 +5168,7 @@ export class MockAPI { host_duration: 1898, device_duration: 55, self_host_duration: 1202, - self_device_duration: 55 + self_device_duration: 55, }, { name: 'aten::empty_like', @@ -5196,7 +5176,7 @@ export class MockAPI { host_duration: 941, device_duration: 0, self_host_duration: 300, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::view', @@ -5204,7 +5184,7 @@ export class MockAPI { host_duration: 137, device_duration: 0, self_host_duration: 137, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm', @@ -5212,7 +5192,7 @@ export class MockAPI { host_duration: 5543, device_duration: 12824, self_host_duration: 2527, - self_device_duration: 12824 + self_device_duration: 12824, }, { name: 'aten::_batch_norm_impl_index', @@ -5220,7 +5200,7 @@ export class MockAPI { host_duration: 5914, device_duration: 12824, self_host_duration: 371, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::batch_norm', @@ -5228,7 +5208,7 @@ export class MockAPI { host_duration: 6167, device_duration: 12824, self_host_duration: 253, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::clamp_min', @@ -5236,7 +5216,7 @@ export class MockAPI { host_duration: 1081, device_duration: 6004, self_host_duration: 507, - self_device_duration: 6004 + self_device_duration: 6004, }, { name: 'aten::clamp_min_', @@ -5244,7 +5224,7 @@ export class MockAPI { host_duration: 1299, device_duration: 6004, self_host_duration: 218, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::relu_', @@ -5252,7 +5232,7 @@ export class MockAPI { host_duration: 1941, device_duration: 6004, self_host_duration: 642, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices', @@ -5260,7 +5240,7 @@ export class MockAPI { host_duration: 59, device_duration: 466, self_host_duration: 44, - self_device_duration: 466 + self_device_duration: 466, }, { name: 'aten::max_pool2d', @@ -5268,7 +5248,7 @@ export class MockAPI { host_duration: 66, device_duration: 466, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -5276,7 +5256,7 @@ export class MockAPI { host_duration: 443, device_duration: 5169, self_host_duration: 267, - self_device_duration: 5169 + self_device_duration: 5169, }, { name: 'aten::mean', @@ -5284,7 +5264,7 @@ export class MockAPI { host_duration: 51, device_duration: 63, self_host_duration: 37, - self_device_duration: 63 + self_device_duration: 63, }, { name: 'aten::adaptive_avg_pool2d', @@ -5292,7 +5272,7 @@ export class MockAPI { host_duration: 58, device_duration: 63, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -5300,7 +5280,7 @@ export class MockAPI { host_duration: 8, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::flatten', @@ -5308,7 +5288,7 @@ export class MockAPI { host_duration: 16, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::as_strided', @@ -5316,7 +5296,7 @@ export class MockAPI { host_duration: 3, device_duration: 0, self_host_duration: 3, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -5324,7 +5304,7 @@ export class MockAPI { host_duration: 10, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -5332,7 +5312,7 @@ export class MockAPI { host_duration: 18, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -5340,7 +5320,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::addmm', @@ -5348,7 +5328,7 @@ export class MockAPI { host_duration: 161, device_duration: 42, self_host_duration: 111, - self_device_duration: 42 + self_device_duration: 42, }, { name: 'aten::linear', @@ -5356,11 +5336,11 @@ export class MockAPI { host_duration: 188, device_duration: 42, self_host_duration: 9, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-15' + path: '0-15', }, { left: { @@ -5375,7 +5355,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -5383,7 +5363,7 @@ export class MockAPI { host_duration: 150, device_duration: 7, self_host_duration: 132, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::log_softmax', @@ -5391,7 +5371,7 @@ export class MockAPI { host_duration: 231, device_duration: 7, self_host_duration: 75, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -5399,7 +5379,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -5407,7 +5387,7 @@ export class MockAPI { host_duration: 266, device_duration: 4, self_host_duration: 243, - self_device_duration: 4 + self_device_duration: 4, }, { name: 'aten::nll_loss', @@ -5415,7 +5395,7 @@ export class MockAPI { host_duration: 300, device_duration: 4, self_host_duration: 34, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -5423,7 +5403,7 @@ export class MockAPI { host_duration: 328, device_duration: 4, self_host_duration: 28, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -5431,9 +5411,9 @@ export class MockAPI { host_duration: 620, device_duration: 11, self_host_duration: 61, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: CrossEntropyLoss', @@ -5447,7 +5427,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax', @@ -5455,7 +5435,7 @@ export class MockAPI { host_duration: 41, device_duration: 7, self_host_duration: 27, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::log_softmax', @@ -5463,7 +5443,7 @@ export class MockAPI { host_duration: 52, device_duration: 7, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::resize_', @@ -5471,7 +5451,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_forward', @@ -5479,7 +5459,7 @@ export class MockAPI { host_duration: 49, device_duration: 4, self_host_duration: 34, - self_device_duration: 4 + self_device_duration: 4, }, { name: 'aten::nll_loss', @@ -5487,7 +5467,7 @@ export class MockAPI { host_duration: 53, device_duration: 4, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_nd', @@ -5495,7 +5475,7 @@ export class MockAPI { host_duration: 57, device_duration: 4, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cross_entropy_loss', @@ -5503,11 +5483,11 @@ export class MockAPI { host_duration: 124, device_duration: 11, self_host_duration: 15, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-16' + path: '0-16', }, { left: { @@ -5522,7 +5502,7 @@ export class MockAPI { host_duration: 39, device_duration: 0, self_host_duration: 39, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -5530,7 +5510,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -5538,9 +5518,9 @@ export class MockAPI { host_duration: 109, device_duration: 0, self_host_duration: 65, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::zeros', @@ -5554,7 +5534,7 @@ export class MockAPI { host_duration: 13, device_duration: 0, self_host_duration: 13, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -5562,7 +5542,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -5570,11 +5550,11 @@ export class MockAPI { host_duration: 23, device_duration: 0, self_host_duration: 9, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-17' + path: '0-17', }, { left: { @@ -5589,7 +5569,7 @@ export class MockAPI { host_duration: 44, device_duration: 0, self_host_duration: 44, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -5597,7 +5577,7 @@ export class MockAPI { host_duration: 7104, device_duration: 132, self_host_duration: 4941, - self_device_duration: 132 + self_device_duration: 132, }, { name: 'aten::zero_', @@ -5605,9 +5585,9 @@ export class MockAPI { host_duration: 14806, device_duration: 132, self_host_duration: 7702, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'Optimizer.zero_grad#SGD.zero_grad', @@ -5621,7 +5601,7 @@ export class MockAPI { host_duration: 6, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -5629,7 +5609,7 @@ export class MockAPI { host_duration: 1945, device_duration: 137, self_host_duration: 878, - self_device_duration: 137 + self_device_duration: 137, }, { name: 'aten::zero_', @@ -5637,11 +5617,11 @@ export class MockAPI { host_duration: 2805, device_duration: 137, self_host_duration: 860, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-18' + path: '0-18', }, { left: { @@ -5656,7 +5636,7 @@ export class MockAPI { host_duration: 99, device_duration: 0, self_host_duration: 99, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_like', @@ -5664,7 +5644,7 @@ export class MockAPI { host_duration: 149, device_duration: 0, self_host_duration: 50, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -5672,7 +5652,7 @@ export class MockAPI { host_duration: 49, device_duration: 1, self_host_duration: 34, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::ones_like', @@ -5680,9 +5660,9 @@ export class MockAPI { host_duration: 263, device_duration: 1, self_host_duration: 65, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::ones_like', @@ -5696,7 +5676,7 @@ export class MockAPI { host_duration: 18, device_duration: 0, self_host_duration: 18, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty_like', @@ -5704,7 +5684,7 @@ export class MockAPI { host_duration: 24, device_duration: 0, self_host_duration: 6, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::fill_', @@ -5712,7 +5692,7 @@ export class MockAPI { host_duration: 20, device_duration: 1, self_host_duration: 8, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::ones_like', @@ -5720,11 +5700,11 @@ export class MockAPI { host_duration: 51, device_duration: 1, self_host_duration: 7, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-19' + path: '0-19', }, { left: { @@ -5739,7 +5719,7 @@ export class MockAPI { host_duration: 58, device_duration: 1, self_host_duration: 36, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::zero_', @@ -5747,7 +5727,7 @@ export class MockAPI { host_duration: 112, device_duration: 1, self_host_duration: 54, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -5755,7 +5735,7 @@ export class MockAPI { host_duration: 269, device_duration: 4, self_host_duration: 142, - self_device_duration: 3 + self_device_duration: 3, }, { name: 'NllLossBackward0', @@ -5763,7 +5743,7 @@ export class MockAPI { host_duration: 406, device_duration: 4, self_host_duration: 137, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -5771,7 +5751,7 @@ export class MockAPI { host_duration: 522, device_duration: 4, self_host_duration: 116, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -5779,7 +5759,7 @@ export class MockAPI { host_duration: 109, device_duration: 9, self_host_duration: 91, - self_device_duration: 9 + self_device_duration: 9, }, { name: 'LogSoftmaxBackward0', @@ -5787,18 +5767,17 @@ export class MockAPI { host_duration: 178, device_duration: 9, self_host_duration: 69, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: LogSoftmaxBackward0', + name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', calls: 1, host_duration: 283, device_duration: 9, self_host_duration: 105, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: CrossEntropyLoss.backward', @@ -5812,7 +5791,7 @@ export class MockAPI { host_duration: 33, device_duration: 1, self_host_duration: 12, - self_device_duration: 1 + self_device_duration: 1, }, { name: 'aten::zero_', @@ -5820,7 +5799,7 @@ export class MockAPI { host_duration: 41, device_duration: 1, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::nll_loss_backward', @@ -5828,7 +5807,7 @@ export class MockAPI { host_duration: 93, device_duration: 4, self_host_duration: 41, - self_device_duration: 3 + self_device_duration: 3, }, { name: 'NllLossBackward0', @@ -5836,7 +5815,7 @@ export class MockAPI { host_duration: 185, device_duration: 4, self_host_duration: 92, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: NllLossBackward0', @@ -5844,7 +5823,7 @@ export class MockAPI { host_duration: 211, device_duration: 4, self_host_duration: 26, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_log_softmax_backward_data', @@ -5852,7 +5831,7 @@ export class MockAPI { host_duration: 36, device_duration: 9, self_host_duration: 22, - self_device_duration: 9 + self_device_duration: 9, }, { name: 'LogSoftmaxBackward0', @@ -5860,20 +5839,19 @@ export class MockAPI { host_duration: 45, device_duration: 9, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: LogSoftmaxBackward0', + name: 'autograd::engine::evaluate_function: LogSoftmaxBackward0', calls: 1, host_duration: 62, device_duration: 9, self_host_duration: 17, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-20' + path: '0-20', }, { left: { @@ -5888,7 +5866,7 @@ export class MockAPI { host_duration: 67, device_duration: 0, self_host_duration: 67, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -5896,7 +5874,7 @@ export class MockAPI { host_duration: 255, device_duration: 0, self_host_duration: 204, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -5904,7 +5882,7 @@ export class MockAPI { host_duration: 430, device_duration: 0, self_host_duration: 175, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -5912,7 +5890,7 @@ export class MockAPI { host_duration: 323, device_duration: 68, self_host_duration: 265, - self_device_duration: 68 + self_device_duration: 68, }, { name: 'AddmmBackward0', @@ -5920,7 +5898,7 @@ export class MockAPI { host_duration: 844, device_duration: 68, self_host_duration: 209, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -5928,7 +5906,7 @@ export class MockAPI { host_duration: 197, device_duration: 7, self_host_duration: 175, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::view', @@ -5936,7 +5914,7 @@ export class MockAPI { host_duration: 963, device_duration: 0, self_host_duration: 963, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -5944,7 +5922,7 @@ export class MockAPI { host_duration: 1377, device_duration: 75, self_host_duration: 296, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -5952,7 +5930,7 @@ export class MockAPI { host_duration: 12404, device_duration: 496, self_host_duration: 9659, - self_device_duration: 496 + self_device_duration: 496, }, { name: 'torch::autograd::AccumulateGrad', @@ -5960,16 +5938,15 @@ export class MockAPI { host_duration: 20417, device_duration: 496, self_host_duration: 8013, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 161, host_duration: 35211, device_duration: 496, self_host_duration: 14794, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -5977,7 +5954,7 @@ export class MockAPI { host_duration: 152, device_duration: 0, self_host_duration: 34, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -5985,7 +5962,7 @@ export class MockAPI { host_duration: 231, device_duration: 0, self_host_duration: 79, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -5993,7 +5970,7 @@ export class MockAPI { host_duration: 35, device_duration: 0, self_host_duration: 35, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -6001,7 +5978,7 @@ export class MockAPI { host_duration: 91, device_duration: 0, self_host_duration: 56, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -6009,16 +5986,15 @@ export class MockAPI { host_duration: 133, device_duration: 0, self_host_duration: 42, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: ReshapeAliasBackward0', + name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', calls: 1, host_duration: 205, device_duration: 0, self_host_duration: 72, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -6026,7 +6002,7 @@ export class MockAPI { host_duration: 95, device_duration: 0, self_host_duration: 79, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -6034,7 +6010,7 @@ export class MockAPI { host_duration: 7, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -6042,7 +6018,7 @@ export class MockAPI { host_duration: 324, device_duration: 37, self_host_duration: 301, - self_device_duration: 37 + self_device_duration: 37, }, { name: 'MeanBackward1', @@ -6050,7 +6026,7 @@ export class MockAPI { host_duration: 547, device_duration: 37, self_host_duration: 121, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -6058,7 +6034,7 @@ export class MockAPI { host_duration: 662, device_duration: 37, self_host_duration: 115, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -6066,7 +6042,7 @@ export class MockAPI { host_duration: 6880, device_duration: 9012, self_host_duration: 6037, - self_device_duration: 9012 + self_device_duration: 9012, }, { name: 'ReluBackward0', @@ -6074,7 +6050,7 @@ export class MockAPI { host_duration: 10536, device_duration: 9012, self_host_duration: 3656, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -6082,7 +6058,7 @@ export class MockAPI { host_duration: 16666, device_duration: 9012, self_host_duration: 6130, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -6090,7 +6066,7 @@ export class MockAPI { host_duration: 122, device_duration: 0, self_host_duration: 122, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -6098,7 +6074,7 @@ export class MockAPI { host_duration: 1278, device_duration: 0, self_host_duration: 1156, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty', @@ -6106,7 +6082,7 @@ export class MockAPI { host_duration: 21126, device_duration: 0, self_host_duration: 21126, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -6114,7 +6090,7 @@ export class MockAPI { host_duration: 30875, device_duration: 22166, self_host_duration: 17909, - self_device_duration: 22166 + self_device_duration: 22166, }, { name: 'CudnnBatchNormBackward0', @@ -6122,16 +6098,15 @@ export class MockAPI { host_duration: 34355, device_duration: 22166, self_host_duration: 3480, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 53, host_duration: 44006, device_duration: 22166, self_host_duration: 9651, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -6139,7 +6114,7 @@ export class MockAPI { host_duration: 20496, device_duration: 37887, self_host_duration: 15516, - self_device_duration: 37887 + self_device_duration: 37887, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -6147,7 +6122,7 @@ export class MockAPI { host_duration: 22878, device_duration: 44271, self_host_duration: 13672, - self_device_duration: 44271 + self_device_duration: 44271, }, { name: 'aten::cudnn_convolution_backward', @@ -6155,7 +6130,7 @@ export class MockAPI { host_duration: 50961, device_duration: 82158, self_host_duration: 7587, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -6163,16 +6138,15 @@ export class MockAPI { host_duration: 54406, device_duration: 82158, self_host_duration: 3445, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 53, host_duration: 64877, device_duration: 87386, self_host_duration: 8284, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -6180,7 +6154,7 @@ export class MockAPI { host_duration: 2187, device_duration: 5228, self_host_duration: 1909, - self_device_duration: 5228 + self_device_duration: 5228, }, { name: 'aten::fill_', @@ -6188,7 +6162,7 @@ export class MockAPI { host_duration: 53, device_duration: 230, self_host_duration: 36, - self_device_duration: 230 + self_device_duration: 230, }, { name: 'aten::zero_', @@ -6196,7 +6170,7 @@ export class MockAPI { host_duration: 96, device_duration: 230, self_host_duration: 43, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -6204,7 +6178,7 @@ export class MockAPI { host_duration: 237, device_duration: 1504, self_host_duration: 129, - self_device_duration: 1274 + self_device_duration: 1274, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -6212,18 +6186,17 @@ export class MockAPI { host_duration: 295, device_duration: 1504, self_host_duration: 58, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 1, host_duration: 411, device_duration: 1504, self_host_duration: 116, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'nn.Module: ResNet.backward', @@ -6237,7 +6210,7 @@ export class MockAPI { host_duration: 7, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::transpose', @@ -6245,7 +6218,7 @@ export class MockAPI { host_duration: 29, device_duration: 0, self_host_duration: 23, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::t', @@ -6253,7 +6226,7 @@ export class MockAPI { host_duration: 53, device_duration: 0, self_host_duration: 24, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mm', @@ -6261,7 +6234,7 @@ export class MockAPI { host_duration: 144, device_duration: 67, self_host_duration: 96, - self_device_duration: 67 + self_device_duration: 67, }, { name: 'AddmmBackward0', @@ -6269,7 +6242,7 @@ export class MockAPI { host_duration: 208, device_duration: 67, self_host_duration: 24, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::sum', @@ -6277,7 +6250,7 @@ export class MockAPI { host_duration: 45, device_duration: 7, self_host_duration: 30, - self_device_duration: 7 + self_device_duration: 7, }, { name: 'aten::view', @@ -6285,7 +6258,7 @@ export class MockAPI { host_duration: 163, device_duration: 0, self_host_duration: 163, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddmmBackward0', @@ -6293,7 +6266,7 @@ export class MockAPI { host_duration: 295, device_duration: 74, self_host_duration: 38, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add_', @@ -6301,7 +6274,7 @@ export class MockAPI { host_duration: 4103, device_duration: 535, self_host_duration: 2037, - self_device_duration: 535 + self_device_duration: 535, }, { name: 'torch::autograd::AccumulateGrad', @@ -6309,16 +6282,15 @@ export class MockAPI { host_duration: 5183, device_duration: 535, self_host_duration: 1080, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', + name: 'autograd::engine::evaluate_function: torch::autograd::AccumulateGrad', calls: 161, host_duration: 7655, device_duration: 535, self_host_duration: 2472, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'TBackward0', @@ -6326,7 +6298,7 @@ export class MockAPI { host_duration: 16, device_duration: 0, self_host_duration: 3, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: TBackward0', @@ -6334,7 +6306,7 @@ export class MockAPI { host_duration: 24, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::_reshape_alias', @@ -6342,7 +6314,7 @@ export class MockAPI { host_duration: 5, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::reshape', @@ -6350,7 +6322,7 @@ export class MockAPI { host_duration: 10, device_duration: 0, self_host_duration: 5, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'ReshapeAliasBackward0', @@ -6358,16 +6330,15 @@ export class MockAPI { host_duration: 17, device_duration: 0, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: ReshapeAliasBackward0', + name: 'autograd::engine::evaluate_function: ReshapeAliasBackward0', calls: 1, host_duration: 27, device_duration: 0, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::expand', @@ -6375,7 +6346,7 @@ export class MockAPI { host_duration: 10, device_duration: 0, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::to', @@ -6383,7 +6354,7 @@ export class MockAPI { host_duration: 1, device_duration: 0, self_host_duration: 1, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::div', @@ -6391,7 +6362,7 @@ export class MockAPI { host_duration: 63, device_duration: 37, self_host_duration: 45, - self_device_duration: 37 + self_device_duration: 37, }, { name: 'MeanBackward1', @@ -6399,7 +6370,7 @@ export class MockAPI { host_duration: 83, device_duration: 37, self_host_duration: 9, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: MeanBackward1', @@ -6407,7 +6378,7 @@ export class MockAPI { host_duration: 99, device_duration: 37, self_host_duration: 16, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::threshold_backward', @@ -6415,7 +6386,7 @@ export class MockAPI { host_duration: 1863, device_duration: 9003, self_host_duration: 1203, - self_device_duration: 9003 + self_device_duration: 9003, }, { name: 'ReluBackward0', @@ -6423,7 +6394,7 @@ export class MockAPI { host_duration: 2330, device_duration: 9003, self_host_duration: 467, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: ReluBackward0', @@ -6431,7 +6402,7 @@ export class MockAPI { host_duration: 3313, device_duration: 9003, self_host_duration: 983, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'AddBackward0', @@ -6439,7 +6410,7 @@ export class MockAPI { host_duration: 14, device_duration: 0, self_host_duration: 14, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'autograd::engine::evaluate_function: AddBackward0', @@ -6447,7 +6418,7 @@ export class MockAPI { host_duration: 135, device_duration: 0, self_host_duration: 121, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::empty', @@ -6455,7 +6426,7 @@ export class MockAPI { host_duration: 4638, device_duration: 0, self_host_duration: 4638, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_batch_norm_backward', @@ -6463,7 +6434,7 @@ export class MockAPI { host_duration: 5047, device_duration: 22244, self_host_duration: 2219, - self_device_duration: 22244 + self_device_duration: 22244, }, { name: 'CudnnBatchNormBackward0', @@ -6471,16 +6442,15 @@ export class MockAPI { host_duration: 5637, device_duration: 22244, self_host_duration: 590, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', + name: 'autograd::engine::evaluate_function: CudnnBatchNormBackward0', calls: 53, host_duration: 7407, device_duration: 22244, self_host_duration: 1770, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::cudnn_convolution_backward_input', @@ -6488,7 +6458,7 @@ export class MockAPI { host_duration: 9345, device_duration: 37854, self_host_duration: 6945, - self_device_duration: 37854 + self_device_duration: 37854, }, { name: 'aten::cudnn_convolution_backward_weight', @@ -6496,7 +6466,7 @@ export class MockAPI { host_duration: 9886, device_duration: 44650, self_host_duration: 5378, - self_device_duration: 44650 + self_device_duration: 44650, }, { name: 'aten::cudnn_convolution_backward', @@ -6504,7 +6474,7 @@ export class MockAPI { host_duration: 20453, device_duration: 82504, self_host_duration: 1222, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'CudnnConvolutionBackward0', @@ -6512,16 +6482,15 @@ export class MockAPI { host_duration: 21000, device_duration: 82504, self_host_duration: 547, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', + name: 'autograd::engine::evaluate_function: CudnnConvolutionBackward0', calls: 53, host_duration: 23024, device_duration: 87731, self_host_duration: 1440, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::add', @@ -6529,7 +6498,7 @@ export class MockAPI { host_duration: 584, device_duration: 5227, self_host_duration: 374, - self_device_duration: 5227 + self_device_duration: 5227, }, { name: 'aten::fill_', @@ -6537,7 +6506,7 @@ export class MockAPI { host_duration: 26, device_duration: 230, self_host_duration: 12, - self_device_duration: 230 + self_device_duration: 230, }, { name: 'aten::zero_', @@ -6545,7 +6514,7 @@ export class MockAPI { host_duration: 33, device_duration: 230, self_host_duration: 7, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::max_pool2d_with_indices_backward', @@ -6553,7 +6522,7 @@ export class MockAPI { host_duration: 73, device_duration: 1513, self_host_duration: 30, - self_device_duration: 1283 + self_device_duration: 1283, }, { name: 'MaxPool2DWithIndicesBackward0', @@ -6561,20 +6530,19 @@ export class MockAPI { host_duration: 83, device_duration: 1513, self_host_duration: 10, - self_device_duration: 0 + self_device_duration: 0, }, { - name: - 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', + name: 'autograd::engine::evaluate_function: MaxPool2DWithIndicesBackward0', calls: 1, host_duration: 106, device_duration: 1513, self_host_duration: 23, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-21' + path: '0-21', }, { left: { @@ -6589,7 +6557,7 @@ export class MockAPI { host_duration: 87, device_duration: 0, self_host_duration: 87, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -6597,7 +6565,7 @@ export class MockAPI { host_duration: 4, device_duration: 0, self_host_duration: 4, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -6605,9 +6573,9 @@ export class MockAPI { host_duration: 160, device_duration: 0, self_host_duration: 69, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, right: { name: 'aten::zeros', @@ -6621,7 +6589,7 @@ export class MockAPI { host_duration: 105, device_duration: 0, self_host_duration: 105, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zero_', @@ -6629,7 +6597,7 @@ export class MockAPI { host_duration: 2, device_duration: 0, self_host_duration: 2, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::zeros', @@ -6637,11 +6605,11 @@ export class MockAPI { host_duration: 119, device_duration: 0, self_host_duration: 12, - self_device_duration: 0 - } - ] + self_device_duration: 0, + }, + ], }, - path: '0-22' + path: '0-22', }, { left: { @@ -6656,7 +6624,7 @@ export class MockAPI { host_duration: 40, device_duration: 0, self_host_duration: 40, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -6664,7 +6632,7 @@ export class MockAPI { host_duration: 11945, device_duration: 401, self_host_duration: 9568, - self_device_duration: 401 + self_device_duration: 401, }, { name: 'aten::add_', @@ -6672,9 +6640,9 @@ export class MockAPI { host_duration: 22480, device_duration: 894, self_host_duration: 17805, - self_device_duration: 894 - } - ] + self_device_duration: 894, + }, + ], }, right: { name: 'Optimizer.step#SGD.step', @@ -6688,7 +6656,7 @@ export class MockAPI { host_duration: 8, device_duration: 0, self_host_duration: 8, - self_device_duration: 0 + self_device_duration: 0, }, { name: 'aten::mul_', @@ -6696,7 +6664,7 @@ export class MockAPI { host_duration: 3440, device_duration: 404, self_host_duration: 1824, - self_device_duration: 404 + self_device_duration: 404, }, { name: 'aten::add_', @@ -6704,13 +6672,13 @@ export class MockAPI { host_duration: 6161, device_duration: 894, self_host_duration: 3186, - self_device_duration: 894 - } - ] - }, - path: '0-23' - } - ] - }) + self_device_duration: 894, + }, + ], + }, + path: '0-23', + }, + ], + }); } } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/app.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/app.tsx index c8cd2ddec26fee10f0a6d448a2051e749ae20696..19eb4b112529073c6b8db9a86b8d68a7633598db 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/app.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/app.tsx @@ -15,51 +15,52 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import Box from '@material-ui/core/Box' -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import ClickAwayListener from '@material-ui/core/ClickAwayListener' -import CssBaseline from '@material-ui/core/CssBaseline' -import Divider from '@material-ui/core/Divider' -import Drawer from '@material-ui/core/Drawer' -import Fab from '@material-ui/core/Fab' -import FormControl from '@material-ui/core/FormControl' -import IconButton from '@material-ui/core/IconButton' -import ListSubheader from '@material-ui/core/ListSubheader' -import MenuItem from '@material-ui/core/MenuItem' -import Select, { SelectProps } from '@material-ui/core/Select' -import { makeStyles } from '@material-ui/core/styles' -import Tab from '@material-ui/core/Tab' -import Tabs from '@material-ui/core/Tabs' -import Typography from '@material-ui/core/Typography' -import ChevronLeftIcon from '@material-ui/icons/ChevronLeft' -import ChevronRightIcon from '@material-ui/icons/ChevronRight' -import 'antd/es/button/style/css' -import 'antd/es/list/style/css' -import 'antd/es/table/style/css' -import clsx from 'clsx' -import * as React from 'react' -import * as api from './api' -import { AccuracyLeftPanel } from './components/Accuracy/AccuracyLeftPanel' -import { FileInfo } from './components/Accuracy/entity' -import { LossComparison } from './components/Accuracy/LossComparison' -import { DiffOverview } from './components/DiffOverview' -import { DistributedView } from './components/DistributedView' -import { FullCircularProgress } from './components/FullCircularProgress' -import { Kernel } from './components/Kernel' -import { MemoryView } from './components/MemoryView' -import { ModuleView } from './components/ModuleView' -import { Operator } from './components/Operator' -import { Overview } from './components/Overview' -import { TraceView } from './components/TraceView' -import { setup } from './setup' -import './styles.css' -import { firstOrUndefined, sleep } from './utils' +import Box from '@material-ui/core/Box'; +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import ClickAwayListener from '@material-ui/core/ClickAwayListener'; +import CssBaseline from '@material-ui/core/CssBaseline'; +import Divider from '@material-ui/core/Divider'; +import Drawer from '@material-ui/core/Drawer'; +import Fab from '@material-ui/core/Fab'; +import FormControl from '@material-ui/core/FormControl'; +import IconButton from '@material-ui/core/IconButton'; +import ListSubheader from '@material-ui/core/ListSubheader'; +import MenuItem from '@material-ui/core/MenuItem'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import { makeStyles } from '@material-ui/core/styles'; +import Tab from '@material-ui/core/Tab'; +import Tabs from '@material-ui/core/Tabs'; +import Typography from '@material-ui/core/Typography'; +import ChevronLeftIcon from '@material-ui/icons/ChevronLeft'; +import ChevronRightIcon from '@material-ui/icons/ChevronRight'; +import { message } from 'antd'; +import 'antd/es/button/style/css'; +import 'antd/es/list/style/css'; +import 'antd/es/table/style/css'; +import clsx from 'clsx'; +import * as React from 'react'; +import * as api from './api'; +import { AccuracyLeftPanel } from './components/Accuracy/AccuracyLeftPanel'; +import { FileInfo } from './components/Accuracy/entity'; +import { LossComparison } from './components/Accuracy/LossComparison'; +import { DiffOverview } from './components/DiffOverview'; +import { DistributedView } from './components/DistributedView'; +import { FullCircularProgress } from './components/FullCircularProgress'; +import { Kernel as KernelView } from './components/Kernel'; +import { MemoryView } from './components/MemoryView'; +import { ModuleView } from './components/ModuleView'; +import { Operator as OperatorView } from './components/Operator'; +import { Overview as OverviewPage } from './components/Overview'; +import { TraceView } from './components/TraceView'; +import { setup } from './setup'; +import './styles.css'; +import { firstOrUndefined, sleep } from './utils'; export enum Views { Overview = 'Overview', @@ -69,10 +70,10 @@ export enum Views { Distributed = 'Distributed', Memory = 'Memory', Module = 'Module', - Lightning = 'Lightning' + Lightning = 'Lightning', } -const ViewNames = { +const viewNames = { [Views.Overview]: Views.Overview, [Views.Operator]: Views.Operator, [Views.Kernel]: 'Kernel', @@ -80,61 +81,59 @@ const ViewNames = { [Views.Distributed]: Views.Distributed, [Views.Memory]: Views.Memory, [Views.Module]: Views.Module, - [Views.Lightning]: Views.Lightning -} - -const accViews = ['Loss Comparison'] + [Views.Lightning]: Views.Lightning, +}; -const drawerWidth = 340 +const drawerWidth = 340; const useStyles = makeStyles((theme) => ({ root: { display: 'flex', - height: '100%' + height: '100%', }, appBar: { zIndex: theme.zIndex.drawer + 1, transition: theme.transitions.create(['width', 'margin'], { easing: theme.transitions.easing.sharp, - duration: theme.transitions.duration.leavingScreen - }) + duration: theme.transitions.duration.leavingScreen, + }), }, appBarShift: { marginLeft: drawerWidth, width: `calc(100% - ${drawerWidth}px)`, transition: theme.transitions.create(['width', 'margin'], { easing: theme.transitions.easing.sharp, - duration: theme.transitions.duration.enteringScreen - }) + duration: theme.transitions.duration.enteringScreen, + }), }, menuButton: { - marginRight: 36 + marginRight: 36, }, hide: { - display: 'none' + display: 'none', }, drawer: { width: drawerWidth, flexShrink: 0, - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, drawerOpen: { width: drawerWidth, zIndex: 999, transition: theme.transitions.create('width', { easing: theme.transitions.easing.sharp, - duration: theme.transitions.duration.enteringScreen - }) + duration: theme.transitions.duration.enteringScreen, + }), }, drawerClose: { transition: theme.transitions.create('width', { easing: theme.transitions.easing.sharp, - duration: theme.transitions.duration.leavingScreen + duration: theme.transitions.duration.leavingScreen, }), overflowX: 'hidden', width: 0, [theme.breakpoints.up('sm')]: { - width: 0 - } + width: 0, + }, }, toolbar: { display: 'flex', @@ -142,322 +141,304 @@ const useStyles = makeStyles((theme) => ({ justifyContent: 'flex-end', padding: theme.spacing(0, 1), // necessary for content to be below app bar - ...theme.mixins.toolbar + ...theme.mixins.toolbar, }, content: { flexGrow: 1, padding: theme.spacing(3), - overflowX: 'hidden' + overflowX: 'hidden', }, formControl: { margin: theme.spacing(1), - minWidth: 120 + minWidth: 120, }, fab: { marginLeft: theme.spacing(1), marginTop: theme.spacing(1), - position: 'absolute' + position: 'absolute', }, iconButton: { - padding: '8px' - } -})) + padding: '8px', + }, +})); -export const App = () => { - const classes = useStyles() +export const App = (): JSX.Element => { + const classes = useStyles(); // #region - State + const [selectedTab, setSelectedTab] = React.useState(0); - const [selectedTab, setSelectedTab] = React.useState(0) - - const [run, setRun] = React.useState('') - const [runs, setRuns] = React.useState([]) - const [runsLoading, setRunsLoading] = React.useState(true) - - const [workers, setWorkers] = React.useState([]) - const [worker, setWorker] = React.useState('') - - const [spans, setSpans] = React.useState([]) - const [span, setSpan] = React.useState('') - - const [views, setViews] = React.useState([]) - const [view, setView] = React.useState('') - const [loaded, setLoaded] = React.useState(false) - const iframeRef = React.useRef(null) - const [deviceTarget, setDeviceTarget] = React.useState('GPU') - - const [diffLeftWorkerOptions, setDiffLeftWorkerOptions] = React.useState< - string[] - >([]) - const [diffLeftSpansOptions, setDiffLeftSpansOptions] = React.useState< - string[] - >([]) - const [diffLeftRun, setDiffLeftRun] = React.useState('') - const [diffLeftWorker, setDiffLeftWorker] = React.useState('') - const [diffLeftSpan, setDiffLeftSpan] = React.useState('') - - const [diffRightWorkerOptions, setDiffRightWorkerOptions] = React.useState< - string[] - >([]) - const [diffRightSpansOptions, setDiffRightSpansOptions] = React.useState< - string[] - >([]) - const [diffRightRun, setDiffRightRun] = React.useState('') - const [diffRightWorker, setDiffRightWorker] = React.useState('') - const [diffRightSpan, setDiffRightSpan] = React.useState('') - - const [open, setOpen] = React.useState(true) - - const [topTab, setTopTab] = React.useState(0) - const [fileList, setFileList] = React.useState([]) - const [uploadedCount, setUploadedCount] = React.useState(0) + const [run, setRun] = React.useState(''); + const [runs, setRuns] = React.useState([]); + const [runsLoading, setRunsLoading] = React.useState(true); - // #endregion + const [workers, setWorkers] = React.useState([]); + const [worker, setWorker] = React.useState(''); + + const [spans, setSpans] = React.useState([]); + const [span, setSpan] = React.useState(''); + + const [views, setViews] = React.useState([]); + const [view, setView] = React.useState(''); + const [loaded, setLoaded] = React.useState(false); + const iframeRef = React.useRef(null); + const [deviceTarget, setDeviceTarget] = React.useState('GPU'); + + const [diffLeftWorkerOptions, setDiffLeftWorkerOptions] = React.useState([]); + const [diffLeftSpansOptions, setDiffLeftSpansOptions] = React.useState([]); + const [diffLeftRun, setDiffLeftRun] = React.useState(''); + const [diffLeftWorker, setDiffLeftWorker] = React.useState(''); + const [diffLeftSpan, setDiffLeftSpan] = React.useState(''); + + const [diffRightWorkerOptions, setDiffRightWorkerOptions] = React.useState([]); + const [diffRightSpansOptions, setDiffRightSpansOptions] = React.useState([]); + const [diffRightRun, setDiffRightRun] = React.useState(''); + const [diffRightWorker, setDiffRightWorker] = React.useState(''); + const [diffRightSpan, setDiffRightSpan] = React.useState(''); + + const [open, setOpen] = React.useState(true); + + const [topTab, setTopTab] = React.useState(0); + const [fileList, setFileList] = React.useState([]); + const [uploadedCount, setUploadedCount] = React.useState(0); // #endregion React.useEffect(() => { - setup().catch(() => { - console.log('google chart is not supported offline') - }).finally(() => { - setLoaded(true) - }) - }, []) - - const continuouslyFetchRuns = async () => { + setup() + .catch(() => { + message.warning('google chart is not supported offline'); + }) + .finally(() => { + setLoaded(true); + }); + }, []); + + const continuouslyFetchRuns = async (): Promise => { while (true) { try { - const runs = await api.defaultApi.runsGet() - setRuns(runs.runs) - setRunsLoading(runs.loading) + const result = await api.defaultApi.runsGet(); + setRuns(result.runs); + setRunsLoading(result.loading); } catch (e) { - console.info('Cannot fetch runs: ', e) + message.warning(`Cannot fetch runs: ${e}`); } - await sleep(5000) + await sleep(5000); } - } + }; React.useEffect(() => { - continuouslyFetchRuns() - }, []) + continuouslyFetchRuns(); + }, []); React.useEffect(() => { if (!run || !runs.includes(run)) { - setRun(firstOrUndefined(runs) ?? '') + setRun(firstOrUndefined(runs) ?? ''); } - }, [runs]) - - // #region - Diff Left + }, [runs]); // #region - Diff Left React.useEffect(() => { if (diffLeftRun) { - api.defaultApi.workersGet(diffLeftRun, Views.Overview).then((workers) => { - setDiffLeftWorkerOptions(workers) - }) + api.defaultApi.workersGet(diffLeftRun, Views.Overview).then((data) => { + setDiffLeftWorkerOptions(data); + }); } - }, [diffLeftRun]) + }, [diffLeftRun]); React.useEffect(() => { if (diffLeftRun && diffLeftWorker) { - api.defaultApi.spansGet(diffLeftRun, diffLeftWorker).then((spans) => { - setDiffLeftSpansOptions(spans) - }) + api.defaultApi.spansGet(diffLeftRun, diffLeftWorker).then((data) => { + setDiffLeftSpansOptions(data); + }); } - }, [diffLeftRun, diffLeftWorker]) + }, [diffLeftRun, diffLeftWorker]); // #endregion - // #region - Diff Right - React.useEffect(() => { if (diffRightRun) { - api.defaultApi - .workersGet(diffRightRun, Views.Overview) - .then((workers) => { - setDiffRightWorkerOptions(workers) - }) + api.defaultApi.workersGet(diffRightRun, Views.Overview).then((data) => { + setDiffRightWorkerOptions(data); + }); } - }, [diffRightRun]) + }, [diffRightRun]); React.useEffect(() => { if (diffRightRun && diffRightWorker) { - api.defaultApi.spansGet(diffRightRun, diffRightWorker).then((spans) => { - setDiffRightSpansOptions(spans) - }) + api.defaultApi.spansGet(diffRightRun, diffRightWorker).then((data) => { + setDiffRightSpansOptions(data); + }); } - }, [diffRightRun, diffRightWorker]) + }, [diffRightRun, diffRightWorker]); // #endregion - // #region - normal - React.useEffect(() => { if (run) { api.defaultApi.viewsGet(run).then((rawViews) => { - const views = rawViews.views - .map((v) => Views[Views[v as Views]]) - .filter(Boolean) - setDeviceTarget(rawViews.device_target) - setViews(views) - }) + const result = rawViews.views.map((v) => Views[Views[v as Views]]).filter(Boolean); + setDeviceTarget(rawViews.device_target); + setViews(result); + }); } - }, [run]) + }, [run]); React.useEffect(() => { - setView(firstOrUndefined(views) ?? '') - }, [views]) + setView(firstOrUndefined(views) ?? ''); + }, [views]); React.useEffect(() => { if (run && view) { - api.defaultApi.workersGet(run, view).then((workers) => { - setWorkers(workers) - }) + api.defaultApi.workersGet(run, view).then((data) => { + setWorkers(data); + }); } - }, [run, view]) + }, [run, view]); React.useEffect(() => { - setWorker(firstOrUndefined(workers) ?? '') - }, [workers]) + setWorker(firstOrUndefined(workers) ?? ''); + }, [workers]); React.useEffect(() => { if (run && worker) { - api.defaultApi.spansGet(run, worker).then((spans) => { - setSpans(spans) - }) + api.defaultApi.spansGet(run, worker).then((data) => { + setSpans(data); + }); } - }, [run, worker]) + }, [run, worker]); React.useEffect(() => { - setSpan(firstOrUndefined(spans) ?? '') - }, [spans]) + setSpan(firstOrUndefined(spans) ?? ''); + }, [spans]); // #endregion // #region - Event Handler - const handleTabChange = (event: React.ChangeEvent<{}>, value: any) => { - setSelectedTab(value as number) - } + const handleTabChange = (event: React.ChangeEvent>, value: any): void => { + setSelectedTab(value as number); + }; - const handleTopTabChange = (event: React.ChangeEvent<{}>, value: any) => { - setTopTab(value as number) - } + const handleTopTabChange = (event: React.ChangeEvent>, value: any): void => { + setTopTab(value as number); + }; const handleRunChange: SelectProps['onChange'] = (event) => { - setRun(event.target.value as string) - setView('') - setWorker('') - setSpan('') - } + setRun(event.target.value as string); + setView(''); + setWorker(''); + setSpan(''); + }; const handleViewChange: SelectProps['onChange'] = (event) => { - setView(event.target.value as Views) - setWorker('') - setSpan('') - } + setView(event.target.value as Views); + setWorker(''); + setSpan(''); + }; const handleWorkerChange: SelectProps['onChange'] = (event) => { - setWorker(event.target.value as string) - setSpan('') - } + setWorker(event.target.value as string); + setSpan(''); + }; const handleSpanChange: SelectProps['onChange'] = (event) => { - setSpan(event.target.value as string) - } + setSpan(event.target.value as string); + }; const handleDiffLeftRunChange: SelectProps['onChange'] = (event) => { - setDiffLeftRun(event.target.value as string) - setDiffLeftWorker('') - setDiffLeftSpan('') - } + setDiffLeftRun(event.target.value as string); + setDiffLeftWorker(''); + setDiffLeftSpan(''); + }; const handleDiffLeftWorkerChange: SelectProps['onChange'] = (event) => { - setDiffLeftWorker(event.target.value as string) - setDiffLeftSpan('') - } + setDiffLeftWorker(event.target.value as string); + setDiffLeftSpan(''); + }; const handleDiffLeftSpanChange: SelectProps['onChange'] = (event) => { - setDiffLeftSpan(event.target.value as string) - } + setDiffLeftSpan(event.target.value as string); + }; const handleDiffRightRunChange: SelectProps['onChange'] = (event) => { - setDiffRightRun(event.target.value as string) - setDiffRightWorker('') - setDiffRightSpan('') - } + setDiffRightRun(event.target.value as string); + setDiffRightWorker(''); + setDiffRightSpan(''); + }; const handleDiffRightWorkerChange: SelectProps['onChange'] = (event) => { - setDiffRightWorker(event.target.value as string) - setDiffRightSpan('') - } + setDiffRightWorker(event.target.value as string); + setDiffRightSpan(''); + }; const handleDiffRightSpanChange: SelectProps['onChange'] = (event) => { - setDiffRightSpan(event.target.value as string) - } + setDiffRightSpan(event.target.value as string); + }; - const handleDrawerOpen = () => { - setOpen(true) - SetIframeActive() - } + const handleDrawerOpen = (): void => { + setOpen(true); + setIframeActive(); + }; - const handleDrawerClose = () => { - setOpen(false) - SetIframeActive() - } + const handleDrawerClose = (): void => { + setOpen(false); + setIframeActive(); + }; - const SetIframeActive = () => { - iframeRef.current?.focus() - } + const setIframeActive = (): void => { + iframeRef.current?.focus(); + }; - const _changeFileList = (files: FileInfo[]) => { + const _changeFileList = (files: FileInfo[]): void => { if (JSON.stringify(files) !== JSON.stringify(fileList)) { - setFileList(files) + setFileList(files); } - } + }; - const _changeUploadCount = (count: number) => { - setUploadedCount(count) - } + const _getViews = (viewName: Views): string => { + if (viewName === Views.Kernel) { + return deviceTarget === 'Ascend' ? `NPU ${viewNames[viewName]}` : `GPU ${viewNames[viewName]}`; + } else { + return viewNames[viewName]; + } + }; - // #endregion + const _changeUploadCount = (count: number): void => { + setUploadedCount(count); + }; // #endregion - const renderContent = () => { - if (!runsLoading && runs.length == 0) { + const renderContent = (): JSX.Element => { + if (!runsLoading && runs.length === 0) { return ( - - + + There are not any runs in the log folder. - ) + ); } - - if (!loaded || !run || !worker || !view || !span) { - return + const notReady = !loaded || !run || !worker || !view || !span; + if (notReady) { + return ; } if (selectedTab === 0) { switch (view) { case Views.Overview: - return + return ; case Views.Operator: - return + return ; case Views.Kernel: - return + return ; case Views.Trace: - return ( - - ) + return ; case Views.Distributed: - return + return ; case Views.Memory: - return + return ; case Views.Module: case Views.Lightning: - return + return ; + default: + return <>; } } else { return ( @@ -469,112 +450,99 @@ export const App = () => { expWorker={diffRightWorker} expSpan={diffRightSpan} /> - ) + ); } - } + }; - const spanComponent = () => { + const spanComponent = (): JSX.Element => { const spanFragment = ( Spans - - + + - ) + ); if (!spans || spans.length <= 1) { - return
{spanFragment}
+ return
{spanFragment}
; } else { - return spanFragment + return spanFragment; } - } + }; return (
- +
- - - + + + {topTab === 0 ? ( <> - - - + + + - {selectedTab == 0 ? ( + {selectedTab === 0 ? ( <> Runs - - + + Views - - + + Workers - - + + @@ -583,93 +551,75 @@ export const App = () => { ) : ( <> -   Baseline +   Baseline Runs - + Workers - - - - Spans - - - + + + + Spans + + + - + -   Experimental +   Experimental Runs - + Workers - - + {diffRightWorkerOptions.map((worker3) => ( + {worker3} ))} Spans - - + {diffRightSpansOptions.map((span2) => ( + {span2} ))} )} - ) : - - } + ) : ( + + )}
{!open && ( - + )}
{topTab === 0 ? renderContent() : }
-
- ) -} + + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/AccuracyLeftPanel.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/AccuracyLeftPanel.tsx index ef9b170ec7a3de46039e5345ddf574f6fd620077..c7b7d7cf0841e7dc3686138b584e101e5052f4a6 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/AccuracyLeftPanel.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/AccuracyLeftPanel.tsx @@ -17,38 +17,32 @@ * limitations under the License. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { useState, useEffect, useCallback, useRef } from 'react' -import { makeStyles } from '@material-ui/core/styles' -import { Button, Checkbox, Spin, Modal, message } from 'antd' -import { CheckboxChangeEvent } from 'antd/es/checkbox' -import { - DeleteOutlined, - DownloadOutlined, - ImportOutlined, - SettingOutlined, - WarningTwoTone, -} from '@ant-design/icons' -import { RegexConfigModal } from './RegexConfigModal' -import { FileInfo } from './entity' +import * as React from 'react'; +import { useState, useEffect, useCallback, useRef } from 'react'; +import { makeStyles } from '@material-ui/core/styles'; +import { Button, Checkbox, Spin, Modal, message } from 'antd'; +import { CheckboxChangeEvent } from 'antd/es/checkbox'; +import { DeleteOutlined, DownloadOutlined, ImportOutlined, SettingOutlined, WarningTwoTone } from '@ant-design/icons'; +import { RegexConfigModal } from './RegexConfigModal'; +import { FileInfo } from './entity'; interface IProps { - onChangeCheckedFileList: (files: FileInfo[]) => void - onChangeUploadedCount: (count: number) => void + onChangeCheckedFileList: (files: FileInfo[]) => void; + onChangeUploadedCount: (count: number) => void; } // 匹配数字包括科学计数法 -const LOSS_REG_EXP = /[+-]?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?/ +const LOSS_REG_EXP = /[+-]?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?/; // 匹配自然数 -const ITER_REG_EXP = /\d+/ +const ITER_REG_EXP = /\d+/; // 单个文件最大大小 -const FILE_MAX_SIZE = 50 * 1024 * 1024 +const FILE_MAX_SIZE = 50 * 1024 * 1024; // 最大文件上传数量 -export const MAX_FILE_COUNT = 6 +export const MAX_FILE_COUNT = 6; const useStyles = makeStyles(() => ({ root: { - height: '100%' + height: '100%', }, btnPanel: { height: 50, @@ -56,8 +50,8 @@ const useStyles = makeStyles(() => ({ borderBottom: '1px solid #DFE5EF', display: 'flex', '& .ant-btn': { - margin: 'auto' - } + margin: 'auto', + }, }, fileContainer: { height: 54, @@ -71,7 +65,7 @@ const useStyles = makeStyles(() => ({ fontSize: 14, overflow: 'hidden', textOverflow: 'ellipsis', - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, '& .btns': { display: 'inline-block', @@ -79,17 +73,17 @@ const useStyles = makeStyles(() => ({ '& .icon': { cursor: 'pointer', '&:hover': { - color: '#1890ff' - } + color: '#1890ff', + }, }, '& .iconLeft': { - marginRight: 8 - } + marginRight: 8, + }, }, }, deleteModal: { '& .ant-modal-title': { - fontWeight: 'bold' + fontWeight: 'bold', }, '& .deleteModalBody': { display: 'flex', @@ -97,203 +91,210 @@ const useStyles = makeStyles(() => ({ height: 80, '& .warningIcon': { display: 'inline-block', - fontSize: 50 + fontSize: 50, }, '& .warningText': { display: 'inline-block', marginLeft: 16, overflow: 'hidden', wordBreak: 'break-all', - flex: 1 - } - } - } -})) + flex: 1, + }, + }, + }, +})); export const AccuracyLeftPanel: React.FC = (props) => { - const { onChangeCheckedFileList, onChangeUploadedCount } = props - const classes = useStyles() - const [configModalVis, setConfigModalVis] = useState(false) - const [deleteModalVis, setDeleteModalVis] = useState(false) - const [fileList, setFileList] = useState([]) - const [importSpin, setImportSpin] = useState(false) - const [selectedFile, setSelectedFile] = useState(undefined) - const downLoadRef = useRef(null) + const { onChangeCheckedFileList, onChangeUploadedCount } = props; + const classes = useStyles(); + const [configModalVis, setConfigModalVis] = useState(false); + const [deleteModalVis, setDeleteModalVis] = useState(false); + const [fileList, setFileList] = useState([]); + const [importSpin, setImportSpin] = useState(false); + const [selectedFile, setSelectedFile] = useState(undefined); + const downLoadRef = useRef(null); const parseFile = (file: FileInfo): FileInfo => { - file.losses = [] - file.iterLosses = {} - file.iters = [] - const lines = file.fileContent.split(/\r\n|\n|\r/) + file.losses = []; + file.iterLosses = {}; + file.iters = []; + const lines = file.fileContent.split(/\r\n|\n|\r/); for (let i = 0; i < lines.length; i++) { - const iter = parseByTag(lines[i], file.iterTag, false) - const loss = parseByTag(lines[i], file.lossTag, true) + const iter = parseByTag(lines[i], file.iterTag, false); + const loss = parseByTag(lines[i], file.lossTag, true); if (iter !== null && loss !== null) { - file.iters.push(iter) - file.losses.push([iter, loss]) - file.iterLosses[iter] = loss + file.iters.push(iter); + file.losses.push([iter, loss]); + file.iterLosses[iter] = loss; } } - return file - } + return file; + }; const parseByTag = (line: string, tag: string, isLoss: boolean): number | null => { - let pos = line.indexOf(tag) - let result: number | null = null + let pos = line.indexOf(tag); + let result: number | null = null; if (pos !== -1) { - const res = (isLoss ? LOSS_REG_EXP : ITER_REG_EXP) - .exec(line.substring(pos + tag.length).trim().split(/\s+/)[0]) + const res = (isLoss ? LOSS_REG_EXP : ITER_REG_EXP).exec( + line + .substring(pos + tag.length) + .trim() + .split(/\s+/)[0] + ); if (res !== null) { if (isLoss) { - result = parseFloat(res[0]) + result = parseFloat(res[0]); } else { - result = parseInt(res[0]) + result = parseInt(res[0]); } } else { - console.log(`Found ${isLoss ? 'loss' : 'iteration'} text, but parse value with error: [${line}]`) + console.warn(`Found ${isLoss ? 'loss' : 'iteration'} text, but parse value with error: [${line}]`); } } - return result - } + return result; + }; - const importFile = () => { - document.getElementById('accComparisonSelectFile')?.click() - } + const importFile = (): void => { + document.getElementById('accComparisonSelectFile')?.click(); + }; - const uploadFile = (e: React.ChangeEvent) => { - setImportSpin(true) - const file = e.target.files?.[0] + const uploadFile = (e: React.ChangeEvent): void => { + setImportSpin(true); + const file = e.target.files?.[0]; if (file) { if (file.size > FILE_MAX_SIZE) { - message.warn('Sorry, the file size cannot be greater than 50MB.') - setImportSpin(false) + message.warn('Sorry, the file size cannot be greater than 50MB.'); + setImportSpin(false); // 防止同名文件不触发事件 - e.target.value = '' - return + e.target.value = ''; + return; } - const reader = new FileReader() - reader.onload = ((selectedFile) => { - return (e) => { - addFile(selectedFile.name.trim(), e.target?.result as string) - setImportSpin(false) - } + const reader = new FileReader(); + reader.onload = ((loadedFile) => { + return (event) => { + addFile(loadedFile.name.trim(), event.target?.result as string); + setImportSpin(false); + }; })(file); - reader.readAsText(file) + reader.readAsText(file); } // 防止同名文件不触发事件 - e.target.value = '' - } + e.target.value = ''; + }; - const addFile = (fileName: string, fileContent: string) => { - const fileLength = fileName.length - const tempList: FileInfo[] = JSON.parse(JSON.stringify(fileList)) + const addFile = (fileName: string, fileContent: string): void => { + const fileLength = fileName.length; + const tempList: FileInfo[] = JSON.parse(JSON.stringify(fileList)); + let updatedFileName = fileName; // 新变量用于存储更新后的文件名 // 上传同名文件加上(1~最大文件数减1)标识 - if (!!tempList.find(item => item.fileName === fileName)) { + if (!!tempList.find((item) => item.fileName === fileName)) { for (let i = 1; i < MAX_FILE_COUNT; i++) { - let temp = `${fileName.slice(0, fileLength - 4)}(${i})${fileName.slice(fileLength - 4)}` - if (tempList.find(item => item.fileName === temp) === undefined) { - fileName = temp - break + let temp = `${fileName.slice(0, fileLength - 4)}(${i})${fileName.slice(fileLength - 4)}`; + if (tempList.find((item) => item.fileName === temp) === undefined) { + updatedFileName = temp; + break; } } } const file: FileInfo = { id: fileList.length, - fileName: fileName, + fileName: updatedFileName, fileContent, checked: true, lossTag: 'loss:', iterTag: 'iteration', iters: [], losses: [], - iterLosses: {} - } - tempList.push(parseFile(file)) - setFileList(tempList) - } + iterLosses: {}, + }; + tempList.push(parseFile(file)); + setFileList(tempList); + }; - const exportCsv = (data: FileInfo) => { - let csvContent = `data:text/csv;charset=utf-8,${data.iterTag},${data.lossTag}\n` - data.losses.forEach(item => { - csvContent += `${item[0]},${item[1]}\n` - }) - downLoadRef.current?.setAttribute('href', encodeURI(csvContent)) - downLoadRef.current?.setAttribute('download', `${data.fileName}.csv`) - downLoadRef.current?.click() - } + const exportCsv = (data: FileInfo): void => { + let csvContent = `data:text/csv;charset=utf-8,${data.iterTag},${data.lossTag}\n`; + data.losses.forEach((item) => { + csvContent += `${item[0]},${item[1]}\n`; + }); + downLoadRef.current?.setAttribute('href', encodeURI(csvContent)); + downLoadRef.current?.setAttribute('download', `${data.fileName}.csv`); + downLoadRef.current?.click(); + }; - const onCheckChange = (e: CheckboxChangeEvent, index: number) => { - const tempList: FileInfo[] = JSON.parse(JSON.stringify(fileList)) - tempList[index].checked = e.target.checked - setFileList(tempList) - } + const onCheckChange = (e: CheckboxChangeEvent, index: number): void => { + const tempList: FileInfo[] = JSON.parse(JSON.stringify(fileList)); + tempList[index].checked = e.target.checked; + setFileList(tempList); + }; - const onConfigIconClick = (data: FileInfo) => { - setSelectedFile(data) - setConfigModalVis(true) - } + const onConfigIconClick = (data: FileInfo): void => { + setSelectedFile(data); + setConfigModalVis(true); + }; - const onDeleteIconClick = (data: FileInfo) => { - setSelectedFile(data) - setDeleteModalVis(true) - } + const onDeleteIconClick = (data: FileInfo): void => { + setSelectedFile(data); + setDeleteModalVis(true); + }; - const configModalOk = (data: FileInfo) => { - const tempList = fileList.map(item => { - return item.id === data.id ? parseFile(data) : item - }) - setFileList(tempList) - setConfigModalVis(false) - } + const configModalOk = (data: FileInfo): void => { + const tempList = fileList.map((item) => { + return item.id === data.id ? parseFile(data) : item; + }); + setFileList(tempList); + setConfigModalVis(false); + }; - const configModalCancel = () => { - setConfigModalVis(false) - } + const configModalCancel = (): void => { + setConfigModalVis(false); + }; - const deleteModalOk = () => { - const tempList = JSON.parse(JSON.stringify(fileList)) - let founded = false - let index = 0 + const deleteModalOk = (): void => { + const tempList = JSON.parse(JSON.stringify(fileList)); + let founded = false; + let index = 0; for (let i = 0; i < tempList.length; i++) { if (founded) { - tempList[i].id -= 1 - continue + tempList[i].id -= 1; + continue; } if (tempList[i].id === selectedFile?.id) { - founded = true - index = i + founded = true; + index = i; } } - tempList.splice(index, 1) - setFileList(tempList) - setSelectedFile(undefined) - setDeleteModalVis(false) - } + tempList.splice(index, 1); + setFileList(tempList); + setSelectedFile(undefined); + setDeleteModalVis(false); + }; const renderFileItems = useCallback(() => { return fileList.map((item) => { return (
- onCheckChange(e, item.id)} /> - {item.fileName} -
- onConfigIconClick(item)} /> - exportCsv(item)} /> - onDeleteIconClick(item)} /> + onCheckChange(e, item.id)} /> + + {item.fileName} + +
+ onConfigIconClick(item)} /> + exportCsv(item)} /> + onDeleteIconClick(item)} />
- ) - }) - }, [JSON.stringify(fileList)]) + ); + }); + }, [JSON.stringify(fileList)]); useEffect(() => { - onChangeCheckedFileList(fileList.filter(item => item.checked)) - onChangeUploadedCount(fileList.length) - }, [JSON.stringify(fileList)]) + onChangeCheckedFileList(fileList.filter((item) => item.checked)); + onChangeUploadedCount(fileList.length); + }, [JSON.stringify(fileList)]); return (
- +
- +
{renderFileItems()}
- {configModalVis && - - } + {configModalVis && ( + + )} setDeleteModalVis(false)} + onCancel={(): void => setDeleteModalVis(false)} onOk={deleteModalOk} width={500} className={classes.deleteModal} > -
- - +
+ + Are you sure to delete "{selectedFile?.fileName}"?
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/ComparisonPanel.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/ComparisonPanel.tsx index a9c9d34feb585cac7c6aa26f9e962c0ed9d11d88..500d29764c5209958ba19630ac1d4e08c10f24a5 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/ComparisonPanel.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/ComparisonPanel.tsx @@ -17,23 +17,23 @@ * limitations under the License. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { useState, useLayoutEffect, useRef, useEffect } from 'react' -import { makeStyles } from '@material-ui/core/styles' -import { FileInfo } from './entity' -import { Empty, Popover, Radio, RadioChangeEvent, Select, Table } from 'antd' -import { ColumnsType } from 'antd/es/table' -import * as echarts from 'echarts' -import { InfoCircleOutlined } from '@ant-design/icons' +import * as React from 'react'; +import { useState, useLayoutEffect, useRef, useEffect } from 'react'; +import { makeStyles } from '@material-ui/core/styles'; +import { FileInfo } from './entity'; +import { Empty, Popover, Radio, RadioChangeEvent, Select, Table } from 'antd'; +import { ColumnsType } from 'antd/es/table'; +import * as echarts from 'echarts'; +import { InfoCircleOutlined } from '@ant-design/icons'; interface IProps { - fileList: FileInfo[] + fileList: FileInfo[]; } interface ILineDataList { - normal: number[][] - absolute: number[][] - relative: number[][] + normal: number[][]; + absolute: number[][]; + relative: number[][]; } const useStyles = makeStyles(() => ({ @@ -49,26 +49,26 @@ const useStyles = makeStyles(() => ({ lineHeight: '24px', fontFamily: 'sans-serif', fontSize: 16, - fontWeight: 700 + fontWeight: 700, }, filter: { height: 40, lineHeight: '40px', '& .comparisonSelect': { - margin: '0 8px' + margin: '0 8px', }, '& .comparisonLabel': { - marginRight: 8 + marginRight: 8, }, '& .comparisonBtn': { - marginLeft: 20 + marginLeft: 20, }, '& .infoLabel': { - fontSize: 20 - } + fontSize: 20, + }, }, empty: { - marginTop: 60 + marginTop: 60, }, content: { flex: 1, @@ -76,11 +76,11 @@ const useStyles = makeStyles(() => ({ }, lossChart: { height: '100%', - flex: 1 + flex: 1, }, lossTable: { height: '100%', - width: '32%' + width: '32%', }, tableHeader: { display: 'inline-block', @@ -90,149 +90,163 @@ const useStyles = makeStyles(() => ({ transform: 'translateY(-50%)', overflow: 'hidden', textOverflow: 'ellipsis', - whiteSpace: 'nowrap' - } -})) + whiteSpace: 'nowrap', + }, +})); export const ComparisonPanel: React.FC = (props) => { - const { fileList } = props - const classes = useStyles() - const [selectedFiles, setSelectedFiles] = useState([]) - const [compareWay, setCompareWay] = useState(0) - const [pageSize, setPageSize] = useState(20) - const [lineData, setLineData] = useState(undefined) - const [tableData, setTableData] = useState([]) - const chartRef = useRef(null) + const { fileList } = props; + const classes = useStyles(); + const [selectedFiles, setSelectedFiles] = useState([]); + const [compareWay, setCompareWay] = useState(0); + const [pageSize, setPageSize] = useState(20); + const [lineData, setLineData] = useState(undefined); + const [tableData, setTableData] = useState([]); + const chartRef = useRef(null); const getColumns = (): ColumnsType => { - const columns: ColumnsType = [{ - title: 'Iteration', - key: 'iter', - dataIndex: 'iter', - }] + const columns: ColumnsType = [ + { + title: 'Iteration', + key: 'iter', + dataIndex: 'iter', + }, + ]; selectedFiles.forEach((item, index) => { columns.push({ title: () => ( -
{item}
+
+ {item} +
), key: index, dataIndex: item, - width: '40%' - }) - }) - return columns - } + width: '40%', + }); + }); + return columns; + }; - const compareFile = (fileNames: string[]) => { + const compareFile = (fileNames: string[]): void => { if (fileNames.length < 2) { - return + return; } - const baseFile = fileList.find(item => item.fileName === fileNames[0]) - const expFile = fileList.find(item => item.fileName === fileNames[1]) + const baseFile = fileList.find((item) => item.fileName === fileNames[0]); + const expFile = fileList.find((item) => item.fileName === fileNames[1]); if (!!baseFile && !!expFile) { - const commonIters: number[] = [] - const lessIters = baseFile.iters.length <= expFile.iters.length ? baseFile.iters : expFile.iters - const moreIters = baseFile.iters.length > expFile.iters.length ? baseFile.iters : expFile.iters - lessIters.forEach(iter => { + const commonIters: number[] = []; + const lessIters = baseFile.iters.length <= expFile.iters.length ? baseFile.iters : expFile.iters; + const moreIters = baseFile.iters.length > expFile.iters.length ? baseFile.iters : expFile.iters; + lessIters.forEach((iter) => { if (moreIters.includes(iter)) { - commonIters.push(iter) + commonIters.push(iter); } - }) - commonIters.sort((a, b) => a - b) - const tempTableData: any[] = [] + }); + commonIters.sort((a, b) => a - b); + const tempTableData: any[] = []; const tempChartData: ILineDataList = { normal: [], absolute: [], - relative: [] - } + relative: [], + }; commonIters.forEach((iter, index) => { - const baseLoss = baseFile.iterLosses[iter] - const expLoss = expFile.iterLosses[iter] + const baseLoss = baseFile.iterLosses[iter]; + const expLoss = expFile.iterLosses[iter]; tempTableData.push({ key: `${iter}_${index}`, iter, [baseFile.fileName]: baseLoss, - [expFile.fileName]: expLoss - }) - tempChartData.normal.push([iter, expLoss - baseLoss]) - tempChartData.absolute.push([iter, Math.abs(expLoss - baseLoss)]) - tempChartData.relative.push([iter, baseLoss === 0 ? 0 : Math.abs(expLoss - baseLoss) / baseLoss]) - }) - setTableData(tempTableData) - setLineData(tempChartData) + [expFile.fileName]: expLoss, + }); + tempChartData.normal.push([iter, expLoss - baseLoss]); + tempChartData.absolute.push([iter, Math.abs(expLoss - baseLoss)]); + tempChartData.relative.push([iter, baseLoss === 0 ? 0 : Math.abs(expLoss - baseLoss) / baseLoss]); + }); + setTableData(tempTableData); + setLineData(tempChartData); } - } + }; - const onSelectChange = (value: string[]) => { - setSelectedFiles(value) - compareFile(value) - } + const onSelectChange = (value: string[]): void => { + setSelectedFiles(value); + compareFile(value); + }; - const onRadioChange = (e: RadioChangeEvent) => { - setCompareWay(e.target.value) - } + const onRadioChange = (e: RadioChangeEvent): void => { + setCompareWay(e.target.value); + }; - const onShowSizeChange = (current: number, size: number) => { - setPageSize(size) - } + const onShowSizeChange = (current: number, size: number): void => { + setPageSize(size); + }; useLayoutEffect(() => { - const element = chartRef.current + const element = chartRef.current; if (!element || !lineData) { - return + return undefined; + } + const echart = echarts.init(element); + let dataSource: number[][] = []; + if (compareWay === 0) { + dataSource = lineData.normal; + } else if (compareWay === 1) { + dataSource = lineData.absolute; + } else { + dataSource = lineData.relative; } - const echart = echarts.init(element) const option: echarts.EChartsOption = { title: { text: 'Comparison Chart', textStyle: { fontSize: 12, - color: '#000' - } + color: '#000', + }, }, legend: { bottom: 0 }, xAxis: { type: 'category', boundaryGap: false, - name: 'Iteration' + name: 'Iteration', }, yAxis: { type: 'value', name: 'Difference', - scale: true + scale: true, }, tooltip: { trigger: 'axis', - valueFormatter: (value) => (value as number).toFixed(6) + valueFormatter: (value) => (value as number).toFixed(6), }, dataZoom: { - type: 'inside' + type: 'inside', }, dataset: { - source: compareWay === 0 ? lineData.normal : (compareWay === 1 ? lineData.absolute : lineData.relative) + source: dataSource, }, series: { type: 'line', name: 'Difference', - symbol: 'none' - } - } + symbol: 'none', + }, + }; - option && echart.setOption(option, true) - return () => { - echart.dispose() + if (option) { + echart.setOption(option, true); } - }, [compareWay, lineData]) + return () => { + echart.dispose(); + }; + }, [compareWay, lineData]); useEffect(() => { - const tempValue = selectedFiles.filter(item => { - return !!fileList.find(file => file.fileName === item) - }) + const tempValue = selectedFiles.filter((item) => { + return !!fileList.find((file) => file.fileName === item); + }); if (JSON.stringify(tempValue) === JSON.stringify(selectedFiles)) { - compareFile(tempValue) + compareFile(tempValue); } - setSelectedFiles(tempValue) - }, [fileList]) + setSelectedFiles(tempValue); + }, [fileList]); return (
@@ -240,25 +254,23 @@ export const ComparisonPanel: React.FC = (props) => {
Comparison objects:
- Iteration Tag + Iteration Tag
- ) -} \ No newline at end of file + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/entity.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/entity.ts index 0a0a1ee4b28661799aea5a9233c4f3a90f4a251e..270c4cb6535633f9a03e5b9fe02dca6121cd3ba7 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/entity.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Accuracy/entity.ts @@ -18,13 +18,13 @@ *--------------------------------------------------------------------------------------------*/ export interface FileInfo { - id: number - fileName: string - fileContent: string - checked: boolean - lossTag: string - iterTag: string - iters: number[] - losses: number[][] - iterLosses: { [iter: number]: number } + id: number; + fileName: string; + fileContent: string; + checked: boolean; + lossTag: string; + iterTag: string; + iters: number[]; + losses: number[][]; + iterLosses: { [iter: number]: number }; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DataLoading.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DataLoading.tsx index e2967bdf74196ad74a13f2d2f8b1799911d3b553..3c5d353ce641c409b51a7aaef8c00ff2f57df6e8 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DataLoading.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DataLoading.tsx @@ -2,18 +2,18 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { FullCircularProgress } from './FullCircularProgress' +import * as React from 'react'; +import { FullCircularProgress } from './FullCircularProgress'; interface IProps { - value: T | undefined | null - children: (t: T) => JSX.Element + value?: T | null; + children: (t: T) => JSX.Element; } -export function DataLoading(props: IProps) { +export function DataLoading(props: IProps): JSX.Element { if (props.value === undefined || props.value === null) { - return + return ; } - return props.children(props.value) + return props.children(props.value); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DiffOverview.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DiffOverview.tsx index e8071b2c5966d944804b4d8abd780d8389042d38..ed029d5020ed1eaf8caea159b25d33c7a5ad03e3 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DiffOverview.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DiffOverview.tsx @@ -2,130 +2,101 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import Button from '@material-ui/core/Button' -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import Grid from '@material-ui/core/Grid' -import { makeStyles } from '@material-ui/core/styles' -import Typography from '@material-ui/core/Typography' -import ChevronLeftIcon from '@material-ui/icons/ChevronLeft' -import { Select, Table } from 'antd' -import * as React from 'react' -import * as api from '../api' -import { useResizeEventDependency } from '../utils/resize' -import { FullCircularProgress } from './FullCircularProgress' -import * as echarts from 'echarts' - -const { Option } = Select - -const topGraphHeight = 230 +import Button from '@material-ui/core/Button'; +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import Grid from '@material-ui/core/Grid'; +import { makeStyles } from '@material-ui/core/styles'; +import Typography from '@material-ui/core/Typography'; +import ChevronLeftIcon from '@material-ui/icons/ChevronLeft'; +import { Select, Table } from 'antd'; +import * as React from 'react'; +import * as api from '../api'; +import { useResizeEventDependency } from '../utils/resize'; +import { FullCircularProgress } from './FullCircularProgress'; +import * as echarts from 'echarts'; + +const { Option } = Select; + +const topGraphHeight = 230; const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, pre: { '& ul': { margin: 0, paddingLeft: theme.spacing(3), - ...theme.typography.body1 + ...theme.typography.body1, }, '& li': {}, '& a': { - color: '#ffa726' + color: '#ffa726', }, '& a:active': { - color: '#ffa726' + color: '#ffa726', }, '& p': { margin: 0, ...theme.typography.subtitle1, - fontWeight: theme.typography.fontWeightBold - } + fontWeight: theme.typography.fontWeightBold, + }, }, topGraph: { - height: topGraphHeight + 40 + height: topGraphHeight + 40, }, iconButton: { - padding: '8px' - } -})) + padding: '8px', + }, +})); -const getAngleByDataLength = (data: number) => { +const getAngleByDataLength = (data: number): number => { if (data < 10) { - return 0 + return 0; } else { // 数量越大越趋近于旋转90度 - return 90 * (1 - 10 / data) + return 90 * (1 - (10 / data)); } -} +}; export interface DiffColumnChartIProps { - rawData: any[] - selectCallback: (row: number, column: number) => void + rawData: any[]; + selectCallback: (row: number, column: number) => void; } export interface DiffStepChartIProps { - rawData: any[] + rawData: any[]; } -const DiffColumnChart: React.FC = ( - props: DiffColumnChartIProps -) => { - const { rawData, selectCallback } = props - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() +const DiffColumnChart: React.FC = (props: DiffColumnChartIProps) => { + const { rawData, selectCallback } = props; + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return - - let left_duration_data: number[] = [] - let left_accumulated_duration_data: number[] = [] - - let right_duration_data: number[] = [] - let right_accumulated_duration_data: number[] = [] - - for (let i = 0; i < rawData.length; i++) { - let curr = rawData[i] - left_duration_data.push(curr[1]) - right_duration_data.push(curr[2]) - left_accumulated_duration_data.push(curr[3]) - right_accumulated_duration_data.push(curr[4]) + const element = graphRef.current; + if (!element) { + return undefined; } - let left_duration_max = Math.max(...left_duration_data) - let right_duration_max = Math.max(...right_duration_data) - let duration_max = Math.max(left_duration_max, right_duration_max) - - let left_accumulated_duration_max = Math.max( - ...left_accumulated_duration_data - ) - let right_accumulated_duration_max = Math.max( - ...right_accumulated_duration_data - ) - let accumulated_max = Math.max( - left_accumulated_duration_max, - right_accumulated_duration_max - ) - - const chart = echarts.init(element) + const chart = echarts.init(element); const options: echarts.EChartsOption = { title: { - text: 'Execution Comparsion' + text: 'Execution Comparsion', }, legend: { top: 10, - right: 10 + right: 10, }, tooltip: { trigger: 'axis', formatter: function (params: any) { - const index = params[0].name.indexOf('@') - const safeName = params[0].name.replace(//g, '>') - var res = `${index > -1 ? safeName.slice(index + 1) : safeName}
` + const index = params[0].name.indexOf('@'); + const safeName = params[0].name.replace(//g, '>'); + let res = `${index > -1 ? safeName.slice(index + 1) : safeName}
`; for (const item of params) { if (typeof item.value[item.encode.y[0]] === 'number') { res += ` - ${item.seriesName}: ${item.value[item.encode.y[0]]}
` + ${item.seriesName}: ${item.value[item.encode.y[0]]}
`; } } - return res - } + return res; + }, }, series: [ { type: 'bar', itemStyle: { - color: '#3366cc' + color: '#3366cc', }, yAxisIndex: 0, - }, { type: 'bar', itemStyle: { - color: '#dc3912' + color: '#dc3912', }, - yAxisIndex: 0 + yAxisIndex: 0, }, { type: 'line', itemStyle: { - color: '#ff9900' + color: '#ff9900', }, - yAxisIndex: 1 + yAxisIndex: 1, }, { type: 'line', itemStyle: { - color: '#109618' + color: '#109618', }, - yAxisIndex: 1 - } + yAxisIndex: 1, + }, ], xAxis: { type: 'category', @@ -178,78 +148,81 @@ const DiffColumnChart: React.FC = ( interval: 0, rotate: getAngleByDataLength(rawData.length), formatter: (name: string) => { - const index = name.indexOf('@') - if (index > -1) { - name = name.slice(index + 1) - } - return name.length > 16 ? name.slice(0, 14) + "..." : name; - } - } + const index = name.indexOf('@'); + const displayName = index > -1 ? name.slice(index + 1) : name; // 创建新变量 + return displayName.length > 16 ? `${displayName.slice(0, 14)}...` : displayName; + }, + }, }, - yAxis: [{ - type: 'value', - name: 'Time Difference(us)', - scale: true - }, { - type: 'value', - name: 'Accumulated Difference(us)', - scale: true - }], + yAxis: [ + { + type: 'value', + name: 'Time Difference(us)', + scale: true, + }, + { + type: 'value', + name: 'Accumulated Difference(us)', + scale: true, + }, + ], dataset: { source: rawData.map((item, idx) => { // 添加索引保证x轴刻度不重复 - let param: any[] = [...item] - param[0] = `${idx}@${param[0]}` - return param - }) - } - } + let param: any[] = [...item]; + param[0] = `${idx}@${param[0]}`; + return param; + }), + }, + }; - options && chart.setOption(options, true) + if (options) { + chart.setOption(options, true); + } chart.on('click', (param) => { if (param.seriesIndex !== undefined) { - selectCallback(param.dataIndex, param.seriesIndex + 1) + selectCallback(param.dataIndex, param.seriesIndex + 1); } - }) + }); return () => { - chart.dispose() - } - }, [rawData, resizeEventDependency]) + chart.dispose(); + }; + }, [rawData, resizeEventDependency]); return (
- ) -} + ); +}; -const DiffStepChart: React.FC = ( - props: DiffStepChartIProps -) => { - const { rawData } = props - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() +const DiffStepChart: React.FC = (props: DiffStepChartIProps) => { + const { rawData } = props; + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return - const chart = echarts.init(element) + const element = graphRef.current; + if (!element) { + return undefined; + } + const chart = echarts.init(element); const options: echarts.EChartsOption = { title: { - text: 'Execution Diff' + text: 'Execution Diff', }, legend: { top: 10, - right: 10 + right: 10, }, dataset: { source: rawData.map((item, idx) => { // 添加索引保证x轴刻度不重复 - let param: any[] = [...item] - param[0] = `${idx}@${param[0]}` - return param - }) + let param: any[] = [...item]; + param[0] = `${idx}@${param[0]}`; + return param; + }), }, xAxis: { type: 'category', @@ -257,24 +230,22 @@ const DiffStepChart: React.FC = ( interval: 0, rotate: getAngleByDataLength(rawData.length), formatter: (name: string) => { - const index = name.indexOf('@') - if (index > -1) { - name = name.slice(index + 1) - } - return name.length > 16 ? name.slice(0, 14) + "..." : name; - } - } + const index = name.indexOf('@'); + const displayName = index > -1 ? name.slice(index + 1) : name; // 创建新变量 + return displayName.length > 16 ? `${displayName.slice(0, 14)}...` : displayName; + }, + }, }, yAxis: { type: 'value', - scale: true + scale: true, }, tooltip: { trigger: 'axis', formatter: function (params: any) { - const index = params[0].name.indexOf('@') - const safeName = params[0].name.replace(//g, '>') - var res = `${index > -1 ? safeName.slice(index + 1) : safeName}
` + const index = params[0].name.indexOf('@'); + const safeName = params[0].name.replace(//g, '>'); + let res = `${index > -1 ? safeName.slice(index + 1) : safeName}
`; for (const item of params) { if (typeof item.value[item.encode.y[0]] === 'number') { res += ` - ${item.seriesName}: ${item.value[item.encode.y[0]]}
` + ${item.seriesName}: ${item.value[item.encode.y[0]]}
`; } } - return res - } + return res; + }, }, series: [ { @@ -298,413 +269,411 @@ const DiffStepChart: React.FC = ( step: 'middle', areaStyle: { color: '#c1d1ef', - opacity: 1 - } - }, { + opacity: 1, + }, + }, + { type: 'line', color: '#dc3912', symbolSize: 0, step: 'middle', areaStyle: { color: '#f4c3b7', - opacity: 1 - } - } - ] - } + opacity: 1, + }, + }, + ], + }; - options && chart.setOption(options, true) - return () => { - chart.dispose() + if (options) { + chart.setOption(options, true); } - }, [rawData, resizeEventDependency]) + return () => { + chart.dispose(); + }; + }, [rawData, resizeEventDependency]); return (
- ) -} + ); +}; export interface IProps { - run: string - worker: string - span: string - expRun: string - expWorker: string - expSpan: string + run: string; + worker: string; + span: string; + expRun: string; + expWorker: string; + expSpan: string; } export interface ColumnUnderlyingData { - name: string - path: string - leftAggs: any[] - rightAggs: any[] + name: string; + path: string; + leftAggs: any[]; + rightAggs: any[]; } export interface TableRow { - key: number - - operator: string - baselineCalls?: number - expCalls?: number - deltaCalls?: number - deltaCallsPercentNumber?: number - deltaCallsPercent?: string - - baselineHostDuration: number - expHostDuration: number - deltaHostDuration: number - deltaHostDurationPercentNumber: number - deltaHostDurationPercent: string - - baselineSelfHostDuration: number - expSelfHostDuration: number - deltaSelfHostDuration: number - deltaSelfHostDurationPercentNumber: number - deltaSelfHostDurationPercent: string - - baselineDeviceDuration: number - expDeviceDuration: number - deltaDeviceDuration: number - deltaDeviceDurationPercentNumber: number - deltaDeviceDurationPercent: string - - baselineSelfDeviceDuration: number - expSelfDeviceDuration: number - deltaSelfDeviceDuration: number - deltaSelfDeviceDurationPercentNumber: number - deltaSelfDeviceDurationPercent: string + key: number; + + operator: string; + baselineCalls?: number; + expCalls?: number; + deltaCalls?: number; + deltaCallsPercentNumber?: number; + deltaCallsPercent?: string; + + baselineHostDuration: number; + expHostDuration: number; + deltaHostDuration: number; + deltaHostDurationPercentNumber: number; + deltaHostDurationPercent: string; + + baselineSelfHostDuration: number; + expSelfHostDuration: number; + deltaSelfHostDuration: number; + deltaSelfHostDurationPercentNumber: number; + deltaSelfHostDurationPercent: string; + + baselineDeviceDuration: number; + expDeviceDuration: number; + deltaDeviceDuration: number; + deltaDeviceDurationPercentNumber: number; + deltaDeviceDurationPercent: string; + + baselineSelfDeviceDuration: number; + expSelfDeviceDuration: number; + deltaSelfDeviceDuration: number; + deltaSelfDeviceDurationPercentNumber: number; + deltaSelfDeviceDurationPercent: string; } -let columnChartDataStack: any[][] = [] -let stepChartDataStack: any[][] = [] -let columnUnderlyingDataStack: ColumnUnderlyingData[][] = [] -let columnTableDataSourceStack: TableRow[][] = [] +let columnChartDataStack: any[][] = []; +let stepChartDataStack: any[][] = []; +let columnUnderlyingDataStack: ColumnUnderlyingData[][] = []; +let columnTableDataSourceStack: TableRow[][] = []; export const DiffOverview: React.FC = (props: IProps) => { // #region - Constant - - const COMPOSITE_NODES_NAME = 'CompositeNodes' + const COMPOSITE_NODES_NAME = 'CompositeNodes'; const hostDurationColumns = [ { title: 'Baseline Host Duration (us)', dataIndex: 'baselineHostDuration', key: 'baselineHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.baselineHostDuration - b.baselineHostDuration + sorter: (a: TableRow, b: TableRow): number => { + const aBaselineHost = a.baselineHostDuration ?? 0; + const bBaselineHost = b.baselineHostDuration ?? 0; + return aBaselineHost - bBaselineHost; + }, }, { title: 'Exp Host Duration (us)', dataIndex: 'expHostDuration', key: 'expHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.expHostDuration - b.expHostDuration + sorter: (a: TableRow, b: TableRow): number => { + const aExpHost = a.expHostDuration ?? 0; + const bExpHost = b.expHostDuration ?? 0; + return aExpHost - bExpHost; + }, }, { title: 'Delta Host Duration (us)', dataIndex: 'deltaHostDuration', key: 'deltaHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.deltaHostDuration! - b.deltaHostDuration! + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaHost = a.deltaHostDuration ?? 0; + const bDeltaHost = b.deltaHostDuration ?? 0; + return aDeltaHost - bDeltaHost; + }, }, { title: 'Delta Host Duration%', dataIndex: 'deltaHostDurationPercent', key: 'deltaHostDurationPercent', - sorter: (a: TableRow, b: TableRow) => - a.deltaHostDurationPercentNumber! - b.deltaHostDurationPercentNumber! - } - ] + sorter: (a: TableRow, b: TableRow): number => { + const aPercent = a.deltaHostDurationPercentNumber ?? 0; + const bPercent = b.deltaHostDurationPercentNumber ?? 0; + return aPercent - bPercent; + }, + }, + ]; const selfHostDurationColumns = [ { title: 'Baseline Self Host Duration (us)', dataIndex: 'baselineSelfHostDuration', key: 'baselineSelfHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.baselineSelfHostDuration - b.baselineSelfHostDuration + sorter: (a: TableRow, b: TableRow): number => a.baselineSelfHostDuration - b.baselineSelfHostDuration, }, { title: 'Exp Self Host Duration (us)', dataIndex: 'expSelfHostDuration', key: 'expSelfHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.expSelfHostDuration - b.expSelfHostDuration + sorter: (a: TableRow, b: TableRow): number => a.expSelfHostDuration - b.expSelfHostDuration, }, { title: 'Delta Self Host Duration (us)', dataIndex: 'deltaSelfHostDuration', key: 'deltaSelfHostDuration', - sorter: (a: TableRow, b: TableRow) => - a.deltaSelfHostDuration! - b.deltaSelfHostDuration! + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaSelfHost = a.deltaSelfHostDuration ?? 0; + const bDeltaSelfHost = b.deltaSelfHostDuration ?? 0; + return aDeltaSelfHost - bDeltaSelfHost; + }, }, { title: 'Delta Self Host Duration%', dataIndex: 'deltaSelfHostDurationPercent', key: 'deltaSelfHostDurationPercent', - sorter: (a: TableRow, b: TableRow) => - a.deltaSelfHostDurationPercentNumber! - - b.deltaSelfHostDurationPercentNumber! - } - ] + sorter: (a: TableRow, b: TableRow): number => { + const aSelfPercent = a.deltaSelfHostDurationPercentNumber ?? 0; + const bSelfPercent = b.deltaSelfHostDurationPercentNumber ?? 0; + return aSelfPercent - bSelfPercent; + }, + }, + ]; const deviceDurationColumns = [ { title: 'Baseline Device Duration (us)', dataIndex: 'baselineDeviceDuration', key: 'baselineDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.baselineDeviceDuration - b.baselineDeviceDuration + sorter: (a: TableRow, b: TableRow): number => a.baselineDeviceDuration - b.baselineDeviceDuration, }, { title: 'Exp Device Duration (us)', dataIndex: 'expDeviceDuration', key: 'expDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.expDeviceDuration - b.expDeviceDuration + sorter: (a: TableRow, b: TableRow): number => a.expDeviceDuration - b.expDeviceDuration, }, { title: 'Delta Device Duration (us)', dataIndex: 'deltaDeviceDuration', key: 'deltaDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.deltaDeviceDuration! - b.deltaDeviceDuration! + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaDeviceDuration = a.deltaDeviceDuration ?? 0; + const bdeltaDeviceDuration = b.deltaDeviceDuration ?? 0; + return aDeltaDeviceDuration - bdeltaDeviceDuration; + }, }, { title: 'Delta Device Duration%', dataIndex: 'deltaDeviceDurationPercent', key: 'deltaDeviceDurationPercent', - sorter: (a: TableRow, b: TableRow) => - a.deltaDeviceDurationPercentNumber! - - b.deltaDeviceDurationPercentNumber! - } - ] + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaDeviceDurationPercentNumber = a.deltaDeviceDurationPercentNumber ?? 0; + const bDeltaDeviceDurationPercentNumber = b.deltaDeviceDurationPercentNumber ?? 0; + return aDeltaDeviceDurationPercentNumber - bDeltaDeviceDurationPercentNumber; + }, + }, + ]; const selfDeviceDurationColumns = [ { title: 'Baseline Self Device Duration (us)', dataIndex: 'baselineSelfDeviceDuration', key: 'baselineSelfDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.baselineSelfDeviceDuration - b.baselineSelfDeviceDuration + sorter: (a: TableRow, b: TableRow): number => a.baselineSelfDeviceDuration - b.baselineSelfDeviceDuration, }, { title: 'Exp Self Device Duration (us)', dataIndex: 'expSelfDeviceDuration', key: 'expSelfDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.expSelfDeviceDuration - b.expSelfDeviceDuration + sorter: (a: TableRow, b: TableRow): number => a.expSelfDeviceDuration - b.expSelfDeviceDuration, }, { title: 'Delta Self Device Duration (us)', dataIndex: 'deltaSelfDeviceDuration', key: 'deltaSelfDeviceDuration', - sorter: (a: TableRow, b: TableRow) => - a.deltaSelfDeviceDuration! - b.deltaSelfDeviceDuration! + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaSelfDeviceDuration = a.deltaSelfDeviceDuration ?? 0; + const bDeltaSelfDeviceDuration = b.deltaSelfDeviceDuration ?? 0; + return aDeltaSelfDeviceDuration - bDeltaSelfDeviceDuration; + }, }, { title: 'Delta Self Device Duration%', dataIndex: 'deltaSelfDeviceDurationPercent', key: 'deltaSelfDeviceDurationPercent', - sorter: (a: TableRow, b: TableRow) => - a.deltaSelfDeviceDurationPercentNumber! - - b.deltaSelfDeviceDurationPercentNumber! - } - ] + sorter: (a: TableRow, b: TableRow): number => { + const aDeltaSelfDeviceDurationPercentNumber = a.deltaSelfDeviceDurationPercentNumber ?? 0; + const bDeltaSelfDeviceDurationPercentNumber = b.deltaSelfDeviceDurationPercentNumber ?? 0; + return aDeltaSelfDeviceDurationPercentNumber - bDeltaSelfDeviceDurationPercentNumber; + }, + }, + ]; - type IColumnMapType = { [key: string]: any } + interface IColumnMap { + [key: string]: any; + } + type IColumnMapType = IColumnMap; const tableSourceColumnMap: IColumnMapType = { selfHostDuration: selfHostDurationColumns, hostDuration: hostDurationColumns, deviceDuration: deviceDurationColumns, - selfDeviceDuration: selfDeviceDurationColumns - } + selfDeviceDuration: selfDeviceDurationColumns, + }; const baseTableColumns = [ { title: 'Operator', dataIndex: 'operator', key: 'operator', - sorter: (a: TableRow, b: TableRow) => a.operator.localeCompare(b.operator) + sorter: (a: TableRow, b: TableRow) => a.operator.localeCompare(b.operator), }, { title: 'Baseline Calls', dataIndex: 'baselineCalls', key: 'baselineCalls', - sorter: (a: TableRow, b: TableRow) => a.baselineCalls! - b.baselineCalls! + sorter: (a: TableRow, b: TableRow) => a.baselineCalls ?? 0 - (b.baselineCalls ?? 0), }, { title: 'Exp Calls', dataIndex: 'expCalls', key: 'expCalls', - sorter: (a: TableRow, b: TableRow) => a.expCalls! - b.expCalls! + sorter: (a: TableRow, b: TableRow) => a.expCalls ?? 0 - (b.expCalls ?? 0), }, { title: 'Delta Calls', dataIndex: 'deltaCalls', key: 'deltaCalls', - sorter: (a: TableRow, b: TableRow) => a.deltaCalls! - b.deltaCalls! + sorter: (a: TableRow, b: TableRow) => a.deltaCalls ?? 0 - (b.deltaCalls ?? 0), }, { title: 'Delta Calls%', dataIndex: 'deltaCallsPercent', key: 'deltaCallsPercent', - sorter: (a: TableRow, b: TableRow) => - a.deltaCallsPercentNumber! - b.deltaCallsPercentNumber! - } - ] + sorter: (a: TableRow, b: TableRow) => a.deltaCallsPercentNumber ?? 0 - (b.deltaCallsPercentNumber ?? 0), + }, + ]; // #endregion // #region - State - const [tableDataSource, setTableDataSource] = React.useState([]) - const { run, worker, span, expRun, expWorker, expSpan } = props + const [tableDataSource, setTableDataSource] = React.useState([]); + const { run, worker, span, expRun, expWorker, expSpan } = props; - const [columnUnderlyingData, setColumnUnderlyingData] = React.useState< - ColumnUnderlyingData[] - >([]) + const [columnUnderlyingData, setColumnUnderlyingData] = React.useState([]); - const [ - rootUnderlyingData, - setRootUnderlyingData - ] = React.useState() + const [rootUnderlyingData, setRootUnderlyingData] = React.useState(); - const [columnChartData, setColumnChartData] = React.useState([]) - const [stepChartData, setStepChartData] = React.useState([]) + const [columnChartData, setColumnChartData] = React.useState([]); + const [stepChartData, setStepChartData] = React.useState([]); - const [ - selectedTableColumnsOptions, - setSelectedTableColumnsOptions - ] = React.useState<[key: string]>(['hostDuration']) - const [selectedTableColumns, setSelectedTableColumns] = React.useState( - [...baseTableColumns, ...hostDurationColumns] - ) + const [selectedTableColumnsOptions, setSelectedTableColumnsOptions] = React.useState<[key: string]>(['hostDuration']); + const [selectedTableColumns, setSelectedTableColumns] = React.useState([ + ...baseTableColumns, + ...hostDurationColumns, + ]); - const [dataStackLevel, setDataStackLevel] = React.useState(0) - const [loading, setLoading] = React.useState(false) + const [dataStackLevel, setDataStackLevel] = React.useState(0); + const [loading, setLoading] = React.useState(false); // #endregion - const classes = useStyles() + const classes = useStyles(); // #region - Event Handler - const handleChartColumnSelect = (row: number, column: number) => { + const handleChartColumnSelect = (row: number, column: number): void => { if (columnUnderlyingData.length === 0) { - return + return; } - let selectedUnderlyingData = columnUnderlyingData[row] + let selectedUnderlyingData = columnUnderlyingData[row]; if (!selectedUnderlyingData) { - return + return; } - let tableDataSource = generateDataSourceFromUnderlyingData( - selectedUnderlyingData - ) - setTableDataSource(tableDataSource) - columnTableDataSourceStack.push(tableDataSource) + let tableDataSource1 = generateDataSourceFromUnderlyingData(selectedUnderlyingData); + setTableDataSource(tableDataSource1); + columnTableDataSourceStack.push(tableDataSource1); - setLoading(true) + setLoading(true); api.defaultApi - .diffnodeGet( - run, - worker, - span, - expRun, - expWorker, - expSpan, - selectedUnderlyingData.path - ) + .diffnodeGet(run, worker, span, expRun, expWorker, expSpan, selectedUnderlyingData.path) .then((resp) => handleDiffNodeResp(resp)) - .finally(() => setLoading(false)) - } + .finally(() => setLoading(false)); + }; - const handleGoBack = () => { + const handleGoBack = (): void => { if (columnChartDataStack.length > 1) { - columnChartDataStack.pop() - let top = columnChartDataStack[columnChartDataStack.length - 1] - setColumnChartData(top) + columnChartDataStack.pop(); + let top = columnChartDataStack[columnChartDataStack.length - 1]; + setColumnChartData(top); } if (stepChartDataStack.length > 1) { - stepChartDataStack.pop() - let top = stepChartDataStack[stepChartDataStack.length - 1] - setStepChartData(top) + stepChartDataStack.pop(); + let top = stepChartDataStack[stepChartDataStack.length - 1]; + setStepChartData(top); } if (columnUnderlyingDataStack.length > 0) { - columnUnderlyingDataStack.pop() - let top = columnUnderlyingDataStack[columnUnderlyingDataStack.length - 1] - setColumnUnderlyingData(top) + columnUnderlyingDataStack.pop(); + let top = columnUnderlyingDataStack[columnUnderlyingDataStack.length - 1]; + setColumnUnderlyingData(top); } if (columnTableDataSourceStack.length > 0) { - columnTableDataSourceStack.pop() - let top = - columnTableDataSourceStack[columnTableDataSourceStack.length - 1] + columnTableDataSourceStack.pop(); + let top = columnTableDataSourceStack[columnTableDataSourceStack.length - 1]; if (top) { - setTableDataSource(top) + setTableDataSource(top); } else { - let tableDataSource = generateDataSourceFromUnderlyingData( - rootUnderlyingData! - ) - setTableDataSource(tableDataSource) + let tableDataSource2 = generateDataSourceFromUnderlyingData(rootUnderlyingData); + setTableDataSource(tableDataSource2); } } - setDataStackLevel(dataStackLevel - 1) - } + setDataStackLevel(dataStackLevel - 1); + }; - const toPercentString = (percentNumber: number) => { + const toPercentString = (percentNumber: number): string => { if (isNaN(percentNumber)) { - return 'N/A' + return 'N/A'; } - return `${percentNumber.toFixed(2)}%` - } + return `${percentNumber.toFixed(2)}%`; + }; - const handleColumnSelectionChange = (value: [key: string]) => { - let columns = value.map((x) => tableSourceColumnMap[x]).flat() - let r = [...baseTableColumns, ...columns] - setSelectedTableColumnsOptions(value) - setSelectedTableColumns(r) - } + const handleColumnSelectionChange = (value: [key: string]): void => { + let columns = value.map((x) => tableSourceColumnMap[x]).flat(); + let r = [...baseTableColumns, ...columns]; + setSelectedTableColumnsOptions(value); + setSelectedTableColumns(r); + }; - const generateDataSourceFromUnderlyingData = ( - selectedUnderlyingData: ColumnUnderlyingData - ) => { - let tableDataSource: TableRow[] = [] + const generateDataSourceFromUnderlyingData = (selectedUnderlyingData?: ColumnUnderlyingData): TableRow[] => { + if (!selectedUnderlyingData) { + return []; + } + let newTableDataSource: TableRow[] = []; for (let i = 0; i < selectedUnderlyingData.leftAggs.length; i++) { - let left = selectedUnderlyingData.leftAggs[i] - let right = selectedUnderlyingData.rightAggs[i] + let left = selectedUnderlyingData.leftAggs[i]; + let right = selectedUnderlyingData.rightAggs[i]; - let deltaCallsPercentNumber = - ((right.calls - left.calls) / left.calls) * 100 + let deltaCallsPercentNumber = ((right.calls - left.calls) / left.calls) * 100; - let deltaHostDurationPercentNumber = - ((right.host_duration - left.host_duration) / left.host_duration) * 100 + let deltaHostDurationPercentNumber = ((right.host_duration - left.host_duration) / left.host_duration) * 100; let deltaSelfHostDurationPercentNumber = - ((right.self_host_duration - left.self_host_duration) / - left.self_host_duration) * - 100 + ((right.self_host_duration - left.self_host_duration) / left.self_host_duration) * 100; let deltaDeviceDurationPercentNumber = - ((right.device_duration - left.device_duration) / - left.device_duration) * - 100 + ((right.device_duration - left.device_duration) / left.device_duration) * 100; let deltaSelfDeviceDurationPercentNumber = - ((right.self_device_duration - left.self_device_duration) / - left.self_device_duration) * - 100 + ((right.self_device_duration - left.self_device_duration) / left.self_device_duration) * 100; - tableDataSource.push({ + newTableDataSource.push({ key: i, operator: left.name, baselineCalls: left.calls, @@ -717,214 +686,194 @@ export const DiffOverview: React.FC = (props: IProps) => { expHostDuration: right.host_duration, deltaHostDuration: parseFloat((right.host_duration - left.host_duration).toFixed(3)), deltaHostDurationPercentNumber: deltaHostDurationPercentNumber, - deltaHostDurationPercent: toPercentString( - deltaHostDurationPercentNumber - ), + deltaHostDurationPercent: toPercentString(deltaHostDurationPercentNumber), baselineSelfHostDuration: left.self_host_duration, expSelfHostDuration: right.self_host_duration, - deltaSelfHostDuration: - parseFloat((right.self_host_duration - left.self_host_duration).toFixed(3)), + deltaSelfHostDuration: parseFloat((right.self_host_duration - left.self_host_duration).toFixed(3)), deltaSelfHostDurationPercentNumber: deltaSelfHostDurationPercentNumber, - deltaSelfHostDurationPercent: toPercentString( - deltaSelfHostDurationPercentNumber - ), + deltaSelfHostDurationPercent: toPercentString(deltaSelfHostDurationPercentNumber), baselineDeviceDuration: left.device_duration, expDeviceDuration: right.device_duration, deltaDeviceDuration: parseFloat((right.device_duration - left.device_duration).toFixed(3)), deltaDeviceDurationPercentNumber: deltaDeviceDurationPercentNumber, - deltaDeviceDurationPercent: toPercentString( - deltaDeviceDurationPercentNumber - ), + deltaDeviceDurationPercent: toPercentString(deltaDeviceDurationPercentNumber), baselineSelfDeviceDuration: left.self_device_duration, expSelfDeviceDuration: right.self_device_duration, - deltaSelfDeviceDuration: - parseFloat((right.self_device_duration - left.self_device_duration).toFixed(3)), + deltaSelfDeviceDuration: parseFloat((right.self_device_duration - left.self_device_duration).toFixed(3)), deltaSelfDeviceDurationPercentNumber: deltaSelfDeviceDurationPercentNumber, - deltaSelfDeviceDurationPercent: toPercentString( - deltaSelfDeviceDurationPercentNumber - ) - }) + deltaSelfDeviceDurationPercent: toPercentString(deltaSelfDeviceDurationPercentNumber), + }); } - return tableDataSource - } + return newTableDataSource; + }; React.useEffect(() => { - if ( + const hasData = run.length > 0 && worker.length > 0 && span.length > 0 && expRun.length > 0 && expWorker.length > 0 && - expSpan.length > 0 - ) { - setLoading(true) + expSpan.length > 0; + if (hasData) { + setLoading(true); - columnChartDataStack = [] - stepChartDataStack = [] - columnUnderlyingDataStack = [] - columnTableDataSourceStack = [] + columnChartDataStack = []; + stepChartDataStack = []; + columnUnderlyingDataStack = []; + columnTableDataSourceStack = []; api.defaultApi .diffnodeGet(run, worker, span, expRun, expWorker, expSpan) .then((resp) => { - handleDiffNodeResp(resp) - let rootUnderlyingData = { + handleDiffNodeResp(resp); + let newRootUnderlyingData = { name: 'rootNode', path: resp.path, leftAggs: resp.left.aggs, - rightAggs: resp.right.aggs - } + rightAggs: resp.right.aggs, + }; - setRootUnderlyingData(rootUnderlyingData) - let tableDataSource = generateDataSourceFromUnderlyingData( - rootUnderlyingData! - ) - setTableDataSource(tableDataSource) + setRootUnderlyingData(newRootUnderlyingData); + let tableDataSource3 = generateDataSourceFromUnderlyingData(newRootUnderlyingData); + setTableDataSource(tableDataSource3); }) - .finally(() => setLoading(false)) + .finally(() => setLoading(false)); - setSelectedTableColumns([...baseTableColumns, ...hostDurationColumns]) + setSelectedTableColumns([...baseTableColumns, ...hostDurationColumns]); } - }, [run, worker, span, expRun, expWorker, expSpan]) - - const handleDiffNodeResp = (resp: any) => { - let columnChartData: any[] = [] - let stepChartData: any[] = [] - let underlyingData: ColumnUnderlyingData[] = [] - - columnChartData.push([ - 'Call', - 'Baseline', - 'Experiment', - 'Baseline Trend', - 'Exp Trend' - ]) - stepChartData.push(['Call', 'Diff', 'Accumulated Diff']) + }, [run, worker, span, expRun, expWorker, expSpan]); + + const handleDiffNodeResp = (resp: any): void => { + let newColumnChartData: any[] = []; + let newStepChartData: any[] = []; + let underlyingData: ColumnUnderlyingData[] = []; + + newColumnChartData.push(['Call', 'Baseline', 'Experiment', 'Baseline Trend', 'Exp Trend']); + newStepChartData.push(['Call', 'Diff', 'Accumulated Diff']); if (resp.children.length > 0) { - let accumulated_left_duration = 0 - let accumulated_right_duration = 0 - let accumulated_step_diff = 0 + let accumulatedLeftDuration = 0; + let accumulatedRightDuration = 0; + let accumulatedStepDiff = 0; for (let i = 0; i < resp.children.length; i++) { - let left = resp.children[i].left - let right = resp.children[i].right - let currColumn: any[] = [] - let currStep: any[] = [] + let left = resp.children[i].left; + let right = resp.children[i].right; + let currColumn: any[] = []; + let currStep: any[] = []; - let name = left.name + let name = left.name; if (name === COMPOSITE_NODES_NAME) { - continue + continue; } if (name.startsWith('aten::')) { // Ignore aten operators - continue + continue; } if (name.startsWith('enumerate(DataLoader)')) { - name = name.substring(21) + name = name.substring(21); } if (name.startsWith('enumerate(DataPipe)')) { - name = name.substring(19) + name = name.substring(19); } if (name.startsWith('nn.Module: ')) { - name = name.substring(11) + name = name.substring(11); } if (name.startsWith('Optimizer.zero_grad')) { - name = 'Optimizer.zero_grad' + name = 'Optimizer.zero_grad'; } if (name.startsWith('Optimizer.step')) { - name = 'Optimizer.step' + name = 'Optimizer.step'; } - currColumn.push(name) - currColumn.push(left.total_duration) - currColumn.push(right.total_duration) + currColumn.push(name); + currColumn.push(left.total_duration); + currColumn.push(right.total_duration); - accumulated_left_duration += left.total_duration - currColumn.push(accumulated_left_duration) + accumulatedLeftDuration += left.total_duration; + currColumn.push(accumulatedLeftDuration); - accumulated_right_duration += right.total_duration - currColumn.push(accumulated_right_duration) - columnChartData.push(currColumn) + accumulatedRightDuration += right.total_duration; + currColumn.push(accumulatedRightDuration); + newColumnChartData.push(currColumn); underlyingData.push({ name: name, path: resp.children[i].path, leftAggs: left.aggs, - rightAggs: right.aggs - }) + rightAggs: right.aggs, + }); - currStep.push(name) - let stepDiff = right.total_duration - left.total_duration - currStep.push(stepDiff) + currStep.push(name); + let stepDiff = right.total_duration - left.total_duration; + currStep.push(stepDiff); - accumulated_step_diff += stepDiff - currStep.push(accumulated_step_diff) + accumulatedStepDiff += stepDiff; + currStep.push(accumulatedStepDiff); - stepChartData.push(currStep) + newStepChartData.push(currStep); } } else { - let left = resp.left - let right = resp.right - let currColumn: any[] = [] - let currStep: any[] = [] - let name = left.name + let left = resp.left; + let right = resp.right; + let currColumn: any[] = []; + let currStep: any[] = []; + let name = left.name; if (name.startsWith('nn.Module: ')) { - name = name.substring(11) + name = name.substring(11); } - currColumn.push(name) - currColumn.push(left.total_duration) - currColumn.push(right.total_duration) - currColumn.push(left.total_duration) - currColumn.push(right.total_duration) + currColumn.push(name); + currColumn.push(left.total_duration); + currColumn.push(right.total_duration); + currColumn.push(left.total_duration); + currColumn.push(right.total_duration); - columnChartData.push(currColumn) + newColumnChartData.push(currColumn); - currStep.push(name) - let stepDiff = right.total_duration - left.total_duration - currStep.push(stepDiff) - currStep.push(stepDiff) - stepChartData.push(currStep) + currStep.push(name); + let stepDiff = right.total_duration - left.total_duration; + currStep.push(stepDiff); + currStep.push(stepDiff); + newStepChartData.push(currStep); } - setColumnChartData(columnChartData) - columnChartDataStack.push(columnChartData) - - setStepChartData(stepChartData) - stepChartDataStack.push(stepChartData) + setColumnChartData(newColumnChartData); + columnChartDataStack.push(newColumnChartData); - setColumnUnderlyingData(underlyingData) - columnUnderlyingDataStack.push(underlyingData) + setStepChartData(newStepChartData); + stepChartDataStack.push(newStepChartData); - setDataStackLevel(columnChartDataStack.length) - } + setColumnUnderlyingData(underlyingData); + columnUnderlyingDataStack.push(underlyingData); - // #endregion + setDataStackLevel(columnChartDataStack.length); + }; // #endregion if (!loading && columnUnderlyingDataStack.length === 0) { return ( - - + + There is no run selected for diff. - ) + ); } if (loading) { - return + return ; } return ( @@ -932,73 +881,62 @@ export const DiffOverview: React.FC = (props: IProps) => { - - + + {columnChartData.length > 1 && ( <> - + )} - {columnChartData.length === 1 && ( - No more level to show. - )} + {columnChartData.length === 1 && No more level to show.} - - + +   - +
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DistributedView.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DistributedView.tsx index aad14aa29828fa1a8886ab3f68c54dd62cd396f9..096501b61bc9ce41978c65dc24f6b3640ab960f3 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DistributedView.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/DistributedView.tsx @@ -2,54 +2,54 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import Grid from '@material-ui/core/Grid' -import InputLabel from '@material-ui/core/InputLabel' -import MenuItem from '@material-ui/core/MenuItem' -import Select, { SelectProps } from '@material-ui/core/Select' -import { makeStyles } from '@material-ui/core/styles' -import { Table } from 'antd' -import { ColumnsType } from 'antd/es/table' -import * as React from 'react' -import * as api from '../api' -import { DistributedGraph, GpuInfo, Graph } from '../api' -import { firstOrUndefined } from '../utils' -import { ColumnChart } from './charts/ColumnChart' -import { DataLoading } from './DataLoading' -import { GpuInfoTable } from './GpuInfoTable' -import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers' +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import Grid from '@material-ui/core/Grid'; +import InputLabel from '@material-ui/core/InputLabel'; +import MenuItem from '@material-ui/core/MenuItem'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import { makeStyles } from '@material-ui/core/styles'; +import { Table } from 'antd'; +import { ColumnsType } from 'antd/es/table'; +import * as React from 'react'; +import * as api from '../api'; +import { DistributedGraph, GpuInfo, Graph } from '../api'; +import { firstOrUndefined } from '../utils'; +import { ColumnChart } from './charts/ColumnChart'; +import { DataLoading } from './DataLoading'; +import { GpuInfoTable } from './GpuInfoTable'; +import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers'; import { - DistributedCommopsTableTooltip, - DistributedGpuInfoTableTooltip, - DistributedOverlapGraphTooltip, - DistributedWaittimeGraphTooltip -} from './TooltipDescriptions' + distributedCommopsTableTooltip, + distributedGpuInfoTableTooltip, + distributedOverlapGraphTooltip, + distributedWaittimeGraphTooltip, +} from './TooltipDescriptions'; export interface IProps { - run: string - worker: string - span: string + run: string; + worker: string; + span: string; } const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, verticalInput: { display: 'flex', - alignItems: 'center' + alignItems: 'center', }, inputWidth: { - width: '4em' + width: '4em', }, inputWidthOverflow: { minWidth: '15em', - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, description: { - marginLeft: theme.spacing(1) + marginLeft: theme.spacing(1), }, table: { height: '100%', @@ -58,165 +58,152 @@ const useStyles = makeStyles((theme) => ({ height: 20, fontSize: '10pt', '& > td': { - padding: '0 8px!important' - } - } - } -})) + padding: '0 8px!important', + }, + }, + }, +})); export const DistributedView: React.FC = (props) => { - const tooltipCommonClasses = useTooltipCommonStyles() + const tooltipCommonClasses = useTooltipCommonStyles(); const chartHeaderRenderer = React.useMemo( () => makeChartHeaderRenderer(tooltipCommonClasses), [tooltipCommonClasses] - ) + ); - let { run, worker, span } = props - const classes = useStyles() + let { run, worker, span } = props; + const classes = useStyles(); - const [overlapGraph, setOverlapGraph] = React.useState< - DistributedGraph | undefined - >(undefined) - const [waittimeGraph, setWaittimeGraph] = React.useState< - DistributedGraph | undefined - >(undefined) - const [commopsTableData, setCommopsTableData] = React.useState< - any | undefined - >(undefined) - const [gpuInfo, setGpuInfo] = React.useState(undefined) - const [commopsTableTitle, setCommopsTableTitle] = React.useState('') - const [commopsWorkers, setCommopsWorkers] = React.useState([]) - const [overlapSteps, setOverlapSteps] = React.useState([]) - const [waittimeSteps, setWaittimeSteps] = React.useState([]) - const [overlapStep, setOverlapStep] = React.useState('') - const [waittimeStep, setWaittimeStep] = React.useState('') - const [commopsWorker, setCommopsWorker] = React.useState('') - const [columns, setColumns] = React.useState>([]) - const [pageSize, setPageSize] = React.useState(30) + const [overlapGraph, setOverlapGraph] = React.useState(undefined); + const [waittimeGraph, setWaittimeGraph] = React.useState(undefined); + const [commopsTableData, setCommopsTableData] = React.useState(undefined); + const [gpuInfo, setGpuInfo] = React.useState(undefined); + const [commopsTableTitle, setCommopsTableTitle] = React.useState(''); + const [commopsWorkers, setCommopsWorkers] = React.useState([]); + const [overlapSteps, setOverlapSteps] = React.useState([]); + const [waittimeSteps, setWaittimeSteps] = React.useState([]); + const [overlapStep, setOverlapStep] = React.useState(''); + const [waittimeStep, setWaittimeStep] = React.useState(''); + const [commopsWorker, setCommopsWorker] = React.useState(''); + const [columns, setColumns] = React.useState>([]); + const [pageSize, setPageSize] = React.useState(30); React.useEffect(() => { if (waittimeSteps.includes('all')) { - setWaittimeStep('all') + setWaittimeStep('all'); } else { - setWaittimeStep(firstOrUndefined(waittimeSteps) ?? '') + setWaittimeStep(firstOrUndefined(waittimeSteps) ?? ''); } - }, [waittimeSteps]) + }, [waittimeSteps]); React.useEffect(() => { if (overlapSteps.includes('all')) { - setOverlapStep('all') + setOverlapStep('all'); } else { - setOverlapStep(firstOrUndefined(overlapSteps) ?? '') + setOverlapStep(firstOrUndefined(overlapSteps) ?? ''); } - }, [overlapSteps]) + }, [overlapSteps]); React.useEffect(() => { - setCommopsWorker(firstOrUndefined(commopsWorkers) ?? '') - }, [commopsWorkers]) + setCommopsWorker(firstOrUndefined(commopsWorkers) ?? ''); + }, [commopsWorkers]); React.useEffect(() => { api.defaultApi.distributedOverlapGet(run, 'All', span).then((resp) => { - setOverlapGraph(resp) - setOverlapSteps(Object.keys(resp.data)) - }) + setOverlapGraph(resp); + setOverlapSteps(Object.keys(resp.data)); + }); api.defaultApi.distributedWaittimeGet(run, 'All', span).then((resp) => { - setWaittimeGraph(resp) - setWaittimeSteps(Object.keys(resp.data)) - }) + setWaittimeGraph(resp); + setWaittimeSteps(Object.keys(resp.data)); + }); api.defaultApi.distributedCommopsGet(run, 'All', span).then((resp) => { - setCommopsTableData(resp.data) - setCommopsWorkers(Object.keys(resp.data)) - setCommopsTableTitle(resp.metadata.title) - }) + setCommopsTableData(resp.data); + setCommopsWorkers(Object.keys(resp.data)); + setCommopsTableTitle(resp.metadata.title); + }); api.defaultApi.distributedGpuinfoGet(run, 'All', span).then((resp) => { - setGpuInfo(resp) - }) - }, [run, worker, span]) + setGpuInfo(resp); + }); + }, [run, worker, span]); const onCommopsWorkerChanged: SelectProps['onChange'] = (event) => { - setCommopsWorker(event.target.value as string) - } + setCommopsWorker(event.target.value as string); + }; const onOverlapStepChanged: SelectProps['onChange'] = (event) => { - setOverlapStep(event.target.value as string) - } + setOverlapStep(event.target.value as string); + }; const onWaittimeStepChanged: SelectProps['onChange'] = (event) => { - setWaittimeStep(event.target.value as string) - } + setWaittimeStep(event.target.value as string); + }; - const getColumnChartData = ( - distributedGraph?: DistributedGraph, - step?: string - ) => { - if (!distributedGraph || !step) return undefined - const barLabels = Object.keys(distributedGraph.data[step]) + const getColumnChartData = (distributedGraph?: DistributedGraph, step?: string): any => { + if (!distributedGraph || !step) { + return undefined; + } + const barLabels = Object.keys(distributedGraph.data[step]); return { legends: distributedGraph.metadata.legends, barLabels, - barHeights: barLabels.map((label) => distributedGraph.data[step][label]) - } - } - const overlapData = React.useMemo( - () => getColumnChartData(overlapGraph, overlapStep), - [overlapGraph, overlapStep] - ) + barHeights: barLabels.map((label) => distributedGraph.data[step][label]), + }; + }; + const overlapData = React.useMemo(() => getColumnChartData(overlapGraph, overlapStep), [overlapGraph, overlapStep]); const waittimeData = React.useMemo( () => getColumnChartData(waittimeGraph, waittimeStep), [waittimeGraph, waittimeStep] - ) + ); - const getTableData = (tableData?: any, worker?: string) => { - if (!tableData || !worker) { - return [] + const getTableData = (tableData?: any, opsWorker?: string): any[] => { + if (!tableData || !opsWorker) { + return []; } - let dataInfo: api.Graph = tableData[worker] - const stringCompare = (a: string, b: string) => a.localeCompare(b) - const numberCompare = (a: number, b: number) => a - b - let column: any[] = dataInfo.columns.map(item => { + let dataInfo: api.Graph = tableData[opsWorker]; + const stringCompare = (a: string, b: string): number => a.localeCompare(b); + const numberCompare = (a: number, b: number): number => a - b; + let column: any[] = dataInfo.columns.map((item) => { return { title: item.name, key: item.name, dataIndex: item.name, - sorter: item.type == 'string' ? (a: any, b: any) => stringCompare(a[item.name], b[item.name]) - : (a: any, b: any) => numberCompare(a[item.name], b[item.name]) - } - }) - setColumns(column) + sorter: + item.type === 'string' + ? (a: any, b: any): number => stringCompare(a[item.name], b[item.name]) + : (a: any, b: any): number => numberCompare(a[item.name], b[item.name]), + }; + }); + setColumns(column); return dataInfo.rows.map((row, index) => { if (row.length !== dataInfo.columns.length) { - return null + return null; } - const dataRow: { [column: string]: number | string } = { key: index } - dataInfo.columns.forEach((column, index) => { - dataRow[column.name] = row[index] as string | number - }) - return dataRow - }) - } + const dataRow: { [column: string]: number | string } = { key: index }; + dataInfo.columns.forEach((item, idx) => { + dataRow[item.name] = row[idx] as string | number; + }); + return dataRow; + }); + }; const commopsTable: any[] = React.useMemo(() => { - return getTableData(commopsTableData, commopsWorker) - }, [commopsTableData, commopsWorker]) + return getTableData(commopsTableData, commopsWorker); + }, [commopsTableData, commopsWorker]); - const onShowSizeChange = (current: number, size: number) => { - setPageSize(size) - } + const onShowSizeChange = (current: number, size: number): void => { + setPageSize(size); + }; return (
- - + + {gpuInfo && ( - + @@ -225,19 +212,15 @@ export const DistributedView: React.FC = (props) => { )} - {(chartData) => ( + {(chartData): JSX.Element => ( - + - Step + Step - {overlapSteps.map((step) => ( {step} ))} @@ -247,35 +230,25 @@ export const DistributedView: React.FC = (props) => { {overlapGraph?.metadata?.title && ( )} - + )} - {(chartData) => ( + {(chartData): JSX.Element => ( - + - Step + Step - {waittimeSteps.map((step) => ( {step} ))} @@ -285,10 +258,7 @@ export const DistributedView: React.FC = (props) => { {waittimeGraph?.metadata?.title && ( )} = (props) => { - - + + - + - Worker + Worker - + {commopsWorkers.map((item) => ( + {item} ))} @@ -338,7 +299,7 @@ export const DistributedView: React.FC = (props) => { pageSize, pageSizeOptions: ['20', '30', '50', '100'], hideOnSinglePage: true, - onShowSizeChange + onShowSizeChange, }} /> @@ -348,5 +309,5 @@ export const DistributedView: React.FC = (props) => {
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/FullCircularProgress.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/FullCircularProgress.tsx index 5212bd74bf9739cc171d369e6591a0c26f058f6a..3f4c0fbaf15a15d402aa205574a28df045d24aec 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/FullCircularProgress.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/FullCircularProgress.tsx @@ -1,23 +1,23 @@ /*--------------------------------------------------------------------------------------------- * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import CircularProgress from '@material-ui/core/CircularProgress' -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' +import CircularProgress from '@material-ui/core/CircularProgress'; +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; const useStyles = makeStyles(() => ({ root: { width: '100%', display: 'flex', - justifyContent: 'center' - } -})) + justifyContent: 'center', + }, +})); export const FullCircularProgress: React.FC = () => { - const classes = useStyles() + const classes = useStyles(); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/GpuInfoTable.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/GpuInfoTable.tsx index 4c624db0580caa466271e56505f2838637705884..07f6f1d78c88abab5f62f844356b47ca517a2561 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/GpuInfoTable.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/GpuInfoTable.tsx @@ -2,127 +2,123 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; export interface IProps { - gpuInfo: any + gpuInfo: any; } const useStyles = makeStyles((theme) => ({ root: { border: '1px solid #E0E0E0', borderCollapse: 'collapse', - width: '100%' + width: '100%', }, td: { borderTop: '1px solid #E0E0E0', borderBottom: '1px solid #E0E0E0', borderCollapse: 'collapse', paddingLeft: 10, - paddingRight: 10 + paddingRight: 10, }, nodeTd: { - fontWeight: 'bold' + fontWeight: 'bold', }, pidTd: { - fontWeight: 'normal' + fontWeight: 'normal', }, gpuTd: { - fontWeight: 'normal' + fontWeight: 'normal', }, keyTd: { fontWeight: 'normal', - textAlign: 'right' + textAlign: 'right', }, valueTd: { - fontWeight: 'bold' - } -})) + fontWeight: 'bold', + }, +})); interface TableCellInfo { - content: string - rowspan: number - cellType: 'node' | 'pid' | 'gpu' | 'key' | 'value' - last?: boolean + content: string; + rowspan: number; + cellType: 'node' | 'pid' | 'gpu' | 'key' | 'value'; + last?: boolean; } function makeTableCellInfo(gpuInfo: any): TableCellInfo[][] { - const rows: TableCellInfo[][] = [] - let curr_row: TableCellInfo[] = [] - rows.push(curr_row) - Object.keys(gpuInfo.data).forEach(function (node_name) { - const node_cell = { - content: node_name, + const rows: TableCellInfo[][] = []; + let currRow: TableCellInfo[] = []; + rows.push(currRow); + Object.keys(gpuInfo.data).forEach((nodeName) => { + const nodeCell = { + content: nodeName, rowspan: 0, - cellType: 'node' as const - } - const i = rows.length - curr_row.push(node_cell) - Object.keys(gpuInfo.data[node_name]).forEach(function (pid) { - const pid_cell = { content: pid, rowspan: 0, cellType: 'pid' as const } - const i = rows.length - curr_row.push(pid_cell) - Object.keys(gpuInfo.data[node_name][pid]).forEach(function (gpu) { - const gpu_cell = { content: gpu, rowspan: 0, cellType: 'gpu' as const } - const i = rows.length - curr_row.push(gpu_cell) - Object.keys(gpuInfo.data[node_name][pid][gpu]).forEach(function ( - key_name - ) { - curr_row.push({ - content: key_name, + cellType: 'node' as const, + }; + const i = rows.length; + currRow.push(nodeCell); + Object.keys(gpuInfo.data[nodeName]).forEach((pid) => { + const pidCell = { content: pid, rowspan: 0, cellType: 'pid' as const }; + const j = rows.length; + currRow.push(pidCell); + Object.keys(gpuInfo.data[nodeName][pid]).forEach((gpu) => { + const gpuCell = { content: gpu, rowspan: 0, cellType: 'gpu' as const }; + const k = rows.length; + currRow.push(gpuCell); + Object.keys(gpuInfo.data[nodeName][pid][gpu]).forEach((keyName) => { + currRow.push({ + content: keyName, rowspan: 1, - cellType: 'key' as const - }) - const value: string = gpuInfo.data[node_name][pid][gpu][key_name] - curr_row.push({ + cellType: 'key' as const, + }); + const value: string = gpuInfo.data[nodeName][pid][gpu][keyName]; + currRow.push({ content: value, rowspan: 1, - cellType: 'value' as const - }) - curr_row = [] - rows.push(curr_row) - }) - gpu_cell.rowspan = rows.length - i - }) - pid_cell.rowspan = rows.length - i - }) - node_cell.rowspan = rows.length - i - }) - rows.pop() - return rows + cellType: 'value' as const, + }); + currRow = []; + rows.push(currRow); + }); + gpuCell.rowspan = rows.length - k; + }); + pidCell.rowspan = rows.length - j; + }); + nodeCell.rowspan = rows.length - i; + }); + rows.pop(); + return rows; } export const GpuInfoTable: React.FC = (props) => { - const classes = useStyles() - interface TableCellInfo { - content: string - rowspan: number - cellType: 'node' | 'pid' | 'gpu' | 'key' | 'value' + const classes = useStyles(); + interface TableCellInfoNoLast { + content: string; + rowspan: number; + cellType: 'node' | 'pid' | 'gpu' | 'key' | 'value'; } - const rows = React.useMemo(() => makeTableCellInfo(props.gpuInfo), [ - props.gpuInfo - ]) + const rows = React.useMemo(() => makeTableCellInfo(props.gpuInfo), [props.gpuInfo]); const cellToClass = { node: classes.nodeTd, pid: classes.pidTd, gpu: classes.gpuTd, key: classes.keyTd, - value: classes.valueTd - } + value: classes.valueTd, + }; - const renderCell = function (info: TableCellInfo) { - let cellClass = cellToClass[info.cellType] - let content = info.cellType == 'key' ? info.content + ':' : info.content + const renderCell = function (info: TableCellInfoNoLast): JSX.Element { + let cellClass = cellToClass[info.cellType]; + let content = info.cellType === 'key' ? `${info.content}:` : info.content; return ( -
- ) - } + ); + }; return (
+ {content}
@@ -130,5 +126,5 @@ export const GpuInfoTable: React.FC = (props) => { {row.map(renderCell)} ))}
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Kernel.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Kernel.tsx index 62ec350b8b400a03bd64c032ee2a61a4ca9a1852..66e05695153a853f68d382a2f3b6a68931861abf 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Kernel.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Kernel.tsx @@ -15,208 +15,183 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import FormControlLabel from '@material-ui/core/FormControlLabel' -import Grid from '@material-ui/core/Grid' -import InputLabel from '@material-ui/core/InputLabel' -import MenuItem from '@material-ui/core/MenuItem' -import Radio from '@material-ui/core/Radio' -import RadioGroup, { RadioGroupProps } from '@material-ui/core/RadioGroup' -import Select, { SelectProps } from '@material-ui/core/Select' -import { makeStyles } from '@material-ui/core/styles' -import TextField, { - StandardTextFieldProps, - TextFieldProps -} from '@material-ui/core/TextField' -import * as React from 'react' -import * as api from '../api' -import { Graph } from '../api' -import { KernelGroupBy } from '../constants/groupBy' -import { useSearch } from '../utils/search' -import { topIsValid, UseTop, useTopN } from '../utils/top' -import { AntTableChart } from './charts/AntTableChart' -import { PieChart } from './charts/PieChart' -import { DataLoading } from './DataLoading' -import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers' +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import FormControlLabel from '@material-ui/core/FormControlLabel'; +import Grid from '@material-ui/core/Grid'; +import InputLabel from '@material-ui/core/InputLabel'; +import MenuItem from '@material-ui/core/MenuItem'; +import Radio from '@material-ui/core/Radio'; +import RadioGroup, { RadioGroupProps } from '@material-ui/core/RadioGroup'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import { makeStyles } from '@material-ui/core/styles'; +import TextField, { StandardTextFieldProps, TextFieldProps } from '@material-ui/core/TextField'; +import * as React from 'react'; +import * as api from '../api'; +import { Graph } from '../api'; +import { KernelGroupBy } from '../constants/groupBy'; +import { useSearch } from '../utils/search'; +import { topIsValid, UseTop, useTopN } from '../utils/top'; +import { AntTableChart } from './charts/AntTableChart'; +import { PieChart } from './charts/PieChart'; +import { DataLoading } from './DataLoading'; +import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers'; import { - GPUKernelTotalTimeTooltip, - TensorCoresPieChartTooltip, - TensorCoresPieChartTooltipAscend -} from './TooltipDescriptions' + gpuKernelTotalTimeTooltip, + tensorCoresPieChartTooltip, + tensorCoresPieChartTooltipAscend, +} from './TooltipDescriptions'; export interface IProps { - run: string - worker: string - span: string - deviceTarget: string + run: string; + worker: string; + span: string; + deviceTarget: string; } const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, verticalInput: { display: 'flex', - alignItems: 'center' + alignItems: 'center', }, inputWidth: { - width: '4em' + width: '4em', }, inputWidthOverflow: { minWidth: '15em', - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, description: { - marginLeft: theme.spacing(1) - } -})) + marginLeft: theme.spacing(1), + }, +})); export const Kernel: React.FC = (props) => { - const { run, worker, span, deviceTarget } = props - const classes = useStyles() - const tooltipCommonClasses = useTooltipCommonStyles() + const { run, worker, span, deviceTarget } = props; + const classes = useStyles(); + const tooltipCommonClasses = useTooltipCommonStyles(); const chartHeaderRenderer = React.useMemo( () => makeChartHeaderRenderer(tooltipCommonClasses), [tooltipCommonClasses] - ) + ); - const [kernelGraph, setKernelGraph] = React.useState( - undefined - ) - const [tcGraph, setTcGraph] = React.useState(undefined) - const [kernelTable, setKernelTable] = React.useState( - undefined - ) - const [groupBy, setGroupBy] = React.useState(KernelGroupBy.Kernel) - const [searchKernelName, setSearchKernelName] = React.useState('') - const [searchOpName, setSearchOpName] = React.useState('') - const [sortColumn, setSortColumn] = React.useState('') - const [hasStep, setHasStep] = React.useState(false) + const [kernelGraph, setKernelGraph] = React.useState(undefined); + const [tcGraph, setTcGraph] = React.useState(undefined); + const [kernelTable, setKernelTable] = React.useState(undefined); + const [groupBy, setGroupBy] = React.useState(KernelGroupBy.KERNEL); + const [searchKernelName, setSearchKernelName] = React.useState(''); + const [searchOpName, setSearchOpName] = React.useState(''); + const [sortColumn, setSortColumn] = React.useState(''); + const [hasStep, setHasStep] = React.useState(false); const [topText, actualTop, useTop, setTopText, setUseTop] = useTopN({ - defaultUseTop: UseTop.Use, - defaultTop: 10 - }) + defaultUseTop: UseTop.USE, + defaultTop: 10, + }); React.useEffect(() => { - setSearchOpName('') - }, [groupBy]) + setSearchOpName(''); + }, [groupBy]); React.useEffect(() => { if (kernelGraph) { - setTopText(String(Math.min(kernelGraph.rows?.length, 10))) + setTopText(String(Math.min(kernelGraph.rows?.length, 10))); } - }, [kernelGraph]) + }, [kernelGraph]); React.useEffect(() => { api.defaultApi.kernelTableGet(run, worker, span, groupBy).then((resp) => { - setSortColumn(resp.metadata.sort) - setKernelTable(resp.data) - const nameColumnIdx = resp.data.columns.findIndex( - (c) => c.name.toLowerCase() === 'step id' - ) - setHasStep(nameColumnIdx > -1) - }) - }, [run, worker, span, groupBy]) + setSortColumn(resp.metadata.sort); + setKernelTable(resp.data); + const nameColumnIdx = resp.data.columns.findIndex((c) => c.name.toLowerCase() === 'step id'); + setHasStep(nameColumnIdx > -1); + }); + }, [run, worker, span, groupBy]); React.useEffect(() => { - api.defaultApi - .kernelGet(run, worker, span, KernelGroupBy.Kernel) - .then((resp) => { - setKernelGraph(resp.total) - setGroupBy(resp.device_target === 'Ascend' ? KernelGroupBy.KernelNameAndOpName : KernelGroupBy.Kernel) - }) - }, [run, worker, span]) + api.defaultApi.kernelGet(run, worker, span, KernelGroupBy.KERNEL).then((resp) => { + setKernelGraph(resp.total); + setGroupBy(resp.device_target === 'Ascend' ? KernelGroupBy.KERNEL_NAME_AND_OP_NAME : KernelGroupBy.KERNEL); + }); + }, [run, worker, span]); React.useEffect(() => { api.defaultApi.kernelTcPieGet(run, worker, span).then((resp) => { - setTcGraph(resp.total) - }) - }, [run, worker, span]) + setTcGraph(resp.total); + }); + }, [run, worker, span]); - const [searchedKernelTable] = useSearch(searchKernelName, 'name', kernelTable) + const [searchedKernelTable] = useSearch(searchKernelName, 'name', kernelTable); const [searchedOpTable] = useSearch( searchOpName, deviceTarget === 'Ascend' ? 'step id' : 'operator', searchedKernelTable - ) + ); const onGroupByChanged: SelectProps['onChange'] = (event) => { - setGroupBy(event.target.value as KernelGroupBy) - } + setGroupBy(event.target.value as KernelGroupBy); + }; const onSearchKernelChanged: TextFieldProps['onChange'] = (event) => { - setSearchKernelName(event.target.value as string) - } + setSearchKernelName(event.target.value as string); + }; const onSearchOpChanged: TextFieldProps['onChange'] = (event) => { - setSearchOpName(event.target.value as string) - } + setSearchOpName(event.target.value as string); + }; const onUseTopChanged: RadioGroupProps['onChange'] = (event) => { - setUseTop(event.target.value as UseTop) - } + setUseTop(event.target.value as UseTop); + }; - const onTopChanged = (event: React.ChangeEvent) => { - setTopText(event.target.value) - } + const onTopChanged = (event: React.ChangeEvent): void => { + setTopText(event.target.value); + }; const inputProps: StandardTextFieldProps['inputProps'] = { - min: 1 - } + min: 1, + }; const GPUKernelTotalTimeTitle = React.useMemo( - () => chartHeaderRenderer('Total Time (us)', GPUKernelTotalTimeTooltip), + () => chartHeaderRenderer('Total Time (us)', gpuKernelTotalTimeTooltip), [chartHeaderRenderer] - ) + ); const TensorCoresTitle = React.useMemo( - () => deviceTarget === 'Ascend' ? - chartHeaderRenderer( - 'Accelerator Core Utilization', - TensorCoresPieChartTooltipAscend - ) - : - chartHeaderRenderer( - 'Tensor Cores Utilization', - TensorCoresPieChartTooltip - ), + () => + deviceTarget === 'Ascend' + ? chartHeaderRenderer('Accelerator Core Utilization', tensorCoresPieChartTooltipAscend) + : chartHeaderRenderer('Tensor Cores Utilization', tensorCoresPieChartTooltip), [chartHeaderRenderer, deviceTarget] - ) + ); return (
- - + + - } - label="All kernels" - /> - } - label="Top kernels to show" - /> + } label='All kernels' /> + } label='Top kernels to show' /> - {useTop === UseTop.Use && ( + {useTop === UseTop.USE && ( = (props) => { - {(graph) => ( + {(graph): JSX.Element => ( - + )} - {(graph) => ( + {(graph): JSX.Element => ( = (props) => { graph={graph} colors={['#0099C6', '#DD4477', '#66AA00', '#B82E2E']} top={actualTop} - tooltip_mode="percentage" + tooltipMode='percentage' /> )} - + - + - Group By - + {deviceTarget === 'Ascend' ? 'Statistic' : 'Kernel Properties + Op Name'} - + {deviceTarget === 'Ascend' ? 'All' : 'Kernel Name'} @@ -279,50 +246,49 @@ export const Kernel: React.FC = (props) => { classes={{ root: classes.inputWidthOverflow }} value={searchKernelName} onChange={onSearchKernelChanged} - type="search" - label="Search by Name" + type='search' + label='Search by Name' inputProps={{ - maxLength: 200 + maxLength: 200, }} /> - {deviceTarget === 'Ascend' ? - (groupBy === KernelGroupBy.Kernel && hasStep && - - - ) - : - (groupBy === KernelGroupBy.KernelNameAndOpName && - - - ) - } + {deviceTarget === 'Ascend' + ? groupBy === KernelGroupBy.KERNEL && + hasStep && ( + + + + ) + : groupBy === KernelGroupBy.KERNEL_NAME_AND_OP_NAME && ( + + + + )} - {(graph) => ( - - )} + {(graph): JSX.Element => } @@ -331,5 +297,5 @@ export const Kernel: React.FC = (props) => {
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/MemoryView.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/MemoryView.tsx index a8f6c458eae79adf09371fcb73ecb29d1a62d067..225f28a931e969d7cfd40d3f490e7cb45c64a305 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/MemoryView.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/MemoryView.tsx @@ -15,22 +15,22 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import Grid from '@material-ui/core/Grid' -import InputLabel from '@material-ui/core/InputLabel' -import MenuItem from '@material-ui/core/MenuItem' -import Select, { SelectProps } from '@material-ui/core/Select' -import Slider from '@material-ui/core/Slider' -import { makeStyles } from '@material-ui/core/styles' -import TextField, { TextFieldProps } from '@material-ui/core/TextField' -import * as React from 'react' -import * as api from '../api' +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import Grid from '@material-ui/core/Grid'; +import InputLabel from '@material-ui/core/InputLabel'; +import MenuItem from '@material-ui/core/MenuItem'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import Slider from '@material-ui/core/Slider'; +import { makeStyles } from '@material-ui/core/styles'; +import TextField, { TextFieldProps } from '@material-ui/core/TextField'; +import * as React from 'react'; +import * as api from '../api'; import { Graph, GraphAscend, @@ -39,288 +39,237 @@ import { MemoryCurveDataAscend, MemoryEventsData, MemoryEventsDataAll, - MemoryStatsData -} from '../api' -import { useSearchDirectly } from '../utils/search' -import { AntTableChart } from './charts/AntTableChart' -import { LineChart } from './charts/NewLineChart' -import { DataLoading } from './DataLoading' -import { MemoryStatsTable } from './tables/MemoryStatsTable' + MemoryStatsData, +} from '../api'; +import { useSearchDirectly } from '../utils/search'; +import { AntTableChart } from './charts/AntTableChart'; +import { LineChart } from './charts/NewLineChart'; +import { DataLoading } from './DataLoading'; +import { MemoryStatsTable } from './tables/MemoryStatsTable'; const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, curve: { - marginBottom: 20 + marginBottom: 20, }, verticalInput: { display: 'flex', - alignItems: 'center' + alignItems: 'center', }, inputWidth: { - width: '4em' + width: '4em', }, inputWidthOverflow: { minWidth: '15em', - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, full: { - width: '100%' + width: '100%', }, description: { - marginLeft: theme.spacing(1) + marginLeft: theme.spacing(1), }, filterSlider: { marginTop: 15, marginRight: 6, - width: 250 + width: 250, }, filterInput: { - width: 100 - } -})) + width: 100, + }, +})); export interface IProps { - run: string - worker: string - span: string - deviceTarget: string + run: string; + worker: string; + span: string; + deviceTarget: string; } -const tags = ['Operator', 'Component'] +const tags = ['Operator', 'Component']; export const MemoryView: React.FC = React.memo((props) => { interface EventSizeFilter { - [deviceName: string]: Array + [deviceName: string]: Array; } interface MaxEventSize { - [deviceName: string]: number + [deviceName: string]: number; } - const { run, worker, span, deviceTarget } = props - const classes = useStyles() + const { run, worker, span, deviceTarget } = props; + const classes = useStyles(); - const [memoryStatsData, setMemoryStatsData] = React.useState< - MemoryStatsData | undefined - >(undefined) + const [memoryStatsData, setMemoryStatsData] = React.useState(undefined); // for backward compatability, old profile do not have events to show - const showEvents = () => { - return memoryEventsData && Object.keys(memoryEventsData.rows).length != 0 - } - const [memoryEventsData, setMemoryEventsData] = React.useState< - MemoryEventsData | undefined - >(undefined) + const showEvents = (): boolean | undefined => { + return memoryEventsData && Object.keys(memoryEventsData.rows).length !== 0; + }; + const [memoryEventsData, setMemoryEventsData] = React.useState(undefined); // for backward compatability, old profile do not have curve to show - const showCurve = () => { - return memoryCurveData && Object.keys(memoryCurveData.rows).length != 0 - } - const [memoryCurveData, setMemoryCurveData] = React.useState< - MemoryCurveData | MemoryCurveDataAscend | undefined - >(undefined) - - const [lineChartData, setLineChartData] = React.useState( + const showCurve = (): boolean | undefined => { + return memoryCurveData && Object.keys(memoryCurveData.rows).length !== 0; + }; + const [memoryCurveData, setMemoryCurveData] = React.useState( undefined - ) + ); + + const [lineChartData, setLineChartData] = React.useState(undefined); - const [devices, setDevices] = React.useState([]) - const [device, setDevice] = React.useState('') - const [tag, setTag] = React.useState('Operator') - const memoryCurveDataAllRef = React.useRef(undefined) - const memoryEventDataAllRef = React.useRef(undefined) + const [devices, setDevices] = React.useState([]); + const [device, setDevice] = React.useState(''); + const [tag, setTag] = React.useState('Operator'); + const memoryCurveDataAllRef = React.useRef(undefined); + const memoryEventDataAllRef = React.useRef(undefined); interface SelectedRange { - start: number - end: number - startTs: number - endTs: number + start: number; + end: number; + startTs: number; + endTs: number; } - const [selectedRange, setSelectedRange] = React.useState< - SelectedRange | undefined - >() - const [searchOperatorName, setSearchOperatorName] = React.useState('') - const [searchEventOperatorName, setSearchEventOperatorName] = React.useState( - '' - ) - const [filterEventSize, setFilterEventSize] = React.useState( - {} - ) - const [maxSize, setMaxSize] = React.useState({}) - - const getSearchIndex = function () { + const [selectedRange, setSelectedRange] = React.useState(); + const [searchOperatorName, setSearchOperatorName] = React.useState(''); + const [searchEventOperatorName, setSearchEventOperatorName] = React.useState(''); + const [filterEventSize, setFilterEventSize] = React.useState({}); + const [maxSize, setMaxSize] = React.useState({}); + + const getSearchIndex = function (): number { if (!memoryStatsData) { - return -1 + return -1; } for (let i = 0; i < memoryStatsData.columns.length; i++) { - if (memoryStatsData.columns[i].name == memoryStatsData.metadata.search) { - return i + if (memoryStatsData.columns[i].name === memoryStatsData.metadata.search) { + return i; } } - return -1 - } + return -1; + }; - const getStep = (size: number, indexBias: number) => { - return 10 ** (Math.floor(Math.log10(size != 0 ? size : 1)) - indexBias) - } + const getStep = (size: number, indexBias: number): number => { + return 10 ** (Math.floor(Math.log10(size !== 0 ? size : 1)) - indexBias); + }; - const filterByEventSize = ( - rows: T[] | undefined, - size: Array - ) => { + const filterByEventSize = (rows: T[] | undefined, size: Array): T[] | undefined => { const result = React.useMemo(() => { if (!rows) { - return undefined + return undefined; } // workaround type system const field = (row: any): number => { - const sizeColIndex = 1 - return row[sizeColIndex] - } + const sizeColIndex = 1; + return row[sizeColIndex]; + }; return rows.filter((row) => { - return field(row) >= size[0] && field(row) <= size[1] - }) - }, [rows, size]) + return field(row) >= size[0] && field(row) <= size[1]; + }); + }, [rows, size]); - return result - } + return result; + }; - const searchIndex = getSearchIndex() - const getName = React.useCallback((row: any) => row[searchIndex], [ - searchIndex - ]) - const getNameAscend = (row: any) => row[0] - const [searchedTableDataRows] = useSearchDirectly( - searchOperatorName, - getName, - memoryStatsData?.rows[device] ?? [] - ) + const searchIndex = getSearchIndex(); + const getName = React.useCallback((row: any) => row[searchIndex], [searchIndex]); + const getNameAscend = (row: any): any => row[0]; + const [searchedTableDataRows] = useSearchDirectly(searchOperatorName, getName, memoryStatsData?.rows[device] ?? []); const [searchedEventsTableDataRows] = useSearchDirectly( searchEventOperatorName, deviceTarget === 'Ascend' ? getNameAscend : getName, - filterByEventSize( - memoryEventsData?.rows[device], - filterEventSize[device] ?? [0, Infinity] - ) ?? [] - ) + filterByEventSize(memoryEventsData?.rows[device], filterEventSize[device] ?? [0, Infinity]) ?? [] + ); const onSearchOperatorChanged: TextFieldProps['onChange'] = (event) => { - setSearchOperatorName(event.target.value as string) - } + setSearchOperatorName(event.target.value as string); + }; const onSearchEventOperatorChanged: TextFieldProps['onChange'] = (event) => { - setSearchEventOperatorName(event.target.value as string) - } + setSearchEventOperatorName(event.target.value as string); + }; - const [selectedRecord, setSelectedRecord] = React.useState() - const onRowSelected = (record?: object, rowIndex?: number) => { - setSelectedRecord(record) - } + const [selectedRecord, setSelectedRecord] = React.useState(); + const onRowSelected = (record?: object, rowIndex?: number): void => { + setSelectedRecord(record); + }; - const onFilterEventSizeChanged = ( - event: any, - newValue: number | number[] - ) => { + const onFilterEventSizeChanged = (event: any, newValue: number | number[]): void => { setFilterEventSize({ ...filterEventSize, - [device]: newValue as number[] - }) - } + [device]: newValue as number[], + }); + }; - const onFilterEventMinSizeInputChanged = ( - event: React.ChangeEvent - ) => { + const onFilterEventMinSizeInputChanged = (event: React.ChangeEvent): void => { setFilterEventSize({ ...filterEventSize, - [device]: [Number(event.target.value), filterEventSize[device][1]] - }) - } + [device]: [Number(event.target.value), filterEventSize[device][1]], + }); + }; - const onFilterEventMaxSizeInputChanged = ( - event: React.ChangeEvent - ) => { + const onFilterEventMaxSizeInputChanged = (event: React.ChangeEvent): void => { setFilterEventSize({ ...filterEventSize, - [device]: [filterEventSize[device][0], Number(event.target.value)] - }) - } + [device]: [filterEventSize[device][0], Number(event.target.value)], + }); + }; React.useEffect(() => { - deviceTarget !== 'Ascend' && api.defaultApi - .memoryGet( - run, - worker, - span, - selectedRange?.startTs, - selectedRange?.endTs - ) - .then((resp) => { - setMemoryStatsData(resp) - if (!devices || devices.length == 0) { + if (deviceTarget !== 'Ascend') { + api.defaultApi.memoryGet(run, worker, span, selectedRange?.startTs, selectedRange?.endTs).then((resp) => { + setMemoryStatsData(resp); + if (!devices || devices.length === 0) { // setDevices only execute on view load. Since selection on curve // might filter all events later, some devices might is missing. - setDevices(Object.keys(resp.rows)) - setDevice(resp.metadata.default_device) + setDevices(Object.keys(resp.rows)); + setDevice(resp.metadata.default_device); } - }) - }, [run, worker, span, selectedRange]) + }); + } + }, [run, worker, span, selectedRange]); React.useEffect(() => { - api.defaultApi - .memoryEventsGet( - run, - worker, - span, - selectedRange?.startTs, - selectedRange?.endTs - ) - .then((resp) => { - const tempRes = deviceTarget === 'Ascend' ? (resp as MemoryEventsDataAll).operator : resp as MemoryEventsData - if (deviceTarget === 'Ascend') { - memoryEventDataAllRef.current = resp as MemoryEventsDataAll - } - let curMaxSize: MaxEventSize = {} - let curFilterEventSize: EventSizeFilter = {} - for (let deviceName in tempRes.rows) { - curMaxSize[deviceName] = 0 - for (let i = 0; i < tempRes.rows[deviceName].length; i++) { - curMaxSize[deviceName] = Math.max( - curMaxSize[deviceName], - tempRes.rows[deviceName][i][1] - ) - } - curFilterEventSize[deviceName] = [ - curMaxSize[deviceName] / 4, - curMaxSize[deviceName] - ] - curMaxSize[deviceName] = curMaxSize[deviceName] + api.defaultApi.memoryEventsGet(run, worker, span, selectedRange?.startTs, selectedRange?.endTs).then((resp) => { + const tempRes = deviceTarget === 'Ascend' ? (resp as MemoryEventsDataAll).operator : (resp as MemoryEventsData); + if (deviceTarget === 'Ascend') { + memoryEventDataAllRef.current = resp as MemoryEventsDataAll; + } + let curMaxSize: MaxEventSize = {}; + let curFilterEventSize: EventSizeFilter = {}; + Object.keys(tempRes.rows).forEach((deviceName) => { + curMaxSize[deviceName] = 0; + for (let i = 0; i < tempRes.rows[deviceName].length; i++) { + curMaxSize[deviceName] = Math.max(curMaxSize[deviceName], tempRes.rows[deviceName][i][1]); } - setMaxSize(curMaxSize) - setFilterEventSize(curFilterEventSize) - setMemoryEventsData(tempRes) - }) - }, [run, worker, span, selectedRange]) + curFilterEventSize[deviceName] = [curMaxSize[deviceName] / 4, curMaxSize[deviceName]]; + curMaxSize[deviceName] = curMaxSize[deviceName]; + }); + setMaxSize(curMaxSize); + setFilterEventSize(curFilterEventSize); + setMemoryEventsData(tempRes); + }); + }, [run, worker, span, selectedRange]); React.useEffect(() => { api.defaultApi.memoryCurveGet(run, worker, span).then((resp) => { // Reset the select range to null whenever run/worker/span changes - setSelectedRange(undefined) + setSelectedRange(undefined); if (deviceTarget === 'Ascend') { - const allCurveData = resp as MemoryCurveDataAll - memoryCurveDataAllRef.current = allCurveData - setDevice(allCurveData.default_device) - setDevices(allCurveData.devices) - setMemoryCurveData(allCurveData.total) - setTag('Operator') + const allCurveData = resp as MemoryCurveDataAll; + memoryCurveDataAllRef.current = allCurveData; + setDevice(allCurveData.default_device); + setDevices(allCurveData.devices); + setMemoryCurveData(allCurveData.total); + setTag('Operator'); } else { - setMemoryCurveData(resp as MemoryCurveData) + setMemoryCurveData(resp as MemoryCurveData); } - }) - }, [run, worker, span]) + }); + }, [run, worker, span]); React.useEffect(() => { if (memoryCurveData !== undefined) { @@ -328,127 +277,118 @@ export const MemoryView: React.FC = React.memo((props) => { setLineChartData({ title: memoryCurveData.metadata.peaks[device] ?? '', columns: memoryCurveData.columns[device] ?? [], - rows: memoryCurveData.rows[device] ?? {} - }) + rows: memoryCurveData.rows[device] ?? {}, + }); } else { setLineChartData({ title: memoryCurveData.metadata.peaks[device], columns: memoryCurveData.columns, - rows: memoryCurveData.rows[device] ?? [] - }) + rows: memoryCurveData.rows[device] ?? [], + }); } } - }, [memoryCurveData, device]) + }, [memoryCurveData, device]); const onDeviceChanged: SelectProps['onChange'] = (event) => { - setDevice(event.target.value as string) - setSelectedRange(undefined) - } + setDevice(event.target.value as string); + setSelectedRange(undefined); + }; const onTagChanged: SelectProps['onChange'] = (event) => { - setTag(event.target.value as string) + setTag(event.target.value as string); if (event.target.value === 'Operator') { - setMemoryCurveData(memoryCurveDataAllRef.current?.total) - setMemoryEventsData(memoryEventDataAllRef.current?.operator) - setSelectedRange(undefined) + setMemoryCurveData(memoryCurveDataAllRef.current?.total); + setMemoryEventsData(memoryEventDataAllRef.current?.operator); + setSelectedRange(undefined); } else { - setMemoryCurveData(memoryCurveDataAllRef.current?.ptaGe) - setMemoryEventsData(memoryEventDataAllRef.current?.component) + setMemoryCurveData(memoryCurveDataAllRef.current?.ptaGe); + setMemoryEventsData(memoryEventDataAllRef.current?.component); } - } + }; - const onSelectedRangeChanged = (start: number, end: number) => { + const onSelectedRangeChanged = (start: number, end: number): void => { if (start > end) { - setSelectedRange(undefined) - return + setSelectedRange(undefined); + return; } - let allDatas = deviceTarget === 'Ascend' ? - memoryCurveData?.rows[device]?.Allocated : memoryCurveData?.rows[device] + let allDatas = deviceTarget === 'Ascend' ? memoryCurveData?.rows[device]?.Allocated : memoryCurveData?.rows[device]; if (allDatas.length <= 1) { - setSelectedRange(undefined) - return + setSelectedRange(undefined); + return; } - let startTs = 0 - let endTs = 0 - let realStart = 0 - let realEnd = 0 - let startId = 1 - let endId = 0 - let needLoopStart = true + let startTs = 0; + let endTs = 0; + let realStart = 0; + let realEnd = 0; + let startId = 1; + let endId = 0; + let needLoopStart = true; for (let i = 1; i < allDatas.length; i++) { if (startId > start && needLoopStart) { - needLoopStart = false - realStart = i - 1 + needLoopStart = false; + realStart = i - 1; } if (allDatas[i][0] !== allDatas[i - 1][0]) { if (startId <= start) { - startId += 1 + startId += 1; } - endId += 1 + endId += 1; } if (endId > end) { - realEnd = i - 1 - break + realEnd = i - 1; + break; } else { - realEnd = i + realEnd = i; if (needLoopStart) { - realStart = i + realStart = i; } } } if (deviceTarget === 'Ascend') { - startTs = allDatas[realStart][0] - endTs = allDatas[realEnd][0] + startTs = allDatas[realStart][0]; + endTs = allDatas[realEnd][0]; } else { - let bias = memoryCurveData?.metadata.first_ts ?? 0 - let scale = 1 / (memoryCurveData?.metadata.time_factor ?? 1) - startTs = Math.round(allDatas[realStart][0] * scale + bias) - endTs = Math.round(allDatas[realEnd][0] * scale + bias) + let bias = memoryCurveData?.metadata.first_ts ?? 0; + let scale = 1 / (memoryCurveData?.metadata.time_factor ?? 1); + startTs = Math.round((allDatas[realStart][0] * scale) + bias); + endTs = Math.round((allDatas[realEnd][0] * scale) + bias); } - setSelectedRange({ start, end, startTs, endTs }) - } + setSelectedRange({ start, end, startTs, endTs }); + }; return (
- - + + - + - {(graph) => ( - + {(graph): JSX.Element => ( + - Device - + {devices.map((item) => ( + {item} ))} - {deviceTarget === 'Ascend' && + {deviceTarget === 'Ascend' && ( - Group By - + {tags.map((item) => ( + {item} ))} - } + )} {showCurve() && lineChartData && lineChartData.columns.length > 0 && ( @@ -471,28 +411,28 @@ export const MemoryView: React.FC = React.memo((props) => { {showEvents() && ( <> - {(deviceTarget !== 'Ascend' || tag === 'Operator') && + {(deviceTarget !== 'Ascend' || tag === 'Operator') && ( - + - + = React.memo((props) => { min: 0, max: filterEventSize[device]?.[1] ?? 0, type: 'number', - 'aria-labelledby': 'input-slider' + 'aria-labelledby': 'input-slider', }} /> @@ -509,7 +449,7 @@ export const MemoryView: React.FC = React.memo((props) => { className={classes.filterSlider} value={filterEventSize[device] ?? [0, 0]} onChange={onFilterEventSizeChanged} - aria-labelledby="input-slider" + aria-labelledby='input-slider' min={0} max={maxSize[device] ?? 0} step={getStep(maxSize[device] ?? 0, 5)} @@ -518,7 +458,7 @@ export const MemoryView: React.FC = React.memo((props) => { = React.memo((props) => { min: filterEventSize[device]?.[0] ?? 0, max: maxSize[device] ?? 0, type: 'number', - 'aria-labelledby': 'input-slider' + 'aria-labelledby': 'input-slider', }} /> - } - + )} + - {(data) => { + {(data): JSX.Element => { return ( - ) + ); }} @@ -555,29 +494,29 @@ export const MemoryView: React.FC = React.memo((props) => { )} {deviceTarget !== 'Ascend' && ( <> - - + + - + - {(data) => ( + {(data): JSX.Element => ( )} @@ -588,5 +527,5 @@ export const MemoryView: React.FC = React.memo((props) => {
- ) -}) + ); +}); diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/ModuleView.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/ModuleView.tsx index 396188aba4e69cced5208ff4af86631bf02e172c..a66a825365fd3c813e58865c609643ab547b4c49 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/ModuleView.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/ModuleView.tsx @@ -1,241 +1,227 @@ /*--------------------------------------------------------------------------------------------- * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardHeader from '@material-ui/core/CardHeader' -import InputLabel from '@material-ui/core/InputLabel' -import MenuItem from '@material-ui/core/MenuItem' -import Select, { SelectProps } from '@material-ui/core/Select' -import { makeStyles } from '@material-ui/core/styles' -import { Table } from 'antd' -import * as React from 'react' -import { FlameGraph } from 'react-flame-graph' -import { - defaultApi, - KeyedColumn, - ModuleStats, - ModuleViewData, - OperatorNode -} from '../api' +import Card from '@material-ui/core/Card'; +import CardHeader from '@material-ui/core/CardHeader'; +import InputLabel from '@material-ui/core/InputLabel'; +import MenuItem from '@material-ui/core/MenuItem'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import { makeStyles } from '@material-ui/core/styles'; +import { message, Table } from 'antd'; +import * as React from 'react'; +import { FlameGraph } from 'react-flame-graph'; +import { defaultApi, KeyedColumn, ModuleStats, ModuleViewData, OperatorNode } from '../api'; const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, hide: { - display: 'none' - } -})) + display: 'none', + }, +})); export interface IProps { - run: string - worker: string - span: string + run: string; + worker: string; + span: string; } -const getKeyedTableColumns = (columns: KeyedColumn[]) => { +const getKeyedTableColumns = (columns: KeyedColumn[]): any[] => { return columns.map((col) => { return { dataIndex: col.key, key: col.key, - title: col.name - } - }) -} + title: col.name, + }; + }); +}; -const getTableRows = (key: number, rows: ModuleStats[]) => { +const getTableRows = (key: number, rows: ModuleStats[]): any[] => { + let initialKey = key; return rows.map((row) => { + const currentKey = initialKey++; const data: any = { - key: key++, + key: currentKey, name: row.name, occurences: row.occurences, operators: row.operators, host_duration: row.host_duration, self_host_duration: row.self_host_duration, device_duration: row.device_duration, - self_device_duration: row.self_device_duration - } + self_device_duration: row.self_device_duration, + }; if (row.children.length) { - data.children = getTableRows(key, row.children) + data.children = getTableRows(key, row.children); } - return data - }) -} + return data; + }); +}; -const getFlameGraphData = (rows: ModuleStats[]) => { +const getFlameGraphData = (rows: ModuleStats[]): any[] => { return rows.map((row) => { const data: any = { name: row.name, value: row.avg_duration, - tooltip: `${row.name} (module id: ${row.id}): ${row.avg_duration} us` - } + tooltip: `${row.name} (module id: ${row.id}): ${row.avg_duration} us`, + }; if (row.children.length) { - data.children = getFlameGraphData(row.children) + data.children = getFlameGraphData(row.children); } - return data - }) -} + return data; + }); +}; const getTreeHeight = (row: ModuleStats): number => { - if (row.children && row.children.length) { - return 1 + Math.max(...row.children.map((child) => getTreeHeight(child))) + if (row.children?.length) { + return 1 + Math.max(...row.children.map((child) => getTreeHeight(child))); } else { - return 1 + return 1; } -} +}; -const getOperatorTree = ( - level: number, - row: OperatorNode, - result: object[] -) => { +const getOperatorTree = (level: number, row: OperatorNode, result: object[]): void => { result.push({ level: level, name: row.name, start: row.start_time, - end: row.end_time - }) + end: row.end_time, + }); if (row.children.length) { - row.children.forEach((child) => getOperatorTree(level + 1, child, result)) + row.children.forEach((child) => getOperatorTree(level + 1, child, result)); } -} +}; export const ModuleView: React.FC = (props) => { - const { run, worker, span } = props - const classes = useStyles() + const { run, worker, span } = props; + const classes = useStyles(); - const [moduleView, setModuleView] = React.useState< - ModuleViewData | undefined - >(undefined) - const [flameData, setFlameData] = React.useState([]) - const [flameHeight, setFlameHeight] = React.useState(0) - const [modules, setModules] = React.useState([]) - const [module, setModule] = React.useState(0) + const [moduleView, setModuleView] = React.useState(undefined); + const [flameData, setFlameData] = React.useState([]); + const [flameHeight, setFlameHeight] = React.useState(0); + const [modules, setModules] = React.useState([]); + const [module, setModule] = React.useState(0); - const [columns, setColumns] = React.useState([]) - const [rows, setRows] = React.useState([]) + const [columns, setColumns] = React.useState([]); + const [rows, setRows] = React.useState([]); - const cardRef = React.useRef(null) - const [cardWidth, setCardWidth] = React.useState( - undefined - ) - const timelineRef = React.useRef(null) + const cardRef = React.useRef(null); + const [cardWidth, setCardWidth] = React.useState(undefined); + const timelineRef = React.useRef(null); React.useEffect(() => { defaultApi .moduleGet(run, worker, span) .then((resp) => { - setModuleView(resp) + setModuleView(resp); if (resp) { // set the flamegraph data - const flameData: any[] = getFlameGraphData(resp.data) - setFlameData(flameData) - const flameHeight = Math.max( - ...flameData.map((x) => getTreeHeight(x)) - ) - setFlameHeight(flameHeight * 25) - setModules(Array.from(Array(flameData.length).keys())) - setModule(0) + const flameGraphData: any[] = getFlameGraphData(resp.data); + setFlameData(flameGraphData); + const flameGraphHeight = Math.max(...flameGraphData.map((x) => getTreeHeight(x))); + setFlameHeight(flameGraphHeight * 25); + setModules(Array.from(Array(flameGraphData.length).keys())); + setModule(0); // set the tree table data - setColumns(getKeyedTableColumns(resp.columns)) - setRows(getTableRows(1, resp.data)) + setColumns(getKeyedTableColumns(resp.columns)); + setRows(getTableRows(1, resp.data)); } }) .catch((e) => { - if (e.status == 404) { - setModules([]) - setFlameData([]) - setRows([]) + if (e.status === 404) { + setModules([]); + setFlameData([]); + setRows([]); } - }) + }); if (cardRef.current) { - setCardWidth(cardRef.current.offsetWidth - 10) + setCardWidth(cardRef.current.offsetWidth - 10); } try { if (timelineRef.current) { defaultApi.treeGet(run, worker, span).then((resp) => { if (resp) { - const data = new google.visualization.DataTable() - data.addColumn({ type: 'string', id: 'Layer' }) - data.addColumn({ type: 'string', id: 'Name' }) - data.addColumn({ type: 'string', role: 'tooltip' }) - data.addColumn({ type: 'number', id: 'Start' }) - data.addColumn({ type: 'number', id: 'End' }) - - let timeline_data: any[] = [] - getOperatorTree(0, resp, timeline_data) - timeline_data.sort((a, b) => a.level - b.level) - const max_level = timeline_data[timeline_data.length - 1].level - timeline_data.forEach((d) => { + const data = new google.visualization.DataTable(); + data.addColumn({ type: 'string', id: 'Layer' }); + data.addColumn({ type: 'string', id: 'Name' }); + data.addColumn({ type: 'string', role: 'tooltip' }); + data.addColumn({ type: 'number', id: 'Start' }); + data.addColumn({ type: 'number', id: 'End' }); + + let timelineData: any[] = []; + getOperatorTree(0, resp, timelineData); + timelineData.sort((a, b) => a.level - b.level); + const maxLevel = timelineData[timelineData.length - 1].level; + timelineData.forEach((d) => { data.addRow([ d.level.toString(), d.name, `${d.name} Duration: ${d.end - d.start} us`, d.start / 1000.0, // the time unit is us returned from server, but the google charts only accept milliseconds here - d.end / 1000.0 - ]) - }) + d.end / 1000.0, + ]); + }); - const chart = new google.visualization.Timeline(timelineRef.current) + const chart = new google.visualization.Timeline(timelineRef.current); const options = { - height: (max_level + 1) * 50, + height: (maxLevel + 1) * 50, tooltip: { - isHtml: true + isHtml: true, }, timeline: { - showRowLabels: false - } - } - chart.draw(data, options) + showRowLabels: false, + }, + }; + chart.draw(data, options); } - }) + }); } } catch (e) { - console.warn('Timeline in module view is not supported offline.') + message.warning('Timeline in module view is not supported offline.'); } - }, [run, worker, span]) + }, [run, worker, span]); const handleModuleChange: SelectProps['onChange'] = (event) => { - setModule(event.target.value as number) - } + setModule(event.target.value as number); + }; - const moduleComponent = () => { + const moduleComponent = (): JSX.Element => { const moduleFragment = ( - Module + Module - ) + ); if (!modules || modules.length <= 1) { - return
{moduleFragment}
+ return
{moduleFragment}
; } else { - return moduleFragment + return moduleFragment; } - } + }; return (
- - + + {rows && rows.length > 0 && ( )} @@ -247,13 +233,12 @@ export const ModuleView: React.FC = (props) => { data={flameData[module]} height={flameHeight} width={cardWidth} - onChange={(node: any) => { - }} + onChange={(node: any): void => {}} /> )}
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Operator.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Operator.tsx index 7278ca59c938874b85b2a52abbb36c59f924373b..b19bef1967a31915c3c1d660b699b11c83ebb226 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Operator.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Operator.tsx @@ -15,119 +15,99 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import FormControlLabel from '@material-ui/core/FormControlLabel' -import Grid from '@material-ui/core/Grid' -import GridList from '@material-ui/core/GridList' -import GridListTile from '@material-ui/core/GridListTile' -import InputLabel from '@material-ui/core/InputLabel' -import MenuItem from '@material-ui/core/MenuItem' -import Radio from '@material-ui/core/Radio' -import RadioGroup, { RadioGroupProps } from '@material-ui/core/RadioGroup' -import Select, { SelectProps } from '@material-ui/core/Select' -import { makeStyles } from '@material-ui/core/styles' -import TextField, { - StandardTextFieldProps, - TextFieldProps -} from '@material-ui/core/TextField' -import * as React from 'react' -import * as api from '../api' -import { - OperationTableData, - OperationTableDataInner, - OperatorGraph -} from '../api' -import { OperationGroupBy } from '../constants/groupBy' -import { useSearchDirectly } from '../utils/search' -import { topIsValid, UseTop, useTopN } from '../utils/top' -import { PieChart } from './charts/PieChart' -import { DataLoading } from './DataLoading' -import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers' -import { OperationTable } from './tables/OperationTable' +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import FormControlLabel from '@material-ui/core/FormControlLabel'; +import Grid from '@material-ui/core/Grid'; +import GridList from '@material-ui/core/GridList'; +import GridListTile from '@material-ui/core/GridListTile'; +import InputLabel from '@material-ui/core/InputLabel'; +import MenuItem from '@material-ui/core/MenuItem'; +import Radio from '@material-ui/core/Radio'; +import RadioGroup, { RadioGroupProps } from '@material-ui/core/RadioGroup'; +import Select, { SelectProps } from '@material-ui/core/Select'; +import { makeStyles } from '@material-ui/core/styles'; +import TextField, { StandardTextFieldProps, TextFieldProps } from '@material-ui/core/TextField'; +import * as React from 'react'; +import * as api from '../api'; +import { OperationTableData, OperationTableDataInner, OperatorGraph } from '../api'; +import { OperationGroupBy } from '../constants/groupBy'; +import { useSearchDirectly } from '../utils/search'; +import { topIsValid, UseTop, useTopN } from '../utils/top'; +import { PieChart } from './charts/PieChart'; +import { DataLoading } from './DataLoading'; +import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers'; +import { OperationTable } from './tables/OperationTable'; import { - DeviceSelfTimeTooltip, - DeviceSelfTimeTooltipAscend, - DeviceTotalTimeTooltip, - DeviceTotalTimeTooltipAscend, - HostSelfTimeTooltip, - HostTotalTimeTooltip -} from './TooltipDescriptions' + deviceSelfTimeTooltip, + deviceSelfTimeTooltipAscend, + deviceTotalTimeTooltip, + deviceTotalTimeTooltipAscend, + hostSelfTimeTooltip, + hostTotalTimeTooltip, +} from './TooltipDescriptions'; const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, verticalInput: { display: 'flex', - alignItems: 'center' + alignItems: 'center', }, inputWidth: { - width: '4em' + width: '4em', }, inputWidthOverflow: { minWidth: '15em', - whiteSpace: 'nowrap' + whiteSpace: 'nowrap', }, full: { - width: '100%' + width: '100%', }, description: { - marginLeft: theme.spacing(1) - } -})) + marginLeft: theme.spacing(1), + }, +})); export interface IProps { - run: string - worker: string - span: string - deviceTarget: string + run: string; + worker: string; + span: string; + deviceTarget: string; } export const Operator: React.FC = (props) => { - const { run, worker, span, deviceTarget } = props - const classes = useStyles() - const tooltipCommonClasses = useTooltipCommonStyles() + const { run, worker, span, deviceTarget } = props; + const classes = useStyles(); + const tooltipCommonClasses = useTooltipCommonStyles(); const chartHeaderRenderer = React.useMemo( () => makeChartHeaderRenderer(tooltipCommonClasses), [tooltipCommonClasses] - ) + ); - const [operatorGraph, setOperatorGraph] = React.useState< - OperatorGraph | undefined - >(undefined) - const [operatorTable, setOperatorTable] = React.useState< - OperationTableData | undefined - >(undefined) - const [sortColumn, setSortColumn] = React.useState('') - const [tableTooltips, setTableTooltips] = React.useState( - undefined - ) - const [groupBy, setGroupBy] = React.useState(OperationGroupBy.Operation) - const [searchOperatorName, setSearchOperatorName] = React.useState('') + const [operatorGraph, setOperatorGraph] = React.useState(undefined); + const [operatorTable, setOperatorTable] = React.useState(undefined); + const [sortColumn, setSortColumn] = React.useState(''); + const [tableTooltips, setTableTooltips] = React.useState(undefined); + const [groupBy, setGroupBy] = React.useState(OperationGroupBy.OPERATION); + const [searchOperatorName, setSearchOperatorName] = React.useState(''); const [topText, actualTop, useTop, setTopText, setUseTop] = useTopN({ - defaultUseTop: UseTop.Use, - defaultTop: 10 - }) + defaultUseTop: UseTop.USE, + defaultTop: 10, + }); - const getName = React.useCallback( - (row: OperationTableDataInner) => row.name, - [] - ) - const [searchedOperatorTable] = useSearchDirectly( - searchOperatorName, - getName, - operatorTable - ) + const getName = React.useCallback((row: OperationTableDataInner) => row.name, []); + const [searchedOperatorTable] = useSearchDirectly(searchOperatorName, getName, operatorTable); const onSearchOperatorChanged: TextFieldProps['onChange'] = (event) => { - setSearchOperatorName(event.target.value as string) - } + setSearchOperatorName(event.target.value as string); + }; React.useEffect(() => { if (operatorGraph) { @@ -135,49 +115,45 @@ export const Operator: React.FC = (props) => { operatorGraph.device_self_time?.rows.length ?? 0, operatorGraph.device_total_time?.rows.length ?? 0, operatorGraph.host_self_time.rows?.length ?? 0, - operatorGraph.host_total_time.rows?.length ?? 0 - ] - setTopText(String(Math.min(Math.max(...counts), 10))) + operatorGraph.host_total_time.rows?.length ?? 0, + ]; + setTopText(String(Math.min(Math.max(...counts), 10))); } - }, [operatorGraph]) + }, [operatorGraph]); React.useEffect(() => { - api.defaultApi - .operationTableGet(run, worker, span, groupBy) - .then((resp) => { - setSortColumn(resp.metadata.sort) - setTableTooltips(resp.metadata.tooltips) - setOperatorTable(resp.data) - }) - }, [run, worker, span, groupBy]) + api.defaultApi.operationTableGet(run, worker, span, groupBy).then((resp) => { + setSortColumn(resp.metadata.sort); + setTableTooltips(resp.metadata.tooltips); + setOperatorTable(resp.data); + }); + }, [run, worker, span, groupBy]); React.useEffect(() => { - api.defaultApi - .operationGet(run, worker, span, groupBy) - .then((resp) => { - setOperatorGraph(resp) - }) - }, [run, worker, span, groupBy]) + api.defaultApi.operationGet(run, worker, span, groupBy).then((resp) => { + setOperatorGraph(resp); + }); + }, [run, worker, span, groupBy]); const onGroupByChanged: SelectProps['onChange'] = (event) => { - setGroupBy(event.target.value as OperationGroupBy) - } + setGroupBy(event.target.value as OperationGroupBy); + }; const onUseTopChanged: RadioGroupProps['onChange'] = (event) => { - setUseTop(event.target.value as UseTop) - } + setUseTop(event.target.value as UseTop); + }; - const onTopChanged = (event: React.ChangeEvent) => { - setTopText(event.target.value) - } + const onTopChanged = (event: React.ChangeEvent): void => { + setTopText(event.target.value); + }; const inputProps: StandardTextFieldProps['inputProps'] = { - min: 1 - } + min: 1, + }; - const renderCharts = (graph: api.OperatorGraph) => { + const renderCharts = (graph: api.OperatorGraph): JSX.Element => { return ( - + {graph.device_self_time && ( @@ -185,7 +161,7 @@ export const Operator: React.FC = (props) => { )} @@ -200,7 +176,7 @@ export const Operator: React.FC = (props) => { )} @@ -211,12 +187,7 @@ export const Operator: React.FC = (props) => { {graph.host_self_time.title && ( - + )} @@ -224,47 +195,34 @@ export const Operator: React.FC = (props) => { {graph.host_total_time.title && ( - + )} - ) - } + ); + }; return (
- - + + - + - } - label="All operators" - /> - } - label="Top operators to show" - /> + } label='All operators' /> + } label='Top operators to show' /> - {useTop === UseTop.Use && ( + {useTop === UseTop.USE && ( = (props) => { {renderCharts} - + - + - Group By - + Operator + Input Shape + Operator @@ -298,10 +248,10 @@ export const Operator: React.FC = (props) => { classes={{ root: classes.inputWidthOverflow }} value={searchOperatorName} onChange={onSearchOperatorChanged} - type="search" - label="Search by Name" + type='search' + label='Search by Name' inputProps={{ - maxLength: 200 + maxLength: 200, }} /> @@ -309,7 +259,7 @@ export const Operator: React.FC = (props) => { - {(table) => ( + {(table): JSX.Element => ( = (props) => {
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Overview.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Overview.tsx index e5f6f17bdaae3d276f24ed24f3566fc994fec0ad..6a81c567bc5e44b1dd6eb4746135d61268cadb81 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Overview.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/Overview.tsx @@ -2,53 +2,50 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import Card from '@material-ui/core/Card' -import CardContent from '@material-ui/core/CardContent' -import CardHeader from '@material-ui/core/CardHeader' -import Grid from '@material-ui/core/Grid' -import { makeStyles } from '@material-ui/core/styles' -import { Table } from 'antd' -import { ColumnsType } from 'antd/es/table' -import * as React from 'react' -import * as api from '../api' -import { PieChart } from './charts/PieChart' -import { SteppedAreaChart } from './charts/SteppedAreaChart' -import { DataLoading } from './DataLoading' -import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers' -import { TextListItem } from './TextListItem' -import { StepTimeBreakDownTooltip } from './TooltipDescriptions' -import { - transformPerformanceIntoPie, - transformPerformanceIntoTable -} from './transform' +import Card from '@material-ui/core/Card'; +import CardContent from '@material-ui/core/CardContent'; +import CardHeader from '@material-ui/core/CardHeader'; +import Grid from '@material-ui/core/Grid'; +import { makeStyles } from '@material-ui/core/styles'; +import { Table } from 'antd'; +import { ColumnsType } from 'antd/es/table'; +import * as React from 'react'; +import * as api from '../api'; +import { PieChart } from './charts/PieChart'; +import { SteppedAreaChart } from './charts/SteppedAreaChart'; +import { DataLoading } from './DataLoading'; +import { makeChartHeaderRenderer, useTooltipCommonStyles } from './helpers'; +import { TextListItem } from './TextListItem'; +import { stepTimeBreakDownTooltip } from './TooltipDescriptions'; +import { transformPerformanceIntoPie, transformPerformanceIntoTable } from './transform'; -const topGraphHeight = 230 +const topGraphHeight = 230; const useStyles = makeStyles((theme) => ({ root: { - flexGrow: 1 + flexGrow: 1, }, pre: { '& ul': { margin: 0, paddingLeft: theme.spacing(3), - ...theme.typography.body1 + ...theme.typography.body1, }, '& li': {}, '& a': { - color: '#ffa726' + color: '#ffa726', }, '& a:active': { - color: '#ffa726' + color: '#ffa726', }, '& p': { margin: 0, ...theme.typography.subtitle1, - fontWeight: theme.typography.fontWeightBold - } + fontWeight: theme.typography.fontWeightBold, + }, }, topGraph: { - height: topGraphHeight + 40 + height: topGraphHeight + 40, }, table: { height: '100%', @@ -57,89 +54,87 @@ const useStyles = makeStyles((theme) => ({ height: 20, fontSize: '10pt', '& > td': { - padding: '0 8px!important' - } - } - } -})) + padding: '0 8px!important', + }, + }, + }, +})); export interface IProps { - run: string - worker: string - span: string + run: string; + worker: string; + span: string; } export const Overview: React.FC = (props) => { - const { run, worker, span } = props + const { run, worker, span } = props; - const [steps, setSteps] = React.useState(undefined) - const [performances, setPerformances] = React.useState([]) - const [environments, setEnvironments] = React.useState([]) - const [gpuMetrics, setGpuMetrics] = React.useState< - api.GpuMetrics | undefined - >(undefined) - const [recommendations, setRecommendations] = React.useState('') - const [columns, setColumns] = React.useState>([]) + const [steps, setSteps] = React.useState(undefined); + const [performances, setPerformances] = React.useState([]); + const [environments, setEnvironments] = React.useState([]); + const [gpuMetrics, setGpuMetrics] = React.useState(undefined); + const [recommendations, setRecommendations] = React.useState(''); + const [columns, setColumns] = React.useState>([]); const tableRows = React.useMemo(() => { - let dataInfo: api.Graph = transformPerformanceIntoTable(performances) + let dataInfo: api.Graph = transformPerformanceIntoTable(performances); if (dataInfo.columns.length < 3) { - return [] + return []; } - const stringCompare = (a: string, b: string) => a.localeCompare(b) - const numberCompare = (a: number, b: number) => a - b - let column: any[] = dataInfo.columns.map(item => { + const stringCompare = (a: string, b: string): number => a.localeCompare(b); + const numberCompare = (a: number, b: number): number => a - b; + let column: any[] = dataInfo.columns.map((item) => { return { title: item.name, key: item.name, dataIndex: item.name, - sorter: item.type == 'string' ? (a: any, b: any) => stringCompare(a[item.name], b[item.name]) - : (a: any, b: any) => numberCompare(a[item.name], b[item.name]) - } - }) - setColumns(column) + sorter: + item.type === 'string' + ? (a: any, b: any): number => stringCompare(a[item.name], b[item.name]) + : (a: any, b: any): number => numberCompare(a[item.name], b[item.name]), + }; + }); + setColumns(column); return dataInfo.rows.map((row, index) => { if (row.length < 3) { - return null + return null; } return { key: index, [dataInfo.columns[0].name]: row[0], [dataInfo.columns[1].name]: row[1], - [dataInfo.columns[2].name]: row[2] - } - }) - }, [performances]) + [dataInfo.columns[2].name]: row[2], + }; + }); + }, [performances]); const synthesizedPieGraph = React.useMemo(() => { - return transformPerformanceIntoPie(performances) - }, [performances]) + return transformPerformanceIntoPie(performances); + }, [performances]); React.useEffect(() => { api.defaultApi.overviewGet(run, worker, span).then((resp) => { - setPerformances(resp.performance) - setEnvironments(resp.environments) - setSteps(resp.steps) - setRecommendations(resp.recommendations) - setGpuMetrics(resp.gpu_metrics) - }) - }, [run, worker, span]) + setPerformances(resp.performance); + setEnvironments(resp.environments); + setSteps(resp.steps); + setRecommendations(resp.recommendations); + setGpuMetrics(resp.gpu_metrics); + }); + }, [run, worker, span]); - const classes = useStyles() - const tooltipCommonClasses = useTooltipCommonStyles() + const classes = useStyles(); + const tooltipCommonClasses = useTooltipCommonStyles(); const chartHeaderRenderer = React.useMemo( () => makeChartHeaderRenderer(tooltipCommonClasses, false), [tooltipCommonClasses] - ) + ); const stepTimeBreakDownTitle = React.useMemo( - () => chartHeaderRenderer('Step Time Breakdown', StepTimeBreakDownTooltip), + () => chartHeaderRenderer('Step Time Breakdown', stepTimeBreakDownTooltip), [tooltipCommonClasses, chartHeaderRenderer] - ) + ); - const cardSizes = gpuMetrics - ? ([2, 3, 7] as const) - : ([4, undefined, 8] as const) + const cardSizes = gpuMetrics ? ([2, 3, 7] as const) : ([4, undefined, 8] as const); return (
@@ -148,14 +143,11 @@ export const Overview: React.FC = (props) => { {React.useMemo( () => ( - - + + {environments.map((environment) => ( - + ))} @@ -165,28 +157,19 @@ export const Overview: React.FC = (props) => { {gpuMetrics && ( - - - + + + {gpuMetrics.data.map((metric) => ( - + ))} )} - - + + @@ -199,10 +182,7 @@ export const Overview: React.FC = (props) => { /> - + @@ -211,16 +191,12 @@ export const Overview: React.FC = (props) => { - + - {(graph) => ( - + {(graph): JSX.Element => ( + )} @@ -229,13 +205,13 @@ export const Overview: React.FC = (props) => { - - + +
@@ -245,5 +221,5 @@ export const Overview: React.FC = (props) => {
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TextListItem.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TextListItem.tsx index c5e4eee5251f7ab8afedf58f305a5cb30ad92a19..59eb79c2a8f05cc750d264880bb66ab646c4bbb4 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TextListItem.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TextListItem.tsx @@ -2,76 +2,69 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import Grid from '@material-ui/core/Grid' -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' +import Grid from '@material-ui/core/Grid'; +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; export interface IStylesProps { - root?: string - name?: string + root?: string; + name?: string; } export interface IProps { - name: string - value?: string - description?: string - extra?: string - classes?: IStylesProps - dangerouslyAllowHtml?: boolean + name: string; + value?: string; + description?: string; + extra?: string; + classes?: IStylesProps; + dangerouslyAllowHtml?: boolean; } const useStyles = makeStyles((theme) => ({ label: { ...theme.typography.subtitle2, - fontWeight: 'bolder' + fontWeight: 'bolder', }, value: { textAlign: 'right', ...theme.typography.subtitle2, - fontWeight: 'bolder' - } -})) + fontWeight: 'bolder', + }, +})); export const TextListItem: React.FC = (props) => { - const classes = useStyles() + const classes = useStyles(); - const getSizes = function () { + const getSizes = function (): readonly any[] { if (props.value && props.extra) { - return [4, 4, 4] as const + return [4, 4, 4] as const; } if (props.value) { if (props.value.length > props.name.length) { - return [4, 8, undefined] as const + return [4, 8, undefined] as const; } - return [8, 4, undefined] as const + return [8, 4, undefined] as const; } - return [12, undefined, undefined] as const - } + return [12, undefined, undefined] as const; + }; - const sizes = getSizes() + const sizes = getSizes(); - const renderSpan = function (content: string, className?: string) { + const renderSpan = function (content: string, className?: string): React.JSX.Element { if (props.dangerouslyAllowHtml) { - return ( - - ) + return ; } - return {content} - } + return {content}; + }; return ( - + {renderSpan(props.name, props.classes?.name)} - {props.description && ( - {renderSpan(props.description)} - )} + {props.description && {renderSpan(props.description)}} {props.value && ( @@ -85,5 +78,5 @@ export const TextListItem: React.FC = (props) => { )} - ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TooltipDescriptions.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TooltipDescriptions.ts index 8f434221ddbdbd48a7a41ab6c73b2901519007c5..6d3631fee97a4dd8da5ebde1550573d8c6e501fa 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TooltipDescriptions.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TooltipDescriptions.ts @@ -2,37 +2,37 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -export const StepTimeBreakDownTooltip = `The time spent on each step is broken down into multiple categories as follows: +export const stepTimeBreakDownTooltip = `The time spent on each step is broken down into multiple categories as follows: Kernel: Kernels execution time on GPU device; Memcpy: GPU involved memory copy time (either D2D, D2H or H2D); Memset: GPU involved memory set time; Runtime: CUDA runtime execution time on host side; Such as cudaLaunchKernel, cudaMemcpyAsync, cudaStreamSynchronize, ... DataLoader: The data loading time spent in PyTorch DataLoader object; CPU Exec: Host compute time, including every PyTorch operator running time; -Other: The time not included in any of the above.` +Other: The time not included in any of the above.`; -export const DeviceSelfTimeTooltip = `The accumulated time spent on GPU, not including this operator’s child operators.` +export const deviceSelfTimeTooltip = `The accumulated time spent on GPU, not including this operator’s child operators.`; -export const DeviceSelfTimeTooltipAscend = `The accumulated time spent on NPU, not including this operator’s child operators.` +export const deviceSelfTimeTooltipAscend = `The accumulated time spent on NPU, not including this operator’s child operators.`; -export const DeviceTotalTimeTooltip = `The accumulated time spent on GPU, including this operator’s child operators.` +export const deviceTotalTimeTooltip = `The accumulated time spent on GPU, including this operator’s child operators.`; -export const DeviceTotalTimeTooltipAscend = `The accumulated time spent on NPU, including this operator’s child operators.` +export const deviceTotalTimeTooltipAscend = `The accumulated time spent on NPU, including this operator’s child operators.`; -export const HostSelfTimeTooltip = `The accumulated time spent on Host, not including this operator’s child operators.` +export const hostSelfTimeTooltip = `The accumulated time spent on Host, not including this operator’s child operators.`; -export const HostTotalTimeTooltip = `The accumulated time spent on Host, including this operator’s child operators.` +export const hostTotalTimeTooltip = `The accumulated time spent on Host, including this operator’s child operators.`; -export const GPUKernelTotalTimeTooltip = `The accumulated time of all calls of this kernel.` +export const gpuKernelTotalTimeTooltip = `The accumulated time of all calls of this kernel.`; -export const TensorCoresPieChartTooltip = `The accumulated time of all kernels using or not using Tensor Cores.` +export const tensorCoresPieChartTooltip = `The accumulated time of all kernels using or not using Tensor Cores.`; -export const TensorCoresPieChartTooltipAscend = `The accumulated time of all kernels group by Accelerator Core.` +export const tensorCoresPieChartTooltipAscend = `The accumulated time of all kernels group by Accelerator Core.`; -export const DistributedGpuInfoTableTooltip = `Information about GPU hardware used during the run.` +export const distributedGpuInfoTableTooltip = `Information about GPU hardware used during the run.`; -export const DistributedOverlapGraphTooltip = `The time spent on computation vs communication.` +export const distributedOverlapGraphTooltip = `The time spent on computation vs communication.`; -export const DistributedWaittimeGraphTooltip = `The time spent waiting vs communicating between devices.` +export const distributedWaittimeGraphTooltip = `The time spent waiting vs communicating between devices.`; -export const DistributedCommopsTableTooltip = `Statistics for operations managing communications between nodes.` +export const distributedCommopsTableTooltip = `Statistics for operations managing communications between nodes.`; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TraceView.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TraceView.tsx index 8f1f3684305cabfe6f35d341557386c1d8f71cf1..be499794936a085ed72740eea8bac5f33df37171 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TraceView.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/TraceView.tsx @@ -2,85 +2,78 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import ClickAwayListener from '@material-ui/core/ClickAwayListener' -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import * as api from '../api' +import ClickAwayListener from '@material-ui/core/ClickAwayListener'; +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; +import * as api from '../api'; export interface IProps { - run: string - worker: string - span: string - iframeRef: React.RefObject + run: string; + worker: string; + span: string; + iframeRef: React.RefObject; } const useStyles = makeStyles(() => ({ root: { - flexGrow: 1 + flexGrow: 1, }, frame: { width: '100%', height: 'calc(100vh - 48px)', - border: 'none' - } -})) + border: 'none', + }, +})); export const TraceView: React.FC = (props) => { - const { run, worker, span, iframeRef } = props - const classes = useStyles() + const { run, worker, span, iframeRef } = props; + const classes = useStyles(); - const [traceData, setTraceData] = React.useState | null>(null) - const [traceViewReady, setTraceViewReady] = React.useState(false) + const [traceData, setTraceData] = React.useState | null>(null); + const [traceViewReady, setTraceViewReady] = React.useState(false); React.useEffect(() => { setTraceData( api.defaultApi.traceGet(run, worker, span).then((resp) => { - return JSON.stringify(resp) + return JSON.stringify(resp); }) - ) - }, [run, worker, span]) + ); + }, [run, worker, span]); React.useEffect(() => { - function callback(event: MessageEvent) { - const data = event.data || {} + function callback(event: MessageEvent): void { + const data = event.data || {}; if (data.msg === 'ready') { - setTraceViewReady(true) + setTraceViewReady(true); } } - window.addEventListener('message', callback) + window.addEventListener('message', callback); return () => { - window.removeEventListener('message', callback) - } - }, []) + window.removeEventListener('message', callback); + }; + }, []); React.useEffect(() => { if (traceData && traceViewReady) { traceData.then((data) => { - iframeRef.current?.contentWindow?.postMessage( - { msg: 'data', data }, - '*' - ) - }) + iframeRef.current?.contentWindow?.postMessage({ msg: 'data', data }, window.origin); + }); } - }, [traceData, traceViewReady]) - const SetIframeActive = () => { - iframeRef.current?.focus() - } + }, [traceData, traceViewReady]); + const setIframeActive = (): void => { + iframeRef.current?.focus(); + }; return (
{React.useMemo( () => ( - - + + ), [] )}
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AntTableChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AntTableChart.tsx index 064167fc64b4e00ec79b648a85d12dff23ecfcd0..83618064b55223ab06d4d1fec8b8b5eeab8d3268 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AntTableChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AntTableChart.tsx @@ -2,110 +2,110 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import { Table } from 'antd' -import * as React from 'react' -import { Graph } from '../../api' +import { makeStyles } from '@material-ui/core/styles'; +import { Table } from 'antd'; +import * as React from 'react'; +import { Graph } from '../../api'; interface IProps { - graph: Graph - sortColumn?: string - initialPageSize?: number - onRowSelected?: (record?: object, rowIndex?: number) => void + graph: Graph; + sortColumn?: string; + initialPageSize?: number; + onRowSelected?: (record?: object, rowIndex?: number) => void; } const useStyles = makeStyles((theme) => ({ tooltip: { - whiteSpace: 'pre-wrap' + whiteSpace: 'pre-wrap', }, row: { - wordBreak: 'break-word' - } -})) + wordBreak: 'break-word', + }, +})); -const getTableColumns = function ( - columns: any, - sort: string | undefined, - tooltipClass: string -) { - let i = 0 - return columns.map(function (col: any) { - const key = 'col' + i++ - const stringCompare = (a: any, b: any) => a[key].localeCompare(b[key]) - const numberCompare = (a: any, b: any) => (a[key] || 0) - (b[key] || 0) +const getTableColumns = function (columns: any, sort: string | undefined, tooltipClass: string): any { + let i = 0; + return columns.map((col: any) => { + const key = `col${i++}`; + const stringCompare = (a: any, b: any): number => a[key].localeCompare(b[key]); + const numberCompare = (a: any, b: any): number => (a[key] || 0) - (b[key] || 0); return { dataIndex: key, key: key, title: col.name, - sorter: col.type == 'string' ? stringCompare : numberCompare, - defaultSortOrder: sort == col.name ? ('descend' as const) : undefined, - showSorterTooltip: col.tooltip - ? { title: col.tooltip, overlayClassName: tooltipClass } - : true - } - }) -} + sorter: col.type === 'string' ? stringCompare : numberCompare, + defaultSortOrder: sort === col.name ? ('descend' as const) : undefined, + showSorterTooltip: col.tooltip ? { title: col.tooltip, overlayClassName: tooltipClass } : true, + }; + }); +}; -const getTableRows = function (rows: any) { - return rows.map(function (row: any) { - let i = 0 - const res: any = {} - row.forEach(function (entry: any) { - res['col' + i++] = entry - }) - return res - }) -} +const getTableRows = function (rows: any): any { + return rows.map((row: any) => { + let i = 0; + const res: any = {}; + row.forEach((entry: any) => { + res[`col${i++}`] = entry; + }); + return res; + }); +}; export const AntTableChart: React.FC = (props) => { - const { graph, sortColumn, initialPageSize, onRowSelected } = props - const classes = useStyles(props) + const { graph, sortColumn, initialPageSize, onRowSelected } = props; + const classes = useStyles(props); - const rows = React.useMemo(() => getTableRows(graph.rows), [graph.rows]) + const rows = React.useMemo(() => getTableRows(graph.rows), [graph.rows]); const columns = React.useMemo( () => getTableColumns(graph.columns, sortColumn, classes.tooltip), [graph.columns, sortColumn, classes.tooltip] - ) + ); // key is used to reset the Table state (page and sort) if the columns change - const key = React.useMemo(() => Math.random() + '', [graph.columns]) + const key: string = React.useMemo(() => `${Math.random()}`, [graph.columns]); - const [pageSize, setPageSize] = React.useState(initialPageSize ?? 30) - const onShowSizeChange = (current: number, size: number) => { - setPageSize(size) - } + const [pageSize, setPageSize] = React.useState(initialPageSize ?? 30); + const onShowSizeChange = (current: number, size: number): void => { + setPageSize(size); + }; - const onRow = (record: object, rowIndex?: number) => { + const onRow = ( + record: object, + rowIndex?: number + ): { + onMouseEnter: (event: any) => void; + onMouseLeave: (event: any) => void; + } => { return { - onMouseEnter: (event: any) => { + onMouseEnter: (event: any): void => { if (onRowSelected) { - onRowSelected(record, rowIndex) + onRowSelected(record, rowIndex); } }, - onMouseLeave: (event: any) => { + onMouseLeave: (event: any): void => { if (onRowSelected) { - onRowSelected(undefined, undefined) + onRowSelected(undefined, undefined); } - } - } - } + }, + }; + }; return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AreaChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AreaChart.tsx index 6a0f5b484d9c156927edfeae64a729bec821c164..cda12860c2fba41f5a15c5d9e73fb92093c0371b 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AreaChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/AreaChart.tsx @@ -2,44 +2,46 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { Graph } from '../../api' -import { useResizeEventDependency } from '../../utils/resize' +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; +import { Graph } from '../../api'; +import { useResizeEventDependency } from '../../utils/resize'; interface IProps { - graph: Graph - height?: number - hAxisTitle?: string + graph: Graph; + height?: number; + hAxisTitle?: string; } const useStyles = makeStyles(() => ({ root: { - height: (props: Pick) => props.height - } -})) + height: (props: Pick): number | undefined => props.height, + }, +})); export const AreaChart: React.FC = (props) => { - const { graph, height = 400, hAxisTitle } = props - const classes = useStyles({ height }) - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() + const { graph, height = 400, hAxisTitle } = props; + const classes = useStyles({ height }); + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return + const element = graphRef.current; + if (!element) { + return undefined; + } - const data = new google.visualization.DataTable() - data.addColumn('string', 'step') + const data = new google.visualization.DataTable(); + data.addColumn('string', 'step'); graph.columns.forEach((column) => { data.addColumn({ type: column.type, label: column.name, role: column.role, - p: column.p - }) - }) - data.addRows(graph.rows.map((x, i) => [(i + 1).toString(), ...x])) + p: column.p, + }); + }); + data.addRows(graph.rows.map((x, i) => [(i + 1).toString(), ...x])); const options = { title: graph.title, @@ -49,22 +51,22 @@ export const AreaChart: React.FC = (props) => { tooltip: { isHtml: true }, chartArea: { left: '15%', width: '80%', top: '10%' }, hAxis: { - title: hAxisTitle - } - } + title: hAxisTitle, + }, + }; - const chart = new google.visualization.AreaChart(element) + const chart = new google.visualization.AreaChart(element); - chart.draw(data, options) + chart.draw(data, options); return () => { - chart.clearChart() - } - }, [graph, height, resizeEventDependency]) + chart.clearChart(); + }; + }, [graph, height, resizeEventDependency]); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/ColumnChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/ColumnChart.tsx index 1c83eea95998222903a161d6ddbb678189a03775..ae51dc1a34e94b1c91eab2fe502ffe2cbc20f618 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/ColumnChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/ColumnChart.tsx @@ -15,58 +15,62 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Offer offline supporting. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { useResizeEventDependency } from '../../utils/resize' -import * as echarts from 'echarts' +import * as React from 'react'; +import { useResizeEventDependency } from '../../utils/resize'; +import * as echarts from 'echarts'; interface IProps { - title?: string - units?: string - colors?: Array - chartData: ColumnChartData + title?: string; + units?: string; + colors?: Array; + chartData: ColumnChartData; } export interface ColumnChartData { - legends: Array - barLabels: Array - barHeights: Array> + legends: Array; + barLabels: Array; + barHeights: Array>; } export const ColumnChart: React.FC = (props) => { - const { title, units, colors, chartData } = props - const { legends, barLabels, barHeights } = chartData - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() + const { title, units, colors, chartData } = props; + const { legends, barLabels, barHeights } = chartData; + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); - const getAngleByDataLength = (data: number) => { + const getAngleByDataLength = (data: number): number => { if (data < 10) { - return 0 + return 0; } else { // 数量越大越趋近于旋转90度 - return 90 * (1 - 10 / data) + return 90 * (1 - (10 / data)); } - } + }; React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return + const element = graphRef.current; + if (!element) { + return undefined; + } - const chart = echarts.init(element) - const dataSource: Array> = [] - dataSource.push(['worker', ...legends]) + const chart = echarts.init(element); + const dataSource: Array> = []; + dataSource.push(['worker', ...legends]); barHeights.forEach((item, index) => { - barLabels[index] !== undefined && dataSource.push([barLabels[index], ...item]) - }) + if (barLabels[index] !== undefined) { + dataSource.push([barLabels[index], ...item]); + } + }); const options: echarts.EChartsOption = { title: { - text: title + text: title, }, legend: { - bottom: 0 + bottom: 0, }, xAxis: { type: 'category', @@ -74,43 +78,41 @@ export const ColumnChart: React.FC = (props) => { interval: 0, rotate: getAngleByDataLength(barLabels.length), formatter: (name: string) => { - const index = name.indexOf('@') - if (index > -1) { - name = name.slice(index + 1) - } - return name.length > 16 ? name.slice(0, 14) + "..." : name; - } - } + const index = name.indexOf('@'); + const processedName = index > -1 ? name.slice(index + 1) : name; // 使用新变量处理 + return processedName.length > 16 ? `${processedName.slice(0, 14)}...` : processedName; + }, + }, }, yAxis: { type: 'value', name: units, nameTextStyle: { - fontSize: 16 - } + fontSize: 16, + }, }, tooltip: { - trigger: 'item' + trigger: 'item', }, dataset: { - source: dataSource + source: dataSource, }, series: Array(legends.length).fill({ type: 'bar', - stack: 'samesign' + stack: 'samesign', }), - } + }; if (colors) { - options.color = colors.slice(0, barLabels.length) + options.color = colors.slice(0, barLabels.length); } - options && chart.setOption(options, true) - return () => { - chart.dispose() + if (options) { + chart.setOption(options, true); } - }, [title, chartData, resizeEventDependency]) + return () => { + chart.dispose(); + }; + }, [title, chartData, resizeEventDependency]); - return ( -
- ) -} + return
; +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/LineChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/LineChart.tsx deleted file mode 100644 index b9a031d3a44336e568f30524abc8837590b3f603..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/LineChart.tsx +++ /dev/null @@ -1,224 +0,0 @@ -/*--------------------------------------------------------------------------------------------- - * Copyright (c) Microsoft Corporation. All rights reserved. - *--------------------------------------------------------------------------------------------*/ - -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { Graph, GraphAscend } from '../../api' -import { useResizeEventDependency } from '../../utils/resize' -import { binarySearch } from '../../utils/binarysearch' - -interface IProps { - graph: Graph | GraphAscend - height?: number - deviceTarget: string - tag: string - hAxisTitle?: string - vAxisTitle?: string - explorerOptions?: object - onSelectionChanged?: (start: number, end: number) => void - record?: any -} - -const useStyles = makeStyles(() => ({ - root: { - height: (props: Pick) => props.height - } -})) - -export const LineChart: React.FC = (props) => { - const { - graph, - height = 400, - deviceTarget, - tag, - hAxisTitle, - vAxisTitle, - onSelectionChanged, - explorerOptions, - record - } = props - const classes = useStyles({ height }) - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() - const [chartObj, setChartObj] = React.useState() - - React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return - - const options = { - title: graph.title, - isStacked: true, - height, - legend: { position: 'bottom' }, - tooltip: { isHtml: true }, - hAxis: { - title: hAxisTitle - }, - vAxis: { - title: vAxisTitle - }, - explorer: explorerOptions - } - - const chart = new google.visualization.LineChart(element) - - // Disable selection of single point - google.visualization.events.addListener(chart, 'select', function () { - chart.setSelection() - }) - - google.visualization.events.addListener(chart, 'ready', function () { - var zoomLast = getCoords() - var observer = new MutationObserver(function () { - var zoomCurrent = getCoords() - if (JSON.stringify(zoomLast) !== JSON.stringify(zoomCurrent)) { - zoomLast = getCoords() - if (onSelectionChanged) { - onSelectionChanged(zoomLast.x_min, zoomLast.x_max) - } - } - }) - if (graphRef.current) { - observer.observe(graphRef.current, { - childList: true, - subtree: true - }) - } - }) - - function getCoords() { - var chartLayout = chart.getChartLayoutInterface() - var chartBounds = chartLayout.getChartAreaBoundingBox() - - return { - x_min: chartLayout.getHAxisValue(chartBounds.left), - x_max: chartLayout.getHAxisValue(chartBounds.width + chartBounds.left) - } - } - - if (deviceTarget === 'Ascend') { - let data = new google.visualization.DataTable() - if (tag === 'Component') { - if (graph.columns.length === 3) { - graph.columns.forEach((column) => { - data.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - }) - data.addRows(graph.rows['PTA'] ?? graph.rows['GE']) - } else if (graph.columns.length === 5) { - const data2 = new google.visualization.DataTable() - graph.columns.forEach((column, index) => { - if (index === 0 || index < 3) { - data.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - } - if (index === 0 || index >= 3) { - data2.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - } - }) - data.addRows(graph.rows['PTA']) - data2.addRows(graph.rows['GE']) - data = google.visualization.data.join(data, data2, 'full', [[0, 0]], [1, 2], [1, 2]) - } - } else { - if (graph.columns.length === 2) { - graph.columns.forEach((column) => { - data.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - }) - data.addRows(graph.rows['Allocated'] ?? graph.rows['Reserved']) - } else if (graph.columns.length === 3) { - const data2 = new google.visualization.DataTable() - graph.columns.forEach((column, index) => { - if (index === 0 || index < 2) { - data.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - } - if (index === 0 || index >= 2) { - data2.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - } - }) - data.addRows(graph.rows['Allocated']) - data2.addRows(graph.rows['Reserved']) - data = google.visualization.data.join(data, data2, 'full', [[0, 0]], [1], [1]) - } - } - - chart.draw(data, options) - } else { - const data = new google.visualization.DataTable() - graph.columns.forEach((column) => { - data.addColumn({ - type: column.type, - label: column.name, - role: column.role, - p: column.p - }) - }) - data.addRows(graph.rows) - chart.draw(data, options) - } - - setChartObj(chart) - }, [graph, height, resizeEventDependency]) - - React.useEffect(() => { - const compare_fn = (key: number, mid: Array) => - key - parseFloat(mid[0].toFixed(2)) - if (chartObj && tag === 'Operator') { - if (record) { - if (deviceTarget === 'Ascend') { - let startId = binarySearch(graph.rows['Allocated'], record.col2, compare_fn) - let endId = binarySearch(graph.rows['Allocated'], record.col3, compare_fn) - let selection = [] - if (startId >= 0) selection.push({ row: startId, column: 1 }) - if (endId >= 0) selection.push({ row: endId, column: 1 }) - chartObj.setSelection(selection) - } else { - let startId = binarySearch(graph.rows, record.col2, compare_fn) - let endId = binarySearch(graph.rows, record.col3, compare_fn) - let selection = [] - if (startId >= 0) selection.push({ row: startId, column: 1 }) - if (endId >= 0) selection.push({ row: endId, column: 1 }) - chartObj.setSelection(selection) - } - } else { - chartObj.setSelection() - } - } - }, [graph, record, chartObj]) - - return ( -
-
-
- ) -} diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/NewLineChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/NewLineChart.tsx index af350e93d96c364d9baf4952bd59458a7bbd0801..a6e222a6cc9d04b3b0c9031be60b91b75fe9ab37 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/NewLineChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/NewLineChart.tsx @@ -15,85 +15,79 @@ * limitations under the License. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { Graph, GraphAscend } from '../../api' -import { useResizeEventDependency } from '../../utils/resize' -import { binarySearch } from '../../utils/binarysearch' -import * as echarts from 'echarts' +import * as React from 'react'; +import { Graph, GraphAscend } from '../../api'; +import { useResizeEventDependency } from '../../utils/resize'; +import { binarySearch } from '../../utils/binarysearch'; +import * as echarts from 'echarts'; interface IProps { - graph: Graph | GraphAscend - height?: number - deviceTarget: string - tag: string - hAxisTitle?: string - vAxisTitle?: string - onSelectionChanged?: (start: number, end: number) => void - record?: any + graph: Graph | GraphAscend; + height?: number; + deviceTarget: string; + tag: string; + hAxisTitle?: string; + vAxisTitle?: string; + onSelectionChanged?: (start: number, end: number) => void; + record?: any; } export const LineChart: React.FC = (props) => { - const { - graph, - height = 400, - deviceTarget, - tag, - hAxisTitle, - vAxisTitle, - onSelectionChanged, - record - } = props - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() - const [chartObj, setChartObj] = React.useState() - const selectedPoints = React.useRef>([]) + const { graph, height = 400, deviceTarget, tag, hAxisTitle, vAxisTitle, onSelectionChanged, record } = props; + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); + const [chartObj, setChartObj] = React.useState(); + const selectedPoints = React.useRef>([]); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return - element.oncontextmenu = () => { return false } + const element = graphRef.current; + if (!element) { + return undefined; + } + element.oncontextmenu = (): boolean => { + return false; + }; - let myChart = echarts.init(element) + let myChart = echarts.init(element); let option: echarts.EChartsOption = { title: { text: graph.title, textStyle: { - fontSize: 16 - } + fontSize: 16, + }, }, tooltip: { trigger: 'axis' }, legend: { type: 'scroll', - bottom: 0 + bottom: 0, }, xAxis: { type: 'category', boundaryGap: false, - name: hAxisTitle + name: hAxisTitle, }, yAxis: { type: 'value', name: vAxisTitle, - scale: true + scale: true, }, toolbox: { feature: { dataZoom: { - yAxisIndex: 'none' + yAxisIndex: 'none', }, - restore: {} - } - } - } + restore: {}, + }, + }, + }; if (deviceTarget === 'Ascend') { if (tag === 'Component') { const mixedTooltip: echarts.TooltipComponentOption = { trigger: 'axis', formatter: function (params: any) { - var res = `${params[0].name}
` + let res = `${params[0].name}
`; for (const item of params) { if (typeof item.value[item.encode.y[0]] === 'number') { res += ` - ${item.seriesName}: ${item.value[item.encode.y[0]]}
` + ${item.seriesName}: ${item.value[item.encode.y[0]]}
`; } } - return res - } - } + return res; + }, + }; if (graph.columns.length <= 4) { - let finalRows = graph.rows['PTA'] ?? graph.rows['GE'] + let finalRows = graph.rows.PTA ?? graph.rows.GE; if (graph.columns.length === 4) { - const mergedAPPRows = graph.rows['APP'].map((item: Array) => { - return [item[0], null, null, item[1]] - }) + const mergedAPPRows = graph.rows.APP.map((item: Array) => { + return [item[0], null, null, item[1]]; + }); finalRows = finalRows.concat(mergedAPPRows).sort((a: any, b: any) => { - return a[0] - b[0] - }) + return a[0] - b[0]; + }); } option = { ...option, tooltip: mixedTooltip, dataset: { - source: [ - graph.columns.map(column => column.name), - ...finalRows - ] + source: [graph.columns.map((column) => column.name), ...finalRows], }, - series: Array(graph.columns.length - 1).fill( - { - type: 'line', - select: { - itemStyle: { - borderWidth: 5, - shadowBlur: 5 - } + series: Array(graph.columns.length - 1).fill({ + type: 'line', + select: { + itemStyle: { + borderWidth: 5, + shadowBlur: 5, }, - emphasis: { - itemStyle: { - borderWidth: 5, - shadowBlur: 5 - } + }, + emphasis: { + itemStyle: { + borderWidth: 5, + shadowBlur: 5, }, - selectedMode: 'single', - } - ) - } + }, + selectedMode: 'single', + }), + }; } else if (graph.columns.length <= 6) { - const datasetTitle = graph.columns.map(item => item.name) - let mergedGERows = graph.rows['GE'].map((item: Array) => { - return [item[0], null, null, item[1], item[2]] - }) + const datasetTitle = graph.columns.map((item) => item.name); + let mergedGERows = graph.rows.GE.map((item: Array) => { + return [item[0], null, null, item[1], item[2]]; + }); if (graph.columns.length === 6) { - const mergedAPPRows = graph.rows['APP'].map((item: Array) => { - return [item[0], null, null, null, null, item[2]] - }) - mergedGERows = mergedGERows.concat(mergedAPPRows) + const mergedAPPRows = graph.rows.APP.map((item: Array) => { + return [item[0], null, null, null, null, item[2]]; + }); + mergedGERows = mergedGERows.concat(mergedAPPRows); } - const finalRows = graph.rows['PTA'].concat(mergedGERows).sort((a: any, b: any) => { - return a[0] - b[0] - }) + const finalRows = graph.rows.PTA.concat(mergedGERows).sort((a: any, b: any) => { + return a[0] - b[0]; + }); option = { ...option, tooltip: mixedTooltip, - dataset: - { - source: [ - datasetTitle, - ...finalRows - ] + dataset: { + source: [datasetTitle, ...finalRows], }, - series: Array(graph.columns.length - 1).fill( - { - type: 'line', - connectNulls: true, - select: { - itemStyle: { - borderWidth: 5, - shadowBlur: 5 - } + series: Array(graph.columns.length - 1).fill({ + type: 'line', + connectNulls: true, + select: { + itemStyle: { + borderWidth: 5, + shadowBlur: 5, }, - emphasis: { - itemStyle: { - borderWidth: 5, - shadowBlur: 5 - } + }, + emphasis: { + itemStyle: { + borderWidth: 5, + shadowBlur: 5, }, - selectedMode: 'single', - datasetIndex: 0 - }) - } + }, + selectedMode: 'single', + datasetIndex: 0, + }), + }; } } else { if (graph.columns.length === 3) { - const datasetTitle1: Array = [] - const datasetTitle2: Array = [] + const datasetTitle1: Array = []; + const datasetTitle2: Array = []; graph.columns.forEach((column, index) => { if (index === 0 || index < 2) { - datasetTitle1.push(column.name) + datasetTitle1.push(column.name); } if (index === 0 || index >= 2) { - datasetTitle2.push(column.name) + datasetTitle2.push(column.name); } - }) + }); option = { ...option, dataset: [ { - source: [ - datasetTitle1, - ...graph.rows['Allocated'] - ] + source: [datasetTitle1, ...graph.rows.Allocated], }, { - source: [ - datasetTitle2, - ...graph.rows['Reserved'] - ] - } + source: [datasetTitle2, ...graph.rows.Reserved], + }, ], series: [ { @@ -226,20 +204,20 @@ export const LineChart: React.FC = (props) => { name: 'Allocated', emphasis: { label: { - show: true + show: true, }, itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, select: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, - datasetIndex: 0 + datasetIndex: 0, }, { type: 'line', @@ -247,30 +225,27 @@ export const LineChart: React.FC = (props) => { select: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, emphasis: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, selectedMode: 'single', - datasetIndex: 1 - } - ] - } + datasetIndex: 1, + }, + ], + }; } } } else { option = { ...option, dataset: { - source: [ - graph.columns.map(column => column.name), - ...graph.rows - ] + source: [graph.columns.map((column) => column.name), ...graph.rows], }, series: [ { @@ -279,16 +254,16 @@ export const LineChart: React.FC = (props) => { select: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, emphasis: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, - selectedMode: 'single' + selectedMode: 'single', }, { type: 'line', @@ -296,112 +271,116 @@ export const LineChart: React.FC = (props) => { select: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, emphasis: { itemStyle: { borderWidth: 5, - shadowBlur: 5 - } + shadowBlur: 5, + }, }, - selectedMode: 'single' - } - ] - } + selectedMode: 'single', + }, + ], + }; } - option && myChart.setOption(option, true) + if (option) { + myChart.setOption(option, true); + } myChart.dispatchAction({ type: 'takeGlobalCursor', key: 'dataZoomSelect', - dataZoomSelectActive: true - }) + dataZoomSelectActive: true, + }); myChart.on('dataZoom', (param: any) => { if (onSelectionChanged) { - onSelectionChanged(param.batch[0].startValue, param.batch[0].endValue) + onSelectionChanged(param.batch[0].startValue, param.batch[0].endValue); } - }) + }); myChart.on('restore', () => { if (onSelectionChanged) { // Set startId greater than endId to query all memory events. - onSelectionChanged(0, -1) + onSelectionChanged(0, -1); } - }) + }); myChart.on('click', (param) => { myChart.dispatchAction({ type: 'unselect', seriesId: param.seriesId, - dataIndex: selectedPoints.current - }) + dataIndex: selectedPoints.current, + }); myChart.dispatchAction({ type: 'select', seriesId: param.seriesId, - dataIndex: param.dataIndex - }) + dataIndex: param.dataIndex, + }); - selectedPoints.current = [param.dataIndex] - }) + selectedPoints.current = [param.dataIndex]; + }); myChart.getZr().on('contextmenu', () => { myChart.dispatchAction({ - type: 'restore' - }) + type: 'restore', + }); myChart.dispatchAction({ type: 'takeGlobalCursor', key: 'dataZoomSelect', - dataZoomSelectActive: true - }) - }) + dataZoomSelectActive: true, + }); + }); - setChartObj(myChart) + setChartObj(myChart); return () => { - myChart.dispose() - } - }, [graph, height, resizeEventDependency]) + myChart.dispose(); + }; + }, [graph, height, resizeEventDependency]); React.useEffect(() => { - const compare_fn = (key: number, mid: Array) => key - mid[0] + const compareFn = (key: number, mid: Array): number => key - mid[0]; if (chartObj && tag === 'Operator') { if (record) { - let startId = -1 - let endId = -1 + let startId = -1; + let endId = -1; if (deviceTarget === 'Ascend') { - startId = binarySearch(graph.rows['Allocated'], record.col2, compare_fn) - endId = binarySearch(graph.rows['Allocated'], record.col3, compare_fn) + startId = binarySearch(graph.rows.Allocated, record.col2, compareFn); + endId = binarySearch(graph.rows.Allocated, record.col3, compareFn); } else { - startId = binarySearch(graph.rows, record.col2, compare_fn) - endId = binarySearch(graph.rows, record.col3, compare_fn) + startId = binarySearch(graph.rows, record.col2, compareFn); + endId = binarySearch(graph.rows, record.col3, compareFn); + } + let selection = []; + if (startId >= 0) { + selection.push(startId); + } + if (endId >= 0) { + selection.push(endId); } - let selection = [] - startId >= 0 && selection.push(startId) - endId >= 0 && selection.push(endId) chartObj.dispatchAction({ type: 'downplay', seriesName: 'Allocated', - dataIndex: selectedPoints.current - }) + dataIndex: selectedPoints.current, + }); chartObj.dispatchAction({ type: 'highlight', seriesName: 'Allocated', - dataIndex: selection - }) - selectedPoints.current = selection + dataIndex: selection, + }); + selectedPoints.current = selection; } else { chartObj.dispatchAction({ type: 'downplay', seriesName: 'Allocated', - dataIndex: selectedPoints.current - }) - selectedPoints.current = [] + dataIndex: selectedPoints.current, + }); + selectedPoints.current = []; } } - }, [graph, record, chartObj]) + }, [graph, record, chartObj]); - return ( -
- ) -} + return
; +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/PieChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/PieChart.tsx index 2c7ea1c1413ab932c226d1a919362a611a88d4ae..49c59ff02e91f7b7fe0d90ddff4239478ca19a0a 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/PieChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/PieChart.tsx @@ -19,83 +19,104 @@ * Modifications: Offer offline supporting. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { Graph } from '../../api' -import { value } from '../../utils' -import { useResizeEventDependency } from '../../utils/resize' -import * as echarts from 'echarts' +import * as React from 'react'; +import { Graph } from '../../api'; +import { value } from '../../utils'; +import { useResizeEventDependency } from '../../utils/resize'; +import * as echarts from 'echarts'; interface IProps { - graph: Graph - height?: number - top?: number - noLegend?: boolean - title?: string - colors?: Array - tooltip_mode?: string + graph: Graph; + height?: number; + top?: number; + noLegend?: boolean; + title?: string; + colors?: Array; + tooltipMode?: string; } -const noLegendArea = { left: '5%', width: '90%', top: '5%', height: '90%' } -const normalArea = { left: '5%', width: '95%' } -const noTitleArea = { left: '5%', width: '95%', top: '10%', height: '80%' } +interface IAreaPosition { + left: string; + width: string; + top?: string; + height?: string; +} + +const noLegendArea: IAreaPosition = { + left: '5%', + width: '90%', + top: '5%', + height: '90%', +}; +const normalArea: IAreaPosition = { left: '5%', width: '95%' }; +const noTitleArea: IAreaPosition = { + left: '5%', + width: '95%', + top: '10%', + height: '80%', +}; export const PieChart: React.FC = (props) => { - const { - graph, - height = 300, - top, - noLegend, - title, - colors, - tooltip_mode = 'both' - } = props - const graphRef = React.useRef(null) + const { graph, height = 300, top, noLegend, title, colors, tooltipMode = 'both' } = props; + const graphRef = React.useRef(null); - const [resizeEventDependency] = useResizeEventDependency() + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return + const element = graphRef.current; + if (!element) { + return undefined; + } - const chart = echarts.init(element) + const chart = echarts.init(element); - let totalValue = 0 - const rowsWithUniqueName: Array<{ name: string, value: number }> = + let totalValue = 0; + const rowsWithUniqueName: Array<{ name: string; value: number }> = top === undefined ? graph.rows.map((item, index) => { - totalValue += item[1] as number - return { name: `${index}_${item[0]}`, value: item[1] as number } - }) - : graph.rows - .sort((a, b) => (value(b[1]) as number) - (value(a[1]) as number)) - .slice(0, top).map((item, index) => { - totalValue += item[1] as number - return { name: `${index}_${item[0]}`, value: item[1] as number } + totalValue += item[1] as number; + return { name: `${index}_${item[0]}`, value: item[1] as number }; }) + : graph.rows + .sort((a, b) => (value(b[1]) as number) - (value(a[1]) as number)) + .slice(0, top) + .map((item, index) => { + totalValue += item[1] as number; + return { name: `${index}_${item[0]}`, value: item[1] as number }; + }); const option: echarts.EChartsOption = { height, width: '100%', title: { - text: title + text: title, }, tooltip: { trigger: 'item', formatter: (data) => { - const typedData = data as echarts.DefaultLabelFormatterCallbackParams - const index = typedData.name.indexOf('_') - const safeName = typedData.name.replace(//g, '>') - return `${index > -1 ? safeName.slice(index + 1) : safeName}
${tooltip_mode === 'both' ? - typedData.value : ''}(${typedData.percent}%)` + const typedData = data as echarts.DefaultLabelFormatterCallbackParams; + const index = typedData.name.indexOf('_'); + const safeName = typedData.name.replace(//g, '>'); + return `${index > -1 ? safeName.slice(index + 1) : safeName}
${ + tooltipMode === 'both' ? typedData.value : '' + }(${typedData.percent}%)`; }, confine: true, extraCssText: `max-width: 300px; word-wrap:break-word; white-space:pre-wrap; - padding-right: 10px` + padding-right: 10px`, }, - chartArea: noLegend ? noLegendArea : !title ? noTitleArea : normalArea, + chartArea: ((): IAreaPosition => { + if (noLegend) { + return noLegendArea; + } + if (!title) { + return noTitleArea; + } else { + return normalArea; + } + })(), legend: { type: noLegend ? 'plain' : 'scroll', orient: 'vertical', @@ -104,24 +125,23 @@ export const PieChart: React.FC = (props) => { // Display at most 36 characters. formatter: (name) => { // Show legends for datas with the same name. - const index = name.indexOf('_') - if (index > -1) { - name = name.slice(index + 1) - } - return name.length > 36 ? name.slice(0, 34) + "..." : name; + const index = name.indexOf('_'); + const processedName = index > -1 ? name.slice(index + 1) : name; // 使用新变量处理 + return processedName.length > 36 ? `${processedName.slice(0, 34)}...` : processedName; }, tooltip: { show: true, triggerOn: 'mousemove', formatter: (data) => { - const currentItem = rowsWithUniqueName.find(item => item.name === data.name) - const index = data.name.indexOf('_') - const percent = ((currentItem?.value || 0) * 100 / totalValue).toFixed(2) - const safeName = data.name.replace(//g, '>') - return `${index > -1 ? safeName.slice(index + 1) : - safeName}
${tooltip_mode === 'both' ? (currentItem?.value || 0) : ''}(${percent}%)` - } - } + const currentItem = rowsWithUniqueName.find((item) => item.name === data.name); + const index = data.name.indexOf('_'); + const percent = (((currentItem?.value || 0) * 100) / totalValue).toFixed(2); + const safeName = data.name.replace(//g, '>'); + return `${index > -1 ? safeName.slice(index + 1) : safeName}
${ + tooltipMode === 'both' ? currentItem?.value || 0 : '' + }(${percent}%)`; + }, + }, }, sliceVisibilityThreshold: 0, colors, @@ -133,21 +153,21 @@ export const PieChart: React.FC = (props) => { label: { position: 'inside', formatter: `{d}%`, - color: '#ffffff' + color: '#ffffff', }, - data: rowsWithUniqueName - } - ] - } + data: rowsWithUniqueName, + }, + ], + }; - option && chart.setOption(option, true) + if (option) { + chart.setOption(option, true); + } return () => { - chart.dispose() - } - }, [graph, height, top, resizeEventDependency]) + chart.dispose(); + }; + }, [graph, height, top, resizeEventDependency]); - return ( -
- ) -} + return
; +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/SteppedAreaChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/SteppedAreaChart.tsx index bc38cc31747cd69e8fee7af4d55476f49bef9914..3e3b01ccb112aeb80795246bd6f3e2ad83aa2a66 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/SteppedAreaChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/SteppedAreaChart.tsx @@ -19,84 +19,88 @@ * Modifications: Offer offline supporting. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { StepedGraph } from '../../api' -import { useResizeEventDependency } from '../../utils/resize' -import * as echarts from 'echarts' +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; +import { StepedGraph } from '../../api'; +import { useResizeEventDependency } from '../../utils/resize'; +import * as echarts from 'echarts'; interface IProps { - graph: StepedGraph - height?: number - hAxisTitle?: string - vAxisTitle?: string + graph: StepedGraph; + height?: number; + hAxisTitle?: string; + vAxisTitle?: string; } const useStyles = makeStyles(() => ({ root: { - height: (props: Pick) => props.height - } -})) + height: (props: Pick): number | undefined => props.height, + }, +})); export const SteppedAreaChart: React.FC = (props) => { - const { graph, height = 400, hAxisTitle, vAxisTitle } = props - const classes = useStyles({ height }) - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() + const { graph, height = 400, hAxisTitle, vAxisTitle } = props; + const classes = useStyles({ height }); + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return + const element = graphRef.current; + if (!element) { + return undefined; + } - const chart = echarts.init(element) - const dataSource: Array> = [] - dataSource.push(graph.columns) + const chart = echarts.init(element); + const dataSource: Array> = []; + dataSource.push(graph.columns); graph.rows.forEach((row) => { - dataSource.push(row.map(item => item.value)) - }) + dataSource.push(row.map((item) => item.value)); + }); const options: echarts.EChartsOption = { title: { - text: graph.title + text: graph.title, }, legend: { - bottom: 0 + bottom: 0, }, xAxis: { type: 'category', name: hAxisTitle, axisLabel: { interval: 0, - } + }, }, yAxis: { type: 'value', - name: vAxisTitle + name: vAxisTitle, }, tooltip: { trigger: 'item', formatter: (params: any) => { - return graph.rows[params.dataIndex][params.seriesIndex + 1]?.tooltip || '' - } + return graph.rows[params.dataIndex][params.seriesIndex + 1]?.tooltip || ''; + }, }, dataset: { - source: dataSource + source: dataSource, }, series: Array(graph.columns.length - 1).fill({ type: 'bar', - stack: 'samesign' - }) - } + stack: 'samesign', + }), + }; - options && chart.setOption(options, true) + if (options) { + chart.setOption(options, true); + } return () => { - chart.dispose() - } - }, [graph, height, resizeEventDependency]) + chart.dispose(); + }; + }, [graph, height, resizeEventDependency]); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/TableChart.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/TableChart.tsx index 267624c85e02e30e047ff50e7d126259b765c83e..444b41b196c162340b846ac488d70eb908c7b717 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/TableChart.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/charts/TableChart.tsx @@ -2,56 +2,54 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import * as React from 'react' -import { Graph } from '../../api' -import { useResizeEventDependency } from '../../utils/resize' +import { makeStyles } from '@material-ui/core/styles'; +import * as React from 'react'; +import { Graph } from '../../api'; +import { useResizeEventDependency } from '../../utils/resize'; interface IProps { - graph: Graph - sortColumn?: number - height?: number - allowHtml?: boolean - setCellProperty?: ( - row: number, - column: number, - cb: (key: string, value: any) => void - ) => void + graph: Graph; + sortColumn?: number; + height?: number; + allowHtml?: boolean; + setCellProperty?: (row: number, column: number, cb: (key: string, value: any) => void) => void; } const useStyles = makeStyles(() => ({ root: { - height: (props: IProps) => props.height - } -})) + height: (props: IProps): number | undefined => props.height, + }, +})); export const TableChart: React.FC = (props) => { - const { graph, sortColumn, setCellProperty, allowHtml } = props - const classes = useStyles(props) - const graphRef = React.useRef(null) - const [resizeEventDependency] = useResizeEventDependency() + const { graph, sortColumn, setCellProperty, allowHtml } = props; + const classes = useStyles(props); + const graphRef = React.useRef(null); + const [resizeEventDependency] = useResizeEventDependency(); React.useLayoutEffect(() => { - const element = graphRef.current - if (!element) return + const element = graphRef.current; + if (!element || !element.parentElement) { + return; + } - const data = new google.visualization.DataTable() + const data = new google.visualization.DataTable(); graph.columns.forEach((column) => { data.addColumn({ type: column.type, label: column.name, role: column.role, - p: column.p - }) - }) - data.addRows(graph.rows) + p: column.p, + }); + }); + data.addRows(graph.rows); if (setCellProperty) { for (let row = 0; row < graph.rows.length; ++row) { for (let column = 0; column < graph.columns.length; ++column) { setCellProperty(row, column, (key: string, value: any) => { - data.setProperty(row, column, key, value) - }) + data.setProperty(row, column, key, value); + }); } } } @@ -64,24 +62,24 @@ export const TableChart: React.FC = (props) => { pageSize: 30, tooltip: { isHtml: true }, sortColumn: sortColumn, - sortAscending: false - } + sortAscending: false, + }; - const chart = new google.visualization.Table(element) + const chart = new google.visualization.Table(element); /* `chart.draw()` removes the contents of `element` and rebuilds it. This can cause a jump in the scroll position * if the height/width change to 0. Since we can't change the code of Google Charts, we temporarily lock the dims * of the parent container. */ if (element.offsetHeight > 0) { - element.parentElement!.style.height = element.offsetHeight + 'px' + element.parentElement.style.height = `${element.offsetHeight}px`; } - chart.draw(data, options) - element.parentElement!.style.height = '' - }, [graph, resizeEventDependency]) + chart.draw(data, options); + element.parentElement.style.height = ''; + }, [graph, resizeEventDependency]); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/helpers.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/helpers.tsx index b787a5e91976a7f8f5839978276b35cf2a900cab..bfbb346e4b3daf65247e6e954346ed7245993f31 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/helpers.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/helpers.tsx @@ -2,48 +2,40 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { makeStyles } from '@material-ui/core/styles' -import Tooltip from '@material-ui/core/Tooltip' -import HelpOutline from '@material-ui/icons/HelpOutline' -import clsx from 'clsx' -import * as React from 'react' +import { makeStyles } from '@material-ui/core/styles'; +import Tooltip from '@material-ui/core/Tooltip'; +import HelpOutline from '@material-ui/icons/HelpOutline'; +import clsx from 'clsx'; +import * as React from 'react'; export const useTooltipCommonStyles = makeStyles((theme) => ({ tooltip: { maxWidth: '600px', whiteSpace: 'pre-wrap', - fontSize: '14px' + fontSize: '14px', }, cardTitle: { display: 'flex', - alignItems: 'center' + alignItems: 'center', }, titleText: { - marginRight: theme.spacing(0.5) + marginRight: theme.spacing(0.5), }, smallTitleText: { fontSize: '.8rem', - fontWeight: 'bold' - } -})) + fontWeight: 'bold', + }, +})); -export const makeChartHeaderRenderer = ( - classes: ReturnType, - smallTitleText = true -) => (title: string, tooltip: string) => { - return ( - - - {title} +export const makeChartHeaderRenderer = + (classes: ReturnType, smallTitleText = true) => + (title: string, tooltip: string): JSX.Element => { + return ( + + {title} + + + - - - - - ) -} + ); + }; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallFrameList.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallFrameList.tsx index 1e2a385bb634b3988142ada0d947adbb46c99715..0334d29e511399664d5204224e47cf1b88d50655 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallFrameList.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallFrameList.tsx @@ -2,25 +2,25 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { CallStackFrame } from './transform' -import { List } from 'antd' -import { NavToCodeButton } from './NavToCodeButton' -import { makeStyles } from '@material-ui/core/styles' +import * as React from 'react'; +import { CallStackFrame } from './transform'; +import { List } from 'antd'; +import { NavToCodeButton } from './NavToCodeButton'; +import { makeStyles } from '@material-ui/core/styles'; interface IProps { - callFrames: CallStackFrame[] + callFrames: CallStackFrame[]; } const useStyles = makeStyles(() => ({ item: { paddingTop: '1px !important', - paddingBottom: '1px !important' - } -})) + paddingBottom: '1px !important', + }, +})); -export const CallFrameList = (props: IProps) => { - const classes = useStyles() +export const CallFrameList = (props: IProps): React.JSX.Element => { + const classes = useStyles(); const renderItem = React.useCallback( (item: CallStackFrame) => ( @@ -29,14 +29,7 @@ export const CallFrameList = (props: IProps) => { ), [classes.item] - ) + ); - return ( - - ) -} + return ; +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallStackTable.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallStackTable.tsx index 359d7c9028aaeb7497e0a8aa1baba8fa6d8768c1..c3176428d11b8b40c691947b2f0da8fc15674c16 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallStackTable.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/CallStackTable.tsx @@ -15,99 +15,89 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { makeStyles } from '@material-ui/core/styles' -import { CallStackTableData, OperationTableDataInner } from '../../api' -import { Table, TableProps } from 'antd' +import * as React from 'react'; +import { makeStyles } from '@material-ui/core/styles'; +import { CallStackTableData, OperationTableDataInner } from '../../api'; +import { Table, TableProps } from 'antd'; -import * as api from '../../api' -import { transformTableData, TransformedCallStackDataInner } from './transform' -import { attachId, getCommonOperationColumns } from './common' -import { OperationGroupBy } from '../../constants/groupBy' -import { makeExpandIcon } from './ExpandIcon' -import { CallFrameList } from './CallFrameList' +import * as api from '../../api'; +import { transformTableData, TransformedCallStackDataInner } from './transform'; +import { attachId, getCommonOperationColumns } from './common'; +import { OperationGroupBy } from '../../constants/groupBy'; +import { makeExpandIcon } from './ExpandIcon'; +import { CallFrameList } from './CallFrameList'; export interface IProps { - data: OperationTableDataInner - run: string - worker: string - span: string - groupBy: OperationGroupBy - deviceTarget: string + data: OperationTableDataInner; + run: string; + worker: string; + span: string; + groupBy: OperationGroupBy; + deviceTarget: string; } const useStyles = makeStyles((theme) => ({ tooltip: { - whiteSpace: 'pre-wrap' - } -})) + whiteSpace: 'pre-wrap', + }, +})); const expandIcon = makeExpandIcon( 'View call frames', (record) => !record.callStackFrames.length -) +); -const rowExpandable = (record: TransformedCallStackDataInner) => - !!record.callStackFrames.length -const expandedRowRender = (record: TransformedCallStackDataInner) => ( +const rowExpandable = (record: TransformedCallStackDataInner): boolean => !!record.callStackFrames.length; +const expandedRowRender = (record: TransformedCallStackDataInner): React.JSX.Element => ( -) +); -export const CallStackTable = (props: IProps) => { - const { data, run, worker, span, groupBy, deviceTarget } = props - const { name, input_shape } = data - const classes = useStyles(props) +export const CallStackTable = (props: IProps): React.JSX.Element => { + const { data, run, worker, span, groupBy, deviceTarget } = props; + const { name, input_shape } = data; + const classes = useStyles(props); - const [stackData, setStackData] = React.useState< - CallStackTableData | undefined - >(undefined) - const [tooltips, setTooltips] = React.useState() + const [stackData, setStackData] = React.useState(undefined); + const [tooltips, setTooltips] = React.useState(); React.useEffect(() => { - api.defaultApi - .operationStackGet(run, worker, span, groupBy, name, input_shape) - .then((resp) => { - setTooltips(resp.metadata.tooltips) - setStackData(resp.data) - }) - }, [name, input_shape, run, worker, span, groupBy]) + api.defaultApi.operationStackGet(run, worker, span, groupBy, name, input_shape).then((resp) => { + setTooltips(resp.metadata.tooltips); + setStackData(resp.data); + }); + }, [name, input_shape, run, worker, span, groupBy]); - const transformedData = React.useMemo( - () => stackData && transformTableData(attachId(stackData)), - [stackData] - ) + const transformedData = React.useMemo(() => stackData && transformTableData(attachId(stackData)), [stackData]); const columns = React.useMemo( - () => - transformedData && - getCommonOperationColumns(transformedData, deviceTarget, undefined, tooltips, classes), + () => transformedData && getCommonOperationColumns(transformedData, deviceTarget, undefined, tooltips, classes), [transformedData] - ) + ); - const expandIconColumnIndex = columns?.length + const expandIconColumnIndex = columns?.length; const expandable: TableProps['expandable'] = React.useMemo( () => ({ expandIconColumnIndex, expandIcon, expandedRowRender, - rowExpandable + rowExpandable, }), [expandIconColumnIndex] - ) + ); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/ExpandIcon.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/ExpandIcon.tsx index 68ff482827679d9c51c1ca0178b256dc5ae39581..422bb781630c24c6dc4915c3aed8c1f341dba363 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/ExpandIcon.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/ExpandIcon.tsx @@ -2,33 +2,34 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { Button, TableProps } from 'antd' -import { OperationTableDataInner, CallStackTableDataInner } from '../../api' -import { Arguments } from '../../utils/type' +import * as React from 'react'; +import { Button, TableProps } from 'antd'; +import { OperationTableDataInner, CallStackTableDataInner } from '../../api'; +import { Arguments } from '../../utils/type'; -type Types = NonNullable['expandable']>['expandIcon'] -type BasePropType = Arguments>>[0] -type PropType = BasePropType & { text: string; disabled?: boolean } +type Types = NonNullable['expandable']>['expandIcon']; +type BasePropType = Arguments>>[0]; +type PropType = BasePropType & { text: string; disabled?: boolean }; -export function ExpandIcon< - T extends OperationTableDataInner | CallStackTableDataInner ->(props: PropType) { - const onClick = (e: React.MouseEvent) => { - props.onExpand(props.record, e) - } +export function ExpandIcon( + props: PropType +): React.JSX.Element { + const onClick = (e: React.MouseEvent): void => { + props.onExpand(props.record, e); + }; return ( - - ) + ); } -export function makeExpandIcon< - T extends OperationTableDataInner | CallStackTableDataInner ->(text: string, disabled?: (v: T) => boolean) { - return (props: BasePropType) => ( +export function makeExpandIcon( + text: string, + disabled?: (v: T) => boolean +) { + return (props: BasePropType): React.JSX.Element => ( - ) + ); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/MemoryStatsTable.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/MemoryStatsTable.tsx index 0b33ab4167ba11e9bb610d7ebc0717def2addda2..c7e1809a3c0b58297ca99066243cf7d65fbe4c8c 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/MemoryStatsTable.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/MemoryStatsTable.tsx @@ -2,84 +2,76 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { Table } from 'antd' -import { makeStyles } from '@material-ui/core' +import * as React from 'react'; +import { Table } from 'antd'; +import { makeStyles } from '@material-ui/core'; export interface IProps { - data: any - sort: string + data: any; + sort: string; } const useStyles = makeStyles((theme) => ({ tooltip: { - whiteSpace: 'pre-wrap' - } -})) + whiteSpace: 'pre-wrap', + }, +})); -const getMemoryStatsTableColumns = function ( - columns: any, - sort: string, - tooltipClass: string -) { - let i = 0 - return columns.map(function (col: any) { - const key = 'col' + i++ - const stringCompare = (a: any, b: any) => a[key].localeCompare(b[key]) - const numberCompare = (a: any, b: any) => (a[key] || 0) - (b[key] || 0) +const getMemoryStatsTableColumns = function (columns: any, sort: string, tooltipClass: string): any { + let i = 0; + return columns.map((col: any) => { + const key = `col${i++}`; + const stringCompare = (a: any, b: any): number => a[key].localeCompare(b[key]); + const numberCompare = (a: any, b: any): number => (a[key] || 0) - (b[key] || 0); return { dataIndex: key, key: key, title: col.name, - sorter: col.type == 'string' ? stringCompare : numberCompare, - defaultSortOrder: sort == col.name ? ('descend' as const) : undefined, - showSorterTooltip: col.tooltip - ? { title: col.tooltip, overlayClassName: tooltipClass } - : true - } - }) -} + sorter: col.type === 'string' ? stringCompare : numberCompare, + defaultSortOrder: sort === col.name ? ('descend' as const) : undefined, + showSorterTooltip: col.tooltip ? { title: col.tooltip, overlayClassName: tooltipClass } : true, + }; + }); +}; -const getMemoryStatsTableRows = function (rows: any) { - return rows.map(function (row: any) { - let i = 0 - const res: any = {} - row.forEach(function (entry: any) { - res['col' + i++] = entry - }) - return res - }) -} +const getMemoryStatsTableRows = function (rows: any): any { + return rows.map((row: any) => { + let i = 0; + const res: any = {}; + row.forEach((entry: any) => { + res[`col${i++}`] = entry; + }); + return res; + }); +}; -export const MemoryStatsTable = (props: IProps) => { - const { data, sort } = props - const classes = useStyles() +export const MemoryStatsTable = (props: IProps): React.JSX.Element => { + const { data, sort } = props; + const classes = useStyles(); - const rows = React.useMemo(() => getMemoryStatsTableRows(data.rows), [ - data.rows - ]) + const rows = React.useMemo(() => getMemoryStatsTableRows(data.rows), [data.rows]); const columns = React.useMemo( () => getMemoryStatsTableColumns(data.columns, sort, classes.tooltip), [data.columns, sort, classes.tooltip] - ) + ); - const [pageSize, setPageSize] = React.useState(30) - const onShowSizeChange = (current: number, size: number) => { - setPageSize(size) - } + const [pageSize, setPageSize] = React.useState(30); + const onShowSizeChange = (current: number, size: number): void => { + setPageSize(size); + }; return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/NavToCodeButton.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/NavToCodeButton.tsx index fb40e7f38bf5ccbe89851b5fe2d0b684af71239a..2c999aa12a49726aad12321f260b31b6f331eda2 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/NavToCodeButton.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/NavToCodeButton.tsx @@ -2,28 +2,28 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { CallStackFrame } from './transform' -import { Button } from 'antd' -import { navToCode } from '../../utils/vscode' +import * as React from 'react'; +import { CallStackFrame } from './transform'; +import { Button } from 'antd'; +import { navToCode } from '../../utils/vscode'; interface IProps { - frame: CallStackFrame + frame: CallStackFrame; } -export const NavToCodeButton = (props: IProps) => { - const { raw, line, file } = props.frame - const couldNavToFile = line && file +export const NavToCodeButton = (props: IProps): React.JSX.Element => { + const { raw, line, file } = props.frame; + const couldNavToFile = line && file; - const onClick = () => { + const onClick = (): void => { if (line && file) { - navToCode(file, line - 1) + navToCode(file, line - 1); } - } + }; return ( - - ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/OperationTable.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/OperationTable.tsx index 799b8497a04cce30dfc248b380bf477eab85909a..1ce77ee817967ee69961ccd8c91dbc3b0357bed7 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/OperationTable.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/OperationTable.tsx @@ -15,62 +15,55 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { makeStyles } from '@material-ui/core/styles' -import { - OperationTableData, - OperationTableDataInner, - TableMetadata -} from '../../api' -import { OperationGroupBy } from '../../constants/groupBy' -import { attachId, getCommonOperationColumns } from './common' -import { Table, TablePaginationConfig, TableProps } from 'antd' -import { makeExpandIcon } from './ExpandIcon' -import { CallStackTable } from './CallStackTable' +import * as React from 'react'; +import { makeStyles } from '@material-ui/core/styles'; +import { OperationTableData, OperationTableDataInner, TableMetadata } from '../../api'; +import { OperationGroupBy } from '../../constants/groupBy'; +import { attachId, getCommonOperationColumns } from './common'; +import { Table, TableProps } from 'antd'; +import { makeExpandIcon } from './ExpandIcon'; +import { CallStackTable } from './CallStackTable'; export interface IProps { - data: OperationTableData - run: string - worker: string - span: string - groupBy: OperationGroupBy - sortColumn: string - tooltips?: any - deviceTarget: string + data: OperationTableData; + run: string; + worker: string; + span: string; + groupBy: OperationGroupBy; + sortColumn: string; + tooltips?: any; + deviceTarget: string; } const useStyles = makeStyles((theme) => ({ tooltip: { - whiteSpace: 'pre-wrap' - } -})) + whiteSpace: 'pre-wrap', + }, +})); -const rowExpandable = (record: OperationTableDataInner) => record.has_call_stack -const expandIcon = makeExpandIcon( - 'View CallStack', - (record) => !record.has_call_stack -) -export const OperationTable = (props: IProps) => { - const { data, run, worker, span, groupBy, sortColumn, tooltips, deviceTarget } = props - const classes = useStyles(props) +const rowExpandable = (record: OperationTableDataInner): boolean => record.has_call_stack; +const expandIcon = makeExpandIcon('View CallStack', (record) => !record.has_call_stack); +export const OperationTable = (props: IProps): React.JSX.Element => { + const { data, run, worker, span, groupBy, sortColumn, tooltips, deviceTarget } = props; + const classes = useStyles(props); - const rows = React.useMemo(() => attachId(data), [data]) + const rows = React.useMemo(() => attachId(data), [data]); const columns = React.useMemo( () => getCommonOperationColumns(rows, deviceTarget, sortColumn, tooltips, classes), [rows] - ) + ); - const [pageSize, setPageSize] = React.useState(30) - const onShowSizeChange = (current: number, size: number) => { - setPageSize(size) - } + const [pageSize, setPageSize] = React.useState(30); + const onShowSizeChange = (current: number, size: number): void => { + setPageSize(size); + }; - const expandIconColumnIndex = columns.length + const expandIconColumnIndex = columns.length; const expandedRowRender = React.useCallback( (record: OperationTableDataInner) => ( { /> ), [run, worker, span, groupBy] - ) + ); const expandable: TableProps['expandable'] = React.useMemo( () => ({ expandIconColumnIndex, expandIcon, expandedRowRender, - rowExpandable + rowExpandable, }), [expandIconColumnIndex, expandedRowRender] - ) + ); return (
- ) -} + ); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/common.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/common.tsx index a6f1770e7424539d916c01abef122808291d86a6..a84a1a3bb3ff96fd5df257af51bdcd302dc318e2 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/common.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/common.tsx @@ -15,147 +15,136 @@ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. - * + * * Modifications: Add visualization of PyTorch Ascend profiling. *--------------------------------------------------------------------------------------------*/ -import { firstOrUndefined, isDef } from '../../utils/def' -import { CallStackTableDataInner, OperationTableDataInner } from '../../api' -import type { ColumnsType } from 'antd/es/table' -import { ClassNameMap } from '@material-ui/styles' +import { firstOrUndefined, isDef } from '../../utils/def'; +import { CallStackTableDataInner, OperationTableDataInner } from '../../api'; +import type { ColumnsType } from 'antd/es/table'; +import { ClassNameMap } from '@material-ui/styles'; -export function getCommonOperationColumns< - T extends OperationTableDataInner | CallStackTableDataInner ->( - data: T[] | undefined, +export function getCommonOperationColumns( + data?: T[], deviceTarget?: string, defaultSort?: string, tooltips?: any, classes?: ClassNameMap<'tooltip'> ): ColumnsType { - const firstData = firstOrUndefined(data) + const firstData = firstOrUndefined(data); - const hasInputShape = !firstData || isDef(firstData.input_shape) - const hasDeviceSelfDuration = - !firstData || isDef(firstData.device_self_duration) - const hasDeviceTotalDuration = - !firstData || isDef(firstData.device_total_duration) - const hasTcEligible = !firstData || isDef(firstData.tc_eligible) - const hasTcSelfRatio = !firstData || isDef(firstData.tc_self_ratio) - const hasTcTotalRatio = !firstData || isDef(firstData.tc_total_ratio) + const hasInputShape = !firstData || isDef(firstData.input_shape); + const hasDeviceSelfDuration = !firstData || isDef(firstData.device_self_duration); + const hasDeviceTotalDuration = !firstData || isDef(firstData.device_total_duration); + const hasTcEligible = !firstData || isDef(firstData.tc_eligible); + const hasTcSelfRatio = !firstData || isDef(firstData.tc_self_ratio); + const hasTcTotalRatio = !firstData || isDef(firstData.tc_total_ratio); - const nameCompare = (a: T, b: T) => a.name.localeCompare(b.name) - const callsCompare = (a: T, b: T) => a.calls - b.calls - const deviceSelfDurationCompare = (a: T, b: T) => - (a.device_self_duration || 0) - (b.device_self_duration || 0) - const deviceTotalDurationCompare = (a: T, b: T) => - (a.device_total_duration || 0) - (b.device_total_duration || 0) - const hostSelfDurationCompare = (a: T, b: T) => - (a.host_self_duration || 0) - (b.host_self_duration || 0) - const hostTotalDurationCompare = (a: T, b: T) => - (a.host_total_duration || 0) - (b.host_total_duration || 0) - const tcEligibleCompare = (a: T, b: T) => - a.tc_eligible!.localeCompare(b.tc_eligible!) - const tcSelfRatioCompare = (a: T, b: T) => - (a.tc_self_ratio || 0) - (b.tc_self_ratio || 0) - const tcTotalRatioCompare = (a: T, b: T) => - (a.tc_total_ratio || 0) - (b.tc_total_ratio || 0) + const nameCompare = (a: T, b: T): number => a.name.localeCompare(b.name); + const callsCompare = (a: T, b: T): number => a.calls - b.calls; + const deviceSelfDurationCompare = (a: T, b: T): number => + (a.device_self_duration || 0) - (b.device_self_duration || 0); + const deviceTotalDurationCompare = (a: T, b: T): number => + (a.device_total_duration || 0) - (b.device_total_duration || 0); + const hostSelfDurationCompare = (a: T, b: T): number => (a.host_self_duration || 0) - (b.host_self_duration || 0); + const hostTotalDurationCompare = (a: T, b: T): number => (a.host_total_duration || 0) - (b.host_total_duration || 0); + const tcEligibleCompare = (a: T, b: T): number => (a.tc_eligible ?? '').localeCompare(b.tc_eligible ?? ''); + const tcSelfRatioCompare = (a: T, b: T): number => (a.tc_self_ratio || 0) - (b.tc_self_ratio || 0); + const tcTotalRatioCompare = (a: T, b: T): number => (a.tc_total_ratio || 0) - (b.tc_total_ratio || 0); const columns: ColumnsType = [ { dataIndex: 'name', key: 'name', title: 'Name', - sorter: nameCompare + sorter: nameCompare, }, hasInputShape ? { - dataIndex: 'input_shape', - key: 'input_shape', - title: 'Input Shape' - } + dataIndex: 'input_shape', + key: 'input_shape', + title: 'Input Shape', + } : undefined, { dataIndex: 'calls', sorter: callsCompare, key: 'calls', - title: 'Calls' + title: 'Calls', }, hasDeviceSelfDuration ? { - dataIndex: 'device_self_duration', - key: 'device_self_duration', - title: 'Device Self Duration (us)', - sorter: deviceSelfDurationCompare, - // Use device_self_duration as default sort if defaultSort is unspecified - defaultSortOrder: defaultSort ? undefined : ('descend' as const) - } + dataIndex: 'device_self_duration', + key: 'device_self_duration', + title: 'Device Self Duration (us)', + sorter: deviceSelfDurationCompare, + // Use device_self_duration as default sort if defaultSort is unspecified + defaultSortOrder: defaultSort ? undefined : ('descend' as const), + } : undefined, hasDeviceTotalDuration ? { - dataIndex: 'device_total_duration', - key: 'device_total_duration', - title: 'Device Total Duration (us)', - sorter: deviceTotalDurationCompare - } + dataIndex: 'device_total_duration', + key: 'device_total_duration', + title: 'Device Total Duration (us)', + sorter: deviceTotalDurationCompare, + } : undefined, { dataIndex: 'host_self_duration', key: 'host_self_duration', title: 'Host Self Duration (us)', - sorter: hostSelfDurationCompare + sorter: hostSelfDurationCompare, }, { dataIndex: 'host_total_duration', key: 'host_total_duration', title: 'Host Total Duration (us)', - sorter: hostTotalDurationCompare + sorter: hostTotalDurationCompare, }, hasTcEligible ? { - dataIndex: 'tc_eligible', - key: 'tc_eligible', - title: deviceTarget === 'Ascend' ? 'AI Cores Eligible' : 'Tensor Cores Eligible', - sorter: tcEligibleCompare - } + dataIndex: 'tc_eligible', + key: 'tc_eligible', + title: deviceTarget === 'Ascend' ? 'AI Cores Eligible' : 'Tensor Cores Eligible', + sorter: tcEligibleCompare, + } : undefined, hasTcSelfRatio ? { - dataIndex: 'tc_self_ratio', - key: 'tc_self_ratio', - title: deviceTarget === 'Ascend' ? 'AI Cores Self(%)' : 'Tensor Cores Self(%)', - sorter: tcSelfRatioCompare - } + dataIndex: 'tc_self_ratio', + key: 'tc_self_ratio', + title: deviceTarget === 'Ascend' ? 'AI Cores Self(%)' : 'Tensor Cores Self(%)', + sorter: tcSelfRatioCompare, + } : undefined, hasTcTotalRatio ? { - dataIndex: 'tc_total_ratio', - key: 'tc_total_ratio', - title: deviceTarget === 'Ascend' ? 'AI Cores Total(%)' : 'Tensor Cores Total(%)', - sorter: tcTotalRatioCompare - } - : undefined - ].filter(isDef) + dataIndex: 'tc_total_ratio', + key: 'tc_total_ratio', + title: deviceTarget === 'Ascend' ? 'AI Cores Total(%)' : 'Tensor Cores Total(%)', + sorter: tcTotalRatioCompare, + } + : undefined, + ].filter(isDef); columns.forEach((column) => { - if (column.key == defaultSort) { - column.defaultSortOrder = 'descend' as const + if (column.key === defaultSort) { + column.defaultSortOrder = 'descend' as const; } if (tooltips[column.key as string]) { column.showSorterTooltip = { title: tooltips[column.key as string], - overlayClassName: classes?.tooltip - } + overlayClassName: classes?.tooltip, + }; } - }) - return columns + }); + return columns; } -let uid = 1 -export function attachId< - T extends CallStackTableDataInner | OperationTableDataInner ->(data: T[]): T[] { +let uid = 1; +export function attachId(data: T[]): T[] { return data.map((d) => ({ ...d, - key: uid++ - })) + key: uid++, + })); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/transform.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/transform.ts index bd051fd429d5cb26a44a59b60f776b207a861d64..5f59728feb30ef6d3230c3eec9803b08cdd72779 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/transform.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/tables/transform.ts @@ -2,49 +2,49 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { CallStackTableData, CallStackTableDataInner } from '../../api' +import { CallStackTableData, CallStackTableDataInner } from '../../api'; export interface CallStackFrame { - file?: string - line?: number - raw: string + file?: string; + line?: number; + raw: string; } export interface TransformedCallStackDataInner extends CallStackTableDataInner { - callStackFrames: CallStackFrame[] + callStackFrames: CallStackFrame[]; } -const lineRegex = /\([0-9]+\)$/ +const lineRegex = /\([0-9]+\)$/; function parseCallStackLine(raw: string): CallStackFrame { - raw = raw.trim() - const results = raw.split(':') - const location = results.slice(0, results.length - 1).join(':') + let rawResult = raw.trim(); + const results = rawResult.split(':'); + const location = results.slice(0, results.length - 1).join(':'); - const result = lineRegex.exec(location) + const result = lineRegex.exec(location); if (!result) { - return { raw } + return { raw: rawResult }; } - const lineWithParens = result[0].trim() - const file = raw.slice(0, result.index).trim() + const lineWithParens = result[0].trim(); + const file = rawResult.slice(0, result.index).trim(); const line = Number( lineWithParens.substr(1, lineWithParens.length - 2).trim() - ) + ); return { - raw, + raw: rawResult, file, - line - } + line, + }; } -function parseCallStack(callStack: string | undefined): CallStackFrame[] { +function parseCallStack(callStack?: string): CallStackFrame[] { const lines = (callStack ?? '') .trim() .split(';') - .map((x) => x.trim()) - return lines.map(parseCallStackLine) + .map((x) => x.trim()); + return lines.map(parseCallStackLine); } function transformCallStackData( @@ -52,12 +52,12 @@ function transformCallStackData( ): TransformedCallStackDataInner { return { ...data, - callStackFrames: parseCallStack(data.call_stack) - } + callStackFrames: parseCallStack(data.call_stack), + }; } export function transformTableData( data: CallStackTableData ): TransformedCallStackDataInner[] { - return data.map(transformCallStackData) + return data.map(transformCallStackData); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/transform.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/transform.ts index 08dcb25a20daf1868cc4ff2ea6245f444330b93f..94ee9f384ebde3a3ddb057c88fc42beb69b0c908 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/components/transform.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/components/transform.ts @@ -2,81 +2,82 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as api from '../api' -import { assertDef, isDef } from '../utils/def' +import * as api from '../api'; +import { assertDef, isDef } from '../utils/def'; -export function transformPerformanceIntoTable( - performances: api.Performance[] -): api.Graph { +export function transformPerformanceIntoTable(performances: api.Performance[]): api.Graph { const columns: api.GraphColumn[] = [ { type: 'string', name: 'Category' }, { type: 'number', name: 'Time Duration (us)' }, - { type: 'number', name: 'Percentage (%)' } - ] + { type: 'number', name: 'Percentage (%)' }, + ]; - const rows: api.Graph['rows'] = [] - const queue = [...performances] + const rows: api.Graph['rows'] = []; + const queue = [...performances]; while (queue.length) { - const first = queue.shift() - assertDef(first) + const first = queue.shift(); + assertDef(first); - const row: api.Graph['rows'][number] = [] - const { name, value, extra, children } = first - assertDef(value) - assertDef(extra) + const row: api.Graph['rows'][number] = []; + const { name, value, extra, children } = first; + assertDef(value); + assertDef(extra); - row.push(name) - row.push(value) - row.push(extra) + row.push(name); + row.push(value); + row.push(extra); if (isDef(children) && children.length) { - queue.push(...children) + queue.push(...children); } - rows.push(row) + rows.push(row); } return { columns, - rows - } + rows, + }; } -export function transformPerformanceIntoPie(performances: api.Performance[]) { +export function transformPerformanceIntoPie(performances: api.Performance[]): { + columns: api.GraphColumn[]; + rows: Array>; +} { const columns: api.GraphColumn[] = [ { type: 'string', name: 'Name' }, - { type: 'number', name: 'Value' } - ] + { type: 'number', name: 'Value' }, + ]; - const rows: api.Graph['rows'] = [] - const queue: api.Performance[] = [] + const rows: api.Graph['rows'] = []; + const queue: api.Performance[] = []; performances.forEach((topLevel) => { if (topLevel.children) { - queue.push(...topLevel.children) + queue.push(...topLevel.children); } - }) + }); while (queue.length) { - const first = queue.shift() - assertDef(first) + const first = queue.shift(); + assertDef(first); - const row: api.Graph['rows'][number] = [] - const { name, value, children } = first - assertDef(value) + const row: api.Graph['rows'][number] = []; + const { name, value, children } = first; + assertDef(value); - row.push(name) - row.push(Number.parseInt(value, 10)) + row.push(name); + row.push(Number.parseInt(value, 10)); if (isDef(children) && children.length) { - queue.push(...children) + queue.push(...children); } - rows.push(row) + rows.push(row); } return { columns, - rows - } + rows, + }; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/constants/groupBy.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/constants/groupBy.ts index 2b96c6b8dd3a0f1127f2617b72934d65c89f01f0..88ea9e3f42adfecd2a829384cc78b7ddc88d11aa 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/constants/groupBy.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/constants/groupBy.ts @@ -3,11 +3,11 @@ *--------------------------------------------------------------------------------------------*/ export enum OperationGroupBy { - Operation = 'Operation', - OperationAndInputShape = 'OperationAndInputShape' + OPERATION = 'Operation', + OPERATION_AND_INPUT_SHAPE = 'OperationAndInputShape', } export enum KernelGroupBy { - Kernel = 'Kernel', - KernelNameAndOpName = 'KernelNameAndOpName' + KERNEL = 'Kernel', + KERNEL_NAME_AND_OP_NAME = 'KernelNameAndOpName', } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/gstatic.d.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/gstatic.d.ts index 646255c2cdc20595fc0166b8cd5ce4743549bd2c..521c5fbb8d985136529d8233f8a65dffb8acca95 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/gstatic.d.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/gstatic.d.ts @@ -2,5 +2,5 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -declare const google: any -declare module 'react-flame-graph' +declare const google: any; +declare module 'react-flame-graph'; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/index.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/index.tsx index 224f37a5fd066414815caf9e83b15298364fd2bd..851474766de5d9adee682e66ed752c85ffd6d4bf 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/index.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/index.tsx @@ -2,9 +2,9 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { render } from 'react-dom' -import { App } from './app' -import 'antd/dist/antd.css' +import * as React from 'react'; +import { render } from 'react-dom'; +import { App } from './app'; +import 'antd/dist/antd.css'; -render(, document.getElementById('app')) +render(, document.getElementById('app')); diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/setup.tsx b/plugins/tensorboard-plugins/tb_plugin/fe/src/setup.tsx index 5db44e8243119c7988ef33007e2eb3134fe6e857..c811ae1524ec7cc6f82410e8aeb999f2ea22476b 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/setup.tsx +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/setup.tsx @@ -2,8 +2,8 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -export async function setup() { +export async function setup(): Promise { await google.charts.load('current', { - packages: ['corechart', 'table', 'timeline'] - }) + packages: ['corechart', 'table', 'timeline'], + }); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/binarysearch.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/binarysearch.ts index 0477cac74d0b0d6836b53f18689891feb2f10cea..41382dcdb7acc8cb9e2b1b4f856e1855fb7ed88f 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/binarysearch.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/binarysearch.ts @@ -1,20 +1,20 @@ export function binarySearch( arr: Array, key: any, - compare_fn: Function + compareFn: (key: number, mid: Array) => number ): number { - let low = 0, - high = arr.length - 1 + let low = 0; + let high = arr.length - 1; while (low <= high) { - let mid = Math.round((high + low) / 2) - let cmp = compare_fn(key, arr[mid]) + let mid = Math.round((high + low) / 2); + let cmp = compareFn(key, arr[mid]); if (cmp > 0) { - low = mid + 1 + low = mid + 1; } else if (cmp < 0) { - high = mid - 1 + high = mid - 1; } else { - return mid + return mid; } } - return -1 + return -1; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/debounce.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/debounce.ts index fcd6368e6ac9e971c85267fe5e6ccc9781235c9e..82c7f04a98b788ab2c7c7647c292f163b8a92783 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/debounce.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/debounce.ts @@ -2,20 +2,20 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' +import * as React from 'react'; export function useDebounce(value: T, delay: number): T { - const [debouncedValue, setDebouncedValue] = React.useState(value) + const [debouncedValue, setDebouncedValue] = React.useState(value); React.useEffect(() => { const handler = setTimeout(() => { - setDebouncedValue(value) - }, delay) + setDebouncedValue(value); + }, delay); return () => { - clearTimeout(handler) - } - }, [value, delay]) + clearTimeout(handler); + }; + }, [value, delay]); - return debouncedValue + return debouncedValue; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/def.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/def.ts index c024293a54e18e543c331226c317713f829c5c10..df6bef8eab076d13c0785902127f46a472ff9fa6 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/def.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/def.ts @@ -2,17 +2,19 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -export function isDef(v: T | undefined | null): v is T { - return v !== null && v !== undefined +export function isDef(v?: T | null): v is T { + return v !== null && v !== undefined; } -export function assertDef(v: T | undefined | null): asserts v is T { +export function assertDef(v?: T | null): asserts v is T { if (!isDef(v)) { - throw new Error('Must be defined') + throw new Error('Must be defined'); } } -export function firstOrUndefined(v: T[] | undefined): T | undefined { - if (!v || !v.length) return undefined - return v[0] +export function firstOrUndefined(v?: T[]): T | undefined { + if (!v || !v.length) { + return undefined; + } + return v[0]; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/hooks.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/hooks.ts index d8dd3eff536eb5e22683debe4338e785fe630616..473b393d9fa270438be85a7b528d78107c5f87f5 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/hooks.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/hooks.ts @@ -2,26 +2,26 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' +import * as React from 'react'; -const cbs: (() => void)[] = [] -export const useOnResize = (cb: () => void) => { +const cbs: Array<() => void> = []; +export const useOnResize = (cb: () => void): void => { React.useEffect(() => { if (cbs.length === 0) { window.addEventListener('resize', () => { - cbs.forEach((cb) => cb()) - }) + cbs.forEach((callback) => callback()); + }); } - cbs.push(cb) + cbs.push(cb); - return () => { - const idx = cbs.findIndex(cb) + return (): void => { + const idx = cbs.findIndex(cb); if (idx > -1) { - cbs.splice(idx, 1) + cbs.splice(idx, 1); } if (cbs.length === 0) { - window.removeEventListener('reset', cb) + window.removeEventListener('reset', cb); } - } - }, [cb]) -} + }; + }, [cb]); +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/index.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/index.ts index 1c7074b4c2002c40dc0b3f2f3da88d9a2b783a5f..5da446721e9d1cac3729d8aea03bca2615031f41 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/index.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/index.ts @@ -2,23 +2,23 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import { ValueAndFormat } from '../api' +import { ValueAndFormat } from '../api'; -export function firstOrUndefined(v: T[] | undefined | null): T | undefined { - if (!v || !v.length) return undefined - return v[0] +export function firstOrUndefined(v?: T[] | null): T | undefined { + if (!v || !v.length) { + return undefined; + } + return v[0]; } -export function sleep(delay: number) { - return new Promise((resolve) => setTimeout(resolve, delay)) +export function sleep(delay: number): Promise { + return new Promise((resolve) => setTimeout(resolve, delay)); } export function isValueAndFormat(v: any): v is ValueAndFormat { - return 'f' in v && 'v' in v + return 'f' in v && 'v' in v; } -export function value( - v: boolean | number | string | ValueAndFormat -): boolean | number | string { - return typeof v === 'object' && isValueAndFormat(v) ? v.v : v +export function value(v: boolean | number | string | ValueAndFormat): boolean | number | string { + return typeof v === 'object' && isValueAndFormat(v) ? v.v : v; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/resize.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/resize.ts index 57ab394042651fcddb7a48cfa158647d2e6b9faa..766a10d54143fecd637b1d0dff33db17f22bee0d 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/resize.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/resize.ts @@ -2,26 +2,26 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import debounce from '@material-ui/core/utils/debounce' +import * as React from 'react'; +import debounce from '@material-ui/core/utils/debounce'; -export function useResizeEventDependency() { - const [version, setVersion] = React.useState(0) +export function useResizeEventDependency(): readonly [number] { + const [version, setVersion] = React.useState(0); const increaseVersion = React.useCallback( debounce(() => { - setVersion((prev) => prev + 1) + setVersion((prev) => prev + 1); }, 100), [] - ) + ); React.useEffect(() => { - window.addEventListener('resize', increaseVersion) + window.addEventListener('resize', increaseVersion); - return () => { - window.removeEventListener('resize', increaseVersion) - } - }, []) + return (): void => { + window.removeEventListener('resize', increaseVersion); + }; + }, []); - return [version] as const + return [version] as const; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/search.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/search.ts index 36689758752625b6c249c5fd532d93c9e5fbafb4..8a2efc36ddf505aee50171affd722bd5ef0a5b86 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/search.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/search.ts @@ -2,65 +2,67 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import * as React from 'react' -import { value } from '.' -import * as api from '../api' -import { useDebounce } from './debounce' +import * as React from 'react'; +import { value } from '.'; +import * as api from '../api'; +import { useDebounce } from './debounce'; export function useSearch( searchName: string, columnName: string, - table: api.Graph | undefined + table?: api.Graph ): [api.Graph | undefined] { - const searchNameDebounce = useDebounce(searchName.trim(), 500) + const searchNameDebounce = useDebounce(searchName.trim(), 500); const searchedTable: api.Graph | undefined = React.useMemo(() => { if (!searchNameDebounce) { - return table + return table; } if (!table) { - return undefined + return undefined; } - const columnNameToFind = columnName.toLowerCase() + const columnNameToFind = columnName.toLowerCase(); const nameColumnIdx = table.columns.findIndex( (c) => c.name.toLowerCase() === columnNameToFind - ) + ); if (nameColumnIdx < 0) { - return table + return table; } return { ...table, rows: table.rows.filter((x) => { - const cell = value(x[nameColumnIdx]) - return typeof cell === 'string' && cell.includes(searchNameDebounce) - }) - } - }, [table, searchNameDebounce]) - return [searchedTable] + const cell = value(x[nameColumnIdx]); + return typeof cell === 'string' && cell.includes(searchNameDebounce); + }), + }; + }, [table, searchNameDebounce]); + return [searchedTable]; } export function useSearchDirectly( searchName: string, field: (v: T) => string, - table: T[] | undefined + table?: T[] ): [T[] | undefined] { - const searchNameDebounce = useDebounce(searchName.trim(), 500) + const searchNameDebounce = useDebounce(searchName.trim(), 500); const result = React.useMemo(() => { if (!searchNameDebounce) { - return table + return table; } if (!table) { - return undefined + return undefined; } return table.filter((row) => { - return field(row).toLowerCase().includes(searchNameDebounce.toLowerCase()) - }) - }, [table, field, searchNameDebounce]) - return [result] + return field(row) + .toLowerCase() + .includes(searchNameDebounce.toLowerCase()); + }); + }, [table, field, searchNameDebounce]); + return [result]; } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/top.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/top.ts index 87bd3c1b86f763a63dbf195ee5feaf649d56e006..4af19968d637d6c13bf64caa94f09fff104f6091 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/top.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/top.ts @@ -2,49 +2,53 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -import debounce from '@material-ui/core/utils/debounce' -import * as React from 'react' +import debounce from '@material-ui/core/utils/debounce'; +import * as React from 'react'; export enum UseTop { - NotUse = 'NotUse', - Use = 'Use' + NOT_USE = 'NotUse', + USE = 'Use', } interface IOptions { - defaultTop?: number - defaultUseTop?: UseTop - noDebounce?: boolean - wait?: number + defaultTop?: number; + defaultUseTop?: UseTop; + noDebounce?: boolean; + wait?: number; } -export function useTopN(options?: IOptions) { - options ??= {} - - const [topText, setTopText] = React.useState(String(options.defaultTop ?? 15)) - const [actualTop, setActualTop] = React.useState( - Number(topText) - ) - const [useTop, setUseTop] = React.useState( - options.defaultUseTop ?? UseTop.NotUse - ) - - const setActualDebounce = !options.noDebounce - ? React.useCallback(debounce(setActualTop, options.wait ?? 500), []) - : setActualTop +export function useTopN( + options?: IOptions +): readonly [ + string, + number | undefined, + UseTop, + React.Dispatch>, + React.Dispatch> +] { + let realOptions = options ?? {}; + + const [topText, setTopText] = React.useState(String(realOptions.defaultTop ?? 15)); + const [actualTop, setActualTop] = React.useState(Number(topText)); + const [useTop, setUseTop] = React.useState(realOptions.defaultUseTop ?? UseTop.NOT_USE); + + const setActualDebounce = !realOptions.noDebounce + ? React.useCallback(debounce(setActualTop, realOptions.wait ?? 500), []) + : setActualTop; React.useEffect(() => { - if (useTop !== UseTop.Use) { - setActualDebounce(undefined) + if (useTop !== UseTop.USE) { + setActualDebounce(undefined); } else if (topIsValid(topText)) { - setActualDebounce(Number(topText)) + setActualDebounce(Number(topText)); } else { - setActualDebounce(actualTop) + setActualDebounce(actualTop); } - }, [topText, useTop]) + }, [topText, useTop]); - return [topText, actualTop, useTop, setTopText, setUseTop] as const + return [topText, actualTop, useTop, setTopText, setUseTop] as const; } -export function topIsValid(topText: string) { - const top = Number(topText) - return !Number.isNaN(top) && top > 0 && Number.isInteger(top) +export function topIsValid(topText: string): boolean { + const top = Number(topText); + return !Number.isNaN(top) && top > 0 && Number.isInteger(top); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/type.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/type.ts index fde74bc598b930f26dd8a83157c91953da2c045c..ccd45fd16e11043abe40a4235a7b39a5d18afcdd 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/type.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/type.ts @@ -6,4 +6,4 @@ export type Arguments void> = T extends ( ...args: infer A ) => void ? A - : never + : never; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/vscode.ts b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/vscode.ts index 62f1a90809548691f3b7b7a89d71ac65e4bf622b..2a763adca54ef3eba96837aa111df627e3f8b116 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/vscode.ts +++ b/plugins/tensorboard-plugins/tb_plugin/fe/src/utils/vscode.ts @@ -2,12 +2,12 @@ * Copyright (c) Microsoft Corporation. All rights reserved. *--------------------------------------------------------------------------------------------*/ -export function navToCode(filename: string, line: number) { +export function navToCode(filename: string, line: number): void { window.parent.parent.postMessage( { filename, - line + line, }, - '*' - ) + window.origin + ); } diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/update-static.js b/plugins/tensorboard-plugins/tb_plugin/fe/update-static.js index 9923c216781c4cfd3505bdc4cb99a736b1bc61a1..67c9be6ccc266ca2470705ad7bb990e550769e96 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/update-static.js +++ b/plugins/tensorboard-plugins/tb_plugin/fe/update-static.js @@ -1,7 +1,7 @@ -const fs = require('fs') -const path = require('path') +const fs = require('fs'); +const path = require('path'); fs.copyFileSync( path.resolve(__dirname, 'dist/index.html'), path.resolve(__dirname, '../torch_tb_profiler/static/index.html') -) +); diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/webpack.config.js b/plugins/tensorboard-plugins/tb_plugin/fe/webpack.config.js index 70541ae9cff81eccfd33a8edd2b2a8424edf5a4b..a47f8b319e83a9c96c80c11afe5adf09e308fbfa 100644 --- a/plugins/tensorboard-plugins/tb_plugin/fe/webpack.config.js +++ b/plugins/tensorboard-plugins/tb_plugin/fe/webpack.config.js @@ -1,8 +1,8 @@ -const path = require('path') -const HtmlWebpackPlugin = require('html-webpack-plugin') -const InlineChunkHtmlPlugin = require('inline-chunk-html-plugin') +const path = require('path'); +const HtmlWebpackPlugin = require('html-webpack-plugin'); +const InlineChunkHtmlPlugin = require('inline-chunk-html-plugin'); -const isDev = process.env.NODE_ENV !== 'production' +const isDev = process.env.NODE_ENV !== 'production'; /** * @type {import('webpack').Configuration & import('webpack-dev-server').Configuration} @@ -12,25 +12,25 @@ module.exports = { entry: './src/index.tsx', output: { path: path.resolve(__dirname, 'dist'), - filename: 'index.js' + filename: 'index.js', }, resolve: { // Add `.ts` and `.tsx` as a resolvable extension. - extensions: ['.ts', '.tsx', '.js'] + extensions: ['.ts', '.tsx', '.js'], }, module: { rules: [ { test: /\.tsx?$/i, use: 'ts-loader' }, - { test: /\.css$/i, use: ['style-loader', 'css-loader'] } - ] + { test: /\.css$/i, use: ['style-loader', 'css-loader'] }, + ], }, plugins: [ new HtmlWebpackPlugin({ inject: true, scriptLoading: 'blocking', - template: 'index.html' + template: 'index.html', }), - !isDev ? new InlineChunkHtmlPlugin(HtmlWebpackPlugin, [/.*/]) : undefined + !isDev ? new InlineChunkHtmlPlugin(HtmlWebpackPlugin, [/.*/]) : undefined, ].filter(Boolean), - devServer: {} -} + devServer: {}, +}; diff --git a/plugins/tensorboard-plugins/tb_plugin/fe/yarn.lock b/plugins/tensorboard-plugins/tb_plugin/fe/yarn.lock deleted file mode 100644 index 3e914db864c7654443e9041cfc1899ea2ac30bb1..0000000000000000000000000000000000000000 --- a/plugins/tensorboard-plugins/tb_plugin/fe/yarn.lock +++ /dev/null @@ -1,3672 +0,0 @@ -# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY. -# yarn lockfile v1 - - -"@ant-design/colors@^6.0.0": - version "6.0.0" - resolved "https://registry.yarnpkg.com/@ant-design/colors/-/colors-6.0.0.tgz#9b9366257cffcc47db42b9d0203bb592c13c0298" - integrity sha512-qAZRvPzfdWHtfameEGP2Qvuf838NhergR35o+EuVyB5XvSA98xod5r4utvi4TJ3ywmevm290g9nsCG5MryrdWQ== - dependencies: - "@ctrl/tinycolor" "^3.4.0" - -"@ant-design/icons-svg@^4.2.1": - version "4.2.1" - resolved "https://registry.yarnpkg.com/@ant-design/icons-svg/-/icons-svg-4.2.1.tgz#8630da8eb4471a4aabdaed7d1ff6a97dcb2cf05a" - integrity sha512-EB0iwlKDGpG93hW8f85CTJTs4SvMX7tt5ceupvhALp1IF44SeUFOMhKUOYqpsoYWQKAOuTRDMqn75rEaKDp0Xw== - -"@ant-design/icons@^4.7.0": - version "4.7.0" - resolved "https://registry.yarnpkg.com/@ant-design/icons/-/icons-4.7.0.tgz#8c3cbe0a556ba92af5dc7d1e70c0b25b5179af0f" - integrity sha512-aoB4Z7JA431rt6d4u+8xcNPPCrdufSRMUOpxa1ab6mz1JCQZOEVolj2WVs/tDFmN62zzK30mNelEsprLYsSF3g== - dependencies: - "@ant-design/colors" "^6.0.0" - "@ant-design/icons-svg" "^4.2.1" - "@babel/runtime" "^7.11.2" - classnames "^2.2.6" - rc-util "^5.9.4" - -"@ant-design/react-slick@~0.28.1": - version "0.28.4" - resolved "https://registry.yarnpkg.com/@ant-design/react-slick/-/react-slick-0.28.4.tgz#8b296b87ad7c7ae877f2a527b81b7eebd9dd29a9" - integrity sha512-j9eAHTn7GxbXUFNknJoHS2ceAsqrQi2j8XykjZE1IXCD8kJF+t28EvhBLniDpbOsBk/3kjalnhriTfZcjBHNqg== - dependencies: - "@babel/runtime" "^7.10.4" - classnames "^2.2.5" - json2mq "^0.2.0" - lodash "^4.17.21" - resize-observer-polyfill "^1.5.0" - -"@babel/runtime@^7.0.0", "@babel/runtime@^7.10.1", "@babel/runtime@^7.10.2", "@babel/runtime@^7.10.4", "@babel/runtime@^7.11.1", "@babel/runtime@^7.11.2", "@babel/runtime@^7.12.5", "@babel/runtime@^7.13.10", "@babel/runtime@^7.3.1", "@babel/runtime@^7.4.4", "@babel/runtime@^7.5.5", "@babel/runtime@^7.8.3", "@babel/runtime@^7.8.4", "@babel/runtime@^7.8.7": - version "7.17.2" - resolved "https://registry.yarnpkg.com/@babel/runtime/-/runtime-7.17.2.tgz#66f68591605e59da47523c631416b18508779941" - integrity sha512-hzeyJyMA1YGdJTuWU0e/j4wKXrU4OMFvY2MSlaI9B7VQb0r5cxTE3EAIS2Q7Tn2RIcDkRvTA/v2JsAEhxe99uw== - dependencies: - regenerator-runtime "^0.13.4" - -"@ctrl/tinycolor@^3.4.0": - version "3.4.0" - resolved "https://registry.yarnpkg.com/@ctrl/tinycolor/-/tinycolor-3.4.0.tgz#c3c5ae543c897caa9c2a68630bed355be5f9990f" - integrity sha512-JZButFdZ1+/xAfpguQHoabIXkcqRRKpMrWKBkpEZZyxfY9C1DpADFB8PEqGSTeFr135SaTRfKqGKx5xSCLI7ZQ== - -"@discoveryjs/json-ext@^0.5.0": - version "0.5.6" - resolved "https://registry.yarnpkg.com/@discoveryjs/json-ext/-/json-ext-0.5.6.tgz#d5e0706cf8c6acd8c6032f8d54070af261bbbb2f" - integrity sha512-ws57AidsDvREKrZKYffXddNkyaF14iHNHm8VQnZH6t99E8gczjNN0GpvcGny0imC80yQ0tHz1xVUKk/KFQSUyA== - -"@emotion/hash@^0.8.0": - version "0.8.0" - resolved "https://registry.yarnpkg.com/@emotion/hash/-/hash-0.8.0.tgz#bbbff68978fefdbe68ccb533bc8cbe1d1afb5413" - integrity sha512-kBJtf7PH6aWwZ6fka3zQ0p6SBYzx4fl1LoZXE2RrnYST9Xljm7WfKJrU4g/Xr3Beg72MLrp1AWNUmuYJTL7Cow== - -"@material-ui/core@^4.11.3": - version "4.12.3" - resolved "https://registry.yarnpkg.com/@material-ui/core/-/core-4.12.3.tgz#80d665caf0f1f034e52355c5450c0e38b099d3ca" - integrity sha512-sdpgI/PL56QVsEJldwEe4FFaFTLUqN+rd7sSZiRCdx2E/C7z5yK0y/khAWVBH24tXwto7I1hCzNWfJGZIYJKnw== - dependencies: - "@babel/runtime" "^7.4.4" - "@material-ui/styles" "^4.11.4" - "@material-ui/system" "^4.12.1" - "@material-ui/types" "5.1.0" - "@material-ui/utils" "^4.11.2" - "@types/react-transition-group" "^4.2.0" - clsx "^1.0.4" - hoist-non-react-statics "^3.3.2" - popper.js "1.16.1-lts" - prop-types "^15.7.2" - react-is "^16.8.0 || ^17.0.0" - react-transition-group "^4.4.0" - -"@material-ui/icons@^4.11.2": - version "4.11.2" - resolved "https://registry.yarnpkg.com/@material-ui/icons/-/icons-4.11.2.tgz#b3a7353266519cd743b6461ae9fdfcb1b25eb4c5" - integrity sha512-fQNsKX2TxBmqIGJCSi3tGTO/gZ+eJgWmMJkgDiOfyNaunNaxcklJQFaFogYcFl0qFuaEz1qaXYXboa/bUXVSOQ== - dependencies: - "@babel/runtime" "^7.4.4" - -"@material-ui/styles@^4.11.4": - version "4.11.4" - resolved "https://registry.yarnpkg.com/@material-ui/styles/-/styles-4.11.4.tgz#eb9dfccfcc2d208243d986457dff025497afa00d" - integrity sha512-KNTIZcnj/zprG5LW0Sao7zw+yG3O35pviHzejMdcSGCdWbiO8qzRgOYL8JAxAsWBKOKYwVZxXtHWaB5T2Kvxew== - dependencies: - "@babel/runtime" "^7.4.4" - "@emotion/hash" "^0.8.0" - "@material-ui/types" "5.1.0" - "@material-ui/utils" "^4.11.2" - clsx "^1.0.4" - csstype "^2.5.2" - hoist-non-react-statics "^3.3.2" - jss "^10.5.1" - jss-plugin-camel-case "^10.5.1" - jss-plugin-default-unit "^10.5.1" - jss-plugin-global "^10.5.1" - jss-plugin-nested "^10.5.1" - jss-plugin-props-sort "^10.5.1" - jss-plugin-rule-value-function "^10.5.1" - jss-plugin-vendor-prefixer "^10.5.1" - prop-types "^15.7.2" - -"@material-ui/system@^4.12.1": - version "4.12.1" - resolved "https://registry.yarnpkg.com/@material-ui/system/-/system-4.12.1.tgz#2dd96c243f8c0a331b2bb6d46efd7771a399707c" - integrity sha512-lUdzs4q9kEXZGhbN7BptyiS1rLNHe6kG9o8Y307HCvF4sQxbCgpL2qi+gUk+yI8a2DNk48gISEQxoxpgph0xIw== - dependencies: - "@babel/runtime" "^7.4.4" - "@material-ui/utils" "^4.11.2" - csstype "^2.5.2" - prop-types "^15.7.2" - -"@material-ui/types@5.1.0": - version "5.1.0" - resolved "https://registry.yarnpkg.com/@material-ui/types/-/types-5.1.0.tgz#efa1c7a0b0eaa4c7c87ac0390445f0f88b0d88f2" - integrity sha512-7cqRjrY50b8QzRSYyhSpx4WRw2YuO0KKIGQEVk5J8uoz2BanawykgZGoWEqKm7pVIbzFDN0SpPcVV4IhOFkl8A== - -"@material-ui/utils@^4.11.2": - version "4.11.2" - resolved "https://registry.yarnpkg.com/@material-ui/utils/-/utils-4.11.2.tgz#f1aefa7e7dff2ebcb97d31de51aecab1bb57540a" - integrity sha512-Uul8w38u+PICe2Fg2pDKCaIG7kOyhowZ9vjiC1FsVwPABTW8vPPKfF6OvxRq3IiBaI1faOJmgdvMG7rMJARBhA== - dependencies: - "@babel/runtime" "^7.4.4" - prop-types "^15.7.2" - react-is "^16.8.0 || ^17.0.0" - -"@nodelib/fs.scandir@2.1.5": - version "2.1.5" - resolved "https://registry.yarnpkg.com/@nodelib/fs.scandir/-/fs.scandir-2.1.5.tgz#7619c2eb21b25483f6d167548b4cfd5a7488c3d5" - integrity sha512-vq24Bq3ym5HEQm2NKCr3yXDwjc7vTsEThRDnkp2DK9p1uqLR+DHurm/NOTo0KG7HYHU7eppKZj3MyqYuMBf62g== - dependencies: - "@nodelib/fs.stat" "2.0.5" - run-parallel "^1.1.9" - -"@nodelib/fs.stat@2.0.5", "@nodelib/fs.stat@^2.0.2": - version "2.0.5" - resolved "https://registry.yarnpkg.com/@nodelib/fs.stat/-/fs.stat-2.0.5.tgz#5bd262af94e9d25bd1e71b05deed44876a222e8b" - integrity sha512-RkhPPp2zrqDAQA/2jNhnztcPAlv64XdhIp7a7454A5ovI7Bukxgt7MX7udwAu3zg1DcpPU0rz3VV1SeaqvY4+A== - -"@nodelib/fs.walk@^1.2.3": - version "1.2.8" - resolved "https://registry.yarnpkg.com/@nodelib/fs.walk/-/fs.walk-1.2.8.tgz#e95737e8bb6746ddedf69c556953494f196fe69a" - integrity sha512-oGB+UxlgWcgQkgwo8GcEGwemoTFt3FIO9ababBmaGwXIoBKZ+GTy0pP185beGg7Llih/NSHSV2XAs1lnznocSg== - dependencies: - "@nodelib/fs.scandir" "2.1.5" - fastq "^1.6.0" - -"@types/body-parser@*": - version "1.19.2" - resolved "https://registry.yarnpkg.com/@types/body-parser/-/body-parser-1.19.2.tgz#aea2059e28b7658639081347ac4fab3de166e6f0" - integrity sha512-ALYone6pm6QmwZoAgeyNksccT9Q4AWZQ6PvfwR37GT6r6FWUPguq6sUmNGSMV2Wr761oQoBxwGGa6DR5o1DC9g== - dependencies: - "@types/connect" "*" - "@types/node" "*" - -"@types/bonjour@^3.5.9": - version "3.5.10" - resolved "https://registry.yarnpkg.com/@types/bonjour/-/bonjour-3.5.10.tgz#0f6aadfe00ea414edc86f5d106357cda9701e275" - integrity sha512-p7ienRMiS41Nu2/igbJxxLDWrSZ0WxM8UQgCeO9KhoVF7cOVFkrKsiDr1EsJIla8vV3oEEjGcz11jc5yimhzZw== - dependencies: - "@types/node" "*" - -"@types/connect-history-api-fallback@^1.3.5": - version "1.3.5" - resolved "https://registry.yarnpkg.com/@types/connect-history-api-fallback/-/connect-history-api-fallback-1.3.5.tgz#d1f7a8a09d0ed5a57aee5ae9c18ab9b803205dae" - integrity sha512-h8QJa8xSb1WD4fpKBDcATDNGXghFj6/3GRWG6dhmRcu0RX1Ubasur2Uvx5aeEwlf0MwblEC2bMzzMQntxnw/Cw== - dependencies: - "@types/express-serve-static-core" "*" - "@types/node" "*" - -"@types/connect@*": - version "3.4.35" - resolved "https://registry.yarnpkg.com/@types/connect/-/connect-3.4.35.tgz#5fcf6ae445e4021d1fc2219a4873cc73a3bb2ad1" - integrity sha512-cdeYyv4KWoEgpBISTxWvqYsVy444DOqehiF3fM3ne10AmJ62RSyNkUnxMJXHQWRQQX2eR94m5y1IZyDwBjV9FQ== - dependencies: - "@types/node" "*" - -"@types/eslint-scope@^3.7.3": - version "3.7.3" - resolved "https://registry.yarnpkg.com/@types/eslint-scope/-/eslint-scope-3.7.3.tgz#125b88504b61e3c8bc6f870882003253005c3224" - integrity sha512-PB3ldyrcnAicT35TWPs5IcwKD8S333HMaa2VVv4+wdvebJkjWuW/xESoB8IwRcog8HYVYamb1g/R31Qv5Bx03g== - dependencies: - "@types/eslint" "*" - "@types/estree" "*" - -"@types/eslint@*": - version "8.4.1" - resolved "https://registry.yarnpkg.com/@types/eslint/-/eslint-8.4.1.tgz#c48251553e8759db9e656de3efc846954ac32304" - integrity sha512-GE44+DNEyxxh2Kc6ro/VkIj+9ma0pO0bwv9+uHSyBrikYOHr8zYcdPvnBOp1aw8s+CjRvuSx7CyWqRrNFQ59mA== - dependencies: - "@types/estree" "*" - "@types/json-schema" "*" - -"@types/estree@*", "@types/estree@^0.0.51": - version "0.0.51" - resolved "https://registry.yarnpkg.com/@types/estree/-/estree-0.0.51.tgz#cfd70924a25a3fd32b218e5e420e6897e1ac4f40" - integrity sha512-CuPgU6f3eT/XgKKPqKd/gLZV1Xmvf1a2R5POBOGQa6uv82xpls89HU5zKeVoyR8XzHd1RGNOlQlvUe3CFkjWNQ== - -"@types/express-serve-static-core@*", "@types/express-serve-static-core@^4.17.18": - version "4.17.28" - resolved "https://registry.yarnpkg.com/@types/express-serve-static-core/-/express-serve-static-core-4.17.28.tgz#c47def9f34ec81dc6328d0b1b5303d1ec98d86b8" - integrity sha512-P1BJAEAW3E2DJUlkgq4tOL3RyMunoWXqbSCygWo5ZIWTjUgN1YnaXWW4VWl/oc8vs/XoYibEGBKP0uZyF4AHig== - dependencies: - "@types/node" "*" - "@types/qs" "*" - "@types/range-parser" "*" - -"@types/express@*", "@types/express@^4.17.13": - version "4.17.13" - resolved "https://registry.yarnpkg.com/@types/express/-/express-4.17.13.tgz#a76e2995728999bab51a33fabce1d705a3709034" - integrity sha512-6bSZTPaTIACxn48l50SR+axgrqm6qXFIxrdAKaG6PaJk3+zuUr35hBlgT7vOmJcum+OEaIBLtHV/qloEAFITeA== - dependencies: - "@types/body-parser" "*" - "@types/express-serve-static-core" "^4.17.18" - "@types/qs" "*" - "@types/serve-static" "*" - -"@types/html-minifier-terser@^6.0.0": - version "6.1.0" - resolved "https://registry.yarnpkg.com/@types/html-minifier-terser/-/html-minifier-terser-6.1.0.tgz#4fc33a00c1d0c16987b1a20cf92d20614c55ac35" - integrity sha512-oh/6byDPnL1zeNXFrDXFLyZjkr1MsBG667IM792caf1L2UPOOMf65NFzjUH/ltyfwjAGfs1rsX1eftK0jC/KIg== - -"@types/http-proxy@^1.17.8": - version "1.17.8" - resolved "https://registry.yarnpkg.com/@types/http-proxy/-/http-proxy-1.17.8.tgz#968c66903e7e42b483608030ee85800f22d03f55" - integrity sha512-5kPLG5BKpWYkw/LVOGWpiq3nEVqxiN32rTgI53Sk12/xHFQ2rG3ehI9IO+O3W2QoKeyB92dJkoka8SUm6BX1pA== - dependencies: - "@types/node" "*" - -"@types/json-schema@*", "@types/json-schema@^7.0.8", "@types/json-schema@^7.0.9": - version "7.0.9" - resolved "https://registry.yarnpkg.com/@types/json-schema/-/json-schema-7.0.9.tgz#97edc9037ea0c38585320b28964dde3b39e4660d" - integrity sha512-qcUXuemtEu+E5wZSJHNxUXeCZhAfXKQ41D+duX+VYPde7xyEVZci+/oXKJL13tnRs9lR2pr4fod59GT6/X1/yQ== - -"@types/mime@^1": - version "1.3.2" - resolved "https://registry.yarnpkg.com/@types/mime/-/mime-1.3.2.tgz#93e25bf9ee75fe0fd80b594bc4feb0e862111b5a" - integrity sha512-YATxVxgRqNH6nHEIsvg6k2Boc1JHI9ZbH5iWFFv/MTkchz3b1ieGDa5T0a9RznNdI0KhVbdbWSN+KWWrQZRxTw== - -"@types/node@*": - version "17.0.21" - resolved "https://registry.yarnpkg.com/@types/node/-/node-17.0.21.tgz#864b987c0c68d07b4345845c3e63b75edd143644" - integrity sha512-DBZCJbhII3r90XbQxI8Y9IjjiiOGlZ0Hr32omXIZvwwZ7p4DMMXGrKXVyPfuoBOri9XNtL0UK69jYIBIsRX3QQ== - -"@types/prop-types@*": - version "15.7.4" - resolved "https://registry.yarnpkg.com/@types/prop-types/-/prop-types-15.7.4.tgz#fcf7205c25dff795ee79af1e30da2c9790808f11" - integrity sha512-rZ5drC/jWjrArrS8BR6SIr4cWpW09RNTYt9AMZo3Jwwif+iacXAqgVjm0B0Bv/S1jhDXKHqRVNCbACkJ89RAnQ== - -"@types/qs@*": - version "6.9.7" - resolved "https://registry.yarnpkg.com/@types/qs/-/qs-6.9.7.tgz#63bb7d067db107cc1e457c303bc25d511febf6cb" - integrity sha512-FGa1F62FT09qcrueBA6qYTrJPVDzah9a+493+o2PCXsesWHIn27G98TsSMs3WPNbZIEj4+VJf6saSFpvD+3Zsw== - -"@types/range-parser@*": - version "1.2.4" - resolved "https://registry.yarnpkg.com/@types/range-parser/-/range-parser-1.2.4.tgz#cd667bcfdd025213aafb7ca5915a932590acdcdc" - integrity sha512-EEhsLsD6UsDM1yFhAvy0Cjr6VwmpMWqFBCb9w07wVugF7w9nfajxLuVmngTIpgS6svCnm6Vaw+MZhoDCKnOfsw== - -"@types/react-dom@^16.9.8": - version "16.9.14" - resolved "https://registry.yarnpkg.com/@types/react-dom/-/react-dom-16.9.14.tgz#674b8f116645fe5266b40b525777fc6bb8eb3bcd" - integrity sha512-FIX2AVmPTGP30OUJ+0vadeIFJJ07Mh1m+U0rxfgyW34p3rTlXI+nlenvAxNn4BP36YyI9IJ/+UJ7Wu22N1pI7A== - dependencies: - "@types/react" "^16" - -"@types/react-transition-group@^4.2.0": - version "4.4.4" - resolved "https://registry.yarnpkg.com/@types/react-transition-group/-/react-transition-group-4.4.4.tgz#acd4cceaa2be6b757db61ed7b432e103242d163e" - integrity sha512-7gAPz7anVK5xzbeQW9wFBDg7G++aPLAFY0QaSMOou9rJZpbuI58WAuJrgu+qR92l61grlnCUe7AFX8KGahAgug== - dependencies: - "@types/react" "*" - -"@types/react@*": - version "17.0.39" - resolved "https://registry.yarnpkg.com/@types/react/-/react-17.0.39.tgz#d0f4cde092502a6db00a1cded6e6bf2abb7633ce" - integrity sha512-UVavlfAxDd/AgAacMa60Azl7ygyQNRwC/DsHZmKgNvPmRR5p70AJ5Q9EAmL2NWOJmeV+vVUI4IAP7GZrN8h8Ug== - dependencies: - "@types/prop-types" "*" - "@types/scheduler" "*" - csstype "^3.0.2" - -"@types/react@^16", "@types/react@^16.9.51": - version "16.14.23" - resolved "https://registry.yarnpkg.com/@types/react/-/react-16.14.23.tgz#37201b9f2324c5ff8fa4600dbf19079dfdffc880" - integrity sha512-WngBZLuSkP4IAgPi0HOsGCHo6dn3CcuLQnCfC17VbA7YBgipZiZoTOhObwl/93DsFW0Y2a/ZXeonpW4DxirEJg== - dependencies: - "@types/prop-types" "*" - "@types/scheduler" "*" - csstype "^3.0.2" - -"@types/retry@^0.12.0": - version "0.12.1" - resolved "https://registry.yarnpkg.com/@types/retry/-/retry-0.12.1.tgz#d8f1c0d0dc23afad6dc16a9e993a0865774b4065" - integrity sha512-xoDlM2S4ortawSWORYqsdU+2rxdh4LRW9ytc3zmT37RIKQh6IHyKwwtKhKis9ah8ol07DCkZxPt8BBvPjC6v4g== - -"@types/scheduler@*": - version "0.16.2" - resolved "https://registry.yarnpkg.com/@types/scheduler/-/scheduler-0.16.2.tgz#1a62f89525723dde24ba1b01b092bf5df8ad4d39" - integrity sha512-hppQEBDmlwhFAXKJX2KnWLYu5yMfi91yazPb2l+lbJiwW+wdo1gNeRA+3RgNSO39WYX2euey41KEwnqesU2Jew== - -"@types/serve-index@^1.9.1": - version "1.9.1" - resolved "https://registry.yarnpkg.com/@types/serve-index/-/serve-index-1.9.1.tgz#1b5e85370a192c01ec6cec4735cf2917337a6278" - integrity sha512-d/Hs3nWDxNL2xAczmOVZNj92YZCS6RGxfBPjKzuu/XirCgXdpKEb88dYNbrYGint6IVWLNP+yonwVAuRC0T2Dg== - dependencies: - "@types/express" "*" - -"@types/serve-static@*": - version "1.13.10" - resolved "https://registry.yarnpkg.com/@types/serve-static/-/serve-static-1.13.10.tgz#f5e0ce8797d2d7cc5ebeda48a52c96c4fa47a8d9" - integrity sha512-nCkHGI4w7ZgAdNkrEu0bv+4xNV/XDqW+DydknebMOQwkpDGx8G+HTlj7R7ABI8i8nKxVw0wtKPi1D+lPOkh4YQ== - dependencies: - "@types/mime" "^1" - "@types/node" "*" - -"@types/sockjs@^0.3.33": - version "0.3.33" - resolved "https://registry.yarnpkg.com/@types/sockjs/-/sockjs-0.3.33.tgz#570d3a0b99ac995360e3136fd6045113b1bd236f" - integrity sha512-f0KEEe05NvUnat+boPTZ0dgaLZ4SfSouXUgv5noUiefG2ajgKjmETo9ZJyuqsl7dfl2aHlLJUiki6B4ZYldiiw== - dependencies: - "@types/node" "*" - -"@types/ws@^8.2.2": - version "8.5.2" - resolved "https://registry.yarnpkg.com/@types/ws/-/ws-8.5.2.tgz#77e0c2e360e9579da930ffcfa53c5975ea3bdd26" - integrity sha512-VXI82ykONr5tacHEojnErTQk+KQSoYbW1NB6iz6wUwrNd+BqfkfggQNoNdCqhJSzbNumShPERbM+Pc5zpfhlbw== - dependencies: - "@types/node" "*" - -"@webassemblyjs/ast@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/ast/-/ast-1.11.1.tgz#2bfd767eae1a6996f432ff7e8d7fc75679c0b6a7" - integrity sha512-ukBh14qFLjxTQNTXocdyksN5QdM28S1CxHt2rdskFyL+xFV7VremuBLVbmCePj+URalXBENx/9Lm7lnhihtCSw== - dependencies: - "@webassemblyjs/helper-numbers" "1.11.1" - "@webassemblyjs/helper-wasm-bytecode" "1.11.1" - -"@webassemblyjs/floating-point-hex-parser@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/floating-point-hex-parser/-/floating-point-hex-parser-1.11.1.tgz#f6c61a705f0fd7a6aecaa4e8198f23d9dc179e4f" - integrity sha512-iGRfyc5Bq+NnNuX8b5hwBrRjzf0ocrJPI6GWFodBFzmFnyvrQ83SHKhmilCU/8Jv67i4GJZBMhEzltxzcNagtQ== - -"@webassemblyjs/helper-api-error@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/helper-api-error/-/helper-api-error-1.11.1.tgz#1a63192d8788e5c012800ba6a7a46c705288fd16" - integrity sha512-RlhS8CBCXfRUR/cwo2ho9bkheSXG0+NwooXcc3PAILALf2QLdFyj7KGsKRbVc95hZnhnERon4kW/D3SZpp6Tcg== - -"@webassemblyjs/helper-buffer@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/helper-buffer/-/helper-buffer-1.11.1.tgz#832a900eb444884cde9a7cad467f81500f5e5ab5" - integrity sha512-gwikF65aDNeeXa8JxXa2BAk+REjSyhrNC9ZwdT0f8jc4dQQeDQ7G4m0f2QCLPJiMTTO6wfDmRmj/pW0PsUvIcA== - -"@webassemblyjs/helper-numbers@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/helper-numbers/-/helper-numbers-1.11.1.tgz#64d81da219fbbba1e3bd1bfc74f6e8c4e10a62ae" - integrity sha512-vDkbxiB8zfnPdNK9Rajcey5C0w+QJugEglN0of+kmO8l7lDb77AnlKYQF7aarZuCrv+l0UvqL+68gSDr3k9LPQ== - dependencies: - "@webassemblyjs/floating-point-hex-parser" "1.11.1" - "@webassemblyjs/helper-api-error" "1.11.1" - "@xtuc/long" "4.2.2" - -"@webassemblyjs/helper-wasm-bytecode@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/helper-wasm-bytecode/-/helper-wasm-bytecode-1.11.1.tgz#f328241e41e7b199d0b20c18e88429c4433295e1" - integrity sha512-PvpoOGiJwXeTrSf/qfudJhwlvDQxFgelbMqtq52WWiXC6Xgg1IREdngmPN3bs4RoO83PnL/nFrxucXj1+BX62Q== - -"@webassemblyjs/helper-wasm-section@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/helper-wasm-section/-/helper-wasm-section-1.11.1.tgz#21ee065a7b635f319e738f0dd73bfbda281c097a" - integrity sha512-10P9No29rYX1j7F3EVPX3JvGPQPae+AomuSTPiF9eBQeChHI6iqjMIwR9JmOJXwpnn/oVGDk7I5IlskuMwU/pg== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/helper-buffer" "1.11.1" - "@webassemblyjs/helper-wasm-bytecode" "1.11.1" - "@webassemblyjs/wasm-gen" "1.11.1" - -"@webassemblyjs/ieee754@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/ieee754/-/ieee754-1.11.1.tgz#963929e9bbd05709e7e12243a099180812992614" - integrity sha512-hJ87QIPtAMKbFq6CGTkZYJivEwZDbQUgYd3qKSadTNOhVY7p+gfP6Sr0lLRVTaG1JjFj+r3YchoqRYxNH3M0GQ== - dependencies: - "@xtuc/ieee754" "^1.2.0" - -"@webassemblyjs/leb128@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/leb128/-/leb128-1.11.1.tgz#ce814b45574e93d76bae1fb2644ab9cdd9527aa5" - integrity sha512-BJ2P0hNZ0u+Th1YZXJpzW6miwqQUGcIHT1G/sf72gLVD9DZ5AdYTqPNbHZh6K1M5VmKvFXwGSWZADz+qBWxeRw== - dependencies: - "@xtuc/long" "4.2.2" - -"@webassemblyjs/utf8@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/utf8/-/utf8-1.11.1.tgz#d1f8b764369e7c6e6bae350e854dec9a59f0a3ff" - integrity sha512-9kqcxAEdMhiwQkHpkNiorZzqpGrodQQ2IGrHHxCy+Ozng0ofyMA0lTqiLkVs1uzTRejX+/O0EOT7KxqVPuXosQ== - -"@webassemblyjs/wasm-edit@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/wasm-edit/-/wasm-edit-1.11.1.tgz#ad206ebf4bf95a058ce9880a8c092c5dec8193d6" - integrity sha512-g+RsupUC1aTHfR8CDgnsVRVZFJqdkFHpsHMfJuWQzWU3tvnLC07UqHICfP+4XyL2tnr1amvl1Sdp06TnYCmVkA== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/helper-buffer" "1.11.1" - "@webassemblyjs/helper-wasm-bytecode" "1.11.1" - "@webassemblyjs/helper-wasm-section" "1.11.1" - "@webassemblyjs/wasm-gen" "1.11.1" - "@webassemblyjs/wasm-opt" "1.11.1" - "@webassemblyjs/wasm-parser" "1.11.1" - "@webassemblyjs/wast-printer" "1.11.1" - -"@webassemblyjs/wasm-gen@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/wasm-gen/-/wasm-gen-1.11.1.tgz#86c5ea304849759b7d88c47a32f4f039ae3c8f76" - integrity sha512-F7QqKXwwNlMmsulj6+O7r4mmtAlCWfO/0HdgOxSklZfQcDu0TpLiD1mRt/zF25Bk59FIjEuGAIyn5ei4yMfLhA== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/helper-wasm-bytecode" "1.11.1" - "@webassemblyjs/ieee754" "1.11.1" - "@webassemblyjs/leb128" "1.11.1" - "@webassemblyjs/utf8" "1.11.1" - -"@webassemblyjs/wasm-opt@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/wasm-opt/-/wasm-opt-1.11.1.tgz#657b4c2202f4cf3b345f8a4c6461c8c2418985f2" - integrity sha512-VqnkNqnZlU5EB64pp1l7hdm3hmQw7Vgqa0KF/KCNO9sIpI6Fk6brDEiX+iCOYrvMuBWDws0NkTOxYEb85XQHHw== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/helper-buffer" "1.11.1" - "@webassemblyjs/wasm-gen" "1.11.1" - "@webassemblyjs/wasm-parser" "1.11.1" - -"@webassemblyjs/wasm-parser@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/wasm-parser/-/wasm-parser-1.11.1.tgz#86ca734534f417e9bd3c67c7a1c75d8be41fb199" - integrity sha512-rrBujw+dJu32gYB7/Lup6UhdkPx9S9SnobZzRVL7VcBH9Bt9bCBLEuX/YXOOtBsOZ4NQrRykKhffRWHvigQvOA== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/helper-api-error" "1.11.1" - "@webassemblyjs/helper-wasm-bytecode" "1.11.1" - "@webassemblyjs/ieee754" "1.11.1" - "@webassemblyjs/leb128" "1.11.1" - "@webassemblyjs/utf8" "1.11.1" - -"@webassemblyjs/wast-printer@1.11.1": - version "1.11.1" - resolved "https://registry.yarnpkg.com/@webassemblyjs/wast-printer/-/wast-printer-1.11.1.tgz#d0c73beda8eec5426f10ae8ef55cee5e7084c2f0" - integrity sha512-IQboUWM4eKzWW+N/jij2sRatKMh99QEelo3Eb2q0qXkvPRISAj8Qxtmw5itwqK+TTkBuUIE45AxYPToqPtL5gg== - dependencies: - "@webassemblyjs/ast" "1.11.1" - "@xtuc/long" "4.2.2" - -"@webpack-cli/configtest@^1.1.1": - version "1.1.1" - resolved "https://registry.yarnpkg.com/@webpack-cli/configtest/-/configtest-1.1.1.tgz#9f53b1b7946a6efc2a749095a4f450e2932e8356" - integrity sha512-1FBc1f9G4P/AxMqIgfZgeOTuRnwZMten8E7zap5zgpPInnCrP8D4Q81+4CWIch8i/Nf7nXjP0v6CjjbHOrXhKg== - -"@webpack-cli/info@^1.4.1": - version "1.4.1" - resolved "https://registry.yarnpkg.com/@webpack-cli/info/-/info-1.4.1.tgz#2360ea1710cbbb97ff156a3f0f24556e0fc1ebea" - integrity sha512-PKVGmazEq3oAo46Q63tpMr4HipI3OPfP7LiNOEJg963RMgT0rqheag28NCML0o3GIzA3DmxP1ZIAv9oTX1CUIA== - dependencies: - envinfo "^7.7.3" - -"@webpack-cli/serve@^1.6.1": - version "1.6.1" - resolved "https://registry.yarnpkg.com/@webpack-cli/serve/-/serve-1.6.1.tgz#0de2875ac31b46b6c5bb1ae0a7d7f0ba5678dffe" - integrity sha512-gNGTiTrjEVQ0OcVnzsRSqTxaBSr+dmTfm+qJsCDluky8uhdLWep7Gcr62QsAKHTMxjCS/8nEITsmFAhfIx+QSw== - -"@xtuc/ieee754@^1.2.0": - version "1.2.0" - resolved "https://registry.yarnpkg.com/@xtuc/ieee754/-/ieee754-1.2.0.tgz#eef014a3145ae477a1cbc00cd1e552336dceb790" - integrity sha512-DX8nKgqcGwsc0eJSqYt5lwP4DH5FlHnmuWWBRy7X0NcaGR0ZtuyeESgMwTYVEtxmsNGY+qit4QYT/MIYTOTPeA== - -"@xtuc/long@4.2.2": - version "4.2.2" - resolved "https://registry.yarnpkg.com/@xtuc/long/-/long-4.2.2.tgz#d291c6a4e97989b5c61d9acf396ae4fe133a718d" - integrity sha512-NuHqBY1PB/D8xU6s/thBgOAiAP7HOYDQ32+BFZILJ8ivkUkAHQnWfn6WhL79Owj1qmUnoN/YPhktdIoucipkAQ== - -accepts@~1.3.4, accepts@~1.3.5, accepts@~1.3.8: - version "1.3.8" - resolved "https://registry.yarnpkg.com/accepts/-/accepts-1.3.8.tgz#0bf0be125b67014adcb0b0921e62db7bffe16b2e" - integrity sha512-PYAthTa2m2VKxuvSD3DPC/Gy+U+sOA1LAuT8mkmRuvw+NACSaeXEQ+NHcVF7rONl6qcaxV3Uuemwawk+7+SJLw== - dependencies: - mime-types "~2.1.34" - negotiator "0.6.3" - -acorn-import-assertions@^1.7.6: - version "1.8.0" - resolved "https://registry.yarnpkg.com/acorn-import-assertions/-/acorn-import-assertions-1.8.0.tgz#ba2b5939ce62c238db6d93d81c9b111b29b855e9" - integrity sha512-m7VZ3jwz4eK6A4Vtt8Ew1/mNbP24u0FhdyfA7fSvnJR6LMdfOYnmuIrrJAgrYfYJ10F/otaHTtrtrtmHdMNzEw== - -acorn@^8.4.1, acorn@^8.5.0: - version "8.7.0" - resolved "https://registry.yarnpkg.com/acorn/-/acorn-8.7.0.tgz#90951fde0f8f09df93549481e5fc141445b791cf" - integrity sha512-V/LGr1APy+PXIwKebEWrkZPwoeoF+w1jiOBUmuxuiUIaOHtob8Qc9BTrYo7VuI5fR8tqsy+buA2WFooR5olqvQ== - -aggregate-error@^3.0.0: - version "3.1.0" - resolved "https://registry.yarnpkg.com/aggregate-error/-/aggregate-error-3.1.0.tgz#92670ff50f5359bdb7a3e0d40d0ec30c5737687a" - integrity sha512-4I7Td01quW/RpocfNayFdFVk1qSuoh0E7JrbRJ16nH01HhKFQ88INq9Sd+nd72zqRySlr9BmDA8xlEJ6vJMrYA== - dependencies: - clean-stack "^2.0.0" - indent-string "^4.0.0" - -ajv-formats@^2.1.1: - version "2.1.1" - resolved "https://registry.yarnpkg.com/ajv-formats/-/ajv-formats-2.1.1.tgz#6e669400659eb74973bbf2e33327180a0996b520" - integrity sha512-Wx0Kx52hxE7C18hkMEggYlEifqWZtYaRgouJor+WMdPnQyEK13vgEWyVNup7SoeeoLMsr4kf5h6dOW11I15MUA== - dependencies: - ajv "^8.0.0" - -ajv-keywords@^3.5.2: - version "3.5.2" - resolved "https://registry.yarnpkg.com/ajv-keywords/-/ajv-keywords-3.5.2.tgz#31f29da5ab6e00d1c2d329acf7b5929614d5014d" - integrity sha512-5p6WTN0DdTGVQk6VjcEju19IgaHudalcfabD7yhDGeA6bcQnmL+CpveLJq/3hvfwd1aof6L386Ougkx6RfyMIQ== - -ajv-keywords@^5.0.0: - version "5.1.0" - resolved "https://registry.yarnpkg.com/ajv-keywords/-/ajv-keywords-5.1.0.tgz#69d4d385a4733cdbeab44964a1170a88f87f0e16" - integrity sha512-YCS/JNFAUyr5vAuhk1DWm1CBxRHW9LbJ2ozWeemrIqpbsqKjHVxYPyi5GC0rjZIT5JxJ3virVTS8wk4i/Z+krw== - dependencies: - fast-deep-equal "^3.1.3" - -ajv@^6.12.5: - version "6.12.6" - resolved "https://registry.yarnpkg.com/ajv/-/ajv-6.12.6.tgz#baf5a62e802b07d977034586f8c3baf5adf26df4" - integrity sha512-j3fVLgvTo527anyYyJOGTYJbG+vnnQYvE0m5mmkc1TK+nxAppkCLMIL0aZ4dblVCNoGShhm+kzE4ZUykBoMg4g== - dependencies: - fast-deep-equal "^3.1.1" - fast-json-stable-stringify "^2.0.0" - json-schema-traverse "^0.4.1" - uri-js "^4.2.2" - -ajv@^8.0.0, ajv@^8.8.0: - version "8.10.0" - resolved "https://registry.yarnpkg.com/ajv/-/ajv-8.10.0.tgz#e573f719bd3af069017e3b66538ab968d040e54d" - integrity sha512-bzqAEZOjkrUMl2afH8dknrq5KEk2SrwdBROR+vH1EKVQTqaUbJVPdc/gEdggTMM0Se+s+Ja4ju4TlNcStKl2Hw== - dependencies: - fast-deep-equal "^3.1.1" - json-schema-traverse "^1.0.0" - require-from-string "^2.0.2" - uri-js "^4.2.2" - -ansi-html-community@^0.0.8: - version "0.0.8" - resolved "https://registry.yarnpkg.com/ansi-html-community/-/ansi-html-community-0.0.8.tgz#69fbc4d6ccbe383f9736934ae34c3f8290f1bf41" - integrity sha512-1APHAyr3+PCamwNw3bXCPp4HFLONZt/yIH0sZp0/469KWNTEy+qN5jQ3GVX6DMZ1UXAi34yVwtTeaG/HpBuuzw== - -ansi-regex@^5.0.1: - version "5.0.1" - resolved "https://registry.yarnpkg.com/ansi-regex/-/ansi-regex-5.0.1.tgz#082cb2c89c9fe8659a311a53bd6a4dc5301db304" - integrity sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ== - -ansi-regex@^6.0.1: - version "6.0.1" - resolved "https://registry.yarnpkg.com/ansi-regex/-/ansi-regex-6.0.1.tgz#3183e38fae9a65d7cb5e53945cd5897d0260a06a" - integrity sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA== - -ansi-styles@^4.1.0: - version "4.3.0" - resolved "https://registry.yarnpkg.com/ansi-styles/-/ansi-styles-4.3.0.tgz#edd803628ae71c04c85ae7a0906edad34b648937" - integrity sha512-zbB9rCJAT1rbjiVDb2hqKFHNYLxgtk8NURxZ3IZwD3F6NtxbXZQCnnSi1Lkx+IDohdPlFp222wVALIheZJQSEg== - dependencies: - color-convert "^2.0.1" - -antd@^4.17.0: - version "4.19.0" - resolved "https://registry.yarnpkg.com/antd/-/antd-4.19.0.tgz#1c637a4d7dde091a2299260ca89f05c29fb21f80" - integrity sha512-4Kp47+zg3j1g1lWmzFstGrmlGdHzUIvxAVXxYJKJqX+iQs++QYgcK2HF+9PBpwEwP6H6VPZCsL0LqKEflke5qg== - dependencies: - "@ant-design/colors" "^6.0.0" - "@ant-design/icons" "^4.7.0" - "@ant-design/react-slick" "~0.28.1" - "@babel/runtime" "^7.12.5" - "@ctrl/tinycolor" "^3.4.0" - classnames "^2.2.6" - copy-to-clipboard "^3.2.0" - lodash "^4.17.21" - memoize-one "^6.0.0" - moment "^2.25.3" - rc-cascader "~3.2.1" - rc-checkbox "~2.3.0" - rc-collapse "~3.1.0" - rc-dialog "~8.6.0" - rc-drawer "~4.4.2" - rc-dropdown "~3.3.2" - rc-field-form "~1.23.0" - rc-image "~5.2.5" - rc-input "^0.0.1-alpha.5" - rc-input-number "~7.3.0" - rc-mentions "~1.6.1" - rc-menu "~9.2.1" - rc-motion "^2.4.4" - rc-notification "~4.5.7" - rc-pagination "~3.1.9" - rc-picker "~2.6.4" - rc-progress "~3.2.1" - rc-rate "~2.9.0" - rc-resize-observer "^1.2.0" - rc-select "~14.0.0-alpha.15" - rc-slider "~10.0.0-alpha.4" - rc-steps "~4.1.0" - rc-switch "~3.2.0" - rc-table "~7.23.0" - rc-tabs "~11.10.0" - rc-textarea "~0.3.0" - rc-tooltip "~5.1.1" - rc-tree "~5.4.3" - rc-tree-select "~5.1.1" - rc-trigger "^5.2.10" - rc-upload "~4.3.0" - rc-util "^5.14.0" - scroll-into-view-if-needed "^2.2.25" - -anymatch@~3.1.2: - version "3.1.2" - resolved "https://registry.yarnpkg.com/anymatch/-/anymatch-3.1.2.tgz#c0557c096af32f106198f4f4e2a383537e378716" - integrity sha512-P43ePfOAIupkguHUycrc4qJ9kz8ZiuOUijaETwX7THt0Y/GNK7v0aa8rY816xWjZ7rJdA5XdMcpVFTKMq+RvWg== - dependencies: - normalize-path "^3.0.0" - picomatch "^2.0.4" - -array-flatten@1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/array-flatten/-/array-flatten-1.1.1.tgz#9a5f699051b1e7073328f2a008968b64ea2955d2" - integrity sha1-ml9pkFGx5wczKPKgCJaLZOopVdI= - -array-flatten@^2.1.0: - version "2.1.2" - resolved "https://registry.yarnpkg.com/array-flatten/-/array-flatten-2.1.2.tgz#24ef80a28c1a893617e2149b0c6d0d788293b099" - integrity sha512-hNfzcOV8W4NdualtqBFPyVO+54DSJuZGY9qT4pRroB6S9e3iiido2ISIC5h9R2sPJ8H3FHCIiEnsv1lPXO3KtQ== - -array-tree-filter@^2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/array-tree-filter/-/array-tree-filter-2.1.0.tgz#873ac00fec83749f255ac8dd083814b4f6329190" - integrity sha512-4ROwICNlNw/Hqa9v+rk5h22KjmzB1JGTMVKP2AKJBOCgb0yL0ASf0+YvCcLNNwquOHNX48jkeZIJ3a+oOQqKcw== - -array-union@^2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/array-union/-/array-union-2.1.0.tgz#b798420adbeb1de828d84acd8a2e23d3efe85e8d" - integrity sha512-HGyxoOTYUyCM6stUe6EJgnd4EoewAI7zMdfqO+kGjnlZmBDz/cR5pf8r/cR4Wq60sL/p0IkcjUEEPwS3GFrIyw== - -async-validator@^4.0.2: - version "4.0.7" - resolved "https://registry.yarnpkg.com/async-validator/-/async-validator-4.0.7.tgz#034a0fd2103a6b2ebf010da75183bec299247afe" - integrity sha512-Pj2IR7u8hmUEDOwB++su6baaRi+QvsgajuFB9j95foM1N2gy5HM4z60hfusIO0fBPG5uLAEl6yCJr1jNSVugEQ== - -async@^2.6.2: - version "2.6.3" - resolved "https://registry.yarnpkg.com/async/-/async-2.6.3.tgz#d72625e2344a3656e3a3ad4fa749fa83299d82ff" - integrity sha512-zflvls11DCy+dQWzTW2dzuilv8Z5X/pjfmZOWba6TNIVDm+2UDaJmXSOXlasHKfNBs8oo3M0aT50fDEWfKZjXg== - dependencies: - lodash "^4.17.14" - -balanced-match@^1.0.0: - version "1.0.2" - resolved "https://registry.yarnpkg.com/balanced-match/-/balanced-match-1.0.2.tgz#e83e3a7e3f300b34cb9d87f615fa0cbf357690ee" - integrity sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw== - -batch@0.6.1: - version "0.6.1" - resolved "https://registry.yarnpkg.com/batch/-/batch-0.6.1.tgz#dc34314f4e679318093fc760272525f94bf25c16" - integrity sha1-3DQxT05nkxgJP8dgJyUl+UvyXBY= - -big.js@^5.2.2: - version "5.2.2" - resolved "https://registry.yarnpkg.com/big.js/-/big.js-5.2.2.tgz#65f0af382f578bcdc742bd9c281e9cb2d7768328" - integrity sha512-vyL2OymJxmarO8gxMr0mhChsO9QGwhynfuu4+MHTAW6czfq9humCB7rKpUjDd9YUiDPU4mzpyupFSvOClAwbmQ== - -binary-extensions@^2.0.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/binary-extensions/-/binary-extensions-2.2.0.tgz#75f502eeaf9ffde42fc98829645be4ea76bd9e2d" - integrity sha512-jDctJ/IVQbZoJykoeHbhXpOlNBqGNcwXJKJog42E5HDPUwQTSdjCHdihjj0DlnheQ7blbT6dHOafNAiS8ooQKA== - -body-parser@1.19.2: - version "1.19.2" - resolved "https://registry.yarnpkg.com/body-parser/-/body-parser-1.19.2.tgz#4714ccd9c157d44797b8b5607d72c0b89952f26e" - integrity sha512-SAAwOxgoCKMGs9uUAUFHygfLAyaniaoun6I8mFY9pRAJL9+Kec34aU+oIjDhTycub1jozEfEwx1W1IuOYxVSFw== - dependencies: - bytes "3.1.2" - content-type "~1.0.4" - debug "2.6.9" - depd "~1.1.2" - http-errors "1.8.1" - iconv-lite "0.4.24" - on-finished "~2.3.0" - qs "6.9.7" - raw-body "2.4.3" - type-is "~1.6.18" - -bonjour@^3.5.0: - version "3.5.0" - resolved "https://registry.yarnpkg.com/bonjour/-/bonjour-3.5.0.tgz#8e890a183d8ee9a2393b3844c691a42bcf7bc9f5" - integrity sha1-jokKGD2O6aI5OzhExpGkK897yfU= - dependencies: - array-flatten "^2.1.0" - deep-equal "^1.0.1" - dns-equal "^1.0.0" - dns-txt "^2.0.2" - multicast-dns "^6.0.1" - multicast-dns-service-types "^1.1.0" - -boolbase@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/boolbase/-/boolbase-1.0.0.tgz#68dff5fbe60c51eb37725ea9e3ed310dcc1e776e" - integrity sha1-aN/1++YMUes3cl6p4+0xDcwed24= - -brace-expansion@^1.1.7: - version "1.1.11" - resolved "https://registry.yarnpkg.com/brace-expansion/-/brace-expansion-1.1.11.tgz#3c7fcbf529d87226f3d2f52b966ff5271eb441dd" - integrity sha512-iCuPHDFgrHX7H2vEI/5xpz07zSHB00TpugqhmYtVmMO6518mCuRMoOYFldEBl0g187ufozdaHgWKcYFb61qGiA== - dependencies: - balanced-match "^1.0.0" - concat-map "0.0.1" - -braces@^3.0.1, braces@~3.0.2: - version "3.0.2" - resolved "https://registry.yarnpkg.com/braces/-/braces-3.0.2.tgz#3454e1a462ee8d599e236df336cd9ea4f8afe107" - integrity sha512-b8um+L1RzM3WDSzvhm6gIz1yfTbBt6YTlcEKAvsmqCZZFw46z626lVj9j1yEPW33H5H+lBQpZMP1k8l+78Ha0A== - dependencies: - fill-range "^7.0.1" - -browserslist@^4.14.5, browserslist@^4.16.5: - version "4.20.0" - resolved "https://registry.yarnpkg.com/browserslist/-/browserslist-4.20.0.tgz#35951e3541078c125d36df76056e94738a52ebe9" - integrity sha512-bnpOoa+DownbciXj0jVGENf8VYQnE2LNWomhYuCsMmmx9Jd9lwq0WXODuwpSsp8AVdKM2/HorrzxAfbKvWTByQ== - dependencies: - caniuse-lite "^1.0.30001313" - electron-to-chromium "^1.4.76" - escalade "^3.1.1" - node-releases "^2.0.2" - picocolors "^1.0.0" - -buffer-from@^1.0.0: - version "1.1.2" - resolved "https://registry.yarnpkg.com/buffer-from/-/buffer-from-1.1.2.tgz#2b146a6fd72e80b4f55d255f35ed59a3a9a41bd5" - integrity sha512-E+XQCRwSbaaiChtv6k6Dwgc+bx+Bs6vuKJHHl5kox/BaKbhiXzqQOwK4cO22yElGp2OCmjwVhT3HmxgyPGnJfQ== - -buffer-indexof@^1.0.0: - version "1.1.1" - resolved "https://registry.yarnpkg.com/buffer-indexof/-/buffer-indexof-1.1.1.tgz#52fabcc6a606d1a00302802648ef68f639da268c" - integrity sha512-4/rOEg86jivtPTeOUUT61jJO1Ya1TrR/OkqCSZDyq84WJh3LuuiphBYJN+fm5xufIk4XAFcEwte/8WzC8If/1g== - -bytes@3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/bytes/-/bytes-3.0.0.tgz#d32815404d689699f85a4ea4fa8755dd13a96048" - integrity sha1-0ygVQE1olpn4Wk6k+odV3ROpYEg= - -bytes@3.1.2: - version "3.1.2" - resolved "https://registry.yarnpkg.com/bytes/-/bytes-3.1.2.tgz#8b0beeb98605adf1b128fa4386403c009e0221a5" - integrity sha512-/Nf7TyzTx6S3yRJObOAV7956r8cr2+Oj8AC5dt8wSP3BQAoeX58NoHyCU8P8zGkNXStjTSi6fzO6F0pBdcYbEg== - -call-bind@^1.0.2: - version "1.0.2" - resolved "https://registry.yarnpkg.com/call-bind/-/call-bind-1.0.2.tgz#b1d4e89e688119c3c9a903ad30abb2f6a919be3c" - integrity sha512-7O+FbCihrB5WGbFYesctwmTKae6rOiIzmz1icreWJ+0aA7LJfuqhEso2T9ncpcFtzMQtzXf2QGGueWJGTYsqrA== - dependencies: - function-bind "^1.1.1" - get-intrinsic "^1.0.2" - -camel-case@^4.1.2: - version "4.1.2" - resolved "https://registry.yarnpkg.com/camel-case/-/camel-case-4.1.2.tgz#9728072a954f805228225a6deea6b38461e1bd5a" - integrity sha512-gxGWBrTT1JuMx6R+o5PTXMmUnhnVzLQ9SNutD4YqKtI6ap897t3tKECYla6gCWEkplXnlNybEkZg9GEGxKFCgw== - dependencies: - pascal-case "^3.1.2" - tslib "^2.0.3" - -caniuse-lite@^1.0.30001313: - version "1.0.30001313" - resolved "https://registry.yarnpkg.com/caniuse-lite/-/caniuse-lite-1.0.30001313.tgz#a380b079db91621e1b7120895874e2fd62ed2e2f" - integrity sha512-rI1UN0koZUiKINjysQDuRi2VeSCce3bYJNmDcj3PIKREiAmjakugBul1QSkg/fPrlULYl6oWfGg3PbgOSY9X4Q== - -chalk@^4.1.0: - version "4.1.2" - resolved "https://registry.yarnpkg.com/chalk/-/chalk-4.1.2.tgz#aac4e2b7734a740867aeb16bf02aad556a1e7a01" - integrity sha512-oKnbhFyRIXpUuez8iBMmyEa4nbj4IOQyuhc/wy9kY7/WVPcwIO9VA668Pu8RkO7+0G76SLROeyw9CpQ061i4mA== - dependencies: - ansi-styles "^4.1.0" - supports-color "^7.1.0" - -chokidar@^3.5.3: - version "3.5.3" - resolved "https://registry.yarnpkg.com/chokidar/-/chokidar-3.5.3.tgz#1cf37c8707b932bd1af1ae22c0432e2acd1903bd" - integrity sha512-Dr3sfKRP6oTcjf2JmUmFJfeVMvXBdegxB0iVQ5eb2V10uFJUCAS8OByZdVAyVb8xXNz3GjjTgj9kLWsZTqE6kw== - dependencies: - anymatch "~3.1.2" - braces "~3.0.2" - glob-parent "~5.1.2" - is-binary-path "~2.1.0" - is-glob "~4.0.1" - normalize-path "~3.0.0" - readdirp "~3.6.0" - optionalDependencies: - fsevents "~2.3.2" - -chrome-trace-event@^1.0.2: - version "1.0.3" - resolved "https://registry.yarnpkg.com/chrome-trace-event/-/chrome-trace-event-1.0.3.tgz#1015eced4741e15d06664a957dbbf50d041e26ac" - integrity sha512-p3KULyQg4S7NIHixdwbGX+nFHkoBiA4YQmyWtjb8XngSKV124nJmRysgAeujbUVb15vh+RvFUfCPqU7rXk+hZg== - -classnames@2.x, classnames@^2.2.1, classnames@^2.2.3, classnames@^2.2.5, classnames@^2.2.6, classnames@^2.3.1: - version "2.3.1" - resolved "https://registry.yarnpkg.com/classnames/-/classnames-2.3.1.tgz#dfcfa3891e306ec1dad105d0e88f4417b8535e8e" - integrity sha512-OlQdbZ7gLfGarSqxesMesDa5uz7KFbID8Kpq/SxIoNGDqY8lSYs0D+hhtBXhcdB3rcbXArFr7vlHheLk1voeNA== - -clean-css@^5.2.2: - version "5.2.4" - resolved "https://registry.yarnpkg.com/clean-css/-/clean-css-5.2.4.tgz#982b058f8581adb2ae062520808fb2429bd487a4" - integrity sha512-nKseG8wCzEuji/4yrgM/5cthL9oTDc5UOQyFMvW/Q53oP6gLH690o1NbuTh6Y18nujr7BxlsFuS7gXLnLzKJGg== - dependencies: - source-map "~0.6.0" - -clean-stack@^2.0.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/clean-stack/-/clean-stack-2.2.0.tgz#ee8472dbb129e727b31e8a10a427dee9dfe4008b" - integrity sha512-4diC9HaTE+KRAMWhDhrGOECgWZxoevMc5TlkObMqNSsVU62PYzXZ/SMTjzyGAFF1YusgxGcSWTEXBhp0CPwQ1A== - -clone-deep@^4.0.1: - version "4.0.1" - resolved "https://registry.yarnpkg.com/clone-deep/-/clone-deep-4.0.1.tgz#c19fd9bdbbf85942b4fd979c84dcf7d5f07c2387" - integrity sha512-neHB9xuzh/wk0dIHweyAXv2aPGZIVk3pLMe+/RNzINf17fe0OG96QroktYAUm7SM1PBnzTabaLboqqxDyMU+SQ== - dependencies: - is-plain-object "^2.0.4" - kind-of "^6.0.2" - shallow-clone "^3.0.0" - -clsx@^1.0.4, clsx@^1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/clsx/-/clsx-1.1.1.tgz#98b3134f9abbdf23b2663491ace13c5c03a73188" - integrity sha512-6/bPho624p3S2pMyvP5kKBPXnI3ufHLObBFCfgx+LkeR5lg2XYy2hqZqUf45ypD8COn2bhgGJSUE+l5dhNBieA== - -color-convert@^2.0.1: - version "2.0.1" - resolved "https://registry.yarnpkg.com/color-convert/-/color-convert-2.0.1.tgz#72d3a68d598c9bdb3af2ad1e84f21d896abd4de3" - integrity sha512-RRECPsj7iu/xb5oKYcsFHSppFNnsj/52OVTRKb4zP5onXwVF3zVmmToNcOfGC+CRDpfK/U584fMg38ZHCaElKQ== - dependencies: - color-name "~1.1.4" - -color-name@~1.1.4: - version "1.1.4" - resolved "https://registry.yarnpkg.com/color-name/-/color-name-1.1.4.tgz#c2a09a87acbde69543de6f63fa3995c826c536a2" - integrity sha512-dOy+3AuW3a2wNbZHIuMZpTcgjGuLU/uBL/ubcZF9OXbDo8ff4O8yVp5Bf0efS8uEoYo5q4Fx7dY9OgQGXgAsQA== - -colorette@^2.0.10, colorette@^2.0.14: - version "2.0.16" - resolved "https://registry.yarnpkg.com/colorette/-/colorette-2.0.16.tgz#713b9af84fdb000139f04546bd4a93f62a5085da" - integrity sha512-hUewv7oMjCp+wkBv5Rm0v87eJhq4woh5rSR+42YSQJKecCqgIqNkZ6lAlQms/BwHPJA5NKMRlpxPRv0n8HQW6g== - -commander@^2.20.0: - version "2.20.3" - resolved "https://registry.yarnpkg.com/commander/-/commander-2.20.3.tgz#fd485e84c03eb4881c20722ba48035e8531aeb33" - integrity sha512-GpVkmM8vF2vQUkj2LvZmD35JxeJOLCwJ9cUkugyk2nuhbv3+mJvpLYYt+0+USMxE+oj+ey/lJEnhZw75x/OMcQ== - -commander@^7.0.0: - version "7.2.0" - resolved "https://registry.yarnpkg.com/commander/-/commander-7.2.0.tgz#a36cb57d0b501ce108e4d20559a150a391d97ab7" - integrity sha512-QrWXB+ZQSVPmIWIhtEO9H+gwHaMGYiF5ChvoJ+K9ZGHG/sVsa6yiesAD1GC/x46sET00Xlwo1u49RVVVzvcSkw== - -commander@^8.3.0: - version "8.3.0" - resolved "https://registry.yarnpkg.com/commander/-/commander-8.3.0.tgz#4837ea1b2da67b9c616a67afbb0fafee567bca66" - integrity sha512-OkTL9umf+He2DZkUq8f8J9of7yL6RJKI24dVITBmNfZBmri9zYZQrKkuXiKhyfPSu8tUhnVBB1iKXevvnlR4Ww== - -compressible@~2.0.16: - version "2.0.18" - resolved "https://registry.yarnpkg.com/compressible/-/compressible-2.0.18.tgz#af53cca6b070d4c3c0750fbd77286a6d7cc46fba" - integrity sha512-AF3r7P5dWxL8MxyITRMlORQNaOA2IkAFaTr4k7BUumjPtRpGDTZpl0Pb1XCO6JeDCBdp126Cgs9sMxqSjgYyRg== - dependencies: - mime-db ">= 1.43.0 < 2" - -compression@^1.7.4: - version "1.7.4" - resolved "https://registry.yarnpkg.com/compression/-/compression-1.7.4.tgz#95523eff170ca57c29a0ca41e6fe131f41e5bb8f" - integrity sha512-jaSIDzP9pZVS4ZfQ+TzvtiWhdpFhE2RDHz8QJkpX9SIpLq88VueF5jJw6t+6CUQcAoA6t+x89MLrWAqpfDE8iQ== - dependencies: - accepts "~1.3.5" - bytes "3.0.0" - compressible "~2.0.16" - debug "2.6.9" - on-headers "~1.0.2" - safe-buffer "5.1.2" - vary "~1.1.2" - -compute-scroll-into-view@^1.0.17: - version "1.0.17" - resolved "https://registry.yarnpkg.com/compute-scroll-into-view/-/compute-scroll-into-view-1.0.17.tgz#6a88f18acd9d42e9cf4baa6bec7e0522607ab7ab" - integrity sha512-j4dx+Fb0URmzbwwMUrhqWM2BEWHdFGx+qZ9qqASHRPqvTYdqvWnHg0H1hIbcyLnvgnoNAVMlwkepyqM3DaIFUg== - -concat-map@0.0.1: - version "0.0.1" - resolved "https://registry.yarnpkg.com/concat-map/-/concat-map-0.0.1.tgz#d8a96bd77fd68df7793a73036a3ba0d5405d477b" - integrity sha1-2Klr13/Wjfd5OnMDajug1UBdR3s= - -connect-history-api-fallback@^1.6.0: - version "1.6.0" - resolved "https://registry.yarnpkg.com/connect-history-api-fallback/-/connect-history-api-fallback-1.6.0.tgz#8b32089359308d111115d81cad3fceab888f97bc" - integrity sha512-e54B99q/OUoH64zYYRf3HBP5z24G38h5D3qXu23JGRoigpX5Ss4r9ZnDk3g0Z8uQC2x2lPaJ+UlWBc1ZWBWdLg== - -content-disposition@0.5.4: - version "0.5.4" - resolved "https://registry.yarnpkg.com/content-disposition/-/content-disposition-0.5.4.tgz#8b82b4efac82512a02bb0b1dcec9d2c5e8eb5bfe" - integrity sha512-FveZTNuGw04cxlAiWbzi6zTAL/lhehaWbTtgluJh4/E95DqMwTmha3KZN1aAWA8cFIhHzMZUvLevkw5Rqk+tSQ== - dependencies: - safe-buffer "5.2.1" - -content-type@~1.0.4: - version "1.0.4" - resolved "https://registry.yarnpkg.com/content-type/-/content-type-1.0.4.tgz#e138cc75e040c727b1966fe5e5f8c9aee256fe3b" - integrity sha512-hIP3EEPs8tB9AT1L+NUqtwOAps4mk2Zob89MWXMHjHWg9milF/j4osnnQLXBCBFBk/tvIG/tUc9mOUJiPBhPXA== - -cookie-signature@1.0.6: - version "1.0.6" - resolved "https://registry.yarnpkg.com/cookie-signature/-/cookie-signature-1.0.6.tgz#e303a882b342cc3ee8ca513a79999734dab3ae2c" - integrity sha1-4wOogrNCzD7oylE6eZmXNNqzriw= - -cookie@0.4.2: - version "0.4.2" - resolved "https://registry.yarnpkg.com/cookie/-/cookie-0.4.2.tgz#0e41f24de5ecf317947c82fc789e06a884824432" - integrity sha512-aSWTXFzaKWkvHO1Ny/s+ePFpvKsPnjc551iI41v3ny/ow6tBG5Vd+FuqGNhh1LxOmVzOlGUriIlOaokOvhaStA== - -copy-to-clipboard@^3.2.0: - version "3.3.1" - resolved "https://registry.yarnpkg.com/copy-to-clipboard/-/copy-to-clipboard-3.3.1.tgz#115aa1a9998ffab6196f93076ad6da3b913662ae" - integrity sha512-i13qo6kIHTTpCm8/Wup+0b1mVWETvu2kIMzKoK8FpkLkFxlt0znUAHcMzox+T8sPlqtZXq3CulEjQHsYiGFJUw== - dependencies: - toggle-selection "^1.0.6" - -core-util-is@~1.0.0: - version "1.0.3" - resolved "https://registry.yarnpkg.com/core-util-is/-/core-util-is-1.0.3.tgz#a6042d3634c2b27e9328f837b965fac83808db85" - integrity sha512-ZQBvi1DcpJ4GDqanjucZ2Hj3wEO5pZDS89BWbkcrvdxksJorwUDDZamX9ldFkp9aw2lmBDLgkObEA4DWNJ9FYQ== - -cross-env@^7.0.2: - version "7.0.3" - resolved "https://registry.yarnpkg.com/cross-env/-/cross-env-7.0.3.tgz#865264b29677dc015ba8418918965dd232fc54cf" - integrity sha512-+/HKd6EgcQCJGh2PSjZuUitQBQynKor4wrFbRg4DtAgS1aWO+gU52xpH7M9ScGgXSYmAVS9bIJ8EzuaGw0oNAw== - dependencies: - cross-spawn "^7.0.1" - -cross-spawn@^7.0.1, cross-spawn@^7.0.3: - version "7.0.3" - resolved "https://registry.yarnpkg.com/cross-spawn/-/cross-spawn-7.0.3.tgz#f73a85b9d5d41d045551c177e2882d4ac85728a6" - integrity sha512-iRDPJKUPVEND7dHPO8rkbOnPpyDygcDFtWjpeWNCgy8WP2rXcxXL8TskReQl6OrB2G7+UJrags1q15Fudc7G6w== - dependencies: - path-key "^3.1.0" - shebang-command "^2.0.0" - which "^2.0.1" - -css-loader@^5.2.4: - version "5.2.7" - resolved "https://registry.yarnpkg.com/css-loader/-/css-loader-5.2.7.tgz#9b9f111edf6fb2be5dc62525644cbc9c232064ae" - integrity sha512-Q7mOvpBNBG7YrVGMxRxcBJZFL75o+cH2abNASdibkj/fffYD8qWbInZrD0S9ccI6vZclF3DsHE7njGlLtaHbhg== - dependencies: - icss-utils "^5.1.0" - loader-utils "^2.0.0" - postcss "^8.2.15" - postcss-modules-extract-imports "^3.0.0" - postcss-modules-local-by-default "^4.0.0" - postcss-modules-scope "^3.0.0" - postcss-modules-values "^4.0.0" - postcss-value-parser "^4.1.0" - schema-utils "^3.0.0" - semver "^7.3.5" - -css-select@^4.1.3: - version "4.2.1" - resolved "https://registry.yarnpkg.com/css-select/-/css-select-4.2.1.tgz#9e665d6ae4c7f9d65dbe69d0316e3221fb274cdd" - integrity sha512-/aUslKhzkTNCQUB2qTX84lVmfia9NyjP3WpDGtj/WxhwBzWBYUV3DgUpurHTme8UTPcPlAD1DJ+b0nN/t50zDQ== - dependencies: - boolbase "^1.0.0" - css-what "^5.1.0" - domhandler "^4.3.0" - domutils "^2.8.0" - nth-check "^2.0.1" - -css-vendor@^2.0.8: - version "2.0.8" - resolved "https://registry.yarnpkg.com/css-vendor/-/css-vendor-2.0.8.tgz#e47f91d3bd3117d49180a3c935e62e3d9f7f449d" - integrity sha512-x9Aq0XTInxrkuFeHKbYC7zWY8ai7qJ04Kxd9MnvbC1uO5DagxoHQjm4JvG+vCdXOoFtCjbL2XSZfxmoYa9uQVQ== - dependencies: - "@babel/runtime" "^7.8.3" - is-in-browser "^1.0.2" - -css-what@^5.1.0: - version "5.1.0" - resolved "https://registry.yarnpkg.com/css-what/-/css-what-5.1.0.tgz#3f7b707aadf633baf62c2ceb8579b545bb40f7fe" - integrity sha512-arSMRWIIFY0hV8pIxZMEfmMI47Wj3R/aWpZDDxWYCPEiOMv6tfOrnpDtgxBYPEQD4V0Y/958+1TdC3iWTFcUPw== - -cssesc@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/cssesc/-/cssesc-3.0.0.tgz#37741919903b868565e1c09ea747445cd18983ee" - integrity sha512-/Tb/JcjK111nNScGob5MNtsntNM1aCNUDipB/TkwZFhyDrrE47SOx/18wF2bbjgc3ZzCSKW1T5nt5EbFoAz/Vg== - -csstype@^2.5.2: - version "2.6.20" - resolved "https://registry.yarnpkg.com/csstype/-/csstype-2.6.20.tgz#9229c65ea0b260cf4d3d997cb06288e36a8d6dda" - integrity sha512-/WwNkdXfckNgw6S5R125rrW8ez139lBHWouiBvX8dfMFtcn6V81REDqnH7+CRpRipfYlyU1CmOnOxrmGcFOjeA== - -csstype@^3.0.2: - version "3.0.11" - resolved "https://registry.yarnpkg.com/csstype/-/csstype-3.0.11.tgz#d66700c5eacfac1940deb4e3ee5642792d85cd33" - integrity sha512-sa6P2wJ+CAbgyy4KFssIb/JNMLxFvKF1pCYCSXS8ZMuqZnMsrxqI2E5sPyoTpxoPU/gVZMzr2zjOfg8GIZOMsw== - -date-fns@2.x: - version "2.28.0" - resolved "https://registry.yarnpkg.com/date-fns/-/date-fns-2.28.0.tgz#9570d656f5fc13143e50c975a3b6bbeb46cd08b2" - integrity sha512-8d35hViGYx/QH0icHYCeLmsLmMUheMmTyV9Fcm6gvNwdw31yXXH+O85sOBJ+OLnLQMKZowvpKb6FgMIQjcpvQw== - -dayjs@1.x: - version "1.10.8" - resolved "https://registry.yarnpkg.com/dayjs/-/dayjs-1.10.8.tgz#267df4bc6276fcb33c04a6735287e3f429abec41" - integrity sha512-wbNwDfBHHur9UOzNUjeKUOJ0fCb0a52Wx0xInmQ7Y8FstyajiV1NmK1e00cxsr9YrE9r7yAChE0VvpuY5Rnlow== - -debug@2.6.9: - version "2.6.9" - resolved "https://registry.yarnpkg.com/debug/-/debug-2.6.9.tgz#5d128515df134ff327e90a4c93f4e077a536341f" - integrity sha512-bC7ElrdJaJnPbAP+1EotYvqZsb3ecl5wi6Bfi6BJTUcNowp6cvspg0jXznRTKDjm/E7AdgFBVeAPVMNcKGsHMA== - dependencies: - ms "2.0.0" - -debug@^3.1.1: - version "3.2.7" - resolved "https://registry.yarnpkg.com/debug/-/debug-3.2.7.tgz#72580b7e9145fb39b6676f9c5e5fb100b934179a" - integrity sha512-CFjzYYAi4ThfiQvizrFQevTTXHtnCqWfe7x1AhgEscTz6ZbLbfoLRLPugTQyBth6f8ZERVUSyWHFD/7Wu4t1XQ== - dependencies: - ms "^2.1.1" - -debug@^4.1.0: - version "4.3.3" - resolved "https://registry.yarnpkg.com/debug/-/debug-4.3.3.tgz#04266e0b70a98d4462e6e288e38259213332b664" - integrity sha512-/zxw5+vh1Tfv+4Qn7a5nsbcJKPaSvCDhojn6FEl9vupwK2VCSDtEiEtqr8DFtzYFOdz63LBkxec7DYuc2jon6Q== - dependencies: - ms "2.1.2" - -deep-equal@^1.0.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/deep-equal/-/deep-equal-1.1.1.tgz#b5c98c942ceffaf7cb051e24e1434a25a2e6076a" - integrity sha512-yd9c5AdiqVcR+JjcwUQb9DkhJc8ngNr0MahEBGvDiJw8puWab2yZlh+nkasOnZP+EGTAP6rRp2JzJhJZzvNF8g== - dependencies: - is-arguments "^1.0.4" - is-date-object "^1.0.1" - is-regex "^1.0.4" - object-is "^1.0.1" - object-keys "^1.1.1" - regexp.prototype.flags "^1.2.0" - -default-gateway@^6.0.3: - version "6.0.3" - resolved "https://registry.yarnpkg.com/default-gateway/-/default-gateway-6.0.3.tgz#819494c888053bdb743edbf343d6cdf7f2943a71" - integrity sha512-fwSOJsbbNzZ/CUFpqFBqYfYNLj1NbMPm8MMCIzHjC83iSJRBEGmDUxU+WP661BaBQImeC2yHwXtz+P/O9o+XEg== - dependencies: - execa "^5.0.0" - -define-lazy-prop@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/define-lazy-prop/-/define-lazy-prop-2.0.0.tgz#3f7ae421129bcaaac9bc74905c98a0009ec9ee7f" - integrity sha512-Ds09qNh8yw3khSjiJjiUInaGX9xlqZDY7JVryGxdxV7NPeuqQfplOpQ66yJFZut3jLa5zOwkXw1g9EI2uKh4Og== - -define-properties@^1.1.3: - version "1.1.3" - resolved "https://registry.yarnpkg.com/define-properties/-/define-properties-1.1.3.tgz#cf88da6cbee26fe6db7094f61d870cbd84cee9f1" - integrity sha512-3MqfYKj2lLzdMSf8ZIZE/V+Zuy+BgD6f164e8K2w7dgnpKArBDerGYpM46IYYcjnkdPNMjPk9A6VFB8+3SKlXQ== - dependencies: - object-keys "^1.0.12" - -del@^6.0.0: - version "6.0.0" - resolved "https://registry.yarnpkg.com/del/-/del-6.0.0.tgz#0b40d0332cea743f1614f818be4feb717714c952" - integrity sha512-1shh9DQ23L16oXSZKB2JxpL7iMy2E0S9d517ptA1P8iw0alkPtQcrKH7ru31rYtKwF499HkTu+DRzq3TCKDFRQ== - dependencies: - globby "^11.0.1" - graceful-fs "^4.2.4" - is-glob "^4.0.1" - is-path-cwd "^2.2.0" - is-path-inside "^3.0.2" - p-map "^4.0.0" - rimraf "^3.0.2" - slash "^3.0.0" - -depd@~1.1.2: - version "1.1.2" - resolved "https://registry.yarnpkg.com/depd/-/depd-1.1.2.tgz#9bcd52e14c097763e749b274c4346ed2e560b5a9" - integrity sha1-m81S4UwJd2PnSbJ0xDRu0uVgtak= - -destroy@~1.0.4: - version "1.0.4" - resolved "https://registry.yarnpkg.com/destroy/-/destroy-1.0.4.tgz#978857442c44749e4206613e37946205826abd80" - integrity sha1-l4hXRCxEdJ5CBmE+N5RiBYJqvYA= - -detect-node@^2.0.4: - version "2.1.0" - resolved "https://registry.yarnpkg.com/detect-node/-/detect-node-2.1.0.tgz#c9c70775a49c3d03bc2c06d9a73be550f978f8b1" - integrity sha512-T0NIuQpnTvFDATNuHN5roPwSBG83rFsuO+MXXH9/3N1eFbn4wcPjttvjMLEPWJ0RGUYgQE7cGgS3tNxbqCGM7g== - -dir-glob@^3.0.1: - version "3.0.1" - resolved "https://registry.yarnpkg.com/dir-glob/-/dir-glob-3.0.1.tgz#56dbf73d992a4a93ba1584f4534063fd2e41717f" - integrity sha512-WkrWp9GR4KXfKGYzOLmTuGVi1UWFfws377n9cc55/tb6DuqyF6pcQ5AbiHEshaDpY9v6oaSr2XCDidGmMwdzIA== - dependencies: - path-type "^4.0.0" - -dns-equal@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/dns-equal/-/dns-equal-1.0.0.tgz#b39e7f1da6eb0a75ba9c17324b34753c47e0654d" - integrity sha1-s55/HabrCnW6nBcySzR1PEfgZU0= - -dns-packet@^1.3.1: - version "1.3.4" - resolved "https://registry.yarnpkg.com/dns-packet/-/dns-packet-1.3.4.tgz#e3455065824a2507ba886c55a89963bb107dec6f" - integrity sha512-BQ6F4vycLXBvdrJZ6S3gZewt6rcrks9KBgM9vrhW+knGRqc8uEdT7fuCwloc7nny5xNoMJ17HGH0R/6fpo8ECA== - dependencies: - ip "^1.1.0" - safe-buffer "^5.0.1" - -dns-txt@^2.0.2: - version "2.0.2" - resolved "https://registry.yarnpkg.com/dns-txt/-/dns-txt-2.0.2.tgz#b91d806f5d27188e4ab3e7d107d881a1cc4642b6" - integrity sha1-uR2Ab10nGI5Ks+fRB9iBocxGQrY= - dependencies: - buffer-indexof "^1.0.0" - -dom-align@^1.7.0: - version "1.12.2" - resolved "https://registry.yarnpkg.com/dom-align/-/dom-align-1.12.2.tgz#0f8164ebd0c9c21b0c790310493cd855892acd4b" - integrity sha512-pHuazgqrsTFrGU2WLDdXxCFabkdQDx72ddkraZNih1KsMcN5qsRSTR9O4VJRlwTPCPb5COYg3LOfiMHHcPInHg== - -dom-converter@^0.2.0: - version "0.2.0" - resolved "https://registry.yarnpkg.com/dom-converter/-/dom-converter-0.2.0.tgz#6721a9daee2e293682955b6afe416771627bb768" - integrity sha512-gd3ypIPfOMr9h5jIKq8E3sHOTCjeirnl0WK5ZdS1AW0Odt0b1PaWaHdJ4Qk4klv+YB9aJBS7mESXjFoDQPu6DA== - dependencies: - utila "~0.4" - -dom-helpers@^5.0.1: - version "5.2.1" - resolved "https://registry.yarnpkg.com/dom-helpers/-/dom-helpers-5.2.1.tgz#d9400536b2bf8225ad98fe052e029451ac40e902" - integrity sha512-nRCa7CK3VTrM2NmGkIy4cbK7IZlgBE/PYMn55rrXefr5xXDP0LdtfPnblFDoVdcAfslJ7or6iqAUnx0CCGIWQA== - dependencies: - "@babel/runtime" "^7.8.7" - csstype "^3.0.2" - -dom-serializer@^1.0.1: - version "1.3.2" - resolved "https://registry.yarnpkg.com/dom-serializer/-/dom-serializer-1.3.2.tgz#6206437d32ceefaec7161803230c7a20bc1b4d91" - integrity sha512-5c54Bk5Dw4qAxNOI1pFEizPSjVsx5+bpJKmL2kPn8JhBUq2q09tTCa3mjijun2NfK78NMouDYNMBkOrPZiS+ig== - dependencies: - domelementtype "^2.0.1" - domhandler "^4.2.0" - entities "^2.0.0" - -domelementtype@^2.0.1, domelementtype@^2.2.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/domelementtype/-/domelementtype-2.2.0.tgz#9a0b6c2782ed6a1c7323d42267183df9bd8b1d57" - integrity sha512-DtBMo82pv1dFtUmHyr48beiuq792Sxohr+8Hm9zoxklYPfa6n0Z3Byjj2IV7bmr2IyqClnqEQhfgHJJ5QF0R5A== - -domhandler@^4.0.0, domhandler@^4.2.0, domhandler@^4.3.0: - version "4.3.0" - resolved "https://registry.yarnpkg.com/domhandler/-/domhandler-4.3.0.tgz#16c658c626cf966967e306f966b431f77d4a5626" - integrity sha512-fC0aXNQXqKSFTr2wDNZDhsEYjCiYsDWl3D01kwt25hm1YIPyDGHvvi3rw+PLqHAl/m71MaiF7d5zvBr0p5UB2g== - dependencies: - domelementtype "^2.2.0" - -domutils@^2.5.2, domutils@^2.8.0: - version "2.8.0" - resolved "https://registry.yarnpkg.com/domutils/-/domutils-2.8.0.tgz#4437def5db6e2d1f5d6ee859bd95ca7d02048135" - integrity sha512-w96Cjofp72M5IIhpjgobBimYEfoPjx1Vx0BSX9P30WBdZW2WIKU0T1Bd0kz2eNZ9ikjKgHbEyKx8BB6H1L3h3A== - dependencies: - dom-serializer "^1.0.1" - domelementtype "^2.2.0" - domhandler "^4.2.0" - -dot-case@^3.0.4: - version "3.0.4" - resolved "https://registry.yarnpkg.com/dot-case/-/dot-case-3.0.4.tgz#9b2b670d00a431667a8a75ba29cd1b98809ce751" - integrity sha512-Kv5nKlh6yRrdrGvxeJ2e5y2eRUpkUosIW4A2AS38zwSz27zu7ufDwQPi5Jhs3XAlGNetl3bmnGhQsMtkKJnj3w== - dependencies: - no-case "^3.0.4" - tslib "^2.0.3" - -ee-first@1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/ee-first/-/ee-first-1.1.1.tgz#590c61156b0ae2f4f0255732a158b266bc56b21d" - integrity sha1-WQxhFWsK4vTwJVcyoViyZrxWsh0= - -electron-to-chromium@^1.4.76: - version "1.4.76" - resolved "https://registry.yarnpkg.com/electron-to-chromium/-/electron-to-chromium-1.4.76.tgz#a0494baedaf51094b1c172999919becd9975a934" - integrity sha512-3Vftv7cenJtQb+k00McEBZ2vVmZ/x+HEF7pcZONZIkOsESqAqVuACmBxMv0JhzX7u0YltU0vSqRqgBSTAhFUjA== - -emojis-list@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/emojis-list/-/emojis-list-3.0.0.tgz#5570662046ad29e2e916e71aae260abdff4f6a78" - integrity sha512-/kyM18EfinwXZbno9FyUGeFh87KC8HRQBQGildHZbEuRyWFOmv1U10o9BBp8XVZDVNNuQKyIGIu5ZYAAXJ0V2Q== - -encodeurl@~1.0.2: - version "1.0.2" - resolved "https://registry.yarnpkg.com/encodeurl/-/encodeurl-1.0.2.tgz#ad3ff4c86ec2d029322f5a02c3a9a606c95b3f59" - integrity sha1-rT/0yG7C0CkyL1oCw6mmBslbP1k= - -enhanced-resolve@^4.0.0: - version "4.5.0" - resolved "https://registry.yarnpkg.com/enhanced-resolve/-/enhanced-resolve-4.5.0.tgz#2f3cfd84dbe3b487f18f2db2ef1e064a571ca5ec" - integrity sha512-Nv9m36S/vxpsI+Hc4/ZGRs0n9mXqSWGGq49zxb/cJfPAQMbUtttJAlNPS4AQzaBdw/pKskw5bMbekT/Y7W/Wlg== - dependencies: - graceful-fs "^4.1.2" - memory-fs "^0.5.0" - tapable "^1.0.0" - -enhanced-resolve@^5.9.2: - version "5.9.2" - resolved "https://registry.yarnpkg.com/enhanced-resolve/-/enhanced-resolve-5.9.2.tgz#0224dcd6a43389ebfb2d55efee517e5466772dd9" - integrity sha512-GIm3fQfwLJ8YZx2smuHpBKkXC1yOk+OBEmKckVyL0i/ea8mqDEykK3ld5dgH1QYPNyT/lIllxV2LULnxCHaHkA== - dependencies: - graceful-fs "^4.2.4" - tapable "^2.2.0" - -entities@^2.0.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/entities/-/entities-2.2.0.tgz#098dc90ebb83d8dffa089d55256b351d34c4da55" - integrity sha512-p92if5Nz619I0w+akJrLZH0MX0Pb5DX39XOwQTtXSdQQOaYH03S1uIQp4mhOZtAXrxq4ViO67YTiLBo2638o9A== - -envinfo@^7.7.3: - version "7.8.1" - resolved "https://registry.yarnpkg.com/envinfo/-/envinfo-7.8.1.tgz#06377e3e5f4d379fea7ac592d5ad8927e0c4d475" - integrity sha512-/o+BXHmB7ocbHEAs6F2EnG0ogybVVUdkRunTT2glZU9XAaGmhqskrvKwqXuDfNjEO0LZKWdejEEpnq8aM0tOaw== - -errno@^0.1.3: - version "0.1.8" - resolved "https://registry.yarnpkg.com/errno/-/errno-0.1.8.tgz#8bb3e9c7d463be4976ff888f76b4809ebc2e811f" - integrity sha512-dJ6oBr5SQ1VSd9qkk7ByRgb/1SH4JZjCHSW/mr63/QcXO9zLVxvJ6Oy13nio03rxpSnVDDjFor75SjVeZWPW/A== - dependencies: - prr "~1.0.1" - -es-module-lexer@^0.9.0: - version "0.9.3" - resolved "https://registry.yarnpkg.com/es-module-lexer/-/es-module-lexer-0.9.3.tgz#6f13db00cc38417137daf74366f535c8eb438f19" - integrity sha512-1HQ2M2sPtxwnvOvT1ZClHyQDiggdNjURWpY2we6aMKCQiUVxTmVs2UYPLIrD84sS+kMdUwfBSylbJPwNnBrnHQ== - -escalade@^3.1.1: - version "3.1.1" - resolved "https://registry.yarnpkg.com/escalade/-/escalade-3.1.1.tgz#d8cfdc7000965c5a0174b4a82eaa5c0552742e40" - integrity sha512-k0er2gUkLf8O0zKJiAhmkTnJlTvINGv7ygDNPbeIsX/TJjGJZHuh9B2UxbsaEkmlEo9MfhrSzmhIlhRlI2GXnw== - -escape-html@~1.0.3: - version "1.0.3" - resolved "https://registry.yarnpkg.com/escape-html/-/escape-html-1.0.3.tgz#0258eae4d3d0c0974de1c169188ef0051d1d1988" - integrity sha1-Aljq5NPQwJdN4cFpGI7wBR0dGYg= - -eslint-scope@5.1.1: - version "5.1.1" - resolved "https://registry.yarnpkg.com/eslint-scope/-/eslint-scope-5.1.1.tgz#e786e59a66cb92b3f6c1fb0d508aab174848f48c" - integrity sha512-2NxwbF/hZ0KpepYN0cNbo+FN6XoK7GaHlQhgx/hIZl6Va0bF45RQOOwhLIy8lQDbuCiadSLCBnH2CFYquit5bw== - dependencies: - esrecurse "^4.3.0" - estraverse "^4.1.1" - -esrecurse@^4.3.0: - version "4.3.0" - resolved "https://registry.yarnpkg.com/esrecurse/-/esrecurse-4.3.0.tgz#7ad7964d679abb28bee72cec63758b1c5d2c9921" - integrity sha512-KmfKL3b6G+RXvP8N1vr3Tq1kL/oCFgn2NYXEtqP8/L3pKapUA4G8cFVaoF3SU323CD4XypR/ffioHmkti6/Tag== - dependencies: - estraverse "^5.2.0" - -estraverse@^4.1.1: - version "4.3.0" - resolved "https://registry.yarnpkg.com/estraverse/-/estraverse-4.3.0.tgz#398ad3f3c5a24948be7725e83d11a7de28cdbd1d" - integrity sha512-39nnKffWz8xN1BU/2c79n9nB9HDzo0niYUqx6xyqUnyoAnQyyWpOTdZEeiCch8BBu515t4wp9ZmgVfVhn9EBpw== - -estraverse@^5.2.0: - version "5.3.0" - resolved "https://registry.yarnpkg.com/estraverse/-/estraverse-5.3.0.tgz#2eea5290702f26ab8fe5370370ff86c965d21123" - integrity sha512-MMdARuVEQziNTeJD8DgMqmhwR11BRQ/cBP+pLtYdSTnf3MIO8fFeiINEbX36ZdNlfU/7A9f3gUw49B3oQsvwBA== - -etag@~1.8.1: - version "1.8.1" - resolved "https://registry.yarnpkg.com/etag/-/etag-1.8.1.tgz#41ae2eeb65efa62268aebfea83ac7d79299b0887" - integrity sha1-Qa4u62XvpiJorr/qg6x9eSmbCIc= - -eventemitter3@^4.0.0: - version "4.0.7" - resolved "https://registry.yarnpkg.com/eventemitter3/-/eventemitter3-4.0.7.tgz#2de9b68f6528d5644ef5c59526a1b4a07306169f" - integrity sha512-8guHBZCwKnFhYdHr2ysuRWErTwhoN2X8XELRlrRwpmfeY2jjuUN4taQMsULKUVo1K4DvZl+0pgfyoysHxvmvEw== - -events@^3.2.0: - version "3.3.0" - resolved "https://registry.yarnpkg.com/events/-/events-3.3.0.tgz#31a95ad0a924e2d2c419a813aeb2c4e878ea7400" - integrity sha512-mQw+2fkQbALzQ7V0MY0IqdnXNOeTtP4r0lN9z7AAawCXgqea7bDii20AYrIBrFd/Hx0M2Ocz6S111CaFkUcb0Q== - -execa@^5.0.0: - version "5.1.1" - resolved "https://registry.yarnpkg.com/execa/-/execa-5.1.1.tgz#f80ad9cbf4298f7bd1d4c9555c21e93741c411dd" - integrity sha512-8uSpZZocAZRBAPIEINJj3Lo9HyGitllczc27Eh5YYojjMFMn8yHMDMaUHE2Jqfq05D/wucwI4JGURyXt1vchyg== - dependencies: - cross-spawn "^7.0.3" - get-stream "^6.0.0" - human-signals "^2.1.0" - is-stream "^2.0.0" - merge-stream "^2.0.0" - npm-run-path "^4.0.1" - onetime "^5.1.2" - signal-exit "^3.0.3" - strip-final-newline "^2.0.0" - -express@^4.17.1: - version "4.17.3" - resolved "https://registry.yarnpkg.com/express/-/express-4.17.3.tgz#f6c7302194a4fb54271b73a1fe7a06478c8f85a1" - integrity sha512-yuSQpz5I+Ch7gFrPCk4/c+dIBKlQUxtgwqzph132bsT6qhuzss6I8cLJQz7B3rFblzd6wtcI0ZbGltH/C4LjUg== - dependencies: - accepts "~1.3.8" - array-flatten "1.1.1" - body-parser "1.19.2" - content-disposition "0.5.4" - content-type "~1.0.4" - cookie "0.4.2" - cookie-signature "1.0.6" - debug "2.6.9" - depd "~1.1.2" - encodeurl "~1.0.2" - escape-html "~1.0.3" - etag "~1.8.1" - finalhandler "~1.1.2" - fresh "0.5.2" - merge-descriptors "1.0.1" - methods "~1.1.2" - on-finished "~2.3.0" - parseurl "~1.3.3" - path-to-regexp "0.1.7" - proxy-addr "~2.0.7" - qs "6.9.7" - range-parser "~1.2.1" - safe-buffer "5.2.1" - send "0.17.2" - serve-static "1.14.2" - setprototypeof "1.2.0" - statuses "~1.5.0" - type-is "~1.6.18" - utils-merge "1.0.1" - vary "~1.1.2" - -fast-deep-equal@^3.1.1, fast-deep-equal@^3.1.3: - version "3.1.3" - resolved "https://registry.yarnpkg.com/fast-deep-equal/-/fast-deep-equal-3.1.3.tgz#3a7d56b559d6cbc3eb512325244e619a65c6c525" - integrity sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q== - -fast-glob@^3.2.9: - version "3.2.11" - resolved "https://registry.yarnpkg.com/fast-glob/-/fast-glob-3.2.11.tgz#a1172ad95ceb8a16e20caa5c5e56480e5129c1d9" - integrity sha512-xrO3+1bxSo3ZVHAnqzyuewYT6aMFHRAd4Kcs92MAonjwQZLsK9d0SF1IyQ3k5PoirxTW0Oe/RqFgMQ6TcNE5Ew== - dependencies: - "@nodelib/fs.stat" "^2.0.2" - "@nodelib/fs.walk" "^1.2.3" - glob-parent "^5.1.2" - merge2 "^1.3.0" - micromatch "^4.0.4" - -fast-json-stable-stringify@^2.0.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/fast-json-stable-stringify/-/fast-json-stable-stringify-2.1.0.tgz#874bf69c6f404c2b5d99c481341399fd55892633" - integrity sha512-lhd/wF+Lk98HZoTCtlVraHtfh5XYijIjalXck7saUtuanSDyLMxnHhSXEDJqHxD7msR8D0uCmqlkwjCV8xvwHw== - -fastest-levenshtein@^1.0.12: - version "1.0.12" - resolved "https://registry.yarnpkg.com/fastest-levenshtein/-/fastest-levenshtein-1.0.12.tgz#9990f7d3a88cc5a9ffd1f1745745251700d497e2" - integrity sha512-On2N+BpYJ15xIC974QNVuYGMOlEVt4s0EOI3wwMqOmK1fdDY+FN/zltPV8vosq4ad4c/gJ1KHScUn/6AWIgiow== - -fastq@^1.6.0: - version "1.13.0" - resolved "https://registry.yarnpkg.com/fastq/-/fastq-1.13.0.tgz#616760f88a7526bdfc596b7cab8c18938c36b98c" - integrity sha512-YpkpUnK8od0o1hmeSc7UUs/eB/vIPWJYjKck2QKIzAf71Vm1AAQ3EbuZB3g2JIy+pg+ERD0vqI79KyZiB2e2Nw== - dependencies: - reusify "^1.0.4" - -faye-websocket@^0.11.3: - version "0.11.4" - resolved "https://registry.yarnpkg.com/faye-websocket/-/faye-websocket-0.11.4.tgz#7f0d9275cfdd86a1c963dc8b65fcc451edcbb1da" - integrity sha512-CzbClwlXAuiRQAlUyfqPgvPoNKTckTPGfwZV4ZdAhVcP2lh9KUxJg2b5GkE7XbjKQ3YJnQ9z6D9ntLAlB+tP8g== - dependencies: - websocket-driver ">=0.5.1" - -fill-range@^7.0.1: - version "7.0.1" - resolved "https://registry.yarnpkg.com/fill-range/-/fill-range-7.0.1.tgz#1919a6a7c75fe38b2c7c77e5198535da9acdda40" - integrity sha512-qOo9F+dMUmC2Lcb4BbVvnKJxTPjCm+RRpe4gDuGrzkL7mEVl/djYSu2OdQ2Pa302N4oqkSg9ir6jaLWJ2USVpQ== - dependencies: - to-regex-range "^5.0.1" - -finalhandler@~1.1.2: - version "1.1.2" - resolved "https://registry.yarnpkg.com/finalhandler/-/finalhandler-1.1.2.tgz#b7e7d000ffd11938d0fdb053506f6ebabe9f587d" - integrity sha512-aAWcW57uxVNrQZqFXjITpW3sIUQmHGG3qSb9mUah9MgMC4NeWhNOlNjXEYq3HjRAvL6arUviZGGJsBg6z0zsWA== - dependencies: - debug "2.6.9" - encodeurl "~1.0.2" - escape-html "~1.0.3" - on-finished "~2.3.0" - parseurl "~1.3.3" - statuses "~1.5.0" - unpipe "~1.0.0" - -find-up@^4.0.0: - version "4.1.0" - resolved "https://registry.yarnpkg.com/find-up/-/find-up-4.1.0.tgz#97afe7d6cdc0bc5928584b7c8d7b16e8a9aa5d19" - integrity sha512-PpOwAdQ/YlXQ2vj8a3h8IipDuYRi3wceVQQGYWxNINccq40Anw7BlsEXCMbt1Zt+OLA6Fq9suIpIWD0OsnISlw== - dependencies: - locate-path "^5.0.0" - path-exists "^4.0.0" - -flow-bin@^0.118.0: - version "0.118.0" - resolved "https://registry.yarnpkg.com/flow-bin/-/flow-bin-0.118.0.tgz#fb706364a58c682d67a2ca7df39396467dc397d1" - integrity sha512-jlbUu0XkbpXeXhan5xyTqVK1jmEKNxE8hpzznI3TThHTr76GiFwK0iRzhDo4KNy+S9h/KxHaqVhTP86vA6wHCg== - -follow-redirects@^1.0.0: - version "1.14.9" - resolved "https://registry.yarnpkg.com/follow-redirects/-/follow-redirects-1.14.9.tgz#dd4ea157de7bfaf9ea9b3fbd85aa16951f78d8d7" - integrity sha512-MQDfihBQYMcyy5dhRDJUHcw7lb2Pv/TuE6xP1vyraLukNDHKbDxDNaOE3NbCAdKQApno+GPRyo1YAp89yCjK4w== - -forwarded@0.2.0: - version "0.2.0" - resolved "https://registry.yarnpkg.com/forwarded/-/forwarded-0.2.0.tgz#2269936428aad4c15c7ebe9779a84bf0b2a81811" - integrity sha512-buRG0fpBtRHSTCOASe6hD258tEubFoRLb4ZNA6NxMVHNw2gOcwHo9wyablzMzOA5z9xA9L1KNjk/Nt6MT9aYow== - -fresh@0.5.2: - version "0.5.2" - resolved "https://registry.yarnpkg.com/fresh/-/fresh-0.5.2.tgz#3d8cadd90d976569fa835ab1f8e4b23a105605a7" - integrity sha1-PYyt2Q2XZWn6g1qx+OSyOhBWBac= - -fs-monkey@1.0.3: - version "1.0.3" - resolved "https://registry.yarnpkg.com/fs-monkey/-/fs-monkey-1.0.3.tgz#ae3ac92d53bb328efe0e9a1d9541f6ad8d48e2d3" - integrity sha512-cybjIfiiE+pTWicSCLFHSrXZ6EilF30oh91FDP9S2B051prEa7QWfrVTQm10/dDpswBDXZugPa1Ogu8Yh+HV0Q== - -fs.realpath@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/fs.realpath/-/fs.realpath-1.0.0.tgz#1504ad2523158caa40db4a2787cb01411994ea4f" - integrity sha1-FQStJSMVjKpA20onh8sBQRmU6k8= - -fsevents@~2.3.2: - version "2.3.2" - resolved "https://registry.yarnpkg.com/fsevents/-/fsevents-2.3.2.tgz#8a526f78b8fdf4623b709e0b975c52c24c02fd1a" - integrity sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA== - -function-bind@^1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/function-bind/-/function-bind-1.1.1.tgz#a56899d3ea3c9bab874bb9773b7c5ede92f4895d" - integrity sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A== - -get-intrinsic@^1.0.2: - version "1.1.1" - resolved "https://registry.yarnpkg.com/get-intrinsic/-/get-intrinsic-1.1.1.tgz#15f59f376f855c446963948f0d24cd3637b4abc6" - integrity sha512-kWZrnVM42QCiEA2Ig1bG8zjoIMOgxWwYCEeNdwY6Tv/cOSeGpcoX4pXHfKUxNKVoArnrEr2e9srnAxxGIraS9Q== - dependencies: - function-bind "^1.1.1" - has "^1.0.3" - has-symbols "^1.0.1" - -get-stream@^6.0.0: - version "6.0.1" - resolved "https://registry.yarnpkg.com/get-stream/-/get-stream-6.0.1.tgz#a262d8eef67aced57c2852ad6167526a43cbf7b7" - integrity sha512-ts6Wi+2j3jQjqi70w5AlN8DFnkSwC+MqmxEzdEALB2qXZYV3X/b1CTfgPLGJNMeAWxdPfU8FO1ms3NUfaHCPYg== - -glob-parent@^5.1.2, glob-parent@~5.1.2: - version "5.1.2" - resolved "https://registry.yarnpkg.com/glob-parent/-/glob-parent-5.1.2.tgz#869832c58034fe68a4093c17dc15e8340d8401c4" - integrity sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow== - dependencies: - is-glob "^4.0.1" - -glob-to-regexp@^0.4.1: - version "0.4.1" - resolved "https://registry.yarnpkg.com/glob-to-regexp/-/glob-to-regexp-0.4.1.tgz#c75297087c851b9a578bd217dd59a92f59fe546e" - integrity sha512-lkX1HJXwyMcprw/5YUZc2s7DrpAiHB21/V+E1rHUrVNokkvB6bqMzT0VfV6/86ZNabt1k14YOIaT7nDvOX3Iiw== - -glob@^7.1.3: - version "7.2.0" - resolved "https://registry.yarnpkg.com/glob/-/glob-7.2.0.tgz#d15535af7732e02e948f4c41628bd910293f6023" - integrity sha512-lmLf6gtyrPq8tTjSmrO94wBeQbFR3HbLHbuyD69wuyQkImp2hWqMGB47OX65FBkPffO641IP9jWa1z4ivqG26Q== - dependencies: - fs.realpath "^1.0.0" - inflight "^1.0.4" - inherits "2" - minimatch "^3.0.4" - once "^1.3.0" - path-is-absolute "^1.0.0" - -globby@^11.0.1: - version "11.1.0" - resolved "https://registry.yarnpkg.com/globby/-/globby-11.1.0.tgz#bd4be98bb042f83d796f7e3811991fbe82a0d34b" - integrity sha512-jhIXaOzy1sb8IyocaruWSn1TjmnBVs8Ayhcy83rmxNJ8q2uWKCAj3CnJY+KpGSXCueAPc0i05kVvVKtP1t9S3g== - dependencies: - array-union "^2.1.0" - dir-glob "^3.0.1" - fast-glob "^3.2.9" - ignore "^5.2.0" - merge2 "^1.4.1" - slash "^3.0.0" - -graceful-fs@^4.1.2, graceful-fs@^4.2.4, graceful-fs@^4.2.6, graceful-fs@^4.2.9: - version "4.2.9" - resolved "https://registry.yarnpkg.com/graceful-fs/-/graceful-fs-4.2.9.tgz#041b05df45755e587a24942279b9d113146e1c96" - integrity sha512-NtNxqUcXgpW2iMrfqSfR73Glt39K+BLwWsPs94yR63v45T0Wbej7eRmL5cWfwEgqXnmjQp3zaJTshdRW/qC2ZQ== - -handle-thing@^2.0.0: - version "2.0.1" - resolved "https://registry.yarnpkg.com/handle-thing/-/handle-thing-2.0.1.tgz#857f79ce359580c340d43081cc648970d0bb234e" - integrity sha512-9Qn4yBxelxoh2Ow62nP+Ka/kMnOXRi8BXnRaUwezLNhqelnN49xKz4F/dPP8OYLxLxq6JDtZb2i9XznUQbNPTg== - -has-flag@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/has-flag/-/has-flag-4.0.0.tgz#944771fd9c81c81265c4d6941860da06bb59479b" - integrity sha512-EykJT/Q1KjTWctppgIAgfSO0tKVuZUjhgMr17kqTumMl6Afv3EISleU7qZUzoXDFTAHTDC4NOoG/ZxU3EvlMPQ== - -has-symbols@^1.0.1, has-symbols@^1.0.2: - version "1.0.3" - resolved "https://registry.yarnpkg.com/has-symbols/-/has-symbols-1.0.3.tgz#bb7b2c4349251dce87b125f7bdf874aa7c8b39f8" - integrity sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A== - -has-tostringtag@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/has-tostringtag/-/has-tostringtag-1.0.0.tgz#7e133818a7d394734f941e73c3d3f9291e658b25" - integrity sha512-kFjcSNhnlGV1kyoGk7OXKSawH5JOb/LzUc5w9B02hOTO0dfFRjbHQKvg1d6cf3HbeUmtU9VbbV3qzZ2Teh97WQ== - dependencies: - has-symbols "^1.0.2" - -has@^1.0.3: - version "1.0.3" - resolved "https://registry.yarnpkg.com/has/-/has-1.0.3.tgz#722d7cbfc1f6aa8241f16dd814e011e1f41e8796" - integrity sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw== - dependencies: - function-bind "^1.1.1" - -he@^1.2.0: - version "1.2.0" - resolved "https://registry.yarnpkg.com/he/-/he-1.2.0.tgz#84ae65fa7eafb165fddb61566ae14baf05664f0f" - integrity sha512-F/1DnUGPopORZi0ni+CvrCgHQ5FyEAHRLSApuYWMmrbSwoN2Mn/7k+Gl38gJnR7yyDZk6WLXwiGod1JOWNDKGw== - -hoist-non-react-statics@^3.3.2: - version "3.3.2" - resolved "https://registry.yarnpkg.com/hoist-non-react-statics/-/hoist-non-react-statics-3.3.2.tgz#ece0acaf71d62c2969c2ec59feff42a4b1a85b45" - integrity sha512-/gGivxi8JPKWNm/W0jSmzcMPpfpPLc3dY/6GxhX2hQ9iGj3aDfklV4ET7NjKpSinLpJ5vafa9iiGIEZg10SfBw== - dependencies: - react-is "^16.7.0" - -hpack.js@^2.1.6: - version "2.1.6" - resolved "https://registry.yarnpkg.com/hpack.js/-/hpack.js-2.1.6.tgz#87774c0949e513f42e84575b3c45681fade2a0b2" - integrity sha1-h3dMCUnlE/QuhFdbPEVoH63ioLI= - dependencies: - inherits "^2.0.1" - obuf "^1.0.0" - readable-stream "^2.0.1" - wbuf "^1.1.0" - -html-entities@^2.3.2: - version "2.3.2" - resolved "https://registry.yarnpkg.com/html-entities/-/html-entities-2.3.2.tgz#760b404685cb1d794e4f4b744332e3b00dcfe488" - integrity sha512-c3Ab/url5ksaT0WyleslpBEthOzWhrjQbg75y7XUsfSzi3Dgzt0l8w5e7DylRn15MTlMMD58dTfzddNS2kcAjQ== - -html-minifier-terser@^6.0.2: - version "6.1.0" - resolved "https://registry.yarnpkg.com/html-minifier-terser/-/html-minifier-terser-6.1.0.tgz#bfc818934cc07918f6b3669f5774ecdfd48f32ab" - integrity sha512-YXxSlJBZTP7RS3tWnQw74ooKa6L9b9i9QYXY21eUEvhZ3u9XLfv6OnFsQq6RxkhHygsaUMvYsZRV5rU/OVNZxw== - dependencies: - camel-case "^4.1.2" - clean-css "^5.2.2" - commander "^8.3.0" - he "^1.2.0" - param-case "^3.0.4" - relateurl "^0.2.7" - terser "^5.10.0" - -html-webpack-plugin@^5.3.1: - version "5.5.0" - resolved "https://registry.yarnpkg.com/html-webpack-plugin/-/html-webpack-plugin-5.5.0.tgz#c3911936f57681c1f9f4d8b68c158cd9dfe52f50" - integrity sha512-sy88PC2cRTVxvETRgUHFrL4No3UxvcH8G1NepGhqaTT+GXN2kTamqasot0inS5hXeg1cMbFDt27zzo9p35lZVw== - dependencies: - "@types/html-minifier-terser" "^6.0.0" - html-minifier-terser "^6.0.2" - lodash "^4.17.21" - pretty-error "^4.0.0" - tapable "^2.0.0" - -htmlparser2@^6.1.0: - version "6.1.0" - resolved "https://registry.yarnpkg.com/htmlparser2/-/htmlparser2-6.1.0.tgz#c4d762b6c3371a05dbe65e94ae43a9f845fb8fb7" - integrity sha512-gyyPk6rgonLFEDGoeRgQNaEUvdJ4ktTmmUh/h2t7s+M8oPpIPxgNACWa+6ESR57kXstwqPiCut0V8NRpcwgU7A== - dependencies: - domelementtype "^2.0.1" - domhandler "^4.0.0" - domutils "^2.5.2" - entities "^2.0.0" - -http-deceiver@^1.2.7: - version "1.2.7" - resolved "https://registry.yarnpkg.com/http-deceiver/-/http-deceiver-1.2.7.tgz#fa7168944ab9a519d337cb0bec7284dc3e723d87" - integrity sha1-+nFolEq5pRnTN8sL7HKE3D5yPYc= - -http-errors@1.8.1: - version "1.8.1" - resolved "https://registry.yarnpkg.com/http-errors/-/http-errors-1.8.1.tgz#7c3f28577cbc8a207388455dbd62295ed07bd68c" - integrity sha512-Kpk9Sm7NmI+RHhnj6OIWDI1d6fIoFAtFt9RLaTMRlg/8w49juAStsrBgp0Dp4OdxdVbRIeKhtCUvoi/RuAhO4g== - dependencies: - depd "~1.1.2" - inherits "2.0.4" - setprototypeof "1.2.0" - statuses ">= 1.5.0 < 2" - toidentifier "1.0.1" - -http-errors@~1.6.2: - version "1.6.3" - resolved "https://registry.yarnpkg.com/http-errors/-/http-errors-1.6.3.tgz#8b55680bb4be283a0b5bf4ea2e38580be1d9320d" - integrity sha1-i1VoC7S+KDoLW/TqLjhYC+HZMg0= - dependencies: - depd "~1.1.2" - inherits "2.0.3" - setprototypeof "1.1.0" - statuses ">= 1.4.0 < 2" - -http-parser-js@>=0.5.1: - version "0.5.6" - resolved "https://registry.yarnpkg.com/http-parser-js/-/http-parser-js-0.5.6.tgz#2e02406ab2df8af8a7abfba62e0da01c62b95afd" - integrity sha512-vDlkRPDJn93swjcjqMSaGSPABbIarsr1TLAui/gLDXzV5VsJNdXNzMYDyNBLQkjWQCJ1uizu8T2oDMhmGt0PRA== - -http-proxy-middleware@^2.0.0: - version "2.0.3" - resolved "https://registry.yarnpkg.com/http-proxy-middleware/-/http-proxy-middleware-2.0.3.tgz#5df04f69a89f530c2284cd71eeaa51ba52243289" - integrity sha512-1bloEwnrHMnCoO/Gcwbz7eSVvW50KPES01PecpagI+YLNLci4AcuKJrujW4Mc3sBLpFxMSlsLNHS5Nl/lvrTPA== - dependencies: - "@types/http-proxy" "^1.17.8" - http-proxy "^1.18.1" - is-glob "^4.0.1" - is-plain-obj "^3.0.0" - micromatch "^4.0.2" - -http-proxy@^1.18.1: - version "1.18.1" - resolved "https://registry.yarnpkg.com/http-proxy/-/http-proxy-1.18.1.tgz#401541f0534884bbf95260334e72f88ee3976549" - integrity sha512-7mz/721AbnJwIVbnaSv1Cz3Am0ZLT/UBwkC92VlxhXv/k/BBQfM2fXElQNC27BVGr0uwUpplYPQM9LnaBMR5NQ== - dependencies: - eventemitter3 "^4.0.0" - follow-redirects "^1.0.0" - requires-port "^1.0.0" - -human-signals@^2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/human-signals/-/human-signals-2.1.0.tgz#dc91fcba42e4d06e4abaed33b3e7a3c02f514ea0" - integrity sha512-B4FFZ6q/T2jhhksgkbEW3HBvWIfDW85snkQgawt07S7J5QXTk6BkNV+0yAeZrM5QpMAdYlocGoljn0sJ/WQkFw== - -hyphenate-style-name@^1.0.3: - version "1.0.4" - resolved "https://registry.yarnpkg.com/hyphenate-style-name/-/hyphenate-style-name-1.0.4.tgz#691879af8e220aea5750e8827db4ef62a54e361d" - integrity sha512-ygGZLjmXfPHj+ZWh6LwbC37l43MhfztxetbFCoYTM2VjkIUpeHgSNn7QIyVFj7YQ1Wl9Cbw5sholVJPzWvC2MQ== - -iconv-lite@0.4.24: - version "0.4.24" - resolved "https://registry.yarnpkg.com/iconv-lite/-/iconv-lite-0.4.24.tgz#2022b4b25fbddc21d2f524974a474aafe733908b" - integrity sha512-v3MXnZAcvnywkTUEZomIActle7RXXeedOR31wwl7VlyoXO4Qi9arvSenNQWne1TcRwhCL1HwLI21bEqdpj8/rA== - dependencies: - safer-buffer ">= 2.1.2 < 3" - -icss-utils@^5.0.0, icss-utils@^5.1.0: - version "5.1.0" - resolved "https://registry.yarnpkg.com/icss-utils/-/icss-utils-5.1.0.tgz#c6be6858abd013d768e98366ae47e25d5887b1ae" - integrity sha512-soFhflCVWLfRNOPU3iv5Z9VUdT44xFRbzjLsEzSr5AQmgqPMTHdU3PMT1Cf1ssx8fLNJDA1juftYl+PUcv3MqA== - -ignore@^5.2.0: - version "5.2.0" - resolved "https://registry.yarnpkg.com/ignore/-/ignore-5.2.0.tgz#6d3bac8fa7fe0d45d9f9be7bac2fc279577e345a" - integrity sha512-CmxgYGiEPCLhfLnpPp1MoRmifwEIOgjcHXxOBjv7mY96c+eWScsOP9c112ZyLdWHi0FxHjI+4uVhKYp/gcdRmQ== - -import-local@^3.0.2: - version "3.1.0" - resolved "https://registry.yarnpkg.com/import-local/-/import-local-3.1.0.tgz#b4479df8a5fd44f6cdce24070675676063c95cb4" - integrity sha512-ASB07uLtnDs1o6EHjKpX34BKYDSqnFerfTOJL2HvMqF70LnxpjkzDB8J44oT9pu4AMPkQwf8jl6szgvNd2tRIg== - dependencies: - pkg-dir "^4.2.0" - resolve-cwd "^3.0.0" - -indent-string@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/indent-string/-/indent-string-4.0.0.tgz#624f8f4497d619b2d9768531d58f4122854d7251" - integrity sha512-EdDDZu4A2OyIK7Lr/2zG+w5jmbuk1DVBnEwREQvBzspBJkCEbRa8GxU1lghYcaGJCnRWibjDXlq779X1/y5xwg== - -inflight@^1.0.4: - version "1.0.6" - resolved "https://registry.yarnpkg.com/inflight/-/inflight-1.0.6.tgz#49bd6331d7d02d0c09bc910a1075ba8165b56df9" - integrity sha1-Sb1jMdfQLQwJvJEKEHW6gWW1bfk= - dependencies: - once "^1.3.0" - wrappy "1" - -inherits@2, inherits@2.0.4, inherits@^2.0.1, inherits@^2.0.3, inherits@~2.0.3: - version "2.0.4" - resolved "https://registry.yarnpkg.com/inherits/-/inherits-2.0.4.tgz#0fa2c64f932917c3433a0ded55363aae37416b7c" - integrity sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ== - -inherits@2.0.3: - version "2.0.3" - resolved "https://registry.yarnpkg.com/inherits/-/inherits-2.0.3.tgz#633c2c83e3da42a502f52466022480f4208261de" - integrity sha1-Yzwsg+PaQqUC9SRmAiSA9CCCYd4= - -inline-chunk-html-plugin@^1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/inline-chunk-html-plugin/-/inline-chunk-html-plugin-1.1.1.tgz#f64111aed16fac274d2b929f6a6a08671d82354e" - integrity sha512-6W1eGIj8z/Yla6xJx5il6jJfCxMZS3kVkbiLQThbbjdsDLRIWkUVmpnhfW2l6WAwCW+qfy0zoXVGBZM1E5XF3g== - -interpret@^2.2.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/interpret/-/interpret-2.2.0.tgz#1a78a0b5965c40a5416d007ad6f50ad27c417df9" - integrity sha512-Ju0Bz/cEia55xDwUWEa8+olFpCiQoypjnQySseKtmjNrnps3P+xfpUmGr90T7yjlVJmOtybRvPXhKMbHr+fWnw== - -ip@^1.1.0: - version "1.1.5" - resolved "https://registry.yarnpkg.com/ip/-/ip-1.1.5.tgz#bdded70114290828c0a039e72ef25f5aaec4354a" - integrity sha1-vd7XARQpCCjAoDnnLvJfWq7ENUo= - -ipaddr.js@1.9.1: - version "1.9.1" - resolved "https://registry.yarnpkg.com/ipaddr.js/-/ipaddr.js-1.9.1.tgz#bff38543eeb8984825079ff3a2a8e6cbd46781b3" - integrity sha512-0KI/607xoxSToH7GjN1FfSbLoU0+btTicjsQSWQlh/hZykN8KpmMf7uYwPW3R+akZ6R/w18ZlXSHBYXiYUPO3g== - -ipaddr.js@^2.0.1: - version "2.0.1" - resolved "https://registry.yarnpkg.com/ipaddr.js/-/ipaddr.js-2.0.1.tgz#eca256a7a877e917aeb368b0a7497ddf42ef81c0" - integrity sha512-1qTgH9NG+IIJ4yfKs2e6Pp1bZg8wbDbKHT21HrLIeYBTRLgMYKnMTPAuI3Lcs61nfx5h1xlXnbJtH1kX5/d/ng== - -is-arguments@^1.0.4: - version "1.1.1" - resolved "https://registry.yarnpkg.com/is-arguments/-/is-arguments-1.1.1.tgz#15b3f88fda01f2a97fec84ca761a560f123efa9b" - integrity sha512-8Q7EARjzEnKpt/PCD7e1cgUS0a6X8u5tdSiMqXhojOdoV9TsMsiO+9VLC5vAmO8N7/GmXn7yjR8qnA6bVAEzfA== - dependencies: - call-bind "^1.0.2" - has-tostringtag "^1.0.0" - -is-binary-path@~2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/is-binary-path/-/is-binary-path-2.1.0.tgz#ea1f7f3b80f064236e83470f86c09c254fb45b09" - integrity sha512-ZMERYes6pDydyuGidse7OsHxtbI7WVeUEozgR/g7rd0xUimYNlvZRE/K2MgZTjWy725IfelLeVcEM97mmtRGXw== - dependencies: - binary-extensions "^2.0.0" - -is-core-module@^2.8.1: - version "2.8.1" - resolved "https://registry.yarnpkg.com/is-core-module/-/is-core-module-2.8.1.tgz#f59fdfca701d5879d0a6b100a40aa1560ce27211" - integrity sha512-SdNCUs284hr40hFTFP6l0IfZ/RSrMXF3qgoRHd3/79unUTvrFO/JoXwkGm+5J/Oe3E/b5GsnG330uUNgRpu1PA== - dependencies: - has "^1.0.3" - -is-date-object@^1.0.1: - version "1.0.5" - resolved "https://registry.yarnpkg.com/is-date-object/-/is-date-object-1.0.5.tgz#0841d5536e724c25597bf6ea62e1bd38298df31f" - integrity sha512-9YQaSxsAiSwcvS33MBk3wTCVnWK+HhF8VZR2jRxehM16QcVOdHqPn4VPHmRK4lSr38n9JriurInLcP90xsYNfQ== - dependencies: - has-tostringtag "^1.0.0" - -is-docker@^2.0.0, is-docker@^2.1.1: - version "2.2.1" - resolved "https://registry.yarnpkg.com/is-docker/-/is-docker-2.2.1.tgz#33eeabe23cfe86f14bde4408a02c0cfb853acdaa" - integrity sha512-F+i2BKsFrH66iaUFc0woD8sLy8getkwTwtOBjvs56Cx4CgJDeKQeqfz8wAYiSb8JOprWhHH5p77PbmYCvvUuXQ== - -is-extglob@^2.1.1: - version "2.1.1" - resolved "https://registry.yarnpkg.com/is-extglob/-/is-extglob-2.1.1.tgz#a88c02535791f02ed37c76a1b9ea9773c833f8c2" - integrity sha1-qIwCU1eR8C7TfHahueqXc8gz+MI= - -is-glob@^4.0.1, is-glob@~4.0.1: - version "4.0.3" - resolved "https://registry.yarnpkg.com/is-glob/-/is-glob-4.0.3.tgz#64f61e42cbbb2eec2071a9dac0b28ba1e65d5084" - integrity sha512-xelSayHH36ZgE7ZWhli7pW34hNbNl8Ojv5KVmkJD4hBdD3th8Tfk9vYasLM+mXWOZhFkgZfxhLSnrwRr4elSSg== - dependencies: - is-extglob "^2.1.1" - -is-in-browser@^1.0.2, is-in-browser@^1.1.3: - version "1.1.3" - resolved "https://registry.yarnpkg.com/is-in-browser/-/is-in-browser-1.1.3.tgz#56ff4db683a078c6082eb95dad7dc62e1d04f835" - integrity sha1-Vv9NtoOgeMYILrldrX3GLh0E+DU= - -is-number@^7.0.0: - version "7.0.0" - resolved "https://registry.yarnpkg.com/is-number/-/is-number-7.0.0.tgz#7535345b896734d5f80c4d06c50955527a14f12b" - integrity sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng== - -is-path-cwd@^2.2.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/is-path-cwd/-/is-path-cwd-2.2.0.tgz#67d43b82664a7b5191fd9119127eb300048a9fdb" - integrity sha512-w942bTcih8fdJPJmQHFzkS76NEP8Kzzvmw92cXsazb8intwLqPibPPdXf4ANdKV3rYMuuQYGIWtvz9JilB3NFQ== - -is-path-inside@^3.0.2: - version "3.0.3" - resolved "https://registry.yarnpkg.com/is-path-inside/-/is-path-inside-3.0.3.tgz#d231362e53a07ff2b0e0ea7fed049161ffd16283" - integrity sha512-Fd4gABb+ycGAmKou8eMftCupSir5lRxqf4aD/vd0cD2qc4HL07OjCeuHMr8Ro4CoMaeCKDB0/ECBOVWjTwUvPQ== - -is-plain-obj@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/is-plain-obj/-/is-plain-obj-3.0.0.tgz#af6f2ea14ac5a646183a5bbdb5baabbc156ad9d7" - integrity sha512-gwsOE28k+23GP1B6vFl1oVh/WOzmawBrKwo5Ev6wMKzPkaXaCDIQKzLnvsA42DRlbVTWorkgTKIviAKCWkfUwA== - -is-plain-object@^2.0.4: - version "2.0.4" - resolved "https://registry.yarnpkg.com/is-plain-object/-/is-plain-object-2.0.4.tgz#2c163b3fafb1b606d9d17928f05c2a1c38e07677" - integrity sha512-h5PpgXkWitc38BBMYawTYMWJHFZJVnBquFE57xFpjB8pJFiF6gZ+bU+WyI/yqXiFR5mdLsgYNaPe8uao6Uv9Og== - dependencies: - isobject "^3.0.1" - -is-regex@^1.0.4: - version "1.1.4" - resolved "https://registry.yarnpkg.com/is-regex/-/is-regex-1.1.4.tgz#eef5663cd59fa4c0ae339505323df6854bb15958" - integrity sha512-kvRdxDsxZjhzUX07ZnLydzS1TU/TJlTUHHY4YLL87e37oUA49DfkLqgy+VjFocowy29cKvcSiu+kIv728jTTVg== - dependencies: - call-bind "^1.0.2" - has-tostringtag "^1.0.0" - -is-stream@^2.0.0: - version "2.0.1" - resolved "https://registry.yarnpkg.com/is-stream/-/is-stream-2.0.1.tgz#fac1e3d53b97ad5a9d0ae9cef2389f5810a5c077" - integrity sha512-hFoiJiTl63nn+kstHGBtewWSKnQLpyb155KHheA1l39uvtO9nWIop1p3udqPcUd/xbF1VLMO4n7OI6p7RbngDg== - -is-wsl@^2.2.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/is-wsl/-/is-wsl-2.2.0.tgz#74a4c76e77ca9fd3f932f290c17ea326cd157271" - integrity sha512-fKzAra0rGJUUBwGBgNkHZuToZcn+TtXHpeCgmkMJMMYx1sQDYaCSyjJBSCa2nH1DGm7s3n1oBnohoVTBaN7Lww== - dependencies: - is-docker "^2.0.0" - -isarray@~1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/isarray/-/isarray-1.0.0.tgz#bb935d48582cba168c06834957a54a3e07124f11" - integrity sha1-u5NdSFgsuhaMBoNJV6VKPgcSTxE= - -isexe@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/isexe/-/isexe-2.0.0.tgz#e8fbf374dc556ff8947a10dcb0572d633f2cfa10" - integrity sha1-6PvzdNxVb/iUehDcsFctYz8s+hA= - -isobject@^3.0.1: - version "3.0.1" - resolved "https://registry.yarnpkg.com/isobject/-/isobject-3.0.1.tgz#4e431e92b11a9731636aa1f9c8d1ccbcfdab78df" - integrity sha1-TkMekrEalzFjaqH5yNHMvP2reN8= - -jest-worker@^27.4.5: - version "27.5.1" - resolved "https://registry.yarnpkg.com/jest-worker/-/jest-worker-27.5.1.tgz#8d146f0900e8973b106b6f73cc1e9a8cb86f8db0" - integrity sha512-7vuh85V5cdDofPyxn58nrPjBktZo0u9x1g8WtjQol+jZDaE+fhN+cIvTj11GndBnMnyfrUOG1sZQxCdjKh+DKg== - dependencies: - "@types/node" "*" - merge-stream "^2.0.0" - supports-color "^8.0.0" - -"js-tokens@^3.0.0 || ^4.0.0": - version "4.0.0" - resolved "https://registry.yarnpkg.com/js-tokens/-/js-tokens-4.0.0.tgz#19203fb59991df98e3a287050d4647cdeaf32499" - integrity sha512-RdJUflcE3cUzKiMqQgsCu06FPu9UdIJO0beYbPhHN4k6apgJtifcoCtT9bcxOpYBtpD2kCM6Sbzg4CausW/PKQ== - -json-parse-better-errors@^1.0.2: - version "1.0.2" - resolved "https://registry.yarnpkg.com/json-parse-better-errors/-/json-parse-better-errors-1.0.2.tgz#bb867cfb3450e69107c131d1c514bab3dc8bcaa9" - integrity sha512-mrqyZKfX5EhL7hvqcV6WG1yYjnjeuYDzDhhcAAUrq8Po85NBQBJP+ZDUT75qZQ98IkUoBqdkExkukOU7Ts2wrw== - -json-schema-traverse@^0.4.1: - version "0.4.1" - resolved "https://registry.yarnpkg.com/json-schema-traverse/-/json-schema-traverse-0.4.1.tgz#69f6a87d9513ab8bb8fe63bdb0979c448e684660" - integrity sha512-xbbCH5dCYU5T8LcEhhuh7HJ88HXuW3qsI3Y0zOZFKfZEHcpWiHU/Jxzk629Brsab/mMiHQti9wMP+845RPe3Vg== - -json-schema-traverse@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/json-schema-traverse/-/json-schema-traverse-1.0.0.tgz#ae7bcb3656ab77a73ba5c49bf654f38e6b6860e2" - integrity sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug== - -json2mq@^0.2.0: - version "0.2.0" - resolved "https://registry.yarnpkg.com/json2mq/-/json2mq-0.2.0.tgz#b637bd3ba9eabe122c83e9720483aeb10d2c904a" - integrity sha1-tje9O6nqvhIsg+lyBIOusQ0skEo= - dependencies: - string-convert "^0.2.0" - -json5@^2.1.2: - version "2.2.0" - resolved "https://registry.yarnpkg.com/json5/-/json5-2.2.0.tgz#2dfefe720c6ba525d9ebd909950f0515316c89a3" - integrity sha512-f+8cldu7X/y7RAJurMEJmdoKXGB/X550w2Nr3tTbezL6RwEE/iMcm+tZnXeoZtKuOq6ft8+CqzEkrIgx1fPoQA== - dependencies: - minimist "^1.2.5" - -jss-plugin-camel-case@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-camel-case/-/jss-plugin-camel-case-10.9.0.tgz#4921b568b38d893f39736ee8c4c5f1c64670aaf7" - integrity sha512-UH6uPpnDk413/r/2Olmw4+y54yEF2lRIV8XIZyuYpgPYTITLlPOsq6XB9qeqv+75SQSg3KLocq5jUBXW8qWWww== - dependencies: - "@babel/runtime" "^7.3.1" - hyphenate-style-name "^1.0.3" - jss "10.9.0" - -jss-plugin-default-unit@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-default-unit/-/jss-plugin-default-unit-10.9.0.tgz#bb23a48f075bc0ce852b4b4d3f7582bc002df991" - integrity sha512-7Ju4Q9wJ/MZPsxfu4T84mzdn7pLHWeqoGd/D8O3eDNNJ93Xc8PxnLmV8s8ZPNRYkLdxZqKtm1nPQ0BM4JRlq2w== - dependencies: - "@babel/runtime" "^7.3.1" - jss "10.9.0" - -jss-plugin-global@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-global/-/jss-plugin-global-10.9.0.tgz#fc07a0086ac97aca174e37edb480b69277f3931f" - integrity sha512-4G8PHNJ0x6nwAFsEzcuVDiBlyMsj2y3VjmFAx/uHk/R/gzJV+yRHICjT4MKGGu1cJq2hfowFWCyrr/Gg37FbgQ== - dependencies: - "@babel/runtime" "^7.3.1" - jss "10.9.0" - -jss-plugin-nested@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-nested/-/jss-plugin-nested-10.9.0.tgz#cc1c7d63ad542c3ccc6e2c66c8328c6b6b00f4b3" - integrity sha512-2UJnDrfCZpMYcpPYR16oZB7VAC6b/1QLsRiAutOt7wJaaqwCBvNsosLEu/fUyKNQNGdvg2PPJFDO5AX7dwxtoA== - dependencies: - "@babel/runtime" "^7.3.1" - jss "10.9.0" - tiny-warning "^1.0.2" - -jss-plugin-props-sort@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-props-sort/-/jss-plugin-props-sort-10.9.0.tgz#30e9567ef9479043feb6e5e59db09b4de687c47d" - integrity sha512-7A76HI8bzwqrsMOJTWKx/uD5v+U8piLnp5bvru7g/3ZEQOu1+PjHvv7bFdNO3DwNPC9oM0a//KwIJsIcDCjDzw== - dependencies: - "@babel/runtime" "^7.3.1" - jss "10.9.0" - -jss-plugin-rule-value-function@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-rule-value-function/-/jss-plugin-rule-value-function-10.9.0.tgz#379fd2732c0746fe45168011fe25544c1a295d67" - integrity sha512-IHJv6YrEf8pRzkY207cPmdbBstBaE+z8pazhPShfz0tZSDtRdQua5jjg6NMz3IbTasVx9FdnmptxPqSWL5tyJg== - dependencies: - "@babel/runtime" "^7.3.1" - jss "10.9.0" - tiny-warning "^1.0.2" - -jss-plugin-vendor-prefixer@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss-plugin-vendor-prefixer/-/jss-plugin-vendor-prefixer-10.9.0.tgz#aa9df98abfb3f75f7ed59a3ec50a5452461a206a" - integrity sha512-MbvsaXP7iiVdYVSEoi+blrW+AYnTDvHTW6I6zqi7JcwXdc6I9Kbm234nEblayhF38EftoenbM+5218pidmC5gA== - dependencies: - "@babel/runtime" "^7.3.1" - css-vendor "^2.0.8" - jss "10.9.0" - -jss@10.9.0, jss@^10.5.1: - version "10.9.0" - resolved "https://registry.yarnpkg.com/jss/-/jss-10.9.0.tgz#7583ee2cdc904a83c872ba695d1baab4b59c141b" - integrity sha512-YpzpreB6kUunQBbrlArlsMpXYyndt9JATbt95tajx0t4MTJJcCJdd4hdNpHmOIDiUJrF/oX5wtVFrS3uofWfGw== - dependencies: - "@babel/runtime" "^7.3.1" - csstype "^3.0.2" - is-in-browser "^1.1.3" - tiny-warning "^1.0.2" - -kind-of@^6.0.2: - version "6.0.3" - resolved "https://registry.yarnpkg.com/kind-of/-/kind-of-6.0.3.tgz#07c05034a6c349fa06e24fa35aa76db4580ce4dd" - integrity sha512-dcS1ul+9tmeD95T+x28/ehLgd9mENa3LsvDTtzm3vyBEO7RPptvAD+t44WVXaUjTBRcrpFeFlC8WCruUR456hw== - -loader-runner@^4.2.0: - version "4.2.0" - resolved "https://registry.yarnpkg.com/loader-runner/-/loader-runner-4.2.0.tgz#d7022380d66d14c5fb1d496b89864ebcfd478384" - integrity sha512-92+huvxMvYlMzMt0iIOukcwYBFpkYJdpl2xsZ7LrlayO7E8SOv+JJUEK17B/dJIHAOLMfh2dZZ/Y18WgmGtYNw== - -loader-utils@^2.0.0: - version "2.0.2" - resolved "https://registry.yarnpkg.com/loader-utils/-/loader-utils-2.0.2.tgz#d6e3b4fb81870721ae4e0868ab11dd638368c129" - integrity sha512-TM57VeHptv569d/GKh6TAYdzKblwDNiumOdkFnejjD0XwTH87K90w3O7AiJRqdQoXygvi1VQTJTLGhJl7WqA7A== - dependencies: - big.js "^5.2.2" - emojis-list "^3.0.0" - json5 "^2.1.2" - -locate-path@^5.0.0: - version "5.0.0" - resolved "https://registry.yarnpkg.com/locate-path/-/locate-path-5.0.0.tgz#1afba396afd676a6d42504d0a67a3a7eb9f62aa0" - integrity sha512-t7hw9pI+WvuwNJXwk5zVHpyhIqzg2qTlklJOf0mVxGSbe3Fp2VieZcduNYjaLDoy6p9uGpQEGWG87WpMKlNq8g== - dependencies: - p-locate "^4.1.0" - -lodash@^4.17.14, lodash@^4.17.20, lodash@^4.17.21: - version "4.17.21" - resolved "https://registry.yarnpkg.com/lodash/-/lodash-4.17.21.tgz#679591c564c3bffaae8454cf0b3df370c3d6911c" - integrity sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvSg== - -loose-envify@^1.1.0, loose-envify@^1.4.0: - version "1.4.0" - resolved "https://registry.yarnpkg.com/loose-envify/-/loose-envify-1.4.0.tgz#71ee51fa7be4caec1a63839f7e682d8132d30caf" - integrity sha512-lyuxPGr/Wfhrlem2CL/UcnUc1zcqKAImBDzukY7Y5F/yQiNdko6+fRLevlw1HgMySw7f611UIY408EtxRSoK3Q== - dependencies: - js-tokens "^3.0.0 || ^4.0.0" - -lower-case@^2.0.2: - version "2.0.2" - resolved "https://registry.yarnpkg.com/lower-case/-/lower-case-2.0.2.tgz#6fa237c63dbdc4a82ca0fd882e4722dc5e634e28" - integrity sha512-7fm3l3NAF9WfN6W3JOmf5drwpVqX78JtoGJ3A6W0a6ZnldM41w2fV5D490psKFTpMds8TJse/eHLFFsNHHjHgg== - dependencies: - tslib "^2.0.3" - -lru-cache@^6.0.0: - version "6.0.0" - resolved "https://registry.yarnpkg.com/lru-cache/-/lru-cache-6.0.0.tgz#6d6fe6570ebd96aaf90fcad1dafa3b2566db3a94" - integrity sha512-Jo6dJ04CmSjuznwJSS3pUeWmd/H0ffTlkXXgwZi+eq1UCmqQwCh+eLsYOYCwY991i2Fah4h1BEMCx4qThGbsiA== - dependencies: - yallist "^4.0.0" - -media-typer@0.3.0: - version "0.3.0" - resolved "https://registry.yarnpkg.com/media-typer/-/media-typer-0.3.0.tgz#8710d7af0aa626f8fffa1ce00168545263255748" - integrity sha1-hxDXrwqmJvj/+hzgAWhUUmMlV0g= - -memfs@^3.4.1: - version "3.4.1" - resolved "https://registry.yarnpkg.com/memfs/-/memfs-3.4.1.tgz#b78092f466a0dce054d63d39275b24c71d3f1305" - integrity sha512-1c9VPVvW5P7I85c35zAdEr1TD5+F11IToIHIlrVIcflfnzPkJa0ZoYEoEdYDP8KgPFoSZ/opDrUsAoZWym3mtw== - dependencies: - fs-monkey "1.0.3" - -"memoize-one@>=3.1.1 <6": - version "5.2.1" - resolved "https://registry.yarnpkg.com/memoize-one/-/memoize-one-5.2.1.tgz#8337aa3c4335581839ec01c3d594090cebe8f00e" - integrity sha512-zYiwtZUcYyXKo/np96AGZAckk+FWWsUdJ3cHGGmld7+AhvcWmQyGCYUh1hc4Q/pkOhb65dQR/pqCyK0cOaHz4Q== - -memoize-one@^3.1.1: - version "3.1.1" - resolved "https://registry.yarnpkg.com/memoize-one/-/memoize-one-3.1.1.tgz#ef609811e3bc28970eac2884eece64d167830d17" - integrity sha512-YqVh744GsMlZu6xkhGslPSqSurOv6P+kLN2J3ysBZfagLcL5FdRK/0UpgLoL8hwjjEvvAVkjJZyFP+1T6p1vgA== - -memoize-one@^6.0.0: - version "6.0.0" - resolved "https://registry.yarnpkg.com/memoize-one/-/memoize-one-6.0.0.tgz#b2591b871ed82948aee4727dc6abceeeac8c1045" - integrity sha512-rkpe71W0N0c0Xz6QD0eJETuWAJGnJ9afsl1srmwPrI+yBCkge5EycXXbYRyvL29zZVUWQCY7InPRCv3GDXuZNw== - -memory-fs@^0.5.0: - version "0.5.0" - resolved "https://registry.yarnpkg.com/memory-fs/-/memory-fs-0.5.0.tgz#324c01288b88652966d161db77838720845a8e3c" - integrity sha512-jA0rdU5KoQMC0e6ppoNRtpp6vjFq6+NY7r8hywnC7V+1Xj/MtHwGIbB1QaK/dunyjWteJzmkpd7ooeWg10T7GA== - dependencies: - errno "^0.1.3" - readable-stream "^2.0.1" - -merge-descriptors@1.0.1: - version "1.0.1" - resolved "https://registry.yarnpkg.com/merge-descriptors/-/merge-descriptors-1.0.1.tgz#b00aaa556dd8b44568150ec9d1b953f3f90cbb61" - integrity sha1-sAqqVW3YtEVoFQ7J0blT8/kMu2E= - -merge-stream@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/merge-stream/-/merge-stream-2.0.0.tgz#52823629a14dd00c9770fb6ad47dc6310f2c1f60" - integrity sha512-abv/qOcuPfk3URPfDzmZU1LKmuw8kT+0nIHvKrKgFrwifol/doWcdA4ZqsWQ8ENrFKkd67Mfpo/LovbIUsbt3w== - -merge2@^1.3.0, merge2@^1.4.1: - version "1.4.1" - resolved "https://registry.yarnpkg.com/merge2/-/merge2-1.4.1.tgz#4368892f885e907455a6fd7dc55c0c9d404990ae" - integrity sha512-8q7VEgMJW4J8tcfVPy8g09NcQwZdbwFEqhe/WZkoIzjn/3TGDwtOCYtXGxA3O8tPzpczCCDgv+P2P5y00ZJOOg== - -methods@~1.1.2: - version "1.1.2" - resolved "https://registry.yarnpkg.com/methods/-/methods-1.1.2.tgz#5529a4d67654134edcc5266656835b0f851afcee" - integrity sha1-VSmk1nZUE07cxSZmVoNbD4Ua/O4= - -micromatch@^4.0.0, micromatch@^4.0.2, micromatch@^4.0.4: - version "4.0.4" - resolved "https://registry.yarnpkg.com/micromatch/-/micromatch-4.0.4.tgz#896d519dfe9db25fce94ceb7a500919bf881ebf9" - integrity sha512-pRmzw/XUcwXGpD9aI9q/0XOwLNygjETJ8y0ao0wdqprrzDa4YnxLcz7fQRZr8voh8V10kGhABbNcHVk5wHgWwg== - dependencies: - braces "^3.0.1" - picomatch "^2.2.3" - -mime-db@1.51.0: - version "1.51.0" - resolved "https://registry.yarnpkg.com/mime-db/-/mime-db-1.51.0.tgz#d9ff62451859b18342d960850dc3cfb77e63fb0c" - integrity sha512-5y8A56jg7XVQx2mbv1lu49NR4dokRnhZYTtL+KGfaa27uq4pSTXkwQkFJl4pkRMyNFz/EtYDSkiiEHx3F7UN6g== - -"mime-db@>= 1.43.0 < 2": - version "1.52.0" - resolved "https://registry.yarnpkg.com/mime-db/-/mime-db-1.52.0.tgz#bbabcdc02859f4987301c856e3387ce5ec43bf70" - integrity sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg== - -mime-types@^2.1.27, mime-types@^2.1.31, mime-types@~2.1.17, mime-types@~2.1.24, mime-types@~2.1.34: - version "2.1.34" - resolved "https://registry.yarnpkg.com/mime-types/-/mime-types-2.1.34.tgz#5a712f9ec1503511a945803640fafe09d3793c24" - integrity sha512-6cP692WwGIs9XXdOO4++N+7qjqv0rqxxVvJ3VHPh/Sc9mVZcQP+ZGhkKiTvWMQRr2tbHkJP/Yn7Y0npb3ZBs4A== - dependencies: - mime-db "1.51.0" - -mime@1.6.0: - version "1.6.0" - resolved "https://registry.yarnpkg.com/mime/-/mime-1.6.0.tgz#32cd9e5c64553bd58d19a568af452acff04981b1" - integrity sha512-x0Vn8spI+wuJ1O6S7gnbaQg8Pxh4NNHb7KSINmEWKiPE4RKOplvijn+NkmYmmRgP68mc70j2EbeTFRsrswaQeg== - -mimic-fn@^2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/mimic-fn/-/mimic-fn-2.1.0.tgz#7ed2c2ccccaf84d3ffcb7a69b57711fc2083401b" - integrity sha512-OqbOk5oEQeAZ8WXWydlu9HJjz9WVdEIvamMCcXmuqUYjTknH/sqsWvhQ3vgwKFRR1HpjvNBKQ37nbJgYzGqGcg== - -minimalistic-assert@^1.0.0: - version "1.0.1" - resolved "https://registry.yarnpkg.com/minimalistic-assert/-/minimalistic-assert-1.0.1.tgz#2e194de044626d4a10e7f7fbc00ce73e83e4d5c7" - integrity sha512-UtJcAD4yEaGtjPezWuO9wC4nwUnVH/8/Im3yEHQP4b67cXlD/Qr9hdITCU1xDbSEXg2XKNaP8jsReV7vQd00/A== - -minimatch@^3.0.4: - version "3.1.2" - resolved "https://registry.yarnpkg.com/minimatch/-/minimatch-3.1.2.tgz#19cd194bfd3e428f049a70817c038d89ab4be35b" - integrity sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw== - dependencies: - brace-expansion "^1.1.7" - -minimist@^1.2.5: - version "1.2.5" - resolved "https://registry.yarnpkg.com/minimist/-/minimist-1.2.5.tgz#67d66014b66a6a8aaa0c083c5fd58df4e4e97602" - integrity sha512-FM9nNUYrRBAELZQT3xeZQ7fmMOBg6nWNmJKTcgsJeaLstP/UODVpGsr5OhXhhXg6f+qtJ8uiZ+PUxkDWcgIXLw== - -mkdirp@^0.5.5: - version "0.5.5" - resolved "https://registry.yarnpkg.com/mkdirp/-/mkdirp-0.5.5.tgz#d91cefd62d1436ca0f41620e251288d420099def" - integrity sha512-NKmAlESf6jMGym1++R0Ra7wvhV+wFW63FaSOFPwRahvea0gMUcGUhVeAg/0BC0wiv9ih5NYPB1Wn1UEI1/L+xQ== - dependencies: - minimist "^1.2.5" - -moment@^2.24.0, moment@^2.25.3: - version "2.29.1" - resolved "https://registry.yarnpkg.com/moment/-/moment-2.29.1.tgz#b2be769fa31940be9eeea6469c075e35006fa3d3" - integrity sha512-kHmoybcPV8Sqy59DwNDY3Jefr64lK/by/da0ViFcuA4DH0vQg5Q6Ze5VimxkfQNSC+Mls/Kx53s7TjP1RhFEDQ== - -ms@2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/ms/-/ms-2.0.0.tgz#5608aeadfc00be6c2901df5f9861788de0d597c8" - integrity sha1-VgiurfwAvmwpAd9fmGF4jeDVl8g= - -ms@2.1.2: - version "2.1.2" - resolved "https://registry.yarnpkg.com/ms/-/ms-2.1.2.tgz#d09d1f357b443f493382a8eb3ccd183872ae6009" - integrity sha512-sGkPx+VjMtmA6MX27oA4FBFELFCZZ4S4XqeGOXCv68tT+jb3vk/RyaKWP0PTKyWtmLSM0b+adUTEvbs1PEaH2w== - -ms@2.1.3, ms@^2.1.1: - version "2.1.3" - resolved "https://registry.yarnpkg.com/ms/-/ms-2.1.3.tgz#574c8138ce1d2b5861f0b44579dbadd60c6615b2" - integrity sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA== - -multicast-dns-service-types@^1.1.0: - version "1.1.0" - resolved "https://registry.yarnpkg.com/multicast-dns-service-types/-/multicast-dns-service-types-1.1.0.tgz#899f11d9686e5e05cb91b35d5f0e63b773cfc901" - integrity sha1-iZ8R2WhuXgXLkbNdXw5jt3PPyQE= - -multicast-dns@^6.0.1: - version "6.2.3" - resolved "https://registry.yarnpkg.com/multicast-dns/-/multicast-dns-6.2.3.tgz#a0ec7bd9055c4282f790c3c82f4e28db3b31b229" - integrity sha512-ji6J5enbMyGRHIAkAOu3WdV8nggqviKCEKtXcOqfphZZtQrmHKycfynJ2V7eVPUA4NhJ6V7Wf4TmGbTwKE9B6g== - dependencies: - dns-packet "^1.3.1" - thunky "^1.0.2" - -nanoid@^3.1.31, nanoid@^3.3.1: - version "3.3.1" - resolved "https://registry.yarnpkg.com/nanoid/-/nanoid-3.3.1.tgz#6347a18cac88af88f58af0b3594b723d5e99bb35" - integrity sha512-n6Vs/3KGyxPQd6uO0eH4Bv0ojGSUvuLlIHtC3Y0kEO23YRge8H9x1GCzLn28YX0H66pMkxuaeESFq4tKISKwdw== - -negotiator@0.6.3: - version "0.6.3" - resolved "https://registry.yarnpkg.com/negotiator/-/negotiator-0.6.3.tgz#58e323a72fedc0d6f9cd4d31fe49f51479590ccd" - integrity sha512-+EUsqGPLsM+j/zdChZjsnX51g4XrHFOIXwfnCVPGlQk/k5giakcKsuxCObBRu6DSm9opw/O6slWbJdghQM4bBg== - -neo-async@^2.6.2: - version "2.6.2" - resolved "https://registry.yarnpkg.com/neo-async/-/neo-async-2.6.2.tgz#b4aafb93e3aeb2d8174ca53cf163ab7d7308305f" - integrity sha512-Yd3UES5mWCSqR+qNT93S3UoYUkqAZ9lLg8a7g9rimsWmYGK8cVToA4/sF3RrshdyV3sAGMXVUmpMYOw+dLpOuw== - -no-case@^3.0.4: - version "3.0.4" - resolved "https://registry.yarnpkg.com/no-case/-/no-case-3.0.4.tgz#d361fd5c9800f558551a8369fc0dcd4662b6124d" - integrity sha512-fgAN3jGAh+RoxUGZHTSOLJIqUc2wmoBwGR4tbpNAKmmovFoWq0OdRkb0VkldReO2a2iBT/OEulG9XSUc10r3zg== - dependencies: - lower-case "^2.0.2" - tslib "^2.0.3" - -node-fetch@^1.0.1, node-fetch@^2.6.1: - version "2.6.7" - resolved "https://registry.yarnpkg.com/node-fetch/-/node-fetch-2.6.7.tgz#24de9fba827e3b4ae44dc8b20256a379160052ad" - integrity sha512-ZjMPFEfVx5j+y2yF35Kzx5sF7kDzxuDj6ziH4FFbOp87zKDZNx8yExJIb05OGF4Nlt9IHFIMBkRl41VdvcNdbQ== - dependencies: - whatwg-url "^5.0.0" - -node-forge@^1.2.0: - version "1.2.1" - resolved "https://registry.yarnpkg.com/node-forge/-/node-forge-1.2.1.tgz#82794919071ef2eb5c509293325cec8afd0fd53c" - integrity sha512-Fcvtbb+zBcZXbTTVwqGA5W+MKBj56UjVRevvchv5XrcyXbmNdesfZL37nlcWOfpgHhgmxApw3tQbTr4CqNmX4w== - -node-releases@^2.0.2: - version "2.0.2" - resolved "https://registry.yarnpkg.com/node-releases/-/node-releases-2.0.2.tgz#7139fe71e2f4f11b47d4d2986aaf8c48699e0c01" - integrity sha512-XxYDdcQ6eKqp/YjI+tb2C5WM2LgjnZrfYg4vgQt49EK268b6gYCHsBLrK2qvJo4FmCtqmKezb0WZFK4fkrZNsg== - -normalize-path@^3.0.0, normalize-path@~3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/normalize-path/-/normalize-path-3.0.0.tgz#0dcd69ff23a1c9b11fd0978316644a0388216a65" - integrity sha512-6eZs5Ls3WtCisHWp9S2GUy8dqkpGi4BVSz3GaqiE6ezub0512ESztXUwUB6C6IKbQkY2Pnb/mD4WYojCRwcwLA== - -npm-run-path@^4.0.1: - version "4.0.1" - resolved "https://registry.yarnpkg.com/npm-run-path/-/npm-run-path-4.0.1.tgz#b7ecd1e5ed53da8e37a55e1c2269e0b97ed748ea" - integrity sha512-S48WzZW777zhNIrn7gxOlISNAqi9ZC/uQFnRdbeIHhZhCA6UqpkOT8T1G7BvfdgP4Er8gF4sUbaS0i7QvIfCWw== - dependencies: - path-key "^3.0.0" - -nth-check@^2.0.1: - version "2.0.1" - resolved "https://registry.yarnpkg.com/nth-check/-/nth-check-2.0.1.tgz#2efe162f5c3da06a28959fbd3db75dbeea9f0fc2" - integrity sha512-it1vE95zF6dTT9lBsYbxvqh0Soy4SPowchj0UBGj/V6cTPnXXtQOPUbhZ6CmGzAD/rW22LQK6E96pcdJXk4A4w== - dependencies: - boolbase "^1.0.0" - -object-assign@^4.1.1: - version "4.1.1" - resolved "https://registry.yarnpkg.com/object-assign/-/object-assign-4.1.1.tgz#2109adc7965887cfc05cbbd442cac8bfbb360863" - integrity sha1-IQmtx5ZYh8/AXLvUQsrIv7s2CGM= - -object-is@^1.0.1: - version "1.1.5" - resolved "https://registry.yarnpkg.com/object-is/-/object-is-1.1.5.tgz#b9deeaa5fc7f1846a0faecdceec138e5778f53ac" - integrity sha512-3cyDsyHgtmi7I7DfSSI2LDp6SK2lwvtbg0p0R1e0RvTqF5ceGx+K2dfSjm1bKDMVCFEDAQvy+o8c6a7VujOddw== - dependencies: - call-bind "^1.0.2" - define-properties "^1.1.3" - -object-keys@^1.0.12, object-keys@^1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/object-keys/-/object-keys-1.1.1.tgz#1c47f272df277f3b1daf061677d9c82e2322c60e" - integrity sha512-NuAESUOUMrlIXOfHKzD6bpPu3tYt3xvjNdRIQ+FeT0lNb4K8WR70CaDxhuNguS2XG+GjkyMwOzsN5ZktImfhLA== - -obuf@^1.0.0, obuf@^1.1.2: - version "1.1.2" - resolved "https://registry.yarnpkg.com/obuf/-/obuf-1.1.2.tgz#09bea3343d41859ebd446292d11c9d4db619084e" - integrity sha512-PX1wu0AmAdPqOL1mWhqmlOd8kOIZQwGZw6rh7uby9fTc5lhaOWFLX3I6R1hrF9k3zUY40e6igsLGkDXK92LJNg== - -on-finished@~2.3.0: - version "2.3.0" - resolved "https://registry.yarnpkg.com/on-finished/-/on-finished-2.3.0.tgz#20f1336481b083cd75337992a16971aa2d906947" - integrity sha1-IPEzZIGwg811M3mSoWlxqi2QaUc= - dependencies: - ee-first "1.1.1" - -on-headers@~1.0.2: - version "1.0.2" - resolved "https://registry.yarnpkg.com/on-headers/-/on-headers-1.0.2.tgz#772b0ae6aaa525c399e489adfad90c403eb3c28f" - integrity sha512-pZAE+FJLoyITytdqK0U5s+FIpjN0JP3OzFi/u8Rx+EV5/W+JTWGXG8xFzevE7AjBfDqHv/8vL8qQsIhHnqRkrA== - -once@^1.3.0: - version "1.4.0" - resolved "https://registry.yarnpkg.com/once/-/once-1.4.0.tgz#583b1aa775961d4b113ac17d9c50baef9dd76bd1" - integrity sha1-WDsap3WWHUsROsF9nFC6753Xa9E= - dependencies: - wrappy "1" - -onetime@^5.1.2: - version "5.1.2" - resolved "https://registry.yarnpkg.com/onetime/-/onetime-5.1.2.tgz#d0e96ebb56b07476df1dd9c4806e5237985ca45e" - integrity sha512-kbpaSSGJTWdAY5KPVeMOKXSrPtr8C8C7wodJbcsd51jRnmD+GZu8Y0VoU6Dm5Z4vWr0Ig/1NKuWRKf7j5aaYSg== - dependencies: - mimic-fn "^2.1.0" - -open@^8.0.9: - version "8.4.0" - resolved "https://registry.yarnpkg.com/open/-/open-8.4.0.tgz#345321ae18f8138f82565a910fdc6b39e8c244f8" - integrity sha512-XgFPPM+B28FtCCgSb9I+s9szOC1vZRSwgWsRUA5ylIxRTgKozqjOCrVOqGsYABPYK5qnfqClxZTFBa8PKt2v6Q== - dependencies: - define-lazy-prop "^2.0.0" - is-docker "^2.1.1" - is-wsl "^2.2.0" - -p-limit@^2.2.0: - version "2.3.0" - resolved "https://registry.yarnpkg.com/p-limit/-/p-limit-2.3.0.tgz#3dd33c647a214fdfffd835933eb086da0dc21db1" - integrity sha512-//88mFWSJx8lxCzwdAABTJL2MyWB12+eIY7MDL2SqLmAkeKU9qxRvWuSyTjm3FUmpBEMuFfckAIqEaVGUDxb6w== - dependencies: - p-try "^2.0.0" - -p-locate@^4.1.0: - version "4.1.0" - resolved "https://registry.yarnpkg.com/p-locate/-/p-locate-4.1.0.tgz#a3428bb7088b3a60292f66919278b7c297ad4f07" - integrity sha512-R79ZZ/0wAxKGu3oYMlz8jy/kbhsNrS7SKZ7PxEHBgJ5+F2mtFW2fK2cOtBh1cHYkQsbzFV7I+EoRKe6Yt0oK7A== - dependencies: - p-limit "^2.2.0" - -p-map@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/p-map/-/p-map-4.0.0.tgz#bb2f95a5eda2ec168ec9274e06a747c3e2904d2b" - integrity sha512-/bjOqmgETBYB5BoEeGVea8dmvHb2m9GLy1E9W43yeyfP6QQCZGFNa+XRceJEuDB6zqr+gKpIAmlLebMpykw/MQ== - dependencies: - aggregate-error "^3.0.0" - -p-retry@^4.5.0: - version "4.6.1" - resolved "https://registry.yarnpkg.com/p-retry/-/p-retry-4.6.1.tgz#8fcddd5cdf7a67a0911a9cf2ef0e5df7f602316c" - integrity sha512-e2xXGNhZOZ0lfgR9kL34iGlU8N/KO0xZnQxVEwdeOvpqNDQfdnxIYizvWtK8RglUa3bGqI8g0R/BdfzLMxRkiA== - dependencies: - "@types/retry" "^0.12.0" - retry "^0.13.1" - -p-try@^2.0.0: - version "2.2.0" - resolved "https://registry.yarnpkg.com/p-try/-/p-try-2.2.0.tgz#cb2868540e313d61de58fafbe35ce9004d5540e6" - integrity sha512-R4nPAVTAU0B9D35/Gk3uJf/7XYbQcyohSKdvAxIRSNghFl4e71hVoGnBNQz9cWaXxO2I10KTC+3jMdvvoKw6dQ== - -param-case@^3.0.4: - version "3.0.4" - resolved "https://registry.yarnpkg.com/param-case/-/param-case-3.0.4.tgz#7d17fe4aa12bde34d4a77d91acfb6219caad01c5" - integrity sha512-RXlj7zCYokReqWpOPH9oYivUzLYZ5vAPIfEmCTNViosC78F8F0H9y7T7gG2M39ymgutxF5gcFEsyZQSph9Bp3A== - dependencies: - dot-case "^3.0.4" - tslib "^2.0.3" - -parseurl@~1.3.2, parseurl@~1.3.3: - version "1.3.3" - resolved "https://registry.yarnpkg.com/parseurl/-/parseurl-1.3.3.tgz#9da19e7bee8d12dff0513ed5b76957793bc2e8d4" - integrity sha512-CiyeOxFT/JZyN5m0z9PfXw4SCBJ6Sygz1Dpl0wqjlhDEGGBP1GnsUVEL0p63hoG1fcj3fHynXi9NYO4nWOL+qQ== - -pascal-case@^3.1.2: - version "3.1.2" - resolved "https://registry.yarnpkg.com/pascal-case/-/pascal-case-3.1.2.tgz#b48e0ef2b98e205e7c1dae747d0b1508237660eb" - integrity sha512-uWlGT3YSnK9x3BQJaOdcZwrnV6hPpd8jFH1/ucpiLRPh/2zCVJKS19E4GvYHvaCcACn3foXZ0cLB9Wrx1KGe5g== - dependencies: - no-case "^3.0.4" - tslib "^2.0.3" - -path-exists@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/path-exists/-/path-exists-4.0.0.tgz#513bdbe2d3b95d7762e8c1137efa195c6c61b5b3" - integrity sha512-ak9Qy5Q7jYb2Wwcey5Fpvg2KoAc/ZIhLSLOSBmRmygPsGwkVVt0fZa0qrtMz+m6tJTAHfZQ8FnmB4MG4LWy7/w== - -path-is-absolute@^1.0.0: - version "1.0.1" - resolved "https://registry.yarnpkg.com/path-is-absolute/-/path-is-absolute-1.0.1.tgz#174b9268735534ffbc7ace6bf53a5a9e1b5c5f5f" - integrity sha1-F0uSaHNVNP+8es5r9TpanhtcX18= - -path-key@^3.0.0, path-key@^3.1.0: - version "3.1.1" - resolved "https://registry.yarnpkg.com/path-key/-/path-key-3.1.1.tgz#581f6ade658cbba65a0d3380de7753295054f375" - integrity sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q== - -path-parse@^1.0.7: - version "1.0.7" - resolved "https://registry.yarnpkg.com/path-parse/-/path-parse-1.0.7.tgz#fbc114b60ca42b30d9daf5858e4bd68bbedb6735" - integrity sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw== - -path-to-regexp@0.1.7: - version "0.1.7" - resolved "https://registry.yarnpkg.com/path-to-regexp/-/path-to-regexp-0.1.7.tgz#df604178005f522f15eb4490e7247a1bfaa67f8c" - integrity sha1-32BBeABfUi8V60SQ5yR6G/qmf4w= - -path-type@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/path-type/-/path-type-4.0.0.tgz#84ed01c0a7ba380afe09d90a8c180dcd9d03043b" - integrity sha512-gDKb8aZMDeD/tZWs9P6+q0J9Mwkdl6xMV8TjnGP3qJVJ06bdMgkbBlLU8IdfOsIsFz2BW1rNVT3XuNEl8zPAvw== - -picocolors@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/picocolors/-/picocolors-1.0.0.tgz#cb5bdc74ff3f51892236eaf79d68bc44564ab81c" - integrity sha512-1fygroTLlHu66zi26VoTDv8yRgm0Fccecssto+MhsZ0D/DGW2sm8E8AjW7NU5VVTRt5GxbeZ5qBuJr+HyLYkjQ== - -picomatch@^2.0.4, picomatch@^2.2.1, picomatch@^2.2.3: - version "2.3.1" - resolved "https://registry.yarnpkg.com/picomatch/-/picomatch-2.3.1.tgz#3ba3833733646d9d3e4995946c1365a67fb07a42" - integrity sha512-JU3teHTNjmE2VCGFzuY8EXzCDVwEqB2a8fsIvwaStHhAWJEeVd1o1QD80CU6+ZdEXXSLbSsuLwJjkCBWqRQUVA== - -pkg-dir@^4.2.0: - version "4.2.0" - resolved "https://registry.yarnpkg.com/pkg-dir/-/pkg-dir-4.2.0.tgz#f099133df7ede422e81d1d8448270eeb3e4261f3" - integrity sha512-HRDzbaKjC+AOWVXxAU/x54COGeIv9eb+6CkDSQoNTt4XyWoIJvuPsXizxu/Fr23EiekbtZwmh1IcIG/l/a10GQ== - dependencies: - find-up "^4.0.0" - -popper.js@1.16.1-lts: - version "1.16.1-lts" - resolved "https://registry.yarnpkg.com/popper.js/-/popper.js-1.16.1-lts.tgz#cf6847b807da3799d80ee3d6d2f90df8a3f50b05" - integrity sha512-Kjw8nKRl1m+VrSFCoVGPph93W/qrSO7ZkqPpTf7F4bk/sqcfWK019dWBUpE/fBOsOQY1dks/Bmcbfn1heM/IsA== - -portable-fetch@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/portable-fetch/-/portable-fetch-3.0.0.tgz#3cbf4aa6dbc5a5734b41c0419c9273313bfd9ad8" - integrity sha1-PL9KptvFpXNLQcBBnJJzMTv9mtg= - dependencies: - node-fetch "^1.0.1" - whatwg-fetch ">=0.10.0" - -portfinder@^1.0.28: - version "1.0.28" - resolved "https://registry.yarnpkg.com/portfinder/-/portfinder-1.0.28.tgz#67c4622852bd5374dd1dd900f779f53462fac778" - integrity sha512-Se+2isanIcEqf2XMHjyUKskczxbPH7dQnlMjXX6+dybayyHvAf/TCgyMRlzf/B6QDhAEFOGes0pzRo3by4AbMA== - dependencies: - async "^2.6.2" - debug "^3.1.1" - mkdirp "^0.5.5" - -postcss-modules-extract-imports@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/postcss-modules-extract-imports/-/postcss-modules-extract-imports-3.0.0.tgz#cda1f047c0ae80c97dbe28c3e76a43b88025741d" - integrity sha512-bdHleFnP3kZ4NYDhuGlVK+CMrQ/pqUm8bx/oGL93K6gVwiclvX5x0n76fYMKuIGKzlABOy13zsvqjb0f92TEXw== - -postcss-modules-local-by-default@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/postcss-modules-local-by-default/-/postcss-modules-local-by-default-4.0.0.tgz#ebbb54fae1598eecfdf691a02b3ff3b390a5a51c" - integrity sha512-sT7ihtmGSF9yhm6ggikHdV0hlziDTX7oFoXtuVWeDd3hHObNkcHRo9V3yg7vCAY7cONyxJC/XXCmmiHHcvX7bQ== - dependencies: - icss-utils "^5.0.0" - postcss-selector-parser "^6.0.2" - postcss-value-parser "^4.1.0" - -postcss-modules-scope@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/postcss-modules-scope/-/postcss-modules-scope-3.0.0.tgz#9ef3151456d3bbfa120ca44898dfca6f2fa01f06" - integrity sha512-hncihwFA2yPath8oZ15PZqvWGkWf+XUfQgUGamS4LqoP1anQLOsOJw0vr7J7IwLpoY9fatA2qiGUGmuZL0Iqlg== - dependencies: - postcss-selector-parser "^6.0.4" - -postcss-modules-values@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/postcss-modules-values/-/postcss-modules-values-4.0.0.tgz#d7c5e7e68c3bb3c9b27cbf48ca0bb3ffb4602c9c" - integrity sha512-RDxHkAiEGI78gS2ofyvCsu7iycRv7oqw5xMWn9iMoR0N/7mf9D50ecQqUo5BZ9Zh2vH4bCUR/ktCqbB9m8vJjQ== - dependencies: - icss-utils "^5.0.0" - -postcss-selector-parser@^6.0.2, postcss-selector-parser@^6.0.4: - version "6.0.9" - resolved "https://registry.yarnpkg.com/postcss-selector-parser/-/postcss-selector-parser-6.0.9.tgz#ee71c3b9ff63d9cd130838876c13a2ec1a992b2f" - integrity sha512-UO3SgnZOVTwu4kyLR22UQ1xZh086RyNZppb7lLAKBFK8a32ttG5i87Y/P3+2bRSjZNyJ1B7hfFNo273tKe9YxQ== - dependencies: - cssesc "^3.0.0" - util-deprecate "^1.0.2" - -postcss-value-parser@^4.1.0: - version "4.2.0" - resolved "https://registry.yarnpkg.com/postcss-value-parser/-/postcss-value-parser-4.2.0.tgz#723c09920836ba6d3e5af019f92bc0971c02e514" - integrity sha512-1NNCs6uurfkVbeXG4S8JFT9t19m45ICnif8zWLd5oPSZ50QnwMfK+H3jv408d4jw/7Bttv5axS5IiHoLaVNHeQ== - -postcss@^8.2.15: - version "8.4.8" - resolved "https://registry.yarnpkg.com/postcss/-/postcss-8.4.8.tgz#dad963a76e82c081a0657d3a2f3602ce10c2e032" - integrity sha512-2tXEqGxrjvAO6U+CJzDL2Fk2kPHTv1jQsYkSoMeOis2SsYaXRO2COxTdQp99cYvif9JTXaAk9lYGc3VhJt7JPQ== - dependencies: - nanoid "^3.3.1" - picocolors "^1.0.0" - source-map-js "^1.0.2" - -prettier@^2.1.2: - version "2.5.1" - resolved "https://registry.yarnpkg.com/prettier/-/prettier-2.5.1.tgz#fff75fa9d519c54cf0fce328c1017d94546bc56a" - integrity sha512-vBZcPRUR5MZJwoyi3ZoyQlc1rXeEck8KgeC9AwwOn+exuxLxq5toTRDTSaVrXHxelDMHy9zlicw8u66yxoSUFg== - -pretty-error@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/pretty-error/-/pretty-error-4.0.0.tgz#90a703f46dd7234adb46d0f84823e9d1cb8f10d6" - integrity sha512-AoJ5YMAcXKYxKhuJGdcvse+Voc6v1RgnsR3nWcYU7q4t6z0Q6T86sv5Zq8VIRbOWWFpvdGE83LtdSMNd+6Y0xw== - dependencies: - lodash "^4.17.20" - renderkid "^3.0.0" - -process-nextick-args@~2.0.0: - version "2.0.1" - resolved "https://registry.yarnpkg.com/process-nextick-args/-/process-nextick-args-2.0.1.tgz#7820d9b16120cc55ca9ae7792680ae7dba6d7fe2" - integrity sha512-3ouUOpQhtgrbOa17J7+uxOTpITYWaGP7/AhoR3+A+/1e9skrzelGi/dXzEYyvbxubEF6Wn2ypscTKiKJFFn1ag== - -prop-types@^15.6.2, prop-types@^15.7.2: - version "15.8.1" - resolved "https://registry.yarnpkg.com/prop-types/-/prop-types-15.8.1.tgz#67d87bf1a694f48435cf332c24af10214a3140b5" - integrity sha512-oj87CgZICdulUohogVAR7AjlC0327U4el4L6eAvOqCeudMDVU0NThNaV+b9Df4dXgSP1gXMTnPdhfe/2qDH5cg== - dependencies: - loose-envify "^1.4.0" - object-assign "^4.1.1" - react-is "^16.13.1" - -proxy-addr@~2.0.7: - version "2.0.7" - resolved "https://registry.yarnpkg.com/proxy-addr/-/proxy-addr-2.0.7.tgz#f19fe69ceab311eeb94b42e70e8c2070f9ba1025" - integrity sha512-llQsMLSUDUPT44jdrU/O37qlnifitDP+ZwrmmZcoSKyLKvtZxpyV0n2/bD/N4tBAAZ/gJEdZU7KMraoK1+XYAg== - dependencies: - forwarded "0.2.0" - ipaddr.js "1.9.1" - -prr@~1.0.1: - version "1.0.1" - resolved "https://registry.yarnpkg.com/prr/-/prr-1.0.1.tgz#d3fc114ba06995a45ec6893f484ceb1d78f5f476" - integrity sha1-0/wRS6BplaRexok/SEzrHXj19HY= - -punycode@^2.1.0: - version "2.1.1" - resolved "https://registry.yarnpkg.com/punycode/-/punycode-2.1.1.tgz#b58b010ac40c22c5657616c8d2c2c02c7bf479ec" - integrity sha512-XRsRjdf+j5ml+y/6GKHPZbrF/8p2Yga0JPtdqTIY2Xe5ohJPD9saDJJLPvp9+NSBprVvevdXZybnj2cv8OEd0A== - -qs@6.9.7: - version "6.9.7" - resolved "https://registry.yarnpkg.com/qs/-/qs-6.9.7.tgz#4610846871485e1e048f44ae3b94033f0e675afe" - integrity sha512-IhMFgUmuNpyRfxA90umL7ByLlgRXu6tIfKPpF5TmcfRLlLCckfP/g3IQmju6jjpu+Hh8rA+2p6A27ZSPOOHdKw== - -queue-microtask@^1.2.2: - version "1.2.3" - resolved "https://registry.yarnpkg.com/queue-microtask/-/queue-microtask-1.2.3.tgz#4929228bbc724dfac43e0efb058caf7b6cfb6243" - integrity sha512-NuaNSa6flKT5JaSYQzJok04JzTL1CA6aGhv5rfLW3PgqA+M2ChpZQnAC8h8i4ZFkBS8X5RqkDBHA7r4hej3K9A== - -randombytes@^2.1.0: - version "2.1.0" - resolved "https://registry.yarnpkg.com/randombytes/-/randombytes-2.1.0.tgz#df6f84372f0270dc65cdf6291349ab7a473d4f2a" - integrity sha512-vYl3iOX+4CKUWuxGi9Ukhie6fsqXqS9FE2Zaic4tNFD2N2QQaXOMFbuKK4QmDHC0JO6B1Zp41J0LpT0oR68amQ== - dependencies: - safe-buffer "^5.1.0" - -range-parser@^1.2.1, range-parser@~1.2.1: - version "1.2.1" - resolved "https://registry.yarnpkg.com/range-parser/-/range-parser-1.2.1.tgz#3cf37023d199e1c24d1a55b84800c2f3e6468031" - integrity sha512-Hrgsx+orqoygnmhFbKaHE6c296J+HTAQXoxEF6gNupROmmGJRoyzfG3ccAveqCBrwr/2yxQ5BVd/GTl5agOwSg== - -raw-body@2.4.3: - version "2.4.3" - resolved "https://registry.yarnpkg.com/raw-body/-/raw-body-2.4.3.tgz#8f80305d11c2a0a545c2d9d89d7a0286fcead43c" - integrity sha512-UlTNLIcu0uzb4D2f4WltY6cVjLi+/jEN4lgEUj3E04tpMDpUlkBo/eSn6zou9hum2VMNpCCUone0O0WeJim07g== - dependencies: - bytes "3.1.2" - http-errors "1.8.1" - iconv-lite "0.4.24" - unpipe "1.0.0" - -rc-align@^4.0.0: - version "4.0.11" - resolved "https://registry.yarnpkg.com/rc-align/-/rc-align-4.0.11.tgz#8198c62db266bc1b8ef05e56c13275bf72628a5e" - integrity sha512-n9mQfIYQbbNTbefyQnRHZPWuTEwG1rY4a9yKlIWHSTbgwI+XUMGRYd0uJ5pE2UbrNX0WvnMBA1zJ3Lrecpra/A== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - dom-align "^1.7.0" - lodash "^4.17.21" - rc-util "^5.3.0" - resize-observer-polyfill "^1.5.1" - -rc-cascader@~3.2.1: - version "3.2.7" - resolved "https://registry.yarnpkg.com/rc-cascader/-/rc-cascader-3.2.7.tgz#74ac3ab9258f930e0c84dfacffd838b122b2cedf" - integrity sha512-M8VtKtifTXXo/qqXj63p12tsMNXm1z45Lytj7tu86L6gxIF8keDPcJ16/ZqrhS5JwlBPfoJNA1VooNl/KId15A== - dependencies: - "@babel/runtime" "^7.12.5" - array-tree-filter "^2.1.0" - classnames "^2.3.1" - rc-select "~14.0.0-alpha.23" - rc-tree "~5.4.3" - rc-util "^5.6.1" - -rc-checkbox@~2.3.0: - version "2.3.2" - resolved "https://registry.yarnpkg.com/rc-checkbox/-/rc-checkbox-2.3.2.tgz#f91b3678c7edb2baa8121c9483c664fa6f0aefc1" - integrity sha512-afVi1FYiGv1U0JlpNH/UaEXdh6WUJjcWokj/nUN2TgG80bfG+MDdbfHKlLcNNba94mbjy2/SXJ1HDgrOkXGAjg== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - -rc-collapse@~3.1.0: - version "3.1.2" - resolved "https://registry.yarnpkg.com/rc-collapse/-/rc-collapse-3.1.2.tgz#76028a811b845d03d9460ccc409c7ea8ad09db14" - integrity sha512-HujcKq7mghk/gVKeI6EjzTbb8e19XUZpakrYazu1MblEZ3Hu3WBMSN4A3QmvbF6n1g7x6lUlZvsHZ5shABWYOQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.3.4" - rc-util "^5.2.1" - shallowequal "^1.1.0" - -rc-dialog@~8.6.0: - version "8.6.0" - resolved "https://registry.yarnpkg.com/rc-dialog/-/rc-dialog-8.6.0.tgz#3b228dac085de5eed8c6237f31162104687442e7" - integrity sha512-GSbkfqjqxpZC5/zc+8H332+q5l/DKUhpQr0vdX2uDsxo5K0PhvaMEVjyoJUTkZ3+JstEADQji1PVLVb/2bJeOQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.6" - rc-motion "^2.3.0" - rc-util "^5.6.1" - -rc-drawer@~4.4.2: - version "4.4.3" - resolved "https://registry.yarnpkg.com/rc-drawer/-/rc-drawer-4.4.3.tgz#2094937a844e55dc9644236a2d9fba79c344e321" - integrity sha512-FYztwRs3uXnFOIf1hLvFxIQP9MiZJA+0w+Os8dfDh/90X7z/HqP/Yg+noLCIeHEbKln1Tqelv8ymCAN24zPcfQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.6" - rc-util "^5.7.0" - -rc-dropdown@^3.2.0, rc-dropdown@~3.3.2: - version "3.3.2" - resolved "https://registry.yarnpkg.com/rc-dropdown/-/rc-dropdown-3.3.2.tgz#097c2ec1b6d55c10eeb94dcf6120ba034c7a58e0" - integrity sha512-49GOz42oNvLtYGoJ2X5UWXJFp7aUiSZkj9OcgTV1UpxFZqHQMw+xijkaL5k3XDkMbb92XsuFnFt7IGG3/C0DKw== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.6" - rc-trigger "^5.0.4" - -rc-field-form@~1.23.0: - version "1.23.1" - resolved "https://registry.yarnpkg.com/rc-field-form/-/rc-field-form-1.23.1.tgz#638c11d05d7ed2efdcb862ff3da5fe2a7d199aaa" - integrity sha512-Mun+eaFmX1Pjud9bz0fD0IvxwDfFKWk2Q8tkt4sg4aKR9/FML/rzYC5MjY77p86X45XBurBDUR3gAda+Cg/ULw== - dependencies: - "@babel/runtime" "^7.8.4" - async-validator "^4.0.2" - rc-util "^5.8.0" - -rc-image@~5.2.5: - version "5.2.5" - resolved "https://registry.yarnpkg.com/rc-image/-/rc-image-5.2.5.tgz#44e6ffc842626827960e7ab72e1c0d6f3a8ce440" - integrity sha512-qUfZjYIODxO0c8a8P5GeuclYXZjzW4hV/5hyo27XqSFo1DmTCs2HkVeQObkcIk5kNsJtgsj1KoPThVsSc/PXOw== - dependencies: - "@babel/runtime" "^7.11.2" - classnames "^2.2.6" - rc-dialog "~8.6.0" - rc-util "^5.0.6" - -rc-input-number@~7.3.0: - version "7.3.4" - resolved "https://registry.yarnpkg.com/rc-input-number/-/rc-input-number-7.3.4.tgz#674aea98260250287d36e330a7e065b174486e9d" - integrity sha512-W9uqSzuvJUnz8H8vsVY4kx+yK51SsAxNTwr8SNH4G3XqQNocLVmKIibKFRjocnYX1RDHMND9FFbgj2h7E7nvGA== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.5" - rc-util "^5.9.8" - -rc-input@^0.0.1-alpha.5: - version "0.0.1-alpha.5" - resolved "https://registry.yarnpkg.com/rc-input/-/rc-input-0.0.1-alpha.5.tgz#cc043c44570c651f4d10d9809b3d634ed12537e6" - integrity sha512-RHvNweOVWFbbx2l/y6hgnSAdOg5fXc1D1VGhX2RNkGGyGr6cemnvyiYMxwZJjcXs0al3YK9jMObm20+DgH/mpw== - dependencies: - "@babel/runtime" "^7.11.1" - classnames "^2.2.1" - rc-util "^5.18.1" - -rc-mentions@~1.6.1: - version "1.6.2" - resolved "https://registry.yarnpkg.com/rc-mentions/-/rc-mentions-1.6.2.tgz#62ed7cdd8fa86d857c3ce3f9e73438022130815e" - integrity sha512-cntfJkNMq8B910rXuvnsnOV88DfmoUidnQnSIeXzWiYiUX4RL5oWUfSZzs+HAXYRU4SL1l8Mwjx95wHETiZ/fQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.6" - rc-menu "^9.0.0" - rc-textarea "^0.3.0" - rc-trigger "^5.0.4" - rc-util "^5.0.1" - -rc-menu@^9.0.0: - version "9.3.2" - resolved "https://registry.yarnpkg.com/rc-menu/-/rc-menu-9.3.2.tgz#bb842d37ebf71da912bea201cf7ef0a27267ad49" - integrity sha512-h3m45oY1INZyqphGELkdT0uiPnFzxkML8m0VMhJnk2fowtqfiT7F5tJLT3znEVaPIY80vMy1bClCkgq8U91CzQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.4.3" - rc-overflow "^1.2.0" - rc-trigger "^5.1.2" - rc-util "^5.12.0" - shallowequal "^1.1.0" - -rc-menu@~9.2.1: - version "9.2.1" - resolved "https://registry.yarnpkg.com/rc-menu/-/rc-menu-9.2.1.tgz#6fbe47f4846363bb81a5a21f0960026c3ada497a" - integrity sha512-UbEtn3rflJ8zS+etYGTVQuzy7Fm+yWXR5c0Rl6ecNTS/dPknRyWAyhJcbeR0Hu1+RdQT+0VCqrUPrgKnm4iY+w== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.4.3" - rc-overflow "^1.2.0" - rc-trigger "^5.1.2" - rc-util "^5.12.0" - shallowequal "^1.1.0" - -rc-motion@^2.0.0, rc-motion@^2.0.1, rc-motion@^2.2.0, rc-motion@^2.3.0, rc-motion@^2.3.4, rc-motion@^2.4.3, rc-motion@^2.4.4: - version "2.4.5" - resolved "https://registry.yarnpkg.com/rc-motion/-/rc-motion-2.4.5.tgz#b061c50bb29ecd3d735d5f4c40924a3c78226cbd" - integrity sha512-f3uJHR4gcpeZS/s8/nYFSOrXt2Wu/h9GrEcbJmC0qmKrVNgwL1pTgrT5kW7lgG6PFeoL4yHDmpQoEKkrPtKIzQ== - dependencies: - "@babel/runtime" "^7.11.1" - classnames "^2.2.1" - rc-util "^5.18.1" - -rc-notification@~4.5.7: - version "4.5.7" - resolved "https://registry.yarnpkg.com/rc-notification/-/rc-notification-4.5.7.tgz#265e6e6a0c1a0fac63d6abd4d832eb8ff31522f1" - integrity sha512-zhTGUjBIItbx96SiRu3KVURcLOydLUHZCPpYEn1zvh+re//Tnq/wSxN4FKgp38n4HOgHSVxcLEeSxBMTeBBDdw== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.2.0" - rc-util "^5.0.1" - -rc-overflow@^1.0.0, rc-overflow@^1.2.0: - version "1.2.3" - resolved "https://registry.yarnpkg.com/rc-overflow/-/rc-overflow-1.2.3.tgz#1754216d807f5473304272b0321c3aba7615f47a" - integrity sha512-Bz6dXTn/ww8nmu70tUQfRV0wT3BkfXY6j1lB1O38OVkDPz4xwfAcGK+LJ2zewUR5cTXkJ8hAN7YULohG8z4M7Q== - dependencies: - "@babel/runtime" "^7.11.1" - classnames "^2.2.1" - rc-resize-observer "^1.0.0" - rc-util "^5.15.0" - -rc-pagination@~3.1.9: - version "3.1.15" - resolved "https://registry.yarnpkg.com/rc-pagination/-/rc-pagination-3.1.15.tgz#e05eddf4c15717a5858290bed0857e27e2f957ff" - integrity sha512-4L3fot8g4E+PjWEgoVGX0noFCg+8ZFZmeLH4vsnZpB3O2T2zThtakjNxG+YvSaYtyMVT4B+GLayjKrKbXQpdAg== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - -rc-picker@~2.6.4: - version "2.6.4" - resolved "https://registry.yarnpkg.com/rc-picker/-/rc-picker-2.6.4.tgz#916aa5fcd8abd11106f1c2fb64bfd549439abfa0" - integrity sha512-Mnc1udPyGNSG7/ya5SmYltUjCUcsMH7jfJnuuXVAvEaEdx9qZxDGMWtIii//+ARC06CSHQ83s5iwiGFwM+FcDw== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - date-fns "2.x" - dayjs "1.x" - moment "^2.24.0" - rc-trigger "^5.0.4" - rc-util "^5.4.0" - shallowequal "^1.1.0" - -rc-progress@~3.2.1: - version "3.2.4" - resolved "https://registry.yarnpkg.com/rc-progress/-/rc-progress-3.2.4.tgz#4036acdae2566438545bc4df2203248babaf7549" - integrity sha512-M9WWutRaoVkPUPIrTpRIDpX0SPSrVHzxHdCRCbeoBFrd9UFWTYNWRlHsruJM5FH1AZI+BwB4wOJUNNylg/uFSw== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.6" - rc-util "^5.16.1" - -rc-rate@~2.9.0: - version "2.9.1" - resolved "https://registry.yarnpkg.com/rc-rate/-/rc-rate-2.9.1.tgz#e43cb95c4eb90a2c1e0b16ec6614d8c43530a731" - integrity sha512-MmIU7FT8W4LYRRHJD1sgG366qKtSaKb67D0/vVvJYR0lrCuRrCiVQ5qhfT5ghVO4wuVIORGpZs7ZKaYu+KMUzA== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.5" - rc-util "^5.0.1" - -rc-resize-observer@^1.0.0, rc-resize-observer@^1.1.0, rc-resize-observer@^1.2.0: - version "1.2.0" - resolved "https://registry.yarnpkg.com/rc-resize-observer/-/rc-resize-observer-1.2.0.tgz#9f46052f81cdf03498be35144cb7c53fd282c4c7" - integrity sha512-6W+UzT3PyDM0wVCEHfoW3qTHPTvbdSgiA43buiy8PzmeMnfgnDeb9NjdimMXMl3/TcrvvWl5RRVdp+NqcR47pQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - rc-util "^5.15.0" - resize-observer-polyfill "^1.5.1" - -rc-select@~14.0.0-alpha.15, rc-select@~14.0.0-alpha.23, rc-select@~14.0.0-alpha.8: - version "14.0.0" - resolved "https://registry.yarnpkg.com/rc-select/-/rc-select-14.0.0.tgz#87735dbc548f1cc8e94d579b21682ed2d34f7653" - integrity sha512-DkoWMhyxmrfpc1KJSqPORZdkKevzgOINvjR4WI+dibRe6i6DyqGB4Jk21sencnK9di6dumzOCHf93x9t9+gp3Q== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.0.1" - rc-overflow "^1.0.0" - rc-trigger "^5.0.4" - rc-util "^5.16.1" - rc-virtual-list "^3.2.0" - -rc-slider@~10.0.0-alpha.4: - version "10.0.0-alpha.4" - resolved "https://registry.yarnpkg.com/rc-slider/-/rc-slider-10.0.0-alpha.4.tgz#f14ec0905d53f1f9d7f495c301527d6eca5781cf" - integrity sha512-ih2xwkBgXAWAf7MjZIZyCiiWo6tnoIMuHifn0UeKXVAup7sH53QdSVvT9x/cysuSZIPNMYWEf6mec184n3gbiQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.5" - rc-tooltip "^5.0.1" - rc-util "^5.18.1" - shallowequal "^1.1.0" - -rc-steps@~4.1.0: - version "4.1.4" - resolved "https://registry.yarnpkg.com/rc-steps/-/rc-steps-4.1.4.tgz#0ba82db202d59ca52d0693dc9880dd145b19dc23" - integrity sha512-qoCqKZWSpkh/b03ASGx1WhpKnuZcRWmvuW+ZUu4mvMdfvFzVxblTwUM+9aBd0mlEUFmt6GW8FXhMpHkK3Uzp3w== - dependencies: - "@babel/runtime" "^7.10.2" - classnames "^2.2.3" - rc-util "^5.0.1" - -rc-switch@~3.2.0: - version "3.2.2" - resolved "https://registry.yarnpkg.com/rc-switch/-/rc-switch-3.2.2.tgz#d001f77f12664d52595b4f6fb425dd9e66fba8e8" - integrity sha512-+gUJClsZZzvAHGy1vZfnwySxj+MjLlGRyXKXScrtCTcmiYNPzxDFOxdQ/3pK1Kt/0POvwJ/6ALOR8gwdXGhs+A== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - rc-util "^5.0.1" - -rc-table@~7.23.0: - version "7.23.0" - resolved "https://registry.yarnpkg.com/rc-table/-/rc-table-7.23.0.tgz#e5f76998ecf3246147d45ed311417c08886e6507" - integrity sha512-Q1gneB2+lUa8EzCCfbrq+jO1qNSwQv1RUUXKB84W/Stdp4EvGOt2+QqGyfotMNM4JUw0fgGLwY+WjnhUhnLuQQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.5" - rc-resize-observer "^1.1.0" - rc-util "^5.14.0" - shallowequal "^1.1.0" - -rc-tabs@~11.10.0: - version "11.10.7" - resolved "https://registry.yarnpkg.com/rc-tabs/-/rc-tabs-11.10.7.tgz#7d8b5dcc17f1608cf3b9425d80069f1415479335" - integrity sha512-7IKmcU7QU3CdYnJTabeXs2DDeLiXLyALC8fvOtgyWWFXUD47G5vG+4bFO3f9+AI+rcFAPpfwapZbXxgmiRuWYQ== - dependencies: - "@babel/runtime" "^7.11.2" - classnames "2.x" - rc-dropdown "^3.2.0" - rc-menu "^9.0.0" - rc-resize-observer "^1.0.0" - rc-util "^5.5.0" - -rc-textarea@^0.3.0, rc-textarea@~0.3.0: - version "0.3.7" - resolved "https://registry.yarnpkg.com/rc-textarea/-/rc-textarea-0.3.7.tgz#987142891efdedb774883c07e2f51b318fde5a11" - integrity sha512-yCdZ6binKmAQB13hc/oehh0E/QRwoPP1pjF21aHBxlgXO3RzPF6dUu4LG2R4FZ1zx/fQd2L1faktulrXOM/2rw== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.1" - rc-resize-observer "^1.0.0" - rc-util "^5.7.0" - shallowequal "^1.1.0" - -rc-tooltip@^5.0.1, rc-tooltip@~5.1.1: - version "5.1.1" - resolved "https://registry.yarnpkg.com/rc-tooltip/-/rc-tooltip-5.1.1.tgz#94178ed162d0252bc4993b725f5dc2ac0fccf154" - integrity sha512-alt8eGMJulio6+4/uDm7nvV+rJq9bsfxFDCI0ljPdbuoygUscbsMYb6EQgwib/uqsXQUvzk+S7A59uYHmEgmDA== - dependencies: - "@babel/runtime" "^7.11.2" - rc-trigger "^5.0.0" - -rc-tree-select@~5.1.1: - version "5.1.4" - resolved "https://registry.yarnpkg.com/rc-tree-select/-/rc-tree-select-5.1.4.tgz#3577135399d1f4931b0f4d8245e0845861802e2b" - integrity sha512-sA6vTUQghzbjh3u6YAwJIebKkJEHUWDPFHQpfiPObqsEYqi9TKE1LvWqbJ77NbOlOARZq0KIb7LDGF8X0dikDQ== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-select "~14.0.0-alpha.8" - rc-tree "~5.4.3" - rc-util "^5.16.1" - -rc-tree@~5.4.3: - version "5.4.4" - resolved "https://registry.yarnpkg.com/rc-tree/-/rc-tree-5.4.4.tgz#2ea3663ad3c566aef79a46ba6a1e050d24323e01" - integrity sha512-2qoObRgp31DBXmVzMJmo4qmwP20XEa4hR3imWQtRPcgN3pmljW3WKFmZRrYdOFHz7CyTnRsFZR065bBkIoUpiA== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "2.x" - rc-motion "^2.0.1" - rc-util "^5.16.1" - rc-virtual-list "^3.4.2" - -rc-trigger@^5.0.0, rc-trigger@^5.0.4, rc-trigger@^5.1.2, rc-trigger@^5.2.10: - version "5.2.10" - resolved "https://registry.yarnpkg.com/rc-trigger/-/rc-trigger-5.2.10.tgz#8a0057a940b1b9027eaa33beec8a6ecd85cce2b1" - integrity sha512-FkUf4H9BOFDaIwu42fvRycXMAvkttph9AlbCZXssZDVzz2L+QZ0ERvfB/4nX3ZFPh1Zd+uVGr1DEDeXxq4J1TA== - dependencies: - "@babel/runtime" "^7.11.2" - classnames "^2.2.6" - rc-align "^4.0.0" - rc-motion "^2.0.0" - rc-util "^5.5.0" - -rc-upload@~4.3.0: - version "4.3.3" - resolved "https://registry.yarnpkg.com/rc-upload/-/rc-upload-4.3.3.tgz#e237aa525e5313fa16f4d04d27f53c2f0e157bb8" - integrity sha512-YoJ0phCRenMj1nzwalXzciKZ9/FAaCrFu84dS5pphwucTC8GUWClcDID/WWNGsLFcM97NqIboDqrV82rVRhW/w== - dependencies: - "@babel/runtime" "^7.10.1" - classnames "^2.2.5" - rc-util "^5.2.0" - -rc-util@^5.0.1, rc-util@^5.0.6, rc-util@^5.0.7, rc-util@^5.12.0, rc-util@^5.14.0, rc-util@^5.15.0, rc-util@^5.16.1, rc-util@^5.18.1, rc-util@^5.2.0, rc-util@^5.2.1, rc-util@^5.3.0, rc-util@^5.4.0, rc-util@^5.5.0, rc-util@^5.6.1, rc-util@^5.7.0, rc-util@^5.8.0, rc-util@^5.9.4, rc-util@^5.9.8: - version "5.18.1" - resolved "https://registry.yarnpkg.com/rc-util/-/rc-util-5.18.1.tgz#80bd1450b5254655d2fbea63e3d34f6871e9be79" - integrity sha512-24xaSrMZUEKh1+suDOtJWfPe9E6YrwryViZcoPO0miJTKzP4qhUlV5AAlKQ82AJilz/AOHfi3l6HoX8qa1ye8w== - dependencies: - "@babel/runtime" "^7.12.5" - react-is "^16.12.0" - shallowequal "^1.1.0" - -rc-virtual-list@^3.2.0, rc-virtual-list@^3.4.2: - version "3.4.2" - resolved "https://registry.yarnpkg.com/rc-virtual-list/-/rc-virtual-list-3.4.2.tgz#1078327aa7230b5e456d679ed2ce99f3c036ebd1" - integrity sha512-OyVrrPvvFcHvV0ssz5EDZ+7Rf5qLat/+mmujjchNw5FfbJWNDwkpQ99EcVE6+FtNRmX9wFa1LGNpZLUTvp/4GQ== - dependencies: - classnames "^2.2.6" - rc-resize-observer "^1.0.0" - rc-util "^5.0.7" - -react-dom@^16.13.1: - version "16.14.0" - resolved "https://registry.yarnpkg.com/react-dom/-/react-dom-16.14.0.tgz#7ad838ec29a777fb3c75c3a190f661cf92ab8b89" - integrity sha512-1gCeQXDLoIqMgqD3IO2Ah9bnf0w9kzhwN5q4FGnHZ67hBm9yePzB5JJAIQCc8x3pFnNlwFq4RidZggNAAkzWWw== - dependencies: - loose-envify "^1.1.0" - object-assign "^4.1.1" - prop-types "^15.6.2" - scheduler "^0.19.1" - -react-flame-graph@^1.4.0: - version "1.4.0" - resolved "https://registry.yarnpkg.com/react-flame-graph/-/react-flame-graph-1.4.0.tgz#52d118cc94348f630a812fc0ec530a5b73c30cdb" - integrity sha512-DaCK9ZX+xK0mNca72kUE5cu6T8hGe/KLsefQWf+eT9sVt+0WP1dVxZCGD8Svfn2KrZB9Mv011Intg/yG2YWSxA== - dependencies: - flow-bin "^0.118.0" - memoize-one "^3.1.1" - react-window "^1" - -react-is@^16.12.0, react-is@^16.13.1, react-is@^16.7.0: - version "16.13.1" - resolved "https://registry.yarnpkg.com/react-is/-/react-is-16.13.1.tgz#789729a4dc36de2999dc156dd6c1d9c18cea56a4" - integrity sha512-24e6ynE2H+OKt4kqsOvNd8kBpV65zoxbA4BVsEOB3ARVWQki/DHzaUoC5KuON/BiccDaCCTZBuOcfZs70kR8bQ== - -"react-is@^16.8.0 || ^17.0.0": - version "17.0.2" - resolved "https://registry.yarnpkg.com/react-is/-/react-is-17.0.2.tgz#e691d4a8e9c789365655539ab372762b0efb54f0" - integrity sha512-w2GsyukL62IJnlaff/nRegPQR94C/XXamvMWmSHRJ4y7Ts/4ocGRmTHvOs8PSE6pB3dWOrD/nueuU5sduBsQ4w== - -react-transition-group@^4.4.0: - version "4.4.2" - resolved "https://registry.yarnpkg.com/react-transition-group/-/react-transition-group-4.4.2.tgz#8b59a56f09ced7b55cbd53c36768b922890d5470" - integrity sha512-/RNYfRAMlZwDSr6z4zNKV6xu53/e2BuaBbGhbyYIXTrmgu/bGHzmqOs7mJSJBHy9Ud+ApHx3QjrkKSp1pxvlFg== - dependencies: - "@babel/runtime" "^7.5.5" - dom-helpers "^5.0.1" - loose-envify "^1.4.0" - prop-types "^15.6.2" - -react-window@^1: - version "1.8.6" - resolved "https://registry.yarnpkg.com/react-window/-/react-window-1.8.6.tgz#d011950ac643a994118632665aad0c6382e2a112" - integrity sha512-8VwEEYyjz6DCnGBsd+MgkD0KJ2/OXFULyDtorIiTz+QzwoP94tBoA7CnbtyXMm+cCeAUER5KJcPtWl9cpKbOBg== - dependencies: - "@babel/runtime" "^7.0.0" - memoize-one ">=3.1.1 <6" - -react@^16.13.1: - version "16.14.0" - resolved "https://registry.yarnpkg.com/react/-/react-16.14.0.tgz#94d776ddd0aaa37da3eda8fc5b6b18a4c9a3114d" - integrity sha512-0X2CImDkJGApiAlcf0ODKIneSwBPhqJawOa5wCtKbu7ZECrmS26NvtSILynQ66cgkT/RJ4LidJOc3bUESwmU8g== - dependencies: - loose-envify "^1.1.0" - object-assign "^4.1.1" - prop-types "^15.6.2" - -readable-stream@^2.0.1: - version "2.3.7" - resolved "https://registry.yarnpkg.com/readable-stream/-/readable-stream-2.3.7.tgz#1eca1cf711aef814c04f62252a36a62f6cb23b57" - integrity sha512-Ebho8K4jIbHAxnuxi7o42OrZgF/ZTNcsZj6nRKyUmkhLFq8CHItp/fy6hQZuZmP/n3yZ9VBUbp4zz/mX8hmYPw== - dependencies: - core-util-is "~1.0.0" - inherits "~2.0.3" - isarray "~1.0.0" - process-nextick-args "~2.0.0" - safe-buffer "~5.1.1" - string_decoder "~1.1.1" - util-deprecate "~1.0.1" - -readable-stream@^3.0.6: - version "3.6.0" - resolved "https://registry.yarnpkg.com/readable-stream/-/readable-stream-3.6.0.tgz#337bbda3adc0706bd3e024426a286d4b4b2c9198" - integrity sha512-BViHy7LKeTz4oNnkcLJ+lVSL6vpiFeX6/d3oSH8zCW7UxP2onchk+vTGB143xuFjHS3deTgkKoXXymXqymiIdA== - dependencies: - inherits "^2.0.3" - string_decoder "^1.1.1" - util-deprecate "^1.0.1" - -readdirp@~3.6.0: - version "3.6.0" - resolved "https://registry.yarnpkg.com/readdirp/-/readdirp-3.6.0.tgz#74a370bd857116e245b29cc97340cd431a02a6c7" - integrity sha512-hOS089on8RduqdbhvQ5Z37A0ESjsqz6qnRcffsMU3495FuTdqSm+7bhJ29JvIOsBDEEnan5DPu9t3To9VRlMzA== - dependencies: - picomatch "^2.2.1" - -rechoir@^0.7.0: - version "0.7.1" - resolved "https://registry.yarnpkg.com/rechoir/-/rechoir-0.7.1.tgz#9478a96a1ca135b5e88fc027f03ee92d6c645686" - integrity sha512-/njmZ8s1wVeR6pjTZ+0nCnv8SpZNRMT2D1RLOJQESlYFDBvwpTA4KWJpZ+sBJ4+vhjILRcK7JIFdGCdxEAAitg== - dependencies: - resolve "^1.9.0" - -regenerator-runtime@^0.13.4: - version "0.13.9" - resolved "https://registry.yarnpkg.com/regenerator-runtime/-/regenerator-runtime-0.13.9.tgz#8925742a98ffd90814988d7566ad30ca3b263b52" - integrity sha512-p3VT+cOEgxFsRRA9X4lkI1E+k2/CtnKtU4gcxyaCUreilL/vqI6CdZ3wxVUx3UOUg+gnUOQQcRI7BmSI656MYA== - -regexp.prototype.flags@^1.2.0: - version "1.4.1" - resolved "https://registry.yarnpkg.com/regexp.prototype.flags/-/regexp.prototype.flags-1.4.1.tgz#b3f4c0059af9e47eca9f3f660e51d81307e72307" - integrity sha512-pMR7hBVUUGI7PMA37m2ofIdQCsomVnas+Jn5UPGAHQ+/LlwKm/aTLJHdasmHRzlfeZwHiAOaRSo2rbBDm3nNUQ== - dependencies: - call-bind "^1.0.2" - define-properties "^1.1.3" - -relateurl@^0.2.7: - version "0.2.7" - resolved "https://registry.yarnpkg.com/relateurl/-/relateurl-0.2.7.tgz#54dbf377e51440aca90a4cd274600d3ff2d888a9" - integrity sha1-VNvzd+UUQKypCkzSdGANP/LYiKk= - -renderkid@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/renderkid/-/renderkid-3.0.0.tgz#5fd823e4d6951d37358ecc9a58b1f06836b6268a" - integrity sha512-q/7VIQA8lmM1hF+jn+sFSPWGlMkSAeNYcPLmDQx2zzuiDfaLrOmumR8iaUKlenFgh0XRPIUeSPlH3A+AW3Z5pg== - dependencies: - css-select "^4.1.3" - dom-converter "^0.2.0" - htmlparser2 "^6.1.0" - lodash "^4.17.21" - strip-ansi "^6.0.1" - -require-from-string@^2.0.2: - version "2.0.2" - resolved "https://registry.yarnpkg.com/require-from-string/-/require-from-string-2.0.2.tgz#89a7fdd938261267318eafe14f9c32e598c36909" - integrity sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw== - -requires-port@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/requires-port/-/requires-port-1.0.0.tgz#925d2601d39ac485e091cf0da5c6e694dc3dcaff" - integrity sha1-kl0mAdOaxIXgkc8NpcbmlNw9yv8= - -resize-observer-polyfill@^1.5.0, resize-observer-polyfill@^1.5.1: - version "1.5.1" - resolved "https://registry.yarnpkg.com/resize-observer-polyfill/-/resize-observer-polyfill-1.5.1.tgz#0e9020dd3d21024458d4ebd27e23e40269810464" - integrity sha512-LwZrotdHOo12nQuZlHEmtuXdqGoOD0OhaxopaNFxWzInpEgaLWoVuAMbTzixuosCx2nEG58ngzW3vxdWoxIgdg== - -resolve-cwd@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/resolve-cwd/-/resolve-cwd-3.0.0.tgz#0f0075f1bb2544766cf73ba6a6e2adfebcb13f2d" - integrity sha512-OrZaX2Mb+rJCpH/6CpSqt9xFVpN++x01XnN2ie9g6P5/3xelLAkXWVADpdz1IHD/KFfEXyE6V0U01OQ3UO2rEg== - dependencies: - resolve-from "^5.0.0" - -resolve-from@^5.0.0: - version "5.0.0" - resolved "https://registry.yarnpkg.com/resolve-from/-/resolve-from-5.0.0.tgz#c35225843df8f776df21c57557bc087e9dfdfc69" - integrity sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw== - -resolve@^1.9.0: - version "1.22.0" - resolved "https://registry.yarnpkg.com/resolve/-/resolve-1.22.0.tgz#5e0b8c67c15df57a89bdbabe603a002f21731198" - integrity sha512-Hhtrw0nLeSrFQ7phPp4OOcVjLPIeMnRlr5mcnVuMe7M/7eBn98A3hmFRLoFo3DLZkivSYwhRUJTyPyWAk56WLw== - dependencies: - is-core-module "^2.8.1" - path-parse "^1.0.7" - supports-preserve-symlinks-flag "^1.0.0" - -retry@^0.13.1: - version "0.13.1" - resolved "https://registry.yarnpkg.com/retry/-/retry-0.13.1.tgz#185b1587acf67919d63b357349e03537b2484658" - integrity sha512-XQBQ3I8W1Cge0Seh+6gjj03LbmRFWuoszgK9ooCpwYIrhhoO80pfq4cUkU5DkknwfOfFteRwlZ56PYOGYyFWdg== - -reusify@^1.0.4: - version "1.0.4" - resolved "https://registry.yarnpkg.com/reusify/-/reusify-1.0.4.tgz#90da382b1e126efc02146e90845a88db12925d76" - integrity sha512-U9nH88a3fc/ekCF1l0/UP1IosiuIjyTh7hBvXVMHYgVcfGvt897Xguj2UOLDeI5BG2m7/uwyaLVT6fbtCwTyzw== - -rimraf@^3.0.2: - version "3.0.2" - resolved "https://registry.yarnpkg.com/rimraf/-/rimraf-3.0.2.tgz#f1a5402ba6220ad52cc1282bac1ae3aa49fd061a" - integrity sha512-JZkJMZkAGFFPP2YqXZXPbMlMBgsxzE8ILs4lMIX/2o0L9UBw9O/Y3o6wFw/i9YLapcUJWwqbi3kdxIPdC62TIA== - dependencies: - glob "^7.1.3" - -run-parallel@^1.1.9: - version "1.2.0" - resolved "https://registry.yarnpkg.com/run-parallel/-/run-parallel-1.2.0.tgz#66d1368da7bdf921eb9d95bd1a9229e7f21a43ee" - integrity sha512-5l4VyZR86LZ/lDxZTR6jqL8AFE2S0IFLMP26AbjsLVADxHdhB/c0GUsH+y39UfCi3dzz8OlQuPmnaJOMoDHQBA== - dependencies: - queue-microtask "^1.2.2" - -safe-buffer@5.1.2, safe-buffer@~5.1.0, safe-buffer@~5.1.1: - version "5.1.2" - resolved "https://registry.yarnpkg.com/safe-buffer/-/safe-buffer-5.1.2.tgz#991ec69d296e0313747d59bdfd2b745c35f8828d" - integrity sha512-Gd2UZBJDkXlY7GbJxfsE8/nvKkUEU1G38c1siN6QP6a9PT9MmHB8GnpscSmMJSoF8LOIrt8ud/wPtojys4G6+g== - -safe-buffer@5.2.1, safe-buffer@>=5.1.0, safe-buffer@^5.0.1, safe-buffer@^5.1.0, safe-buffer@~5.2.0: - version "5.2.1" - resolved "https://registry.yarnpkg.com/safe-buffer/-/safe-buffer-5.2.1.tgz#1eaf9fa9bdb1fdd4ec75f58f9cdb4e6b7827eec6" - integrity sha512-rp3So07KcdmmKbGvgaNxQSJr7bGVSVk5S9Eq1F+ppbRo70+YeaDxkw5Dd8NPN+GD6bjnYm2VuPuCXmpuYvmCXQ== - -"safer-buffer@>= 2.1.2 < 3": - version "2.1.2" - resolved "https://registry.yarnpkg.com/safer-buffer/-/safer-buffer-2.1.2.tgz#44fa161b0187b9549dd84bb91802f9bd8385cd6a" - integrity sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg== - -scheduler@^0.19.1: - version "0.19.1" - resolved "https://registry.yarnpkg.com/scheduler/-/scheduler-0.19.1.tgz#4f3e2ed2c1a7d65681f4c854fa8c5a1ccb40f196" - integrity sha512-n/zwRWRYSUj0/3g/otKDRPMh6qv2SYMWNq85IEa8iZyAv8od9zDYpGSnpBEjNgcMNq6Scbu5KfIPxNF72R/2EA== - dependencies: - loose-envify "^1.1.0" - object-assign "^4.1.1" - -schema-utils@^3.0.0, schema-utils@^3.1.0, schema-utils@^3.1.1: - version "3.1.1" - resolved "https://registry.yarnpkg.com/schema-utils/-/schema-utils-3.1.1.tgz#bc74c4b6b6995c1d88f76a8b77bea7219e0c8281" - integrity sha512-Y5PQxS4ITlC+EahLuXaY86TXfR7Dc5lw294alXOq86JAHCihAIZfqv8nNCWvaEJvaC51uN9hbLGeV0cFBdH+Fw== - dependencies: - "@types/json-schema" "^7.0.8" - ajv "^6.12.5" - ajv-keywords "^3.5.2" - -schema-utils@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/schema-utils/-/schema-utils-4.0.0.tgz#60331e9e3ae78ec5d16353c467c34b3a0a1d3df7" - integrity sha512-1edyXKgh6XnJsJSQ8mKWXnN/BVaIbFMLpouRUrXgVq7WYne5kw3MW7UPhO44uRXQSIpTSXoJbmrR2X0w9kUTyg== - dependencies: - "@types/json-schema" "^7.0.9" - ajv "^8.8.0" - ajv-formats "^2.1.1" - ajv-keywords "^5.0.0" - -scroll-into-view-if-needed@^2.2.25: - version "2.2.29" - resolved "https://registry.yarnpkg.com/scroll-into-view-if-needed/-/scroll-into-view-if-needed-2.2.29.tgz#551791a84b7e2287706511f8c68161e4990ab885" - integrity sha512-hxpAR6AN+Gh53AdAimHM6C8oTN1ppwVZITihix+WqalywBeFcQ6LdQP5ABNl26nX8GTEL7VT+b8lKpdqq65wXg== - dependencies: - compute-scroll-into-view "^1.0.17" - -select-hose@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/select-hose/-/select-hose-2.0.0.tgz#625d8658f865af43ec962bfc376a37359a4994ca" - integrity sha1-Yl2GWPhlr0Psliv8N2o3NZpJlMo= - -selfsigned@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/selfsigned/-/selfsigned-2.0.0.tgz#e927cd5377cbb0a1075302cff8df1042cc2bce5b" - integrity sha512-cUdFiCbKoa1mZ6osuJs2uDHrs0k0oprsKveFiiaBKCNq3SYyb5gs2HxhQyDNLCmL51ZZThqi4YNDpCK6GOP1iQ== - dependencies: - node-forge "^1.2.0" - -semver@^7.3.4, semver@^7.3.5: - version "7.3.5" - resolved "https://registry.yarnpkg.com/semver/-/semver-7.3.5.tgz#0b621c879348d8998e4b0e4be94b3f12e6018ef7" - integrity sha512-PoeGJYh8HK4BTO/a9Tf6ZG3veo/A7ZVsYrSA6J8ny9nb3B1VrpkuN+z9OE5wfE5p6H4LchYZsegiQgbJD94ZFQ== - dependencies: - lru-cache "^6.0.0" - -send@0.17.2: - version "0.17.2" - resolved "https://registry.yarnpkg.com/send/-/send-0.17.2.tgz#926622f76601c41808012c8bf1688fe3906f7820" - integrity sha512-UJYB6wFSJE3G00nEivR5rgWp8c2xXvJ3OPWPhmuteU0IKj8nKbG3DrjiOmLwpnHGYWAVwA69zmTm++YG0Hmwww== - dependencies: - debug "2.6.9" - depd "~1.1.2" - destroy "~1.0.4" - encodeurl "~1.0.2" - escape-html "~1.0.3" - etag "~1.8.1" - fresh "0.5.2" - http-errors "1.8.1" - mime "1.6.0" - ms "2.1.3" - on-finished "~2.3.0" - range-parser "~1.2.1" - statuses "~1.5.0" - -serialize-javascript@^6.0.0: - version "6.0.0" - resolved "https://registry.yarnpkg.com/serialize-javascript/-/serialize-javascript-6.0.0.tgz#efae5d88f45d7924141da8b5c3a7a7e663fefeb8" - integrity sha512-Qr3TosvguFt8ePWqsvRfrKyQXIiW+nGbYpy8XK24NQHE83caxWt+mIymTT19DGFbNWNLfEwsrkSmN64lVWB9ag== - dependencies: - randombytes "^2.1.0" - -serve-index@^1.9.1: - version "1.9.1" - resolved "https://registry.yarnpkg.com/serve-index/-/serve-index-1.9.1.tgz#d3768d69b1e7d82e5ce050fff5b453bea12a9239" - integrity sha1-03aNabHn2C5c4FD/9bRTvqEqkjk= - dependencies: - accepts "~1.3.4" - batch "0.6.1" - debug "2.6.9" - escape-html "~1.0.3" - http-errors "~1.6.2" - mime-types "~2.1.17" - parseurl "~1.3.2" - -serve-static@1.14.2: - version "1.14.2" - resolved "https://registry.yarnpkg.com/serve-static/-/serve-static-1.14.2.tgz#722d6294b1d62626d41b43a013ece4598d292bfa" - integrity sha512-+TMNA9AFxUEGuC0z2mevogSnn9MXKb4fa7ngeRMJaaGv8vTwnIEkKi+QGvPt33HSnf8pRS+WGM0EbMtCJLKMBQ== - dependencies: - encodeurl "~1.0.2" - escape-html "~1.0.3" - parseurl "~1.3.3" - send "0.17.2" - -setprototypeof@1.1.0: - version "1.1.0" - resolved "https://registry.yarnpkg.com/setprototypeof/-/setprototypeof-1.1.0.tgz#d0bd85536887b6fe7c0d818cb962d9d91c54e656" - integrity sha512-BvE/TwpZX4FXExxOxZyRGQQv651MSwmWKZGqvmPcRIjDqWub67kTKuIMx43cZZrS/cBBzwBcNDWoFxt2XEFIpQ== - -setprototypeof@1.2.0: - version "1.2.0" - resolved "https://registry.yarnpkg.com/setprototypeof/-/setprototypeof-1.2.0.tgz#66c9a24a73f9fc28cbe66b09fed3d33dcaf1b424" - integrity sha512-E5LDX7Wrp85Kil5bhZv46j8jOeboKq5JMmYM3gVGdGH8xFpPWXUMsNrlODCrkoxMEeNi/XZIwuRvY4XNwYMJpw== - -shallow-clone@^3.0.0: - version "3.0.1" - resolved "https://registry.yarnpkg.com/shallow-clone/-/shallow-clone-3.0.1.tgz#8f2981ad92531f55035b01fb230769a40e02efa3" - integrity sha512-/6KqX+GVUdqPuPPd2LxDDxzX6CAbjJehAAOKlNpqqUpAqPM6HeL8f+o3a+JsyGjn2lv0WY8UsTgUJjU9Ok55NA== - dependencies: - kind-of "^6.0.2" - -shallowequal@^1.1.0: - version "1.1.0" - resolved "https://registry.yarnpkg.com/shallowequal/-/shallowequal-1.1.0.tgz#188d521de95b9087404fd4dcb68b13df0ae4e7f8" - integrity sha512-y0m1JoUZSlPAjXVtPPW70aZWfIL/dSP7AFkRnniLCrK/8MDKog3TySTBmckD+RObVxH0v4Tox67+F14PdED2oQ== - -shebang-command@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/shebang-command/-/shebang-command-2.0.0.tgz#ccd0af4f8835fbdc265b82461aaf0c36663f34ea" - integrity sha512-kHxr2zZpYtdmrN1qDjrrX/Z1rR1kG8Dx+gkpK1G4eXmvXswmcE1hTWBWYUzlraYw1/yZp6YuDY77YtvbN0dmDA== - dependencies: - shebang-regex "^3.0.0" - -shebang-regex@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/shebang-regex/-/shebang-regex-3.0.0.tgz#ae16f1644d873ecad843b0307b143362d4c42172" - integrity sha512-7++dFhtcx3353uBaq8DDR4NuxBetBzC7ZQOhmTQInHEd6bSrXdiEyzCvG07Z44UYdLShWUyXt5M/yhz8ekcb1A== - -signal-exit@^3.0.3: - version "3.0.7" - resolved "https://registry.yarnpkg.com/signal-exit/-/signal-exit-3.0.7.tgz#a9a1767f8af84155114eaabd73f99273c8f59ad9" - integrity sha512-wnD2ZE+l+SPC/uoS0vXeE9L1+0wuaMqKlfz9AMUo38JsyLSBWSFcHR1Rri62LZc12vLr1gb3jl7iwQhgwpAbGQ== - -slash@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/slash/-/slash-3.0.0.tgz#6539be870c165adbd5240220dbe361f1bc4d4634" - integrity sha512-g9Q1haeby36OSStwb4ntCGGGaKsaVSjQ68fBxoQcutl5fS1vuY18H3wSt3jFyFtrkx+Kz0V1G85A4MyAdDMi2Q== - -sockjs@^0.3.21: - version "0.3.24" - resolved "https://registry.yarnpkg.com/sockjs/-/sockjs-0.3.24.tgz#c9bc8995f33a111bea0395ec30aa3206bdb5ccce" - integrity sha512-GJgLTZ7vYb/JtPSSZ10hsOYIvEYsjbNU+zPdIHcUaWVNUEPivzxku31865sSSud0Da0W4lEeOPlmw93zLQchuQ== - dependencies: - faye-websocket "^0.11.3" - uuid "^8.3.2" - websocket-driver "^0.7.4" - -source-map-js@^1.0.2: - version "1.0.2" - resolved "https://registry.yarnpkg.com/source-map-js/-/source-map-js-1.0.2.tgz#adbc361d9c62df380125e7f161f71c826f1e490c" - integrity sha512-R0XvVJ9WusLiqTCEiGCmICCMplcCkIwwR11mOSD9CR5u+IXYdiseeEuXCVAjS54zqwkLcPNnmU4OeJ6tUrWhDw== - -source-map-support@~0.5.20: - version "0.5.21" - resolved "https://registry.yarnpkg.com/source-map-support/-/source-map-support-0.5.21.tgz#04fe7c7f9e1ed2d662233c28cb2b35b9f63f6e4f" - integrity sha512-uBHU3L3czsIyYXKX88fdrGovxdSCoTGDRZ6SYXtSRxLZUzHg5P/66Ht6uoUlHu9EZod+inXhKo3qQgwXUT/y1w== - dependencies: - buffer-from "^1.0.0" - source-map "^0.6.0" - -source-map@^0.6.0, source-map@^0.6.1, source-map@~0.6.0: - version "0.6.1" - resolved "https://registry.yarnpkg.com/source-map/-/source-map-0.6.1.tgz#74722af32e9614e9c287a8d0bbde48b5e2f1a263" - integrity sha512-UjgapumWlbMhkBgzT7Ykc5YXUT46F0iKu8SGXq0bcwP5dz/h0Plj6enJqjz1Zbq2l5WaqYnrVbwWOWMyF3F47g== - -source-map@~0.7.2: - version "0.7.3" - resolved "https://registry.yarnpkg.com/source-map/-/source-map-0.7.3.tgz#5302f8169031735226544092e64981f751750383" - integrity sha512-CkCj6giN3S+n9qrYiBTX5gystlENnRW5jZeNLHpe6aue+SrHcG5VYwujhW9s4dY31mEGsxBDrHR6oI69fTXsaQ== - -spdy-transport@^3.0.0: - version "3.0.0" - resolved "https://registry.yarnpkg.com/spdy-transport/-/spdy-transport-3.0.0.tgz#00d4863a6400ad75df93361a1608605e5dcdcf31" - integrity sha512-hsLVFE5SjA6TCisWeJXFKniGGOpBgMLmerfO2aCyCU5s7nJ/rpAepqmFifv/GCbSbueEeAJJnmSQ2rKC/g8Fcw== - dependencies: - debug "^4.1.0" - detect-node "^2.0.4" - hpack.js "^2.1.6" - obuf "^1.1.2" - readable-stream "^3.0.6" - wbuf "^1.7.3" - -spdy@^4.0.2: - version "4.0.2" - resolved "https://registry.yarnpkg.com/spdy/-/spdy-4.0.2.tgz#b74f466203a3eda452c02492b91fb9e84a27677b" - integrity sha512-r46gZQZQV+Kl9oItvl1JZZqJKGr+oEkB08A6BzkiR7593/7IbtuncXHd2YoYeTsG4157ZssMu9KYvUHLcjcDoA== - dependencies: - debug "^4.1.0" - handle-thing "^2.0.0" - http-deceiver "^1.2.7" - select-hose "^2.0.0" - spdy-transport "^3.0.0" - -"statuses@>= 1.4.0 < 2", "statuses@>= 1.5.0 < 2", statuses@~1.5.0: - version "1.5.0" - resolved "https://registry.yarnpkg.com/statuses/-/statuses-1.5.0.tgz#161c7dac177659fd9811f43771fa99381478628c" - integrity sha1-Fhx9rBd2Wf2YEfQ3cfqZOBR4Yow= - -string-convert@^0.2.0: - version "0.2.1" - resolved "https://registry.yarnpkg.com/string-convert/-/string-convert-0.2.1.tgz#6982cc3049fbb4cd85f8b24568b9d9bf39eeff97" - integrity sha1-aYLMMEn7tM2F+LJFaLnZvznu/5c= - -string_decoder@^1.1.1: - version "1.3.0" - resolved "https://registry.yarnpkg.com/string_decoder/-/string_decoder-1.3.0.tgz#42f114594a46cf1a8e30b0a84f56c78c3edac21e" - integrity sha512-hkRX8U1WjJFd8LsDJ2yQ/wWWxaopEsABU1XfkM8A+j0+85JAGppt16cr1Whg6KIbb4okU6Mql6BOj+uup/wKeA== - dependencies: - safe-buffer "~5.2.0" - -string_decoder@~1.1.1: - version "1.1.1" - resolved "https://registry.yarnpkg.com/string_decoder/-/string_decoder-1.1.1.tgz#9cf1611ba62685d7030ae9e4ba34149c3af03fc8" - integrity sha512-n/ShnvDi6FHbbVfviro+WojiFzv+s8MPMHBczVePfUpDJLwoLT0ht1l4YwBCbi8pJAveEEdnkHyPyTP/mzRfwg== - dependencies: - safe-buffer "~5.1.0" - -strip-ansi@^6.0.1: - version "6.0.1" - resolved "https://registry.yarnpkg.com/strip-ansi/-/strip-ansi-6.0.1.tgz#9e26c63d30f53443e9489495b2105d37b67a85d9" - integrity sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A== - dependencies: - ansi-regex "^5.0.1" - -strip-ansi@^7.0.0: - version "7.0.1" - resolved "https://registry.yarnpkg.com/strip-ansi/-/strip-ansi-7.0.1.tgz#61740a08ce36b61e50e65653f07060d000975fb2" - integrity sha512-cXNxvT8dFNRVfhVME3JAe98mkXDYN2O1l7jmcwMnOslDeESg1rF/OZMtK0nRAhiari1unG5cD4jG3rapUAkLbw== - dependencies: - ansi-regex "^6.0.1" - -strip-final-newline@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/strip-final-newline/-/strip-final-newline-2.0.0.tgz#89b852fb2fcbe936f6f4b3187afb0a12c1ab58ad" - integrity sha512-BrpvfNAE3dcvq7ll3xVumzjKjZQ5tI1sEUIKr3Uoks0XUl45St3FlatVqef9prk4jRDzhW6WZg+3bk93y6pLjA== - -style-loader@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/style-loader/-/style-loader-2.0.0.tgz#9669602fd4690740eaaec137799a03addbbc393c" - integrity sha512-Z0gYUJmzZ6ZdRUqpg1r8GsaFKypE+3xAzuFeMuoHgjc9KZv3wMyCRjQIWEbhoFSq7+7yoHXySDJyyWQaPajeiQ== - dependencies: - loader-utils "^2.0.0" - schema-utils "^3.0.0" - -supports-color@^7.1.0: - version "7.2.0" - resolved "https://registry.yarnpkg.com/supports-color/-/supports-color-7.2.0.tgz#1b7dcdcb32b8138801b3e478ba6a51caa89648da" - integrity sha512-qpCAvRl9stuOHveKsn7HncJRvv501qIacKzQlO/+Lwxc9+0q2wLyv4Dfvt80/DPn2pqOBsJdDiogXGR9+OvwRw== - dependencies: - has-flag "^4.0.0" - -supports-color@^8.0.0: - version "8.1.1" - resolved "https://registry.yarnpkg.com/supports-color/-/supports-color-8.1.1.tgz#cd6fc17e28500cff56c1b86c0a7fd4a54a73005c" - integrity sha512-MpUEN2OodtUzxvKQl72cUF7RQ5EiHsGvSsVG0ia9c5RbWGL2CI4C7EpPS8UTBIplnlzZiNuV56w+FuNxy3ty2Q== - dependencies: - has-flag "^4.0.0" - -supports-preserve-symlinks-flag@^1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/supports-preserve-symlinks-flag/-/supports-preserve-symlinks-flag-1.0.0.tgz#6eda4bd344a3c94aea376d4cc31bc77311039e09" - integrity sha512-ot0WnXS9fgdkgIcePe6RHNk1WA8+muPa6cSjeR3V8K27q9BB1rTE3R1p7Hv0z1ZyAc8s6Vvv8DIyWf681MAt0w== - -tapable@^1.0.0: - version "1.1.3" - resolved "https://registry.yarnpkg.com/tapable/-/tapable-1.1.3.tgz#a1fccc06b58db61fd7a45da2da44f5f3a3e67ba2" - integrity sha512-4WK/bYZmj8xLr+HUCODHGF1ZFzsYffasLUgEiMBY4fgtltdO6B4WJtlSbPaDTLpYTcGVwM2qLnFTICEcNxs3kA== - -tapable@^2.0.0, tapable@^2.1.1, tapable@^2.2.0: - version "2.2.1" - resolved "https://registry.yarnpkg.com/tapable/-/tapable-2.2.1.tgz#1967a73ef4060a82f12ab96af86d52fdb76eeca0" - integrity sha512-GNzQvQTOIP6RyTfE2Qxb8ZVlNmw0n88vp1szwWRimP02mnTsx3Wtn5qRdqY9w2XduFNUgvOwhNnQsjwCp+kqaQ== - -terser-webpack-plugin@^5.1.3: - version "5.3.1" - resolved "https://registry.yarnpkg.com/terser-webpack-plugin/-/terser-webpack-plugin-5.3.1.tgz#0320dcc270ad5372c1e8993fabbd927929773e54" - integrity sha512-GvlZdT6wPQKbDNW/GDQzZFg/j4vKU96yl2q6mcUkzKOgW4gwf1Z8cZToUCrz31XHlPWH8MVb1r2tFtdDtTGJ7g== - dependencies: - jest-worker "^27.4.5" - schema-utils "^3.1.1" - serialize-javascript "^6.0.0" - source-map "^0.6.1" - terser "^5.7.2" - -terser@^5.10.0, terser@^5.7.2: - version "5.12.0" - resolved "https://registry.yarnpkg.com/terser/-/terser-5.12.0.tgz#728c6bff05f7d1dcb687d8eace0644802a9dae8a" - integrity sha512-R3AUhNBGWiFc77HXag+1fXpAxTAFRQTJemlJKjAgD9r8xXTpjNKqIXwHM/o7Rh+O0kUJtS3WQVdBeMKFk5sw9A== - dependencies: - acorn "^8.5.0" - commander "^2.20.0" - source-map "~0.7.2" - source-map-support "~0.5.20" - -thunky@^1.0.2: - version "1.1.0" - resolved "https://registry.yarnpkg.com/thunky/-/thunky-1.1.0.tgz#5abaf714a9405db0504732bbccd2cedd9ef9537d" - integrity sha512-eHY7nBftgThBqOyHGVN+l8gF0BucP09fMo0oO/Lb0w1OF80dJv+lDVpXG60WMQvkcxAkNybKsrEIE3ZtKGmPrA== - -tiny-warning@^1.0.2: - version "1.0.3" - resolved "https://registry.yarnpkg.com/tiny-warning/-/tiny-warning-1.0.3.tgz#94a30db453df4c643d0fd566060d60a875d84754" - integrity sha512-lBN9zLN/oAf68o3zNXYrdCt1kP8WsiGW8Oo2ka41b2IM5JL/S1CTyX1rW0mb/zSuJun0ZUrDxx4sqvYS2FWzPA== - -to-regex-range@^5.0.1: - version "5.0.1" - resolved "https://registry.yarnpkg.com/to-regex-range/-/to-regex-range-5.0.1.tgz#1648c44aae7c8d988a326018ed72f5b4dd0392e4" - integrity sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ== - dependencies: - is-number "^7.0.0" - -toggle-selection@^1.0.6: - version "1.0.6" - resolved "https://registry.yarnpkg.com/toggle-selection/-/toggle-selection-1.0.6.tgz#6e45b1263f2017fa0acc7d89d78b15b8bf77da32" - integrity sha1-bkWxJj8gF/oKzH2J14sVuL932jI= - -toidentifier@1.0.1: - version "1.0.1" - resolved "https://registry.yarnpkg.com/toidentifier/-/toidentifier-1.0.1.tgz#3be34321a88a820ed1bd80dfaa33e479fbb8dd35" - integrity sha512-o5sSPKEkg/DIQNmH43V0/uerLrpzVedkUh8tGNvaeXpfpuwjKenlSox/2O/BTlZUtEe+JG7s5YhEz608PlAHRA== - -tr46@~0.0.3: - version "0.0.3" - resolved "https://registry.yarnpkg.com/tr46/-/tr46-0.0.3.tgz#8184fd347dac9cdc185992f3a6622e14b9d9ab6a" - integrity sha1-gYT9NH2snNwYWZLzpmIuFLnZq2o= - -ts-loader@^8.0.18: - version "8.3.0" - resolved "https://registry.yarnpkg.com/ts-loader/-/ts-loader-8.3.0.tgz#83360496d6f8004fab35825279132c93412edf33" - integrity sha512-MgGly4I6cStsJy27ViE32UoqxPTN9Xly4anxxVyaIWR+9BGxboV4EyJBGfR3RePV7Ksjj3rHmPZJeIt+7o4Vag== - dependencies: - chalk "^4.1.0" - enhanced-resolve "^4.0.0" - loader-utils "^2.0.0" - micromatch "^4.0.0" - semver "^7.3.4" - -tslib@^2.0.3: - version "2.3.1" - resolved "https://registry.yarnpkg.com/tslib/-/tslib-2.3.1.tgz#e8a335add5ceae51aa261d32a490158ef042ef01" - integrity sha512-77EbyPPpMz+FRFRuAFlWMtmgUWGe9UOG2Z25NqCwiIjRhOf5iKGuzSe5P2w1laq+FkRy4p+PCuVkJSGkzTEKVw== - -type-is@~1.6.18: - version "1.6.18" - resolved "https://registry.yarnpkg.com/type-is/-/type-is-1.6.18.tgz#4e552cd05df09467dcbc4ef739de89f2cf37c131" - integrity sha512-TkRKr9sUTxEH8MdfuCSP7VizJyzRNMjj2J2do2Jr3Kym598JVdEksuzPQCnlFPW4ky9Q+iA+ma9BGm06XQBy8g== - dependencies: - media-typer "0.3.0" - mime-types "~2.1.24" - -typescript@^4.0.3: - version "4.6.2" - resolved "https://registry.yarnpkg.com/typescript/-/typescript-4.6.2.tgz#fe12d2727b708f4eef40f51598b3398baa9611d4" - integrity sha512-HM/hFigTBHZhLXshn9sN37H085+hQGeJHJ/X7LpBWLID/fbc2acUMfU+lGD98X81sKP+pFa9f0DZmCwB9GnbAg== - -unpipe@1.0.0, unpipe@~1.0.0: - version "1.0.0" - resolved "https://registry.yarnpkg.com/unpipe/-/unpipe-1.0.0.tgz#b2bf4ee8514aae6165b4817829d21b2ef49904ec" - integrity sha1-sr9O6FFKrmFltIF4KdIbLvSZBOw= - -uri-js@^4.2.2: - version "4.4.1" - resolved "https://registry.yarnpkg.com/uri-js/-/uri-js-4.4.1.tgz#9b1a52595225859e55f669d928f88c6c57f2a77e" - integrity sha512-7rKUyy33Q1yc98pQ1DAmLtwX109F7TIfWlW1Ydo8Wl1ii1SeHieeh0HHfPeL2fMXK6z0s8ecKs9frCuLJvndBg== - dependencies: - punycode "^2.1.0" - -util-deprecate@^1.0.1, util-deprecate@^1.0.2, util-deprecate@~1.0.1: - version "1.0.2" - resolved "https://registry.yarnpkg.com/util-deprecate/-/util-deprecate-1.0.2.tgz#450d4dc9fa70de732762fbd2d4a28981419a0ccf" - integrity sha1-RQ1Nyfpw3nMnYvvS1KKJgUGaDM8= - -utila@~0.4: - version "0.4.0" - resolved "https://registry.yarnpkg.com/utila/-/utila-0.4.0.tgz#8a16a05d445657a3aea5eecc5b12a4fa5379772c" - integrity sha1-ihagXURWV6Oupe7MWxKk+lN5dyw= - -utils-merge@1.0.1: - version "1.0.1" - resolved "https://registry.yarnpkg.com/utils-merge/-/utils-merge-1.0.1.tgz#9f95710f50a267947b2ccc124741c1028427e713" - integrity sha1-n5VxD1CiZ5R7LMwSR0HBAoQn5xM= - -uuid@^8.3.2: - version "8.3.2" - resolved "https://registry.yarnpkg.com/uuid/-/uuid-8.3.2.tgz#80d5b5ced271bb9af6c445f21a1a04c606cefbe2" - integrity sha512-+NYs2QeMWy+GWFOEm9xnn6HCDp0l7QBD7ml8zLUmJ+93Q5NF0NocErnwkTkXVFNiX3/fpC6afS8Dhb/gz7R7eg== - -vary@~1.1.2: - version "1.1.2" - resolved "https://registry.yarnpkg.com/vary/-/vary-1.1.2.tgz#2299f02c6ded30d4a5961b0b9f74524a18f634fc" - integrity sha1-IpnwLG3tMNSllhsLn3RSShj2NPw= - -watchpack@^2.3.1: - version "2.3.1" - resolved "https://registry.yarnpkg.com/watchpack/-/watchpack-2.3.1.tgz#4200d9447b401156eeca7767ee610f8809bc9d25" - integrity sha512-x0t0JuydIo8qCNctdDrn1OzH/qDzk2+rdCOC3YzumZ42fiMqmQ7T3xQurykYMhYfHaPHTp4ZxAx2NfUo1K6QaA== - dependencies: - glob-to-regexp "^0.4.1" - graceful-fs "^4.1.2" - -wbuf@^1.1.0, wbuf@^1.7.3: - version "1.7.3" - resolved "https://registry.yarnpkg.com/wbuf/-/wbuf-1.7.3.tgz#c1d8d149316d3ea852848895cb6a0bfe887b87df" - integrity sha512-O84QOnr0icsbFGLS0O3bI5FswxzRr8/gHwWkDlQFskhSPryQXvrTMxjxGP4+iWYoauLoBvfDpkrOauZ+0iZpDA== - dependencies: - minimalistic-assert "^1.0.0" - -webidl-conversions@^3.0.0: - version "3.0.1" - resolved "https://registry.yarnpkg.com/webidl-conversions/-/webidl-conversions-3.0.1.tgz#24534275e2a7bc6be7bc86611cc16ae0a5654871" - integrity sha1-JFNCdeKnvGvnvIZhHMFq4KVlSHE= - -webpack-cli@^4.5.0: - version "4.9.2" - resolved "https://registry.yarnpkg.com/webpack-cli/-/webpack-cli-4.9.2.tgz#77c1adaea020c3f9e2db8aad8ea78d235c83659d" - integrity sha512-m3/AACnBBzK/kMTcxWHcZFPrw/eQuY4Df1TxvIWfWM2x7mRqBQCqKEd96oCUa9jkapLBaFfRce33eGDb4Pr7YQ== - dependencies: - "@discoveryjs/json-ext" "^0.5.0" - "@webpack-cli/configtest" "^1.1.1" - "@webpack-cli/info" "^1.4.1" - "@webpack-cli/serve" "^1.6.1" - colorette "^2.0.14" - commander "^7.0.0" - execa "^5.0.0" - fastest-levenshtein "^1.0.12" - import-local "^3.0.2" - interpret "^2.2.0" - rechoir "^0.7.0" - webpack-merge "^5.7.3" - -webpack-dev-middleware@^5.3.1: - version "5.3.1" - resolved "https://registry.yarnpkg.com/webpack-dev-middleware/-/webpack-dev-middleware-5.3.1.tgz#aa079a8dedd7e58bfeab358a9af7dab304cee57f" - integrity sha512-81EujCKkyles2wphtdrnPg/QqegC/AtqNH//mQkBYSMqwFVCQrxM6ktB2O/SPlZy7LqeEfTbV3cZARGQz6umhg== - dependencies: - colorette "^2.0.10" - memfs "^3.4.1" - mime-types "^2.1.31" - range-parser "^1.2.1" - schema-utils "^4.0.0" - -webpack-dev-server@^4.7.4: - version "4.7.4" - resolved "https://registry.yarnpkg.com/webpack-dev-server/-/webpack-dev-server-4.7.4.tgz#d0ef7da78224578384e795ac228d8efb63d5f945" - integrity sha512-nfdsb02Zi2qzkNmgtZjkrMOcXnYZ6FLKcQwpxT7MvmHKc+oTtDsBju8j+NMyAygZ9GW1jMEUpy3itHtqgEhe1A== - dependencies: - "@types/bonjour" "^3.5.9" - "@types/connect-history-api-fallback" "^1.3.5" - "@types/express" "^4.17.13" - "@types/serve-index" "^1.9.1" - "@types/sockjs" "^0.3.33" - "@types/ws" "^8.2.2" - ansi-html-community "^0.0.8" - bonjour "^3.5.0" - chokidar "^3.5.3" - colorette "^2.0.10" - compression "^1.7.4" - connect-history-api-fallback "^1.6.0" - default-gateway "^6.0.3" - del "^6.0.0" - express "^4.17.1" - graceful-fs "^4.2.6" - html-entities "^2.3.2" - http-proxy-middleware "^2.0.0" - ipaddr.js "^2.0.1" - open "^8.0.9" - p-retry "^4.5.0" - portfinder "^1.0.28" - schema-utils "^4.0.0" - selfsigned "^2.0.0" - serve-index "^1.9.1" - sockjs "^0.3.21" - spdy "^4.0.2" - strip-ansi "^7.0.0" - webpack-dev-middleware "^5.3.1" - ws "^8.4.2" - -webpack-merge@^5.7.3: - version "5.8.0" - resolved "https://registry.yarnpkg.com/webpack-merge/-/webpack-merge-5.8.0.tgz#2b39dbf22af87776ad744c390223731d30a68f61" - integrity sha512-/SaI7xY0831XwP6kzuwhKWVKDP9t1QY1h65lAFLbZqMPIuYcD9QAW4u9STIbU9kaJbPBB/geU/gLr1wDjOhQ+Q== - dependencies: - clone-deep "^4.0.1" - wildcard "^2.0.0" - -webpack-sources@^3.2.3: - version "3.2.3" - resolved "https://registry.yarnpkg.com/webpack-sources/-/webpack-sources-3.2.3.tgz#2d4daab8451fd4b240cc27055ff6a0c2ccea0cde" - integrity sha512-/DyMEOrDgLKKIG0fmvtz+4dUX/3Ghozwgm6iPp8KRhvn+eQf9+Q7GWxVNMk3+uCPWfdXYC4ExGBckIXdFEfH1w== - -webpack@^5.28.0: - version "5.70.0" - resolved "https://registry.yarnpkg.com/webpack/-/webpack-5.70.0.tgz#3461e6287a72b5e6e2f4872700bc8de0d7500e6d" - integrity sha512-ZMWWy8CeuTTjCxbeaQI21xSswseF2oNOwc70QSKNePvmxE7XW36i7vpBMYZFAUHPwQiEbNGCEYIOOlyRbdGmxw== - dependencies: - "@types/eslint-scope" "^3.7.3" - "@types/estree" "^0.0.51" - "@webassemblyjs/ast" "1.11.1" - "@webassemblyjs/wasm-edit" "1.11.1" - "@webassemblyjs/wasm-parser" "1.11.1" - acorn "^8.4.1" - acorn-import-assertions "^1.7.6" - browserslist "^4.14.5" - chrome-trace-event "^1.0.2" - enhanced-resolve "^5.9.2" - es-module-lexer "^0.9.0" - eslint-scope "5.1.1" - events "^3.2.0" - glob-to-regexp "^0.4.1" - graceful-fs "^4.2.9" - json-parse-better-errors "^1.0.2" - loader-runner "^4.2.0" - mime-types "^2.1.27" - neo-async "^2.6.2" - schema-utils "^3.1.0" - tapable "^2.1.1" - terser-webpack-plugin "^5.1.3" - watchpack "^2.3.1" - webpack-sources "^3.2.3" - -websocket-driver@>=0.5.1, websocket-driver@^0.7.4: - version "0.7.4" - resolved "https://registry.yarnpkg.com/websocket-driver/-/websocket-driver-0.7.4.tgz#89ad5295bbf64b480abcba31e4953aca706f5760" - integrity sha512-b17KeDIQVjvb0ssuSDF2cYXSg2iztliJ4B9WdsuB6J952qCPKmnVq4DyW5motImXHDC1cBT/1UezrJVsKw5zjg== - dependencies: - http-parser-js ">=0.5.1" - safe-buffer ">=5.1.0" - websocket-extensions ">=0.1.1" - -websocket-extensions@>=0.1.1: - version "0.1.4" - resolved "https://registry.yarnpkg.com/websocket-extensions/-/websocket-extensions-0.1.4.tgz#7f8473bc839dfd87608adb95d7eb075211578a42" - integrity sha512-OqedPIGOfsDlo31UNwYbCFMSaO9m9G/0faIHj5/dZFDMFqPTcx6UwqyOy3COEaEOg/9VsGIpdqn62W5KhoKSpg== - -whatwg-fetch@>=0.10.0: - version "3.6.2" - resolved "https://registry.yarnpkg.com/whatwg-fetch/-/whatwg-fetch-3.6.2.tgz#dced24f37f2624ed0281725d51d0e2e3fe677f8c" - integrity sha512-bJlen0FcuU/0EMLrdbJ7zOnW6ITZLrZMIarMUVmdKtsGvZna8vxKYaexICWPfZ8qwf9fzNq+UEIZrnSaApt6RA== - -whatwg-url@^5.0.0: - version "5.0.0" - resolved "https://registry.yarnpkg.com/whatwg-url/-/whatwg-url-5.0.0.tgz#966454e8765462e37644d3626f6742ce8b70965d" - integrity sha1-lmRU6HZUYuN2RNNib2dCzotwll0= - dependencies: - tr46 "~0.0.3" - webidl-conversions "^3.0.0" - -which@^2.0.1: - version "2.0.2" - resolved "https://registry.yarnpkg.com/which/-/which-2.0.2.tgz#7c6a8dd0a636a0327e10b59c9286eee93f3f51b1" - integrity sha512-BLI3Tl1TW3Pvl70l3yq3Y64i+awpwXqsGBYWkkqMtnbXgrMD+yj7rhW0kuEDxzJaYXGjEW5ogapKNMEKNMjibA== - dependencies: - isexe "^2.0.0" - -wildcard@^2.0.0: - version "2.0.0" - resolved "https://registry.yarnpkg.com/wildcard/-/wildcard-2.0.0.tgz#a77d20e5200c6faaac979e4b3aadc7b3dd7f8fec" - integrity sha512-JcKqAHLPxcdb9KM49dufGXn2x3ssnfjbcaQdLlfZsL9rH9wgDQjUtDxbo8NE0F6SFvydeu1VhZe7hZuHsB2/pw== - -wrappy@1: - version "1.0.2" - resolved "https://registry.yarnpkg.com/wrappy/-/wrappy-1.0.2.tgz#b5243d8f3ec1aa35f1364605bc0d1036e30ab69f" - integrity sha1-tSQ9jz7BqjXxNkYFvA0QNuMKtp8= - -ws@^8.4.2: - version "8.5.0" - resolved "https://registry.yarnpkg.com/ws/-/ws-8.5.0.tgz#bfb4be96600757fe5382de12c670dab984a1ed4f" - integrity sha512-BWX0SWVgLPzYwF8lTzEy1egjhS4S4OEAHfsO8o65WOVsrnSRGaSiUaa9e0ggGlkMTtBlmOpEXiie9RUcBO86qg== - -yallist@^4.0.0: - version "4.0.0" - resolved "https://registry.yarnpkg.com/yallist/-/yallist-4.0.0.tgz#9bb92790d9c0effec63be73519e11a35019a3a72" - integrity sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A== diff --git a/plugins/tensorboard-plugins/tb_plugin/setup.py b/plugins/tensorboard-plugins/tb_plugin/setup.py index 3c09006122c776df8fbe8af5836711613e3f6a9c..2d4260b2133ae00a91831a7e2867b467e029d108 100644 --- a/plugins/tensorboard-plugins/tb_plugin/setup.py +++ b/plugins/tensorboard-plugins/tb_plugin/setup.py @@ -1,6 +1,5 @@ # ------------------------------------------------------------------------- -# Copyright (c) Microsoft Corporation. All rights reserved. -# +# Copyright (c) Microsoft Corporation. # Copyright(c) 2023 Huawei Technologies. # All rights reserved # @@ -21,8 +20,13 @@ import os import pathlib import subprocess +from configparser import ConfigParser + import setuptools +config = ConfigParser() +config.read('./torch_tb_profiler/config/config.ini') + def read(rel_path): here = os.path.abspath(os.path.dirname(__file__)) @@ -83,17 +87,16 @@ setuptools.setup( name="torch-tb-profiler-ascend", version=get_version(os.path.join('torch_tb_profiler', '__init__.py')), description="PyTorch Ascend Profiler TensorBoard Plugin", - long_description="PyTorch Ascend Profiler TensorBoard Plugin : \ - https://gitee.com/ascend/att/tree/master/plugins/tensorboard-plugins/tb_plugin", - url="https://gitee.com/ascend/att/tree/master/plugins/tensorboard-plugins/tb_plugin", + long_description=f"PyTorch Ascend Profiler TensorBoard Plugin: {config.get('URL', 'repository_url')}", + url=config.get('URL', 'repository_url'), author="Ascend Team", - author_email="pmail_mindstudio@huawei.com", + author_email=config.get('EMAIL', 'author_email'), cmdclass={ "build_fe": build_fe }, packages=setuptools.find_packages(), package_data={ - "torch_tb_profiler": ["static/**"], + "torch_tb_profiler": ["static/**", "config/**"], }, entry_points={ "tensorboard_plugins": [ diff --git a/plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_0/worker0.1623143089861.pt.trace.json.gz b/plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_0/worker0.1623143089861.pt.trace.json.gz similarity index 100% rename from plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_0/worker0.1623143089861.pt.trace.json.gz rename to plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_0/worker0.1623143089861.pt.trace.json.gz diff --git a/plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_0/worker0.1623143566756.pt.trace.json.gz b/plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_0/worker0.1623143566756.pt.trace.json.gz similarity index 100% rename from plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_0/worker0.1623143566756.pt.trace.json.gz rename to plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_0/worker0.1623143566756.pt.trace.json.gz diff --git a/plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_4/worker0.1623212756351.pt.trace.json.gz b/plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_4/worker0.1623212756351.pt.trace.json.gz similarity index 100% rename from plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_4/worker0.1623212756351.pt.trace.json.gz rename to plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_4/worker0.1623212756351.pt.trace.json.gz diff --git a/plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_4/worker0.1623213129365.pt.trace.json.gz b/plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_4/worker0.1623213129365.pt.trace.json.gz similarity index 100% rename from plugins/tensorboard-plugins/tb_plugin/samples/resnet50_num_workers_4/worker0.1623213129365.pt.trace.json.gz rename to plugins/tensorboard-plugins/tb_plugin/test/resources/resnet50_num_workers_4/worker0.1623213129365.pt.trace.json.gz diff --git a/plugins/tensorboard-plugins/tb_plugin/test/test_tensorboard_end2end.py b/plugins/tensorboard-plugins/tb_plugin/test/test_tensorboard_end2end.py index fae95b49050537b921e291a4771c63a6bff35690..46636d11801a739935b4f385c6ce548009d09916 100644 --- a/plugins/tensorboard-plugins/tb_plugin/test/test_tensorboard_end2end.py +++ b/plugins/tensorboard-plugins/tb_plugin/test/test_tensorboard_end2end.py @@ -13,7 +13,7 @@ from urllib.error import HTTPError def get_samples_dir(): - return os.path.join(os.path.dirname(os.path.abspath(__file__)), '../samples') + return os.path.join(os.path.dirname(os.path.abspath(__file__)), 'resources') class TestEnd2End(unittest.TestCase): diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/__init__.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/__init__.py index fd7b265cfa7d67023075ec8d9bc59ed85f4e0f15..f7b951e609e5c65895a6db82d391e8d584eb37c8 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/__init__.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/__init__.py @@ -4,4 +4,4 @@ # Entry point for Pytorch TensorBoard plugin package. -__version__ = '0.4.0.8' +__version__ = '0.4.0.11' diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/config/config.ini b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/config/config.ini new file mode 100644 index 0000000000000000000000000000000000000000..500d472d27b2ca574e07829a64c50d6eb2ab7e71 --- /dev/null +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/config/config.ini @@ -0,0 +1,11 @@ +[URL] +pytorch_data_loading_url = https://pytorch.org/docs/stable/data.html#single-and-multi-process-data-loading +pytorch_amp_url = https://pytorch.org/docs/stable/amp.html +pytorch_ckp_url = https://pytorch.org/docs/stable/checkpoint.html +cuda_nn_ddp_instead_url = https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead +compress_url = https://pytorch.org/docs/stable/ddp_comm_hooks.html +grad_acc_url = https://towardsdatascience.com/what-is-gradient-accumulation-in-deep-learning-ec034122cfa +lamb_url = https://nvidia.github.io/apex/optimizers.html#apex.optimizers.FusedLAMB +repository_url = https://gitee.com/ascend/att/tree/master/plugins/tensorboard-plugins/tb_plugin +[EMAIL] +author_email = pmail_mindstudio@huawei.com \ No newline at end of file diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/consts.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/consts.py index 533effb8bb91f1f775fb1b98725b63854182ef53..b3e202af61eb9df1d210cd366e7d172075e1e570 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/consts.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/consts.py @@ -35,6 +35,8 @@ NODE_PROCESS_PATTERN = re.compile(r"""^(.*)_(\d+)""") MONITOR_RUN_REFRESH_INTERNAL_IN_SECONDS = 10 MAX_GPU_PER_NODE = 64 MAX_FILE_SIZE = 500 * 1024 * 1024 +MAX_LINUX_PATH_LENGTH = 4096 +MAX_WINDOWS_PATH_LENGTH = 260 View = namedtuple('View', 'id, name, display_name') OVERALL_VIEW = View(1, 'overall', 'Overview') diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/__init__.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/__init__.py index 6bd764e88d4fecd142e7a953b1adb5c4a72262b9..296f53b7c813b2c97b498469f49b973438d9f3ae 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/__init__.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/__init__.py @@ -1,4 +1,23 @@ +# ------------------------------------------------------------------------- +# Copyright (c) Microsoft Corporation. +# Copyright(c) 2023 Huawei Technologies. +# All rights reserved +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Modifications: Add visualization of PyTorch Ascend profiling. +# -------------------------------------------------------------------------- from .cache import Cache from .file import (BaseFileSystem, StatData, abspath, basename, download_file, exists, get_filesystem, glob, isdir, join, listdir, - makedirs, read, register_filesystem, relpath, walk, stat) + makedirs, read, register_filesystem, relpath, walk, stat, check_file_valid) diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/azureblob.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/azureblob.py index b0ac49a655fd3d999ea80dfc3e6fa62e33fc5269..2fcd69fee8c24393458875635c17bd74a71b0fc4 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/azureblob.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/azureblob.py @@ -20,9 +20,9 @@ class AzureBlobSystem(RemotePath, BaseFileSystem): raise ImportError('azure-storage-blob must be installed for Azure Blob support.') self.connection_string = os.environ.get('AZURE_STORAGE_CONNECTION_STRING', None) - def exists(self, dirname): + def exists(self, filename): """Returns whether the path is a directory or not.""" - basename, parts = self.split_blob_path(dirname) + basename, parts = self.split_blob_path(filename) if basename is None or parts is None: return False if basename == '': @@ -31,10 +31,10 @@ class AzureBlobSystem(RemotePath, BaseFileSystem): else: return basename == parts[0] - def read(self, filename, binary_mode=False, size=None, continue_from=None): + def read(self, file, binary_mode=False, size=None, continue_from=None): """Reads contents of a file to a string.""" - logger.info('azure blob: starting reading file %s' % filename) - account, container, path = self.container_and_path(filename) + logger.info('azure blob: starting reading file %s' % file) + account, container, path = self.container_and_path(file) client = self.create_container_client(account, container) blob_client = client.get_blob_client(path) if not blob_client.exists(): @@ -47,7 +47,7 @@ class AzureBlobSystem(RemotePath, BaseFileSystem): continuation_token = downloader.size data = downloader.readall() - logger.info('azure blob: file %s download is done, size is %d' % (filename, len(data))) + logger.info('azure blob: file %s download is done, size is %d' % (file, len(data))) if binary_mode: return as_bytes(data), continuation_token else: @@ -122,7 +122,7 @@ class AzureBlobSystem(RemotePath, BaseFileSystem): items.append(item) return items - def makedirs(self, dirname): + def makedirs(self, path): """No need create directory since the upload blob will automatically create""" pass diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/file.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/file.py index dc9abb056860d7a7708533bba55995a1ac6a5e79..9ef5d8485264f18426c18147663f2e1b9fb6900e 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/file.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/file.py @@ -15,32 +15,34 @@ The following functionalities are added after forking: """ import glob as py_glob import os +import platform +import sys import tempfile from .. import utils from .base import BaseFileSystem, LocalPath, RemotePath, StatData from .utils import as_bytes, as_text, parse_blob_url +from ..consts import MAX_FILE_SIZE, MAX_WINDOWS_PATH_LENGTH, MAX_LINUX_PATH_LENGTH logger = utils.get_logger() +S3_ENABLED = True try: import boto3 import botocore.exceptions - - S3_ENABLED = True except ImportError: S3_ENABLED = False +BLOB_ENABLED = True try: from azure.storage.blob import ContainerClient - BLOB_ENABLED = True except ImportError: BLOB_ENABLED = False +GS_ENABLED = True try: # Imports the Google Cloud client library from google.cloud import storage - GS_ENABLED = True except ImportError: GS_ENABLED = False @@ -86,19 +88,23 @@ class LocalFileSystem(LocalPath, BaseFileSystem): def __init__(self): pass + @staticmethod + def islink(path): + return os.path.islink(path) + def exists(self, filename): return os.path.exists(filename) - def read(self, filename, binary_mode=False, size=None, continue_from=None): + def read(self, file, binary_mode=False, size=None, continue_from=None): mode = "rb" if binary_mode else "r" encoding = None if binary_mode else "utf8" - if not self.exists(filename): - raise FileNotFoundError(filename) + if not self.exists(file): + raise FileNotFoundError(file) offset = None if continue_from is not None: offset = continue_from.get("opaque_offset", None) - with open(filename, mode, encoding=encoding) as f: + with open(file, mode, encoding=encoding) as f: if offset is not None: f.seek(offset) data = f.read(size) @@ -160,10 +166,6 @@ class LocalFileSystem(LocalPath, BaseFileSystem): return StatData(file_length) def walk(self, top, topdown=True, onerror=None): - # Note on followlinks=True: per the tensorboard documentation [1], users are encouraged to - # use symlink trees to have fine-grained control over the filesystem layout of runs. To - # support such trees, we must follow links. - # [1] https://github.com/tensorflow/tensorboard/blob/master/README.md#logdir--logdir_spec-legacy-mode yield from os.walk(top, topdown, onerror, followlinks=True) @@ -198,10 +200,10 @@ class S3FileSystem(RemotePath, BaseFileSystem): return True return False - def read(self, filename, binary_mode=False, size=None, continue_from=None): + def read(self, file, binary_mode=False, size=None, continue_from=None): """Reads contents of a file to a string.""" s3 = boto3.resource("s3", endpoint_url=self._s3_endpoint) - bucket, path = self.bucket_and_path(filename) + bucket, path = self.bucket_and_path(file) args = {} # S3 use continuation tokens of the form: {byte_offset: number} @@ -216,7 +218,7 @@ class S3FileSystem(RemotePath, BaseFileSystem): if offset != 0 or endpoint != "": args["Range"] = "bytes={}-{}".format(offset, endpoint) - logger.info("s3: starting reading file %s" % filename) + logger.info("s3: starting reading file %s" % file) try: stream = s3.Object(bucket, path).get(**args)["Body"].read() except botocore.exceptions.ClientError as exc: @@ -238,7 +240,7 @@ class S3FileSystem(RemotePath, BaseFileSystem): raise logger.info("s3: file %s download is done, size is %d" % - (filename, len(stream))) + (file, len(stream))) # `stream` should contain raw bytes here (i.e., there has been neither decoding nor newline translation), # so the byte offset increases by the expected amount. continuation_token = {"byte_offset": (offset + len(stream))} @@ -261,9 +263,6 @@ class S3FileSystem(RemotePath, BaseFileSystem): def download_file(self, file_to_download, file_to_save): logger.info("s3: starting downloading file %s as %s" % (file_to_download, file_to_save)) - # Use boto3.resource instead of boto3.client('s3') to support minio. - # https://docs.min.io/docs/how-to-use-aws-sdk-for-python-with-minio-server.html - # To support minio, the S3_ENDPOINT need to be set like: S3_ENDPOINT=http://localhost:9000 s3 = boto3.resource("s3", endpoint_url=self._s3_endpoint) bucket, path = self.bucket_and_path(file_to_download) s3.Bucket(bucket).download_file(path, file_to_save) @@ -321,14 +320,14 @@ class S3FileSystem(RemotePath, BaseFileSystem): keys.append(key) return keys - def makedirs(self, dirname): + def makedirs(self, path): """Creates a directory and all parent/intermediate directories.""" - if not self.exists(dirname): + if not self.exists(path): client = boto3.client("s3", endpoint_url=self._s3_endpoint) - bucket, path = self.bucket_and_path(dirname) - if not path.endswith("/"): - path += "/" - client.put_object(Body="", Bucket=bucket, Key=path) + bucket, dir_path = self.bucket_and_path(path) + if not dir_path.endswith("/"): + dir_path += "/" + client.put_object(Body="", Bucket=bucket, Key=dir_path) def stat(self, filename): """Returns file statistics for a given path.""" @@ -466,7 +465,7 @@ class File(object): if line and (line[-1] == "\n" or not self.buff): return line if not self.buff: - raise StopIteration() + return None else: index = self.buff.find("\n", self.buff_offset) if index != -1: @@ -481,7 +480,7 @@ class File(object): if line and (line[-1] == "\n" or not self.buff): return line if not self.buff: - raise StopIteration() + return None def next(self): return self.__next__() @@ -620,3 +619,40 @@ def stat(filename): def read(file): with File(file, 'rb') as f: return f.read() + + +def is_link(path): + return LocalFileSystem.islink(path) + + +def is_too_big_file(filepath): + return stat(filepath).length > MAX_FILE_SIZE + + +def has_too_long_path(filepath): + if platform.system() == 'Windows' and len(filepath) > MAX_WINDOWS_PATH_LENGTH: + logger.warning( + f'The path length of the file "{filepath}" exceeds the maximum limit of {MAX_WINDOWS_PATH_LENGTH} ' + f'and will be skipped.') + return True + elif len(filepath) > MAX_WINDOWS_PATH_LENGTH: + logger.warning( + f'The path length of the file "{filepath}" exceeds the maximum limit of {MAX_LINUX_PATH_LENGTH} ' + f'and will be skipped.') + return True + else: + return False + + +def check_file_valid(filepath): + if is_link(filepath): + logger.warning(f'File "{filepath}" is a soft link and will be skipped.') + return False + if is_too_big_file(filepath): + logger.warning( + f'File "{filepath}" exceeds the maximum limit size of 500MB and will be skipped.') + return False + if has_too_long_path(filepath): + return False + return True + diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/gs.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/gs.py index d3a46877326b12a5e8be49a65cf4c90be8157311..8596bce2b892b7188155d05330a6356a83323eff 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/gs.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/io/gs.py @@ -16,14 +16,14 @@ class GoogleBlobSystem(RemotePath, BaseFileSystem): if not storage: raise ImportError('google-cloud-storage must be installed for Google Cloud Blob support.') - def exists(self, dirname): + def exists(self, filename): """Returns whether the path is a directory or not.""" - bucket_name, path = self.bucket_and_path(dirname) + bucket_name, path = self.bucket_and_path(filename) client = self.create_google_cloud_client() bucket = client.bucket(bucket_name) return bucket.blob(path).exists() - def read(self, filename, binary_mode=False, size=None, continue_from=None): + def read(self, file, binary_mode=False, size=None, continue_from=None): raise NotImplementedError def write(self, filename, file_content, binary_mode=False): @@ -62,7 +62,7 @@ class GoogleBlobSystem(RemotePath, BaseFileSystem): items.append(item) return items - def makedirs(self, dirname): + def makedirs(self, path): """No need create directory since the upload blob will automatically create""" pass diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/plugin.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/plugin.py index 6091fdbcd906bf49e4e631afe7d2ba57e65ce711..2651f87c087a419c950f93b201606e7601a33a08 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/plugin.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/plugin.py @@ -1,6 +1,5 @@ # ------------------------------------------------------------------------- -# Copyright (c) Microsoft Corporation. All rights reserved. -# +# Copyright (c) Microsoft Corporation. # Copyright(c) 2023 Huawei Technologies. # All rights reserved # @@ -47,6 +46,7 @@ def decorate_headers(func): headers = func(*args, **kwargs) headers.extend(TorchProfilerPlugin.headers) return headers + return wrapper @@ -344,14 +344,23 @@ class TorchProfilerPlugin(base_plugin.TBPlugin): end_ts = float(end_ts) for key in operator_memory_events: if start_ts is not None and end_ts is not None: - operator_memory_events[key] = [i for i in operator_memory_events[key] if - i[2] and start_ts <= i[2] <= end_ts] + operator_memory_events[key] = [ + i + for i in operator_memory_events[key] + if i[2] and start_ts <= i[2] <= end_ts + ] elif start_ts is not None: - operator_memory_events[key] = [i for i in operator_memory_events[key] if - i[2] and start_ts <= i[2]] + operator_memory_events[key] = [ + i + for i in operator_memory_events[key] + if i[2] and start_ts <= i[2] + ] elif end_ts is not None: - operator_memory_events[key] = [i for i in operator_memory_events[key] if - i[2] and end_ts >= i[2]] + operator_memory_events[key] = [ + i + for i in operator_memory_events[key] + if i[2] and end_ts >= i[2] + ] return self.respond_as_json(temp_memory_events, True) else: if start_ts is not None: @@ -473,9 +482,8 @@ class TorchProfilerPlugin(base_plugin.TBPlugin): def _monitor_runs(self): logger.info('Monitor runs begin') - + touched = set() try: - touched = set() while True: try: logger.debug('Scan run dir') diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/__init__.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/__init__.py index 9ca062abf58245753361a96890a2ee1ccdec42fb..59a0e64155546ce75e1c4607cf35c3144a28271f 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/__init__.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/__init__.py @@ -1,7 +1,6 @@ # ------------------------------------------------------------------------- # Copyright (c) Microsoft Corporation. All rights reserved. # -------------------------------------------------------------------------- +__all__ = ['RunLoader'] from .loader import RunLoader - -__all__ = ['RunLoader'] diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/communication.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/communication.py index 00f8dc98139d5bbb96daffb5989b9c3c660f2cbc..0afcdb11a66f89b8a448713bf140e3293db7e503 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/communication.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/communication.py @@ -59,7 +59,7 @@ def analyze_communication_nodes(comm_node_list: List[CommunicationNode])\ total_comm_stats[comm_node.name][0] += 1 bytes_one_value = 0 if comm_node.input_shape: - for i in range(len(comm_node.input_shape)): + for i, shape in enumerate(comm_node.input_shape): if comm_node.input_type[i] == 'long int': bytes_one_value = 8 elif comm_node.input_type[i] == 'float': @@ -76,7 +76,7 @@ def analyze_communication_nodes(comm_node_list: List[CommunicationNode])\ logger.warning('Found an unknown tensor type: {}'.format(comm_node.input_type[i])) bytes_one_value = 0 total_size = 1 - for size in comm_node.input_shape[i]: + for size in shape: total_size *= size total_comm_stats[comm_node.name][1] += total_size * bytes_one_value total_comm_stats[comm_node.name][2].extend(comm_node.kernel_ranges) diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/data.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/data.py index d6f9bb245eb2d170cb4a63e7f912a9c69932e28b..00544e635340c556d5346fc307bb29913c08929c 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/data.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/data.py @@ -22,14 +22,16 @@ import gzip import io as sysio import json import math +import os.path import re import tempfile from json.decoder import JSONDecodeError from typing import Dict, List, Optional +from configparser import ConfigParser from .op_tree import OpTreeBuilder from .. import io, utils -from ..consts import InputFilesType, MAX_FILE_SIZE, INPUT_FILE_LIST +from ..consts import InputFilesType, INPUT_FILE_LIST from ..utils import href from . import trace from .communication import analyze_communication_nodes @@ -44,6 +46,9 @@ from .tensor_cores_parser import TensorCoresParser from .trace import BaseEvent, EventTypes, MemoryEvent logger = utils.get_logger() +config = ConfigParser() +config_path = os.path.join(os.getcwd(), 'torch_tb_profiler', 'config', '../config/config.ini') +config.read(config_path) class RunProfileData(object): @@ -164,15 +169,8 @@ class RunProfileData(object): has_communication_overlap = False has_communication_wait_ops = False - def _check_file_size_valid(filepath): - if io.stat(filepath).length > MAX_FILE_SIZE: - logger.warning( - f'File "{filepath}" exceeds the maximum limit size of 500MB and will be skipped.') - return False - return True - for file in io.listdir(path): - if utils.is_npu_trace_path(file) and _check_file_size_valid(io.join(path, file)): + if utils.is_npu_trace_path(file) and io.check_file_valid(io.join(path, file)): has_trace = True trace_file = io.join(path, file) trace_path, trace_json = RunProfileData._preprocess_file(trace_file, cache_dir, 'Ascend') @@ -194,7 +192,7 @@ class RunProfileData(object): profile.profiler_start_ts = 0 for file in io.listdir(path): - if str(file) in INPUT_FILE_LIST and _check_file_size_valid(io.join(path, file)): + if str(file) in INPUT_FILE_LIST and io.check_file_valid(io.join(path, file)): if InputFilesType(file) == InputFilesType.KERNEL_DETAILS_CSV: has_kernel = True profile.kernel_file_path = io.join(path, file) @@ -262,10 +260,10 @@ class RunProfileData(object): try: trace_json = json.loads(fout.getvalue()) logger.warning('Get JSONDecodeError: %s, Re-encode it to temp file' % e.msg) - json_reencode = True except JSONDecodeError: logger.error(f'File "{trace_path}" is not in a legal JSON format and will be skipped.') return trace_path, {} + json_reencode = True # work-around to remove the 'Record Window End' events to avoid the huge end timestamp if device_target == 'Ascend': @@ -363,7 +361,7 @@ class RunProfileData(object): dataloader_ratio = self.avg_costs.costs[ProfileRole.DataLoader] / self.avg_costs.costs[ProfileRole.Total] if dataloader_ratio > 0.05: percentage = dataloader_ratio * 100 - url = 'https://pytorch.org/docs/stable/data.html#single-and-multi-process-data-loading' + url = config.get('URL', 'pytorch_data_loading_url') self.recommendations.append( f'This run has high time cost on input data loading. {percentage:.1f}% of the step ' + "time is in DataLoader. You could try to set num_workers on DataLoader's construction " + @@ -375,12 +373,11 @@ class RunProfileData(object): if self.device_props: # Tensor Cores feature is available on GPU cards with compute capability >= 7.0 - # https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications major = self.device_props[0].get('computeMajor') # If it's a pure CPU run, then self.tc_used_ratio is None, this rule will not be triggered. if major is not None and major >= 7: if math.isclose(self.tc_used_ratio, 0.0) and self.tc_eligible_ops_kernel_ratio > 0.0: - url = 'https://pytorch.org/docs/stable/amp.html' + url = config.get('URL', 'pytorch_amp_url') self.recommendations.append( f'Kernels with {round(self.tc_eligible_ops_kernel_ratio * 100)}%' ' time are launched by Tensor Cores eligible operators. ' @@ -395,8 +392,8 @@ class RunProfileData(object): if total_mem is not None and peak_mem > total_mem * 0.9: percentage = peak_mem / total_mem * 100 if total_mem > 0 else 0 total_mem_gb = total_mem / 1024 / 1024 / 1024 - ckp_url = 'https://pytorch.org/docs/stable/checkpoint.html' - amp_url = 'https://pytorch.org/docs/stable/amp.html' + ckp_url = config.get('URL', 'pytorch_ckp_url') + amp_url = config.get('URL', 'pytorch_amp_url') self.recommendations.append( f'Device memory usage is at the limit of device memory capacity ' f'({percentage:.1f}% of {total_mem_gb:.1f}GB on GPU{dev_id}). ' @@ -406,7 +403,7 @@ class RunProfileData(object): def _analyze_distributed_metrics(self): if self.use_dp and len(self.used_devices) > 1: - url = 'https://pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead' + url = config.get('URL', 'cuda_nn_ddp_instead_url') self.recommendations.append( f"It is recommended to {href('use DistributedDataParallel instead of DataParallel', url)}" ' to do multi-GPU training.') @@ -428,9 +425,9 @@ class RunProfileData(object): communication_ratio = self.avg_costs.costs[ProfileRole.Communication] / self.avg_costs.costs[ProfileRole.Total] if communication_ratio > 0.1: percentage = communication_ratio * 100 - compress_url = 'https://pytorch.org/docs/stable/ddp_comm_hooks.html', - grad_acc_url = 'https://towardsdatascience.com/what-is-gradient-accumulation-in-deep-learning-ec034122cfa' - lamb_url = 'https://nvidia.github.io/apex/optimizers.html#apex.optimizers.FusedLAMB' + compress_url = config.get('URL', 'compress_url') + grad_acc_url = config.get('URL', 'grad_acc_url') + lamb_url = config.get('URL', 'lamb_url') self.recommendations.append( f'This run has high time cost on communication. {percentage:.1f}% of the step time is in ' f"communication. You could try {href('Gradient Compression', compress_url)} or " diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/diffrun/tree.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/diffrun/tree.py index a164bd3d37390ba367f0d504910e45050227ffbf..c5cf5fad448122c74db46467cb0c70b8ce4f727e 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/diffrun/tree.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/diffrun/tree.py @@ -56,8 +56,9 @@ class DiffNode: def compare_operator_nodes( left_nodes: List[OperatorNode], right_nodes: List[OperatorNode]) -> Generator['DiffNode', None, None]: - '''Given two OperatorNode lists, find the DataLoader/Module/Backward/Optimizer node and create the child list DiffNode - ''' + """Given two OperatorNode lists, find the DataLoader/Module/Backward/Optimizer node and + create the child list DiffNode + """ right_keys = [(type(r), r.name) for r in right_nodes] # find matching points in the two list diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/event_parser.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/event_parser.py index 3cd7ce9ff662a152cc9e1e4150bfe4d762e7a691..9b364e0dbba55e07b939690d45123bbf6dc6fe23 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/event_parser.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/event_parser.py @@ -3,6 +3,7 @@ # ------------------------------------------------------------------------- import sys from collections import defaultdict +from dataclasses import dataclass from enum import IntEnum from typing import Dict, Iterable, List, Optional, Tuple @@ -31,11 +32,19 @@ class ProfileRole(IntEnum): Total = 8 +@dataclass +class NodeInfoParams: + event: DurationEvent + corrid_to_device: Dict[int, List[DeviceNode]] + corrid_to_runtime: Dict[int, RuntimeNode] + externalid_to_runtime: Dict[int, List[RuntimeNode]] + tid2list: Dict[int, List[OperatorNode]] + pl_tid2list: Dict[int, List[PLProfileNode]] + tid2zero_rt_list: Dict[int, List[RuntimeNode]] + + class NodeParserMixin: def __init__(self, *args, **kwargs): - """Please refer to https://stackoverflow.com/questions/9575409/calling-parent-class-init-with-multiple-inheritance-whats-the-right-way # noqa: E501 - to see the reason why we need call super().__init__ like this way - """ super().__init__(*args, **kwargs) self.communication_data: Dict[int, CommunicationNode] = {} @@ -68,14 +77,9 @@ class NodeParserMixin: for event in events: if event.type == EventTypes.MEMORY: continue - self._parse_node( - event, - corrid_to_device, - corrid_to_runtime, - externalid_to_runtime, - tid2list, - pl_tid2list, - tid2zero_rt_list) + params = NodeInfoParams(event, corrid_to_device, corrid_to_runtime, externalid_to_runtime, tid2list, + pl_tid2list, tid2zero_rt_list) + self._parse_node(params) if CommLibTypes.Nccl in self.comm_lib: for event in events: @@ -116,14 +120,14 @@ class NodeParserMixin: return comm_node is not None - def _parse_node(self, - event: DurationEvent, - corrid_to_device: Dict[int, List[DeviceNode]], - corrid_to_runtime: Dict[int, RuntimeNode], - externalid_to_runtime: Dict[int, List[RuntimeNode]], - tid2list: Dict[int, List[OperatorNode]], - pl_tid2list: Dict[int, List[PLProfileNode]], - tid2zero_rt_list: Dict[int, List[RuntimeNode]]): + def _parse_node(self, params: NodeInfoParams): + event = params.event + corrid_to_device = params.corrid_to_device + corrid_to_runtime = params.corrid_to_runtime + externalid_to_runtime = params.externalid_to_runtime + tid2list = params.tid2list + pl_tid2list = params.pl_tid2list + tid2zero_rt_list = params.tid2zero_rt_list corrid = event.correlation_id tid = event.tid if event.type in [EventTypes.KERNEL, EventTypes.MEMCPY, EventTypes.MEMSET]: @@ -226,8 +230,8 @@ class StepParser: self.steps.append((self.cpu_min_ts, self.cpu_max_ts)) self.steps_names.append('0') - for i in range(len(self.role_ranges)): - self.role_ranges[i] = merge_ranges(self.role_ranges[i]) + for i, role_range in enumerate(self.role_ranges): + self.role_ranges[i] = merge_ranges(role_range) def update_device_steps(self, runtime_node_list: List[RuntimeNode]): self._update_steps_duration(*self._find_device_steps(runtime_node_list)) @@ -362,9 +366,9 @@ class StepParser: # Change step time to device side on the condition that any step have device time. is_use_gpu = prev_step_end_time is not None if is_use_gpu: - for i_step in range(len(self.steps)): - step_start_time = max(prev_step_end_time, self.steps[i_step][0]) - step_end_time = self.steps[i_step][1] + for i_step, step in enumerate(self.steps): + step_start_time = max(prev_step_end_time, step[0]) + step_end_time = step[1] if steps_device[i_step][0] == sys.maxsize: # When step i_step has no device event. # Assign to step_start_time when kernel is behind host step end. step_end_time = max(step_end_time, step_start_time) @@ -402,7 +406,7 @@ class StepParser: class EventParser(NodeParserMixin, StepParser): def __init__(self): super().__init__() - self.comm_node_list: Dict[CommunicationNode] = None + self.comm_node_list: List[CommunicationNode] = None def parse(self, events: Iterable[BaseEvent], fwd_bwd_map: Dict[int, int]) -> Dict[int, List[OperatorNode]]: with utils.timing('EventParser: parse nodes'): @@ -439,10 +443,10 @@ class EventParser(NodeParserMixin, StepParser): header = f'[{ctx.tid}]' + '.'.join(ctx.name_stack[1:]) # omit the CallTreeRoot prefix_len = len(ctx.name_stack) * 4 - 4 - 1 if len(ctx.name_stack) > 1: - print(header) + logger.info(header) prefix = ' ' * prefix_len - print(prefix, node.name) - print(prefix, 'time:', node.start_time, '-->', node.end_time) + logger.info(prefix, node.name) + logger.info(prefix, 'time:', node.start_time, '-->', node.end_time) def push(node: OperatorNode): ctx.name_stack.append(node.name) diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/kernel_parser.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/kernel_parser.py index 838fc38ce60619977c3e096791241d7fc697562d..229251e60a90d5bf4fed514d5f175199b92d3870 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/kernel_parser.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/kernel_parser.py @@ -6,7 +6,7 @@ from typing import Optional import numpy as np import pandas as pd -from .tensor_core import TC_Allowlist +from .tensor_core import TcAllowlist from .trace import EventTypes @@ -19,7 +19,7 @@ class KernelParser: events = [vars(event) for event in events if event.type == EventTypes.KERNEL] events = pd.DataFrame(events) events = events.astype({'type': 'category', 'name': 'string'}, copy=False) - events['tc_used'] = events['name'].map(lambda name: name in TC_Allowlist) + events['tc_used'] = events['name'].map(lambda name: name in TcAllowlist) def weighted_avg(x: pd.Series): try: diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/memory_parser.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/memory_parser.py index 766782be271240dabffc76bbc389d8659e601299..64b78127a4c7a5675e5b2f71877754c541dde94f 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/memory_parser.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/memory_parser.py @@ -25,7 +25,7 @@ class MemoryMetrics(IntEnum): class MemoryRecord: def __init__(self, scope: str, pid: int, tid: int, ts: int, device_type: DeviceType, device_id: int, - address: int, bytes: int, total_allocated: float, total_reserved: float): + address: int, record_bytes: int, total_allocated: float, total_reserved: float): self.scope = scope self.tid = tid self.pid = pid @@ -33,7 +33,7 @@ class MemoryRecord: self.device_type = device_type self.device_id = device_id self.addr = address - self.bytes = bytes + self.bytes = record_bytes self.total_allocated = total_allocated self.total_reserved = total_reserved self.op_name: Optional[str] = None @@ -132,7 +132,7 @@ class MemorySnapshot: for i in range(self_metric_length, metric_length): memory_metrics_keyed_by_node[node][device][i] += metrics[i] - for tid, root in tid2tree.items(): + for _, root in tid2tree.items(): for child in root.children: traverse_node_memory(child) @@ -217,7 +217,8 @@ class MemoryParser: """In the loop, one pass will process one record. The basic logic is: It will search from the node that last visited since both the records and tree is ordered already 1. it current node contains the records, then find the exactly child which just embrace it. - 2. otherwise, find the parent node and set the child_index, so that the parent node could continue from previous visited node. # noqa: E501 + 2. otherwise, find the parent node and set the child_index, so that the parent node could continue from + previous visited node. # noqa: E501 3. if there is not any node contains the records, then all remaining records will be ignored. """ record = records[record_index] diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/module_op.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/module_op.py index 061a503b411bb900c6a405c0b97c8a07dd986a00..15f1e4ef93a5234cdf6273f9830ac1a6f3aeaa41 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/module_op.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/module_op.py @@ -260,10 +260,3 @@ def get_module_tree(tid2tree: Dict[int, OperatorNode]): traverse_node(child, None) return modules - - -def dump_modules(level: int, modules: Iterable[Union[Module, ModuleNode]]): - """testing purpose""" - for module in modules: - print(f"{' ' * level}{module.name.replace('nn.Module: ', '')}_{module.module_id}") - dump_modules(level + 1, module.children) diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/node.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/node.py index 80860e53661e9a554de6fa9b09e6f13057fca8bb..0528491c28752b0358d79e27168d055546bd0310 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/node.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/node.py @@ -6,7 +6,7 @@ from abc import ABC from typing import List, Optional, Tuple from .. import utils -from .tensor_core import TC_Allowlist, TC_OP_Allowlist +from .tensor_core import TcAllowlist, TcOpAllowlist from .trace import (DurationEvent, EventTypes, KernelEvent, ModuleEvent, OperatorEvent, PLProfileEvent, NcclOpNameSet, GlooOpNameSet) @@ -16,12 +16,12 @@ ExcludeOpName = ['DataParallel.forward', 'DistributedDataParallel.forward'] class BaseNode(ABC): - def __init__(self, name: str, start_time: int, end_time: int, type: str, tid: int, + def __init__(self, name: str, start_time: int, end_time: int, node_type: str, tid: int, external_id: Optional[int] = None): self.name = name self.start_time = start_time self.end_time = end_time - self.type = type + self.type = node_type self.tid = tid self.external_id = external_id # For consistency check. @@ -31,7 +31,7 @@ class BaseNode(ABC): kwargs['name'] = event.name kwargs['start_time'] = event.ts kwargs['end_time'] = event.ts + event.duration - kwargs['type'] = event.type + kwargs['node_type'] = event.type kwargs['tid'] = event.tid external_id = getattr(event, 'external_id', None) @@ -84,15 +84,18 @@ class OperatorNode(HostNode): self.callstack = callstack self.self_host_duration = self_host_duration self.self_device_duration = self_device_duration - # self.parent_node = None - self.tc_eligible = self.name in TC_OP_Allowlist + self.tc_eligible = self.name in TcOpAllowlist self.tc_self_duration = 0 # Time of TC kernels launched by this op excluding its children operators. self.tc_total_duration = 0 # Time of TC kernels launched by this op including its children operators. def fill_stats(self): + def sort_key(x): + if x.start_time and x.end_time: + return x.start_time, -x.end_time + else: + return sys.maxsize, -sys.maxsize - 1 self.children.sort(key=lambda x: (x.start_time, -x.end_time)) - self.runtimes.sort(key=lambda x: (x.start_time, -x.end_time) - if x.start_time and x.end_time else (sys.maxsize, -sys.maxsize - 1)) + self.runtimes.sort(key=sort_key) for child in self.children: child.fill_stats() @@ -273,7 +276,7 @@ class DeviceNode(BaseNode): self.block = block self.regs_per_thread = regs_per_thread self.shared_memory = shared_memory - self.tc_used = self.name in TC_Allowlist + self.tc_used = self.name in TcAllowlist self.device_id = device_id @classmethod @@ -306,7 +309,7 @@ def create_operator_node(event: OperatorEvent): def is_operator_node(node: BaseNode): - return bool(type(node) is OperatorNode and node.type == EventTypes.OPERATOR and node.name not in ExcludeOpName + return bool(isinstance(node, OperatorNode) and node.type == EventTypes.OPERATOR and node.name not in ExcludeOpName and not node.name.startswith("Optimizer.")) # exclude Optimizer.zero_grad diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_agg.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_agg.py index 08a3f0d7061dc332a78ec97a6ff085bf1840a47d..d6fdb5903d368e02c4ddb9fc3f29f536696e2a2e 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_agg.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_agg.py @@ -49,7 +49,6 @@ def aggregate_ops(op_list: List[OperatorNode], agg.self_device_duration += op.self_device_duration agg.tc_self_duration += op.tc_self_duration agg.tc_total_duration += op.tc_total_duration - return agg agg_dicts: List[Dict[str, OperatorAgg]] = [{} for _ in range(len(keys_func))] for op in op_list: diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_tree.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_tree.py index 55e264617d835fb5bf94819b329fdbd2ee1c53f6..fe919b29ced02efcea862f5e83ab52704f3f0d09 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_tree.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/op_tree.py @@ -68,9 +68,10 @@ class OpTreeBuilder: if main_tid: # only append the staled device nodes into main thread self.main_tid = op_list[0].tid - root_node = self._build_tree_internal(op_list, zero_rt_list, tid, staled_device_nodes, is_ascend) + root_node = OpTreeBuilder._build_tree_internal(op_list, zero_rt_list, tid, staled_device_nodes, + is_ascend) else: - root_node = self._build_tree_internal(op_list, zero_rt_list, tid, [], is_ascend) + root_node = OpTreeBuilder._build_tree_internal(op_list, zero_rt_list, tid, [], is_ascend) tid2tree[int(tid)] = root_node return tid2tree @@ -83,7 +84,8 @@ class OpTreeBuilder: # there are multiple tids backward_tid = self._find_backward_tid() tid2len = { - tid: root.end_time - root.start_time for tid, root in self.tid2tree.items() + tid: root.end_time - root.start_time + for tid, root in self.tid2tree.items() if tid != backward_tid or backward_tid is None } # get the maximum length as the main thread @@ -97,7 +99,8 @@ class OpTreeBuilder: return None - def _build_tree_internal(self, host_node_list, zero_rt_list, tid, staled_device_nodes, is_ascend): + @staticmethod + def _build_tree_internal(host_node_list, zero_rt_list, tid, staled_device_nodes, is_ascend): """host_node_list: list of OperatorNode and ProfilerStepNode. zero_rt_list: list of RuntimeNode with external_id=0.""" @@ -110,7 +113,7 @@ class OpTreeBuilder: name='dummy', start_time=None, end_time=None, - type=EventTypes.RUNTIME, + node_type=EventTypes.RUNTIME, tid=0, device_nodes=staled_device_nodes)) dummpy_rt[0].fill_stats() @@ -119,7 +122,7 @@ class OpTreeBuilder: name='CallTreeRoot', start_time=-sys.maxsize - 1, end_time=sys.maxsize, - type=EventTypes.PYTHON, + node_type=EventTypes.PYTHON, tid=tid, runtimes=zero_rt_list + dummpy_rt) # Give the list of RuntimeNode with external_id=0 to root node. node_stack.append(root_node) @@ -130,7 +133,6 @@ class OpTreeBuilder: if node.end_time <= tail_node.end_time or ( is_ascend and math.isclose(node.end_time, tail_node.end_time, rel_tol=1)): tail_node.children.append(node) - # node.parent_node = weakref.ref(tail_node) node_stack.append(node) else: logger.error('Error in input data: ranges on the same thread should not intersect!' @@ -274,7 +276,7 @@ class OpTreeBuilder: if isinstance(node, ModuleNode): backward_node = BackwardNode(name=node.name + '.backward', start_time=None, end_time=None, - type='backward', tid=0) + node_type='backward', tid=0) if parent is None: result.append(backward_node) else: diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/overall_parser.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/overall_parser.py index e12fbfd1cc502accee83fb44c52b94f8253c64ce..c646a33b89a673e1738fd38704516df8bfdfaade 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/overall_parser.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/overall_parser.py @@ -23,8 +23,8 @@ class OverallParser(object): @classmethod def create_from_statistics(cls, statistics: 'OverallParser.Statistics', total_duration: int): costs = [0.] * len(ProfileRole) - for i in range(len(statistics.cost_ranges)): - costs[i] = get_ranges_sum(statistics.cost_ranges[i]) + for i, cost_range in enumerate(statistics.cost_ranges): + costs[i] = get_ranges_sum(cost_range) costs[ProfileRole.Total] = total_duration return cls(costs) @@ -58,8 +58,8 @@ class OverallParser(object): def intersection_with_step(self, step: Tuple[int, int]): cost_ranges: List[List[Tuple[int, int]]] = [] step = [step] - for range in self.cost_ranges: - cost_ranges.append(intersection_ranges_lists(step, range)) + for cost_range in self.cost_ranges: + cost_ranges.append(intersection_ranges_lists(step, cost_range)) return OverallParser.Statistics(cost_ranges) @@ -77,6 +77,9 @@ class OverallParser(object): def aggregate(self, steps: List[Tuple[int, int]], role_ranges: List[List[Tuple[int, int]]]): logger.debug('Overall, statistics') + if len(steps) <= 0: + logger.error('Invalid steps number of 0') + return global_stats = OverallParser.Statistics.create_from_range(steps, role_ranges) if role_ranges[ProfileRole.Kernel]: comm_comp_overlap = intersection_ranges_lists( @@ -89,7 +92,7 @@ class OverallParser(object): for i, step in enumerate(steps): steps_stat = global_stats.intersection_with_step(step) self.steps_costs.append(OverallParser.Costs.create_from_statistics(steps_stat, step[1] - step[0])) - for cost_index in range(len(self.avg_costs.costs)): + for cost_index, _ in enumerate(self.avg_costs.costs): self.avg_costs.costs[cost_index] += self.steps_costs[i].costs[cost_index] comm_costs = OverallParser.StepCommunicationCosts() @@ -107,5 +110,5 @@ class OverallParser(object): self.communication_overlap.append(comm_costs) valid_steps = len(steps) - for i in range(len(self.avg_costs.costs)): + for i, _ in enumerate(self.avg_costs.costs): self.avg_costs.costs[i] /= valid_steps diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/run_generator.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/run_generator.py index f2ab0452ec733783d880abfebae948f8ec4b3e6e..111dc34e81031a33ff9e0a2c03b0375522de24cf 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/run_generator.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/run_generator.py @@ -2,7 +2,6 @@ # Copyright (c) Microsoft Corporation. All rights reserved. # # Copyright(c) 2023 Huawei Technologies. -# All rights reserved # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -49,6 +48,140 @@ class RunGenerator(object): self.component_curve_data = {} self.process_data = {} + @staticmethod + def check_overlap_data(title): + # csv: step / compute time / communication_not_overlap / overlap / communication / free time + length = len(title) + if length < 5: + return [] + key = ["computing", "overlapped", "communication(not overlapped)", "free"] + get_key = list() + for j in key: + for i in range(length): + if j == title[i]: + get_key.append(i) + if len(get_key) < 4: + return [] + return get_key + + @staticmethod + def get_table_head(name: str, input_shape: str, call_stack: str, value: list): + if name is None: + return {} + temp = { + 'name': name, 'calls': 0, 'host_self_duration': 0, + 'host_total_duration': 0, 'device_self_duration': 0, 'device_total_duration': 0, + 'tc_self_ratio': 0, 'tc_total_ratio': 0, 'tc_eligible': 'Yes' + } + if input_shape is not None: + temp['input_shape'] = input_shape + if call_stack is not None: + temp['call_stack'] = call_stack + else: + temp['has_call_stack'] = False + else: + if call_stack is not None: + temp['call_stack'] = call_stack + else: + temp['has_call_stack'] = False + for vl in iter(value): + if 'has_call_stack' in temp and vl[2]: + temp['has_call_stack'] = True + temp['calls'] += 1 + temp['host_self_duration'] = round(temp['host_self_duration'] + vl[3], 2) + temp['host_total_duration'] = round(temp['host_total_duration'] + vl[4], 2) + temp['device_self_duration'] = round(temp['device_self_duration'] + vl[5], 2) + temp['device_total_duration'] = round(temp['device_total_duration'] + vl[6], 2) + temp['tc_self_ratio'] = round(temp['tc_self_ratio'] + vl[7], 2) + temp['tc_total_ratio'] = round(temp['tc_total_ratio'] + vl[8], 2) + temp['tc_eligible'] = 'Yes' if temp['tc_self_ratio'] > 0 or temp['tc_total_ratio'] > 0 else 'No' + temp['tc_self_ratio'] = 0 if temp['device_self_duration'] == 0 \ + else round(temp['tc_self_ratio'] / temp['device_self_duration'] * 100, 2) + temp['tc_total_ratio'] = 0 if temp['device_total_duration'] == 0 \ + else round(temp['tc_total_ratio'] / temp['device_total_duration'] * 100, 2) + return temp + + @staticmethod + def get_wait_table_by_ops(op, ops): + total_trans = 0 + total_synchronize = 0 + for key, data in op.items(): + if str(key) == "Total Op Info" and data.get("Communication Time Info"): + total_trans += float(data.get("Communication Time Info").get("Transit Time(ms)")) + total_synchronize += float(data.get("Communication Time Info").get("Synchronization Time(ms)")) + continue + k = re.sub(r'[0-9]+', ' ', key).split(" ")[0] + if k not in ops: + ops[k] = [0, 0, 0, 0] + ops[k][0] += 1 + for _, band in data.get("Communication Bandwidth Info").items(): + ops[k][1] += float(band.get("Transit Size(MB)")) + if data.get("Communication Time Info") is not None: + ops[k][2] += data.get("Communication Time Info").get("Elapse Time(ms)") + ops[k][3] += data.get("Communication Time Info").get("Transit Time(ms)") + return total_trans, total_synchronize + + @staticmethod + def trans_shape(shape: str): + result = list() + if ';' not in shape: + result.append('[' + shape.strip() + ']') + return '[' + ', '.join(result) + ']' + if len(shape.strip()) <= 1: + result.append('[]') + return '[' + ', '.join(result) + ']' + shape_spl = shape.split("\n") + for shape_div in iter(shape_spl): + result.append('[' + str(shape_div.replace(';', '')) + ']') + return '[' + ', '.join(result) + ']' + + @staticmethod + def get_process_peaks_and_devices_type(process_data: dict, memory_metric: str): + devices_type = [] + peaks = {} + for device in process_data: + devices_type.append(device) + reserved_list = process_data.get(device).get('Allocated') + if reserved_list is not None: + max_reserved = 0 + for array_value in reserved_list: + max_reserved = max(array_value[1], max_reserved) + peaks[device] = f'Peak Memory Usage: {max_reserved:.1f}{memory_metric}' + return devices_type, peaks + + @staticmethod + def get_pta_ge_peaks_and_devices_type(process_data: dict, memory_metric: str): + devices_type = [] + peaks = {} + for device in process_data: + devices_type.append(device) + peaks[device] = 'Reserved Peak Memory Usage:' + for component in process_data.get(device): + max_reserved = 0 + for array_value in process_data.get(device).get(component): + max_reserved = max(array_value[2], max_reserved) + peaks[device] += f' {component}-{max_reserved:.1f}{memory_metric} |' + return devices_type, peaks + + @staticmethod + def check_csv_columns(columns: list, column_idxs: dict): + column_exist_count = 0 + for idx, column in enumerate(columns): + if column in column_idxs: + column_idxs[column] = idx + column_exist_count += 1 + return column_idxs.values(), column_exist_count + + @staticmethod + def get_csv_data(path: str): + if path is None: + return [] + datas = [] + with open(path, encoding='utf-8-sig') as f: + for row in csv.reader(f, skipinitialspace=True): + datas.append(row) + return datas + def generate_run_profile(self): profile_run = RunProfile(self.worker, self.span) profile_run.is_pytorch_lightning = self.profile_data.is_pytorch_lightning @@ -85,7 +218,7 @@ class RunGenerator(object): profile_run.gpu_metrics = self.profile_data.gpu_metrics_parser.get_gpu_metrics() - gpu_infos = {gpu_id: RunGenerator._get_gpu_info(self.profile_data.device_props, gpu_id) + gpu_infos = {gpu_id: RunGenerator.get_gpu_info(self.profile_data.device_props, gpu_id) for gpu_id in self.profile_data.gpu_metrics_parser.gpu_ids} gpu_infos = {gpu_id: gpu_info for gpu_id, gpu_info in gpu_infos.items() if gpu_info is not None} @@ -140,11 +273,11 @@ class RunGenerator(object): def _npu_get_overlap(self): path = self.profile_data.distributed_csv_path overlap_by_steps: Dict[str, List[float]] = OrderedDict() - data = RunGenerator._get_csv_data(path) + data = RunGenerator.get_csv_data(path) if len(data) <= 1: return overlap_by_steps title = [x.lower() for x in data[0]] - title_name = RunGenerator._check_overlap_data(title) + title_name = RunGenerator.check_overlap_data(title) if not title_name: logger.error(f"Incomplete content of CSV file {path}.") return overlap_by_steps @@ -154,8 +287,10 @@ class RunGenerator(object): key = step[0] if key == '': key = 'all' - overlap = [float(step[int(title_name[0])]), float(step[int(title_name[1])]), - float(step[int(title_name[2])]), float(step[int(title_name[3])])] + overlap = [ + float(step[int(title_name[0])]), float(step[int(title_name[1])]), + float(step[int(title_name[2])]), float(step[int(title_name[3])]) + ] if key in overlap_by_steps: overlap_by_steps[key] = list(np.add(overlap, overlap_by_steps[key])) else: @@ -164,22 +299,6 @@ class RunGenerator(object): logger.error(f'File "{path}" has wrong data format in row {idx + 2} and will skip it.') return overlap_by_steps - @staticmethod - def _check_overlap_data(title): - # csv: step / compute time / communication_not_overlap / overlap / communication / free time - length = len(title) - if length < 5: - return [] - key = ["computing", "overlapped", "communication(not overlapped)", "free"] - get_key = list() - for j in key: - for i in range(length): - if j == title[i]: - get_key.append(i) - if len(get_key) < 4: - return [] - return get_key - def _npu_get_wait_table(self): path = self.profile_data.communication_json_path if not io.exists(path): @@ -214,9 +333,9 @@ class RunGenerator(object): collection_ops = data.get("collective") p2p_ops = data.get("p2p") try: - coll_total_trans, coll_total_synchronize = RunGenerator._get_wait_table_by_ops(collection_ops, - table_ops) - p2p_total_trans, p2p_total_synchronize = RunGenerator._get_wait_table_by_ops(p2p_ops, table_ops) + coll_total_trans, coll_total_synchronize = RunGenerator.get_wait_table_by_ops(collection_ops, + table_ops) + p2p_total_trans, p2p_total_synchronize = RunGenerator.get_wait_table_by_ops(p2p_ops, table_ops) except ValueError: logger.error(f'Time and size info must be number, please check file "{path}"') return wait_by_step, table_ops @@ -227,39 +346,21 @@ class RunGenerator(object): } return wait_by_step, table_ops - @staticmethod - def _get_wait_table_by_ops(op, ops): - total_trans = 0 - total_synchronize = 0 - for key, data in op.items(): - if str(key) == "Total Op Info" and data.get("Communication Time Info"): - total_trans += float(data.get("Communication Time Info").get("Transit Time(ms)")) - total_synchronize += float(data.get("Communication Time Info").get("Synchronization Time(ms)")) - continue - k = re.sub(r'[0-9]+', ' ', key).split(" ")[0] - if k not in ops: - ops[k] = [0, 0, 0, 0] - ops[k][0] += 1 - for _, band in data.get("Communication Bandwidth Info").items(): - ops[k][1] += float(band.get("Transit Size(MB)")) - if data.get("Communication Time Info") is not None: - ops[k][2] += data.get("Communication Time Info").get("Elapse Time(ms)") - ops[k][3] += data.get("Communication Time Info").get("Transit Time(ms)") - return total_trans, total_synchronize - def _get_operator_details_by_name(self): operator_by_name = defaultdict(list) operator_by_name_and_input_shapes = defaultdict(list) path = self.profile_data.operator_path - datas = RunGenerator._get_csv_data(path) + datas = RunGenerator.get_csv_data(path) if len(datas) <= 1: return operator_by_name, operator_by_name_and_input_shapes for idx, ls in enumerate(datas[1:]): try: - temp: list = [ls[0], RunGenerator._trans_shape(str(ls[1])), ls[2], float(ls[3]), float(ls[4]), - float(ls[5]), float(ls[6]), float(ls[7]), float(ls[8])] + temp: list = [ + ls[0], RunGenerator.trans_shape(str(ls[1])), ls[2], float(ls[3]), float(ls[4]), + float(ls[5]), float(ls[6]), float(ls[7]), float(ls[8]) + ] operator_by_name[ls[0]].append(temp) - key = "{}###{}".format(str(ls[0]), RunGenerator._trans_shape(str(ls[1]))) + key = "{}###{}".format(str(ls[0]), RunGenerator.trans_shape(str(ls[1]))) operator_by_name_and_input_shapes[key].append(temp) except (ValueError, IndexError): logger.error(f'File "{path}" has wrong data format in row {idx + 2} and will skip it.') @@ -281,8 +382,10 @@ class RunGenerator(object): def _get_operator_pie(self, group_by_input_shape=False): data = {} - tag = {'device_self_time': 'Device Self Time (us)', 'device_total_time': 'Device Total Time (us)', - 'host_self_time': 'Host Self Time (us)', 'host_total_time': 'Host Total Time (us)'} + tag = { + 'device_self_time': 'Device Self Time (us)', 'device_total_time': 'Device Total Time (us)', + 'host_self_time': 'Host Self Time (us)', 'host_total_time': 'Host Total Time (us)' + } for key, value in tag.items(): data[key] = { 'title': value, @@ -307,9 +410,9 @@ class RunGenerator(object): if group_by_input_shape: name = name_key.split("###")[0] shape = name_key.split("###")[1] - result.append(RunGenerator._get_table_head(name, shape, None, values)) + result.append(RunGenerator.get_table_head(name, shape, None, values)) else: - result.append(RunGenerator._get_table_head(name_key, None, None, values)) + result.append(RunGenerator.get_table_head(name_key, None, None, values)) return result def _set_name_callstack_data(self, group_by_input_shape=False): @@ -344,24 +447,10 @@ class RunGenerator(object): 'data': [] } for callstack_key, value in values.items(): - table['data'].append(RunGenerator._get_table_head(name, shape, callstack_key, value)) + table['data'].append(RunGenerator.get_table_head(name, shape, callstack_key, value)) result[name_key] = table return result - @staticmethod - def _trans_shape(shape: str): - result = list() - if ';' not in shape: - result.append('[' + shape.strip() + ']') - return '[' + ', '.join(result) + ']' - if len(shape.strip()) <= 1: - result.append('[]') - return '[' + ', '.join(result) + ']' - shape_spl = shape.split("\n") - for shape_div in iter(shape_spl): - result.append('[' + str(shape_div.replace(';', '')) + ']') - return '[' + ', '.join(result) + ']' - def _get_call_stack_by_name(self): result = dict() name_callstack_data = self._set_name_callstack_data() @@ -378,45 +467,10 @@ class RunGenerator(object): 'data': [] } for callstack_key, value in values.items(): - table['data'].append(RunGenerator._get_table_head(name_key, None, callstack_key, value)) + table['data'].append(RunGenerator.get_table_head(name_key, None, callstack_key, value)) result[name_key] = table return result - @staticmethod - def _get_table_head(name: str, input_shape: str, call_stack: str, value: list): - if name is None: - return {} - temp = {'name': name, 'calls': 0, 'host_self_duration': 0, - 'host_total_duration': 0, 'device_self_duration': 0, 'device_total_duration': 0, - 'tc_self_ratio': 0, 'tc_total_ratio': 0, 'tc_eligible': 'Yes'} - if input_shape is not None: - temp['input_shape'] = input_shape - if call_stack is not None: - temp['call_stack'] = call_stack - else: - temp['has_call_stack'] = False - else: - if call_stack is not None: - temp['call_stack'] = call_stack - else: - temp['has_call_stack'] = False - for vl in iter(value): - if 'has_call_stack' in temp and vl[2]: - temp['has_call_stack'] = True - temp['calls'] += 1 - temp['host_self_duration'] = round(temp['host_self_duration'] + vl[3], 2) - temp['host_total_duration'] = round(temp['host_total_duration'] + vl[4], 2) - temp['device_self_duration'] = round(temp['device_self_duration'] + vl[5], 2) - temp['device_total_duration'] = round(temp['device_total_duration'] + vl[6], 2) - temp['tc_self_ratio'] = round(temp['tc_self_ratio'] + vl[7], 2) - temp['tc_total_ratio'] = round(temp['tc_total_ratio'] + vl[8], 2) - temp['tc_eligible'] = 'Yes' if temp['tc_self_ratio'] > 0 or temp['tc_total_ratio'] > 0 else 'No' - temp['tc_self_ratio'] = 0 if temp['device_self_duration'] == 0 \ - else round(temp['tc_self_ratio'] / temp['device_self_duration'] * 100, 2) - temp['tc_total_ratio'] = 0 if temp['device_total_duration'] == 0 \ - else round(temp['tc_total_ratio'] / temp['device_total_duration'] * 100, 2) - return temp - def _get_memory_event(self, peak_memory_events: dict): display_columns = ('Name', 'Size(KB)', 'Allocation Time(us)', 'Release Time(us)', 'Duration(us)') path = self.profile_data.memory_operator_path @@ -430,10 +484,16 @@ class RunGenerator(object): 'columns': [], 'rows': {} } - datas = RunGenerator._get_csv_data(path) + datas = RunGenerator.get_csv_data(path) + if len(datas) < 1: + return { + 'operator': table, + 'component': peak_memory_events + } + device_type_form_idx = -1 for idx, column in enumerate(datas[0]): if column == 'Device Type': - self.device_type_form_idx = idx + device_type_form_idx = idx if column in display_columns: if column == 'Name': table['columns'].append({'name': column, 'type': 'string'}) @@ -444,20 +504,22 @@ class RunGenerator(object): table['columns'].append({'name': column.replace('(us)', '(ms)'), 'type': 'number'}) required_column_idxs = {key: -1 for key in display_columns} (name_idx, size_idx, allocation_idx, release_idx, duration_idx), column_exist_count = \ - RunGenerator._check_csv_columns(datas[0], required_column_idxs) - if column_exist_count < len(required_column_idxs): - logger.error('Required column is missing in file "operator_memory.csv"') + RunGenerator.check_csv_columns(datas[0], required_column_idxs) + if device_type_form_idx < 0 or column_exist_count < len(required_column_idxs): + raise ValueError('Required column is missing in file "operator_memory.csv"') for idx, ls in enumerate(datas[1:]): - device_type = ls[self.device_type_form_idx] + device_type = ls[device_type_form_idx] # convert time metric 'us' to 'ms' # some operators may not have the following columns try: - nums = [ls[name_idx] if ls[name_idx] else '', abs(float(ls[size_idx])), + nums = [ + ls[name_idx] if ls[name_idx] else '', abs(float(ls[size_idx])), round((float(ls[allocation_idx]) - self.profile_data.profiler_start_ts) / 1000, 3) if ls[ allocation_idx] else None, round((float(ls[release_idx]) - self.profile_data.profiler_start_ts) / 1000, 3) if ls[ release_idx] else None, - round(float(ls[duration_idx]) / 1000, 3) if ls[duration_idx] else None] + round(float(ls[duration_idx]) / 1000, 3) if ls[duration_idx] else None + ] display_datas[device_type].append(nums) except ValueError: logger.error(f'File "{path}" has wrong data format in row {idx + 2} and will skip it.') @@ -474,8 +536,8 @@ class RunGenerator(object): time_metric: str = 'ms' memory_metric: str = 'MB' cano = Canonicalizer(time_metric, memory_metric) - process_devices_type, process_peaks = RunGenerator._get_process_peaks_and_devices_type(self.process_data, - memory_metric) + process_devices_type, process_peaks = RunGenerator.get_process_peaks_and_devices_type(self.process_data, + memory_metric) total_result = { 'metadata': { 'devices': process_devices_type, @@ -502,8 +564,8 @@ class RunGenerator(object): if len(total_result['columns'][device]) > 0: total_result['columns'][device].insert(0, {'name': f'Time ({cano.time_metric})', 'type': 'number', 'tooltip': 'Time since profiler starts.'}) - pta_ge_devices_type, pta_ge_peaks = RunGenerator._get_pta_ge_peaks_and_devices_type(self.component_curve_data, - memory_metric) + pta_ge_devices_type, pta_ge_peaks = RunGenerator.get_pta_ge_peaks_and_devices_type(self.component_curve_data, + memory_metric) component_curve_result = { 'metadata': { 'devices': pta_ge_devices_type, @@ -547,48 +609,11 @@ class RunGenerator(object): 'ptaGe': component_curve_result } - @staticmethod - def _get_process_peaks_and_devices_type(process_data: dict, memory_metric: str): - devices_type = [] - peaks = {} - for device in process_data: - devices_type.append(device) - reserved_list = process_data.get(device).get('Allocated') - if reserved_list is not None: - max_reserved = 0 - for array_value in reserved_list: - max_reserved = max(array_value[1], max_reserved) - peaks[device] = f'Peak Memory Usage: {max_reserved:.1f}{memory_metric}' - return devices_type, peaks - - @staticmethod - def _get_pta_ge_peaks_and_devices_type(process_data: dict, memory_metric: str): - devices_type = [] - peaks = {} - for device in process_data: - devices_type.append(device) - peaks[device] = 'Reserved Peak Memory Usage:' - for component in process_data.get(device): - max_reserved = 0 - for array_value in process_data.get(device).get(component): - max_reserved = max(array_value[2], max_reserved) - peaks[device] += f' {component}-{max_reserved:.1f}{memory_metric} |' - return devices_type, peaks - - @staticmethod - def _check_csv_columns(columns: list, column_idxs: dict): - column_exist_count = 0 - for idx, column in enumerate(columns): - if column in column_idxs: - column_idxs[column] = idx - column_exist_count += 1 - return column_idxs.values(), column_exist_count - def _handle_memory_data(self): process_data = defaultdict() pta_or_ge_data = defaultdict() path = self.profile_data.memory_curve_path - datas = RunGenerator._get_csv_data(path) + datas = RunGenerator.get_csv_data(path) required_column_idxs = { 'Component': -1, 'Device Type': -1, @@ -597,7 +622,7 @@ class RunGenerator(object): 'Total Allocated(MB)': -1 } (tag_type_idx, device_type_idx, time_idx, reserved_idx, allocated_idx), column_exist_count = \ - RunGenerator._check_csv_columns(datas[0], required_column_idxs) + RunGenerator.check_csv_columns(datas[0], required_column_idxs) if column_exist_count < len(required_column_idxs): logger.error('Required column is missing in file "memory_record.csv"') else: @@ -615,8 +640,10 @@ class RunGenerator(object): pta_or_ge_data.setdefault(device_type, {}).setdefault(ls[tag_type_idx], []).append( line_chart_data) elif ls[tag_type_idx] in ('PTA', 'GE'): - line_chart_data = [time_column, round(float(ls[allocated_idx]), 3), - round(float(ls[reserved_idx]), 3)] + line_chart_data = [ + time_column, round(float(ls[allocated_idx]), 3), + round(float(ls[reserved_idx]), 3) + ] pta_or_ge_data.setdefault(device_type, {}).setdefault(ls[tag_type_idx], []).append( line_chart_data) except ValueError: @@ -636,7 +663,7 @@ class RunGenerator(object): } peak_memory_rows = defaultdict(list) path = self.profile_data.memory_component_path - component_datas = RunGenerator._get_csv_data(path) + component_datas = RunGenerator.get_csv_data(path) if component_datas: required_column_idxs = { 'Component': -1, @@ -645,7 +672,7 @@ class RunGenerator(object): 'Device': -1 } (tag_type_idx, time_idx, reserved_idx, device_type_idx), column_exist_count = \ - RunGenerator._check_csv_columns(component_datas[0], required_column_idxs) + RunGenerator.check_csv_columns(component_datas[0], required_column_idxs) if column_exist_count < len(required_column_idxs): logger.error(f'Required column is missing in file "{path}"') else: @@ -691,14 +718,16 @@ class RunGenerator(object): '{}: {}us
' 'Percentage: {}%' '') - percentage = round(100 * part_cost / costs.costs[ProfileRole.Total], 2) + percentage = 0.0 if costs.costs[ProfileRole.Total] == 0 else round( + 100 * part_cost / costs.costs[ProfileRole.Total], 2) return format_str.format(step_name, costs.costs[ProfileRole.Total], part_name, part_cost, percentage) def build_avg_cost_dict(part_name: str, part_cost: float): + profiler_total_cost = self.profile_data.avg_costs.costs[ProfileRole.Total] cost_dict = {'name': part_name, 'description': '', 'value': round(part_cost), - 'extra': round(100 * part_cost / self.profile_data.avg_costs.costs[ProfileRole.Total], 2)} + 'extra': 0.0 if profiler_total_cost == 0 else round(100 * part_cost / profiler_total_cost, 2)} return cost_dict show_gpu = (self.profile_data.has_runtime @@ -717,8 +746,7 @@ class RunGenerator(object): data['steps']['columns'].extend(['DataLoader', 'CPU Exec', 'Other']) data['steps']['rows'] = [] - for i in range(len(self.profile_data.steps_costs)): - costs = self.profile_data.steps_costs[i] + for i, costs in enumerate(self.profile_data.steps_costs): step_name = self.profile_data.steps_names[i] row = [{'value': step_name}] if show_gpu: @@ -763,9 +791,11 @@ class RunGenerator(object): build_avg_cost_dict('Other', self.profile_data.avg_costs.costs[ProfileRole.Other]) ]) - data['performance'] = [{'name': 'Average Step Time', 'description': '', + data['performance'] = [ + {'name': 'Average Step Time', 'description': '', 'value': round(self.profile_data.avg_costs.costs[ProfileRole.Total]), - 'extra': 100, 'children': avg_costs}] + 'extra': 100, 'children': avg_costs} + ] if len(self.profile_data.recommendations) == 0: html = '
  • N/A
  • ' @@ -915,7 +945,8 @@ class RunGenerator(object): }, 'data': table } - table['columns'] = [{'type': 'string', 'name': 'Name'}, + table['columns'] = [ + {'type': 'string', 'name': 'Name'}, {'type': 'string', 'name': 'Operator'}, {'type': 'string', 'name': 'Grid'}, {'type': 'string', 'name': 'Block'}, @@ -924,7 +955,8 @@ class RunGenerator(object): {'type': 'string', 'name': 'Kernel Uses Tensor Cores', 'tooltip': consts.TOOLTIP_KERNEL_USES_TC}, {'type': 'string', 'name': 'Op is Tensor Cores eligible', - 'tooltip': consts.TOOLTIP_KERNEL_OP_TC_ELIGIBLE}] + 'tooltip': consts.TOOLTIP_KERNEL_OP_TC_ELIGIBLE} + ] col_names = ['Calls', 'Total Duration (us)', 'Mean Duration (us)', 'Max Duration (us)', 'Min Duration (us)'] for column in col_names: table['columns'].append({'type': 'number', 'name': column}) @@ -935,14 +967,16 @@ class RunGenerator(object): kernel_list: List[KernelAggByNameOp] = sorted( self.profile_data.kernel_list_groupby_name_op, key=lambda x: x.total_duration, reverse=True) for agg_by_name_op in kernel_list: - kernel_op_row = [agg_by_name_op.name, agg_by_name_op.op_name, + kernel_op_row = [ + agg_by_name_op.name, agg_by_name_op.op_name, str(agg_by_name_op.grid), str(agg_by_name_op.block), str(agg_by_name_op.regs_per_thread or '0'), str(agg_by_name_op.shared_memory or '0'), 'Yes' if agg_by_name_op.tc_used else 'No', 'Yes' if agg_by_name_op.op_tc_eligible else 'No', agg_by_name_op.calls, agg_by_name_op.total_duration, round(agg_by_name_op.avg_duration), - agg_by_name_op.max_duration, agg_by_name_op.min_duration] + agg_by_name_op.max_duration, agg_by_name_op.min_duration + ] if self.profile_data.gpu_metrics_parser.has_blocks_per_sm: kernel_op_row.append(round(agg_by_name_op.avg_blocks_per_sm, 2)) if self.profile_data.gpu_metrics_parser.has_occupancy: @@ -965,9 +999,11 @@ class RunGenerator(object): }, 'data': table } - table['columns'] = [{'type': 'string', 'name': 'Name'}, + table['columns'] = [ + {'type': 'string', 'name': 'Name'}, {'type': 'string', 'name': 'Tensor Cores Used', - 'tooltip': consts.TOOLTIP_KERNEL_USES_TC}] + 'tooltip': consts.TOOLTIP_KERNEL_USES_TC} + ] columns = ['count', 'sum', 'mean', 'max', 'min'] round_digits = [0, 0, 0, 0, 0] if self.profile_data.gpu_metrics_parser.has_blocks_per_sm: @@ -1011,7 +1047,8 @@ class RunGenerator(object): {'type': 'number', 'name': 'Total Durations(us)'}, {'type': 'number', 'name': 'Min Durations(us)'}, {'type': 'number', 'name': 'Avg Durations(us)'}, - {'type': 'number', 'name': 'Max Durations(us)'}] + {'type': 'number', 'name': 'Max Durations(us)'} + ] table['rows'] = [] for key, value in self.statistic_data.items(): temp = [key] @@ -1037,14 +1074,14 @@ class RunGenerator(object): 'data': table } path = self.profile_data.kernel_file_path - datas = RunGenerator._get_csv_data(path) + datas = RunGenerator.get_csv_data(path) required_column_idxs = { 'Name': -1, 'Duration(us)': -1, 'Accelerator Core': -1 } (name_idx, duration_idx, core_type_idx), column_exist_count = \ - RunGenerator._check_csv_columns(datas[0], required_column_idxs) + RunGenerator.check_csv_columns(datas[0], required_column_idxs) if column_exist_count < 3: logger.error('Required column is missing in file "kernel_details.csv"') else: @@ -1058,16 +1095,6 @@ class RunGenerator(object): table['rows'] = datas[1:] return result - @staticmethod - def _get_csv_data(path: str): - if path is None: - return [] - datas = [] - with open(path, encoding='utf-8-sig') as f: - for row in csv.reader(f, skipinitialspace=True): - datas.append(row) - return datas - def _generate_tc_pie_npu(self): pie = {'columns': [{'type': 'string', 'name': 'name'}, {'type': 'number', 'name': 'value'}], 'rows': []} for key, val in self.accelerator_data.items(): @@ -1076,7 +1103,7 @@ class RunGenerator(object): return data @staticmethod - def _get_gpu_info(device_props, gpu_id): + def get_gpu_info(device_props, gpu_id): if (device_props is None) or (gpu_id >= len(device_props)) or (gpu_id < 0): return None @@ -1117,12 +1144,17 @@ class RunGenerator(object): self.accelerator_data[call_type] = call_duration if self.statistic_data.get(call_name) is not None: - temp = self.statistic_data[call_name] - temp['Max'] = max(temp['Max'], call_duration) - temp['Min'] = min(temp['Min'], call_duration) - temp['Total'] = round(temp['Total'] + call_duration, 2) - temp['Calls'] += 1 - temp['Average'] = round(temp['Total'] / temp['Calls'], 2) + temp = self.statistic_data.get(call_name, {}) + temp['Max'] = max(temp.get('Max', 0), call_duration) + temp['Min'] = min(temp.get('Min', 0), call_duration) + temp['Total'] = round(temp.get('Total', 0) + call_duration, 2) + temp['Calls'] = temp.get('Calls', 0) + 1 + if temp['Calls'] == 0: + logger.error( + f'temp["Calls"] is zero which can not be divisor.') + temp['Average'] = 0 + else: + temp['Average'] = round(temp['Total'] / temp['Calls'], 2) else: self.statistic_data[call_name] = { 'Calls': 1, @@ -1172,7 +1204,7 @@ class DistributedRunGenerator(object): process_id = 'Process ' + str(process_id) result[node][process_id] = OrderedDict() for used_device in data.used_devices: - gpu_info = RunGenerator._get_gpu_info(data.device_props, used_device) + gpu_info = RunGenerator.get_gpu_info(data.device_props, used_device) if gpu_info is not None: result[node][process_id]['GPU' + str(used_device)] = gpu_info @@ -1223,7 +1255,9 @@ class DistributedRunGenerator(object): round(costs.other, 3) ] steps_to_overlap['all'][data.worker] = [ - sum(x) for x in zip(steps_to_overlap['all'][data.worker], steps_to_overlap[step_name][data.worker])] + sum(x) + for x in zip(steps_to_overlap['all'][data.worker], steps_to_overlap[step_name][data.worker]) + ] @staticmethod def _get_npu_overlap_data(data, steps_to_overlap): @@ -1235,7 +1269,9 @@ class DistributedRunGenerator(object): steps_to_overlap[k][data.worker] = list( [round(v[0] - v[1], 3), round(v[1], 3), round(v[2], 3), round(v[3], 3)]) steps_to_overlap['all'][data.worker] = [ - sum(x) for x in zip(steps_to_overlap['all'][data.worker], steps_to_overlap[k][data.worker])] + sum(x) + for x in zip(steps_to_overlap['all'][data.worker], steps_to_overlap[k][data.worker]) + ] @staticmethod def _get_npu_wait_data(data, steps_to_wait): @@ -1250,7 +1286,9 @@ class DistributedRunGenerator(object): wait = round(v.get('Synchronize') * 1000, 3) # 1ms = 1000us steps_to_wait[k][data.worker] = list([trans, wait]) steps_to_wait['all'][data.worker] = [ - sum(x) for x in zip(steps_to_wait['all'][data.worker], steps_to_wait[k][data.worker])] + sum(x) + for x in zip(steps_to_wait['all'][data.worker], steps_to_wait[k][data.worker]) + ] steps_to_wait['all'][data.worker] = [x / step_number for x in steps_to_wait['all'][data.worker]] @staticmethod @@ -1264,7 +1302,9 @@ class DistributedRunGenerator(object): round(comm_stats[0] - comm_stats[1], 3) ] steps_to_wait['all'][data.worker] = [ - sum(x) for x in zip(steps_to_wait['all'][data.worker], steps_to_wait[step][data.worker])] + sum(x) + for x in zip(steps_to_wait['all'][data.worker], steps_to_wait[step][data.worker]) + ] steps_to_wait['all'][data.worker] = [int(x / step_number) for x in steps_to_wait['all'][data.worker]] def _generate_wait_graph(self): @@ -1352,10 +1392,11 @@ class DistributedRunGenerator(object): op, stats[0], round(stats[1], 3), - round(stats[1] / stats[0] if stats != 0 else 0), + + round(stats[1] / stats[0] if stats[0] != 0 else 0), round(stats[2], 3), - round(stats[2] / stats[0] if stats != 0 else 0), + round(stats[2] / stats[0] if stats[0] != 0 else 0), round(stats[3], 3), - round(stats[3] / stats[0] if stats != 0 else 0) + round(stats[3] / stats[0] if stats[0] != 0 else 0) ] table['rows'].append(row) diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/tensor_core.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/tensor_core.py index 3a69cf70b881acc4588682fc4440cb5534541eb1..cc53ab217f0ee6f88817c51da6ba46da68df4e28 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/tensor_core.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/tensor_core.py @@ -1,14 +1,13 @@ # ------------------------------------------------------------------------- # Copyright (c) Microsoft Corporation. All rights reserved. # ------------------------------------------------------------------------- -class TC_Allowlist_Meta(type): - # Enable grammar sugar as 'v in TC_Allowlist'. +class TcAllowlistMeta(type): + # Enable grammar sugar as 'v in TcAllowlist'. def __contains__(cls, item): return cls.__contains__(item) -class TC_Allowlist(metaclass=TC_Allowlist_Meta): - # Refer to https://github.com/NVIDIA/PyProf/blob/fd1b2902e3306119eee40ba6b6e8b2f816920c29/pyprof/prof/tc.py#L19 +class TcAllowlist(metaclass=TcAllowlistMeta): allowlist = ['h884', 's884', 'h1688', 's1688', 'hmma', 'i8816', '16816', 'dgrad_1x1_stride_2x2', 'first_layer_wgrad_kernel', 'conv1x1', 'conv2d_c1_k1', 'direct_group', 'xmma_implicit_gemm', @@ -24,8 +23,7 @@ class TC_Allowlist(metaclass=TC_Allowlist_Meta): return False -class TC_OP_Allowlist(metaclass=TC_Allowlist_Meta): - # Refer to https://github.com/pytorch/pytorch/blob/69b2bf70f9c0e591ce5e566afa59e19618031ead/aten/src/ATen/autocast_mode.cpp#L290-L351 # noqa: E501 +class TcOpAllowlist(metaclass=TcAllowlistMeta): allowlist = ['aten::_convolution', 'aten::conv1d', 'aten::conv2d', 'aten::conv3d', 'aten::conv_tbc', 'aten::conv_transpose1d', 'aten::conv_transpose2d', 'aten::conv_transpose3d', 'aten::convolution', 'aten::cudnn_convolution', 'aten::cudnn_convolution_transpose', diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/trace.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/trace.py index e76f8b18dd80a9f12a867c9395de6a96a39bc2c1..ea09f79666bd184956469f48fc7922854394940d 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/trace.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/profiler/trace.py @@ -1,13 +1,13 @@ # ------------------------------------------------------------------------- # Copyright (c) Microsoft Corporation. All rights reserved. # -------------------------------------------------------------------------- +__all__ = ['EventTypes', 'create_event'] + from enum import IntEnum from typing import Dict, Optional from .. import utils -__all__ = ['EventTypes', 'create_event'] - logger = utils.get_logger() NcclOpNameSet = ['nccl:broadcast', 'nccl:reduce', 'nccl:all_reduce', 'nccl:all_gather', 'nccl:reduce_scatter'] @@ -56,8 +56,8 @@ EventTypeMap = { class BaseEvent(object): - def __init__(self, type, data): - self.type: str = type + def __init__(self, event_type, data): + self.type: str = event_type self.name: str = data.get('name') self.ts: int = data.get('ts') self.pid: int = data.get('pid') @@ -66,8 +66,8 @@ class BaseEvent(object): class DurationEvent(BaseEvent): - def __init__(self, type, data): - super().__init__(type, data) + def __init__(self, event_type, data): + super().__init__(event_type, data) self.category: str = data.get('cat', '') self.duration: int = data.get('dur') @@ -79,8 +79,8 @@ class DurationEvent(BaseEvent): class KernelEvent(DurationEvent): - def __init__(self, type, data): - super().__init__(type, data) + def __init__(self, event_type, data): + super().__init__(event_type, data) self.occupancy = self.args.get('est. achieved occupancy %') self.blocks_per_sm = self.args.get('blocks per SM') self.grid = self.args.get('grid') @@ -91,8 +91,8 @@ class KernelEvent(DurationEvent): class OperatorEvent(DurationEvent): - def __init__(self, type, data): - super().__init__(type, data) + def __init__(self, event_type, data): + super().__init__(event_type, data) self.callstack = self.args.get('Call stack') self.input_type = self.args.get('Input type') @@ -111,8 +111,8 @@ class ProfilerStepEvent(OperatorEvent): class MemoryEvent(BaseEvent): - def __init__(self, type, data): - super().__init__(type, data) + def __init__(self, event_type, data): + super().__init__(event_type, data) self.scope: str = data.get('s', '') self.device_id: int = self.args.get('Device Id') dtype = self.args.get('Device Type') @@ -142,8 +142,8 @@ class MemoryEvent(BaseEvent): class PythonFunctionEvent(DurationEvent): - def __init__(self, type, data): - super().__init__(type, data) + def __init__(self, event_type, data): + super().__init__(event_type, data) self.python_id: int = self.args.get('Python id') self.python_parent_id: int = self.args.get('Python parent id') diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/run.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/run.py index 2f719fb0c6139e498f51afdcad2497293e90ad1e..9e30f225244280df7acfd7d2deb95a40208cfa54 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/run.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/run.py @@ -77,7 +77,7 @@ class Run(object): if worker is not None: if self.span_view.get(worker) is None: return None - spans = self.span_view[worker] + spans = self.span_view.get(worker, []) else: spans = [s for _, s in self.profiles.keys()] diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/static/trace_embedding.html b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/static/trace_embedding.html index bb84da0d0c0cb92d51a2d6ab1cb92ce308b23241..462d2c395f81d932fbf0196ccc53f4b0ece6e93a 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/static/trace_embedding.html +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/static/trace_embedding.html @@ -11,7 +11,7 @@ found in the LICENSE file. 'use strict'; function onTraceViewerImportFail() { - document.addEventListener('DOMContentLoaded', function () { + document.addEventListener('DOMContentLoaded', () => { document.body.textContent = 'tracing/bin/trace_viewer_full.html is missing. ' + 'Run vulcanize_trace_viewer from $TRACE_VIEWER and reload.'; @@ -52,12 +52,11 @@ found in the LICENSE file. // warning. window.__hideTraceViewerPolyfillWarning = true; - window.addEventListener("message", event => { - const data = event.data || {} - console.log(data) - name = data.name || 'unknown' - onResult(data.data) - }) + window.addEventListener('message', event => { + const data = event.data || {}; + name = data.name || 'unknown'; + onResult(data.data); + }); function onResult(result) { model = new tr.Model(); @@ -78,7 +77,7 @@ found in the LICENSE file. overlay.visible = true; } - document.addEventListener('WebComponentsReady', function () { + document.addEventListener('WebComponentsReady', () => { const container = document.createElement('track-view-container'); container.id = 'track_view_container'; @@ -91,7 +90,7 @@ found in the LICENSE file. Polymer.dom(document.body).appendChild(viewer); if (window.parent) { - window.parent.postMessage({ msg: 'ready' }, '*') + window.parent.postMessage({ msg: 'ready' }, window.origin); } }); }()); diff --git a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/utils.py b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/utils.py index 8f4189d765e6e9233478d800ab2d1424597af254..5991cf2b33d1e818e6876c8d7550fbb6c87cdaa3 100644 --- a/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/utils.py +++ b/plugins/tensorboard-plugins/tb_plugin/torch_tb_profiler/utils.py @@ -23,14 +23,15 @@ import math import os import time from contextlib import contextmanager -from math import pow from . import consts +predefined_logging_level = ('CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG', 'NOTSET') + def get_logging_level(): log_level = os.environ.get('TORCH_PROFILER_LOG_LEVEL', 'INFO').upper() - if log_level not in logging._levelToName.values(): + if log_level not in predefined_logging_level: log_level = logging.getLevelName(logging.INFO) return log_level @@ -76,7 +77,6 @@ class Canonicalizer: input_time_metric='us', input_memory_metric='B'): # raw timestamp is in microsecond - # https://github.com/pytorch/pytorch/blob/v1.9.0/torch/csrc/autograd/profiler_kineto.cpp#L33 time_metric_to_factor = { 'us': 1, 'ms': 1e3, @@ -84,10 +84,10 @@ class Canonicalizer: } # raw memory is in bytes memory_metric_to_factor = { - 'B': pow(1024, 0), - 'KB': pow(1024, 1), - 'MB': pow(1024, 2), - 'GB': pow(1024, 3), + 'B': math.pow(1024, 0), + 'KB': math.pow(1024, 1), + 'MB': math.pow(1024, 2), + 'GB': math.pow(1024, 3), } # canonicalize the memory metric to a string @@ -125,7 +125,7 @@ class DisplayRounder: def __init__(self, ndigits): self.ndigits = ndigits - self.precision = pow(10, -ndigits) + self.precision = math.pow(10, -ndigits) def __call__(self, v: float): _v = abs(v) diff --git a/profiler/__init__.py b/profiler/__init__.py index de0604079e1323b2749bc801a6e8326893c73498..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 100644 --- a/profiler/__init__.py +++ b/profiler/__init__.py @@ -1,14 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. \ No newline at end of file diff --git a/profiler/affinity_cpu_bind/bind_core.py b/profiler/affinity_cpu_bind/bind_core.py index 0f27eb79a02d22a5cfc6ec7eb827f5c0cfb5a529..aeb0647d7dc9817e36b959d3e74e0ead0d3a02e4 100644 --- a/profiler/affinity_cpu_bind/bind_core.py +++ b/profiler/affinity_cpu_bind/bind_core.py @@ -203,6 +203,9 @@ class BindCoreManager(): logging.info('NPU total id list: %s', self.npu_id_list) def _get_npu_affinity(self) -> bool: + if not self.npu_id_list: + logging.error('NPU id list is empty') + return False cpu_num = os.cpu_count() cpu_num_for_each_npu = cpu_num // len(self.npu_id_list) get_npu_topo_cmd = 'npu-smi info -t topo' diff --git a/profiler/example/mstx_torch_plugin/README.md b/profiler/example/mstx_torch_plugin/README.md index 8f140ce17c176b64a6651350cb62621cce7c16b7..f6c5b071150965d856f3be70f231ff603889820b 100644 --- a/profiler/example/mstx_torch_plugin/README.md +++ b/profiler/example/mstx_torch_plugin/README.md @@ -2,23 +2,46 @@ Ascend Pytorch Profiler中的[采集并解析msprof_tx数据](https://www.hiascend.com/document/detail/zh/canncommercial/80RC3/devaids/devtools/profiling/atlasprofiling_16_0033.html#ZH-CN_TOPIC_0000002081898541__section5940122172516)功能已经内置了通信算子的打点。为了方便用户在不修改业务代码的基础上获取更多关键阶段的耗时数据,mstx_torch_plugin在Ascend Pytorch Profiler内置了**dataloader**、**forward**、**step**、**save_checkpoint**这四个关键阶段函数的打点。 -**约束** +## 约束 暂不支持PyTorch图模式场景使用。 -**使用指导** +## 使用指导 1. 下载mstx_torch_plugin的whl包。 - whl包链接:[mstx_torch_plugin](https://ptdbg.obs.myhuaweicloud.com/profiler/example/1.0/mstx_torch_plugin-1.0-py3-none-any.whl) + 请通过下表链接下载mstx_torch_plugin的whl包。 -2. 安装mstx_torch_plugin + | 版本 | 发布日期 | 下载链接 | 校验码 | + | ---- | ---------- | ------------------------------------------------------------ | ------------------------------------------------------------ | + | 1.0 | 2024-12-19 | [mstx_torch_plugin-1.0-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/example/1.0/mstx_torch_plugin-1.0-py3-none-any.whl) | 8b3500245ac0ea63f2ada832b1cc67ca8923a86d6081b165a8f62da0a276cbaa | + +2. whl包校验。 + + 1. 根据以上下载链接下载whl包到Linux安装环境。 + + 2. 进入whl包所在目录,执行如下命令。 + + ``` + sha256sum {name}.whl + ``` + + {name}为whl包名称。 + + 若回显呈现对应版本whl包一致的**校验码**,则表示下载了正确的性能工具whl安装包。示例如下: + + ``` + sha256sum mstx_torch_plugin-1.0-py3-none-any.whl + xx *mstx_torch_plugin-1.0-py3-none-any.whl + ``` + +3. 安装mstx_torch_plugin ```bash pip install mstx_torch_plugin-1.0-py3-none-any.whl ``` -3. 在AI任务执行脚本中import导入该whl包。 +4. 在AI任务执行脚本中import导入该whl包。 需保证import的顺序在import torch和import torch_npu后面: @@ -29,7 +52,7 @@ Ascend Pytorch Profiler中的[采集并解析msprof_tx数据](https://www.hiasce import mstx_torch_plugin ``` -4. 使能torch_npu.profiler,采集打点数据。 +5. 使能torch_npu.profiler,采集打点数据。 打开msprof_tx开关,profiler_level开关可根据实际采集需要,配置对应的level: @@ -57,7 +80,7 @@ Ascend Pytorch Profiler中的[采集并解析msprof_tx数据](https://www.hiasce prof.step() ``` -**采集结果** +## 采集结果 采集的性能数据使用MindStudio Insight工具打开,可视化效果如下: diff --git a/profiler/example/mstx_torch_plugin/mstx_torch_plugin.py b/profiler/example/mstx_torch_plugin/mstx_torch_plugin.py index f6b25db7cf48cb5bcd2be687251c2499ecb30965..fbd77a670dd0a59c211e18637b7bc03c7ff0a42a 100644 --- a/profiler/example/mstx_torch_plugin/mstx_torch_plugin.py +++ b/profiler/example/mstx_torch_plugin/mstx_torch_plugin.py @@ -12,10 +12,12 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +from datetime import datetime import os import functools +import re +import site import torch -import torch_npu from torch.nn import Module from torch.utils.data import DataLoader from torch.optim.optimizer import register_optimizer_step_post_hook @@ -28,6 +30,23 @@ original_multinext = torch.utils.data.dataloader._MultiProcessingDataLoaderIter. origin_patch_step_function = torch.optim.Optimizer._patch_step_function +def _print_warn_msg(message: str): + time_str = datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S") + print(f"{time_str} [WARNING] [{os.getpid()}] mstx_torch_plugin.py: {message}") + + +def _check_directory_path_readable(path): + if not os.path.exists(path): + msg = f"The path dose not exist: {path}" + raise RuntimeError(msg) + if os.path.islink(path): + msg = f"Invalid path is a soft chain: {path}" + raise RuntimeError(msg) + if not os.access(path, os.R_OK): + msg = f"The path permission check failed: {path}" + raise RuntimeError(msg) + + class MstxState: def __init__(self): self.module_dict = {} @@ -144,9 +163,59 @@ def _custom_step(optimizer: torch.optim.Optimizer): mstx_state.last_optimizer_id = id(optimizer) +def _get_torch_npu_version_str(): + torch_npu_version_str = "" + site_packages = site.getsitepackages() + if site_packages and site_packages[0]: + path = site_packages[0] + version_path = os.path.join(path, "torch_npu", "version.py") + _check_directory_path_readable(version_path) + # example version info: "__version__ = '2.1.0.post11.xxxxxx'" + try: + with open(version_path, "r") as f: + for line in f: + if line.find("__version__") != -1: + torch_npu_version_str = line.strip().split("=")[-1][2:-1] + break + except Exception as e: + _print_warn_msg(f"Failed to open {version_path} to get torch npu version.") + return torch_npu_version_str + + +def _get_torch_npu_info(version_str: str): + # version info example: "2.1.0.post11.xxxxxx" + match = re.search(r"^(\d+\.\d+\.\d+)\.post(\d+)", version_str) + if match and len(match.groups()) == 2: + return match.group(1), match.group(2) + else: + return '', '' + + +def _check_pta_support_patch(): + pta_support_patch_version = { + "2.1.0": 10, + "2.3.1": 4, + "2.4.0": 2, + } + torch_npu_version_str = _get_torch_npu_version_str() + if not torch_npu_version_str: + _print_warn_msg("Failed to get torch_npu version info.") + return False + torch_branch, torch_npu_version = _get_torch_npu_info(torch_npu_version_str) + if not torch_branch or not torch_npu_version or not torch_npu_version.isdigit(): + _print_warn_msg("Failed to get valid torch branch or torch_npu version.") + return False + for branch, post_version in pta_support_patch_version.items(): + if torch_branch == branch and int(torch_npu_version) <= post_version: + return False + return True + + def apply_mstx_patch(): + pta_support_patch = _check_pta_support_patch() Module.__call__ = _custom_forward_call - DataLoader.__iter__ = _custom_dataloader_iter - torch.serialization.save = _custom_save(original_save) + if not pta_support_patch: + DataLoader.__iter__ = _custom_dataloader_iter + torch.serialization.save = _custom_save(original_save) torch.optim.Optimizer._patch_step_function = _custom_step register_optimizer_step_post_hook(_step_hook) diff --git a/profiler/merge_profiling_timeline/README.md b/profiler/merge_profiling_timeline/README.md deleted file mode 100644 index 24db91adee88d74bff99117189e70a6ad632ddd3..0000000000000000000000000000000000000000 --- a/profiler/merge_profiling_timeline/README.md +++ /dev/null @@ -1,115 +0,0 @@ -# 合并大json工具 - -merge_profiling_timeline(合并大json工具)支持合并Profiling的timeline数据,支持合并指定rank的timline、合并指定timeline中的item。 - - -## 多timeline融合 - -### 性能数据采集 - -使用Ascend PyTorch Profiler或者E2E性能采集工具采集性能数据,E2E profiling将被废弃,不建议使用。Ascend PyTorch Profiler采集方式参考:[Profiling数据采集](https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze)。将采集到的所有节点的性能数据拷贝到当前环境同一目录下,以下假设数据在/home/test/cann_profiling下。 - -E2E Profiling数据目录结构示例如下: - -```bash -|- cann_profiling - |- PROF_*** - |- timeline - |- msprof.json - |- device_* - |- info.json.* - ... - |- PROF_*** - ... -``` - -Ascend PyTorch Profiler数据目录结构示例如下: - -```bash -|- ascend_pytorch_profiling - |- **_ascend_pt - |- ASCEND_PROFILER_OUTPUT - |- trace_view.json - |- FRAMEWORK - |- PROF_*** - |- **_ascend_pt -``` - -### 参数说明 - -| 参数名称 | 说明 | 是否必选 | -| -------- | ------------------------------------------------------------ | -------- | -| -i | 指定Profiling数据目录路径。 | 是 | -| --type | 指定需要合并timeline场景,可选取值:`pytorch`(通过Ascend PyTorch Profiler方式采集profiling数据,合并所有卡的trace_view.json)、`e2e`(通过E2E Profiling方式采集Profiling数据,优先合并总timeline,没有生成则选择合并device目录下的msprof_*.json)、`custom` (自定义需要合并的timeline数据,具体参考**使用示例**)。 | 是 | -| -o | 指定合并后的timeline文件输出的路径(路径末尾可以设置文件名,具体用法参考**使用示例**),不设置该参数的情况下默认文件输出的路径为当前目录(默认文件名为merged.json)。 | 否 | -| --rank | 指定需要合并timeline的Rank ID,默认全部合并。 | 否 | -| --items | 指定需要合并的Profiling数据项,包括:python、Ascend Hardware、CANN、HCCL、PTA、Overlap Analysis,默认全部合并。 | 否 | - -### 使用示例 - -1. 合并单机多卡timeline,默认合并所有卡、所有数据项,生成first.json在path/to/cann_profiling/output/目录下 - - ```bash - python3 main.py -i path/to/cann_profiling/ -o path/to/cann_profiling/output/first --type pytorch - ``` - -2. 合并单机多卡timeline,默认合并所有卡、所有数据项,不设置-o参数时默认生成merge.json在当前目录下 - - ```bash - python3 main.py -i path/to/cann_profiling/ --type pytorch - ``` - -3. 合并单机多卡timeline,只合并0卡和1卡 - - ```bash - python3 main.py -i path/to/cann_profiling/ -o path/to/cann_profiling/output/2p --type pytorch --rank 0,1 - ``` - -4. 合并单机多卡timeline,合并所有卡的CANN层和Ascend_Hardware层数据 - - ```bash - python3 main.py -i path/to/cann_profiling/ --type pytorch --items "CANN,Ascend Hardware" - ``` - -5. 合并多timeline(自定义) - - 以上场景不支持的情况下,可以使用自定义的合并方式,将需要合并的timeline文件放在同一目录下(附:该场景比较特殊,与正常合并不同,无法直接读取info.json中的rank_id,因此该场景下的rank_id为默认分配的序号,用于区分不同文件的相同层,不代表实际rank_id) - 数据目录结构示意如下: - - ```bash - |- timeline - |- msprof_0.json - |- msprof_1.json - |- msprof_2.json - |- hccl_3.json - |- hccl_4.json - ... - ``` - - 通过下面的命令合并所有timeline,同样支持-o、--rank、--items等参数。 - - ```bash - python3 main.py -i path/to/timeline/ -o path/to/timeline/xxx --type custom - ``` - - 合并timeline查看:在 -o 指定的目录(不设置-o时默认在当前目录下的merged.json)的xxx.json为合并后的文件。 - - -## 超大timeline文件查看 - -[下载whl](https://gitee.com/aerfaliang/trace_processor/releases/download/trace_processor_37.0/trace_processor-37.0-py3-none-any.whl)包并执行如下命令安装(windows): - -```bash -pip3 install trace_processor-37.0-py3-none-any.whl -``` - -安装完成后直接执行如下命令: - -```bash -python -m trace_processor --httpd path/to/xxx_merged.json -``` - -等待加载完毕,刷新[perfetto](https://ui.perfetto.dev/)界面,单击Use old version regardless,再单击`YES, use loaded trace`即可展示timeline(通过W放大、S缩小、A左移、D右移来查看timeline文件)。 - -![输入图片说明](perfetto使用指导截图1.png) -![输入图片说明](perfetto使用指导截图2.png) \ No newline at end of file diff --git a/profiler/merge_profiling_timeline/main.py b/profiler/merge_profiling_timeline/main.py deleted file mode 100644 index 722457812b8c039317cbf541d26767ee2bb91361..0000000000000000000000000000000000000000 --- a/profiler/merge_profiling_timeline/main.py +++ /dev/null @@ -1,237 +0,0 @@ -#! /usr/bin/python3 -# Copyright 2023 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import json -import os -import re - -from functools import partial -from argparse import ArgumentParser -from decimal import Decimal - - -FILTER_DIRS = [".profiler", "HCCL_PROF", "timeline", "query", 'sqlite', 'log'] -RANK_ID_POS = 1000 - -def get_path_dir(path: str) -> list: - """ - check result path exist JOB dir - path : result path - """ - path_dir_filter = filter(partial(_path_dir_filter_func, root_dir=path), os.listdir(path)) - sub_dirs = list(path_dir_filter) - return sub_dirs - - -def _path_dir_filter_func(sub_path, root_dir): - return sub_path not in FILTER_DIRS and os.path.isdir(os.path.realpath(os.path.join(root_dir, sub_path))) - - -def natural_sort(files): - def convert(text): - return int(text) if text.isdigit() else text.lower() - - def alphanum_key(key): - return [convert(c) for c in re.split('([0-9]+)', key)] - - return sorted(files, key=alphanum_key) - - -def get_timeline_info(args, prof_dirs): - timeline_info = {} - - for prof in prof_dirs: - pro_path = os.path.join(args.input, prof) - - # 从info.json读取rank_id - rank_id = get_rank_id_from_info_json(pro_path) - if rank_id is None: - print(f"WARN, There is not rank id info in {pro_path}") - continue - - timeline_path = get_timeline_path(pro_path, args.type) - - if os.path.exists(timeline_path): - timeline_info[rank_id] = timeline_path - else: - print(f"WARN, The file \"{timeline_path}\" does not exist.") - return timeline_info - - -def get_timeline_path(pro_path, type): - for root, dirs, files in os.walk(pro_path): - for dir_ in dirs: - if 'ASCEND_PROFILER_OUTPUT' == dir_ and type == 'pytorch': - timeline_path = os.path.realpath(os.path.join(root, dir_, 'trace_view.json')) - return timeline_path - - for file_ in sorted(files, reverse=True): - if 'msprof' in file_: - timeline_path = os.path.join(root, file_) - return timeline_path - return None - -def get_rank_id_from_info_json(pro_path): - info_json = "" - rank_id = None - for root, _, files in os.walk(pro_path): - for file in files: - if "info.json." in file and ".done" not in file: - info_json = os.path.join(root, file) - break - - if info_json: - if os.path.islink(info_json): - print(f"The file: \"{info_json}\" is link. Please check the path.") - return None - try: - with open(info_json, "r+") as f: - info = json.load(f) - rank_id = info.get("rank_id") - except Exception as err: - print("[ERROR] %s" % err) - return None - return rank_id - - -def merge_timeline_general(args): - """合并e2e profiling生成的msprof*.json""" - if not os.path.isdir(args.input): - print(f"No such file or directory: \"{args.input}\". Please check the path.") - return - prof_dir = get_path_dir(args.input) - if not prof_dir: - message = f"The path \"{args.input}\" does not have PROF dir. Please check the path." - print(message) - return - timeline_info = get_timeline_info(args, prof_dir) - timeline_files_dict = {} - - # 合并部分profiling items - process_list = args.items.split(",") if args.items else None - - # 合并部分rank - if args.rank: - rank_ids = [int(rank_id) for rank_id in args.rank.split(",")] - else: - rank_ids = list(timeline_info.keys()) - - for rank_id in rank_ids: - if not timeline_info.get(rank_id): - print(f"main.py: error rank_id '{rank_id}' ") - return - timeline_files_dict[rank_id] = timeline_info.get(rank_id) - merge_timeline_events(timeline_files_dict, process_list) - - -def merge_timeline_custom(args): - """合并指定目录里所有timeline文件""" - timeline_files = natural_sort(os.listdir(args.input)) - timeline_files_dict = {} - for idx, timeline_file in enumerate(timeline_files): - timeline_files_dict[idx] = os.path.join(args.input, timeline_file) - # 合并部分profiling items - process_list = args.items.split(",") if args.items else None - merge_timeline_events(timeline_files_dict, process_list) - - -def merge_timeline_events(timeline_file_dict, process_list): - """ - 输入需要合并的timeline文件路径及对应的rank_id/id、需要合并的process_list - 输出合并timeline - """ - new_events = [] - for rank_id, timeline_path in timeline_file_dict.items(): - node = rank_id // 8 - print("rank id: ", rank_id, "timeline file: ", timeline_path) - if os.path.islink(timeline_path): - print(f"The file: \"{timeline_path}\" is link. Please check the path.") - return - try: - with open(timeline_path, 'r+') as f: - cur_events = json.load(f) - except Exception as err: - print("[ERROR] %s" % err) - return - - proc_pid_dict = {} - for event in cur_events: - if event.get("name") == "process_name" and event.get("ph") == "M": - if event.get("args"): - proc_pid_dict[event["args"].get("name")] = event.get("pid") - process_list_tmp = process_list if process_list else list(proc_pid_dict.keys()) - # 提取待合并的items的pid - merged_pids = set() - for pro in process_list_tmp: - if pro not in proc_pid_dict.keys(): - print(f"main.py: error argument --items: invalid choice: '{pro}' (choose from {list(proc_pid_dict.keys())})") - return - merged_pids.add(proc_pid_dict.get(pro)) - - for event in cur_events: - - # 只合并特定数据项 - if merged_pids and event.get('pid') not in merged_pids: - continue - - # convert tid to int - if not isinstance(event['tid'], int): - print(f"[WARNNING] {event['tid']} is not int type") - - # 进程名加上rank_id区分不同rank - if event.get("name") == "process_name" and event.get("ph") == "M": - if event.get("args") is not None and event["args"].get("name") is not None: - event["args"]["name"] = event["args"]["name"] + f"_{rank_id}" - - #modify connect id - if event.get('id') and (event.get('ph') == 's' or event.get('ph') == 'f'): - event['id'] = float(event.get('id')) * RANK_ID_POS + rank_id - - new_events.append(event) - out_path = f"{args.output}.json" - if os.path.islink(out_path): - print(f"The file: \"{out_path}\" is link. Please check the path.") - return - if os.path.exists(out_path): - print(f"File {out_path} existed before and is now overwritten.") - os.remove(out_path) - try: - # 设置文件权限为640,安全考虑 - with os.fdopen(os.open(out_path, os.O_WRONLY | os.O_CREAT, 0o640), 'w') as f: - json.dump(new_events, f) - except FileNotFoundError: - print(f"Param -o (output path) is not exists, please check it.") - return - print(f"timeline merged output path: {out_path}") - - -def parse_args(): - parser = ArgumentParser(description="Merge timeline for multi card") - parser.add_argument("-i", "--input", default=None, help="root dir of PROF_* data") - parser.add_argument("-o", "--output", default="./merged", help="save path of merged.json ") - parser.add_argument("--rank", default=None, help="List of ranks to be merged. By default, all ranks are merged") - parser.add_argument("--items", default=None, help="Specify the data items (python,CANN,Ascend Hardware,HCCL,..)to be merged. in the timeline.") - parser.add_argument("--type", choices=('pytorch', 'e2e', 'custom'), help="Customize the timeline file to be merged.") - arg = parser.parse_args() - return arg - - -if __name__ == "__main__": - args = parse_args() - print("========================== start merge timeline ====================") - if args.type == "custom": - merge_timeline_custom(args) - else: - merge_timeline_general(args) \ No newline at end of file diff --git "a/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2761.png" "b/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2761.png" deleted file mode 100644 index beef396ce2996c25ecd74298285ccab5011ddea1..0000000000000000000000000000000000000000 Binary files "a/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2761.png" and /dev/null differ diff --git "a/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2762.png" "b/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2762.png" deleted file mode 100644 index 48793f136e48f21f618ff3cb13bdcc3388f76930..0000000000000000000000000000000000000000 Binary files "a/profiler/merge_profiling_timeline/perfetto\344\275\277\347\224\250\346\214\207\345\257\274\346\210\252\345\233\2762.png" and /dev/null differ diff --git a/profiler/msprof_analyze/MANIFEST.in b/profiler/msprof_analyze/MANIFEST.in index df1488cce957db8d6135caf1e65e834103fe92ed..b4d096405c98ea1a906b8882418362d428cbf1b6 100644 --- a/profiler/msprof_analyze/MANIFEST.in +++ b/profiler/msprof_analyze/MANIFEST.in @@ -3,6 +3,5 @@ recursive-include msprof_analyze/cli/ * recursive-include msprof_analyze/prof_common/ * recursive-include msprof_analyze/compare_tools/ * recursive-include msprof_analyze/cluster_analyse/ * -recursive-include msprof_analyze/precheck/ * global-exclude */__pycache__/* global-exclude *.pyc diff --git a/profiler/msprof_analyze/OWNERS b/profiler/msprof_analyze/OWNERS index 7524470824c5552b570c09cc231e74811a15adf7..064f975772fc0ab035ceeb26254142a0602468b5 100644 --- a/profiler/msprof_analyze/OWNERS +++ b/profiler/msprof_analyze/OWNERS @@ -5,8 +5,6 @@ approvers: - aerfaliang - chenhao_1209 - feng123www -- sunboquan reviewers: -- sunboquan - Seanesmhxocism - wjchuee diff --git a/profiler/msprof_analyze/README.md b/profiler/msprof_analyze/README.md index c3be2acd6ef1a33c629a07cba10be953036cfefd..4aa658d71c700028a73e892a97015bb31f319ed8 100644 --- a/profiler/msprof_analyze/README.md +++ b/profiler/msprof_analyze/README.md @@ -117,6 +117,8 @@ Successfully installed msprof-analyze-{version} | profiler版本 | 发布日期 | 下载链接 | 校验码 | |------------|------------|-------------------------------------------------------------------------------------------------------------------------------------------| ------------------------------------------------------------ | +| 2.0.2 | 2025-03-31 | [msprof_analyze-2.0.2-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/package/2.0.2/msprof_analyze-2.0.2-py3-none-any.whl) | 4227ff628187297b2f3bc14b9dd3a8765833ed25d527f750bc266a8d29f86935 | +| 2.0.1 | 2025-02-28 | [msprof_analyze-2.0.1-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/package/2.0.1/msprof_analyze-2.0.1-py3-none-any.whl) | 82dfe2c779dbab9015f61d36ea0c32d832b6d182454b3f7db68e6c0ed49c0423 | | 2.0.0 | 2025-02-08 | [msprof_analyze-2.0.0-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/package/2.0.0/msprof_analyze-2.0.0-py3-none-any.whl) | 8e44e5f3e7681c377bb2657a600ad9841d3bed11061ddd7844c30e8a97242101 | | 1.3.4 | 2025-01-20 | [msprof_analyze-1.3.4-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/package/1.3.4/msprof_analyze-1.3.4-py3-none-any.whl) | 8de92188d1a97105fb14cadcb0875ccd5f66629ee3bb25f37178da1906f4cce2 | | 1.3.3 | 2024-12-26 | [msprof_analyze-1.3.3-py3-none-any.whl](https://ptdbg.obs.myhuaweicloud.com/profiler/package/1.3.3/msprof_analyze-1.3.3-py3-none-any.whl) | 27676f2eee636bd0c65243f81e292c7f9d30d7f985c772ac9cbaf10b54d3584e | diff --git a/profiler/msprof_analyze/advisor/README.md b/profiler/msprof_analyze/advisor/README.md index 2c9e055a119847134f08337c559d4012b4ea31fc..9b1c5db3f381351f759171855e0988504d397108 100644 --- a/profiler/msprof_analyze/advisor/README.md +++ b/profiler/msprof_analyze/advisor/README.md @@ -84,13 +84,14 @@ msprof-analyze advisor命令行包含如下三个参数: | dimension | mode | 参数释义 | 支持场景 | | ---------- |---------------------------------------| ------------------------------------ | ------------------------------------ | -| overall | overall summary | 计算、通信、空闲等维度对性能数据进行拆解 | PyTorch、MindSpore | +| overall | Overall Summary | 计算、通信、空闲等维度对性能数据进行拆解 | PyTorch、MindSpore | | | Environment Variable Issues | 环境变量设置推荐 | PyTorch | | | slow rank | 慢卡识别 | PyTorch、MindSpore | | | slow link | 慢链路识别 | PyTorch、MindSpore | | computation | AICPU Issues | AI CPU调优 | PyTorch、MindSpore | | | Operator Dynamic Shape Issues | 识别动态Shape算子 | PyTorch | -| | Block Dim | Block Dim算子调优 | PyTorch、MindSpore | +| | AI Core Performance Analysis | MatMul、FlashAttentionScore、AI_VECTOR_CORE和MIX_AIV类算子的性能分析 | PyTorch | +| | Block Dim Issues | Block Dim算子调优 | PyTorch、MindSpore | | | Operator No Bound Issues | 算子瓶颈分析 | PyTorch、MindSpore | | | Fusion Issues | 融合算子图调优 | PyTorch、MindSpore | | | AI Core Frequency Issues | AI Core算子降频分析 | PyTorch、MindSpore | @@ -103,6 +104,7 @@ msprof-analyze advisor命令行包含如下三个参数: | | SyncBatchNorm Issues | BatchNorm同步检测 | PyTorch、MindSpore | | | Synchronize Stream Issues | 流同步检测 | PyTorch、MindSpore | | | GC Analysis | 识别异常垃圾回收事件。需要Ascend PyTorch Profiler采集时开启experimental_config下的gc_delect_threshold功能 | PyTorch | +| | Fusible Operator Analysis | 检测具有Host瓶颈或者MTE瓶颈的算子序列,可用于代码优化或开发可融合算子 | PyTorch、MindSpore | | dataloader | Slow Dataloader Issues | 异常dataloader检测 | PyTorch、MindSpore | | memory | Memory Operator Issues | 识别异常的内存申请释放操作 | PyTorch、MindSpore | | comparison | Kernel compare of Rank\* Step\* and Rank\* Step\* | 识别标杆和待比对性能数据的Kernel数据(无标杆场景是集群内部快慢卡的性能数据对比,有标杆场景是两个集群之间存在明显耗时差异的相同卡之间的性能数据对比) | PyTorch、MindSpore | @@ -134,12 +136,12 @@ msprof-analyze advisor命令行包含如下三个参数: | 参数 | 说明 | 是否必选 | | ---------------------------------- | ------------------------------------------------------------ | -------- | -| -d
    --profiling_path | 性能数据文件或目录所在路径,Ascend PyTorch Profiler采集场景指定为`*_ascend_pt`性能数据结果目录,其他场景指定为`PROF_XXX`性能数据结果目录。建议通过Ascend PyTorch Profiler获取性能数据。
    advisor依赖Profiling工具解析后的timeline数据(.json)、summary(.csv)数据以及info.json*文件,请确保指定的“profiling_path”目录下存在以上文件。 | 是 | +| -d
    --profiling_path | 性能数据文件或目录所在路径,Ascend PyTorch Profiler采集场景指定为`*_ascend_pt`性能数据结果目录,MindSpore Profiler采集场景指定为`*_ascend_ms`性能数据结果目录。集群数据需要指定到`*_ascend_pt`或`*_ascend_ms`的父目录。 | 是 | | -bp
    --benchmark_profiling_path | 基准性能数据所在目录,用于性能比对。性能数据通过Profiling工具采集获取。
    **computation和schedule不支持该参数。** | 否 | | -o
    --output_path | 分析结果输出路径,完成advisor分析操作后会在该目录下保存分析结果数据。默认未配置,为当前目录。 | 否 | | -cv
    --cann_version | 使用Profiling工具采集时对应的CANN软件版本。目前配套的兼容版本为“6.3.RC2”,“7.0.RC1”、“7.0.0”、“8.0.RC1”,此字段不填默认按“8.0.RC1”版本数据进行处理,其余版本采集的Profiling数据在分析时可能会导致不可知问题。可通过在环境中执行如下命令获取其version字段:`cat /usr/local/Ascend/ascend-toolkit/latest/aarch64-linux/ascend_toolkit_install.info` | 否 | | -tv
    --torch_version | 运行环境的torch版本,默认为1.11.0,支持torch1.11.0和torch2.1.0,当运行环境torch版本为其他版本如torch1.11.3时,可以忽略小版本号差异选择相近的torch版本如1.11.0。 | 否 | -| -pt
    --profiling_type | 配置性能数据采集使用的Profiling工具类型。可取值:
    ascend_pytorch_profiler:使用Ascend PyThon Profiler接口方式采集的性能数据时配置,默认值。
    msprof:使用msprof命令行方式采集的性能数据时配置。功能完善中,暂不建议使用。
    mslite:使用[Benchmark](https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench)工具采集的性能数据时配置。不建议使用。
    **schedule不支持该参数。** | 否 | +| -pt
    --profiling_type | 配置性能数据采集使用的Profiling工具类型。可取值:
    pytorch:使用Ascend PyThon Profiler接口方式采集的性能数据时配置,默认值。
    mindspore:使用MindSpore Profiler接口方式采集的性能数据时配置。
    mslite:使用[Benchmark](https://gitee.com/ascend/tools/tree/master/ais-bench_workload/tool/ais_bench)工具采集的性能数据时配置。不建议使用。
    **schedule不支持该参数。** | 否 | | --force | 强制执行advisor。配置后可强制跳过如下情况:
    指定的目录、文件的用户属主不属于当前用户,忽略属主判断直接执行。
    csv文件大于5G、json文件大于10G、db文件大于8G,忽略文件过大判断直接执行。
    配置该参数表示开启强制执行,默认未配置表示关闭。 | 否 | | -l
    --language | 设置分析结果输出的语言,可取值:
    cn:输出中文,默认值。
    en:输出英文。 | 否 | | --debug | 工具执行报错时可打开此开关,将会展示详细保存堆栈信息。配置该参数表示开启Debug,默认未配置表示关闭。 | 否 | @@ -231,9 +233,9 @@ communication模块从通信维度进行分析,目前支持通信小包检测 通信算子字节对齐检测,传输类型为SDMA的通信算子,数据量需要被512字节整除,保证传输带宽不会下降。 -![byte_alignment](/img/byte_alignment.png) +![byte_alignment](./img/byte_alignment.png) -computation模块从device计算性能维度进行分析,能够识别AI CPU、动态Shape、Dlock Dim、算子瓶颈、融合算子图、AI Core算子降频分析等问题并给出相应建议。此处不再详细展开,按照报告进行调优即可。示例如下: +computation模块从device计算性能维度进行分析,能够识别AI CPU、动态Shape、AI Core Performance Analysis、Dlock Dim、算子瓶颈、融合算子图、AI Core算子降频分析等问题并给出相应建议。此处不再详细展开,按照报告进行调优即可。示例如下: ![computation_1](./img/computation_1.png) @@ -241,6 +243,8 @@ computation模块从device计算性能维度进行分析,能够识别AI CPU、 ![op_no_bound](./img/op_no_bound.png) +![AI_Core_Performance_Analysis](./img/AI_Core_Performance_analysis.png) + 上图中torch_npu.npu.set_compile_mode接口介绍请参见[torch_npu.npu.set_compile_mode](https://www.hiascend.com/document/detail/zh/Pytorch/60RC2/apiref/apilist/ptaoplist_000880.html);AICPU算子替换样例可参考《[Samples of AI CPU Operator Replacement](https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/advisor/doc/Samples%20of%20AI%20CPU%20Operator%20Replacement.md)》。 当存在pp stage(流水线并行)时,computation会按stage分析,每个stage就是一个流水线切分,比如0\~7卡为stage-0、8\~15卡为stage-1。 @@ -253,7 +257,22 @@ dataloader模块包含Slow Dataloader Issues,主要检测异常高耗时的dat 上图中的`pin_memory`(内存锁定)和`num_workers`(数据加载是子流程数量)参数为[数据加载优化](https://www.hiascend.com/document/detail/zh/Pytorch/60RC2/ptmoddevg/trainingmigrguide/performance_tuning_0019.html)使用。 -schedule模块包GC Analysis、含亲和API、aclOpCompile、SyncBatchNorm、SynchronizeStream等多项检测。 +schedule模块包含GC Analysis、亲和API、aclOpCompile、SyncBatchNorm、SynchronizeStream和Fusible Operator Analysis等多项检测。 + +其中Fusible Operator Analysis解析结果仅打屏展示和保存在`mstt_advisor_{timestamp}.xlsx`文件中,包含“基于host瓶颈的算子序列分析”和“基于mte瓶颈的算子序列分析”页签,如下图: + +![Fusible_Operator_Analysis](./img/Fusible_Operator_Analysis.png) + +| 字段 | 说明 | +| ------------------ | ------------------------------------------------------------ | +| start index | 序列起始算子在kernel details.csv或op_summary.csv中索引位置(不包含表头,起始索引为0)。 | +| end index | 序列末尾算子在kernel details.csv或op_summary.csv中索引位置。 | +| total time(us) | 算子序列总耗时(包含算子间隙),单位us。 | +| execution time(us) | 序列中算子执行总耗时,单位us。 | +| mte time(us) | 序列中算子搬运总耗时,单位us。 | +| occurrences | 序列出现次数。 | +| mte bound | 是否为MTE瓶颈。 | +| host bound | 是否为Host瓶颈。 | 如下图示例,GC Analysis提示存在异常垃圾回收事件,用户可以通过有效的Python内存管理、使用`gc.set_threshold()`调整垃圾回收阈值、使用gc.disable()禁用gc等方法处理GC问题。 @@ -266,7 +285,7 @@ schedule模块包GC Analysis、含亲和API、aclOpCompile、SyncBatchNorm、Syn - `gc.set_threshold(threshold0, thresholdl, threshold2)`:这个函数用于设置垃圾回收的阈值。垃圾回收器将所有对象分为三代(0代、1代和2代),每一代的对象在经历垃圾回收后会被移到下一代。`threshold0`控制第0代的垃圾回收频率,`threshold1`控制第1代的垃圾回收频率,`threshold2`控制第2代的垃圾回收频率。将`threshold0`设为0可以禁用垃圾回收。 - `gc.disable ()`:这个函数用于禁用自动垃圾回收。调用`gc.disable ()`后,垃圾回收器将不会自动运行,直到手动调用`gc.enable()`。 -如下图示例,Affinity API Issues提示存在可以替换的亲和API并给出对应的堆栈,用户可以根据堆栈找到需要修改的代码,并给出修改案例([API instruction](https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/advisor/doc/Samples%20of%20Fused%20Operator%20API%20Replacement.md))。 +如下图示例,Affinity API Issues提示存在可以替换的亲和API并给出对应的堆栈,用户可以根据堆栈找到需要修改的代码,并给出修改案例([API instructions](https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/advisor/doc/Samples%20of%20Fused%20Operator%20API%20Replacement.md))。 ![schedule_3](./img/schedule_3.png) @@ -345,13 +364,13 @@ Jupyter Notebook使用方式如下: 安装环境下保存Ascend PyTorch Profiler采集的性能数据。 -3. 进入mstt\profiler\advisor目录执行如下命令启动Jupyter Notebook工具。 +3. 进入mstt\profiler\msprof_analyze\advisor目录执行如下命令启动Jupyter Notebook工具。 ```bash jupyter notebook ``` - 执行成功则自动启动浏览器读取mstt\profiler\advisor目录,如下示例: + 执行成功则自动启动浏览器读取mstt\profiler\msprof_analyze\advisor目录,如下示例: ![jupyter_report](./img/jupyter_report.PNG) diff --git a/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/kernel_cluster_advice.py b/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/kernel_cluster_advice.py index a7d3a010959275ee0fd6e3be2af926f7fb46c3bb..0c0d9f15a646803b461c688cb71933a62895fd89 100644 --- a/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/kernel_cluster_advice.py +++ b/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/kernel_cluster_advice.py @@ -17,7 +17,7 @@ import os import pandas as pd from msprof_analyze.advisor.advisor_backend.common_func_advisor.constant import Constant as AdvisorConstant from msprof_analyze.advisor.advisor_backend.cluster_advice.cluster_advice_base import ClusterAdviceBase -from cluster_data_preprocess.pytorch_data_preprocessor import PytorchDataPreprocessor +from msprof_analyze.cluster_analyse.cluster_data_preprocess.pytorch_data_preprocessor import PytorchDataPreprocessor from msprof_analyze.prof_common.file_manager import FileManager from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.path_manager import PathManager diff --git a/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/slow_link_advice.py b/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/slow_link_advice.py index 2024adf8f6a020a5e09ce41949f9815831d7b563..6d2a0638913d759817b091a013d7fbce9df09f63 100644 --- a/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/slow_link_advice.py +++ b/profiler/msprof_analyze/advisor/advisor_backend/cluster_advice/slow_link_advice.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import copy import os from collections import defaultdict from msprof_analyze.advisor.advisor_backend.common_func_advisor.constant import Constant @@ -41,7 +42,7 @@ class SlowLinkAdvice(ClusterAdviceBase): self.SDMA_TIME_MS: 0, self.SDMA_SIZE_MB: 0, } - self.rank_bw_dict = defaultdict(lambda: default_value.copy()) + self.rank_bw_dict = defaultdict(lambda: copy.deepcopy(default_value)) @staticmethod def compute_ratio(dividend: float, divisor: float): diff --git a/profiler/msprof_analyze/advisor/advisor_backend/common_func_advisor/constant.py b/profiler/msprof_analyze/advisor/advisor_backend/common_func_advisor/constant.py index 077bf0074ccc5edc1bbf0814d2d3d72b1c5475e7..162a9fd2fdde15e02d2897106b43f52bca99bde1 100644 --- a/profiler/msprof_analyze/advisor/advisor_backend/common_func_advisor/constant.py +++ b/profiler/msprof_analyze/advisor/advisor_backend/common_func_advisor/constant.py @@ -214,7 +214,7 @@ class CoreType: AICPU = "AI_CPU" MIX_AIV = "MIX_AIV" MIX_AIC = "MIX_AIC" - HCCL = "HCCL" + HCCL = "COMMUNICATION" class PerfColor(Enum): diff --git a/profiler/msprof_analyze/advisor/analyzer/cluster/slow_link_analyzer.py b/profiler/msprof_analyze/advisor/analyzer/cluster/slow_link_analyzer.py index 9c4416009e1035e1938cf5430a49b0383bbf47d7..377a084f434099bbad3a4d53296e3f66564a5fe8 100644 --- a/profiler/msprof_analyze/advisor/analyzer/cluster/slow_link_analyzer.py +++ b/profiler/msprof_analyze/advisor/analyzer/cluster/slow_link_analyzer.py @@ -175,6 +175,8 @@ class SlowLinkAnalyzer(BaseAnalyzer): if bindwidth_index is not None: data_list = [tuple_list[bindwidth_index] for tuple_list in self.format_datas.get("data", [])] + if not data_list: + return global_step_rank max_bandwidth, min_bandwidth = max(data_list), min(data_list) if self.compute_max_gap_ratio(data_list, sum(data_list) / len( diff --git a/profiler/msprof_analyze/advisor/analyzer/cluster/slow_rank_analyzer.py b/profiler/msprof_analyze/advisor/analyzer/cluster/slow_rank_analyzer.py index b1d05c8b4562c23ad4993d3b620d1abf27453b1c..b588ddc0db331cfcca8213b2f55a5e08898eabe9 100644 --- a/profiler/msprof_analyze/advisor/analyzer/cluster/slow_rank_analyzer.py +++ b/profiler/msprof_analyze/advisor/analyzer/cluster/slow_rank_analyzer.py @@ -167,7 +167,9 @@ class SlowRankAnalyzer(BaseAnalyzer): if dimension_index is None or rank_id_index is None: return global_step_rank - data_list = [tuple_list[dimension_index] for tuple_list in self.format_datas.get("data")] + data_list = [tuple_list[dimension_index] for tuple_list in self.format_datas.get("data", [])] + if not data_list: + return global_step_rank max_time, min_time = max(data_list), min(data_list) if self.compute_max_gap_ratio(data_list, sum(data_list) / len( diff --git a/profiler/msprof_analyze/advisor/analyzer/comparison/comparison_checker.py b/profiler/msprof_analyze/advisor/analyzer/comparison/comparison_checker.py index 8e40bcc1cfe6914470d82c86f5b76980a5c16814..57811a6c8d8bc4ee4b640cf4405e05648a42e104 100644 --- a/profiler/msprof_analyze/advisor/analyzer/comparison/comparison_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/comparison/comparison_checker.py @@ -68,7 +68,7 @@ class ComparisonChecker: return self.compare_mode = compare_mode if ("Api" in compare_mode) and self.benchmark_profiling_path.endswith("ascend_ms"): - logger.warning("The current compare mode %s does not support Mindspore.", compare_mode) + logger.info("The current compare mode %s does not support Mindspore.", compare_mode) return compare_interface = ComparisonInterface(self.profiling_path, self.benchmark_profiling_path, self.step, self.benchmark_step, diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/builder/__init__.py b/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/__init__.py similarity index 100% rename from debug/accuracy_tools/msprobe/pytorch/visualization/builder/__init__.py rename to profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/__init__.py diff --git a/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_analyzer.py b/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_analyzer.py new file mode 100644 index 0000000000000000000000000000000000000000..23ec775e275134e8a99336b005d9f8f198660245 --- /dev/null +++ b/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_analyzer.py @@ -0,0 +1,53 @@ +# Copyright (c) Huawei Technologies Co., Ltd. 2025. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import logging + +from msprof_analyze.advisor.analyzer.base_analyzer import BaseAnalyzer +from msprof_analyze.advisor.analyzer.computation.ai_core_performance.ai_core_performance_checker import \ + AICorePerformanceChecker +from msprof_analyze.advisor.dataset.profiling.profiling_dataset import ProfilingDataset +from msprof_analyze.advisor.result.result import OptimizeResult +from msprof_analyze.advisor.display.html.priority_background_color import PriorityBackgroundColor +from msprof_analyze.advisor.display.html.render import HTMLRender + +logger = logging.getLogger() + + +class AICorePerformanceAnalyzer(BaseAnalyzer): + dataset_cls_list = [ProfilingDataset] + + def __init__(self, collection_path, n_processes: int = 1, **kwargs) -> None: + super().__init__(collection_path, n_processes, **kwargs) + profiling_key = ProfilingDataset.get_key() + self.profiling_dataset = self.get_first_data_by_key(self.dataset_list, profiling_key) + self.result = OptimizeResult() + self.html_render = HTMLRender() + self.html = None + + def optimize(self, **kwargs): + add_render_list = kwargs.get("add_render_list", True) + ai_core_perf_checker = AICorePerformanceChecker() + ai_core_perf_checker.data_filter(self.profiling_dataset) + if not ai_core_perf_checker.ai_core_performance_issues: + return self.result + ai_core_perf_checker.check_ai_core_performance(self.profiling_dataset) + ai_core_perf_checker.make_record(self.result) + self.html = ai_core_perf_checker.make_render(self.html_render, + add_render_list, + priority=self.get_priority(), + rank=kwargs.get("rank")) + return self.result + + def get_priority(self, max_mem_op_dur=None): + return PriorityBackgroundColor.low \ No newline at end of file diff --git a/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_checker.py b/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_checker.py new file mode 100644 index 0000000000000000000000000000000000000000..afc58578f68c0cc120a01a914c998227949ee48e --- /dev/null +++ b/profiler/msprof_analyze/advisor/analyzer/computation/ai_core_performance/ai_core_performance_checker.py @@ -0,0 +1,573 @@ +# Copyright (c) Huawei Technologies Co., Ltd. 2025. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import logging +import os +from functools import reduce + +from msprof_analyze.advisor.utils.utils import safe_division, convert_to_int_with_exception, \ + convert_to_float_with_warning +from msprof_analyze.advisor.dataset.profiling.profiling_dataset import ProfilingDataset +from msprof_analyze.advisor.result.item import OptimizeItem, OptimizeRecord +from msprof_analyze.advisor.result.result import OptimizeResult +from msprof_analyze.prof_common.additional_args_manager import AdditionalArgsManager +from msprof_analyze.prof_common.file_manager import FileManager + +logger = logging.getLogger() + + +class AICorePerformanceChecker: + """ + operator performance checker + """ + _CHECKER = "AICorePerformanceChecker" + CUBE_OPERATOR_MEMORY_SIZE_MB = 100 + INNER_AXIS_256 = 256 + INNER_AXIS_128 = 128 + + def __init__(self): + self.result = dict() + self.ai_core_performance_issues = False + self._desc = "" + self.cube_dict = {} + self.fa_dict = {} + self.fa_list = [] + self.vector_dict = {} + self.load_aicore_perf_rules() + + @staticmethod + def get_operator_list(cube_dict, profiling_dataset): + operator_list = [] + for op in profiling_dataset.op_summary.op_list: + if op.op_name in cube_dict: + key = op.input_shapes[1:-1] + "-" + op.output_shapes[1:-1] + if key in cube_dict[op.op_name]: + operator_list.append(op) + return operator_list + + @staticmethod + def get_vector_list(profiling_dataset, vector_dict): + vector_list = [] + for op_name in vector_dict: + for shape in vector_dict[op_name]: + for operator in profiling_dataset.op_summary.op_list: + if operator.op_name == op_name and operator.input_shapes[1:-1] + "-" + operator.output_shapes[ + 1:-1] == shape: + vector_list.append(operator) + return vector_list + + @staticmethod + def safe_divide(numerator, denominator): + if denominator == 0: + logger.warning("Warning: Division by zero is not allowed.") + return None + return numerator / denominator + + @staticmethod + def memory_size(operator): + memory = 0 + input_shapes = operator.input_shapes[1:-1].split(";") + output_shapes = operator.output_shapes[1:-1] + for shapes in input_shapes: + if "," not in shapes and shapes != "": + # 多的一维是 bias ,预先乘2 + memory += convert_to_int_with_exception(shapes) * 2 + continue + memory += reduce(lambda x, y: x * y, map(int, shapes.split(","))) + memory += reduce(lambda x, y: x * y, map(int, output_shapes.split(","))) + return memory * 2 / 1024 / 1024 + + def load_aicore_perf_rules(self): + language = AdditionalArgsManager().language + rule_path = os.path.join( + os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))), + "rules", language, "aicore_performance.yaml" + ) + + if not os.path.exists(rule_path): + logger.warning("Skip analyze aicpu issues, because %s does not exist.", rule_path) + + self.language = language + self.aicore_rules = FileManager.read_yaml_file(rule_path) + self._cube_problem = self.aicore_rules.get("cube_problem") + self._fa_problem = self.aicore_rules.get("fa_problem") + self._vector_problem = self.aicore_rules.get("vector_problem") + self._desc = self.aicore_rules.get("description") + self._bound_desc = self.aicore_rules.get("bound_description") + self._opti_desc = self.aicore_rules.get("optimization_description") + self._affinity_desc = self.aicore_rules.get("affinity_description") + self._cube_affinity_desc = self.aicore_rules.get("cube_affinity_desc") + self._fa_affinity_desc_head_dim_128 = self.aicore_rules.get("fa_affinity_desc_head_dim_128") + self._fa_affinity_desc_seq_len_128 = self.aicore_rules.get("fa_affinity_desc_seq_len_128") + self._fa_affinity_desc_head_dim_seq_len_128 = self.aicore_rules.get("fa_affinity_desc_head_dim_seq_len_128") + self._suggestion = self.aicore_rules.get("suggestion") + self._affinity_suggestion = self.aicore_rules.get("affinity_suggestion") + self._bound_suggestion = self.aicore_rules.get("bound_suggestion") + self._opti_suggestion = self.aicore_rules.get("optimization_suggestion") + self._operator_rules = {"cube_operators": self.aicore_rules.get("cube_operators"), + "fa_operators": self.aicore_rules.get("fa_operators"), + "vector_operators": self.aicore_rules.get("vector_operators")} + + def data_filter(self, profiling_dataset: ProfilingDataset): + if not self.check_task_list(profiling_dataset): + return + + operator_list = profiling_dataset.op_summary.op_list + total_duration = sum(convert_to_float_with_warning(operator.task_duration) for operator in operator_list) + if (total_duration == 0): + return + cube_memory_dict, vector_type_dict = {}, {} + + for op in operator_list: + if not op.input_shapes or not op.output_shapes: + continue + shapes = op.input_shapes[1:-1] + "-" + op.output_shapes[1:-1] + # preliminary filter cube operator + if op.task_type == "AI_CORE" and "matmul" in op.op_type.lower(): + cube_memory_dict.setdefault(op.op_name, {}).setdefault(shapes, 0) + cube_memory_dict[op.op_name][shapes] += self.memory_size(op) + continue + + # filter fa operator + if op.op_type == "FlashAttentionScore": + self.fa_dict.setdefault(op.op_name, set()).add(shapes) + self.fa_list.append(op) + elif op.op_type == "FlashAttentionScoreGrad": + self.fa_dict.setdefault(op.op_name, set()).add(shapes + "-grad") + self.fa_list.append(op) + + # preliminary filter vector operator + if op.task_type in ["AI_VECTOR_CORE", "MIX_AIV"]: + vector_type_dict.setdefault(op.op_type, set()).add(op) + + # filter cube operator + for op_name in cube_memory_dict: + for shapes in cube_memory_dict[op_name]: + if cube_memory_dict[op_name][shapes] >= self.CUBE_OPERATOR_MEMORY_SIZE_MB: + self.cube_dict.setdefault(op_name, set()).add(shapes) + + # filter vector operator + for op_type in vector_type_dict: + duration_group_by_time = sum(convert_to_float_with_warning(op.task_duration) + for op in vector_type_dict[op_type]) + if (duration_group_by_time / total_duration) >= 0.01 or duration_group_by_time >= 1000000: + for op in vector_type_dict[op_type]: + shapes = op.input_shapes[1:-1] + "-" + op.output_shapes[1:-1] + self.vector_dict.setdefault(op.op_name, set()).add(shapes) + + if any([self.cube_dict, self.fa_dict, self.vector_dict]): + self.ai_core_performance_issues = True + + def check_ai_core_performance(self, promoting_dataset: ProfilingDataset): + for operator_type in ["cube", "fa", "vector"]: + try: + self.result[operator_type] = getattr(self, f"check_{operator_type}_operator")(promoting_dataset) + except (IndexError, ValueError, AttributeError) as e: + logger.warning(f"Failed to check ai core performance {operator_type} operator, {e}.") + self.result[operator_type] = [] + + if not any([self.result["cube"], self.result["fa"], self.result["vector"]]): + self.ai_core_performance_issues = False + + def check_cube_operator(self, profiling_dataset: ProfilingDataset): + cube_dict = self.cube_dict + suggestion = self._cube_affinity_desc + optimization_queue, bound_queue, affinity_queue = [], [], [] + operator_list = self.get_operator_list(cube_dict, profiling_dataset) + for op in cube_dict: + for shape in cube_dict[op]: + affinity_flag = self._check_cube_inner_axis(shape) + if not affinity_flag: + dtype, shape_duration = None, 0. + for operator in operator_list: + if (operator.op_name == op and + operator.input_shapes[1:-1] + "-" + operator.output_shapes[1:-1] == shape): + dtype = operator.input_data_types + shape_duration += convert_to_float_with_warning(operator.task_duration) + affinity_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "duration": shape_duration, + "suggestion": suggestion}) + else: + shape_list = [] + for operator in operator_list: + if (operator.op_name == op and operator.input_shapes[1:-1] + "-" + + operator.output_shapes[1:-1] == shape): + shape_list.append(operator) + shape_duration = sum(convert_to_float_with_warning(operator.task_duration) + for operator in shape_list) + dtype = shape_list[0].input_data_types if shape_list else None + bound, optimization = self.del_cube_operator_bound(shape_list) + if bound is None and optimization is None: + continue + if bound: + bound_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "bound": bound, + "duration": shape_duration}) + else: + optimization_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "optimization": round(optimization * 100, 2)}) + return [sorted(optimization_queue, key=lambda x: x["optimization"], reverse=True)[:5], + sorted(bound_queue, key=lambda x: x["duration"], reverse=True)[:5], + sorted(affinity_queue, key=lambda x: x["duration"], reverse=True)[:5]] + + def del_cube_operator_bound(self, shape_list): + bound, optimization, aic_mac_ratio, aic_mte2_ratio, length = "", 0., 0., 0., 0 + for operator in shape_list: + try: + aic_mac_ratio += convert_to_float_with_warning(operator.aic_mac_ratio) + aic_mte2_ratio += convert_to_float_with_warning(operator.aic_mte2_ratio) + length += 1 + except ValueError: + continue + aic_mac_ratio = self.safe_divide(aic_mac_ratio, length) + aic_mte2_ratio = self.safe_divide(aic_mte2_ratio, length) + if aic_mac_ratio is None or aic_mte2_ratio is None: + return None, None + aic_mac_ratio_rule, aic_mte2_ratio_rule = None, None + for operator_rule in self._operator_rules["cube_operators"]: + if operator_rule["target"] == "aic_mac_ratio": + aic_mac_ratio_rule = operator_rule + elif operator_rule["target"] == "aic_mte2_ratio": + aic_mte2_ratio_rule = operator_rule + if (aic_mac_ratio >= aic_mac_ratio_rule["threshold"] + and aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"]): + bound = aic_mac_ratio_rule["bound"] + "_and_" + aic_mte2_ratio_rule["bound"] + "_bound" + elif aic_mac_ratio >= aic_mte2_ratio_rule["threshold"]: + bound = aic_mac_ratio_rule["bound"] + elif aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"]: + bound = aic_mte2_ratio_rule["bound"] + else: + optimization = max(aic_mac_ratio_rule["threshold"] - aic_mac_ratio, + aic_mte2_ratio_rule["threshold"] - aic_mte2_ratio) + return bound, optimization + + def check_fa_operator(self, profiling_dataset: ProfilingDataset): + fa_list, fa_dict = self.fa_list, self.fa_dict + optimization_queue, bound_queue, affinity_queue = [], [], [] + # 不亲和算子筛选 + for op in fa_dict: + for shape in fa_dict[op]: + affinity_flag, dtype, shape_duration, suggestion = self._check_fa_inner_axis(fa_list, op, shape) + if affinity_flag: + # 不亲和算子 计算耗时,加入affinity_queue + affinity_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "suggestion": suggestion, + "duration": shape_duration}) + else: + # 处理bound算子和优化算子 + if len(shape.split("-")) > 2: + bound, optimization, dtype, shape_duration = self.del_fa_operator_bound_grad(op, shape, fa_list) + else: + bound, optimization, dtype, shape_duration = self.del_fa_operator_bound(op, shape, fa_list) + if bound is None and optimization is None: + continue + if bound: + bound_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "bound": bound, + "duration": shape_duration}) + else: + optimization_queue.append({"op_name": op, + "shape": shape.split("-")[0], + "dtype": dtype, + "optimization": round(optimization * 100, 2)}) + + return [sorted(optimization_queue, key=lambda x: x["optimization"], reverse=True)[:5], + sorted(bound_queue, key=lambda x: x["duration"], reverse=True)[:5], + sorted(affinity_queue, key=lambda x: x["duration"], reverse=True)[:5]] + + def del_fa_operator_bound_grad(self, op, shape, fa_list): + aic_fixpipe_ratio, aic_mte2_ratio, shape_duration, optimization, length = 0., 0., 0., 0., 0 + bound, dtype = "", None + for operator in fa_list: + if (operator.op_name == op and + operator.input_shapes[1:-1] + "-" + + operator.output_shapes[1:-1] + "-grad" == shape): + try: + aic_fixpipe_ratio += convert_to_float_with_warning(operator.aic_fixpipe_ratio) + aic_mte2_ratio += convert_to_float_with_warning(operator.aic_mte2_ratio) + shape_duration += convert_to_float_with_warning(operator.task_duration) + dtype = operator.input_data_types + length += 1 + except ValueError: + continue + aic_fixpipe_ratio = self.safe_divide(aic_fixpipe_ratio, length) + aic_mte2_ratio = self.safe_divide(aic_mte2_ratio, length) + if aic_mte2_ratio is None or aic_fixpipe_ratio is None: + return None, None, None, None + aic_fixpipe_ratio_rule, aic_mte2_ratio_rule = None, None + for rule in self._operator_rules["fa_operators"]: + if rule["target"] == "aic_fixpipe_ratio": + aic_fixpipe_ratio_rule = rule + elif rule["target"] == "aic_mte2_ratio": + aic_mte2_ratio_rule = rule + if (aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"] and + aic_fixpipe_ratio >= aic_fixpipe_ratio_rule["threshold"]): + bound = aic_fixpipe_ratio_rule["bound"] + "_and_" + aic_mte2_ratio_rule["bound"] + "_bound" + elif aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"]: + bound = aic_mte2_ratio_rule["bound"] + elif aic_fixpipe_ratio >= aic_fixpipe_ratio_rule["threshold"]: + bound = aic_fixpipe_ratio_rule["bound"] + else: + optimization = max(aic_fixpipe_ratio_rule["threshold"] - aic_fixpipe_ratio, + aic_mte2_ratio_rule["threshold"] - aic_mte2_ratio) + return bound, optimization, dtype, shape_duration + + def del_fa_operator_bound(self, op, shape, fa_list): + aiv_vec_ratio, aic_mte2_ratio, shape_duration, optimization, length = 0., 0., 0., 0., 0 + bound, dtype = "", None + for operator in fa_list: + if (operator.op_name == op and + operator.input_shapes[1:-1] + "-" + operator.output_shapes[1:-1] == shape): + try: + aiv_vec_ratio += convert_to_float_with_warning(operator.aiv_vec_ratio) + aic_mte2_ratio += convert_to_float_with_warning(operator.aic_mte2_ratio) + shape_duration += convert_to_float_with_warning(operator.task_duration) + length += 1 + except ValueError: + continue + aiv_vec_ratio = self.safe_divide(aiv_vec_ratio, length) + aic_mte2_ratio = self.safe_divide(aic_mte2_ratio, length) + if aiv_vec_ratio is None or aic_mte2_ratio is None: + return None, None, None, None + aiv_vec_ratio_rule, aic_mte2_ratio_rule = None, None + for rule in self._operator_rules["fa_operators"]: + if rule["target"] == "aiv_vec_ratio": + aiv_vec_ratio_rule = rule + elif rule["target"] == "aic_mte2_ratio": + aic_mte2_ratio_rule = rule + if (aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"] + and aiv_vec_ratio >= aiv_vec_ratio_rule["threshold"]): + bound = aic_mte2_ratio_rule["bound"] + "_and_" + aiv_vec_ratio_rule["bound"] + "_bound" + elif aic_mte2_ratio >= aic_mte2_ratio_rule["threshold"]: + bound = aic_mte2_ratio_rule["bound"] + elif aiv_vec_ratio >= aiv_vec_ratio_rule["threshold"]: + bound = aiv_vec_ratio_rule["bound"] + else: + optimization = max(aiv_vec_ratio_rule["threshold"] - aiv_vec_ratio, + aic_mte2_ratio_rule["threshold"] - aic_mte2_ratio) + return bound, optimization, dtype, shape_duration + + def check_vector_operator(self, profiling_dataset: ProfilingDataset): + vector_dict = self.vector_dict + optimization_queue, bound_queue = [], [] + vector_list = self.get_vector_list(profiling_dataset, vector_dict) + for op_name in vector_dict: + for shape in vector_dict[op_name]: + aiv_vec_ratio, aiv_mte2_ratio, aiv_mte3_ratio, shape_duration = 0., 0., 0., 0. + length, dtype = 0, "" + for operator in vector_list: + if (operator.op_name == op_name and + operator.input_shapes[1:-1] + "-" + operator.output_shapes[1:-1] == shape): + try: + aiv_vec_ratio += convert_to_float_with_warning(operator.aiv_vec_ratio) + aiv_mte2_ratio += convert_to_float_with_warning(operator.aiv_mte2_ratio) + aiv_mte3_ratio += convert_to_float_with_warning(operator.aiv_mte3_ratio) + shape_duration += convert_to_float_with_warning(operator.task_duration) + dtype = operator.input_data_types + length += 1 + except ValueError: + continue + aiv_vec_ratio = self.safe_divide(aiv_vec_ratio, length) + aiv_mte2_ratio = self.safe_divide(aiv_mte2_ratio, length) + aiv_mte3_ratio = self.safe_divide(aiv_mte3_ratio, length) + if aiv_vec_ratio is None or aiv_mte2_ratio is None or aiv_mte3_ratio is None: + continue + bound, optimization = self.del_vector_operator_bound(aiv_mte2_ratio, aiv_mte3_ratio, aiv_vec_ratio) + if bound: + bound_queue.append({"op_name": op_name, + "shape": shape.split("-")[0], + "bound": bound, + "dtype": dtype, + "duration": shape_duration}) + else: + optimization_queue.append({"op_name": op_name, + "shape": shape.split("-")[0], + "dtype": dtype, + "optimization": round(optimization * 100, 2)}) + return [sorted(optimization_queue, key=lambda x: x["optimization"], reverse=True)[:5], + sorted(bound_queue, key=lambda x: x["duration"], reverse=True)[:5]] + + def del_vector_operator_bound(self, aiv_mte2_ratio, aiv_mte3_ratio, aiv_vec_ratio): + bound, optimization = "", 0 + aiv_vec_ratio_rule, aiv_mte2_ratio_rule, aiv_mte3_ratio_rule, total_rule = None, None, None, None + for operator_rule in self._operator_rules["vector_operators"]: + if operator_rule["target"] == "aiv_vec_ratio": + aiv_vec_ratio_rule = operator_rule + elif operator_rule["target"] == "aiv_mte2_ratio": + aiv_mte2_ratio_rule = operator_rule + elif operator_rule["target"] == "aiv_mte3_ratio": + aiv_mte3_ratio_rule = operator_rule + elif operator_rule["target"] == "total": + total_rule = operator_rule + if aiv_vec_ratio + aiv_mte2_ratio + aiv_mte3_ratio >= total_rule["threshold"]: + bound = total_rule["bound"] + elif aiv_mte2_ratio >= aiv_mte2_ratio_rule["threshold"]: + bound = aiv_mte2_ratio_rule["bound"] + elif aiv_mte3_ratio >= aiv_mte3_ratio_rule["threshold"]: + bound = aiv_mte3_ratio_rule["bound"] + elif aiv_vec_ratio >= aiv_vec_ratio_rule["threshold"]: + bound = aiv_vec_ratio_rule["bound"] + else: + optimization = max(aiv_vec_ratio_rule["threshold"] - aiv_vec_ratio, + aiv_mte2_ratio_rule["threshold"] - aiv_mte2_ratio, + aiv_mte3_ratio_rule["threshold"] - aiv_mte3_ratio) + return bound, optimization + + def draw_record(self, op_type: str, result: OptimizeResult): + suggestion_keys = ['opti', 'bound', 'affinity'] + desc = dict.fromkeys(suggestion_keys, "") + problem_map = { + 'cube': self._cube_problem, + 'fa': self._fa_problem, + 'vector': self._vector_problem + } + if op_type not in problem_map: + return + optimization_item = OptimizeItem(problem_map[op_type], self._desc, [self._suggestion]) + result.add(OptimizeRecord(optimization_item)) + headers = [ + "Type", + "Description", + ] + result.add_detail(problem_map[op_type], headers=headers) + for opti_issue in self.result[op_type][0]: + opti_sugg = self._opti_suggestion.format(**opti_issue) + desc["opti"] += opti_sugg + if desc["opti"]: + result.add_detail(problem_map[op_type], detail=[self._opti_desc, desc["opti"]]) + for bound_issue in self.result[op_type][1]: + bound_sugg = self._bound_suggestion.format(**bound_issue) + desc["bound"] += bound_sugg + if desc["bound"]: + result.add_detail(problem_map[op_type], detail=[self._bound_desc, desc["bound"]]) + if op_type == "vector": # vector 类型没有亲和性建议 + return + for affinity_issue in self.result[op_type][2]: + affinity_sugg = self._affinity_suggestion.format(**affinity_issue) + desc["affinity"] += affinity_sugg + if desc["affinity"]: + result.add_detail(problem_map[op_type], detail=[self._affinity_desc, desc["affinity"]]) + + def make_record(self, result: OptimizeResult): + """ + make record for what and how to optimize + """ + if not self.ai_core_performance_issues: + return self.ai_core_performance_issues + if any(self.result["cube"]): + self.draw_record("cube", result) + if any(self.result["fa"]): + self.draw_record("fa", result) + if any(self.result["vector"]): + self.draw_record("vector", result) + + return True + + def make_render(self, html_render, add_render_list=True, **kwargs): + if not self.ai_core_performance_issues: + return self.ai_core_performance_issues + + priority = kwargs.get("priority") + return html_render.render_template(key="computation", + template_dir="templates", + template_name="ai_core_performance.html", + format_result=self.result, + language=self.language, + add_render_list=add_render_list, + priority_background_color=priority, + rank=kwargs.get("rank")) + + def check_task_list(self, profiling_dataset: ProfilingDataset) -> bool: + if not hasattr(profiling_dataset, "op_summary"): + logger.warning("Skip %s checker because of not containing %s", self._CHECKER, "op summary") + return False + if not hasattr(profiling_dataset.op_summary, "op_list"): + logger.warning("Skip %s checker because of not containing %s", self._CHECKER, "op_list") + return False + if (not hasattr(profiling_dataset.op_summary.op_list[0], "input_shapes") or + not hasattr(profiling_dataset.op_summary.op_list[0], "input_data_types")): + logger.warning("Skip %s checker because of not containing input datas", self._CHECKER) + return False + return True + + def _check_cube_inner_axis(self, shape): + # 判断输入shape内轴是否为256的倍数 + shapes = shape.split("-")[0].split(";") + if (len(shape.split("-")[0].split(";")[0].split(","))) == 4: + # NZ格式 + b_axis, c_axis = (convert_to_int_with_exception(shapes[0].split(",")[1]), + convert_to_int_with_exception(shapes[0].split(",")[2])) + f_axis, g_axis = (convert_to_int_with_exception(shapes[1].split(",")[1]), + convert_to_int_with_exception(shapes[1].split(",")[2])) + return (b_axis * c_axis % self.INNER_AXIS_256 == 0) and (f_axis * g_axis % self.INNER_AXIS_256 == 0) + elif (len(shape.split("-")[0].split(";")[0].split(","))) == 2: + # ND格式 + l_axis, k_axis = (convert_to_int_with_exception(shapes[0].split(",")[1]), + convert_to_int_with_exception(shapes[1].split(",")[1])) + return (l_axis % self.INNER_AXIS_256 == 0) and (k_axis % self.INNER_AXIS_256 == 0) + else: + return False + + def _check_fa_inner_axis(self, fa_list, op, shape): + shape_duration = 0. + affinity_flag = False + dtype = None + suggestion = "" + if "varlen" in op.lower(): + # 处理变长算子 如果不亲和则affinity_flag为False + inner_axis = convert_to_int_with_exception(shape.split("-")[0].split(";")[0].split(",")[2]) + if inner_axis % self.INNER_AXIS_128 != 0: + affinity_flag = True + suggestion = self._fa_affinity_desc_head_dim_128 + for operator in fa_list: + if (operator.op_name == op and + operator.input_shapes[1:-1] + "-" + operator.output_shapes[1:-1] == shape): + shape_duration += convert_to_float_with_warning(operator.task_duration) + dtype = operator.input_data_types + else: + # 处理定长算子 如果不亲和则affinity_flag为False + head_dim = 0 + seq_len = convert_to_int_with_exception(shape.split("-")[1].split(";")[0].split(",")[2]) + input_first_tensor = shape.split("-")[0].split(";")[0].split(",") + if len(input_first_tensor) == 3: + head_dim = safe_division(convert_to_int_with_exception(input_first_tensor[2]), + convert_to_int_with_exception(shape.split("-")[1].split(";")[0].split(",")[1])) + else: + head_dim = convert_to_int_with_exception(input_first_tensor[3]) + if head_dim % self.INNER_AXIS_128 != 0 and seq_len % self.INNER_AXIS_128 != 0: + affinity_flag = True + suggestion = self._fa_affinity_desc_head_dim_seq_len_128 + elif head_dim % self.INNER_AXIS_128 != 0: + affinity_flag = True + suggestion = self._fa_affinity_desc_head_dim_128 + elif seq_len % self.INNER_AXIS_128 != 0: + affinity_flag = True + suggestion = self._fa_affinity_desc_seq_len_128 + if affinity_flag: + for operator in fa_list: + if (operator.op_name == op and + operator.input_shapes[1:-1] + "-" + + operator.output_shapes[1:-1] == shape): + shape_duration += convert_to_float_with_warning(operator.task_duration) + dtype = operator.input_data_types + return affinity_flag, dtype, shape_duration, suggestion diff --git a/profiler/msprof_analyze/advisor/analyzer/computation/operator_checker.py b/profiler/msprof_analyze/advisor/analyzer/computation/operator_checker.py index ab9d4228b470ee515ed912ab018badbba3ec2e67..4be0fc66ae8b8f75ca0518228cbdccde1a0d7c1e 100644 --- a/profiler/msprof_analyze/advisor/analyzer/computation/operator_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/computation/operator_checker.py @@ -52,6 +52,7 @@ class OperatorChecker(VersionControl): self._tune_op_list: List[str] = [] self.prompt_class = BasePrompt.get_prompt_class("OperatorChecker") + self.rank_id = self.prompt_class.RANK_ID self.pytorch_op_tune_suggestion = self.prompt_class.PYTORCH_OPERATOR_TUNE_SUGGESTION self.mslite_op_tune_suggestion = self.prompt_class.MSLITE_OPERATOR_TUNE_SUGGESTION self.pytorch_release_suggestion = self.prompt_class.PYTORCH_RELEASE_SUGGESTION @@ -118,7 +119,7 @@ class OperatorChecker(VersionControl): """ if rank is not None: - self._problem = self.prompt_class.RANK_ID.format(rank) + self._problem.lower() + self._problem = self.rank_id.format(rank) + self._problem.lower() task_duration_list = [float(op_info.get_attr("task_duration")) for op_info in self._op_list @@ -301,7 +302,7 @@ class OperatorChecker(VersionControl): def format_suggestion_content(self, profiling_data: ProfilingDataset) -> None: if profiling_data.prof_type == EnumParamsParser().profiling_type.ascend_pytorch_profiler: self._suggestion.append(self.pytorch_op_tune_suggestion) - elif profiling_data.prof_type == EnumParamsParser.profiling_type.mslite: + elif profiling_data.prof_type == EnumParamsParser().profiling_type.mslite: self._suggestion.append(self.mslite_op_tune_suggestion) def _check_data(self, profiling_data): diff --git a/profiler/msprof_analyze/advisor/analyzer/dataloader/dataloader_checker.py b/profiler/msprof_analyze/advisor/analyzer/dataloader/dataloader_checker.py index 45efecc728680561264cdf4e0c9502aadc3b4140..6c6dbfe96e5167afdcfd52b35550a54f2e3c2e69 100644 --- a/profiler/msprof_analyze/advisor/analyzer/dataloader/dataloader_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/dataloader/dataloader_checker.py @@ -64,7 +64,7 @@ class DataloaderChecker: if not self.dataloader_issues: return - self.optimization_item.append(OptimizeItem("Slow dataloader", self.desc, self.suggestions)) + self.optimization_item.append(OptimizeItem("Slow Dataloader Issues", self.desc, self.suggestions)) for optimization in self.optimization_item: result.add(OptimizeRecord(optimization)) diff --git a/profiler/msprof_analyze/advisor/analyzer/memory/memory_checker.py b/profiler/msprof_analyze/advisor/analyzer/memory/memory_checker.py index 82bca84cd233aadf0e0744d2dab51c341a19e3cf..99ed672359665e2f59f0f362a4723c44230bee77 100644 --- a/profiler/msprof_analyze/advisor/analyzer/memory/memory_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/memory/memory_checker.py @@ -72,7 +72,7 @@ class MemoryOpsChecker: if not self.memory_issues: return - self.optimization_item.append(OptimizeItem("Memory", self.desc, self.suggestions)) + self.optimization_item.append(OptimizeItem("Memory Operator Issues", self.desc, self.suggestions)) for optimization in self.optimization_item: result.add(OptimizeRecord(optimization)) diff --git a/profiler/msprof_analyze/advisor/analyzer/schedule/fusible_ops/fusible_operator_checker.py b/profiler/msprof_analyze/advisor/analyzer/schedule/fusible_ops/fusible_operator_checker.py index 3ab54b0dbb8729c8297606a471ce67e55715b2b8..2a408fecafc3437cc9c8d13cc21bfff9940d0012 100644 --- a/profiler/msprof_analyze/advisor/analyzer/schedule/fusible_ops/fusible_operator_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/schedule/fusible_ops/fusible_operator_checker.py @@ -24,7 +24,7 @@ from msprof_analyze.advisor.result.result import OptimizeResult from msprof_analyze.advisor.result.item import OptimizeItem, OptimizeRecord from msprof_analyze.prof_common.additional_args_manager import AdditionalArgsManager from msprof_analyze.prof_common.file_manager import FileManager -from msprof_analyze.advisor.utils.utils import convert_to_float_with_warning +from msprof_analyze.advisor.utils.utils import convert_to_float_with_warning, safe_division from msprof_analyze.advisor.display.html.priority_background_color import PriorityBackgroundColor logger = logging.getLogger() @@ -88,7 +88,7 @@ class FusibleOperatorChecker: @staticmethod def check_hccl(task: OpInfo): - return (task.task_type == "HCCL" or + return (task.task_type in ["COMMUNICATION", "HCCL"] or any(task.op_name.lower().startswith(item) for item in ["hcom", "lccl", "lcoc"])) @staticmethod @@ -143,7 +143,7 @@ class FusibleOperatorChecker: self.post_processing(result_dict) def check_sequence_ratio(self, detail: List): - return detail[self._TOTAL_TIME_INDEX] / self.step_duration > self.sequence_duration_threshold + return safe_division(detail[self._TOTAL_TIME_INDEX], self.step_duration) > self.sequence_duration_threshold def check_sequence_num(self, detail: List): return detail[self._COUNT_INDEX] > self.sequence_count_threshold @@ -203,9 +203,9 @@ class FusibleOperatorChecker: def compute_priority(self): sequence_total_time = sum(detail[self._TOTAL_TIME_INDEX] for detail in self.host_details + self.mte_details) - if sequence_total_time / self.step_duration > self._HIGH_PRIORITY: + if safe_division(sequence_total_time, self.step_duration) > self._HIGH_PRIORITY: return PriorityBackgroundColor.high - elif sequence_total_time / self.step_duration < self._LOW_PRIORITY: + elif safe_division(sequence_total_time, self.step_duration) < self._LOW_PRIORITY: return PriorityBackgroundColor.low else: return PriorityBackgroundColor.medium diff --git a/profiler/msprof_analyze/advisor/analyzer/schedule/fusion_ops/fusion_ops_analyzer.py b/profiler/msprof_analyze/advisor/analyzer/schedule/fusion_ops/fusion_ops_analyzer.py index 247088080b9a5e1b492889752405516a1a731f3e..508050ba33ff7cbc76d8b59ca894446a925723fd 100644 --- a/profiler/msprof_analyze/advisor/analyzer/schedule/fusion_ops/fusion_ops_analyzer.py +++ b/profiler/msprof_analyze/advisor/analyzer/schedule/fusion_ops/fusion_ops_analyzer.py @@ -284,7 +284,7 @@ class TimelineFusionOpsAnalyzer(BaseAnalyzer): op_pattern_list = op_rule.split(Constant.OP_SEP) format_op_pattern = "" for op_pattern in op_pattern_list: - matched_res = re.search(r'\((.*?)\)', op_pattern) + matched_res = re.search(r'\((\w+)\)', op_pattern) ops_index_range = (matched_res.start() + 1, matched_res.end() - 1) if matched_res else ( 0, len(op_pattern)) diff --git a/profiler/msprof_analyze/advisor/analyzer/schedule/syncbn/syncbn_checker.py b/profiler/msprof_analyze/advisor/analyzer/schedule/syncbn/syncbn_checker.py index 64bf40ee230fe2b5d4000deeda337fef124c9c35..f6deb5d9eaacb97bd2f2c12ec3f7ffa9817e55e6 100644 --- a/profiler/msprof_analyze/advisor/analyzer/schedule/syncbn/syncbn_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/schedule/syncbn/syncbn_checker.py @@ -54,7 +54,7 @@ class SyncBNChecker: if not self.syncbn_issues: return - self.optimization_item.append(OptimizeItem("SyncBatchNorm", self.desc, self.suggestions)) + self.optimization_item.append(OptimizeItem("SyncBatchNorm Issues", self.desc, self.suggestions)) for optimization in self.optimization_item: result.add(OptimizeRecord(optimization)) diff --git a/profiler/msprof_analyze/advisor/analyzer/schedule/synchronize_stream/synchronize_stream_checker.py b/profiler/msprof_analyze/advisor/analyzer/schedule/synchronize_stream/synchronize_stream_checker.py index 1b9c074304d3f3f7b5e8db904df5344197bbb312..5af880f941e21949c7e0dd2b1bfbadd033719b68 100644 --- a/profiler/msprof_analyze/advisor/analyzer/schedule/synchronize_stream/synchronize_stream_checker.py +++ b/profiler/msprof_analyze/advisor/analyzer/schedule/synchronize_stream/synchronize_stream_checker.py @@ -87,7 +87,7 @@ class SynchronizeStreamChecker(TimelineBaseChecker): if not self.synchronize_issues: return - self.optimization_item.append(OptimizeItem("SynchronizeStream", self.desc, self.suggestions)) + self.optimization_item.append(OptimizeItem("Synchronize Stream Issues", self.desc, self.suggestions)) for optimization in self.optimization_item: result.add(OptimizeRecord(optimization)) diff --git a/profiler/msprof_analyze/advisor/common/analyzer_scopes.py b/profiler/msprof_analyze/advisor/common/analyzer_scopes.py index 07ceef769440b39c93aeaaf15ded5ad99fc3f4b3..6a6261c7b75e721c0a9df75f35ecb3cd2aa1e487 100644 --- a/profiler/msprof_analyze/advisor/common/analyzer_scopes.py +++ b/profiler/msprof_analyze/advisor/common/analyzer_scopes.py @@ -1,5 +1,4 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. +# Copyright (c) Huawei Technologies Co., Ltd. 2024-2025. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -41,3 +40,4 @@ class SupportedScopes: FUSIBLE_OPERATOR_ANALYSIS = "fusible_operator_analysis" CONJECTURED_GC_ANALYSIS = "conjectured_analysis" COMPARISON = "comparison" + AICORE_PERFORMANCE_ANALYSIS = "ai_core_performance_analysis" diff --git a/profiler/msprof_analyze/advisor/common/enum_params_parser.py b/profiler/msprof_analyze/advisor/common/enum_params_parser.py index 7158af929f5711a32de835b426bbe91c2000a401..99881ee559e27cd5040b016ef3e663f6df6dac12 100644 --- a/profiler/msprof_analyze/advisor/common/enum_params_parser.py +++ b/profiler/msprof_analyze/advisor/common/enum_params_parser.py @@ -19,7 +19,7 @@ import logging import typing from msprof_analyze.advisor.common.timeline.event import AdvisorDict -from msprof_analyze.advisor.utils.utils import singleton +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.prof_common.file_manager import FileManager logger = logging.getLogger() diff --git a/profiler/msprof_analyze/advisor/common/profiling/msprof.py b/profiler/msprof_analyze/advisor/common/profiling/msprof.py index e4d537ddc78d282ef9db69b1741abfb6cde6d906..ed0da92e4eccc6f59267e51df6893c21a1bc6c12 100644 --- a/profiler/msprof_analyze/advisor/common/profiling/msprof.py +++ b/profiler/msprof_analyze/advisor/common/profiling/msprof.py @@ -114,6 +114,8 @@ class Msprof(ProfilingParser): max_time = 0.0 task_checker = TaskChecker() is_iter = False + self._tasks = [None] * len(self._raw_data) + task_index = 0 for item in self._raw_data: task = TaskInfo(item) if task.cat == "Iteration Time": @@ -130,7 +132,8 @@ class Msprof(ProfilingParser): if task_checker.is_sqe(task): continue - self._tasks.append(task) + self._tasks[task_index] = task + task_index += 1 self._parse_task(task) start_time = task.start_time @@ -146,6 +149,7 @@ class Msprof(ProfilingParser): self._iteration_time = dur self._max_time = max_time self._min_time = min_time + self._tasks = self._tasks[:task_index] if self._tasks: self._tasks.sort(key=lambda x: x.start_time) return True diff --git a/profiler/msprof_analyze/advisor/config/config.py b/profiler/msprof_analyze/advisor/config/config.py index 2ab735a28657caf1cf60fe357ef907c6b62e6811..80057b2a5d664c38e5bc428e5b70065df074f4c7 100644 --- a/profiler/msprof_analyze/advisor/config/config.py +++ b/profiler/msprof_analyze/advisor/config/config.py @@ -18,7 +18,7 @@ import logging import os from msprof_analyze.advisor.utils.utils import Timer -from msprof_analyze.advisor.utils.utils import singleton +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.prof_common.utils import SafeConfigReader logger = logging.getLogger() diff --git a/profiler/msprof_analyze/advisor/dataset/cluster/cluster_dataset.py b/profiler/msprof_analyze/advisor/dataset/cluster/cluster_dataset.py index b47f6d4518b45d84497fe4eac87cfe11d0fccb04..c976ac0f9de2acc1d59581375b8b306aab889934 100644 --- a/profiler/msprof_analyze/advisor/dataset/cluster/cluster_dataset.py +++ b/profiler/msprof_analyze/advisor/dataset/cluster/cluster_dataset.py @@ -12,24 +12,36 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import json import logging import os import re - +from abc import ABC, abstractmethod from collections import defaultdict + +import numpy as np +import pandas as pd + +from msprof_analyze.prof_common.db_manager import DBManager + +from msprof_analyze.prof_common.database_service import DatabaseService + from msprof_analyze.advisor.dataset.dataset import Dataset -from msprof_analyze.advisor.utils.utils import singleton +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.prof_common.file_manager import FileManager from msprof_analyze.prof_common.constant import Constant from msprof_analyze.cluster_analyse.cluster_analysis import Interface from msprof_analyze.advisor.dataset.cluster.cluster_step_trace_time_bean import ClusterStepTraceTimeBean from msprof_analyze.advisor.dataset.cluster.hccl_collection import HcclInfo +from msprof_analyze.prof_exports.communicaion_info_export import (ClusterCommunicationInfoExport, + ClusterBandwidthInfoExport, + ClusterStepTraceTimeExport) logger = logging.getLogger() -class ClusterDataset(Dataset): +class ClusterDataset(ABC, Dataset): def __init__(self, collection_path, data: dict, **kwargs) -> None: super().__init__(collection_path, data, **kwargs) @@ -46,14 +58,23 @@ class ClusterDataset(Dataset): return True return False + def is_db_cluster_analysis_data_simplification(self): + db_path = os.path.join(self.output_path, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + return DBManager.check_tables_in_db(db_path, "CommunicationGroupMapping") + def cluster_analyze(self): if self.is_cluster_analysis_output_exist(): return parameter = { - Constant.COLLECTION_PATH: self.collection_path, - Constant.ANALYSIS_MODE: "all", + Constant.PROFILING_PATH: self.collection_path, + Constant.MODE: "all", Constant.CLUSTER_ANALYSIS_OUTPUT_PATH: self.output_path } + if self.data_type == Constant.DB: + parameter[Constant.DATA_SIMPLIFICATION] = True + parameter[Constant.PARALLEL_MODE] = Constant.CONCURRENT_MODE + parameter[Constant.EXPORT_TYPE] = Constant.DB logger.info("cluster analysis is in the process, please wait...") try: Interface(parameter).run() @@ -76,6 +97,26 @@ class ClusterDataset(Dataset): data = FileManager.read_json_file(json_path) return data + def load_db_data(self, table): + db_path = os.path.join(self.output_path, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + database = DatabaseService(db_path=db_path, step_range={}) + database.add_table_for_query(table) + res = database.query_data() + return res.get(table, None) + + @abstractmethod + def parse_from_text(self): + pass + + @abstractmethod + def parse_from_db(self): + pass + + def _parse(self): + self.cluster_analyze() + return self.parse_from_db() if self.data_type == Constant.DB else self.parse_from_text() + @singleton class ClusterStepTraceTimeDataset(ClusterDataset): @@ -87,7 +128,7 @@ class ClusterStepTraceTimeDataset(ClusterDataset): self._stages = [] super().__init__(collection_path, data, **kwargs) - def format_data(self, step_data: list): + def format_text_data(self, step_data: list): step_dict = defaultdict(lambda: [0, 0, 0]) for step_bean in step_data: if step_bean.type == self.RANK: @@ -110,24 +151,53 @@ class ClusterStepTraceTimeDataset(ClusterDataset): self._stages.append(stage) return step_dict + def format_db_data(self, step_df): + if step_df is None: + return None + # process stage info + self._stages = (step_df[step_df['type'] == 'stage']['index'].dropna() + .apply(lambda x: sorted(list(map(int, re.findall(r'\d+', x))))) + .tolist()) + # process rank info + rank_df = step_df[step_df['type'] == 'rank'] + rank_df['step'] = rank_df['step'].fillna(Constant.DEFAULT_STEP) + rank_df["step_rank"] = rank_df.apply(lambda row: f"{row['step']}_{row['index']}", axis=1) + step_dict = (rank_df.set_index('step_rank')[['computing', 'communication_not_overlapped', 'free']]. + apply(list, axis=1).to_dict()) + return step_dict + + def get_data(self): return self._step_dict def get_stages(self): return sorted(self._stages) - def _parse(self): - self.cluster_analyze() + def parse_from_text(self): try: step_data = self.load_csv_data(Constant.CLUSTER_STEP_TIME_CSV, ClusterStepTraceTimeBean) except RuntimeError as e: - logger.error("捕获到异常:%s", e) + logger.error("Exception when run load_csv_data:%s", e) + self._step_dict = None + return False + self._step_dict = self.format_text_data(step_data) + return True + + def parse_from_db(self): + db_path = os.path.join(self.output_path, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + export = ClusterStepTraceTimeExport(db_path) + df = export.read_export_db() + try: + self._step_dict = self.format_db_data(df) + except RuntimeError as e: + logger.error("Exception when run format_db_data:%s", e) self._step_dict = None return False - self._step_dict = self.format_data(step_data) return True + @singleton class ClusterCommunicationDataset(ClusterDataset): RDMA_TIME_MS = "RDMA time(ms)" @@ -175,7 +245,7 @@ class ClusterCommunicationDataset(ClusterDataset): op_name = op.split("@")[0] for rank_id, rank_dict in op_dict.items(): try: - hccl_info = HcclInfo(group, step, rank_id, op, rank_dict) + hccl_info = HcclInfo.construct_instance_from_dict(group, step, rank_id, op, rank_dict) if self.hccl_dict[group].get(op_name) is None: self.hccl_dict[group].setdefault(op_name, defaultdict(list)) if self.hccl_dict[group][op_name].get(step) is None: @@ -213,13 +283,55 @@ class ClusterCommunicationDataset(ClusterDataset): def get_data(self): return self.rank_bw_dict - def _parse(self): - self.cluster_analyze() + def parse_from_text(self): try: communication_json = self.load_json_data(Constant.CLUSTER_COMM_JSON) except RuntimeError as e: - logger.error("捕获到异常:%s", e) + logger.error("Exception when run load_json_data:%s", e) self.rank_bw_dict = None return False self.process(communication_json) return True + + def parse_from_db(self): + data_simplification = self.is_db_cluster_analysis_data_simplification() + db_path = os.path.join(self.output_path, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + + self.process_bandwidth_db(db_path, data_simplification) + self.process_hccl_info_db(db_path, data_simplification) + + def process_hccl_info_db(self, db_path, data_simplification): + export = ClusterCommunicationInfoExport(db_path, data_simplification) + df = export.read_export_db() + df['sdma_dict'] = df['sdma_dict'].apply(lambda x: json.loads(x) if pd.notna(x) else {}) + df['rdma_dict'] = df['rdma_dict'].apply(lambda x: json.loads(x) if pd.notna(x) else {}) + for row in df.itertuples(index=False): + group, op_name, step = row.rank_set, row.hccl_op_name, row.step + hccl_info = HcclInfo(group, step, row.rank_id, op_name, row.start_timestamp, + row.elapsed_time, row.sdma_dict, row.rdma_dict) + self.hccl_dict[group][op_name][step].append(hccl_info) + + def process_bandwidth_db(self, db_path, data_simplification): + export = ClusterBandwidthInfoExport(db_path, data_simplification) + df = export.read_export_db() + processed_steps = df['step'].astype(str).str.lower().str.lstrip('step').replace('', str(Constant.DEFAULT_STEP)) + df['step_rank'] = processed_steps + '_' + df['rank_id'].astype(str) + bandwidth_df = df.groupby(['band_type', 'step_rank']).agg({ + 'transit_time': 'sum', + 'transit_size': 'sum' + }).reset_index() + bandwidth_df['bandwidth'] = np.where(bandwidth_df['transit_time'] > Constant.EPS, + bandwidth_df['transit_size'] / bandwidth_df['transit_time'], + 0).round(4) + for row in bandwidth_df.itertuples(index=False): + if row.band_type == self.SDMA: + self.rank_bw_dict[row.step_rank][self.SDMA_SIZE_MB] = row.transit_size + self.rank_bw_dict[row.step_rank][self.SDMA_TIME_MS] = row.transit_time + self.rank_bw_dict[row.step_rank][self.SDMA_TIME_MS] = row.transit_time + self.rank_bw_dict[row.step_rank][self.SDMA_BANDWIDTH] = row.bandwidth + elif row.band_type == self.RDMA: + self.rank_bw_dict[row.step_rank][self.RDMA_SIZE_MB] = row.transit_size + self.rank_bw_dict[row.step_rank][self.RDMA_TIME_MS] = row.transit_time + self.rank_bw_dict[row.step_rank][self.RDMA_BANDWIDTH] = row.bandwidth + diff --git a/profiler/msprof_analyze/advisor/dataset/cluster/hccl_collection.py b/profiler/msprof_analyze/advisor/dataset/cluster/hccl_collection.py index 6a7156e496db05da19f30d7c049794046d2ceebb..4d90e49b62430102d4f55509aced011afb14ff7e 100644 --- a/profiler/msprof_analyze/advisor/dataset/cluster/hccl_collection.py +++ b/profiler/msprof_analyze/advisor/dataset/cluster/hccl_collection.py @@ -17,19 +17,22 @@ hccl info """ import logging +import pandas as pd + logger = logging.getLogger() -class HcclInfo(): - def __init__(self, group: str, step: str, rank: str, op: str, rank_dict: dict) -> None: +class HcclInfo: + def __init__(self, group: str, step: str, rank: str, op_name: str, + start_time: float, elapse_time: float, sdma_info: dict, rdma_info: dict): self._group = group self._step = step self._rank = rank - self._name = op.split("@")[0] - self._ts = self.get_communication_time_info(rank_dict, "Start Timestamp(us)") - self._elapse_time = self.get_communication_time_info(rank_dict, "Elapse Time(ms)") - self._sdma_info = self.get_communication_info(rank_dict, "SDMA") - self._rdma_info = self.get_communication_info(rank_dict, "RDMA") + self._name = op_name + self._ts = start_time + self._elapse_time = elapse_time + self._sdma_info = sdma_info + self._rdma_info = rdma_info @property def group(self): @@ -73,6 +76,15 @@ class HcclInfo(): communication_time_info = rank_dict.get('Communication Time Info', dict()) return communication_time_info.get(name, 0) + @classmethod + def construct_instance_from_dict(cls, group: str, step: str, rank: str, op: str, rank_dict: dict): + return cls(group, step, rank, op.split("@")[0], + HcclInfo.get_communication_time_info(rank_dict, "Start Timestamp(us)"), + HcclInfo.get_communication_time_info(rank_dict, "Elapse Time(ms)"), + HcclInfo.get_communication_info(rank_dict, "SDMA"), + HcclInfo.get_communication_info(rank_dict, "RDMA") + ) + def get_rdma_transmit_time(self): return self.rdma_info.get('Transit Time(ms)', 0) @@ -81,3 +93,4 @@ class HcclInfo(): def get_rdma_bandwidth(self): return self.rdma_info.get('Bandwidth(GB/s)', 0) + diff --git a/profiler/msprof_analyze/advisor/dataset/communication/communication_dataset.py b/profiler/msprof_analyze/advisor/dataset/communication/communication_dataset.py index 44efddccffbc4a0759f62bd5b53d732617f9f887..399d9c11f0bf22e1b05f632c7e3f60ce0069a3cb 100644 --- a/profiler/msprof_analyze/advisor/dataset/communication/communication_dataset.py +++ b/profiler/msprof_analyze/advisor/dataset/communication/communication_dataset.py @@ -12,38 +12,40 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import json import logging import os from collections import defaultdict -from msprof_analyze.advisor.utils.utils import singleton + +import numpy as np +import pandas as pd +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant + +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.file_manager import FileManager from msprof_analyze.advisor.dataset.cluster.hccl_collection import HcclInfo from msprof_analyze.advisor.utils.utils import CheckPathAccess +from msprof_analyze.advisor.dataset.dataset import Dataset +from msprof_analyze.prof_exports.communicaion_info_export import CommunicationInfoExport logger = logging.getLogger() @singleton -class CommunicationDataset: +class CommunicationDataset(Dataset): RANK = "rank" hccl_dict = defaultdict(list) def __init__(self, collection_path, data: dict, **kwargs) -> None: - self.timeline_dir = collection_path - if not self.timeline_dir.endswith("ascend_pt") and not self.timeline_dir.endswith("ascend_ms"): + self.collection_path = collection_path + if not collection_path.endswith("ascend_pt") and not collection_path.endswith("ascend_ms"): return - self.timeline_data_list = self.get_file_path_from_directory( - self.timeline_dir, - lambda file: file.endswith(Constant.COMMUNICATION_JSON) - ) + self.is_pta = collection_path.endswith("ascend_pt") + self.communication_file = "" self.hccl_dict = defaultdict(list) self.step = kwargs.get("step") - if self.parse(): - key = self.get_key() - if key not in data: - data[key] = [] - data[key].append(self) + super().__init__(collection_path, data, **kwargs) @staticmethod def load_json_data(json_path): @@ -84,20 +86,37 @@ class CommunicationDataset: """ return cls.__module__.rsplit('.', maxsplit=1)[-1] - def parse(self): - if len(self.timeline_data_list) == 0: - logger.warning("Please ensure communication.json in %s, skip timeline analysis.", self.timeline_dir) + def get_communication_file_list(self): + file_name = "" + if self.data_type == Constant.TEXT: + file_name = Constant.COMMUNICATION_JSON + elif self.data_type == Constant.DB: + file_name = Constant.DB_COMMUNICATION_ANALYZER if self.collection_path.endswith("ascend_pt") \ + else Constant.DB_MS_COMMUNICATION_ANALYZER + if not file_name: + logger.error("Invalid collection path, can not get communication file name pattern") + return False + + communication_data_list = self.get_file_path_from_directory( + self.collection_path, + lambda file: file.endswith(file_name) + ) + if len(communication_data_list) == 0: + logger.warning(f"Please ensure {file_name} in {self.collection_path}, skip timeline analysis.") return False + if len(communication_data_list) > 1: + logger.warning(f"Found multiple {file_name} in {self.collection_path}, " + f"load the file of device 0 for analysis.") + self.communication_file = sorted(communication_data_list)[0] + return True - if len(self.timeline_data_list) > 1: - logger.warning("Found multiple communication.json in %s, load the file of device 0 for analysis.", - self.timeline_dir) - json_data = self.load_json_data(sorted(self.timeline_data_list)[0]) - self.process(json_data) + def parse_from_text(self): + json_data = self.load_json_data(self.communication_file) + self.process_communication_json(json_data) return True - def process(self, communication_json: dict): + def process_communication_json(self, communication_json: dict): for step, step_dict in communication_json.items(): for group, group_dict in step_dict.items(): for op, op_dict in group_dict.items(): @@ -105,10 +124,32 @@ class CommunicationDataset: def process_hccl_info(self, group, step, op, op_dict): try: - hccl_info = HcclInfo(group, step, "None", op, op_dict) + hccl_info = HcclInfo.construct_instance_from_dict(group, step, "None", op, op_dict) if self.hccl_dict.get(step) is None: self.hccl_dict.setdefault(step, list()) self.hccl_dict[step].append(hccl_info) except ValueError as e: msg = "[ERROR] Cluster_communication.json has invalid structure." raise ValueError(msg) from e + + def parse_from_db(self): + export = CommunicationInfoExport(self.communication_file, self.is_pta) + df = export.read_export_db() + if TableConstant.STEP not in df.columns: + df[TableConstant.STEP] = 'step' + if TableConstant.TYPE not in df.columns: + is_p2p = df[TableConstant.HCCL_OP_NAME].str.lower().str.contains('send|receive|recv', regex=True) + df[Constant.TYPE] = np.where(is_p2p, Constant.P2P, Constant.COLLECTIVE) + + df['sdma_dict'] = df['sdma_dict'].apply(lambda x: json.loads(x) if pd.notna(x) else {}) + df['rdma_dict'] = df['rdma_dict'].apply(lambda x: json.loads(x) if pd.notna(x) else {}) + for row in df.itertuples(index=False): + self.hccl_dict[row.step].append(HcclInfo(row.type, row.step, "None", row.hccl_op_name, + row.start_timestamp, row.elapse_time, + row.sdma_dict, row.rdma_dict)) + return True + + def _parse(self): + if not self.get_communication_file_list(): + return False + return self.parse_from_db() if self.data_type == Constant.DB else self.parse_from_text() diff --git a/profiler/msprof_analyze/advisor/dataset/communication/hccl_detail_dataset.py b/profiler/msprof_analyze/advisor/dataset/communication/hccl_detail_dataset.py index fac5603b99bfd4956503fc76d6355edb8da54941..cf693b1a12fd6e0abe9b87868032706545fec5a9 100644 --- a/profiler/msprof_analyze/advisor/dataset/communication/hccl_detail_dataset.py +++ b/profiler/msprof_analyze/advisor/dataset/communication/hccl_detail_dataset.py @@ -14,7 +14,7 @@ # limitations under the License. import logging from typing import List -from msprof_analyze.advisor.utils.utils import singleton +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.advisor.common.profiling.msprof import Msprof from msprof_analyze.advisor.dataset.profiling.info_collection import TaskInfo, HcclOp, HcclTask @@ -39,7 +39,8 @@ class HcclDetailDataset: @staticmethod def _get_hccl_pid(tasks: List[TaskInfo]): for task in tasks: - if task.name == "process_name" and hasattr(task, "args") and task.args.get("name", None) == "HCCL": + if task.name == "process_name" and hasattr(task, "args") \ + and task.args.get("name", None) in ["Communication", "HCCL"]: return task.pid return -1 diff --git a/profiler/msprof_analyze/advisor/dataset/dataset.py b/profiler/msprof_analyze/advisor/dataset/dataset.py index 3cc669480db99d26076fd76caabb50ac6fe59bdb..92cd30a829dfd5c0304556a1e35061966df52203 100644 --- a/profiler/msprof_analyze/advisor/dataset/dataset.py +++ b/profiler/msprof_analyze/advisor/dataset/dataset.py @@ -18,6 +18,11 @@ dataset module """ import logging import os +import re + +from msprof_analyze.prof_common.constant import Constant + +from msprof_analyze.prof_common.file_manager import FileManager from msprof_analyze.advisor.config.config import Config @@ -35,6 +40,7 @@ class Dataset: data = {} self.collection_path = os.path.abspath(os.path.join(Config().work_path, collection_path)) self.output_path = kwargs.get("output_path", None) + self.data_type = self.get_data_type() if not self.output_path: self.output_path = self.collection_path logger.debug("init %s with %s", self.__class__.__name__, self.collection_path) @@ -55,3 +61,19 @@ class Dataset: :return: key """ return cls.__name__.rsplit('.', maxsplit=1)[-1] + + def get_data_type(self): + pytorch_pattern = re.compile(r'ascend_pytorch_profiler_\d+\.db$') + mindspore_pattern = re.compile(r'ascend_mindspore_profiler_\d+\.db$') + + # 递归搜索ASCEND_PROFILER_PATH文件夹 + for root, dirs, _ in os.walk(self.collection_path): + if Constant.ASCEND_PROFILER_OUTPUT in dirs: + profiler_dir = os.path.join(root, Constant.ASCEND_PROFILER_OUTPUT) + + # 检查profiler目录下的文件 + for file in os.listdir(profiler_dir): + if pytorch_pattern.match(file) or mindspore_pattern.match(file): + return Constant.DB # 找到任意一种.db文件即返回 + + return Constant.TEXT diff --git a/profiler/msprof_analyze/advisor/dataset/profiling/profiling_dataset.py b/profiler/msprof_analyze/advisor/dataset/profiling/profiling_dataset.py index 7981e4140f03eda4a07392f700b5432909c7f497..a9ead2b7e06e712703f5d0603e6a60cfc91efe3e 100644 --- a/profiler/msprof_analyze/advisor/dataset/profiling/profiling_dataset.py +++ b/profiler/msprof_analyze/advisor/dataset/profiling/profiling_dataset.py @@ -26,12 +26,14 @@ from msprof_analyze.advisor.common.enum_params_parser import EnumParamsParser from msprof_analyze.advisor.dataset.dataset import Dataset from msprof_analyze.advisor.dataset.profiling.device_info import DeviceInfoParser from msprof_analyze.advisor.utils.utils import join_prof_path +from msprof_analyze.advisor.utils.utils import singleton from msprof_analyze.prof_common.file_manager import FileManager logger = logging.getLogger() +@singleton class ProfilingDataset(Dataset): prof_type = "" diff --git a/profiler/msprof_analyze/advisor/display/html/render.py b/profiler/msprof_analyze/advisor/display/html/render.py index 09211220677a0d7d50e8af52a27f068a85d0859b..57a074420ba3e2baba02ff4a08639e8d88509fea 100644 --- a/profiler/msprof_analyze/advisor/display/html/render.py +++ b/profiler/msprof_analyze/advisor/display/html/render.py @@ -21,7 +21,8 @@ from collections import defaultdict, OrderedDict from jinja2 import Environment, FileSystemLoader from msprof_analyze.prof_common.constant import Constant from msprof_analyze.advisor.config.config import Config -from msprof_analyze.advisor.utils.utils import singleton, safe_write +from msprof_analyze.advisor.utils.utils import safe_write +from msprof_analyze.prof_common.singleton import singleton logger = logging.getLogger() diff --git a/profiler/msprof_analyze/advisor/display/html/templates/ai_core_frequency.html b/profiler/msprof_analyze/advisor/display/html/templates/ai_core_frequency.html index 405460ac9616740613bc337d705d617cc9de9287..87b2e30c997718e8fc18218f21be135f62be3a7f 100644 --- a/profiler/msprof_analyze/advisor/display/html/templates/ai_core_frequency.html +++ b/profiler/msprof_analyze/advisor/display/html/templates/ai_core_frequency.html @@ -1,6 +1,6 @@ {% if data|length > 0 %}
    -

    AI CORE Frequency Issues

    +

    AI Core Frequency Issues

    {% if rank is not none %} Analysis of rank {{ rank|safe }}. diff --git a/profiler/msprof_analyze/advisor/display/html/templates/ai_core_performance.html b/profiler/msprof_analyze/advisor/display/html/templates/ai_core_performance.html new file mode 100644 index 0000000000000000000000000000000000000000..60faaf020c7896e2b6978f6e9a87a207d163ec23 --- /dev/null +++ b/profiler/msprof_analyze/advisor/display/html/templates/ai_core_performance.html @@ -0,0 +1,204 @@ +{% if format_result|length > 0 %} + +
    +

    AI Core Performance Analysis

    +
    + {% if language == "cn" %} + {% set title_ns = namespace(type='类别', desc='描述及建议', opti_set='性能优化算子集合', bound_set='bound算子集合', affinity_set='不亲和算子集合', + opti_refer=' 参考性能优化空间', bound_refer=' bound类型为', affinity_refer=' 不亲和类型为', title_desc='算子相关分析,参考如下: ') %} + {% else %} + {% set title_ns = namespace(type='Type', desc='Description and Suggestion', opti_set='set of performance optimization operators', + bound_set='set of bound operators', affinity_set='set of unaffine operators', opti_refer=' refer to Performance Optimization Space', + bound_refer=' bound type', affinity_refer=' type of disaffinity', title_desc=' Operator related analysis, referenced below: ') %} + {% endif %} + {% if format_result.cube[0]|length + format_result.cube[1]|length + format_result.cube[2]|length > 0 %} + Cube{{ title_ns.title_desc }} +
    +
    + + + + + {% set opti_ns = namespace(total_opti='') %} + {% for opti in format_result.cube[0] %} + {% if not loop.first %} + {% set opti_ns.total_opti = opti_ns.total_opti ~ "" %} + {% else %} + {% set opti_ns.total_opti = "" %} + {% endif %} + {% endfor %} + {% if opti_ns.total_opti|length > 0 %} + + + + + {% endif %} + {% set bound_ns = namespace(total_bound='') %} + {% for bound in format_result.cube[1] %} + {% if not loop.first %} + {% set bound_ns.total_bound = bound_ns.total_bound ~ "" %} + {% else %} + {% set bound_ns.total_bound = "" %} + {% endif %} + {% endfor %} + {% if bound_ns.total_bound|length > 0 %} + + + + + {% endif %} + {% set affinity_ns = namespace(total_affinity='') %} + {% for affinity in format_result.cube[2] %} + {% if not loop.first %} + {% set affinity_ns.total_affinity = affinity_ns.total_affinity ~ "" %} + {% else %} + {% set affinity_ns.total_affinity = "" %} + {% endif %} + {% endfor %} + {% if affinity_ns.total_affinity|length > 0 %} + + + + + {% endif %} +
    {{ title_ns.type }}{{ title_ns.desc }}
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    {{ title_ns.opti_set }} + + + {{ opti_ns.total_opti | safe }} +
    nameshapedtype{{ title_ns.opti_refer }}
    +
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    {{ title_ns.bound_set }} + + + {{ bound_ns.total_bound | safe }} +
    nameshapedtype{{ title_ns.bound_refer }}
    +
    " ~ affinity.op_name ~ "" ~ affinity.shape ~ "" ~ affinity.dtype ~ "" ~ affinity.suggestion ~ "
    " ~ affinity.op_name ~ "" ~ affinity.shape ~ "" ~ affinity.dtype ~ "" ~ affinity.suggestion ~ "
    {{ title_ns.affinity_set }} + + + {{ affinity_ns.total_affinity | safe }} +
    nameshapedtype{{ title_ns.affinity_refer }}
    +
    + {% endif %} + + {% if format_result.fa[0]|length + format_result.fa[1]|length + format_result.fa[2]|length > 0 %} + FA{{ title_ns.title_desc }} +
    + + + + + + {% set opti_ns = namespace(total_opti='') %} + {% for opti in format_result.fa[0] %} + {% if not loop.first %} + {% set opti_ns.total_opti = opti_ns.total_opti ~ "" %} + {% else %} + {% set opti_ns.total_opti = "" %} + {% endif %} + {% endfor %} + {% if opti_ns.total_opti|length > 0 %} + + + + + {% endif %} + {% set bound_ns = namespace(total_bound='') %} + {% for bound in format_result.fa[1] %} + {% if not loop.first %} + {% set bound_ns.total_bound = bound_ns.total_bound ~ "" %} + {% else %} + {% set bound_ns.total_bound = "" %} + {% endif %} + {% endfor %} + {% if bound_ns.total_bound|length > 0 %} + + + + + {% endif %} + {% set affinity_ns = namespace(total_affinity='') %} + {% for affinity in format_result.fa[2] %} + {% if not loop.first %} + {% set affinity_ns.total_affinity = affinity_ns.total_affinity ~ "" %} + {% else %} + {% set affinity_ns.total_affinity = "" %} + {% endif %} + {% endfor %} + {% if affinity_ns.total_affinity|length > 0 %} + + + + + {% endif %} +
    {{ title_ns.type }}{{ title_ns.desc }}
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    {{ title_ns.opti_set }} + + + {{ opti_ns.total_opti | safe }} +
    nameshapedtype{{ title_ns.opti_refer }}
    +
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    {{ title_ns.bound_set }} + + + {{ bound_ns.total_bound | safe }} +
    nameshapedtype{{ title_ns.bound_refer }}
    +
    " ~ affinity.op_name ~ "" ~ affinity.shape ~ "" ~ affinity.dtype ~ "" ~ affinity.suggestion ~ "
    " ~ affinity.op_name ~ "" ~ affinity.shape ~ "" ~ affinity.dtype ~ "" ~ affinity.suggestion ~ "
    {{ title_ns.affinity_set }} + + + {{ affinity_ns.total_affinity | safe }} +
    nameshapedtype{{ title_ns.affinity_refer }}
    +
    + {% endif %} + + {% if format_result.vector[0]|length + format_result.vector[1]|length > 0 %} + Vector{{ title_ns.title_desc }} +
    + + + + + + {% set opti_ns = namespace(total_opti='') %} + {% for opti in format_result.vector[0] %} + {% if not loop.first %} + {% set opti_ns.total_opti = opti_ns.total_opti ~ "" %} + {% else %} + {% set opti_ns.total_opti = "" %} + {% endif %} + {% endfor %} + {% if opti_ns.total_opti|length > 0 %} + + + + + {% endif %} + {% set bound_ns = namespace(total_bound='') %} + {% for bound in format_result.vector[1] %} + {% if not loop.first %} + {% set bound_ns.total_bound = bound_ns.total_bound ~ "" %} + {% else %} + {% set bound_ns.total_bound = "" %} + {% endif %} + {% endfor %} + {% if bound_ns.total_bound|length > 0 %} + + + + + {% endif %} +
    {{ title_ns.type }}{{ title_ns.desc }}
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    " ~ opti.op_name ~ "" ~ opti.shape ~ "" ~ opti.dtype ~ "" ~ opti.optimization ~ "%
    {{ title_ns.opti_set }} + + + {{ opti_ns.total_opti | safe }} +
    nameshapedtype{{ title_ns.opti_refer }}
    +
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    " ~ bound.op_name ~ "" ~ bound.shape ~ "" ~ bound.dtype ~ "" ~ bound.bound ~ "
    {{ title_ns.bound_set }} + + + {{ bound_ns.total_bound | safe }} +
    nameshapedtype{{ title_ns.bound_refer }}
    +
    + {% endif %} +
    +
    +{% endif %} \ No newline at end of file diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/ai_core_freq_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/ai_core_freq_prompt.py index 7737a372ad29d6ac35bfc1d61e75ce747395f8af..17560d39878e2ed2ce9b4ef6fb13d630c6f95db5 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/ai_core_freq_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/ai_core_freq_prompt.py @@ -15,7 +15,7 @@ class AICoreFreqPrompt(object): RANK_ID = "RANK {} " - PROBLEM = "AI Core Frequency" + PROBLEM = "AI Core Frequency Issues" DESCRIPTION = "{} operators are found during frequency reduction, and the reduction " \ "ratio is larger than {}." RANK_DESCRIPTION = "For rank {}, " diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/block_dim_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/block_dim_prompt.py index 410fcdd41cf569481bbc4c64750e9457bfd5191c..a7e73ddf04869da007b569b3cf93f4d812f06033 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/block_dim_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/block_dim_prompt.py @@ -14,7 +14,7 @@ # limitations under the License. class BlockDimPrompt(object): - PROBLEM = "block dim" + PROBLEM = "Block Dim Issues" DESCRIPTION = "some operator does not make full use of {} ai core" AIV_NUM_DESCRIPTION = " or {} ai vector core" TOP_DURATION_OP_DESCRIPTION = ";\n Top-{} operator of task duration are as follows:\n" diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/dynamic_shape_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/dynamic_shape_prompt.py index fab738319b8277b618bece649541378750454c35..2e475c0769dd9bf91b28ccacc21711c93aa43074 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/dynamic_shape_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/dynamic_shape_prompt.py @@ -15,7 +15,7 @@ class DynamicShapePrompt(object): RANK_ID = "RANK {} " - PROBLEM = "Dynamic Shape Operator" + PROBLEM = "Operator Dynamic Shape Issues" DESCRIPTION = "Found all operators are dynamic shape" ENABLE_COMPILED_SUGGESTION = \ "Please place the following code at the entrance of the python script to disable jit compile.\n " \ diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/environment_variable_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/environment_variable_prompt.py index c099ea99196182f92435644cafee1bb459f4272d..a826949471e5912579bc4df646251f522a7d3cc3 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/environment_variable_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/environment_variable_prompt.py @@ -14,6 +14,6 @@ # limitations under the License. class EnvironmentVariablePrompt(object): - PROBLEM = "Environment Variable Analysis" + PROBLEM = "Environment Variable Issues" DESCRIPTION = "Describe and suggest the optimal environment variable settings" SUGGESTION = "Please set the optimal environment variable" diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/graph_fusion_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/graph_fusion_prompt.py index e9e3edd3b8b08726fe0390a97765a0290e65e413..f48c9988d6d23be4273b12ca79bb060d8290cd81 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/graph_fusion_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/graph_fusion_prompt.py @@ -14,6 +14,6 @@ # limitations under the License. class GraphFusionPrompt(object): - PROBLEM = "Fusion Issue" + PROBLEM = "Fusion Issues" DESCRIPTION = "Found {} fusion issues" SUGGESTION = "Check fusion issues detail in mstt_advisor*.html" diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/op_dispatch_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/op_dispatch_prompt.py index 4f791154c1a8f4d7171885e1481fd128077632e4..20a2d6dbbc146a9382ced5d1652d929bcbe2001f 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/op_dispatch_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/op_dispatch_prompt.py @@ -14,7 +14,7 @@ # limitations under the License. class OpDispatchPrompt(object): - PROBLEM = "Operator Dispatch" + PROBLEM = "Operator Dispatch Issues" DESCRIPTION = "Found {} operator compile issues." SUGGESTION = "Please place the following code at the entrance of the python script to disable jit compile. " \ "Code: `torch_npu.npu.set_compile_mode(jit_compile=False); " \ diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/operator_bound_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/operator_bound_prompt.py index f4f29f25d1be7adb2203ea08375ccafe423dbdbe..434b85e9693591638da2410e67539ccde7a0b4aa 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/operator_bound_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/operator_bound_prompt.py @@ -14,6 +14,6 @@ # limitations under the License. class OperatorBoundPrompt(object): - PROBLEM = "operator no bound" + PROBLEM = "Operator No Bound Issues" DESCRIPTION = "There is no mte, cube, vector, scalar ratio is more than {},\n" \ "Top task duration operators need to be tuned are as follows: \n" diff --git a/profiler/msprof_analyze/advisor/display/prompt/en/timeline_fusion_ops_prompt.py b/profiler/msprof_analyze/advisor/display/prompt/en/timeline_fusion_ops_prompt.py index a52dec4caaf852ebf5dfd0f5cb40ad9d8542cc71..97da3c4a2789a637109b52876deacfb197358bae 100644 --- a/profiler/msprof_analyze/advisor/display/prompt/en/timeline_fusion_ops_prompt.py +++ b/profiler/msprof_analyze/advisor/display/prompt/en/timeline_fusion_ops_prompt.py @@ -14,7 +14,7 @@ # limitations under the License. class TimelineFusionOpsPrompt(object): - PROBLEM = "Affinity Apis" + PROBLEM = "Affinity API Issues" DESCRIPTION = "On the runtime env cann-{} and torch-{}, found {} apis to be replaced" SUGGESTION = "Please replace training api according to sub table 'Affinity training api'" EMPTY_STACK_DESCRIPTION = ", but with no stack" diff --git a/profiler/msprof_analyze/advisor/img/AI_Core_Performance_analysis.png b/profiler/msprof_analyze/advisor/img/AI_Core_Performance_analysis.png new file mode 100644 index 0000000000000000000000000000000000000000..37708366c990fb899a9b4a846dc81fa43d5e1d43 Binary files /dev/null and b/profiler/msprof_analyze/advisor/img/AI_Core_Performance_analysis.png differ diff --git a/profiler/msprof_analyze/advisor/img/Fusible_Operator_Analysis.png b/profiler/msprof_analyze/advisor/img/Fusible_Operator_Analysis.png new file mode 100644 index 0000000000000000000000000000000000000000..332b9ff838130e0daa625691aef88059dd31918d Binary files /dev/null and b/profiler/msprof_analyze/advisor/img/Fusible_Operator_Analysis.png differ diff --git a/profiler/msprof_analyze/advisor/interface/interface.py b/profiler/msprof_analyze/advisor/interface/interface.py index b3afefee57c8c62030af17130f79413238588f8f..60459da0795051cc5001c7475ec8fe1d1ee32b40 100644 --- a/profiler/msprof_analyze/advisor/interface/interface.py +++ b/profiler/msprof_analyze/advisor/interface/interface.py @@ -1,5 +1,4 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. +# Copyright (c) Huawei Technologies Co., Ltd. 2024-2025. All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -44,6 +43,8 @@ from msprof_analyze.advisor.analyzer.schedule.gc.gc_analyzer import GcAnalyzer from msprof_analyze.advisor.analyzer.schedule.conjectured_gc.conjectured_gc_analyzer import ConjecturedGcAnalyzer from msprof_analyze.advisor.analyzer.comparison.comparison_analyzer import ComparisonAnalyzer from msprof_analyze.advisor.analyzer.schedule.fusible_ops.fusible_operator_analyzer import FusibleOperatorAnalyzer +from msprof_analyze.advisor.analyzer.computation.ai_core_performance.ai_core_performance_analyzer import \ + AICorePerformanceAnalyzer logger = logging.getLogger() @@ -74,7 +75,8 @@ class Interface: SupportedScopes.OPERATOR_NO_BOUND_ANALYSIS: OperatorBoundAnalyzer, SupportedScopes.BLOCK_DIM_ANALYSIS: BlockDimAnalyzer, SupportedScopes.GRAPH: FusionOPAnalyzer, - SupportedScopes.FREQ_ANALYSIS: AICoreFreqAnalyzer + SupportedScopes.FREQ_ANALYSIS: AICoreFreqAnalyzer, + SupportedScopes.AICORE_PERFORMANCE_ANALYSIS: AICorePerformanceAnalyzer }), COMMUNICATION: OrderedDict({SupportedScopes.PACKET: PacketAnalyzer, SupportedScopes.COMMUNICATION_RETRANSMISSION_DETECTION: RDMARetransmissionAnalyzer, @@ -137,3 +139,4 @@ class Interface: if __name__ == "__main__": Interface() + diff --git a/profiler/msprof_analyze/advisor/result/result.py b/profiler/msprof_analyze/advisor/result/result.py index 422fd43ec4a3c5652aa5a3a1271cbba4c8bdd50e..3ab352b7bd10495ca6d3a52acf5fde5693967266 100644 --- a/profiler/msprof_analyze/advisor/result/result.py +++ b/profiler/msprof_analyze/advisor/result/result.py @@ -23,9 +23,10 @@ from prettytable import ALL, PrettyTable from msprof_analyze.prof_common.additional_args_manager import AdditionalArgsManager from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.advisor.utils.utils import singleton, logger +from msprof_analyze.advisor.utils.utils import logger from msprof_analyze.advisor.config.config import Config from msprof_analyze.advisor.utils.file import check_dir_writable +from msprof_analyze.prof_common.singleton import singleton from msprof_analyze.prof_common.file_manager import FileManager diff --git a/profiler/msprof_analyze/advisor/rules/cn/aicore_performance.yaml b/profiler/msprof_analyze/advisor/rules/cn/aicore_performance.yaml new file mode 100644 index 0000000000000000000000000000000000000000..dcdc3e188f4684c4a80e5a3e064878fb823e3b70 --- /dev/null +++ b/profiler/msprof_analyze/advisor/rules/cn/aicore_performance.yaml @@ -0,0 +1,48 @@ +cube_problem: "Cube算子性能分析" +fa_problem: "FA算子性能分析" +vector_problem: "Vector算子性能分析" +description: "提供一些AICORE算子的参考瓶颈" +bound_description: "bound算子集合" +optimization_description: "性能优化算子集合" +affinity_description: "不亲和算子集合" +cube_affinity_desc: "内轴无法被256整除" +fa_affinity_desc_head_dim_128: "D不能被128整除" +fa_affinity_desc_seq_len_128: "S不能被128整除" +fa_affinity_desc_head_dim_seq_len_128: "D和S均不能被128整除" +suggestion: "请根据亲和性、bound类型或优化空间尝试分析筛选出来的算子" +affinity_suggestion: "{op_name}算子 shape: {shape} dtype: {dtype} 有不亲和特征: {suggestion}\n" +bound_suggestion: "{op_name}算子 shape: {shape} dtype: {dtype} bound类型为: {bound} bound\n" +optimization_suggestion: "{op_name}算子 shape: {shape} dtype: {dtype} 疑似有性能优化空间,参考性能优化空间: {optimization}%\n" + +cube_operators: + - target: aic_mac_ratio + bound: mac + threshold: 0.8 + - target: aic_mte2_ratio + bound: mte2 + threshold: 0.95 + +fa_operators: + - target: aic_mte2_ratio + bound: mac + threshold: 0.8 + - target: aic_fixpipe_ratio + bound: fixpipe + threshold: 0.75 + - target: aiv_vec_ratio + bound: vec + threshold: 0.75 + +vector_operators: + - target: total + bound: vec_mte2_mte3 + threshold: 0.9 + - target: aiv_vec_ratio + bound: vec + threshold: 0.7 + - target: aiv_mte2_ratio + bound: mte2 + threshold: 0.7 + - target: aiv_mte3_ratio + bound: mte3 + threshold: 0.7 \ No newline at end of file diff --git a/profiler/msprof_analyze/advisor/rules/en/aicore_performance.yaml b/profiler/msprof_analyze/advisor/rules/en/aicore_performance.yaml new file mode 100644 index 0000000000000000000000000000000000000000..5f9f63869054a850aabfe41a2a4dd088afa70cfa --- /dev/null +++ b/profiler/msprof_analyze/advisor/rules/en/aicore_performance.yaml @@ -0,0 +1,48 @@ +cube_problem: "Cube Operator Perf Analysis" +fa_problem: "FA Operator Perf Analysis" +vector_problem: "Vector Operator Perf Analysis" +description: "Provide some reference bottlenecks for the AICORE operators" +bound_description: "set of bound operators" +optimization_description: "set of performance optimization operators" +affinity_description: "set of unaffine operators" +cube_affinity_desc: "Then inner axis is not divisible by 256" +fa_affinity_desc_head_dim_128: "D is not divisible by 128" +fa_affinity_desc_seq_len_128: "S is not divisible by 128" +fa_affinity_desc_head_dim_seq_len_128: "Neither D nor S is not divisible by 128" +suggestion: "Please try to analyze the filtered operators based on affinity, bound type or optimization space" +affinity_suggestion: "{op_name} Op shape: {shape} dtype: {dtype} with disaffection characteristics: {suggestion}\n" +bound_suggestion: "{op_name} Op shape: {shape} dtype: {dtype} bound type: {bound} bound\n" +optimization_suggestion: "{op_name} Op shape: {shape} dtype: {dtype} suspect there is room for performance optimization, refer to Performance Optimization Space: {optimization}%\n" + +cube_operators: + - target: aic_mac_ratio + bound: mac + threshold: 0.8 + - target: aic_mte2_ratio + bound: mte2 + threshold: 0.95 + +fa_operators: + - target: aic_mte2_ratio + bound: mac + threshold: 0.8 + - target: aic_fixpipe_ratio + bound: fixpipe + threshold: 0.75 + - target: aiv_vec_ratio + bound: vec + threshold: 0.75 + +vector_operators: + - target: total + bound: vec_mte2_mte3 + threshold: 0.9 + - target: aiv_vec_ratio + bound: vec + threshold: 0.7 + - target: aiv_mte2_ratio + bound: mte2 + threshold: 0.7 + - target: aiv_mte3_ratio + bound: mte3 + threshold: 0.7 \ No newline at end of file diff --git a/profiler/msprof_analyze/advisor/rules/en/aicpu_rules.yaml b/profiler/msprof_analyze/advisor/rules/en/aicpu_rules.yaml index b414c71a2ee8b9c4c45e37114a6624f0b3471fcf..33ccd7452ca33a0fb5decce593a1fba3bd838102 100644 --- a/profiler/msprof_analyze/advisor/rules/en/aicpu_rules.yaml +++ b/profiler/msprof_analyze/advisor/rules/en/aicpu_rules.yaml @@ -1,4 +1,4 @@ -problem: "AICPU operator" +problem: "AICPU Issues" description: "Some operators and task duration exceed {} us, such as :\n" suggestion: "Modify code to avoid aicpu operator" double_suggestion: "Try to convert double type operator to float, such as {}" diff --git a/profiler/msprof_analyze/advisor/rules/en/fusible_operator.yaml b/profiler/msprof_analyze/advisor/rules/en/fusible_operator.yaml index db481a84522349db4dabedff7d1bea8b4359142a..7fc191b1a17014dcf05046f1ab8e6198de644630 100644 --- a/profiler/msprof_analyze/advisor/rules/en/fusible_operator.yaml +++ b/profiler/msprof_analyze/advisor/rules/en/fusible_operator.yaml @@ -1,4 +1,4 @@ -problem: "FUSIBLE OPERATOR ANALYSIS" +problem: "Fusible Operator Analysis" mte_problem: "OPERATOR SEQUENCE ANALYSIS BASED ON MTE BOUND" host_problem: "OPERATOR SEQUENCE ANALYSIS BASED ON MTE BOUND" description: "Detected a total of {count} operator sequences with fusion value, with a total end-to-end duration of {wall_duration}ms, including {npu_time}ms for NPU time, {host_threshold} for host bottleneck duration percentage, diff --git a/profiler/msprof_analyze/advisor/rules/en/packet.yaml b/profiler/msprof_analyze/advisor/rules/en/packet.yaml index c74cd16fd9c7c913c4e2b036e2a37275ecd252e9..47eee37a9bb844281f24e56c148a0319fbb828e5 100644 --- a/profiler/msprof_analyze/advisor/rules/en/packet.yaml +++ b/profiler/msprof_analyze/advisor/rules/en/packet.yaml @@ -1,4 +1,4 @@ -problem: "Packet analysis" +problem: "Packet Analysis" description: "Excessive small communication packets may cause host delivery bottlenecks.\n" sdma_problem: "In the SDMA communication, {abnormal_ratio} of the communication data volume is less than {min_size} MB, and the total time is {abnormal_time} ms.\n" rdma_problem: "In the RDMA communication, {abnormal_ratio} of the communication data volume is less than {min_size} MB, and the total time is {abnormal_time} ms." diff --git a/profiler/msprof_analyze/advisor/rules/en/rdma_analysis.yaml b/profiler/msprof_analyze/advisor/rules/en/rdma_analysis.yaml index a21f9fa98be09f9cc8f8f9da6821b641b1c99c4d..2a6aa3bc9416f9cf14d693467c19a95b2b86c782 100644 --- a/profiler/msprof_analyze/advisor/rules/en/rdma_analysis.yaml +++ b/profiler/msprof_analyze/advisor/rules/en/rdma_analysis.yaml @@ -1,4 +1,4 @@ -problem: "Communication retransmission analysis" +problem: "Communication Retransmission Analysis" description: "RDMA communication retransmission occurs. A single retransmission takes more than 4s. Retransmission problems are detected in {group_count} communication domains. \n Advised to perform the following suggestions" diff --git a/profiler/msprof_analyze/advisor/rules/timeline_fusion_ops.yaml b/profiler/msprof_analyze/advisor/rules/timeline_fusion_ops.yaml index 3337c938625ccd4b4ea77a0dafa9879222cf1bfe..cd4b8ab3eabe2dec6401288533e6065780a15d87 100644 --- a/profiler/msprof_analyze/advisor/rules/timeline_fusion_ops.yaml +++ b/profiler/msprof_analyze/advisor/rules/timeline_fusion_ops.yaml @@ -4,10 +4,17 @@ operator_rules: aten: add: - torch_npu.npu_confusion_transpose: [ "(permute|transpose)-(contiguous){0,1}-(reshape|view)", - "(reshape|view)-(contiguous){0,1}-(permute|transpose)" ] + torch_npu.npu_confusion_transpose: [ "(permute)-(contiguous){0,1}-(reshape)", + "(permute)-(contiguous){0,1}-(view)", + "(transpose)-(contiguous){0,1}-(reshape)", + "(transpose)-(contiguous){0,1}-(view)", + "(reshape)-(contiguous){0,1}-(permute)", + "(reshape)-(contiguous){0,1}-(transpose)", + "(view)-(contiguous){0,1}-(permute)", + "(view)-(contiguous){0,1}-(transpose)" ] torch_npu.fast_gelu: [ gelu ] - torch_npu.npu_scaled_masked_softmax: [ "softmax-(mul){0,1}-(masked_fill_|add)" ] + torch_npu.npu_scaled_masked_softmax: [ "softmax-(mul){0,1}-(masked_fill_)", + "softmax-(mul){0,1}-(add)"] optimizer.clip_grad_norm_fused_: [ add-reciprocal-mul ] Optimizer: add: @@ -29,8 +36,10 @@ operator_rules: aten: add: - torch_npu.npu_fusion_attention: ["matmul-(add){0,1}-(mul){0,1}-(masked_fill_|add){0,1}-softmax-(dropout){0,1}-matmul"] - torch_npu.npu_rotary_mul: ["(chunk|slice)-neg-cat-(mul){0,2}-add"] + torch_npu.npu_fusion_attention: ["matmul-(add){0,1}-(mul){0,1}-(masked_fill_){0,1}-softmax-(dropout){0,1}-matmul", + "matmul-(add){0,1}-(mul){0,1}-(add){0,1}-softmax-(dropout){0,1}-matmul",] + torch_npu.npu_rotary_mul: ["(chunk)-neg-cat-(mul){0,2}-add", + "(slice)-neg-cat-(mul){0,2}-add"] - cann_version: 7.0.0 torch_version: [1.11.0, 2.1.0] @@ -40,9 +49,11 @@ aten: add: torch_npu.npu_rms_norm: ["(pow){0,1}-(mean){0,1}-(add){0,1}-rsqrt-mul-(type_as){0,1}"] - torch_npu.npu_swiglu: [ "(slice|chunk)-silu-mul", "(slice|chunk)-mul-silu", - "(slice|chunk)-sigmoid-mul-mul", "(slice|chunk)-mul-sigmoid-mul", - "(slice|chunk)-mul-mul-sigmoid" ] + torch_npu.npu_swiglu: [ "(slice)-silu-mul", "(chunk)-silu-mul", + "(slice)-mul-silu", "(chunk)-mul-silu", + "(slice)-sigmoid-mul-mul", "(chunk)-sigmoid-mul-mul", + "(slice)-mul-sigmoid-mul", "(chunk)-mul-sigmoid-mul", + "(slice)-mul-mul-sigmoid", "(chunk)-mul-mul-sigmoid" ] - cann_version: 8.0.RC1 torch_version: [1.11.0, 2.1.0] @@ -51,7 +62,8 @@ operator_rules: aten: add: - torch_npu.npu_geglu: [ "(slice|chunk)-gelu-mul", "(slice|chunk)-mul-gelu" ] + torch_npu.npu_geglu: [ "(slice)-gelu-mul", "(chunk)-gelu-mul", + "(slice)-mul-gelu", "(chunk)-mul-gelu"] torch_npu.npu_group_norm_silu: [ "group_norm-silu" ] torch.addmm: [ "mul-mul-add" ] torch_npu.npu_add_layer_norm: [ "add-layer_norm" ] @@ -63,7 +75,27 @@ operator_rules: aten: add: - torch_npu.npu_geglu: [ "(slice|chunk)-gelu-mul", "(slice|chunk)-mul-gelu" ] + torch_npu.npu_geglu: [ "(slice)-gelu-mul", "(chunk)-gelu-mul", + "(slice)-mul-gelu", "(chunk)-mul-gelu"] torch_npu.npu_group_norm_silu: [ "group_norm-silu" ] torch.addmm: [ "mul-mul-add" ] - torch_npu.npu_add_layer_norm: [ "add-layer_norm" ] \ No newline at end of file + torch_npu.npu_add_layer_norm: [ "add-layer_norm" ] + +- cann_version: 8.0.RC3 + torch_version: [1.11.0, 2.1.0] + unique_id: 4 + inherit_unique_id: 3 + operator_rules: + aten: + add: + mindspeed.ops.npu_matmul_add: [ "matmul-add" ] + +- cann_version: 8.0.RC3 + torch_version: [1.11.0, 2.1.0] + unique_id: 5 + inherit_unique_id: 4 + operator_rules: + aten: + add: + mindspeed.ops.npu_moe_token_permute: ["argsort-argsort-index_select"] + mindspeed.ops.npu_moe_token_unpermute: ["index_select-mul-reduce_sum"] \ No newline at end of file diff --git a/profiler/msprof_analyze/advisor/utils/utils.py b/profiler/msprof_analyze/advisor/utils/utils.py index 2ddf91c76a3052fffd8fa3cd8f5a92b6b6f52ea9..5e9bd3a23d693ae0a2456866c21ffdbba9bb2b1a 100644 --- a/profiler/msprof_analyze/advisor/utils/utils.py +++ b/profiler/msprof_analyze/advisor/utils/utils.py @@ -13,7 +13,6 @@ # See the License for the specific language governing permissions and # limitations under the License. -import inspect import json import logging @@ -31,8 +30,9 @@ import ijson import click from tqdm import tqdm -from msprof_analyze.prof_common.constant import Constant from msprof_analyze.advisor.utils.log import init_logger, get_log_level +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.singleton import singleton logger = logging.getLogger() logger.setLevel(get_log_level()) @@ -52,83 +52,6 @@ def debug_option(f): help="Debug Mode. Shows full stack trace when error occurs.")(f) -def get_class_absolute_path(cls): - module = inspect.getmodule(cls) - if module is not None: - module_path = module.__name__ - class_name = cls.__name__ - return f"{module_path}.{class_name}" - else: - return None - - -def is_static_func(function_obj): - return isinstance(function_obj, staticmethod) - - -def singleton(cls): - """ - :param cls: any class - :return: singleton handle - - When using the singleton function, you need to manually specify collection_path='dataSet_path'. Otherwise, the - singleton function is initialized by class name. - if cls has 'collection_path' property, _instance map will build by class_name and 'collection_path', the - default value of collection path is class absolute path. - - _instance = {cls.name: {collection_path: instance}} - """ - _instance = {} - - @wraps(cls) # 使用 wraps 装饰器 - def _singleton(*args, **kw): - # 适配多进程异步调用场景,确保不同子进程的单例类互相隔离 - pid = os.getpid() - if pid not in _instance: - _instance[pid] = {} - - collection_path = kw.get("collection_path") - if not collection_path: - collection_path = get_class_absolute_path(cls) - if cls in _instance[pid] and collection_path in _instance[pid][cls]: - return _instance[pid][cls].get(collection_path) - if cls not in _instance[pid]: - _instance[pid][cls] = {collection_path: cls(*args, **kw)} - else: - _instance[pid][cls][collection_path] = cls(*args, **kw) - return _instance[pid][cls].get(collection_path) - - def reset_all_instances(): - """ - 用于ut使用,清空单例类,防止ut不同测试用例间相互干扰 - """ - _instance.clear() - - # 保留原始类的属性和方法 - _singleton.__name__ = cls.__name__ - _singleton.__module__ = cls.__module__ - _singleton.__doc__ = cls.__doc__ - - # 拷贝原始类的类方法和静态方法 - _singleton.__dict__.update(cls.__dict__) - for base_class in inspect.getmro(cls)[::-1]: - # 获取类的所有成员 - members = inspect.getmembers(base_class) - - # 过滤出函数对象 - function_objs = [member[1] - for member in members - if inspect.isfunction(member[1]) or inspect.ismethod(member[1]) - ] - for function_obj in function_objs: - if inspect.isfunction(function_obj) and not is_static_func(function_obj): - continue - setattr(_singleton, function_obj.__name__, function_obj) - - _singleton.reset_all_instances = reset_all_instances - singleton.reset_all_instances = reset_all_instances - - return _singleton def lazy_property(func): @@ -524,7 +447,7 @@ def convert_to_int(data: any) -> int: try: int_value = int(convert_to_float(data)) except ValueError: - logger.error(f"Can not convert %s to int.", data) + logger.warning(f"Can not convert %s to int.", data) return 0 return int_value diff --git a/profiler/msprof_analyze/cli/cluster_cli.py b/profiler/msprof_analyze/cli/cluster_cli.py index 0cdb2bd2b10b2ede411d10221e36a51e3f015e12..adaf0f8d7cab8eff139125fbb4699ea962e3a427 100644 --- a/profiler/msprof_analyze/cli/cluster_cli.py +++ b/profiler/msprof_analyze/cli/cluster_cli.py @@ -34,6 +34,7 @@ context_settings['ignore_unknown_options'] = True @click.option("--parallel_mode", type=str, help="context mode", default="concurrent") @click.option("--export_type", help="recipe export type", type=click.Choice(["db", "notebook"]), default="db") @click.option("--rank_list", type=str, help="Rank id list", default='all') +@click.option("--step_id", type=int, help="Step id", default=Constant.VOID_STEP) @click.argument('args', nargs=-1) def cluster_cli(**kwargs) -> None: Interface(kwargs).run() diff --git a/profiler/msprof_analyze/cli/entrance.py b/profiler/msprof_analyze/cli/entrance.py index 0aa61f1b6aee2a5b6b321e8e3fb7a04ed63ff98a..534a9b133c7e60d1442cb290490a79e9256ce43d 100644 --- a/profiler/msprof_analyze/cli/entrance.py +++ b/profiler/msprof_analyze/cli/entrance.py @@ -22,7 +22,6 @@ from msprof_analyze.cli.complete_cli import auto_complete_cli from msprof_analyze.cli.compare_cli import compare_cli from msprof_analyze.cli.cluster_cli import cluster_cli from msprof_analyze.advisor.version import print_version_callback, cli_version -from msprof_analyze.cli.precheck_cli import precheck_cli logger = logging.getLogger() CONTEXT_SETTINGS = dict(help_option_names=['-H', '-h', '--help'], @@ -32,8 +31,7 @@ COMMAND_PRIORITY = { "advisor": 1, "compare": 2, "cluster": 3, - "precheck": 4, - "auto-completion": 5 + "auto-completion": 4 } @@ -68,6 +66,5 @@ def msprof_analyze_cli(**kwargs): msprof_analyze_cli.add_command(analyze_cli, name="advisor") msprof_analyze_cli.add_command(compare_cli, name="compare") msprof_analyze_cli.add_command(cluster_cli, name="cluster") -msprof_analyze_cli.add_command(precheck_cli, name="precheck") msprof_analyze_cli.add_command(auto_complete_cli, name="auto-completion") diff --git a/profiler/msprof_analyze/cli/precheck_cli.py b/profiler/msprof_analyze/cli/precheck_cli.py deleted file mode 100644 index c70b540ce9e3a8c9f718fe6be34917f8fed7b85d..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cli/precheck_cli.py +++ /dev/null @@ -1,159 +0,0 @@ -#!/usr/bin/python -# -*- coding: utf-8 -*- -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import sys -import ipaddress -import logging -from functools import wraps - -import click - -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) -CONTEXT_SETTINGS = dict(help_option_names=['-H', '-h', '--help']) - - -@click.group(context_settings=CONTEXT_SETTINGS) -def precheck_cli(): - """Profiler precheck tool""" - pass - - -def common_options(f): - """Common options for both precheck and runner commands""" - - @wraps(f) - def wrapper(*args, **kwargs): - return f(*args, **kwargs) - - wrapper = click.option('--master_addr', required=True, - help='IP address of the master node (first node in the cluster)')(wrapper) - wrapper = click.option('--master_port', type=int, default=29500, - help='Port on the master node for communication. Default is 29500')(wrapper) - wrapper = click.option('--nnodes', type=int, required=True, - help='Total number of nodes in the distributed setup')(wrapper) - wrapper = click.option('--nproc_per_node', type=int, required=True, - help='Number of processes to run per node')(wrapper) - wrapper = click.option('--node_prof_save_dir', default='', callback=PathManager.expanduser_for_cli, - help='Directory for saving node profiling data')(wrapper) - wrapper = click.option('--master_prof_gather_dir', default='', callback=PathManager.expanduser_for_cli, - help='Directory for saving gathered profiling data in master node')(wrapper) - wrapper = click.option('--output_dir', default='./output', callback=PathManager.expanduser_for_cli, - help='Directory to save profiling dump data, logs, and advisor reports')(wrapper) - wrapper = click.option('--task_name', default='', - help='Name of the task or experiment')(wrapper) - wrapper = click.option('--static', is_flag=True, - help='If set, run profiling in static mode')(wrapper) - wrapper = click.option('--profiling_cmd', default="", - help='Command to run the profiler script')(wrapper) - wrapper = click.option('--prof_in_shared_storage', is_flag=True, - help='If set, skip data collection as profiling data is in shared storage')(wrapper) - return wrapper - - -def validate_ip_list(ctx, param, value): - if not value: - return [] - try: - ips = [ip.strip() for ip in value.split(',')] - # Validate each IP - for ip in ips: - ipaddress.ip_address(ip) - return ips - except ValueError as e: - raise click.BadParameter(f'Invalid IP address in list: {e}') - - -@precheck_cli.command(context_settings=CONTEXT_SETTINGS, - name="start_all", - short_help='Start precheck on all nodes via ssh') -@common_options -@click.option('--host_ips', - callback=validate_ip_list, - help='Comma-separated list of IP addresses for nodes in distributed training (e.g., "192.168.1.1,192.168.1.2")') -@click.option('--python_path', default=sys.executable, callback=PathManager.expanduser_for_cli, - help='Path to the Python interpreter') -@click.option('--host_config_file', default='', callback=PathManager.expanduser_for_cli, - help='Path to the host configuration file (CSV format with node connection details)') -def precheck_start_all(**kwargs): - """Run precheck command""" - # Add validation - if not kwargs.get('host_ips') and not kwargs.get('host_config_file'): - raise click.UsageError('Either --host_ips or --host_config_file must be specified') - - if kwargs.get('host_ips') and kwargs.get('host_config_file'): - raise click.UsageError('Cannot specify both --host_ips and --host_config_file') - - from msprof_analyze.precheck.manager.args_manager import PrecheckArgsManager - from msprof_analyze.precheck.__main__ import main as precheck_main - - args = PrecheckArgsManager(type('Args', (), kwargs)) - click.echo(args) - precheck_main(args) - - -@precheck_cli.command(context_settings=CONTEXT_SETTINGS, - name="start_node", - short_help='Start one node precheck, if your nnodes > 1, you need to run this command on each node') -@common_options -@click.option('--node_rank', type=int, required=True, - help='Rank of the current node') -def precheck_start_node(**kwargs): - """Run precheck runner command""" - from msprof_analyze.precheck.manager.args_manager import PrecheckRunnerArgsManager - from msprof_analyze.precheck.runner.__main__ import main as runner_main - - args = PrecheckRunnerArgsManager(type('Args', (), kwargs)) - click.echo(args) - - runner_main(args) - - -@precheck_cli.command(context_settings=CONTEXT_SETTINGS, - name="env", - short_help='execute environment precheck') -@click.option('--nproc_per_node', type=int, required=True, - help='Number of processes to run per node') -@click.option('--nnodes', type=int, required=True, - help='Total number of nodes in the distributed setup') -@click.option('--node_rank', type=int, required=True, - help='Rank of the current node') -@click.option('--master_addr', type=str, required=False, - help='IP address of the master node', default="localhost") -@click.option('--master_port', type=int, required=False, - help='Port on the master node for communication', default=6000) -@click.option('--tensor-model-parallel-size', type=int, required=False, - help='Degree of tensor parallelism', default=1) -@click.option('--pipeline-model-parallel-size', type=int, required=False, - help='Degree of pipeline parallelism', default=1) -@click.option('--context-parallel-size', type=int, required=False, - help='Degree of context parallelism', default=1) -@click.option('--expert-model-parallel-size', type=int, required=False, - help='Degree of expert parallelism', default=1) -@click.option('--output', type=str, required=False, - help='Output path', default="./output") -@click.option('--check-type', type=str, required=False, - help='Environment precheck type', default="all") -def environment_precheck(**kwargs): - from msprof_analyze.precheck.precheck import Precheck - - click.echo(kwargs) - Precheck.env_precheck(**kwargs) - - -if __name__ == '__main__': - precheck_cli() \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/README.md b/profiler/msprof_analyze/cluster_analyse/README.md index 5147fa651481f7ea894f9158cec1b36a56641634..d4d57bad9a9ee3a79659de79f8c7fe136206ba72 100644 --- a/profiler/msprof_analyze/cluster_analyse/README.md +++ b/profiler/msprof_analyze/cluster_analyse/README.md @@ -2,11 +2,11 @@ cluster_analyse(集群分析工具)是在集群场景下,通过此工具来进行集群数据的分析,当前主要对基于通信域的迭代内耗时分析、通信时间分析以及通信矩阵分析为主, 从而定位慢卡、慢节点以及慢链路问题。 ## 性能数据采集 -当前集群调优工具主要支持PyTorch场景的Ascend PyTorch Profiler采集方式和MindSpore场景的MindSpore Profiler采集方式下的集群数据。 +当前集群调优工具主要支持PyTorch场景的Ascend PyTorch Profiler采集方式和MindSpore场景的MindSpore Profiler采集方式以及msprof命令行工具采集方式下的集群数据。 此工具只需要NPU的性能数据作为输入。 -Ascend PyTorch Profiler采集方法请参见《[NPU性能数据采集](https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze)》,MindSpore Profiler采集方法请参见《[性能调试](https://www.mindspore.cn/mindinsight/docs/zh-CN/r2.3/performance_profiling_ascend.html)》。 +Ascend PyTorch Profiler采集方法请参见《[NPU性能数据采集](https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze#npu性能数据采集)》,MindSpore Profiler采集方法请参见《[性能调试](https://www.mindspore.cn/mindinsight/docs/zh-CN/r2.3/performance_profiling_ascend.html)》,msprof命令行采集方法请参见《[msprof命令行工具](https://www.hiascend.com/document/detail/zh/canncommercial/800/devaids/devtools/profiling/atlasprofiling_16_0010.html)》。 我们要求至少是L1级别的数据。 ```python @@ -16,19 +16,26 @@ experimental_config = torch_npu.profiler._ExperimentalConfig( ``` ### 确认数据是否可用 -打开采集到的某张卡数据(\*ascend_pt、\*ascend_ms结尾的文件夹),可用的数据应该具备: +通过上述三种方式获得性能数据,打开采集到的某张卡数据,可用的数据应该具备: -- ./profiler_info_x.json, -- ./ASCEND_PROFILER_OUTPUT/step_trace_time.csv, -- ./ASCEND_PROFILER_OUTPUT/trace_view.json, -- ./ASCEND_PROFILER_OUTPUT/kernel_details.csv, -- ./ASCEND_PROFILER_OUTPUT/communication.json, -- ./ASCEND_PROFILER_OUTPUT/communication_matrix.json +- Ascend PyTorch Profiler采集的\*ascend_pt目录或MindSpore Profiler采集的\*ascend_ms目录: -或者具备: + - ./profiler_info_x.json, + - ./ASCEND_PROFILER_OUTPUT/step_trace_time.csv, + - ./ASCEND_PROFILER_OUTPUT/trace_view.json, + - ./ASCEND_PROFILER_OUTPUT/kernel_details.csv, + - ./ASCEND_PROFILER_OUTPUT/communication.json, + - ./ASCEND_PROFILER_OUTPUT/communication_matrix.json -- analysis.db -- ascend_pytorch_profiler_{rank_id}.db + 或者具备: + + - analysis.db + - ascend_pytorch_profiler_{rank_id}.db + +- msprof命令行采集的PROF_XXX目录: + + - --type=db、--export=on情况下解析的:msprof_{timestamp}.db + - --type=db、--analyze=on情况下解析的:analyze/communication_analyzer.db 以上csv、json文件与db文件只能存在一类,否则集群分析工具解析异常。MindSpore场景暂不支持以上db文件。 @@ -52,6 +59,18 @@ experimental_config = torch_npu.profiler._ExperimentalConfig( python3 cluster_analysis.py -d {cluster profiling data path} [-m mode] [-o output_path] [--data_simplification] [--force] ``` + 命令示例: + + ```bash + msprof-analyze cluster -d ./cluster_profiling_data_path -m cann_api_sum --parallel_mode concurrent + ``` + + 或 + + ```bash + python3 cluster_analysis.py -d ./cluster_profiling_data_path -m cann_api_sum --parallel_mode concurrent + ``` + 参数说明: | 参数名 | 说明 | 是否必选 | @@ -61,45 +80,38 @@ experimental_config = torch_npu.profiler._ExperimentalConfig( | --mode或-m | 数据解析模式,取值详见“**--mode参数说明**”表。 | 否 | | --data_simplification | 数据精简模式。对于数据量过大的性能数据db文件,可以通过配置该参数将数据精简,并提高工具分析效率。配置该参数表示开启数据精简,默认未配置表示关闭。 | 否 | | --force | 强制执行cluster。配置后可强制跳过如下情况:
    指定的目录、文件的用户属主不属于当前用户,忽略属主判断直接执行。
    csv文件大于5G、json文件大于10G、db文件大于8G,忽略文件过大判断直接执行。
    配置该参数表示开启强制执行,默认未配置表示关闭。 | 否 | - | --parallel_mode | 设置收集多卡、多节点db数据时的并发方式。取值为concurrent(使用concurrent.feature进程池实现并发)。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum时可配置此参数。** | 否 | - | --export_type | 设置导出的数据形式。取值为db(.db格式文件)和notebook(Jupyter Notebook文件),默认值为db。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum时可配置此参数。** | 否 | - | --rank_list | 对特定Rank上的数据进行统计,默认值为all(表示对所有Rank进行统计),须根据实际卡的Rank ID配置。应配置为大于等于0的整数,若所配置的值大于实际训练所运行的卡的Rank ID,则仅解析合法的RankID的数据,比如当前环境Rank ID为0到7,实际训练运行0到3卡,此时若配置Rank ID为0, 3, 4或不存在的10等其他值,则仅解析0和3。配置示例:--rank_list 0, 1, 2。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum时可配置此参数。** | 否 | + | --parallel_mode | 设置收集多卡、多节点db数据时的并发方式。取值为concurrent(使用concurrent.feature进程池实现并发)。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum和自定义分析参数时可配置此参数。** | 否 | + | --export_type | 设置导出的数据形式。取值为db(.db格式文件)和notebook(Jupyter Notebook文件),默认值为db。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum和自定义分析参数时可配置此参数。** | 否 | + | --rank_list | 对特定Rank上的数据进行统计,默认值为all(表示对所有Rank进行统计),须根据实际卡的Rank ID配置。应配置为大于等于0的整数,若所配置的值大于实际训练所运行的卡的Rank ID,则仅解析合法的RankID的数据,比如当前环境Rank ID为0到7,实际训练运行0到3卡,此时若配置Rank ID为0, 3, 4或不存在的10等其他值,则仅解析0和3。配置示例:--rank_list 0, 1, 2。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum和自定义分析参数时可配置此参数。** | 否 | + | --step_id | 性能数据Step ID,配置后对该Step的性能数据进行分析。需配置性能数据中实际存在的Step ID,默认未配置,表示全量分析。配置示例:--step_id=1。
    **只有-m配置cann_api_sum、compute_op_sum、hccl_sum、mstx_sum和自定义分析参数时可配置此参数。** | 否 | | --top_num | 设置TopN耗时的通信算子的数量,默认值为15,配置示例:--top_num 20。
    **只有-m配置hccl_sum时可配置此参数。** | 否 | - | --exclude_op_name | 控制compute_op_name结果是否包含op_name,示例:--exclude_op_name,后面不需要跟参数。
    **只有-m配置compute_op_sum时可配置此参数。** | 否 | + | --exclude_op_name | 控制compute_op_name结果是否包含op_name,示例:--exclude_op_name,后面不需要跟参数。
    **只有-m配置compute_op_sum时可配置此参数。** | 否 | --mode参数说明: - | 参数名 | 说明 | 是否必选 | - |----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------| - | communication_matrix | 解析通信矩阵数据。 | 否 | - | communication_time | 解析通信耗时数据。 | 否 | - | all | 同时解析通信矩阵communication_matrix和通信耗时数据communication_time,--mode参数默认值为all。 | 否 | - | cann_api_sum | 集群API性能数据汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/CannApiSum目录下输出交付件stats.ipynb。 | 否 | - | compute_op_sum | 集群场景性能数据的device运行算子信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/ComputeOpSum目录下输出交付件stats.ipynb;可根据实际情况决定是否是否打开--exclude_op_name。 | 否 | - | hccl_sum | 集合通信算子耗时分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/HcclSum目录下输出交付件stats.ipynb。 | 否 | - | mstx_sum | 集群场景mstx打点信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/MstxSum目录下输出交付件stats.ipynb。 | 否 | - | slow_link | 集群慢链路异常分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/SlowLink目录下输出交付件stats.ipynb。 | 否 | - | cluster_time_summary | 集群场景性能数据分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db和analysis.db文件。--export_type为db时,输出交付件cluster_analysis.db,db里面有ClusterTimeSummary,不支持导出notebook。 | 否 | - | cluster_time_compare_summary | 集群场景性能数据对比分析,使用前集群数据必须先分析cluster_time_summary,需要配合--bp参数使用。输入性能数据需要基于cluster_analysis_output下的cluster_analysis.db文件。--export_type为db时,输出交付件cluster_analysis.db,db文件中有对比结果的表ClusterTimeCompareSummary,不支持导出notebook。 | 否 | - | slow_rank_pp_stage | 集群场景性能数据pp stage通信对比分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。输入性能数据中MetaData表如果没有包含训练任务的并行策略,则需要通过--tp --pp --dp手动传入,数据类型为正整数。--export_type为db时,输出交付件cluster_analysis.db,db文件中有分析结果PPAnalysisResult和P2PAnalysisResult,不支持导出notebook。 | 否 | - | p2p_pairing | 集群场景P2P算子生成全局关联索引,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。输出的关联索引会作为一个新的字段`opConnectionId`附在原性能数据ascend_pytorch_profiler_{rank_id}.db文件的`COMMUNICATION_OP`的表中。 | 否 | - - --parallel_mode参数示例如下: + --mode参数设置不同的数据解析模式,可分析生成cluster_analysis.db交付件,交付件详细内容请参见[cluster_analysis.db交付件表结构说明](#cluster_analysisdb交付件表结构说明)。 - ```bash - msprof-analyze cluster -d {cluster profiling data path} -m cann_api_sum --parallel_mode concurrent - ``` - - 或 - - ```bash - python3 cluster_analysis.py -d {cluster profiling data path} -m cann_api_sum --parallel_mode concurrent - ``` + | 参数名 | 说明 | 是否必选 | + |--------------------------| ------------------------------------------------------------ | -------- | + | communication_matrix | 解析通信矩阵数据。 | 否 | + | communication_time | 解析通信耗时数据。 | 否 | + | all | 解析内容包括:
    通信矩阵communication_matrix
    通信耗时数据communication_time
    汇总集群内的节点信息(基于ascend_pytorch_profiler_{rank_id}.db生成)
    --mode参数默认值为all。 | 否 | + | cann_api_sum | 集群API性能数据汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/CannApiSum目录下输出交付件stats.ipynb。 | 否 | + | compute_op_sum | 集群场景性能数据的device运行算子信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/ComputeOpSum目录下输出交付件stats.ipynb;可根据实际情况决定是否打开--exclude_op_name。 | 否 | + | hccl_sum | 集合通信算子耗时分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/HcclSum目录下输出交付件stats.ipynb。 | 否 | + | mstx_sum | 集群场景mstx打点信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。--export_type为db时,输出交付件cluster_analysis.db;--export_type为notebook时,在cluster_analysis_output/MstxSum目录下输出交付件stats.ipynb。 | 否 | + | communication_group_map | 集群场景通信域与并行策略呈现,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件和analysis.db。--export_type为db时,输出交付件cluster_analysis.db。 | 否 | + | communication_time_sum | 集群场景通信时间和带宽汇总分析,输入性能数据需要基于analysis.db。--export_type为db时,输出交付件cluster_analysis.db。| 否 | + | communication_matrix_sum | 集群场景通信矩阵汇总分析,输入性能数据需要基于analysis.db。--export_type为db时,输出交付件cluster_analysis.db。| 否 | + | freq_analysis | 集群场景aicore frequency信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。打印输出是否aicore存在空闲(频率为800MHz)、异常(频率不为1800MHz或800MHz)的现象。如果有,则在输出交付件cluster_analysis.db增加对应的卡和频率信息。 | 否 | + | ep_load_balance | 集群场景moe负载信息汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。输出交付件cluster_analysis.db增加EPTokensSummary, TopEPTokensInfo分析表格。 | 否 | + | slow_rank | 集群场景通信算子快慢卡汇总分析,输入性能数据需要基于ascend_pytorch_profiler_{rank_id}.db文件。输出交付件cluster_analysis.db中展示各个rank按照当前的快慢卡统计算法得出的快慢卡影响次数。|| + | 自定义分析参数 | 与cann_api_sum、compute_op_sum、hccl_sum等参数功能类似,用户可自定义一套性能数据的分析规则,要求用户开发者详细了解性能分析规则,具体开发指导请参见“[自定义分析规则开发指导](#自定义分析规则开发指导)”。 | 否 | ### 交付件 -集群分析工具的交付件通过MindStudio Insight工具展示,详见《[MindStudio Insight用户指南](https://www.hiascend.com/document/detail/zh/mindstudio/70RC2/GUI-baseddevelopmenttool/msascendinsightug/AscendInsight_0002.html)》。 +集群分析工具的交付件通过MindStudio Insight工具展示,详见《[MindStudio Insight用户指南](https://www.hiascend.com/document/detail/zh/mindstudio/70RC3/msinsightug/msascendinsightug/Insight_userguide_0002.html)》。 #### cluster_step_trace_time.csv @@ -107,7 +119,7 @@ experimental_config = torch_npu.profiler._ExperimentalConfig( A列: Step数,是采集性能数据时设置的,一般来说集群性能数据采集一个step足够,如果采集多个step,需要先筛选一下。 -B列: Type,主要分两种,rank和stage, 和后面的index强相关,可以理解为一个是单卡rank,一个是rank group(pp 并行的stage),如果type为stage,则后面D-K列信息为rank group下的最大值。 +B列: Type,主要分两种,rank和stage,和后面的Index强相关,可以理解为一个是单卡rank,一个是rank group(pp 并行的stage),如果type为stage,则后面D-K列信息为rank group下的最大值。 C列:Index,与type相关,表示卡号。 @@ -135,7 +147,7 @@ N列:PP Index,指集群数据按照并行策略切分后所属PP组的索引 O列:TP Index,指集群数据按照并行策略切分后所属TP组的索引,如果没有采集则不显示。 -**Tips**:先筛选B列type为stage, 看stage间是否有问题,再筛选B列type为rank,看rank是否有问题,根据以下几点排查。 +**Tips**:先筛选B列type为stage,看stage间是否有问题,再筛选B列type为rank,看rank是否有问题,根据以下几点排查。 * 根据Computing的时间差异判断是否有慢卡,或者有负载不均衡的现象。 @@ -143,27 +155,28 @@ O列:TP Index,指集群数据按照并行策略切分后所属TP组的索引 * 根据Communication(Not Overlapped and Exclude Receive)时间判断是否通信耗时占比过大。 -* 根据Bubble时间的占比和理论计算公式判断bubble设置是否合理,是否stage间有不均衡现象。 +* 根据Bubble时间的占比和理论计算公式判断bubble设置是否合理,stage间是否有不均衡现象。 -以上时间理论上都应该处于持平状态,即最大值小于最小值5%,否则就可能出现慢卡。 +以上时间理论上都应该处于持平状态,即(最大值-最小值)/均值≤5%,否则就可能出现慢卡。 #### cluster_communication_matrix.json 数据解析模式为communication_matrix或all时生成。 -直接打开json(vscode或json查看器), 搜索"Total", 会有多个搜索结果,一般来说链路带宽信息的结构: +直接打开json(vscode或json查看器),搜索“Total”, 会有多个搜索结果,一般来说链路带宽信息的结构: ```bash {src_rank}-{dst_rank}: { "Transport Type": "LOCAL", "Transit Time(ms)": 0.02462, "Transit Size(MB)": 16.777216, + "Op Name": "", "Bandwidth(GB/s)": 681.4466 } ``` **Tips**:可以根据rank互联的带宽以及链路类型,判断是否有慢链路的问题。 -- "LOCAL"是片内拷贝,速度最高。 +- “LOCAL”是片内拷贝,速度最高。 - “HCCS”或“PCIE”是节点内片间拷贝,速度居中。 - “RDMA”是节点间拷贝,速度最低。 @@ -175,7 +188,7 @@ O列:TP Index,指集群数据按照并行策略切分后所属TP组的索引 #### cluster_analysis.db -解析analysis.db或ascend_pytorch_profiler_{rank_id}.db生成的交付件,根据数据解析模式不同而解析不同的数据,可以使用MindStudio Insight工具展示。 +解析analysis.db或ascend_pytorch_profiler_{rank_id}.db生成的交付件,根据数据解析模式不同而解析不同的数据,详情介绍请参见[cluster_analysis.db交付件表结构说明](#cluster_analysisdb交付件表结构说明) #### communication_group.json @@ -199,9 +212,649 @@ O列:TP Index,指集群数据按照并行策略切分后所属TP组的索引 可使用jupyter notebook工具或MindStudio Insight工具打开,主要展示集群场景mstx打点信息,分为框架侧、CANN侧和Device侧三部分的打点信息。 -- 数据解析模式为slow_link时生成,保存在cluster_analysis_output/SlowLink目录下。 +## cluster_analysis.db交付件表结构说明 + +说明: + +msprof-analyze配置--mode参数时可分析并输出cluster_analysis.db交付件,本节介绍该交付件的表结构和字段说明。 + +### compute_op_sum + +设置-m compute_op_sum时,会生成以下表。 + +#### ComputeOpAllRankStats + +说明: + +基于db格式的集群性能数据,针对全部rank的数据,以OpType和TaskType分组,对计算算子的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpType | TEXT | 计算算子类型 | +| TaskType | TEXT | 算子执行的加速器类型 | +| Count | INTEGER | 以OpType和TaskType分组进行统计的算子数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | + +#### ComputeOpPerRankStatsByOpType + +说明: + +基于db格式的集群性能数据,针对每个rank的数据,以OpType和TaskType分组,对计算算子的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpType | TEXT | 计算算子类型 | +| TaskType | TEXT | 算子执行的加速器类型 | +| Count | INTEGER | 以OpType和TaskType分组进行统计的算子数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| Rank | INTEGER | rank_id | + +#### ComputeOpPerRankStatsByOpName + +说明: + +配置--exclude_op_name参数时不会生成该表; +基于db格式的集群性能数据,针对每个rank的数据,以OpName、OpType、TaskType和InputShapes分组,对计算算子的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpName | TEXT | 计算算子名字 | +| OpType | TEXT | 计算算子类型 | +| TaskType | TEXT | 算子执行的加速器类型 | +| InputShapes | TEXT | 算子的输入维度 | +| Count | INTEGER | 这个分组的算子数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| Rank | INTEGER | rank_id | + +### cann_api_sum + +设置-m cann_api_sum时,会生成以下表。 + +#### CannApiSum + +说明: + +基于db格式的集群性能数据,针对全部rank的数据,对每一种api(名字不同)的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| name | TEXT | API名字 | +| timeRatio | REAL | API的耗时占所有API总耗时的百分比 | +| totalTimeNs | INTEGER | API的总耗时,单位ns | +| totalCount | INTEGER | API的数量 | +| averageNs | REAL | 耗时的平均值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| medNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| minNs | REAL | 耗时的最小值,单位ns | +| maxNs | REAL | 耗时的最大值,单位ns | +| stdev | REAL | 耗时的标准差,单位ns | +| minRank | TEXT | minNs对应的rank的集合 | +| maxRank | TEXT | maxNs对应的rank的集合 | + +#### CannApiSumRank + +说明: + +基于db格式的集群性能数据,针对每个rank的数据,对每一种api(名字不同)的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| name | TEXT | API名字 | +| durationRatio | REAL | API的耗时占卡内所有API总耗时的百分比 | +| totalTimeNs | INTEGER | API的总耗时,单位ns | +| totalCount | INTEGER | API的数量 | +| averageNs | REAL | 耗时的平均值,单位ns | +| minNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| medNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| maxNs | REAL | 耗时的最大值,单位ns | +| stdev | REAL | 耗时的标准差,单位ns | +| rank | INTEGER | rank_id | + +### hccl_sum + +设置-m hccl_sum时,会生成以下表。 + +#### HcclAllRankStats + +说明: + +基于db格式的集群性能数据,针对全部rank的数据,对每一种通信算子类型(例如hcom_broadcast_)的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpType | TEXT | 通信算子类型 | +| Count | INTEGER | 数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | + +#### HcclPerRankStats + +说明: + +基于db格式的集群性能数据,针对每个rank的数据,对每一种通信算子类型(例如hcom_broadcast_)的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpType | TEXT | 通信算子类型 | +| Count | INTEGER | 数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| Rank | INTEGER | rank_id | + +#### HcclGroupNameMap + +说明: + +通信域内包含的rank。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| GroupName | TEXT | 通信域,例如:10.170.22.98%enp67s0f5_60000_0_1708156014257149 | +| GroupId | TEXT | 通信域的hash值的后三位 | +| Ranks | TEXT | 该通信域的所有rank | + +#### HcclTopOpStats + +说明: + +基于db格式的集群性能数据,对所有rank的通信算子的耗时进行分析,展示耗时平均值排名TOP N(默认为 15)的通信算子的数据。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| OpName | TEXT | 通信算子名,例如hcom_allReduce__606_0_1 | +| Count | INTEGER | 数量 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| MinRank | INTEGER | 该通信算子耗时最小的rank | +| MaxRank | INTEGER | 该通信算子耗时最大的rank | + +### communication_group_map + +设置-m communication_group_map,会生成以下表。 + +#### CommunicationGroupMapping + +说明: + +基于db格式的集群性能数据,生成通信域与并行策略的对应关系。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| type | TEXT | 算子类型,包含collective和p2p, 其中算子名包含"send","recv","receive"的算子被认为是p2p | +| rank_set | TEXT | 通信域内包含的rank(global rank)| +| group_name | TEXT | 通信域的hash值,可映射成group_id | +| group_id | TEXT | hccl内部定义的通信域名字,例如:10.170.22.98%enp67s0f5_60000_0_1708156014257149 | +| pg_name | TEXT | 业务定义的通信域名字,例如:"dp","dp_cp","mp"等等 | + +### communication_time_sum + +设置-m communication_time_sum时,会生成以下表。 + +#### ClusterCommunicationTime + +说明: + +基于db格式的集群性能数据,分析集群通信时间。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| step | TEXT | 算子所属的step | +| rank_id | INTEGER | global rank | +| hccl_op_name | TEXT | 通信算子名,例如hcom_allReduce__606_0_1 | +| group_name | TEXT | 通信域hashId,例如3915571125887837303 | +| start_timestamp | REAL | 开始时间,单位us | +| elapsed_time | REAL | 通信总耗时,单位ms | +| transit_time | REAL | 传输时间,单位ms | +| wait_time | REAL | 等待时间,单位ms | +| synchronization_time | REAL | 同步时间,单位ms | +| idle_time | REAL | 空闲时间,单位ms | +| synchronization_time_ratio | REAL | 同步时间占比,synchronization_time /(transit_time + synchronization_time) | +| wait_time_ratio | REAL | 等待时间占比,wait_time /(transit_time + wait_time) | + +#### ClusterCommunicationBandwidth + +说明: + +基于db格式的集群性能数据,分析集群通信带宽。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| step | TEXT | 算子所属的step | +| rank_id | INTEGER | global rank | +| hccl_op_name | TEXT | 通信算子名,例如hcom_allReduce__606_0_1 | +| group_name | TEXT | 通信域hashId,例如3915571125887837303 | +| band_type | TEXT | 传输类型,包含:LOCAL、SDMA、RDMA、HCCS等 | +| transit_size | REAL | 传输的数据量,单位MB | +| transit_time | REAL | 传输耗时,单位ms | +| bandwidth | REAL | 带宽,单位GB/s | +| large_packet_ratio | REAL | 大数据包的比例 | +| package_size | REAL | 一次传输的通信数据包大小,单位MB | +| count | INTEGER | 通信传输次数 | +| total_duration | REAL | 通信传输总耗时,单位ms | + +### communication_matrix_sum + +设置-m communication_matrix_sum时,会生成以下表。 + +#### ClusterCommunicationMatrix + +说明: + +基于db格式的集群性能数据,生成通信矩阵数据。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| step | TEXT | 算子所属的step | +| hccl_op_name | TEXT | 矩阵分析后的精简算子名,例如:send-top1 | +| group_name | TEXT | 通信域hashId,例如3915571125887837303 | +| src_rank | REAL | 发送数据的rankId,例如:0| +| dst_rank | REAL | 接收数据的rankId,例如:1| +| transport_type | TEXT | 传输类型,包含:LOCAL、SDMA、RDMA等 | +| op_name | TEXT | 算子的原始名字 | +| transit_size | REAL | 传输的数据量,单位MB | +| transit_time | REAL | 传输耗时,单位ms | +| bandwidth | REAL | 带宽,单位GB/s | + +### mstx_sum + +设置-m mstx_sum时,会生成以下表。 + +#### MSTXAllFrameworkStats + +说明: + +基于db格式的集群性能数据,分析mstx打点数据的框架侧耗时(不区分rank)。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| Name | TEXT | mstx打点数据携带信息 | +| Count | INTEGER | 该迭代内以Name为分组的打点的次数 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| StepId | INTEGER | 迭代id | + +#### MSTXAllCannStats + +说明: + +基于db格式的集群性能数据,分析mstx打点数据的cann层耗时(不区分rank)。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| Name | TEXT | mstx打点数据携带信息 | +| Count | INTEGER | 该迭代内以Name为分组的打点的次数 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| StepId | INTEGER | 迭代id | + +#### MSTXAllDeviceStats + +说明: + +基于db格式的集群性能数据,分析mstx打点数据的device侧耗时(不区分rank)。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| Name | TEXT | mstx打点数据携带信息 | +| Count | INTEGER | 该迭代内以Name为分组的打点的次数 | +| MeanNs | REAL | 耗时的平均值,单位ns | +| StdNs | REAL | 耗时的标准差,单位ns | +| MinNs | REAL | 耗时的最小值,单位ns | +| Q1Ns | REAL | 耗时的25%分位数,单位ns | +| MedianNs | REAL | 耗时的50%分位数,单位ns | +| Q3Ns | REAL | 耗时的75%分位数,单位ns | +| MaxNs | REAL | 耗时的最大值,单位ns | +| SumNs | REAL | 耗时的总和,单位ns | +| StepId | INTEGER | 迭代id | + +#### MSTXMarkStats + +说明: + +基于db格式的集群性能数据,针对每个rank的打点数据,以Rank,StepId分组,对mstx打点的耗时进行统计分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| Name | TEXT | mstx打点数据携带信息 | +| FrameworkDurationNs | REAL | 框架侧耗时,单位ns | +| CannDurationNs | REAL | CANN层耗时,单位ns | +| DeviceDurationNs | REAL | device侧耗时,单位ns | +| Rank | INTEGER | global rank | +| StepId | INTEGER | 迭代id | + +### freq_analysis + +说明: + +基于db格式的集群性能数据,分析aicore frequency,提供npu降频一键检测能力。频率分为三种情况: +* 正常情况下,应当稳定在1800MHz; +* 当npu空闲时间较长时,设备会自动降频,会掉到800MHz; +* 当npu因为各种原因,出现降频现象时,除了1800MHz,800MHz,还有出现其他异常频率。 + +设置-m freq_analysis时,如果发生降频,会生成以下表。 + +#### FreeFrequencyRanks + +说明: + +对应第二种情况。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| rankId | INTEGER | global rank | +| aicoreFrequency | TEXT | [800, 1800] | + +#### AbnormalFrequencyRanks + +说明: + +对应第三种情况。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| rankId | INTEGER | global rank | +| aicoreFrequency | TEXT | 异常频率列表;例如:[800, 1150, 1450, 1800] | + +### ep_load_balance + +说明: + +集群训练场景下,MOE负载不均指的是,在分布式环境下,不同的专家模型处理的任务量不均衡,导致某些专家过载(处理过多任务),而其他专家闲置。这种负载不均会降低系统的整体效率,甚至可能导致性能瓶颈。 + +设置-m ep_load_balance时,会生成以下表。 + +#### EPTokensSummary + +说明: + +基于db格式的集群性能数据,分析GroupedMatmul算子的shape信息。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| rank | INTEGER | global rank | +| epRanks | TEXT | 同一个ep(Expert Parallelism)的rank集合,例如0,1 | +| inputShapesSummary | INTEGER | 该rank的GroupedMatmul算子的inputshapes的第一个维度的总和 | + +#### TopEPTokensInfo + +说明: + +负载不均的ep。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| epRanks | TEXT | 负载不均的ep(Expert Parallelism)的rank集合,例如0,1 | +| tokensDiff | INTEGER | 同一个ep内最大值与最小值之间的差值 | + +### slow_rank + +设置-m slow_rank时,会生成以下表。 + +#### SlowRank + +说明: + +基于db格式的集群性能数据,进行慢卡分析。 + +格式: + +| 字段名 | 类型 | 含义 | +| ------ | ---- | ---- | +| rankId | INTEGER | 慢卡 | +| slowAffectCount | INTEGER | 该rank影响了多少次通信 | + +## 附录 + +### 自定义分析规则开发指导 + +自定义分析规则是基于对Profiling的analysis.db和ascend_pytorch_profiler_{rank_id}.db文件进行性能数据分析而开发。与cann_api_sum、compute_op_sum、hccl_sum等参数功能实现类似,可自定义一套性能数据的分析规则,方法如下: + +1. 在mstt工具代码仓profiler/msprof_analyze/cluster_analyse/recipes目录下创建xxx目录和xxx.py文件。 + + 例如:profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/cann_api_sum.py,其中目录名和文件名要保持一致,该目录名也会作为使用msprof-analyze cluster工具启动该自定义分析的开关参数。 + +2. 在xxx.py文件进行性能数据分析规则的开发,开发要求继承BaseRecipeAnalysis,实现run函数。 + + 典型的run函数实现: + + ```python + def run(self, context): + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + if self._export_type == "db": + self.save_db() + elif self._export_type == "notebook": + self.save_notebook() + else: + logger.error("Unknown export type.") + ``` + + 1. `mapper_func`函数:多卡数据查询并合并返回结果。由于集群数据每张卡的数据处理是同样的,因此采用context并行处理集群数据并将结果按序拼装返回。开发只需要实现单卡数据处理的函数`self._mapper_fun`。 + + ```python + def mapper_func(self, context): + return context.wait( + context.map( + self._mapper_func, + self._get_rank_db(), + analysis_class=self._recipe_name + ) + ) + ``` + + ```python + def _mapper_func(self, data_map, analysis_class): + """ + Extract the profiling data required for cluster analysis from each device, and then aggregate the + results from each device to be processed by a reduce function. + Params: + data_map: eg. {"RANK_ID": 1, "profiler_db_path": "xxxx/ascend_pytorch_profiler_1.db"} + analysis_class: hccl_sum, compute_op_sum, cann_api_sum, mstx_sum...... + """ + pass + ``` + + 2. `reducer_func`函数:对多卡结果分析处理。接收`mapper_func`函数的返回值,进行进一步的集群数据的汇总分析,数据结构采用dataframe。 + + 3. `save_db`函数:分析结果保存在cluster_analysis.db中。 + + 4. `save_notebook`函数:分析结果以csv和stats.ipynb的形式保存。 + +3. `self._mapper_fun`函数依赖单db数据查询,可通过可通过如下两种方式。 + + 1. 使用DatabaseService可配置单表的查询。 + + 可参考:https://gitee.com/ascend/mstt/blob/pre-research/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/mstx2commop.py + + 使用样例: + + ```Python + service = DatabaseService(profiler_db_path) + service.add_table_for_query("ENUM_HCCL_DATA_TYPE", ["id", "name"]) # 第一个参数:表名;第二个参数:字段列表,默认为None,当不填写时表明select * + service.add_table_for_query("STRING_IDS", ["id", "value"]) #可以添加多个表 + df_dict = service.query_data() # 将配置的所有表按序查询,以dict形式返回,key为表名,value为数据库查询结果dataframe数据类型 + ``` + + 2. 维护在msprof_analyze/prof_exports目录下,新建一个py文件,需继承自BaseStatsExport(注:新增之前可以看现有的是否可用,避免重复)如下示例(以hccl_sum_export.py文件为例): + + ```Python + from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport + + QUERY = """ + SELECT + NAME_IDS.value AS "OpName", + TYPE_IDS.value AS "OpType", + round(endNs - startNs) AS "Duration", + GROUP_NAME_IDS.value AS "GroupName" + FROM + COMMUNICATION_OP + LEFT JOIN + STRING_IDS AS TYPE_IDS + ON TYPE_IDS.id == COMMUNICATION_OP.opType + LEFT JOIN + STRING_IDS AS NAME_IDS + ON NAME_IDS.id == COMMUNICATION_OP.opName + LEFT JOIN + STRING_IDS AS GROUP_NAME_IDS + ON GROUP_NAME_IDS.id == COMMUNICATION_OP.groupName + """ + + + class HcclSumExport(BaseStatsExport): + def __init__(self, db_path, recipe_name): + super().__init__(db_path, recipe_name) + self._query = QUERY + ``` + + 使用样例:df = HcclSumExport(profiler_db_path, analysis_class).read_export_db(),返回的数据类型是dataframe。 + +4. 分析规则增加拓展参数。 + + 实现函数add_parser_argument,样例如下: + + ```Python + @classmethod + def add_parser_argument(cls, parser): + parser.add_argument("--top_num", type=str, help="Duration cost top count", default=cls.DEFAULT_TOP_NUM) + ``` + + 从self._extra_args里获取对应的扩展参数: + + ```Python + def __init__(self, params): + super().__init__(params) + top_num = self._extra_args.get(self.TOP_NUM, self.DEFAULT_TOP_NUM) + self.top_num = int(top_num) if isinstance(top_num, str) and top_num.isdigit() else self.DEFAULT_TOP_NUM + ``` + +5. 执行自定义分析规则命令。 + + ```bash + msprof-analyze cluster -d {cluster profiling data path} --mode xxx --top_num 10 + ``` + +### 开发和上库流程规范 + +开发要遵守以下流程规范。 + +1. **需求澄清和串讲** + + 确定要做该需求后,首先要明确该需求的**迭代时间**,开发流程需要严格遵守我们的迭代时间,参加该需求的需求澄清以及串讲(我们会安排相应会议)。需求澄清可由DE完成(对齐输入输入以及流程图),需求串讲需要开发者来完成,串讲时需要准备**设计文档和测试用例**(有文档模版,可以跟SE或者DE联系拿到)。 + +2. **UT** + + 为了保证后面的开发者修改你的代码时不会影响你的功能,或者能够感知这次修改的影响,比如算法实现、字段变更等,需要在上库的同时添加UT。 + UT的编写可以参考已经上库的其他用例,建议四段式命名:test_{目标方法名}_should_{预期结果}_when_{分支条件}_given_{输入参数},可以灵活使用mock方式构造虚拟返回。 + +3. **资料编写** + + 目前,如果新增一个分析能力,需要在[操作步骤](#操作步骤)的第2小节的“--mode参数说明”中添加对应参数的说明,简洁说明该分析能力的作用以及输入输出。 + 另外,需要在[cluster_analysis.db交付件表结构说明](#cluster_analysisdb交付件表结构说明)中添加表结构说明,明确输入输出。可以详细说明你的分析能力的**主要场景、用途甚至是算法原理**,保证用户知道这个分析能力的能做什么,对调优有什么帮助。(参考[freq_analysis](#freq_analysis)的说明) + +4. **CI** + + 正常商发需求合入master分支;预研需求合入pre-research分支;poc需求合入poc分支。 + 提了PR之后,可以评论**compile**,触发线上CI,会跑cleancode和冒烟,只有全绿,才可以发起代码检视。PR合入需要lgtm标签和approve标签(群里有相应的committer可以加标签)。 - 可使用jupyter notebook工具或MindStudio Insight工具打开,主要展示集群场景异常慢链路数据分析(将集群所有链路进行汇总并以图表展示),集群慢链路汇总耗时分析(展示检测到可能存在慢链路的数据)。 +5. **代码检视** + 代码上库,需要经过检视,可以将链接发到**msprof-analyze代码检视群**,说明该PR的标题,然后@相关人进行检视。修改完检视意见后再次@commiter,合代码。 + 为了让结果可信以及方便其他开发或者测试使用这个分析能力,需要编写测试用例并提供**自验报告**作为凭证。 diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/analysis_facade.py b/profiler/msprof_analyze/cluster_analyse/analysis/analysis_facade.py index aa9658f0c6229264e40eeacf6cfcc9fcd3a18538..2adef926706f52e22f9f82ba7563698e3241c195 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/analysis_facade.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/analysis_facade.py @@ -16,17 +16,18 @@ from multiprocessing import Process, Value, Lock from tqdm import tqdm from msprof_analyze.cluster_analyse.analysis.communication_analysis import CommunicationAnalysis -from msprof_analyze.cluster_analyse.analysis.communication_analysis import CommunicationAnalysisOptimized from msprof_analyze.cluster_analyse.analysis.comm_matrix_analysis import CommMatrixAnalysis -from msprof_analyze.cluster_analyse.analysis.comm_matrix_analysis import CommMatrixAnalysisOptimized from msprof_analyze.cluster_analyse.analysis.step_trace_time_analysis import StepTraceTimeAnalysis from msprof_analyze.cluster_analyse.analysis.host_info_analysis import HostInfoAnalysis from msprof_analyze.cluster_analyse.analysis.cluster_base_info_analysis import ClusterBaseInfoAnalysis from msprof_analyze.cluster_analyse.common_func.context import Context - from msprof_analyze.cluster_analyse.common_func.analysis_loader import get_class_from_name from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.cluster_analyse.recipes.communication_group_map.communication_group_map import CommunicationGroupMap +from msprof_analyze.cluster_analyse.recipes.communication_time_sum.communication_time_sum import \ + CommunicationTimeSum +from msprof_analyze.cluster_analyse.recipes.communication_matrix_sum.communication_matrix_sum import CommMatrixSum logger = get_logger() @@ -34,10 +35,7 @@ logger = get_logger() class AnalysisFacade: default_module = {CommunicationAnalysis, StepTraceTimeAnalysis, CommMatrixAnalysis, HostInfoAnalysis, ClusterBaseInfoAnalysis} - simplified_module = { - CommunicationAnalysisOptimized, StepTraceTimeAnalysis, CommMatrixAnalysisOptimized, HostInfoAnalysis, - ClusterBaseInfoAnalysis - } + simplified_module = {StepTraceTimeAnalysis, ClusterBaseInfoAnalysis, HostInfoAnalysis} def __init__(self, params: dict): self.params = params @@ -47,6 +45,7 @@ class AnalysisFacade: process_list = [] if self.params.get(Constant.DATA_SIMPLIFICATION) and self.params.get(Constant.DATA_TYPE) == Constant.DB: analysis_module = self.simplified_module + self.cluster_analyze_with_recipe() else: analysis_module = self.default_module @@ -78,6 +77,9 @@ class AnalysisFacade: pbar.refresh() def do_recipe(self, recipe_class): + if not recipe_class or len(recipe_class) != 2: + logger.error(f"Invalid input recipe_class, should be two elements, e.g. (class_name, class)") + return try: logger.info(f"Recipe {recipe_class[0]} analysis is starting to launch.") with Context.create_context(self.params.get(Constant.PARALLEL_MODE)) as context: @@ -92,3 +94,12 @@ class AnalysisFacade: recipe_class = get_class_from_name(self.params.get(Constant.ANALYSIS_MODE)) if recipe_class: self.do_recipe(recipe_class) + + def cluster_analyze_with_recipe(self): + recipes = [["CommunicationGroupMap", CommunicationGroupMap]] + if self.params.get(Constant.ANALYSIS_MODE) in (Constant.ALL, Constant.COMMUNICATION_TIME): + recipes.append(["CommunicationTimeSum", CommunicationTimeSum]) + if self.params.get(Constant.ANALYSIS_MODE) in (Constant.ALL, Constant.COMMUNICATION_MATRIX): + recipes.append(["CommMatrixSum", CommMatrixSum]) + for recipe_class in recipes: + self.do_recipe(recipe_class) diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/base_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/base_analysis.py index 59b824c1defd3ea93655a71d9727962871d07537..0d14af7693abbf433af14431ff4958e5a24a3cde 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/base_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/base_analysis.py @@ -54,7 +54,7 @@ class BaseAnalysis: if stat_name in op_name: if stat_name != total: return False - return True + return True @abstractmethod def run(self): diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/cluster_base_info_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/cluster_base_info_analysis.py index c5cb2652a1159f9bb645b96c4f60535c74a67859..48d5c6c1d51039d01dbcb919da82de9b6a6a20ec 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/cluster_base_info_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/cluster_base_info_analysis.py @@ -28,7 +28,6 @@ logger = get_logger() class ClusterBaseInfoAnalysis(BaseAnalysis): - KEY_DISTRIBUTED_ARGS = "distributed_args" def __init__(self, param: dict): super().__init__(param) @@ -56,7 +55,7 @@ class ClusterBaseInfoAnalysis(BaseAnalysis): result_db = os.path.join(output_path, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) conn, curs = DBManager.create_connect_db(result_db) DBManager.create_tables(result_db, Constant.TABLE_CLUSTER_BASE_INFO) - save_distributed_args = [[json.dumps(self.distributed_args)]] + save_distributed_args = [[Constant.DISTRIBUTED_ARGS, json.dumps(self.distributed_args)]] sql = "insert into {} values ({value})".format(Constant.TABLE_CLUSTER_BASE_INFO, value="?," * (len(save_distributed_args[0]) - 1) + "?") DBManager.executemany_sql(conn, sql, save_distributed_args) @@ -72,9 +71,9 @@ class ClusterBaseInfoAnalysis(BaseAnalysis): except RuntimeError as e: logger.error("Read json failed. %s", str(e)) continue - if not meta_data.get(self.KEY_DISTRIBUTED_ARGS): + if not meta_data.get(Constant.DISTRIBUTED_ARGS): continue - for key, value in meta_data[self.KEY_DISTRIBUTED_ARGS].items(): + for key, value in meta_data[Constant.DISTRIBUTED_ARGS].items(): if key == "rank": continue self.distributed_args.setdefault(key, value) diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/comm_matrix_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/comm_matrix_analysis.py index 3a538509b88e7ce3996aa539e49bf714bb163766..d4df5466c3845f7a2db7f3b5b439059155bc4d47 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/comm_matrix_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/comm_matrix_analysis.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import copy import os from collections import defaultdict @@ -21,6 +22,8 @@ from msprof_analyze.prof_common.db_manager import DBManager from msprof_analyze.cluster_analyse.common_func.utils import increase_shared_value from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.cluster_analyse.common_func.utils import double_hash +from msprof_analyze.prof_common.file_manager import FileManager logger = get_logger() @@ -69,48 +72,66 @@ class CommMatrixAnalysis(BaseAnalysis): self.combine_link_info(step_dict) def merge_same_links(self, step_dict: dict): - def process_link_key(rank_id, rank_dict): + def update_rank_map(step_dict): + for op_name, op_dict in step_dict.items(): + group_name = op_name.split("@")[-1] + for rank_id, rank_dict in op_dict.items(): + for link_key in rank_dict: + if '-' not in link_key: + logger.warning("%s has an invalid link key %s!", str(op_name), str(link_key)) + break + src_rank = link_key.split('-')[0] + dst_rank = link_key.split('-')[1] + if src_rank == dst_rank: + if src_rank not in project_local_global_rank_map.get(group_name, {}): + project_local_global_rank_map.setdefault(group_name, {})[src_rank] = rank_id + elif project_local_global_rank_map.get(group_name, {}).get(src_rank) != rank_id: + logger.warning(f"In the same communication group {group_name}, global rank {rank_id} " + f"and {project_local_global_rank_map.get(group_name, {}).get(src_rank)} " + f"get the same local rank {src_rank}!") + + def process_link_key(rank_dict): for link_key in rank_dict: if '-' not in link_key: logger.warning("%s has an invalid link key %s!", str(op_name), str(link_key)) break - src_rank = link_key.split('-')[0] - dst_rank = link_key.split('-')[1] - if src_rank == dst_rank: - if src_rank not in project_local_global_rank_map: - project_local_global_rank_map[src_rank] = rank_id - elif project_local_global_rank_map.get(src_rank) != rank_id: - logger.warning("In the same communication group, local ranks projecting to global ranks " - "repeat!") self.combine_link(link_info[link_key], rank_dict[link_key]) - def convert_local_to_global_rank(): + def convert_local_to_global_rank(rank_map): tmp_link = {} for link_key, link_dict in link_info.items(): src_rank = link_key.split('-')[0] dst_rank = link_key.split('-')[1] - src_rank = project_local_global_rank_map[src_rank] \ - if src_rank in project_local_global_rank_map else src_rank - dst_rank = project_local_global_rank_map[dst_rank] \ - if dst_rank in project_local_global_rank_map else dst_rank + if src_rank not in rank_map: + logger.warning(f"The src local rank {src_rank} of the operator {op_name} " + f"cannot be mapped to the global rank.") + continue + if dst_rank not in rank_map: + logger.warning(f"The dst local rank {dst_rank} of the operator {op_name} " + f"cannot be mapped to the global rank.") + continue + src_rank = rank_map[src_rank] + dst_rank = rank_map[dst_rank] link_dict[Constant.BANDWIDTH_GB_S] = \ self.compute_ratio(link_dict.get(Constant.TRANSIT_SIZE_MB, 0), link_dict.get(Constant.TRANSIT_TIME_MS, 0)) tmp_link[f"{src_rank}-{dst_rank}"] = link_dict return tmp_link - project_local_global_rank_map = dict() default_value = { Constant.TRANSPORT_TYPE: '', Constant.TRANSIT_TIME_MS: 0, Constant.TRANSIT_SIZE_MB: 0, Constant.OP_NAME: '' } + project_local_global_rank_map = self.get_parallel_group_info() + update_rank_map(step_dict) for op_name, op_dict in step_dict.items(): - link_info = defaultdict(lambda: default_value.copy()) - for rank_id, rank_dict in op_dict.items(): - process_link_key(rank_id, rank_dict) - step_dict[op_name] = convert_local_to_global_rank() + link_info = defaultdict(lambda: copy.deepcopy(default_value)) + group_name = op_name.split("@")[-1] + for rank_dict in op_dict.values(): + process_link_key(rank_dict) + step_dict[op_name] = convert_local_to_global_rank(project_local_global_rank_map.get(group_name, {})) def combine_link_info(self, step_dict: dict): default_value = { @@ -119,7 +140,7 @@ class CommMatrixAnalysis(BaseAnalysis): Constant.TRANSIT_SIZE_MB: 0, Constant.OP_NAME: '' } - total_op_info = defaultdict(lambda: default_value.copy()) + total_op_info = defaultdict(lambda: copy.deepcopy(default_value)) for op_name, op_dict in step_dict.items(): if self.check_add_op(op_name): for link_key, link_dict in op_dict.items(): @@ -130,23 +151,15 @@ class CommMatrixAnalysis(BaseAnalysis): link_dict.get(Constant.TRANSIT_TIME_MS, 0)) step_dict[Constant.TOTAL_OP_INFO] = total_op_info - -class CommMatrixAnalysisOptimized(CommMatrixAnalysis): - SAVED_JSON = "cluster_communication_matrix.json" - COMMUNICATION_MATRIX_TABLE = "ClusterCommunicationMatrix" - - def __init__(self, param: dict): - super().__init__(param) - - def dump_db(self): - res_comm_matrix = self.adapter.transfer_matrix_from_json_to_db(self.comm_ops_struct) - output_path = os.path.join(self.cluster_analysis_output_path, Constant.CLUSTER_ANALYSIS_OUTPUT) - result_db = os.path.join(output_path, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) - DBManager.create_tables(result_db, self.COMMUNICATION_MATRIX_TABLE) - conn, cursor = DBManager.create_connect_db(result_db) - if res_comm_matrix: - res_matrix_value = [list(data.values())[1:] for data in res_comm_matrix] - sql = "insert into {} values ({value})".format(self.COMMUNICATION_MATRIX_TABLE, - value="?," * (len(res_matrix_value[0]) - 1) + "?") - DBManager.executemany_sql(conn, sql, res_matrix_value) - DBManager.destroy_db_connect(conn, cursor) + def get_parallel_group_info(self): + parallel_group_info = {} + for profiler_path in self.data_map.values(): + meta_json = os.path.join(profiler_path, "profiler_metadata.json") + if os.path.exists(meta_json): + meta_data = FileManager.read_json_file(meta_json) + for group_name, group_info in meta_data.get("parallel_group_info", {}).items(): + global_ranks = group_info.get("global_ranks") + if isinstance(global_ranks, list) and global_ranks: + global_ranks.sort() + parallel_group_info[double_hash(group_name)] = dict(enumerate(global_ranks)) + return parallel_group_info diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/communication_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/communication_analysis.py index 47846522a9543511b3e55579d05b814d1ca9717d..e8ca793f525b0279053bc9848f99f21016ea6295 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/communication_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/communication_analysis.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +import copy import os from collections import defaultdict @@ -78,7 +79,7 @@ class CommunicationAnalysis(BaseAnalysis): Constant.COMMUNICATION_TIME_INFO: defaultdict(float), Constant.COMMUNICATION_BANDWIDTH_INFO: {} } - total_rank_dict = defaultdict(lambda: default_value.copy()) + total_rank_dict = defaultdict(lambda: copy.deepcopy(default_value)) for _, rank_dict in comm_ops.items(): for rank_id, communication_op_info in rank_dict.items(): for com_info, com_info_dict in communication_op_info.items(): @@ -136,147 +137,3 @@ class CommunicationBandwidthParams: self.step_id = step_id self.transport_type = transport_type self.package_size = package_size - - -class CommunicationAnalysisOptimized(BaseAnalysis): - COMMUNICATION_BANDWIDTH_TABLE = "ClusterCommunicationBandwidth" - COMMUNICATION_TIME_TABLE = "ClusterCommunicationTime" - - def __init__(self, param: dict): - super().__init__(param) - self._communication_ops = param.get(Constant.COMM_DATA_DICT, {}).get(Constant.COMMUNICATION_OPS) - self._communication_group = param.get(Constant.COMM_DATA_DICT, {}).get(Constant.COMMUNICATION_GROUP) - self._aggregate_time = {} - self._aggregate_bandwidth = {} - self._output_time = [] - self._output_bandwidth = [] - - @staticmethod - def _execute(conn, res_data, table_name): - if res_data: - sql = "insert into {} values ({value})".format(table_name, value="?," * (len(res_data[0]) - 1) + "?") - DBManager.executemany_sql(conn, sql, res_data) - - @staticmethod - def _format_time_data(communication_data): - data_dict = {} - for single_op in communication_data: - formatted_data = CommunicationTimeBean(single_op) - data_dict.setdefault(formatted_data.step_id, {}). \ - setdefault(formatted_data.rank_id, {}). \ - setdefault(formatted_data.group_name, []).extend([formatted_data]) - return data_dict - - def run(self, completed_processes, lock): - if not self._communication_ops[0] or not self._communication_ops[1]: - increase_shared_value(completed_processes, lock) - logger.info("CommunicationAnalysisOptimized completed") - return - self._aggregate_time = self._format_time_data(self._communication_ops[0]) - self._aggregate_bandwidth = self._format_bandwidth_data(self._communication_ops[1]) - self._compute_total_info() - self._dump_data() - increase_shared_value(completed_processes, lock) - logger.info("CommunicationAnalysisOptimized completed") - - def _format_bandwidth_data(self, communication_data: dict): - data_dict = {} - for single_op in communication_data: - formatted_data = CommunicationBandwidthBean(single_op) - rank_set = str(self.collective_group_dict.get(formatted_data.group_name, formatted_data.group_name)) - data_dict.setdefault(rank_set, {}).setdefault(formatted_data.step_id, {}). \ - setdefault(formatted_data.rank_id, {}). \ - setdefault(formatted_data.transport_type, {}). \ - setdefault(formatted_data.package_size, []).extend([formatted_data]) - return data_dict - - def _dump_data(self): - output_path = os.path.join(self.cluster_analysis_output_path, Constant.CLUSTER_ANALYSIS_OUTPUT) - result_db = os.path.join(output_path, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) - DBManager.create_tables(result_db, self.COMMUNICATION_TIME_TABLE) - DBManager.create_tables(result_db, self.COMMUNICATION_BANDWIDTH_TABLE) - conn, cursor = DBManager.create_connect_db(result_db) - self._execute(conn, self._output_time, self.COMMUNICATION_TIME_TABLE) - self._execute(conn, self._output_bandwidth, self.COMMUNICATION_BANDWIDTH_TABLE) - DBManager.destroy_db_connect(conn, cursor) - - def _compute_time_info(self): - for step_id, rank_dict in self._aggregate_time.items(): - for rank_id, communication_op_info in rank_dict.items(): - rank_set_dict = {} - for group_name, single_group_op_info in communication_op_info.items(): - total_dict = { - TableConstant.RANK_ID: rank_id, - TableConstant.STEP: step_id, - TableConstant.GROUP_NAME: group_name, - TableConstant.HCCL_OP_NAME: Constant.TOTAL_OP_INFO - } - total_time_info = CommunicationTimeBean(total_dict) - for com_info_dict in single_group_op_info: - total_time_info += com_info_dict - self._output_time.append(com_info_dict.convert_output()) - rank_set = str(self.collective_group_dict.get(group_name)) - if not rank_set: - logger.warning("failed to find rank set with group name: %s.", str(group_name)) - continue - if rank_set_dict.get(rank_set): - rank_set_dict[rank_set] += total_time_info - else: - rank_set_dict[rank_set] = total_time_info - for _, total_time_info in rank_set_dict.items(): - total_time_info.compute_ratio() - self._output_time.append(total_time_info.convert_output()) - - def _process_package_info(self, package_info, total_transit_size, total_transit_time, op_group_set, - communication_bandwidth_params): - total_bw_info = CommunicationBandwidthBean({ - TableConstant.RANK_ID: communication_bandwidth_params.rank_id, - TableConstant.STEP: communication_bandwidth_params.step_id, - TableConstant.GROUP_NAME: '', - TableConstant.HCCL_OP_NAME: Constant.TOTAL_OP_INFO, - TableConstant.TRANSPORT_TYPE: communication_bandwidth_params.transport_type, - TableConstant.TRANSIT_SIZE: 0.0, - TableConstant.TRANSIT_TIME: 0.0, - TableConstant.BANDWIDTH: 0.0, - TableConstant.PACKAGE_SIZE: communication_bandwidth_params.package_size - }) - for bandwidth_package_info in package_info: - total_bw_info += bandwidth_package_info - if not total_bw_info.group_name: - total_bw_info.set_group_name(bandwidth_package_info.group_name) - self._output_bandwidth.append(bandwidth_package_info.convert_output()) - op_group = bandwidth_package_info.hccl_op_name + "@" + bandwidth_package_info.group_name - if op_group not in op_group_set: - op_group_set.add(op_group) - total_transit_size += bandwidth_package_info.transit_size - total_transit_time += bandwidth_package_info.transit_time - return total_bw_info, total_transit_size, total_transit_time - - def _compute_bandwidth_info(self): - for _, step_dict in self._aggregate_bandwidth.items(): - for step_id, rank_dict in step_dict.items(): - for rank_id, communication_op_info in rank_dict.items(): - for transport_type, bandwidth_info in communication_op_info.items(): - total_transit_size = 0.0 - total_transit_time = 0.0 - total_info = [] - op_group_set = set() - for package_size, package_info in bandwidth_info.items(): - total_bandwidth_info, total_transit_size, total_transit_time = self._process_package_info( - package_info, total_transit_size, total_transit_time, op_group_set, - CommunicationBandwidthParams(rank_id, step_id, transport_type, package_size) - ) - total_info.append(total_bandwidth_info) - total_bandwidth = total_transit_size / total_transit_time if total_transit_time else 0.0 - for single_total_info in total_info: - single_total_info.set_transit_size(total_transit_size) - single_total_info.set_transit_time(total_transit_time) - single_total_info.set_bandwidth(total_bandwidth) - self._output_bandwidth.append(single_total_info.convert_output()) - - def _compute_total_info(self): - if not self._aggregate_time or not self._aggregate_bandwidth: - logger.error("communication data is null.") - return - self._compute_time_info() - self._compute_bandwidth_info() diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/host_info_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/host_info_analysis.py index e8821b3c2971d1b9707a8347b164f031a44d2922..05cd3c9a2ab7332fb885993148505b3128f128fc 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/host_info_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/host_info_analysis.py @@ -21,6 +21,8 @@ from msprof_analyze.cluster_analyse.common_func.utils import increase_shared_val from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.cluster_analyse.cluster_data_preprocess.msprof_data_preprocessor import MsprofDataPreprocessor +from msprof_analyze.cluster_analyse.cluster_data_preprocess.mindspore_data_preprocessor import MindsporeDataPreprocessor logger = get_logger() @@ -33,6 +35,8 @@ class HostInfoAnalysis(BaseAnalysis): super().__init__(param) self.all_rank_host_info = {} self.all_rank_device_info = [] + self.is_msprof = param.get(Constant.IS_MSPROF) + self.is_mindspore = param.get(Constant.IS_MINDSPORE) def run(self, completed_processes=None, lock=None): if self.data_type != Constant.DB: @@ -83,7 +87,7 @@ class HostInfoAnalysis(BaseAnalysis): for rank_id, profiling_dir in self.data_map.items(): host_info = [] rank_device_info = [] - db_path = os.path.join(profiling_dir, Constant.SINGLE_OUTPUT, f"ascend_pytorch_profiler_{rank_id}.db") + db_path = self._get_db_path(rank_id, profiling_dir) if (os.path.exists(db_path) and DBManager.check_tables_in_db(db_path, self.TABLE_HOST_INFO)): conn, curs = DBManager.create_connect_db(db_path) sql = "select * from {0}".format(self.TABLE_HOST_INFO) @@ -98,6 +102,13 @@ class HostInfoAnalysis(BaseAnalysis): sql = "select * from {0}".format(self.TABLE_RANK_DEVICE_MAP) rank_device_info = DBManager.fetch_all_data(curs, sql, is_dict=False) DBManager.destroy_db_connect(conn, curs) + if self.is_msprof: + device_id = MsprofDataPreprocessor.get_device_id(profiling_dir) + rank_device_info = [[rank_id, device_id]] + if self.is_mindspore: + prof_dir = MindsporeDataPreprocessor.get_msprof_dir(profiling_dir) + device_id = MsprofDataPreprocessor.get_device_id(prof_dir) + rank_device_info = [[rank_id, device_id]] if not (rank_device_info and rank_device_info[0]): if not print_empty_host_info: print_empty_host_info = f"No {self.TABLE_RANK_DEVICE_MAP} data in {self.data_type} file." @@ -109,3 +120,10 @@ class HostInfoAnalysis(BaseAnalysis): self.all_rank_device_info.extend(rank_device_info) if print_empty_host_info: logger.warning(print_empty_host_info) + + def _get_db_path(self, rank_id, profiling_dir): + if self.is_msprof: + return MsprofDataPreprocessor.get_msprof_profiler_db_path(profiling_dir) + if self.is_mindspore: + return os.path.join(profiling_dir, Constant.SINGLE_OUTPUT, f"ascend_mindspore_profiler_{rank_id}.db") + return os.path.join(profiling_dir, Constant.SINGLE_OUTPUT, f"ascend_pytorch_profiler_{rank_id}.db") diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/msprof_step_trace_time_adapter.py b/profiler/msprof_analyze/cluster_analyse/analysis/msprof_step_trace_time_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..799fd86a477b5ea3e0b59e1831fdc5a7ff398a6b --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/analysis/msprof_step_trace_time_adapter.py @@ -0,0 +1,149 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from msprof_analyze.cluster_analyse.prof_bean.step_trace_time_bean import StepTraceTimeBean +from msprof_analyze.prof_common.utils import convert_to_float +from msprof_analyze.prof_common.file_manager import FileManager +from msprof_analyze.cluster_analyse.common_func.time_range_calculator import RangeCaculator +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant +from msprof_analyze.cluster_analyse.common_func.time_range_calculator import CommunicationTimeRange +from msprof_analyze.prof_common.constant import Constant + +logger = get_logger() + + +class MsprofStepTraceTimeAdapter: + COMPUTE = "Computing" + COMM_NOT_OVERLAP = "Communication(Not Overlapped)" + OVERLAPPED = "Overlapped" + COMMUNICATION = "Communication" + FREE = "Free" + STAGE = "Stage" + BUBBLE = "Bubble" + COMM_NOT_OVERLAP_EXCLUDE_RECEIVE = "Communication(Not Overlapped and Exclude Receive)" + PREPARE = "Preparing" + STEP = "Step" + + def __init__(self, file_path): + self.file_path = file_path + self._data = {self.STEP: None, self.COMPUTE: 0, self.COMM_NOT_OVERLAP: 0, self.OVERLAPPED: 0, + self.COMMUNICATION: 0, self.FREE: 0, self.STAGE: 0, self.BUBBLE: 0, + self.COMM_NOT_OVERLAP_EXCLUDE_RECEIVE: 0, self.PREPARE: 0} + + def generate_step_trace_time_data(self): + json_str = [] + for file_path in self.file_path: + json_str.extend(FileManager.read_json_file(file_path)) + receive_comm = [] + analysis_data = {} + for data in json_str: + event_name = data.get("name", "") + if event_name in {self.COMMUNICATION, self.COMPUTE, self.FREE, self.COMM_NOT_OVERLAP}: + analysis_data.setdefault(event_name, []).append(data) + elif event_name.startswith('hcom_receive'): + receive_comm.append(data) + for event_type, event_list in analysis_data.items(): + self._data[event_type] = sum((convert_to_float(event.get("dur", 0)) for event in event_list)) + self._data[self.BUBBLE] = sum((convert_to_float(event.get("dur", 0)) for event in receive_comm)) + self._data[self.COMM_NOT_OVERLAP_EXCLUDE_RECEIVE] = self._data[self.COMM_NOT_OVERLAP] - self._data[self.BUBBLE] + self._data[self.OVERLAPPED] = self._data[self.COMMUNICATION] - self._data[self.COMM_NOT_OVERLAP] + e2e_time = self._data[self.FREE] + self._data[self.COMPUTE] + self._data[self.COMM_NOT_OVERLAP] + self._data[self.STAGE] = e2e_time - self._data[self.BUBBLE] + return [StepTraceTimeBean(self._data)] + + +class MsprofStepTraceTimeDBAdapter(MsprofStepTraceTimeAdapter): + OP_NAME = 0 + START_NS = 1 + END_NS = 2 + + def __init__(self, file_path): + super().__init__(file_path) + self.task_db_con = None + self.task_db_curs = None + self.string_id_map = None + self.compute_task_info = None + self.communication_op_info = None + + def generate_step_trace_time_data(self): + try: + self._init_task_info_from_db() + except Exception as err: + logger.error(err) + DBManager.destroy_db_connect(self.task_db_con, self.task_db_curs) + return [] + origin_compute_data = self._get_compute_data() + origin_communication_data, bubble_data = self._get_communication_data() + compute_data = RangeCaculator.merge_continuous_intervals(origin_compute_data) + self._data[self.COMPUTE] = sum(data.end_ts - data.start_ts for data in compute_data) + communication_data = RangeCaculator.merge_continuous_intervals(origin_communication_data) + self._data[self.COMMUNICATION] = sum(data.end_ts - data.start_ts for data in communication_data) + pure_communication_data, free_data = RangeCaculator.compute_pipeline_overlap(communication_data, compute_data) + self._data[self.COMM_NOT_OVERLAP] = sum(data.end_ts - data.start_ts for data in pure_communication_data) + self._data[self.FREE] = sum(data.end_ts - data.start_ts for data in free_data) + self._data[self.BUBBLE] = sum(data.end_ts - data.start_ts for data in bubble_data) + self._data[self.COMM_NOT_OVERLAP_EXCLUDE_RECEIVE] = self._data[self.COMM_NOT_OVERLAP] - self._data[self.BUBBLE] + self._data[self.OVERLAPPED] = self._data[self.COMMUNICATION] - self._data[self.COMM_NOT_OVERLAP] + e2e_time = self._data[self.FREE] + self._data[self.COMPUTE] + self._data[self.COMM_NOT_OVERLAP] + self._data[self.STAGE] = e2e_time - self._data[self.BUBBLE] + return [[self._data[self.STEP], self._data[self.COMPUTE] / Constant.NS_TO_US, + self._data[self.COMM_NOT_OVERLAP] / Constant.NS_TO_US, self._data[self.OVERLAPPED] / Constant.NS_TO_US, + self._data[self.COMMUNICATION] / Constant.NS_TO_US, self._data[self.FREE] / Constant.NS_TO_US, + self._data[self.STAGE] / Constant.NS_TO_US, self._data[self.BUBBLE] / Constant.NS_TO_US, + self._data[self.COMM_NOT_OVERLAP_EXCLUDE_RECEIVE] / Constant.NS_TO_US, + self._data[self.PREPARE] / Constant.NS_TO_US]] + + def _init_task_info_from_db(self): + db_path = self.file_path.get(Constant.PROFILER_DB_PATH) + conn, curs = DBManager.create_connect_db(db_path) + if not (conn and curs): + logger.warning(f"Failed to connect to db file: {db_path}") + return + self.task_db_con = conn + self.task_db_curs = curs + if DBManager.judge_table_exists(curs, TableConstant.TABLE_STRING_IDS): + sql = "select id, value from {}".format(TableConstant.TABLE_STRING_IDS) + string_id_data = DBManager.fetch_all_data(curs, sql, is_dict=False) + self.string_id_map = {data[0]: data[1] for data in string_id_data} + if DBManager.judge_table_exists(curs, TableConstant.TABLE_COMPUTE_TASK_INFO): + sql = f"select TASK.startNs, TASK.endNs from {TableConstant.TABLE_COMPUTE_TASK_INFO} LEFT JOIN " \ + f"{TableConstant.TABLE_TASK} on {TableConstant.TABLE_TASK}.globalTaskId == " \ + f"{TableConstant.TABLE_COMPUTE_TASK_INFO}.globalTaskId" + self.compute_task_info = DBManager.fetch_all_data(curs, sql, is_dict=False) + if DBManager.judge_table_exists(curs, TableConstant.TABLE_COMMUNICATION_OP): + sql = "select opName, startNs, endNs from {}".format(TableConstant.TABLE_COMMUNICATION_OP) + self.communication_op_info = DBManager.fetch_all_data(curs, sql, is_dict=False) + DBManager.destroy_db_connect(conn, curs) + + def _get_communication_data(self): + communication_data = [] + bubble_data = [] + for op_info in self.communication_op_info: + op_start_time = op_info[self.START_NS] + time_range = RangeCaculator.generate_time_range( + op_start_time, op_info[self.END_NS], class_range=CommunicationTimeRange) + communication_data.append(time_range) + op_name = self.string_id_map.get(op_info[self.OP_NAME], '') + if op_name.startswith('hcom_receive'): + bubble_data.append(time_range) + return communication_data, bubble_data + + def _get_compute_data(self): + compute_data = [] + for compute_task in self.compute_task_info: + compute_data.append(RangeCaculator.generate_time_range(compute_task[0], compute_task[1])) + return compute_data diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/stage_group_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/stage_group_analysis.py new file mode 100644 index 0000000000000000000000000000000000000000..d84bbe7de3fc78657ef2497d58aa49622bfbe655 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/analysis/stage_group_analysis.py @@ -0,0 +1,155 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from copy import deepcopy + +import pandas as pd + +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant +from msprof_analyze.cluster_analyse.common_func.utils import UnionFind +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.database_service import DatabaseService +from msprof_analyze.prof_common.file_manager import FileManager + +logger = get_logger() + + +class StageInfoAnalysis: + + def __init__(self, param: dict): + self.cluster_analysis_output_path = param.get(Constant.CLUSTER_ANALYSIS_OUTPUT_PATH, "") + self.cluster_analysis_output_dir = os.path.join(self.cluster_analysis_output_path, + Constant.CLUSTER_ANALYSIS_OUTPUT) + self.data_type = param.get(Constant.DATA_TYPE) + self.simplified_mode = param.get(Constant.DATA_SIMPLIFICATION) + self.communication_data_dict = param.get(Constant.COMM_DATA_DICT, {}) + self.collective_group_dict = {} + self.p2p_link = [] + self.p2p_union_group = [] + self.stage_group = [] + + def run(self): + if not self.prepare_data(): + return [] + self.generate_p2p_union_group() + self.generate_stage_group() + return self.stage_group + + def prepare_data(self): + if Constant.KEY_COMM_GROUP_PARALLEL_INFO in self.communication_data_dict: + comm_group_df = pd.DataFrame(self.communication_data_dict.get(Constant.KEY_COMM_GROUP_PARALLEL_INFO)) + else: + comm_group_df = self.load_communication_group_df() + return self.extract_infos(comm_group_df) + + def load_communication_group_df(self): + if not os.path.exists(self.cluster_analysis_output_path): + logger.warning(f"StageInfoAnalysis: {self.cluster_analysis_output_path} not exist!") + return None + return self.load_communication_group_df_for_text() if self.data_type == Constant.TEXT else ( + self.load_communication_group_df_for_db()) + + def load_communication_group_df_for_text(self): + # check file exist + communication_group_json = os.path.join(self.cluster_analysis_output_dir, Constant.COMMUNICATION_GROUP_JSON) + if not os.path.exists(communication_group_json): + logger.warning(f"{communication_group_json} not exists!") + return None + # read comm_group_parallel_info from communication_group.json + group_data = FileManager.read_json_file(communication_group_json) + if Constant.KEY_COMM_GROUP_PARALLEL_INFO not in group_data: + logger.warning(f"{Constant.KEY_COMM_GROUP_PARALLEL_INFO} not in {Constant.COMMUNICATION_GROUP_JSON}") + return None + # convert to dataframe + comm_group_df = pd.DataFrame(group_data.get(Constant.KEY_COMM_GROUP_PARALLEL_INFO)) + comm_group_df[TableConstant.RANK_SET] = comm_group_df[TableConstant.RANK_SET].apply(set) + return comm_group_df + + def load_communication_group_df_for_db(self): + # load data from cluster_analysis.db + if not os.path.exists(self.cluster_analysis_output_dir): + logger.warning("rank %s db path %s does not exist.", self.cluster_analysis_output_path) + cluster_analysis_db = os.path.join(self.cluster_analysis_output_dir, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + data_service = DatabaseService(cluster_analysis_db, {}) + if self.simplified_mode: + table_communication_group = Constant.TABLE_COMMUNICATION_GROUP_MAPPING + else: + table_communication_group = Constant.TABLE_COMMUNICATION_GROUP + data_service.add_table_for_query(table_communication_group) + data_dict = data_service.query_data() + comm_group_df = data_dict.get(table_communication_group, None) + if comm_group_df is None or comm_group_df.empty: + logger.error(f"There is no {table_communication_group} data in {cluster_analysis_db}.") + return None + + # process rank_set + comm_group_df[TableConstant.RANK_SET] = comm_group_df[TableConstant.RANK_SET].apply( + lambda s: set(map(int, s.strip('()').split(',')))) + return comm_group_df + + def extract_infos(self, comm_group_df): + if comm_group_df is None: + return False + self.collective_group_dict = \ + comm_group_df[comm_group_df[TableConstant.TYPE] == Constant.COLLECTIVE].set_index(TableConstant.GROUP_NAME)[ + TableConstant.RANK_SET].to_dict() + pp_df = comm_group_df[comm_group_df[TableConstant.TYPE] == Constant.P2P] + pp_df = pp_df[pp_df[TableConstant.PG_NAME].str.lower().str.startswith('pp', na=False)] + self.p2p_link = pp_df[TableConstant.RANK_SET].to_list() + return len(self.p2p_link) > 0 + + def generate_p2p_union_group(self): + self.p2p_link.sort(key=lambda x: min(x)) + while self.p2p_link: + union_set = deepcopy(self.p2p_link[0]) + rm_list = [self.p2p_link[0]] + for link_rank_set_x in self.p2p_link[1:]: + if UnionFind.is_connected(link_rank_set_x, union_set): + union_set = union_set.union(link_rank_set_x) + rm_list.append(link_rank_set_x) + self.p2p_union_group.append(union_set) + self.p2p_link = [element for element in self.p2p_link if element not in rm_list] + + def generate_stage_group(self): + stage_group = {} + for _, rank_set in self.collective_group_dict.items(): + if not self.whether_valid_comm_group(rank_set): + continue + unioned_set = set() + remove_key = [] + for first_rank, stage in stage_group.items(): + if UnionFind.is_connected(rank_set, stage): + unioned_set = UnionFind.union(rank_set, stage, unioned_set) + remove_key.append(first_rank) + if unioned_set: + for key in remove_key: + del stage_group[key] + stage_group[min(unioned_set)] = unioned_set + else: + stage_group[min(rank_set)] = rank_set + first_rank_sort_list = sorted(first_rank for first_rank in stage_group) + self.stage_group = [list(stage_group.get(first_rank, {})) for first_rank in first_rank_sort_list] + + def whether_valid_comm_group(self, rank_set: set): + """ + while distinguish which communication group should be used to infer stage info, these group should be ignored: + 1. group can not include more than 1 rank in every single p2p group + """ + for p2p_rank_set in self.p2p_union_group: + if len(rank_set.intersection(p2p_rank_set)) > 1: + return False + return True diff --git a/profiler/msprof_analyze/cluster_analyse/analysis/step_trace_time_analysis.py b/profiler/msprof_analyze/cluster_analyse/analysis/step_trace_time_analysis.py index 5168f63aef5113221766252ac6ef35f8d9504147..5514d4ec25f6c20ee4bb2da1b19e6e4a86e78d3e 100644 --- a/profiler/msprof_analyze/cluster_analyse/analysis/step_trace_time_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/analysis/step_trace_time_analysis.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. import os +import re from msprof_analyze.prof_common.db_manager import DBManager from msprof_analyze.cluster_analyse.common_func.utils import increase_shared_value @@ -21,6 +22,10 @@ from msprof_analyze.cluster_analyse.prof_bean.step_trace_time_bean import StepTr from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.file_manager import FileManager from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.cluster_analyse.analysis.msprof_step_trace_time_adapter import MsprofStepTraceTimeAdapter +from msprof_analyze.cluster_analyse.cluster_data_preprocess.msprof_data_preprocessor import MsprofDataPreprocessor +from msprof_analyze.cluster_analyse.analysis.msprof_step_trace_time_adapter import MsprofStepTraceTimeDBAdapter +from msprof_analyze.cluster_analyse.analysis.stage_group_analysis import StageInfoAnalysis logger = get_logger() @@ -35,11 +40,14 @@ class StepTraceTimeAnalysis: self.collection_path = param.get(Constant.COLLECTION_PATH) self.cluster_analysis_output_path = param.get(Constant.CLUSTER_ANALYSIS_OUTPUT_PATH) self.data_map = param.get(Constant.DATA_MAP) - self.communication_group = param.get(Constant.COMM_DATA_DICT, {}).get(Constant.COMMUNICATION_GROUP) + self.communication_data_dict = param.get(Constant.COMM_DATA_DICT, {}) self.step_time_dict = {} self.step_data_list = [] self.data_type = param.get(Constant.DATA_TYPE) + self.data_simplification = param.get(Constant.DATA_SIMPLIFICATION) self.distributed_args = None + self.is_msprof = param.get(Constant.IS_MSPROF) + self.is_mindspore = param.get(Constant.IS_MINDSPORE) @staticmethod def get_max_data_row(data_group_list: list): @@ -50,13 +58,33 @@ class StepTraceTimeAnalysis: ret.append(max(item)) return ret + @staticmethod + def find_msprof_json(path): + msprof_pattern = r'^msprof_\d{14}\.json$' + msprof_slice_pattern = r'^msprof_slice_\d{1}_\d{14}\.json$' + msprof_dict, msprof_slice_dict = {}, {} + for file_name in os.listdir(path): + if re.match(msprof_pattern, file_name): + timestamp = re.search(r"\d{14}", file_name).group() + msprof_dict.setdefault(timestamp, []).append(os.path.join(path, file_name)) + elif re.match(msprof_slice_pattern, file_name): + timestamp = re.search(r"\d{14}", file_name).group() + msprof_slice_dict.setdefault(timestamp, []).append(os.path.join(path, file_name)) + if msprof_dict: + max_timestamp = max(msprof_dict.keys()) + return msprof_dict.get(max_timestamp) + if msprof_slice_dict: + max_timestamp = max(msprof_slice_dict.keys()) + return msprof_slice_dict.get(max_timestamp) + return [] + def run(self, completed_processes, lock): self.load_step_trace_time_data() self.analyze_step_time() self.partition_ranks_data() self.dump_data() increase_shared_value(completed_processes, lock) - logger.warning("StepTraceTimeAnalysis completed") + logger.info("StepTraceTimeAnalysis completed") def partition_ranks_data(self): if not self.distributed_args: @@ -132,19 +160,31 @@ class StepTraceTimeAnalysis: metadata = FileManager.read_json_file(metadata_path) self.distributed_args = metadata.get(Constant.DISTRIBUTED_ARGS, None) if metadata else None if self.data_type == Constant.TEXT: - step_time_file = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, Constant.STEP_TIME_CSV) - if os.path.exists(step_time_file): - self.step_time_dict[rank_id] = FileManager.read_csv_file(step_time_file, StepTraceTimeBean) + if self.is_msprof: + msprof_json = self.find_msprof_json(os.path.join(profiling_dir_path, "mindstudio_profiler_output")) + self.step_time_dict[rank_id] = MsprofStepTraceTimeAdapter( + msprof_json).generate_step_trace_time_data() + else: + step_time_file = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, Constant.STEP_TIME_CSV) + if os.path.exists(step_time_file): + self.step_time_dict[rank_id] = FileManager.read_csv_file(step_time_file, StepTraceTimeBean) else: - step_time_file = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, - Constant.DB_COMMUNICATION_ANALYZER) - if (os.path.exists(step_time_file) and - DBManager.check_tables_in_db(step_time_file, Constant.TABLE_STEP_TRACE)): - conn, cursor = DBManager.create_connect_db(step_time_file) - sql = "select * from {0}".format(Constant.TABLE_STEP_TRACE) - data = DBManager.fetch_all_data(cursor, sql, is_dict=False) - self.step_time_dict[rank_id] = data - DBManager.destroy_db_connect(conn, cursor) + if self.is_msprof or self.is_mindspore: + profiler_db = MsprofDataPreprocessor.get_msprof_profiler_db_path(profiling_dir_path) if \ + self.is_msprof else os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, + f"ascend_mindspore_profiler_{rank_id}.db") + self.step_time_dict[rank_id] = MsprofStepTraceTimeDBAdapter( + {Constant.PROFILER_DB_PATH: profiler_db}).generate_step_trace_time_data() + else: + step_time_file = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, + Constant.DB_COMMUNICATION_ANALYZER) + if (os.path.exists(step_time_file) and + DBManager.check_tables_in_db(step_time_file, Constant.TABLE_STEP_TRACE)): + conn, cursor = DBManager.create_connect_db(step_time_file) + sql = "select * from {0}".format(Constant.TABLE_STEP_TRACE) + data = DBManager.fetch_all_data(cursor, sql, is_dict=False) + self.step_time_dict[rank_id] = data + DBManager.destroy_db_connect(conn, cursor) if not self.step_time_dict.get(rank_id): logger.warning("Rank %s does not have a valid step_trace_time data in %s file.", str(rank_id), str(self.data_type)) @@ -156,7 +196,8 @@ class StepTraceTimeAnalysis: self.step_data_list.append([data_bean.step, Constant.RANK, rank_id] + data_bean.row) else: self.step_data_list.append([data_bean[0], Constant.RANK, rank_id] + list(data_bean[1:])) - stage_list = self.communication_group.get(Constant.P2P) + + stage_list = self.generate_stage_group_list() if not stage_list: return step_group_dict = {} @@ -184,3 +225,16 @@ class StepTraceTimeAnalysis: elif self.step_time_dict.get(rank): return self.step_time_dict[rank][0].all_headers return [] + + def generate_stage_group_list(self): + if Constant.STAGE in self.communication_data_dict: + return self.communication_data_dict[Constant.STAGE] + params = { + Constant.CLUSTER_ANALYSIS_OUTPUT_PATH: self.cluster_analysis_output_path, + Constant.DATA_TYPE: self.data_type, + Constant.DATA_SIMPLIFICATION: self.data_simplification, + Constant.COMM_DATA_DICT: self.communication_data_dict + } + stage_analyzer = StageInfoAnalysis(params) + stage_list = stage_analyzer.run() + return stage_list diff --git a/profiler/msprof_analyze/cluster_analyse/cluster_analysis.py b/profiler/msprof_analyze/cluster_analyse/cluster_analysis.py index 6464bb732ddf57b2790d99ac7148ce3ecaf327ce..233bc5b3bc7579644119f263b1ad0be32808988f 100644 --- a/profiler/msprof_analyze/cluster_analyse/cluster_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/cluster_analysis.py @@ -13,6 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. import argparse +import copy import os import sys @@ -21,6 +22,7 @@ sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath( from msprof_analyze.cluster_analyse.analysis.analysis_facade import AnalysisFacade from msprof_analyze.cluster_analyse.cluster_data_preprocess.pytorch_data_preprocessor import PytorchDataPreprocessor from msprof_analyze.cluster_analyse.cluster_data_preprocess.mindspore_data_preprocessor import MindsporeDataPreprocessor +from msprof_analyze.cluster_analyse.cluster_data_preprocess.msprof_data_preprocessor import MsprofDataPreprocessor from msprof_analyze.cluster_analyse.communication_group.communication_group_generator import CommunicationGroupGenerator from msprof_analyze.prof_common.additional_args_manager import AdditionalArgsManager from msprof_analyze.prof_common.constant import Constant @@ -47,6 +49,7 @@ ALL_FEATURE_LIST = COMM_FEATURE_LIST + get_all_recipes() class Interface: ASCEND_PT = "ascend_pt" ASCEND_MS = "ascend_ms" + PROF = "PROF_" def __init__(self, params: dict): self.collection_path = PathManager.get_realpath(params.get(Constant.PROFILING_PATH)) @@ -58,7 +61,6 @@ class Interface: self.matrix_ops = [] self.origin_params = params self.cluster_analysis_output_path = self.get_cluster_analysis_output_path(params) - self.force = params.get(Constant.FORCE, False) AdditionalArgsManager().init(params) def get_cluster_analysis_output_path(self, params): @@ -70,27 +72,41 @@ class Interface: def allocate_prof_data(self): ascend_pt_dirs = [] ascend_ms_dirs = [] + prof_dirs = [] for root, dirs, _ in os.walk(self.collection_path): for dir_name in dirs: if dir_name.endswith(self.ASCEND_PT): ascend_pt_dirs.append(os.path.join(root, dir_name)) if dir_name.endswith(self.ASCEND_MS): ascend_ms_dirs.append(os.path.join(root, dir_name)) + if dir_name.startswith(self.PROF): + prof_dirs.append(os.path.join(root, dir_name)) pytorch_processor = PytorchDataPreprocessor(ascend_pt_dirs) pt_data_map = pytorch_processor.get_data_map() - data_type = pytorch_processor.get_data_type() - ms_data_map = MindsporeDataPreprocessor(ascend_ms_dirs).get_data_map() + pt_data_type = pytorch_processor.get_data_type() + ms_processor = MindsporeDataPreprocessor(ascend_ms_dirs) + ms_data_map = ms_processor.get_data_map() + ms_data_type = ms_processor.get_data_type() if pt_data_map and ms_data_map: logger.error("Can not analyze pytorch and mindspore meantime.") - return [] - return (pt_data_map, data_type) if pt_data_map else (ms_data_map, Constant.TEXT) + return {} + if pt_data_map: + return {Constant.DATA_MAP: pt_data_map, Constant.DATA_TYPE: pt_data_type, Constant.IS_MSPROF: False} + if ms_data_map: + return {Constant.DATA_MAP: ms_data_map, Constant.DATA_TYPE: ms_data_type, Constant.IS_MSPROF: False, + Constant.IS_MINDSPORE: True} + msprof_processor = MsprofDataPreprocessor(prof_dirs) + prof_data_map = msprof_processor.get_data_map() + prof_data_type = msprof_processor.get_data_type() + return {Constant.DATA_MAP: prof_data_map, Constant.DATA_TYPE: prof_data_type, Constant.IS_MSPROF: True} def run(self): PathManager.check_input_directory_path(self.collection_path) PathManager.check_input_directory_path(self.cluster_analysis_output_path) PathManager.check_path_owner_consistent([self.collection_path, self.cluster_analysis_output_path]) - data_map, data_type = self.allocate_prof_data() + data_dict = self.allocate_prof_data() + data_map, data_type = data_dict.get(Constant.DATA_MAP), data_dict.get(Constant.DATA_TYPE) if not data_map: logger.warning("Can not get rank info or profiling data.") return @@ -98,34 +114,33 @@ class Interface: logger.error("The current folder contains both DB and other files. Please check.") return - params = { + params = copy.deepcopy(self.origin_params) + params.update({ Constant.COLLECTION_PATH: self.collection_path, + Constant.ANALYSIS_MODE: self.analysis_mode, Constant.DATA_MAP: data_map, Constant.DATA_TYPE: data_type, - Constant.ANALYSIS_MODE: self.analysis_mode, - Constant.CLUSTER_ANALYSIS_OUTPUT_PATH: self.cluster_analysis_output_path, - Constant.DATA_SIMPLIFICATION: self.origin_params.get(Constant.DATA_SIMPLIFICATION, False), - Constant.FORCE: self.force - } - + Constant.IS_MSPROF: data_dict.get(Constant.IS_MSPROF, False), + Constant.IS_MINDSPORE: data_dict.get(Constant.IS_MINDSPORE, False), + Constant.CLUSTER_ANALYSIS_OUTPUT_PATH: self.cluster_analysis_output_path + }) if self.analysis_mode in COMM_FEATURE_LIST: FileManager.create_output_dir(self.cluster_analysis_output_path) PathManager.check_path_writeable(self.cluster_analysis_output_path) logger.info("Begin generate communication data.") - comm_data_dict = CommunicationGroupGenerator(params).generate() - logger.info("Communication data read completed.") - params[Constant.COMM_DATA_DICT] = comm_data_dict + if data_type == Constant.TEXT or not params.get(Constant.DATA_SIMPLIFICATION): + comm_data_dict = CommunicationGroupGenerator(params).generate() + logger.info("Communication data read completed.") + params[Constant.COMM_DATA_DICT] = comm_data_dict AnalysisFacade(params).cluster_analyze() logger.info("The cluster analysis result file has been generated: %s", self.cluster_analysis_output_path) - return - - if data_type != Constant.DB: + elif data_type == Constant.TEXT: logger.error("The current analysis node only supports DB as input data. Please check.") - return - FileManager.create_output_dir(self.cluster_analysis_output_path, is_overwrite=True) - self.origin_params.update(params) - AnalysisFacade(self.origin_params).recipe_analyze() + else: + FileManager.create_output_dir(self.cluster_analysis_output_path, is_overwrite=True) + PathManager.check_path_writeable(self.cluster_analysis_output_path) + AnalysisFacade(params).recipe_analyze() def cluster_analysis_main(): @@ -142,10 +157,16 @@ def cluster_analysis_main(): parser.add_argument("--parallel_mode", type=str, help="context mode", default="concurrent") parser.add_argument("--export_type", type=str, help="recipe export type", choices=["db", "notebook"], default="db") parser.add_argument("--rank_list", type=str, help="Rank id list", default='all') + parser.add_argument("--step_id", type=int, help="Step id", default=Constant.VOID_STEP) args, extra_args = parser.parse_known_args() parameter = vars(args) - parameter[Constant.EXTRA_ARGS] = extra_args + if extra_args: + if parameter.get(Constant.MODE) in COMM_FEATURE_LIST: + unknown_args = " ".join(extra_args) + logger.warning(f"Invalid parameters: {unknown_args}. It will not have any effect.") + else: + parameter[Constant.EXTRA_ARGS] = extra_args Interface(parameter).run() diff --git a/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/mindspore_data_preprocessor.py b/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/mindspore_data_preprocessor.py index eaa14fb71f9583b65b0ed18c6ad9727d913eb2fe..c22ecb1ad907f8d20bfe2b380db798928d0343b9 100644 --- a/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/mindspore_data_preprocessor.py +++ b/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/mindspore_data_preprocessor.py @@ -12,11 +12,14 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import os +import re from collections import defaultdict from msprof_analyze.cluster_analyse.cluster_data_preprocess.data_preprocessor import DataPreprocessor - from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.file_manager import FileManager logger = get_logger() @@ -25,6 +28,15 @@ class MindsporeDataPreprocessor(DataPreprocessor): def __init__(self, path_list: list): super().__init__(path_list) + self.data_type = set() + + @classmethod + def get_msprof_dir(cls, profiling_path): + prof_pattren = r"^PROF_\d+_\d+_[0-9a-zA-Z]+" + for file_name in os.listdir(profiling_path): + if re.match(prof_pattren, file_name): + return os.path.join(profiling_path, file_name) + return "" def get_data_map(self) -> dict: rank_id_map = defaultdict(list) @@ -33,6 +45,15 @@ class MindsporeDataPreprocessor(DataPreprocessor): if rank_id < 0: logger.error("fail to get rankid or rankid invalid.") continue + for file_name in os.listdir(dir_name): + if file_name.startswith(self.PROFILER_INFO_HEAD) and file_name.endswith(self.PROFILER_INFO_EXTENSION): + file_path = os.path.join(dir_name, file_name) + config = FileManager.read_json_file(file_path) + export_type = (config.get(Constant.PROFILER_PARAMETER, {}).get(Constant.EXPORT_TYPE, Constant.TEXT)) + if isinstance(export_type, list): + self.data_type.add(Constant.DB if Constant.DB in export_type else Constant.TEXT) + else: + self.data_type.add(export_type) rank_id_map[rank_id].append(dir_name) try: @@ -42,3 +63,8 @@ class MindsporeDataPreprocessor(DataPreprocessor): except Exception as e: raise RuntimeError("Found invalid directory name!") from e return self.data_map + + def get_data_type(self): + if len(self.data_type) == 1: + return self.data_type.pop() + return Constant.INVALID diff --git a/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/msprof_data_preprocessor.py b/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/msprof_data_preprocessor.py new file mode 100644 index 0000000000000000000000000000000000000000..f751de56fe3d622e705c481220cf4a6760b163d0 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/cluster_data_preprocess/msprof_data_preprocessor.py @@ -0,0 +1,120 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import re +from collections import defaultdict + +from msprof_analyze.cluster_analyse.cluster_data_preprocess.data_preprocessor import DataPreprocessor +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.file_manager import FileManager + +logger = get_logger() + + +class MsprofDataPreprocessor(DataPreprocessor): + DEVICE_PATTERN = r"device_\d{1,2}$" + INFO_JSON_PATTERN = r"^info\.json\.\d{1,2}$" + DB_PATTERN = r"^msprof_\d{1,20}\.db$" + + def __init__(self, path_list: list): + super().__init__(path_list) + self.data_type = set() + + @classmethod + def get_msprof_profiler_db_path(cls, data_path): + msprof_db_pattern = r"^msprof_\d{14}\.db$" + msprof_db_list = [] + for file_name in os.listdir(data_path): + if re.match(msprof_db_pattern, file_name): + msprof_db_list.append(file_name) + if msprof_db_list: + msprof_db_list.sort(key=lambda x: x.split(".")[0].split("_")[-1]) + return os.path.join(data_path, msprof_db_list[-1]) + return "" + + @classmethod + def get_device_id(cls, data_path): + for file_name in os.listdir(data_path): + if re.match(cls.DEVICE_PATTERN, file_name): + return int(file_name.split("_")[-1]) + return None + + def get_data_map(self) -> dict: + prof_data_uid = defaultdict(list) + prof_data_rank = defaultdict(list) + for dir_name in self.path_list: + info_json_file = self._find_info_json_file(dir_name) + if not info_json_file: + logger.error(f"Profiling data in not completed, please check the info.json file in the path {dir_name}") + continue + + if self._check_db_type(dir_name): + self.data_type.add(Constant.DB) + elif os.path.exists(os.path.join(dir_name, "mindstudio_profiler_output")): + if os.path.exists(os.path.join(dir_name, "analyze")): + self.data_type.add(Constant.TEXT) + else: + logger.error(f"The profiling data has not been fully parsed. You can parse it by executing " + f"the following command: msprof --analyze=on --output={dir_name}") + continue + else: + logger.error(f"The profiling data has not been fully parsed. You can parse it by executing " + f"the following command: msprof --export=on --output={dir_name}; " + f"msprof --analyze=on --output={dir_name}") + continue + info_json = FileManager.read_json_file(info_json_file) + rank_id = info_json.get("rank_id") + if rank_id != Constant.INVALID_RETURN: + prof_data_rank[rank_id].append(dir_name) + continue + host_id = info_json.get("hostUid") + device_id = int(os.path.basename(info_json_file).split(".")[-1]) + prof_data_uid[(host_id, device_id)].append(dir_name) + + if prof_data_rank: + for rank_id, dir_list in prof_data_rank.items(): + dir_list.sort(key=lambda x: x.split('_')[-2]) + self.data_map[rank_id] = dir_list[0] + else: + ordered_keys = sorted(prof_data_uid.keys(), key=lambda x: (x[0], x[1])) + rank_id = 0 + for key in ordered_keys: + dir_list = prof_data_uid[key] + dir_list.sort(key=lambda x: x.split('_')[-2]) + self.data_map[rank_id] = dir_list[0] + rank_id += 1 + return self.data_map + + def get_data_type(self): + if len(self.data_type) == 1: + return self.data_type.pop() + return Constant.INVALID + + def _find_info_json_file(self, dir_name): + for file_name in os.listdir(dir_name): + file_path = os.path.join(dir_name, file_name) + if not os.path.isdir(file_path): + continue + for device_file in os.listdir(file_path): + if re.match(self.INFO_JSON_PATTERN, device_file): + return os.path.join(dir_name, file_name, device_file) + return None + + def _check_db_type(self, dir_name): + for file_name in os.listdir(dir_name): + if re.match(self.DB_PATTERN, file_name): + return True + return False diff --git a/profiler/msprof_analyze/cluster_analyse/common_func/context.py b/profiler/msprof_analyze/cluster_analyse/common_func/context.py index e4f716e90d991645de514e2bc6ecd12920c0c9e1..cde351508c0e814867d224b7ce5c454e0813a71b 100644 --- a/profiler/msprof_analyze/cluster_analyse/common_func/context.py +++ b/profiler/msprof_analyze/cluster_analyse/common_func/context.py @@ -16,7 +16,6 @@ import os from functools import partial from concurrent import futures -from collections import defaultdict from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger @@ -69,7 +68,6 @@ class ConcurrentContext(Context): super().__init__() self._custom = executor is None self._executor = executor or futures.ProcessPoolExecutor(max_workers=os.cpu_count()) - self.future_dict = defaultdict(list) def __enter__(self): if self._executor is None: @@ -86,15 +84,12 @@ class ConcurrentContext(Context): def map(self, func, *iterables, **kwargs): partial_func = partial(func, **kwargs) - return list(self._executor.map(partial_func, *iterables)) + try: + res = list(self._executor.map(partial_func, *iterables)) + except Exception as err: + logger.error(err) + return [] + return res def wait(self, waitable): return waitable - - def submit(self, name, func, *args, **kwargs): - self.future_dict[name].append(self._executor.submit(func, *args, **kwargs)) - - def wait_all_futures(self): - for _, future_list in self.future_dict.items(): - for future in future_list: - future.result() \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/common_func/table_constant.py b/profiler/msprof_analyze/cluster_analyse/common_func/table_constant.py index 3acb8713e21f0337dae4973044667fb64707eba1..eae250b0f7a7b5ab138aeae0eed9aba863e1cbf1 100644 --- a/profiler/msprof_analyze/cluster_analyse/common_func/table_constant.py +++ b/profiler/msprof_analyze/cluster_analyse/common_func/table_constant.py @@ -13,7 +13,6 @@ # See the License for the specific language governing permissions and # limitations under the License. class TableConstant: - RANK_SET = "rank_set" STEP = "step" RANK_ID = "rank_id" @@ -39,21 +38,15 @@ class TableConstant: DST_RANK = "dst_rank" TRANSPORT_TYPE = "transport_type" OPNAME = "op_name" + GROUP_ID = "group_id" + PG_NAME = "pg_name" + NAME = "name" + VALUE = "value" - -class ProfilerTableConstant: - - # COMMUNICATION OP - OP_ID = "opId" - OP_NAME = "opName" - START_NS = "startNS" - END_NS = "endNS" - CONNECTION_ID = "connectionId" - GROUP_NAME = "groupName" - RELAY = "relay" - RETRY = "retry" - DATA_TYPE = "dataType" - ALG_TYPE = "algType" - COUNT = "count" - OP_TYPE = "opType" - WAIT_NS = "waitNS" + # table name + TABLE_STRING_IDS = "STRING_IDS" + TABLE_COMPUTE_TASK_INFO = "COMPUTE_TASK_INFO" + TABLE_COMMUNICATION_OP = "COMMUNICATION_OP" + TABLE_TASK = "TASK" + TABLE_META_DATA = "META_DATA" + TABLE_COMM_ANALYZER_MATRIX = "CommAnalyzerMatrix" diff --git a/profiler/msprof_analyze/cluster_analyse/common_func/tables_config.py b/profiler/msprof_analyze/cluster_analyse/common_func/tables_config.py index 7c948ead594dcf5c67d1e70ff417b7bedf2b9265..7a40d9977d89883f8a33b743e5e80d73f2bfa328 100644 --- a/profiler/msprof_analyze/cluster_analyse/common_func/tables_config.py +++ b/profiler/msprof_analyze/cluster_analyse/common_func/tables_config.py @@ -31,7 +31,10 @@ class TablesConfig: ], "CommunicationGroupMap": [ ("type", "TEXT, null"), - ("rank_set", "TEXT, null") + ("rank_set", "TEXT, null"), + ("group_name", "TEXT, null"), + ("group_id", "TEXT, null"), + ("pg_name", "TEXT, null") ], "ClusterCommAnalyzerBandwidthMap": [ ("rank_set", "TEXT, null"), @@ -130,10 +133,13 @@ class TablesConfig: ], "CommunicationGroupMappingMap": [ ("type", "TEXT, null"), + ("rank_set", "TEXT, null"), ("group_name", "TEXT, null"), - ("rank_set", "TEXT, null") + ("group_id", "TEXT, null"), + ("pg_name", "TEXT, null") ], "ClusterBaseInfoMap": [ - ("distributed_args", "TEXT, null") + ("key", "TEXT, null"), + ("value", "TEXT, null") ] } diff --git a/profiler/msprof_analyze/cluster_analyse/common_func/time_range_calculator.py b/profiler/msprof_analyze/cluster_analyse/common_func/time_range_calculator.py new file mode 100644 index 0000000000000000000000000000000000000000..36ee94067ac37aa57f04cc1f64855be19b9807e0 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/common_func/time_range_calculator.py @@ -0,0 +1,99 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass + +DEFAULT_INT_VALUE = -1 + + +@dataclass +class TimeRange: + start_ts: int = DEFAULT_INT_VALUE + end_ts: int = DEFAULT_INT_VALUE + + +class CommunicationTimeRange(TimeRange): + + def __init__(self): + super().__init__() + + +class RangeCaculator: + + @staticmethod + def generate_time_range(start, end, class_range=TimeRange): + time_range = class_range() + time_range.start_ts, time_range.end_ts = start, end + return time_range + + @staticmethod + def merge_continuous_intervals(time_range_list: list): + result = [] + if not time_range_list: + return result + time_range_list.sort(key=lambda x: x.start_ts) + current_range = time_range_list[0] + for time_range in time_range_list: + if time_range.start_ts <= current_range.end_ts: + current_range.end_ts = max(current_range.end_ts, time_range.end_ts) + else: + result.append(current_range) + current_range = time_range + result.append(current_range) + return result + + @staticmethod + def compute_pipeline_overlap(communication_range, compute_range): + free_time_range = [] + pure_communication_range = [] + time_range_list = sorted(communication_range + compute_range, key=lambda x: x.start_ts) + if not time_range_list: + return pure_communication_range, free_time_range + + min_range = time_range_list.pop(0) + for time_range in time_range_list: + if min_range.end_ts - time_range.start_ts < 0: + free_time_range.append( + RangeCaculator.generate_time_range(min_range.end_ts, time_range.start_ts) + ) + if isinstance(min_range, CommunicationTimeRange): + pure_communication_range.append( + RangeCaculator.generate_time_range(min_range.start_ts, min_range.end_ts) + ) + min_range = time_range + continue + if min_range.end_ts - time_range.end_ts < 0: + if isinstance(min_range, CommunicationTimeRange): + pure_communication_range.append( + RangeCaculator.generate_time_range(min_range.start_ts, time_range.start_ts) + ) + min_range = RangeCaculator.generate_time_range(min_range.end_ts, time_range.end_ts) + if isinstance(time_range, CommunicationTimeRange): + min_range = RangeCaculator.generate_time_range( + min_range.end_ts, time_range.end_ts, class_range=CommunicationTimeRange + ) + else: + if isinstance(min_range, CommunicationTimeRange): + pure_communication_range.append( + RangeCaculator.generate_time_range(min_range.start_ts, time_range.start_ts) + ) + min_range = RangeCaculator.generate_time_range( + time_range.end_ts, min_range.end_ts, class_range=CommunicationTimeRange + ) + if isinstance(time_range, CommunicationTimeRange): + min_range = RangeCaculator.generate_time_range(time_range.end_ts, min_range.end_ts) + if isinstance(min_range, CommunicationTimeRange): + pure_communication_range.append(min_range) + return pure_communication_range, free_time_range diff --git a/profiler/msprof_analyze/cluster_analyse/common_func/utils.py b/profiler/msprof_analyze/cluster_analyse/common_func/utils.py index 7c867cba32a5ca72423988ac1805d88d0de75a0e..1c6a23a9ff008823c074c34d37dfa293a838bcdd 100644 --- a/profiler/msprof_analyze/cluster_analyse/common_func/utils.py +++ b/profiler/msprof_analyze/cluster_analyse/common_func/utils.py @@ -18,6 +18,10 @@ from multiprocessing import Value, Lock import numpy as np import pandas as pd +from msprof_analyze.prof_common.logger import get_logger + +logger = get_logger() + def format_columns(df: pd.DataFrame): formatted_df = df.rename( @@ -66,7 +70,7 @@ def stdev(df, aggregated): var_sum = np.dot(df["totalCount"] - 1, df["stdev"] ** 2) deviation = df["averageNs"] - aggregated["averageNs"].loc[df.name] dev_sum = np.dot(df["totalCount"], deviation ** 2) - return np.sqrt((var_sum + dev_sum) / (instance - 1)) + return np.sqrt((var_sum + dev_sum) / (instance - 1)) if (instance - 1) else 0 def convert_unit(df: pd.DataFrame, src_unit, dst_unit): @@ -81,32 +85,37 @@ def increase_shared_value(shared_value: Value, lock: Lock): shared_value.value += 1 -def detect_outliers_z_score(data, threshold=3): - """ - 使用 Z-Score 方法判断是否存在异常值。 - Z-Score 是一种统计方法,用于衡量数据点与均值的标准差距离。 - 如果某个数据点的 Z-Score 超过阈值(默认为3),则认为它是异常值。 - - 返回值: - - True:存在异常值 - - False:不存在异常值 - """ - # 计算数据的均值 - mean = np.mean(data) # 均值表示数据的中心位置 - - # 计算数据的标准差 - std = np.std(data) # 标准差表示数据的离散程度 - - # 如果标准差为0,直接返回 False(不存在异常值) - if std == 0: - return False - - # 计算 Z-Score 的上阈值和下阈值 - z_scores_upper_threshold = threshold * std + mean - z_scores_lower_threshold = -threshold * std + mean - - # 判断是否存在 Z-Score 超过阈值的数据点 - has_outliers = any(x > z_scores_upper_threshold or x < z_scores_lower_threshold for x in data) - - # 返回是否存在异常值的布尔值 - return has_outliers \ No newline at end of file +def double_hash(data): + uint32_bits = 32 + uint32_max = 0xFFFFFFFF # 32 位无符号整数的最大值 + prime = [29, 131] + hash_values = [0, 0] + + for d in data: + hash_values[0] = (hash_values[0] * prime[0] + ord(d)) & uint32_max + hash_values[1] = (hash_values[1] * prime[1] + ord(d)) & uint32_max + + return ((hash_values[0] << uint32_bits) | hash_values[1]) + + +class UnionFind(object): + """Disjoint Set Union""" + + @classmethod + def union(cls, *args): + result = set() + for s in args: + if not isinstance(s, set): + logger.warning(f"All arguments must be sets, got {type(s).__name__}") + return set() + result |= s + return result + + @classmethod + def is_connected(cls, first_set: set, second_set: set): + """ + check whether set p and set q are connected + """ + if not isinstance(first_set, set) or not isinstance(second_set, set): + return False + return len(first_set & second_set) > 0 diff --git a/profiler/msprof_analyze/cluster_analyse/communication_group/__init__.py b/profiler/msprof_analyze/cluster_analyse/communication_group/__init__.py index 7101187a2c2619f3b1c20dded14b433950b4c662..de0604079e1323b2749bc801a6e8326893c73498 100644 --- a/profiler/msprof_analyze/cluster_analyse/communication_group/__init__.py +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/__init__.py @@ -11,4 +11,4 @@ # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and -# limitations under the License. +# limitations under the License. \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/communication_group/base_communication_group.py b/profiler/msprof_analyze/cluster_analyse/communication_group/base_communication_group.py index 8f6625f8f6bbf646cfd77099b70c36398680f67a..2765ba263c809a0b0c4bf0d42da5d60b6066e182 100644 --- a/profiler/msprof_analyze/cluster_analyse/communication_group/base_communication_group.py +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/base_communication_group.py @@ -18,36 +18,45 @@ from abc import abstractmethod from collections import defaultdict from copy import deepcopy from multiprocessing import Pool +import pandas as pd from msprof_analyze.cluster_analyse.cluster_utils.data_transfer_adapter import DataTransferAdapter +from msprof_analyze.cluster_analyse.common_func.utils import double_hash from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.file_manager import FileManager logger = get_logger() class BaseCommunicationGroup: + KEY_PARALLEL_GROUP_INFO = "parallel_group_info" + KEY_COMM_GROUP_PARALLEL_INFO = "comm_group_parallel_info" + def __init__(self, params: dict): self.collection_path = params.get(Constant.COLLECTION_PATH) self.cluster_analysis_output_path = params.get(Constant.CLUSTER_ANALYSIS_OUTPUT_PATH) self.data_map = params.get(Constant.DATA_MAP) self.data_type = params.get(Constant.DATA_TYPE) self.analysis_mode = params.get(Constant.ANALYSIS_MODE) + self.is_msprof = params.get(Constant.IS_MSPROF) self.rank_comm_dir_dict = {} - self.p2p_link = [] self.collective_group_dict = defaultdict(set) - self.p2p_comm_group = [] + self.p2p_group_dict = defaultdict(set) self.communication_group = {} + self.parallel_group_info = {} self.communication_ops = [] self.matrix_ops = [] self.adapter = DataTransferAdapter() + self.comm_group_parallel_info_df = None def load_communication_data(self): comm_op_dirs = [] for rank_id, profiling_dir_path in self.data_map.items(): if self.data_type == Constant.TEXT: - comm_dir = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, Constant.COMM_JSON) - matrix_dir = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, Constant.COMM_MATRIX_JSON) + output_dir = "analyze" if self.is_msprof else Constant.SINGLE_OUTPUT + comm_dir = os.path.join(profiling_dir_path, output_dir, Constant.COMM_JSON) + matrix_dir = os.path.join(profiling_dir_path, output_dir, Constant.COMM_MATRIX_JSON) else: comm_dir = os.path.join(profiling_dir_path, Constant.SINGLE_OUTPUT, Constant.DB_COMMUNICATION_ANALYZER) matrix_dir = comm_dir @@ -61,64 +70,36 @@ class BaseCommunicationGroup: with Pool(processes=max_processes) as p: self.rank_comm_dir_dict = p.map(self.read_communication_func, comm_op_dirs) - def set_p2p_groups(self): - self.p2p_link = sorted(self.p2p_link, key=lambda x: min(x)) - while self.p2p_link: - union_set = deepcopy(self.p2p_link[0]) - rm_list = [self.p2p_link[0]] - for _, link_rank_set_x in enumerate(self.p2p_link[1:]): - if UnionFind.is_connected(link_rank_set_x, union_set): - union_set = union_set.union(link_rank_set_x) - rm_list.append(link_rank_set_x) - self.p2p_comm_group.append(union_set) - self.p2p_link = [element for element in self.p2p_link if element not in rm_list] - - def generate_collective_communication_group(self): + def generate_communication_group(self): self.communication_group[Constant.COLLECTIVE] = \ [list(group) for _, group in self.collective_group_dict.items()] - - def generate_p2p_communication_group(self): - stage_group = {} - for _, rank_set in self.collective_group_dict.items(): - if not self.whether_valid_comm_group(rank_set): - continue - unioned_set = set() - remove_key = [] - for first_rank, stage in stage_group.items(): - if UnionFind.is_connected(rank_set, stage): - unioned_set = UnionFind.union(rank_set, stage, unioned_set) - remove_key.append(first_rank) - if unioned_set: - for key in remove_key: - del stage_group[key] - stage_group[min(unioned_set)] = unioned_set - else: - stage_group[min(rank_set)] = rank_set - first_rank_sort_list = sorted([first_rank for first_rank in stage_group]) self.communication_group[Constant.P2P] = \ - [list(stage_group.get(first_rank, {})) for first_rank in first_rank_sort_list] - - def whether_valid_comm_group(self, rank_set: set): - """ - while distinguish which communication group should be used to infer stage info, these group should be ignored: - 1. group can not include more than 1 rank in every single p2p group - """ - for p2p_rank_set in self.p2p_comm_group: - if len(rank_set.intersection(p2p_rank_set)) > 1: - return False - return True + [list(group) for _, group in self.p2p_group_dict.items()] @abstractmethod def read_communication_func(self, params: tuple): pass + def read_parallel_group_info(self): + for _, profiling_dir_path in self.data_map.items(): + meta_file = os.path.join(profiling_dir_path, Constant.PROFILER_METADATA) + if not os.path.exists(meta_file): + continue + meta_data = FileManager.read_json_file(meta_file) + if self.KEY_PARALLEL_GROUP_INFO not in meta_data: + continue + for group_id, group_info in meta_data[self.KEY_PARALLEL_GROUP_INFO].items(): + if group_id not in self.parallel_group_info: + self.parallel_group_info[group_id] = group_info + def analyze_communication_data(self): for rank_id, rank_id_comm_dict, rank_id_matrix_dict in self.rank_comm_dir_dict: for step_id, step_id_dict in rank_id_comm_dict.items(): if not isinstance(step_id_dict, dict): logger.warning("rank%s's communication.json has a wrong data struct.", rank_id) continue - self.get_collective_ops_name(rank_id, step_id_dict.get(Constant.COLLECTIVE)) + self.add_collective_group_rank_map(rank_id, step_id_dict.get(Constant.COLLECTIVE, {})) + self.add_p2p_group_rank_map(rank_id, step_id_dict.get(Constant.P2P, {})) for comm_op_type, comm_op_dict in step_id_dict.items(): self.add_communication_ops(rank_id, step_id, comm_op_type, comm_op_dict) @@ -126,8 +107,10 @@ class BaseCommunicationGroup: if not isinstance(step_id_dict, dict): logger.warning("rank%s's communication_matrix.json has a wrong data struct.", rank_id) continue - self.set_p2p_link(rank_id, step_id, rank_id_matrix_dict) - self.get_collective_ops_name(rank_id, step_id_dict.get(Constant.COLLECTIVE)) + self.add_matrix_ops(rank_id, step_id, step_id_dict) + self.add_collective_group_rank_map(rank_id, step_id_dict.get(Constant.COLLECTIVE, {})) + self.add_p2p_group_rank_map(rank_id, step_id_dict.get(Constant.P2P, {})) + @abstractmethod def dump_data(self): @@ -145,44 +128,26 @@ class BaseCommunicationGroup: def generate(self): self.load_communication_data() self.analyze_communication_data() - self.set_p2p_groups() - self.generate_collective_communication_group() - self.generate_p2p_communication_group() + self.read_parallel_group_info() + self.generate_communication_group() + self.analyze_parallel_group_info() self.dump_data() return self.collect_comm_data() - def set_p2p_link(self, rank_id: int, step_id: str, rank_id_matrix_dict: dict): - ops = rank_id_matrix_dict.get(step_id, {}) - self.add_matrix_ops(rank_id, step_id, ops) - if not ops: - logger.warning( - "rank%s %s do not have communication matrix ops data.", rank_id, step_id - ) - return - p2p_ops = ops.get(Constant.P2P, {}) - for op_name, link_dict in p2p_ops.items(): - self.append_p2p_link(op_name, link_dict) - - def append_p2p_link(self, op_name, link_dict): - for link in link_dict: - if '-' not in link: - logger.warning("%s has an invalid link key %s!", op_name, link) - break - src_rank = int(link.split('-')[0]) - dst_rank = int(link.split('-')[1]) - if src_rank != dst_rank: - rank_set = {src_rank, dst_rank} - if rank_set in self.p2p_link: - continue - self.p2p_link.append(rank_set) - - def get_collective_ops_name(self, rank_id: int, comm_op_dict: dict): + def add_collective_group_rank_map(self, rank_id: int, comm_op_dict: dict): for comm_op in comm_op_dict: if comm_op.startswith('Total'): continue group_name = comm_op.split('@')[-1] self.collective_group_dict[group_name].add(rank_id) + def add_p2p_group_rank_map(self, rank_id: int, comm_op_dict: dict): + for comm_op in comm_op_dict: + if comm_op.startswith('Total'): + continue + group_name = comm_op.split('@')[-1] + self.p2p_group_dict[group_name].add(rank_id) + def add_communication_ops(self, rank_id: str, step_id: str, comm_op_type: str, comm_op_dict: dict): for comm_op in comm_op_dict: if comm_op.startswith('Total'): @@ -215,21 +180,28 @@ class BaseCommunicationGroup: Constant.COMM_OP_INFO: op_link_info }) + def analyze_parallel_group_info(self): + # create comm group dataframe + comm_group_cols = ["type", "rank_set", "group_name"] + comm_group_df = pd.DataFrame(columns=comm_group_cols) + for group_name, rank_set in self.collective_group_dict.items(): + comm_group_df.loc[comm_group_df.shape[0]] = [Constant.COLLECTIVE, list(rank_set), group_name] + for group_name, rank_set in self.p2p_group_dict.items(): + comm_group_df.loc[comm_group_df.shape[0]] = [Constant.P2P, list(rank_set), group_name] + + # create parallel group dataframe + parallel_group_cols = ["group_name", "group_id", "pg_name"] + parallel_group_df = pd.DataFrame(columns=parallel_group_cols) + for group_id, parallel_info in self.parallel_group_info.items(): + group_name = str(double_hash(group_id)) # group_name is hashed group_id + pg_name = parallel_info.get("group_name", "") + if not pg_name: + continue + parallel_group_df.loc[parallel_group_df.shape[0]] = [group_name, group_id, pg_name] -class UnionFind(object): - """Disjoint Set Union""" + # merge by group_name + df = pd.merge(comm_group_df, parallel_group_df, on='group_name', how='left') + df.fillna("", inplace=True) - @classmethod - def union(cls, first_set: set, second_set: set, third_set: set): - """make p and q the same set""" - return first_set | second_set | third_set + self.comm_group_parallel_info_df = df - @classmethod - def is_connected(cls, first_set: set, second_set: set): - """ - check whether set p and set q are connected - """ - if first_set & second_set: - return True - else: - return False diff --git a/profiler/msprof_analyze/cluster_analyse/communication_group/communication_db_group.py b/profiler/msprof_analyze/cluster_analyse/communication_group/communication_db_group.py index 7d1b4ec250ba1d25079a86f1b0bf95fd2c8906aa..f570ce204d95662569392bdbf416200ca66418c3 100644 --- a/profiler/msprof_analyze/cluster_analyse/communication_group/communication_db_group.py +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/communication_db_group.py @@ -76,12 +76,9 @@ class CommunicationDBGroup(BaseCommunicationGroup): return rank_id, comm_data, comm_matrix_data def dump_data(self): - res = [] - for data_type, data_list in self.communication_group.items(): - for data in data_list: - rank_set = "(" + ",".join(str(i) for i in data) + ")" - data = [data_type, rank_set] - res.append(data) + self.comm_group_parallel_info_df["rank_set"] = (self.comm_group_parallel_info_df["rank_set"]. + apply(lambda x: "(" + ",".join(str(i) for i in x) + ")")) + res = self.comm_group_parallel_info_df.values.tolist() dump_group_db(res, self.COMMUNICATION_GROUP_TABLE, self.cluster_analysis_output_path) @@ -103,13 +100,16 @@ class CommunicationDBGroupOptimized(BaseCommunicationGroup): comm_time_data = (time_data, bandwidth_data) return rank_id, comm_time_data, comm_matrix_data - def set_collective_group(self, rank_id: int, time_data: list): + def set_group_rank_map(self, rank_id: int, time_data: list): for single_time_data in time_data: - if single_time_data.get('type') == Constant.P2P: - continue + group_type = single_time_data.get(Constant.TYPE) group_name = single_time_data.get(Constant.GROUP_NAME) - if group_name: + if not group_name: + return + if group_type == Constant.COLLECTIVE: self.collective_group_dict[group_name].add(rank_id) + elif group_type == Constant.P2P: + self.p2p_group_dict[group_name].add(rank_id) def analyze_communication_data(self): for rank_id, comm_time_data, comm_matrix_data in self.rank_comm_dir_dict: @@ -118,7 +118,7 @@ class CommunicationDBGroupOptimized(BaseCommunicationGroup): if not time_data: logger.warning("[WARNING] rank %s has error format in time data.", rank_id) continue - self.set_collective_group(rank_id, time_data) + self.set_group_rank_map(rank_id, time_data) self.communication_ops.extend(self._merge_data_with_rank(rank_id, time_data)) self.bandwidth_data.extend(self._merge_data_with_rank(rank_id, bandwidth_data)) if self.analysis_mode in [Constant.ALL, Constant.COMMUNICATION_MATRIX]: @@ -129,8 +129,8 @@ class CommunicationDBGroupOptimized(BaseCommunicationGroup): if not isinstance(step_id_dict, dict): logger.warning("[WARNING] rank %s has error format in matrix data.", rank_id) continue - self.set_p2p_link(rank_id, step_id, comm_matrix_data) - self.get_collective_ops_name(rank_id, step_id_dict.get(Constant.COLLECTIVE)) + self.add_matrix_ops(rank_id, step_id, step_id_dict) + self.set_group_rank_map(rank_id, time_data) def generate_collective_communication_group(self): collective_group = [] @@ -148,16 +148,9 @@ class CommunicationDBGroupOptimized(BaseCommunicationGroup): return comm_data_dict def dump_data(self): - res = [] - for data_type, data_list in self.communication_group.items(): - if data_type == Constant.P2P: - for data in data_list: - rank_set = "(" + ",".join(str(i) for i in data) + ")" - res.append([data_type, "", rank_set]) - continue - for group_name, data in data_list: - rank_set = "(" + ",".join(str(i) for i in data) + ")" - res.append([data_type, group_name, rank_set]) + self.comm_group_parallel_info_df["rank_set"] = (self.comm_group_parallel_info_df["rank_set"]. + apply(lambda x: "(" + ",".join(str(i) for i in x) + ")")) + res = self.comm_group_parallel_info_df.values.tolist() dump_group_db(res, self.COMMUNICATION_GROUP_MAPPING_TABLE, self.cluster_analysis_output_path) def _merge_data_with_rank(self, rank_id: int, data_list: list): diff --git a/profiler/msprof_analyze/cluster_analyse/communication_group/communication_json_group.py b/profiler/msprof_analyze/cluster_analyse/communication_group/communication_json_group.py index 97948228264f7b6fb2aed8d8b8766b3515626d40..e6fd3b41eeaddd0f01673e8b6c60a042dfb808f2 100644 --- a/profiler/msprof_analyze/cluster_analyse/communication_group/communication_json_group.py +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/communication_json_group.py @@ -14,9 +14,14 @@ # limitations under the License. import os - +from copy import deepcopy + from msprof_analyze.cluster_analyse.communication_group.base_communication_group import BaseCommunicationGroup from msprof_analyze.prof_common.file_manager import FileManager +from msprof_analyze.cluster_analyse.communication_group.msprof_communication_matrix_adapter import \ + MsprofCommunicationMatrixAdapter +from msprof_analyze.cluster_analyse.communication_group.msprof_communication_time_adapter import \ + MsprofCommunicationTimeAdapter class CommunicationJsonGroup(BaseCommunicationGroup): @@ -26,8 +31,10 @@ class CommunicationJsonGroup(BaseCommunicationGroup): super().__init__(params) def dump_data(self): + res = deepcopy(self.communication_group) + res[self.KEY_COMM_GROUP_PARALLEL_INFO] = self.comm_group_parallel_info_df.to_dict(orient="records") FileManager.create_json_file( - self.cluster_analysis_output_path, self.communication_group, self.COMMUNICATION_GROUP_JSON + self.cluster_analysis_output_path, res, self.COMMUNICATION_GROUP_JSON ) def read_communication_func(self: any, params: tuple): @@ -39,7 +46,11 @@ class CommunicationJsonGroup(BaseCommunicationGroup): comm_data = {} matrix_data = {} if os.path.exists(comm_json_path) and self.analysis_mode in ["all", "communication_time"]: - comm_data = FileManager.read_json_file(comm_json_path) + comm_data = MsprofCommunicationTimeAdapter( + comm_json_path).generate_comm_time_data() if self.is_msprof else FileManager.read_json_file( + comm_json_path) if os.path.exists(matrix_json_path) and self.analysis_mode in ["all", "communication_matrix"]: - matrix_data = FileManager.read_json_file(matrix_json_path) + matrix_data = MsprofCommunicationMatrixAdapter( + matrix_json_path).generate_comm_matrix_data() if self.is_msprof else FileManager.read_json_file( + matrix_json_path) return rank_id, comm_data, matrix_data diff --git a/profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_matrix_adapter.py b/profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_matrix_adapter.py new file mode 100644 index 0000000000000000000000000000000000000000..7f1aef80b96cbd45dc362e6c507303e4ad8d0424 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_matrix_adapter.py @@ -0,0 +1,102 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import re +from collections import defaultdict + +from msprof_analyze.prof_common.file_manager import FileManager +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger + +from msprof_analyze.prof_common.utils import compute_ratio + +logger = get_logger() + + +class MsprofCommunicationMatrixAdapter: + P2P_HCOM = ["hcom_send", "hcom_receive", "hcom_batchsendrecv"] + HCCL_PATTERN = r"send|reduce|invalid|broadcast|allreduce|" \ + r"receive|allgather|reducescatter|scatter|alltoall|alltoallv|alltoallvc|batchsendrecv" + BANDWIDTH_GB_S = "Bandwidth(GB/s)" + TRANSPORT_TYPE = "Transport Type" + TRANSIT_SIZE_MB = "Transit Size(MB)" + TRANSIT_TIME_MS = "Transit Time(ms)" + + def __init__(self, file_path): + self.file_path = file_path + + def generate_comm_matrix_data(self): + output_comm_matrix = {"step": {Constant.P2P: {}, Constant.COLLECTIVE: {}}} + comm_matrix_data = FileManager.read_json_file(self.file_path) + split_comm_dict = {Constant.P2P: {}, Constant.COLLECTIVE: {}} + for communication_op, comm_matrix_info in comm_matrix_data.items(): + lower_op_name = communication_op.lower() + if any(lower_op_name.startswith(start_str) for start_str in self.P2P_HCOM): + split_comm_dict[Constant.P2P][communication_op] = comm_matrix_info + elif lower_op_name.startswith(Constant.TOTAL): + continue + else: + split_comm_dict[Constant.COLLECTIVE][communication_op] = comm_matrix_info + output_comm_matrix["step"][Constant.P2P] = self.integrate_matrix_data( + self.get_comm_type(split_comm_dict[Constant.P2P])) + output_comm_matrix["step"][Constant.COLLECTIVE] = self.integrate_matrix_data( + self.get_comm_type(split_comm_dict[Constant.COLLECTIVE])) + return output_comm_matrix + + def get_comm_type(self, op_data: dict) -> dict: + new_comm_op_dict = defaultdict(list) + for communication_op, communication_info in op_data.items(): + match_obj = re.compile(self.HCCL_PATTERN).search((communication_op.lower())) + if match_obj: + comm_op_type = match_obj.group() + else: + comm_op_type = communication_op.split("__")[0] + logger.warning(f"Unknown communication op type: {comm_op_type}") + for link, data in communication_info.items(): + new_comm_op_name = (comm_op_type, communication_op.split("@")[-1], link) + data['Op Name'] = communication_op.split("@")[0] + new_comm_op_dict[new_comm_op_name].append(data) + return new_comm_op_dict + + def integrate_matrix_data(self, new_comm_op_dict: dict): + """integrate the matrix data""" + comm_op_dict = defaultdict(dict) + for new_comm_op_name, data in new_comm_op_dict.items(): + data.sort(key=lambda x: x[self.BANDWIDTH_GB_S], reverse=True) + t_type = data[0].get(self.TRANSPORT_TYPE, '') + t_size = sum(x.get(self.TRANSIT_SIZE_MB, 0) for x in data) + t_time = sum(x.get(self.TRANSIT_TIME_MS, 0) for x in data) + bandwidth = compute_ratio(t_size, t_time) + + link = new_comm_op_name[2] + new_comm_op_name_top1 = f'{new_comm_op_name[0]}-top1@{new_comm_op_name[1]}' + new_comm_op_name_middle = f'{new_comm_op_name[0]}-middle@{new_comm_op_name[1]}' + new_comm_op_name_bottom1 = f'{new_comm_op_name[0]}-bottom1@{new_comm_op_name[1]}' + new_comm_op_name_bottom2 = f'{new_comm_op_name[0]}-bottom2@{new_comm_op_name[1]}' + new_comm_op_name_bottom3 = f'{new_comm_op_name[0]}-bottom3@{new_comm_op_name[1]}' + new_comm_op_name_total = f'{new_comm_op_name[0]}-total@{new_comm_op_name[1]}' + comm_op_dict[new_comm_op_name_top1].update({link: data[0]}) + comm_op_dict[new_comm_op_name_middle].update({link: data[len(data) // 2]}) + comm_op_dict[new_comm_op_name_bottom1].update({link: data[-1]}) + comm_op_dict[new_comm_op_name_total].update({link: { + self.TRANSPORT_TYPE: t_type, + self.TRANSIT_SIZE_MB: t_size, + self.TRANSIT_TIME_MS: t_time, + self.BANDWIDTH_GB_S: bandwidth + }}) + if len(data) >= 2: + comm_op_dict[new_comm_op_name_bottom2].update({link: data[-2]}) + if len(data) >= 3: + comm_op_dict[new_comm_op_name_bottom3].update({link: data[-3]}) + return comm_op_dict diff --git a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/utils.py b/profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_time_adapter.py similarity index 33% rename from debug/accuracy_tools/msprobe/pytorch/nan_analyse/utils.py rename to profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_time_adapter.py index 2eb54dc488ccbd98ddba132a285a99655deba815..7b63b700f5cb001eb85da3bf0a75d7bb8ae36ae7 100644 --- a/debug/accuracy_tools/msprobe/pytorch/nan_analyse/utils.py +++ b/profiler/msprof_analyze/cluster_analyse/communication_group/msprof_communication_time_adapter.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# Copyright (c) 2025, Huawei Technologies Co., Ltd # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -12,39 +12,27 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -import hashlib -from typing import Any - - -CHECK_FIELDS = ['Max', 'Min', 'Mean'] -OVERFLOW_VALUES = ['inf', '-inf', 'nan'] - - -def singleton(cls): - """ - :param cls: any class - :return: singleton handle - """ - _instance = {} - - def _singleton(*args: any, **kw: any) -> any: - if cls not in _instance: - _instance[cls] = cls(*args, **kw) - return _instance.get(cls) - - return _singleton - - -def has_nan_inf(value: Any) -> bool: - """检查值是否包含NaN或Inf""" - if isinstance(value, dict): - for k, v in value.items(): - if k in CHECK_FIELDS and str(v).lower() in OVERFLOW_VALUES: - return True - return False - - -def generate_hash(input_string): - sha256_hash = hashlib.sha256() - sha256_hash.update(input_string.encode('utf-8')) - return sha256_hash.hexdigest() +from msprof_analyze.prof_common.file_manager import FileManager +from msprof_analyze.prof_common.constant import Constant + + +class MsprofCommunicationTimeAdapter: + P2P_HCOM = ["hcom_send", "hcom_receive", "hcom_batchsendrecv"] + TOTAL = "total" + + def __init__(self, file_path): + self.file_path = file_path + + def generate_comm_time_data(self): + output_communication = {"step": {Constant.P2P: {}, Constant.COLLECTIVE: {}}} + communication_data = FileManager.read_json_file(self.file_path) + for communication_op, communication_info in communication_data.items(): + lower_op_name = communication_op.lower() + if any(lower_op_name.startswith(start_str) for start_str in self.P2P_HCOM): + output_communication["step"][Constant.P2P][communication_op] = communication_info + elif lower_op_name.startswith(self.TOTAL): + continue + else: + output_communication["step"][Constant.COLLECTIVE][communication_op] = communication_info + + return output_communication diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/base_recipe_analysis.py b/profiler/msprof_analyze/cluster_analyse/recipes/base_recipe_analysis.py index 7975c3675373ea80d5eba4d827fdc757a409c7af..e466c5f2122b552c2e01ef7dd0f3135415dff123 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/base_recipe_analysis.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/base_recipe_analysis.py @@ -26,6 +26,8 @@ from msprof_analyze.cluster_analyse.common_func.utils import convert_unit from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.cluster_analyse.cluster_data_preprocess.msprof_data_preprocessor import MsprofDataPreprocessor +from msprof_analyze.prof_common.file_manager import FileManager logger = get_logger() @@ -42,6 +44,8 @@ class BaseRecipeAnalysis(ABC): self._recipe_name = params.get(Constant.RECIPE_NAME, "") self._parallel_mode = params.get(Constant.PARALLEL_MODE, "") self._export_type = params.get(Constant.EXPORT_TYPE, "") + self._is_msprof = params.get(Constant.IS_MSPROF) + self._is_mindspore = params.get(Constant.IS_MINDSPORE) self._cluster_analysis_output_path = os.path.join( params.get(Constant.CLUSTER_ANALYSIS_OUTPUT_PATH, self._collection_dir), Constant.CLUSTER_ANALYSIS_OUTPUT) self._output_path = self._cluster_analysis_output_path if self._export_type == "db" else os.path.join( @@ -49,7 +53,8 @@ class BaseRecipeAnalysis(ABC): rank_list = params.get(Constant.RANK_LIST, 'all') self._rank_list = rank_list if rank_list == "all" else [int(rank) for rank in rank_list.split(",") if rank.isdigit()] - self._extra_args = self.get_extra_argument(params.get(Constant.EXTRA_ARGS)) + self._step_id = params.get(Constant.STEP_ID, Constant.VOID_STEP) + self._extra_args = self.get_extra_argument(params.get(Constant.EXTRA_ARGS, [])) PathManager.make_dir_safety(self._output_path) def __enter__(self): @@ -85,7 +90,10 @@ class BaseRecipeAnalysis(ABC): def get_extra_argument(cls, args_list) -> dict: parser = argparse.ArgumentParser() cls.add_parser_argument(parser) - args, _ = parser.parse_known_args(args_list) + args, unknown_args = parser.parse_known_args(args_list) + if unknown_args: + unknown_args = " ".join(unknown_args) + logger.warning(f"Invalid parameters: {unknown_args}. It will not have any effect.") return vars(args) @abstractmethod @@ -114,7 +122,7 @@ class BaseRecipeAnalysis(ABC): result_csv = os.path.join(self.output_path, file_name) if isinstance(data, pd.DataFrame): data = convert_unit(data, self.DB_UNIT, self.UNIT) - data.to_csv(result_csv, index=index) + FileManager.create_csv_from_dataframe(result_csv, data, index=index) else: logger.error(f"Unknown dump data type: {type(data)}") @@ -127,13 +135,12 @@ class BaseRecipeAnalysis(ABC): template_file = os.path.join(template_path, self.base_dir, filename) if replace_dict is None: shutil.copy(template_file, output_file_path) + os.chmod(output_file_path, Constant.FILE_AUTHORITY) else: - with open(template_file, 'r') as f: - template_content = f.read() - for key, value in replace_dict.items(): - template_content = template_content.replace(str(key), str(value)) - with open(output_file_path, 'w') as f: - f.write(template_content) + template_content = FileManager.read_common_file(template_file) + for key, value in replace_dict.items(): + template_content = template_content.replace(str(key), str(value)) + FileManager.create_common_file(output_file_path, template_content) logger.info(f"Notebook export path is: {output_file_path}") def add_helper_file(self, helper_file): @@ -142,6 +149,7 @@ class BaseRecipeAnalysis(ABC): if helper_file_path is not None: shutil.copy(helper_file_path, helper_output_path) + os.chmod(helper_output_path, Constant.FILE_AUTHORITY) def _get_rank_db(self): invalid_rank_id = [] @@ -157,27 +165,85 @@ class BaseRecipeAnalysis(ABC): db_paths = [] for rank_id in rank_ids: rank_path = self._data_map[rank_id] - profiler_db_path = os.path.join(rank_path, Constant.SINGLE_OUTPUT, f"ascend_pytorch_profiler_{rank_id}.db") - analysis_db_path = os.path.join(rank_path, Constant.SINGLE_OUTPUT, f"analysis.db") - if not os.path.exists(profiler_db_path): - logger.warning(f"Profiler DB file not found, rank id: {rank_id}, db path: {profiler_db_path}") - continue - db_path_dict = {Constant.RANK_ID: rank_id, Constant.PROFILER_DB_PATH: profiler_db_path} + db_path_dict = {Constant.RANK_ID: rank_id, Constant.PROFILER_DB_PATH: "", Constant.ANALYSIS_DB_PATH: "", + Constant.STEP_RANGE: {}} + profiler_db_path = self._get_profiler_db_path(rank_id, rank_path) + analysis_db_path = self._get_analysis_db_path(rank_path) + if os.path.exists(profiler_db_path): + db_path_dict[Constant.PROFILER_DB_PATH] = profiler_db_path + db_path_dict[Constant.STEP_RANGE] = self._get_step_range(profiler_db_path) + else: + logger.warning(f"Profiler DB file not found, rank id: {rank_id}, db path: {profiler_db_path}.") + if os.path.exists(analysis_db_path): db_path_dict[Constant.ANALYSIS_DB_PATH] = analysis_db_path else: - logger.warning(f"Analysis DB file not found, rank id: {rank_id}, db path: {analysis_db_path}") - db_paths.append(db_path_dict) + logger.warning(f"Analysis DB file not found, rank id: {rank_id}, db path: {analysis_db_path}.") + if db_path_dict.get(Constant.PROFILER_DB_PATH): + db_paths.append(db_path_dict) if invalid_rank_id: - logger.warning(f"Invalid Rank id : [{','.join(invalid_rank_id)}].") + logger.warning(f"Invalid Rank id: [{','.join(invalid_rank_id)}].") return db_paths + def _get_profiler_db_path(self, rank_id, data_path): + if self._is_msprof: + db_path = MsprofDataPreprocessor.get_msprof_profiler_db_path(data_path) + return db_path if db_path else os.path.join(data_path, "msprof_xx.db") + if self._is_mindspore: + return os.path.join(data_path, Constant.SINGLE_OUTPUT, f"ascend_mindspore_profiler_{rank_id}.db") + return os.path.join(data_path, Constant.SINGLE_OUTPUT, f"ascend_pytorch_profiler_{rank_id}.db") + + def _get_analysis_db_path(self, data_path): + if self._is_msprof: + return os.path.join(data_path, Constant.ANALYZE_DIR, "communication_analyzer.db") + if self._is_mindspore: + return os.path.join(data_path, Constant.SINGLE_OUTPUT, "communication_analyzer.db") + return os.path.join(data_path, Constant.SINGLE_OUTPUT, "analysis.db") + + def _get_step_range(self, db_path): + step_range = {} + if self._step_id == Constant.VOID_STEP: + return step_range + conn, cursor = DBManager.create_connect_db(db_path) + if not DBManager.judge_table_exists(cursor, "STEP_TIME"): + logger.error(f"The STEP_TIME table does not exist in the database: {db_path}, " + f"the parameter step_id will not take effect.") + DBManager.destroy_db_connect(conn, cursor) + return step_range + + step_time = [] + sql = f"select id, startNs, endNs from STEP_TIME" + try: + step_time = DBManager.fetch_all_data(cursor, sql) + except Exception as err: + logger.error(err) + finally: + DBManager.destroy_db_connect(conn, cursor) + + for step_data in step_time: + if step_data.get("id") == self._step_id: + step_range = step_data + break + if not step_range: + step_list = ", ".join([str(step.get("id", "")) for step in step_time]) + logger.error(f"Invalid step_id {self._step_id} in the database: {db_path}, " + f"step_id must be an element of the set ({step_list}), " + f"the parameter step_id will not take effect.") + return step_range + def _mapper_func(self, data_map, analysis_class): """ Extract the profiling data required for cluster analysis from each device, and then aggregate the results from each device to be processed by a reduce function. Params: - data_map: eg. {"RANK_ID": 1, "profiler_db_path": "xxxx/ascend_pytorch_profiler_1.db"} + data_map: eg1. {"RANK_ID": 1, + "profiler_db_path": "xxx/ASCEND_PROFILER_OUTPUT/ascend_pytorch_profiler_1.db", + "analysis_db_path": "xxx/ASCEND_PROFILER_OUTPUT/analysis.db", + "step_range": {"id": 2, "startNs": 12345, "endNs": 12443]} + eg2. {"RANK_ID": 1, + "profiler_db_path": "xxx/msprof_20250227145123.db", + "analysis_db_path": "xxx/analyze/communication_analyzer.db", + "step_range": {"id": 2, "startNs": 12345, "endNs": 12443]} analysis_class: hccl_sum, compute_op_sum, cann_api_sum, mstx_sum…… """ - pass \ No newline at end of file + pass diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/cann_api_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/cann_api_sum.py index 17f8f698960ac4ebd10bfbdfd0c5712fa016b7f1..329890bff23a84c916a6c7806d92bbb6912e89c7 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/cann_api_sum.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/cann_api_sum.py @@ -42,10 +42,10 @@ class CannApiSum(BaseRecipeAnalysis): grouped = stats_res.groupby("name") res = {} total_time = grouped["totalTimeNs"].sum() - res["timeRatio"] = total_time / total_time.sum() * 100.0 + res["timeRatio"] = total_time / total_time.sum() * 100.0 if total_time.sum() else 0 res["totalTimeNs"] = total_time res["totalCount"] = grouped["totalCount"].sum() - res["averageNs"] = res["totalTimeNs"] / res["totalCount"] + res["averageNs"] = res["totalTimeNs"] / res["totalCount"].where(res["totalCount"] != 0, other=0) res["Q1Ns"] = grouped["Q1Ns"].min() res["medNs"] = grouped["medNs"].median() res["Q3Ns"] = grouped["Q3Ns"].max() @@ -90,14 +90,16 @@ class CannApiSum(BaseRecipeAnalysis): self.add_helper_file("cluster_display.py") def save_db(self): - self.dump_data(self._stats_rank_data, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "CannApiSumRank") + self.dump_data(self._stats_rank_data, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "CannApiSumRank", + index=False) self.dump_data(self._stats_data, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "CannApiSum") def _mapper_func(self, data_map, analysis_class): profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) rank_id = data_map.get(Constant.RANK_ID) - df = CannApiSumExport(profiler_db_path, analysis_class).read_export_db() + step_range = data_map.get(Constant.STEP_RANGE) + df = CannApiSumExport(profiler_db_path, analysis_class, step_range).read_export_db() if df is None or df.empty: logger.warning(f"There is no stats data in {profiler_db_path}.") return None, None - return rank_id, df \ No newline at end of file + return rank_id, df diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/stats.ipynb b/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/stats.ipynb index c97f039c5a01a6e7cce2968d569d79e137e76f8c..2bc1b77e9b14777b57771313233beb7fa255d2e9 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/stats.ipynb +++ b/profiler/msprof_analyze/cluster_analyse/recipes/cann_api_sum/stats.ipynb @@ -72,7 +72,7 @@ "outputs": [], "source": [ "per_rank_df = pd.read_csv(\"rank_stats.csv\")\n", - "cluster_display.display_stats_per_operation(per_rank_df, xaxis_title='rank', yaxis_title='duration (ns)')" + "cluster_display.display_stats_per_operation(per_rank_df, box=False, scatter=False)" ] } ], diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_display.py b/profiler/msprof_analyze/cluster_analyse/recipes/cluster_display.py index fbf89bc4909c28fc1ec3a4f2c38a7414fe8b986d..5a23a280fff9b3c0492f1c8cd2fac20824afb708 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_display.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/cluster_display.py @@ -14,8 +14,6 @@ # limitations under the License. import logging -import math -import matplotlib.pyplot as plt import numpy as np import pandas as pd import plotly.graph_objects as go @@ -240,74 +238,3 @@ def display_stats_optional_combobox(options, display_func, args, description="Op dropdown.value = options[0] elif len(options) == 1: display_func(options[0], args) - - -def compute_quantile_intervals(lst, num_intervals): - lst.sort(reverse=False) - if len(lst) > num_intervals: - min_value = min(lst) - max_value = max(lst) - interval_size = len(lst) / num_intervals - result = [min_value] - for i in range(1, num_intervals): - index = int(math.ceil(i * interval_size)) - 1 - result.append(lst[index]) - result.append(max_value) - else: - result = lst - return result[::-1] - - -def calculate_zscore(x, mean, std): - if std != 0: - zscore = (x - mean) / std - elif x > mean: - zscore = 100 - else: - zscore = -100 - return zscore - - -def process_data(df, group_cols, value_col, num_intervals): - grouped = df.groupby(group_cols)[value_col].apply(list).to_dict() - data = {k: compute_quantile_intervals(v, num_intervals) for k, v in grouped.items()} - max_len = max(len(v) for v in data.values()) - data_dict = { - k: v + [np.nan] * (max_len - len(v)) - for k, v in data.items() - } - # 使用sorted()函数和lambda表达式对字典的键进行排序,reverse=True表示降序排列 - sorted_items = sorted(data_dict.items(), key=lambda item: item[0], reverse=True) - # 将排序后的列表转换为字典 - data_dict = dict(sorted_items) - data_dealed = pd.DataFrame(data_dict) - return data_dealed - - -def plot_data(df, title, ylabel): - ax = df.plot(kind='bar', figsize=(12, 6)) - ax.set_title(title, fontsize=14) - ax.set_xlabel('opTypeRelatedRanksDataSize', fontsize=12) - ax.set_ylabel(ylabel, fontsize=12) - ax.legend(title='Percentiles', bbox_to_anchor=(1.05, 1)) - plt.tight_layout() - plt.show() - - -def display_transmittime_bar(slowlinkops_df, ratio_set=0.05, optype='hcom_allGather_', - relatedranks=5, datasize=1024): - slowlinkops_df_f = slowlinkops_df[(slowlinkops_df['opType'] == optype) & - (slowlinkops_df['relatedRanks'] == relatedranks) & (slowlinkops_df['dataSize'] == datasize)] - slowlinkops_df_f['relatedRanks'] = slowlinkops_df_f['relatedRanks'].apply(str) - slowlinkops_df_f['dataSize'] = slowlinkops_df_f['dataSize'].apply(str) - slowlinkops_df_f['opTypeRelatedRanksDataSize'] = slowlinkops_df_f['opType'] + \ - slowlinkops_df_f['relatedRanks'] + '_' + slowlinkops_df_f['dataSize'] - slowlinkops_df_f['transmitTime_Zscore'] = slowlinkops_df_f['transmitTime'].apply( - lambda x: calculate_zscore(x, slowlinkops_df_f['transmitTime'].mean(), slowlinkops_df_f['transmitTime'].std())) - num_intervals = int(1 / ratio_set) - - data_tt = process_data(slowlinkops_df_f, 'opTypeRelatedRanksDataSize', 'transmitTime', num_intervals) - data_ttzscore = process_data(slowlinkops_df_f, 'opTypeRelatedRanksDataSize', 'transmitTime_Zscore', num_intervals) - - plot_data(data_tt, 'Transmit Time Distribution', 'Time (ns)') - plot_data(data_ttzscore, 'Z-Score of Transmit Time Distribution', 'Z-Score') \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_compare_summary/cluster_time_compare_summary.py b/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_compare_summary/cluster_time_compare_summary.py deleted file mode 100644 index 71a5fbee9d40c34e0f74930f7615ec23bec44d44..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_compare_summary/cluster_time_compare_summary.py +++ /dev/null @@ -1,115 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os - -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.database_service import DatabaseService -from msprof_analyze.prof_common.db_manager import DBManager -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_common.path_manager import PathManager - -logger = get_logger() - - -class ClusterTimeCompareSummary(BaseRecipeAnalysis): - BP = "bp" # 被对比的路径参数 - TABLE_CLUSTER_TIME_COMPARE_SUMMARY = "ClusterTimeCompareSummary" - CLUSTER_TIME_SUMMARY_CSV = "cluster_time_summary.csv" - CLUSTER_TIME_SUMMARY_COLUMNS = [ - "rank", - "step", - "computation", - "communicationNotOverlapComputation", - "communicationOverlapComputation", - "communication", - "free", - "communicationWaitStageTime", - "communicationTransmitStageTime", - "memory", - "memoryNotOverlapComputationCommunication", - "taskLaunchDelayAvgTime" - ] - - def __init__(self, params): - super().__init__(params) - self.db_path = os.path.join(self._collection_dir, Constant.CLUSTER_ANALYSIS_OUTPUT, - Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) - self.base_db_path = os.path.join(self._extra_args.get(self.BP, ""), Constant.CLUSTER_ANALYSIS_OUTPUT, - Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) - self.compare_result = None - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - @classmethod - def add_parser_argument(cls, parser): - BaseRecipeAnalysis.add_parser_argument(parser) - parser.add_argument('--bp', type=PathManager.expanduser_for_argumentparser, default="", - help="base profiling data path") - - def run(self, context=None): - logger.info("ClusterTimeCompareSummary starts running.") - if not self.check_params_is_valid(): - return - self.get_compare_data() - self.save_db() - - def check_params_is_valid(self) -> bool: - base_path = self._extra_args.get(self.BP, "") - if not base_path: - logger.error("Must specify the --bp parameter.") - return False - if self._export_type == Constant.NOTEBOOK: - logger.error("For cluster_time_compare_summary, the export_type parameter only supports db.") - return False - try: - PathManager.check_input_directory_path(base_path) # 校验目录 - except RuntimeError: - logger.error(f"{base_path} is not valid.") - return False - if not DBManager.check_tables_in_db(self.db_path, Constant.TABLE_CLUSTER_TIME_SUMMARY): - logger.error(f"{Constant.TABLE_CLUSTER_TIME_SUMMARY} in {self.db_path} does not exist.") - return False - if not DBManager.check_tables_in_db(self.base_db_path, Constant.TABLE_CLUSTER_TIME_SUMMARY): - logger.error(f"{Constant.TABLE_CLUSTER_TIME_SUMMARY} in {self.base_db_path} does not exist.") - return False - return True - - - def get_compare_data(self): - database_service_for_db = DatabaseService(self.db_path) - database_service_for_db.add_table_for_query(Constant.TABLE_CLUSTER_TIME_SUMMARY, - self.CLUSTER_TIME_SUMMARY_COLUMNS) - cluster_time_summary_df_dict = database_service_for_db.query_data() - cluster_time_summary_df = cluster_time_summary_df_dict.get(Constant.TABLE_CLUSTER_TIME_SUMMARY) - database_service_for_base_db = DatabaseService(self.base_db_path) - database_service_for_base_db.add_table_for_query(Constant.TABLE_CLUSTER_TIME_SUMMARY, - self.CLUSTER_TIME_SUMMARY_COLUMNS) - base_cluster_time_summary_df_dict = database_service_for_base_db.query_data() - base_cluster_time_summary_df = base_cluster_time_summary_df_dict.get(Constant.TABLE_CLUSTER_TIME_SUMMARY) - self.compare_result = ( - cluster_time_summary_df.set_index(["rank", "step"]) - .subtract(base_cluster_time_summary_df.set_index(["rank", "step"])) - .dropna() - .reset_index() - .rename(columns=lambda x: f"{x}Diff" if x not in ["rank", "step"] else x) - ) - - def save_db(self): - self.dump_data(self.compare_result, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, - self.TABLE_CLUSTER_TIME_COMPARE_SUMMARY, index=False) \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_summary/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_summary/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_summary/cluster_time_summary.py b/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_summary/cluster_time_summary.py deleted file mode 100644 index a574850ec6d0aa2ace4d5d5b1ffaa3a3c71b6759..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_summary/cluster_time_summary.py +++ /dev/null @@ -1,186 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import pandas as pd - -from msprof_analyze.cluster_analyse.common_func.context import ConcurrentContext -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.cluster_time_summary_export import CommunicationTimeExport -from msprof_analyze.prof_exports.cluster_time_summary_export import MemoryAndDispatchTimeExport -from msprof_analyze.prof_common.database_service import DatabaseService - -logger = get_logger() - - -class OverlapInfo: - def __init__(self, start, end, overlap_type): - self.start = start - self.end = end - self.type = overlap_type - - -class ClusterTimeSummary(BaseRecipeAnalysis): - COMPUTING_TYPE = 0 - COMMUNICATION_TYPE = 1 - MEMORY_TYPE = 4 - STEP_TRACE = "step_trace" - COMMUNICATION = "communication" - MEMORY_AND_DISPATCH = "memory_and_dispatch" - - def __init__(self, params): - super().__init__(params) - self.db_paths = self._get_rank_db() - self.stats_data = None - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - @staticmethod - def aggregate_stats(context: ConcurrentContext): - step_trace_df_list = [future.result() for future in context.future_dict[ClusterTimeSummary.STEP_TRACE]] - communication_df_list = [ - future.result() - for future in context.future_dict[ClusterTimeSummary.COMMUNICATION] - ] - memory_and_dispatch_df_list = [ - future.result() - for future in context.future_dict[ClusterTimeSummary.MEMORY_AND_DISPATCH] - ] - step_trace_df = pd.concat(step_trace_df_list, ignore_index=True) - communication_df = pd.concat(communication_df_list, ignore_index=True) - memory_and_dispatch_df = pd.concat(memory_and_dispatch_df_list, ignore_index=True) - communication_df["communicationTransmitStageTime"] = \ - communication_df.groupby(["groupName", "opName", "step"])["communication_time"].transform("min") - communication_df["communicationWaitStageTime"] = \ - communication_df["communication_time"] - communication_df["communicationTransmitStageTime"] - transmit_and_wait_df = communication_df.groupby(["rank", "step"])[ - ["communicationWaitStageTime", "communicationTransmitStageTime"]].sum().reset_index() - all_dfs = [step_trace_df, transmit_and_wait_df, memory_and_dispatch_df] - merged_df = all_dfs[0] - for df in all_dfs[1:]: - merged_df = pd.merge(merged_df, df, on=['rank', 'step'], how='outer') - # 根据 step 和 rank 列对合并后的 DataFrame 进行排序 - merged_df = merged_df.sort_values(by=['rank', 'step']) - merged_df["free"] = merged_df["free"] - merged_df["memoryNotOverlapComputationCommunication"] - merged_df = merged_df.rename(columns={ - 'computing': 'computation', - 'overlapped': 'communicationOverlapComputation', - 'communication_not_overlapped': 'communicationNotOverlapComputation'}) - return merged_df.sort_values(by=['rank', 'step']) - - @classmethod - def get_memory_not_overlap(cls, df: pd.DataFrame): - memory_not_overlap_time = 0 # free的时间段里面memory的总时间(异步拷贝) - cur_block = OverlapInfo(df.iloc[0]["start"], df.iloc[0]["start"], -1) - for time_info in df.itertuples(): - if cur_block.type == cls.MEMORY_TYPE: - tmp_start = cur_block.start - tmp_end = cur_block.end if time_info.start > cur_block.end else time_info.start - if tmp_start < tmp_end: - memory_not_overlap_time += tmp_end - tmp_start - if time_info.start > cur_block.end: - cur_block.end = time_info.end - cur_block.type = time_info.type - cur_block.start = time_info.start - else: - cur_block.type = time_info.type if time_info.end > cur_block.end else cur_block.type - cur_block.start = cur_block.end if time_info.end > cur_block.end else time_info.end - cur_block.end = time_info.end if time_info.end > cur_block.end else cur_block.end - # 此处为了添加最后一块数据 - if cur_block.type == cls.MEMORY_TYPE: - memory_not_overlap_time += cur_block.end - cur_block.start - return memory_not_overlap_time / Constant.TIME_UNIT_SCALE - - @classmethod - def calculate_dispatch_time(cls, df: pd.DataFrame) -> pd.DataFrame: - filtered_df = df[df['type'].isin([cls.COMPUTING_TYPE, cls.COMMUNICATION_TYPE])] - result = filtered_df.groupby(['step'])['dispatch'].mean().reset_index() - result = result.rename(columns={'dispatch': 'taskLaunchDelayAvgTime'}) - return result - - @classmethod - def calculate_memory_time(cls, df: pd.DataFrame) -> pd.DataFrame: - filtered_df = df[df['type'].isin([cls.MEMORY_TYPE])].copy() - filtered_df['memory'] = filtered_df['end'] - filtered_df['start'] - result = filtered_df.groupby(['step'])['memory'].sum().reset_index() - result['memory'] = result['memory'] / Constant.TIME_UNIT_SCALE - return result - - def calculate_step_trace_time(self, data_map, analysis_class): - analysis_db_path = data_map.get(Constant.ANALYSIS_DB_PATH) - rank_id = data_map.get(Constant.RANK_ID) - data_service = DatabaseService(analysis_db_path) - data_service.add_table_for_query(Constant.TABLE_STEP_TRACE, ["step", "computing", - "communication_not_overlapped", "overlapped", - "communication", "free", ]) - df = data_service.query_data().get(Constant.TABLE_STEP_TRACE) - if df is None or df.empty: - logger.warning(f"There is no stats data in {analysis_db_path}.") - return None - df.insert(0, "rank", rank_id) - df["step"] = df["step"].astype(int) - return df - - def calculate_communication_time(self, data_map, analysis_class): - analysis_db_path = data_map.get(Constant.PROFILER_DB_PATH) - df = CommunicationTimeExport(analysis_db_path, analysis_class).read_export_db() - return df - - def calculate_memory_and_dispatch_time(self, data_map, analysis_class): - """ - rank step memory computing_dispatch communication_dispatch - 0 1 120 150 200 - 0 2 130 150 200 - """ - profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) - rank_id = data_map.get(Constant.RANK_ID) - df = MemoryAndDispatchTimeExport(profiler_db_path, analysis_class).read_export_db() - if df is None or df.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - memory_df = ClusterTimeSummary.calculate_memory_time(df) - memory_not_overlap_df = (df.groupby(["step"]).apply(ClusterTimeSummary.get_memory_not_overlap). - reset_index(name="memoryNotOverlapComputationCommunication")) - dispatch_df = ClusterTimeSummary.calculate_dispatch_time(df) - result_df = pd.merge(memory_df, memory_not_overlap_df, on='step', how='inner') - result_df = pd.merge(result_df, dispatch_df, on='step', how='inner') - result_df.insert(0, "rank", rank_id) - return result_df - - def mapper_func(self, context: ConcurrentContext): - for db_map in self.db_paths: - context.submit(self.STEP_TRACE, self.calculate_step_trace_time, db_map, self._recipe_name) - context.submit(self.COMMUNICATION, self.calculate_communication_time, - db_map, self._recipe_name) - context.submit(self.MEMORY_AND_DISPATCH, self.calculate_memory_and_dispatch_time, - db_map, self._recipe_name) - - def run(self, context: ConcurrentContext): - logger.info("ClusterTimeSummary init.") - self.mapper_func(context) - context.wait_all_futures() - self.stats_data = self.aggregate_stats(context) - if self._export_type == Constant.DB: - self.save_db() - else: - logger.warning("cluster_time_summary only supports export db.") - - def save_db(self): - self.dump_data(self.stats_data, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, - Constant.TABLE_CLUSTER_TIME_SUMMARY, index=False) diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/compare/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_group_map/__init__.py similarity index 100% rename from debug/accuracy_tools/msprobe/pytorch/visualization/compare/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/communication_group_map/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/communication_group_map/communication_group_map.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_group_map/communication_group_map.py new file mode 100644 index 0000000000000000000000000000000000000000..40bac754fe6f109e08ce78ca8d25df2e2df0ddda --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/communication_group_map/communication_group_map.py @@ -0,0 +1,122 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import json +import os +import pandas as pd + +from msprof_analyze.cluster_analyse.common_func.utils import double_hash +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.database_service import DatabaseService + +logger = get_logger() + + +class CommunicationGroupMap(BaseRecipeAnalysis): + COMMUNICATION_GROUP_MAPPING_TABLE = "CommunicationGroupMapping" + + def __init__(self, params): + super().__init__(params) + logger.info("CommunicationGroupMap init.") + self.group_df = None + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + @staticmethod + def get_comm_type_from_op_name(op_name: str): + op_name_lower = op_name.lower() + return Constant.P2P if ("send" in op_name_lower or "receive" in op_name_lower or "recv" in op_name_lower) \ + else Constant.COLLECTIVE + + def run(self, context): + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + if self._export_type == Constant.DB: + self.save_db() + else: + logger.error(f"CommGroupMap: {self._export_type} is not supported for export type.") + + def reducer_func(self, mapper_res): + # concat and process all comm group + comm_group_df_list = [df for df, _ in mapper_res] + comm_group_combined_df = pd.concat(comm_group_df_list).drop_duplicates() + comm_group_combined_df = (comm_group_combined_df.groupby([TableConstant.TYPE, TableConstant.GROUP_NAME]) + [TableConstant.RANK_ID].apply(lambda x: sorted(set(x))).reset_index()) + comm_group_combined_df[TableConstant.RANK_SET] = (comm_group_combined_df[TableConstant.RANK_ID]. + apply(lambda x: "(" + ",".join(str(i) for i in x) + ")")) + + comm_group_combined_df = comm_group_combined_df.drop(columns=[TableConstant.RANK_ID]) + # concat all parallel group info + parallel_info_df_list = [df for _, df in mapper_res] + parallel_info_combined_df = pd.concat(parallel_info_df_list).drop_duplicates() + # merge by group_name + group_df = pd.merge(comm_group_combined_df, parallel_info_combined_df, on=TableConstant.GROUP_NAME, how="left") + group_df.fillna("", inplace=True) + # column order + column_order = [TableConstant.TYPE, TableConstant.RANK_SET, TableConstant.GROUP_NAME, + TableConstant.GROUP_ID, TableConstant.PG_NAME] + self.group_df = group_df[column_order] + + def save_db(self): + self.dump_data(self.group_df, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, + self.COMMUNICATION_GROUP_MAPPING_TABLE, index=False) + + def _mapper_func(self, data_map, analysis_class): + rank_id = data_map.get(Constant.RANK_ID) + # read CommAnalyzerTime table + analysis_db_path = data_map.get(Constant.ANALYSIS_DB_PATH) + analysis_data_service = DatabaseService(analysis_db_path, {}) + analysis_data_service.add_table_for_query(Constant.TABLE_COMM_ANALYZER_TIME, + [TableConstant.HCCL_OP_NAME, TableConstant.GROUP_NAME]) + comm_time_res = analysis_data_service.query_data() + # process comm_time_df: group_name, type, rank_id + comm_time_df = comm_time_res.get(Constant.TABLE_COMM_ANALYZER_TIME) + comm_time_df[TableConstant.RANK_ID] = rank_id + comm_time_df[TableConstant.TYPE] = (comm_time_df[TableConstant.HCCL_OP_NAME]. + apply(lambda x: self.get_comm_type_from_op_name(x))) + comm_time_df = comm_time_df.drop(columns=[TableConstant.HCCL_OP_NAME]) + comm_time_df = comm_time_df.drop_duplicates() + + # read META_DATA table + profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) + profiler_data_service = DatabaseService(profiler_db_path, {}) + profiler_data_service.add_table_for_query(Constant.TABLE_META_DATA, + [TableConstant.NAME, TableConstant.VALUE]) + meta_data_res = profiler_data_service.query_data() + meta_data_df = meta_data_res.get(Constant.TABLE_META_DATA) + # process parallel_info_df + parallel_info_df = pd.DataFrame(columns=[TableConstant.GROUP_NAME, + TableConstant.GROUP_ID, TableConstant.PG_NAME]) + if Constant.PARALLEL_GROUP_INFO not in meta_data_df[TableConstant.NAME].values: + return comm_time_df, parallel_info_df + info_str = meta_data_df.loc[meta_data_df[TableConstant.NAME] == Constant.PARALLEL_GROUP_INFO, + TableConstant.VALUE].values[0] + info_dict = json.loads(info_str) + for group_id, parallel_info in info_dict.items(): + group_name = str(double_hash(group_id)) # group_name is hashed group_id + pg_name = parallel_info.get(TableConstant.GROUP_NAME, "") + if not pg_name: + continue + parallel_info_df.loc[parallel_info_df.shape[0]] = [group_name, group_id, pg_name] + + return comm_time_df, parallel_info_df + + + + diff --git a/debug/accuracy_tools/msprobe/pytorch/visualization/graph/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_matrix_sum/__init__.py similarity index 100% rename from debug/accuracy_tools/msprobe/pytorch/visualization/graph/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/communication_matrix_sum/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/communication_matrix_sum/communication_matrix_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_matrix_sum/communication_matrix_sum.py new file mode 100644 index 0000000000000000000000000000000000000000..8b91626fe50dfa9ebfa62321545c92bb4dde39d0 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/communication_matrix_sum/communication_matrix_sum.py @@ -0,0 +1,207 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import ast +import os + +import pandas as pd +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.database_service import DatabaseService +from msprof_analyze.cluster_analyse.common_func.utils import double_hash + +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant + +logger = get_logger() + + +class CommMatrixSum(BaseRecipeAnalysis): + TABLE_CLUSTER_COMM_MATRIX = "ClusterCommunicationMatrix" + RANK_MAP = "rank_map" + MATRIX_DATA = "matrix_data" + RANK_SET = "rank_set" + P2P_HCOM = ["hcom_send", "hcom_receive", "hcom_batchsendrecv"] + + def __init__(self, params): + super().__init__(params) + self.cluster_matrix_df = None + logger.info("CommMatrixSum init.") + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + @classmethod + def _get_parallel_group_info(cls, profiler_db_path): + rank_map = {} + data_service = DatabaseService(profiler_db_path, {}) + data_service.add_table_for_query(TableConstant.TABLE_META_DATA) + meta_df = data_service.query_data().get(TableConstant.TABLE_META_DATA, None) + if meta_df is None or meta_df.empty: + return rank_map + filtered_df = meta_df[meta_df['name'] == "parallel_group_info"] + if filtered_df.shape[0] == 1 and filtered_df.shape[1] == 2: + parallel_group_info = ast.literal_eval(filtered_df['value'].tolist()[0]) + for group_name, group_info in parallel_group_info.items(): + global_ranks = group_info.get("global_ranks") + if isinstance(global_ranks, list) and global_ranks: + global_ranks.sort() + rank_map[double_hash(group_name)] = dict(enumerate(global_ranks)) + return rank_map + + @classmethod + def _trans_msprof_matrix_data(cls, matrix_data): + matrix_data["step"] = "step" + matrix_data["type"] = Constant.COLLECTIVE + for index, row in matrix_data.iterrows(): + lower_op_name = row["hccl_op_name"].lower() + if any(lower_op_name.startswith(start_str) for start_str in cls.P2P_HCOM): + matrix_data.at[index, "type"] = Constant.P2P + matrix_data = matrix_data.rename(columns={'hccl_op_name': 'op_name'}) + matrix_data["hccl_op_name"] = matrix_data["op_name"].str.split("__").str[0] + + # 按多字段分组 + grouped_df = matrix_data.groupby(['type', 'step', 'group_name', 'hccl_op_name', 'src_rank', 'dst_rank']) + + # 定义一个函数,用于提取特定的记录 + def get_specific_rows(group): + # 按带宽排序 + sorted_group = group.sort_values(by='bandwidth') + bottom1 = sorted_group.iloc[-1] + bottom2 = sorted_group.iloc[-2] if len(group) > 1 else pd.Series() + bottom3 = sorted_group.iloc[-3] if len(group) > 2 else pd.Series() + top1 = sorted_group.iloc[0] + mid_index = len(group) // 2 + middle = sorted_group.iloc[mid_index] + return pd.DataFrame([top1, bottom1, bottom2, bottom3, middle], + index=['top1', 'bottom1', 'bottom2', 'bottom3', 'middle']).reset_index() + + example_df = grouped_df.apply(get_specific_rows).reset_index(drop=True) + example_df = example_df.dropna().reset_index(drop=True) + example_df["hccl_op_name"] = example_df["hccl_op_name"].astype(str) + "-" + example_df["index"].astype(str) + example_df = example_df.drop(columns="index") + + # total + total_df = matrix_data.groupby(['type', 'step', 'group_name', 'hccl_op_name', 'src_rank', 'dst_rank']).agg( + {'transport_type': 'first', "transit_size": "sum", "transit_time": "sum"}) + total_df = total_df.reset_index() + total_df["op_name"] = None + total_df["hccl_op_name"] = total_df["hccl_op_name"].astype(str) + "-total" + total_df['bandwidth'] = total_df['transit_size'] / total_df['transit_time'].where(total_df['transit_time'] != 0, + other=0) + return pd.concat([example_df, total_df], ignore_index=True) + + def run(self, context): + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + + if self._export_type == "db": + self.save_db() + else: + logger.error("communication_matrix_sum is not supported for notebook export type.") + + def reducer_func(self, mapper_res): + rank_map = self._generate_rank_map(mapper_res) + concat_df = pd.DataFrame() + for rank_data in mapper_res: + matrix_df = rank_data.get(self.MATRIX_DATA) + concat_df = pd.concat([concat_df, matrix_df], ignore_index=True) + if concat_df.empty: + logger.error("Communication matrix data is None.") + return + concat_df[self.RANK_SET] = "" + for index, row in concat_df.iterrows(): + if row["type"] == Constant.P2P: + concat_df.at[index, self.RANK_SET] = Constant.P2P + continue + rank_list = sorted(rank_map.get(row["group_name"], {}).values()) + concat_df.at[index, self.RANK_SET] = ",".join([str(rank) for rank in rank_list]) + grouped_df = concat_df.groupby( + [self.RANK_SET, 'step', "hccl_op_name", "group_name", "src_rank", "dst_rank"]).agg( + {'transport_type': 'first', 'op_name': 'first', "transit_size": "sum", "transit_time": "sum"}) + grouped_df = grouped_df.reset_index() + grouped_df["is_mapped"] = False + grouped_df["bandwidth"] = None + for index, row in grouped_df.iterrows(): + src_rank = row["src_rank"] + dst_rank = row["dst_rank"] + group_name = row["group_name"] + group_rank_map = rank_map.get(group_name, {}) + if src_rank not in group_rank_map: + logger.warning(f"The src local rank {src_rank} of the group_name {group_name} " + f"cannot be mapped to the global rank.") + continue + if dst_rank not in group_rank_map: + logger.warning(f"The dst local rank {dst_rank} of the group_name {group_name} " + f"cannot be mapped to the global rank.") + continue + grouped_df.at[index, 'src_rank'] = group_rank_map[src_rank] + grouped_df.at[index, 'dst_rank'] = group_rank_map[dst_rank] + grouped_df.at[index, 'is_mapped'] = True + grouped_df.at[index, 'bandwidth'] = row["transit_size"] / row["transit_time"] if row["transit_time"] else 0 + filtered_df = grouped_df[grouped_df["is_mapped"]].drop(columns="is_mapped") + total_op_info = filtered_df[filtered_df['hccl_op_name'].str.contains('total', na=False)].groupby( + [self.RANK_SET, 'step', "src_rank", "dst_rank"]).agg( + {"group_name": "first", 'transport_type': 'first', 'op_name': 'first', "transit_size": "sum", + "transit_time": "sum"} + ) + total_op_info = total_op_info.reset_index() + total_op_info["hccl_op_name"] = Constant.TOTAL_OP_INFO + total_op_info['bandwidth'] = total_op_info['transit_size'] / total_op_info['transit_time'].where( + total_op_info['transit_time'] != 0, other=0) + self.cluster_matrix_df = pd.concat([filtered_df, total_op_info], ignore_index=True).drop(columns=self.RANK_SET) + + def save_db(self): + self.dump_data(self.cluster_matrix_df, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, + self.TABLE_CLUSTER_COMM_MATRIX, index=False) + + def _generate_rank_map(self, mapper_res): + rank_map = {} + rank_map_df = pd.DataFrame({"group_name": [], "src_rank": [], Constant.RANK_ID: []}) + for rank_data in mapper_res: + rank_map.update(rank_data.get(self.RANK_MAP)) + matrix_df = rank_data.get(self.MATRIX_DATA) + filter_matrix_df = matrix_df[matrix_df["src_rank"] == matrix_df["dst_rank"]] + grouped_matrix_df = filter_matrix_df[['group_name', 'src_rank']].drop_duplicates() + grouped_matrix_df[Constant.RANK_ID] = rank_data.get(Constant.RANK_ID) + rank_map_df = pd.concat([grouped_matrix_df, rank_map_df], ignore_index=True) + rank_map_df = rank_map_df.drop_duplicates() + for _, row in rank_map_df.iterrows(): + group_name = row["group_name"] + local_rank = row["src_rank"] + global_rank = row[Constant.RANK_ID] + if group_name not in rank_map: + rank_map[group_name] = {local_rank: global_rank} + continue + if local_rank not in rank_map[group_name]: + rank_map[group_name][local_rank] = global_rank + continue + if rank_map[group_name][local_rank] != global_rank: + logger.warning(f"In the same communication group {group_name}, global rank {global_rank} " + f"and {rank_map[group_name][local_rank]} get the same local rank {local_rank}!") + return rank_map + + def _mapper_func(self, data_map, analysis_class): + result_data = {Constant.RANK_ID: data_map.get(Constant.RANK_ID)} + profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) + result_data[self.RANK_MAP] = self._get_parallel_group_info(profiler_db_path) + analysis_db_path = data_map.get(Constant.ANALYSIS_DB_PATH) + data_service = DatabaseService(analysis_db_path, {}) + data_service.add_table_for_query(TableConstant.TABLE_COMM_ANALYZER_MATRIX) + matrix_data = data_service.query_data().get(TableConstant.TABLE_COMM_ANALYZER_MATRIX) + if self._is_msprof or self._is_mindspore: + matrix_data = self._trans_msprof_matrix_data(matrix_data) + result_data[self.MATRIX_DATA] = matrix_data + return result_data diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/p2p_pairing/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_time_sum/__init__.py similarity index 100% rename from profiler/msprof_analyze/cluster_analyse/recipes/p2p_pairing/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/communication_time_sum/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/communication_time_sum/communication_time_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/communication_time_sum/communication_time_sum.py new file mode 100644 index 0000000000000000000000000000000000000000..fd6182a923a28766230148cdb12280031e1f2890 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/communication_time_sum/communication_time_sum.py @@ -0,0 +1,238 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import os + +import numpy as np +import pandas as pd +from msprof_analyze.cluster_analyse.common_func.analysis_loader import get_class_from_name +from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.database_service import DatabaseService +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.db_manager import DBManager + +logger = get_logger() + + +class CommunicationTimeSum(BaseRecipeAnalysis): + TABLE_CLUSTER_COMM_TIME = "ClusterCommunicationTime" + TABLE_CLUSTER_COMM_BANDWIDTH = "ClusterCommunicationBandwidth" + + TABLE_COMMUNICATION_GROUP_MAPPING = "CommunicationGroupMapping" + + def __init__(self, params): + super().__init__(params) + self.params = params + logger.info("CommunicationTimeSum init.") + self.communication_time = None + self.communication_bandwidth = None + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + def run(self, context): + if not self.check_table_exist(self.TABLE_COMMUNICATION_GROUP_MAPPING): + if not self.run_communication_group_map_recipe(context): + logger.error("Create CommunicationGroupMap table failed!") + return + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + if self._export_type == Constant.DB: + self.save_db() + else: + logger.error("Unknown export type.") + + def reducer_func(self, mapper_res): + mapper_res_time = list(item[0] for item in mapper_res if item[0] is not None) + mapper_res_bw = list(item[1] for item in mapper_res if item[1] is not None) + if not mapper_res_time and not mapper_res_bw: + logger.error("Mapper data is None.") + return + cluster_db_path = os.path.join(self.output_path, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + data_service = DatabaseService(cluster_db_path, None) + data_service.add_table_for_query(self.TABLE_COMMUNICATION_GROUP_MAPPING, + [TableConstant.RANK_SET, TableConstant.GROUP_NAME]) + df_dict = data_service.query_data() + rank_set_df = df_dict.get(self.TABLE_COMMUNICATION_GROUP_MAPPING, None) + if rank_set_df is None or rank_set_df.empty: + logger.error(f"There is no {self.TABLE_COMMUNICATION_GROUP_MAPPING} data in {cluster_db_path}.") + return + communication_time = pd.concat(mapper_res_time) + communication_bandwidth = pd.concat(mapper_res_bw) + self._compute_time_info(communication_time, rank_set_df) + self._compute_bandwidth_info(communication_bandwidth, rank_set_df) + + def save_db(self): + self.dump_data(self.communication_time, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, + self.TABLE_CLUSTER_COMM_TIME, index=False) + self.dump_data(self.communication_bandwidth, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, + self.TABLE_CLUSTER_COMM_BANDWIDTH, index=False) + + def check_table_exist(self, table): + db_path = os.path.join(self.output_path, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + conn, cursor = DBManager.create_connect_db(db_path) + table_exist = DBManager.judge_table_exists(cursor, table) + DBManager.destroy_db_connect(conn, cursor) + return table_exist + + def run_communication_group_map_recipe(self, context): + """ + Run Recipe to create CommunicationGroupMapping table + """ + logger.info(f"Run CommunicationGroupMap recipe first to get {self.TABLE_COMMUNICATION_GROUP_MAPPING} table") + recipe_class = get_class_from_name("communication_group_map") + if not recipe_class or len(recipe_class) != 2: # 2: (class_name, class) + return False + try: + group_map_recipe = recipe_class[1](self.params) + group_map_recipe.run(context) + except Exception as e: + logger.error(f"Run CommunicationGroupMap recipe failed: {e}!") + return False + return self.check_table_exist(self.TABLE_COMMUNICATION_GROUP_MAPPING) + + def _compute_time_info(self, communication_time, rank_set_df): + """ + communication_time: ['hccl_op_name', 'group_name', 'start_timestamp', 'elapse_time', + 'transit_time', 'wait_time', 'synchronization_time', 'idle_time', + 'step', 'type', 'rank_id'] + rank_set_df: ['rank_set', 'group_name'] + output: ['step', 'rank_id', 'hccl_op_name', 'group_name', 'start_timestamp', 'elapse_time', 'transit_time', + 'wait_time', 'synchronization_time', 'idle_time', 'synchronization_time_ratio', 'wait_time_ratio'] + + 按"step", "rank_id", "rank_set"字段进行分组,汇总"elapse_time", "transit_time", "wait_time", + "synchronization_time", "idle_time"等时间数据,新增汇总行插入communication_time + """ + merged_df = pd.merge(communication_time, rank_set_df, on=TableConstant.GROUP_NAME, how='left') + summed_df = merged_df.groupby([TableConstant.STEP, TableConstant.RANK_ID, TableConstant.RANK_SET]).agg({ + TableConstant.GROUP_NAME: "first", + TableConstant.ELAPSED_TIME: "sum", + TableConstant.TRANSIT_TIME: "sum", + TableConstant.WAIT_TIME: "sum", + TableConstant.SYNCHRONIZATION_TIME: "sum", + TableConstant.IDLE_TIME: "sum" + }).reset_index() + summed_df[TableConstant.HCCL_OP_NAME] = Constant.TOTAL_OP_INFO + summed_df[TableConstant.START_TIMESTAMP] = 0 + # 计算 synchronization_time_ratio,wait_time_ratio + summed_df[TableConstant.SYNCHRONIZATION_TIME_RATIO] = ( + summed_df[TableConstant.SYNCHRONIZATION_TIME] / + (summed_df[TableConstant.TRANSIT_TIME] + summed_df[TableConstant.SYNCHRONIZATION_TIME]).replace(0, + np.nan) + ).fillna(0).round(4) + summed_df[TableConstant.WAIT_TIME_RATIO] = ( + summed_df[TableConstant.WAIT_TIME] / + (summed_df[TableConstant.TRANSIT_TIME] + summed_df[TableConstant.WAIT_TIME]).replace(0, np.nan) + ).fillna(0).round(4) + + communication_time[TableConstant.SYNCHRONIZATION_TIME_RATIO] = 0 + communication_time[TableConstant.WAIT_TIME_RATIO] = 0 + desired_order = [TableConstant.STEP, TableConstant.RANK_ID, TableConstant.HCCL_OP_NAME, + TableConstant.GROUP_NAME, TableConstant.START_TIMESTAMP, TableConstant.ELAPSED_TIME, + TableConstant.TRANSIT_TIME, TableConstant.WAIT_TIME, TableConstant.SYNCHRONIZATION_TIME, + TableConstant.IDLE_TIME, TableConstant.SYNCHRONIZATION_TIME_RATIO, + TableConstant.WAIT_TIME_RATIO] + # 合并汇总数据DataFrame + final_df = pd.concat([communication_time, summed_df], axis=0).reindex(columns=desired_order) + final_df.rename(columns={'elapse_time': 'elapsed_time'}, inplace=True) + self.communication_time = final_df + + def _compute_bandwidth_info(self, communication_bandwidth, rank_set_df): + """ + communication_bandwidth: ['hccl_op_name', 'group_name', 'transport_type', 'transit_size', + 'transit_time', 'bandwidth', 'large_packet_ratio', 'package_size', + 'count', 'total_duration', 'step', 'type', 'rank_id'] + output: ['step', 'rank_id', 'hccl_op_name', 'group_name', 'band_type', 'transit_size', 'transit_time', + 'bandwidth', 'large_packet_ratio', 'package_size', 'count', 'total_duration'] + rank_set_df: ['rank_set', 'group_name'] + 按'rank_set', 'step', 'rank_id', 'transport_type', 'package_size'进行分组,对'count', 'total_duration'进行求和; + 对于同一'rank_set', 'step', 'rank_id', 'transport_type'下的数据,对'transit_size', 'transit_time'求和, + 其中如果'hccl_op_name'+'group_name'相同,求和时只累加一次 + """ + merged_df = pd.merge(communication_bandwidth, rank_set_df, on=TableConstant.GROUP_NAME, how='left') + # 计算每个rank_set/step/rank_id/transport_type分组下去重后的transit_size和transit_time总和 + sum_transit_size = 'sum_transit_size' + sum_transit_time = 'sum_transit_time' + sum_transit = merged_df.groupby( + [TableConstant.RANK_SET, TableConstant.STEP, TableConstant.RANK_ID, TableConstant.TRANSPORT_TYPE]).apply( + self._get_sum_distinct_op).reset_index().rename(columns={ + TableConstant.TRANSIT_SIZE: sum_transit_size, + TableConstant.TRANSIT_TIME: sum_transit_time + }) + joined_df = pd.merge(merged_df, sum_transit, + on=[TableConstant.RANK_SET, TableConstant.STEP, TableConstant.RANK_ID, + TableConstant.TRANSPORT_TYPE]) + # 按'rank_set', 'step', 'rank_id', 'transport_type', 'package_size'进行聚合 + agg_result = joined_df.groupby( + [TableConstant.RANK_SET, TableConstant.STEP, TableConstant.RANK_ID, TableConstant.TRANSPORT_TYPE, + TableConstant.PACKAGE_SIZE] + ).agg({ + TableConstant.COUNT: 'sum', + TableConstant.TOTAL_DURATION: 'sum', + TableConstant.HCCL_OP_NAME: 'first', + TableConstant.GROUP_NAME: 'first', + sum_transit_size: 'first', + sum_transit_time: 'first' + }).reset_index() + agg_result[TableConstant.LARGE_PACKET_RATIO] = 0 + agg_result[TableConstant.HCCL_OP_NAME] = Constant.TOTAL_OP_INFO + # 计算聚合数据带宽 + agg_result[TableConstant.BANDWIDTH] = ( + agg_result[sum_transit_size] / agg_result[sum_transit_time].replace(0, np.nan) + ).fillna(0).round(4) + agg_result = agg_result.rename(columns={ + sum_transit_size: TableConstant.TRANSIT_SIZE, + sum_transit_time: TableConstant.TRANSIT_TIME + }) + desired_order = [TableConstant.STEP, TableConstant.RANK_ID, TableConstant.HCCL_OP_NAME, + TableConstant.GROUP_NAME, TableConstant.TRANSPORT_TYPE, TableConstant.TRANSIT_SIZE, + TableConstant.TRANSIT_TIME, TableConstant.BANDWIDTH, TableConstant.LARGE_PACKET_RATIO, + TableConstant.PACKAGE_SIZE, TableConstant.COUNT, TableConstant.TOTAL_DURATION] + final_df = pd.concat([communication_bandwidth, agg_result], axis=0).reindex(columns=desired_order) + final_df.rename(columns={TableConstant.TRANSPORT_TYPE: TableConstant.BAND_TYPE}, inplace=True) + self.communication_bandwidth = final_df + + def _get_sum_distinct_op(self, op_df): + return op_df.drop_duplicates(subset=[TableConstant.HCCL_OP_NAME, TableConstant.GROUP_NAME])[ + [TableConstant.TRANSIT_SIZE, TableConstant.TRANSIT_TIME]].sum() + + def _mapper_func(self, data_map, analysis_class): + analysis_db_path = data_map.get(Constant.ANALYSIS_DB_PATH) + rank_id = data_map.get(Constant.RANK_ID) + step_range = data_map.get(Constant.STEP_RANGE) + date_service = DatabaseService(analysis_db_path, step_range) + date_service.add_table_for_query(Constant.TABLE_COMM_ANALYZER_TIME) + date_service.add_table_for_query(Constant.TABLE_COMM_ANALYZER_BANDWIDTH) + df_dict = date_service.query_data() + time_df = df_dict.get(Constant.TABLE_COMM_ANALYZER_TIME) + bandwidth_df = df_dict.get(Constant.TABLE_COMM_ANALYZER_BANDWIDTH) + + is_time_df_empty = time_df is None or time_df.empty + is_bandwidth_df_empty = bandwidth_df is None or bandwidth_df.empty + if is_time_df_empty or is_bandwidth_df_empty: + logger.warning(f"There is no stats data in {analysis_db_path}.") + return None, None + # 补充step、rank_id字段 + time_df[TableConstant.RANK_ID] = rank_id + bandwidth_df[TableConstant.RANK_ID] = rank_id + if TableConstant.STEP not in time_df.columns: + time_df[TableConstant.STEP] = TableConstant.STEP + if TableConstant.STEP not in bandwidth_df.columns: + bandwidth_df[TableConstant.STEP] = TableConstant.STEP + return time_df, bandwidth_df diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/compute_op_sum/compute_op_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/compute_op_sum/compute_op_sum.py index a5d44c3f17f6d2a31f097506c38d829c18d5d74f..528534be399e3ceacadbe7d1acf7294d7b3ff37d 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/compute_op_sum/compute_op_sum.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/compute_op_sum/compute_op_sum.py @@ -108,10 +108,11 @@ class ComputeOpSum(BaseRecipeAnalysis): def _mapper_func(self, data_map, analysis_class): profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) rank_id = data_map.get(Constant.RANK_ID) + step_range = data_map.get(Constant.STEP_RANGE) if self.exclude_op_name: - df = ComputeOpSumExportExcludeOpName(profiler_db_path, analysis_class).read_export_db() + df = ComputeOpSumExportExcludeOpName(profiler_db_path, analysis_class, step_range).read_export_db() else: - df = ComputeOpSumExport(profiler_db_path, analysis_class).read_export_db() + df = ComputeOpSumExport(profiler_db_path, analysis_class, step_range).read_export_db() if df is None or df.empty: logger.warning(f"There is no stats data in {profiler_db_path}.") return None diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/filter_db/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/ep_load_balance/__init__.py similarity index 100% rename from profiler/msprof_analyze/cluster_analyse/recipes/filter_db/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/ep_load_balance/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/ep_load_balance/ep_load_balance.py b/profiler/msprof_analyze/cluster_analyse/recipes/ep_load_balance/ep_load_balance.py new file mode 100644 index 0000000000000000000000000000000000000000..5c58b173167dcd2afdee65ac6a74eec12c9a8f52 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/ep_load_balance/ep_load_balance.py @@ -0,0 +1,131 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import json + +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_exports.ep_load_balance_ecport import InputShapeExport +from msprof_analyze.prof_common.database_service import DatabaseService + +logger = get_logger() + + +class EPLoadBalance(BaseRecipeAnalysis): + + EP_TOKENS_SUMMARY = "EPTokensSummary" + TOP_EP_TOKENS_INFO = "TopEPTokensInfo" + META_DATA = "META_DATA" + Top_Num = 20 + GROUPEP = "exp" + + def __init__(self, params): + super().__init__(params) + logger.info("EPLoadBalance init.") + self.ep_tokens_summary = None + self.top_ep_tokens_map = None + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + def process_input_shapes(self, df): + def calculate_seqlength(shape_str): + shape_str = shape_str.strip('"') + parts = shape_str.split(";") + non_empty_parts = [part for part in parts if part] + # 取前 n-2 个有值的部分 + if len(non_empty_parts) > 1: + non_empty_parts = non_empty_parts[: len(non_empty_parts) - 2] + else: + return None + seqlength = 0 + for part in non_empty_parts: + part = part.strip() + try: + first_dim = int(part.split(",")[0]) + except (IndexError, ValueError) as e: + return None + seqlength += first_dim + return seqlength + + df["InputShapes"] = df["InputShapes"].apply(calculate_seqlength) + return df + + def reducer_func(self, mapper_res): + mapper_res = list(filter(lambda df: df is not None, mapper_res)) + if not mapper_res: + logger.error("Mapper data is None.") + return + for i, df in enumerate(mapper_res): + mapper_res[i] = self.process_input_shapes(df) + mapper_res = [df.dropna() for df in mapper_res] + for df in mapper_res: + df["epRanks"] = df["epRanks"].apply(lambda x: ",".join(map(str, x))) + combined_df = pd.concat(mapper_res) + self.ep_tokens_summary = combined_df.groupby(["Rank", "epRanks"]).agg({"InputShapes": "sum"}).reset_index() + self.ep_tokens_summary.columns = ["rank", "epRanks", "inputShapesSummary"] + self.top_ep_tokens_map = ( + self.ep_tokens_summary.groupby("epRanks")["inputShapesSummary"] + .agg(tokensDiff=lambda x: x.max() - x.min()) + .reset_index() + ) + self.top_ep_tokens_map = self.top_ep_tokens_map.sort_values(by="tokensDiff", ascending=False).head(self.Top_Num) + + def run(self, context): + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + + if self._export_type == "db": + self.save_db() + else: + logger.error("ep_load_balance is only supported for db export type.") + + def save_db(self): + self.dump_data(self.ep_tokens_summary, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, self.EP_TOKENS_SUMMARY, + index=False) + self.dump_data(self.top_ep_tokens_map, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, self.TOP_EP_TOKENS_INFO, + index=False) + + def _mapper_func(self, data_map, analysis_class): + profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) + rank_id = data_map.get(Constant.RANK_ID) + step_range = data_map.get(Constant.STEP_RANGE) + analysis_data_service = DatabaseService(profiler_db_path, {}) + analysis_data_service.add_table_for_query(self.META_DATA) + meta_map = analysis_data_service.query_data()[self.META_DATA] + parallel_group_info = meta_map.loc[meta_map['name'] == 'parallel_group_info', 'value'].iloc[0] + try: + data_dict = json.loads(parallel_group_info) + except json.JSONDecodeError as e: + logger.error(f"{profiler_db_path}'s parallel_group_info is illegal") + return None + if not isinstance(data_dict, dict): + raise TypeError('{} must be dict, not {}.'.format(data_dict, type(data_dict).__name__)) + for _, value in data_dict.items(): + if value["group_name"] == self.GROUPEP: + global_ranks = value["global_ranks"] + break + df = InputShapeExport(profiler_db_path, analysis_class, step_range).read_export_db() + if df is None or df.empty: + logger.warning(f"There is no stats data in {profiler_db_path}.") + return None + df["Rank"] = rank_id + df["epRanks"] = [global_ranks] * len(df) + return df \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/filter_db/filter_db.py b/profiler/msprof_analyze/cluster_analyse/recipes/filter_db/filter_db.py deleted file mode 100644 index 29db8f637376fa04629e5b728a2aa53c9251944c..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/filter_db/filter_db.py +++ /dev/null @@ -1,80 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import shutil - -from msprof_analyze.prof_common.db_manager import DBManager -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_common.path_manager import PathManager -from msprof_analyze.prof_exports.filter_db_export import OPFilter -from msprof_analyze.prof_exports.filter_db_export import TaskFilter -from msprof_analyze.prof_exports.filter_db_export import CANNFilter -from msprof_analyze.prof_exports.filter_db_export import PYTORCHFilter - -logger = get_logger() - -FILTER_COMPUTE = "COMPUTE_TASK_INFO" -FILTER_TASK = "TASK" -FILTER_CANN = "CANN_API" -FILTER_PYTORCH = "PYTORCH_API" - - -class DatabaseFilter(BaseRecipeAnalysis): - def __init__(self, params): - super().__init__(params) - logger.info("filter_db init.") - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - def run(self, context): - mapper_res = self.mapper_func(context) - logger.info("Filtering database completed.") - - def _mapper_func(self, data_map, analysis_class): - profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) - rank_id = data_map.get(Constant.RANK_ID) - - paths = profiler_db_path.split(os.path.sep) - sub_path = os.path.join(*paths[-3:-1]) - - output_path = os.path.join(self._output_path, "filter_db", sub_path) - PathManager.make_dir_safety(output_path) - - filtered_db = os.path.join(output_path, f"ascend_pytorch_profiler_{rank_id}.db") - shutil.copyfile(profiler_db_path, filtered_db) - - conn, cursor = DBManager.create_connect_db(filtered_db) - - op = OPFilter(filtered_db, analysis_class).read_export_db() - op.to_sql(FILTER_COMPUTE, conn, if_exists="replace", index=False) - task = TaskFilter(filtered_db, analysis_class).read_export_db() - task.to_sql(FILTER_TASK, conn, if_exists="replace", index=False) - cann = CANNFilter(filtered_db, analysis_class).read_export_db() - cann.to_sql(FILTER_CANN, conn, if_exists="replace", index=False) - pytorch = PYTORCHFilter(filtered_db, analysis_class).read_export_db() - pytorch.to_sql(FILTER_PYTORCH, conn, if_exists="replace", index=False) - - DBManager.execute_sql(conn, "DROP TABLE IF EXISTS COMMUNICATION_TASK_INFO;") - DBManager.execute_sql(conn, "DROP TABLE IF EXISTS TASK_PMU_INFO;") - - cursor.execute("VACUUM;") - conn.commit() - - DBManager.destroy_db_connect(conn, cursor) diff --git a/profiler/merge_profiling_timeline/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/freq_analysis/__init__.py similarity index 100% rename from profiler/merge_profiling_timeline/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/freq_analysis/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/freq_analysis/freq_analysis.py b/profiler/msprof_analyze/cluster_analyse/recipes/freq_analysis/freq_analysis.py new file mode 100644 index 0000000000000000000000000000000000000000..985288ca40cb0cab5b6c7677b163d202e71ed23e --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/freq_analysis/freq_analysis.py @@ -0,0 +1,112 @@ +# Copyright (c) 2024, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from collections import defaultdict +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.database_service import DatabaseService + +logger = get_logger() + + +class FreqAnalysis(BaseRecipeAnalysis): + COMMON_FREQ = 1800 + FREE_FREQ = 800 + + def __init__(self, params): + super().__init__(params) + self.free_freq_ranks = [] + self.abnormal_freq_ranks = [] + self.abnormal_freq_ranks_map = {} + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + def reducer_func(self, mapper_res): + if self._is_msprof: + logger.warning("Freq analysis do not support msprof db now.") + return + mapper_res = list(filter(lambda res: res[0] is not None, mapper_res)) + if not mapper_res: + logger.error("Mapper data is None, load profiling data failed.") + return + for freqs, rank_id in mapper_res: + if freqs == [self.COMMON_FREQ]: + continue + elif set(freqs) == {self.COMMON_FREQ, self.FREE_FREQ}: + self.free_freq_ranks.append(rank_id) + else: + self.abnormal_freq_ranks.append(rank_id) + self.abnormal_freq_ranks_map[rank_id] = str(freqs) + self.free_freq_ranks.sort() + self.abnormal_freq_ranks.sort() + + def save_db(self): + if len(self.free_freq_ranks) > 0: + logger.info(f"Found {len(self.free_freq_ranks)} ranks with free time, " + f"aicore frequency in {[self.FREE_FREQ, self.COMMON_FREQ]}.") + free_ranks_df = pd.DataFrame() + free_ranks_df["rankId"] = self.free_freq_ranks + free_ranks_df["aicoreFrequency"] = str([self.FREE_FREQ, self.COMMON_FREQ]) + free_ranks_df.set_index(["rankId"], inplace=True) + self.dump_data(free_ranks_df, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "FreeFrequencyRanks") + else: + logger.info("No rank found with free time.") + if len(self.abnormal_freq_ranks) > 0: + logger.info(f"Found {len(self.abnormal_freq_ranks)} ranks with abnormal aicore frequency.") + + abnormal_ranks_df = pd.DataFrame.from_dict(self.abnormal_freq_ranks_map, + orient="index", columns=["aicoreFrequency"]) + abnormal_ranks_df = abnormal_ranks_df.reset_index().rename(columns={"index": "rankId"}) + abnormal_ranks_df.set_index(["rankId"], inplace=True) + self.dump_data(abnormal_ranks_df, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "AbnormalFrequencyRanks") + else: + logger.info("No rank found with abnormal aicore frequency.") + if len(self.free_freq_ranks) > 0 or len(self.abnormal_freq_ranks) > 0: + logger.info("Please verify result in output file.") + + def run(self, context): + mapper_res = self.mapper_func(context) + self.reducer_func(mapper_res) + + if self._export_type == Constant.DB: + self.save_db() + else: + logger.error("Frequence analysis is not supported for notebook export type.") + + def _mapper_func(self, data_map, analysis_class): + profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) + service = DatabaseService(profiler_db_path, None) + service.add_table_for_query("AICORE_FREQ", ["deviceId", "freq"]) + service.add_table_for_query("RANK_DEVICE_MAP", ["rankId"]) + service_res = service.query_data() + aic_freq = service_res.get("AICORE_FREQ", None) + rank_id = service_res.get("RANK_DEVICE_MAP", None) + if aic_freq is None or aic_freq.empty: + logger.error(f"No aic freq data found in {profiler_db_path}.") + return None, None + if rank_id is None or rank_id.empty: + logger.error(f"No rank_id data found in {profiler_db_path}.") + return None, None + rank_id = rank_id["rankId"].values[0] + freq_arr = aic_freq["freq"].values + freqs = list(set(freq_arr)) + freqs.sort() + return freqs, rank_id diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/hccl_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/hccl_sum.py index a78603ee0ac2894fb8b60a21f411e7fef9d144db..84ff40ac7e5d78d6ea30127739e18dfd1654e2c0 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/hccl_sum.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/hccl_sum.py @@ -128,9 +128,10 @@ class HcclSum(BaseRecipeAnalysis): def _mapper_func(self, data_map, analysis_class): profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) rank_id = data_map.get(Constant.RANK_ID) - df = HcclSumExport(profiler_db_path, analysis_class).read_export_db() + step_range = data_map.get(Constant.STEP_RANGE) + df = HcclSumExport(profiler_db_path, analysis_class, step_range).read_export_db() if df is None or df.empty: logger.warning(f"There is no stats data in {profiler_db_path}.") return None df["Rank"] = rank_id - return df \ No newline at end of file + return df diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/stats.ipynb b/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/stats.ipynb index 87f8c6d736240531e2c28c0cf33df087ecfe38e8..51a08a854b97161ba8e88ec94809b728582d6631 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/stats.ipynb +++ b/profiler/msprof_analyze/cluster_analyse/recipes/hccl_sum/stats.ipynb @@ -4,9 +4,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# HCCL Summary\n", + "# COMMUNICATION Summary\n", "\n", - "集群场景Hccl算子数据分析\n", + "集群场景通信算子数据分析\n", "\n", "主要包含以下3个统计内容:\n", "1. 按算子类型分组的,整个集群通信算子耗时的统计情况\n", diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/mstx2commop.py b/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/mstx2commop.py deleted file mode 100644 index 6ca80230abbe90cca79d2a17da466ed5bab83c03..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/mstx2commop/mstx2commop.py +++ /dev/null @@ -1,162 +0,0 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import pandas as pd - -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.db_manager import DBManager -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.mstx2commop_export import Mstx2CommopExport -from msprof_analyze.prof_common.database_service import DatabaseService - -logger = get_logger() - -TABLE_COMMUNICATION_OP = "COMMUNICATION_OP" -TABLE_STRING_IDS = "STRING_IDS" - - -def double_hash(data): - uint32_bits = 32 - uint32_max = 0xFFFFFFFF # 32 位无符号整数的最大值 - prime = [29, 131] - hash_values = [0, 0] - - for d in data: - hash_values[0] = (hash_values[0] * prime[0] + ord(d)) & uint32_max - hash_values[1] = (hash_values[1] * prime[1] + ord(d)) & uint32_max - - return ((hash_values[0] << uint32_bits) | hash_values[1]) - - -class Mstx2Commop(BaseRecipeAnalysis): - - def __init__(self, params): - super().__init__(params) - logger.info("Mstx2Commop init.") - self.communication_op = None - self.string_ids_insert = None - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - def run(self, context): - self.mapper_func(context) - - def _mapper_func(self, data_map, analysis_class): - profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) - data_service = DatabaseService(profiler_db_path) - data_service.add_table_for_query("ENUM_HCCL_DATA_TYPE", ["id", "name"]) - data_service.add_table_for_query("STRING_IDS", ["id", "value"]) - df_dict = data_service.query_data() - - df = Mstx2CommopExport(profiler_db_path, analysis_class).read_export_db() - - if df is None or df.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - - df_hccl_dt = df_dict.get("ENUM_HCCL_DATA_TYPE") - - if df_hccl_dt is None or df_hccl_dt.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - - df_string_ids = df_dict.get("STRING_IDS") - - if df_string_ids is None or df_string_ids.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - - df['value_list'] = df['value'].apply(lambda x: x.split(',')) - df['value_list_len'] = df['value_list'].apply(len) - df = df[df['value_list_len'] == 4] - df['opType_primal'] = df['value_list'].apply(lambda x: 'hcom_' + x[0][9:] + '_') - df['groupName_primal'] = df['value_list'].apply(lambda x: x[1]) - df['dataType'] = df['value_list'].apply(lambda x: x[2]) - df['count'] = df['value_list'].apply(lambda x: x[3]) - - df['groupName_hash'] = df['groupName_primal'].apply(double_hash).apply(str) - - df['gN_oT'] = df['groupName_primal'] + df['opType_primal'] - - gnot_set = set(list(df['gN_oT'])) - - df_concat = pd.DataFrame() - for g_o in gnot_set: - df_split = df[df['gN_oT'] == g_o] - df_split['queue'] = list(range(len(df_split))) - df_concat = pd.concat([df_concat, df_split], axis=0) - - df_concat['queue'] = df_concat['queue'].apply(str) - - df_concat['groupId'] = df_concat['groupName_hash'].apply(lambda x: "_" + x[-3:]) - - df_concat['opName_primal'] = df_concat['opType_primal'] + df_concat['groupId'] + '_' + df_concat['queue'] + '_1' - - df_concat['opId'] = list(range(len(df_concat))) - df_concat['relay'] = None - df_concat['retry'] = None - df_concat['algType'] = None - - df_hccl_dt['name'] = df_hccl_dt['name'].apply(lambda x: x.lower()) - hccl_data_type_dict = dict(zip(df_hccl_dt['name'], df_hccl_dt['id'])) - - string_ids_dict = dict(zip(df_string_ids['value'], df_string_ids['id'])) - - string_ids_max = df_string_ids['id'].max() - - df_concat['dataType'] = df_concat['dataType'].apply(lambda x: hccl_data_type_dict[x]) - - df_concat['string_id_opType_primal'] = df_concat['opType_primal'].apply( - lambda x: 1 if x in string_ids_dict else 0) - df_concat['string_id_opName_primal'] = df_concat['opName_primal'].apply( - lambda x: 1 if x in string_ids_dict else 0) - df_concat['string_id_groupName_primal'] = df_concat['groupName_primal'].apply( - lambda x: 1 if x in string_ids_dict else 0) - optype_primal_list = list(set(df_concat[df_concat['string_id_opType_primal'] == 0]['opType_primal'])) - opname_primal_list = list(set(df_concat[df_concat['string_id_opName_primal'] == 0]['opName_primal'])) - groupname_primal_list = list(set(df_concat[df_concat['string_id_groupName_primal'] == 0]['groupName_primal'])) - - special_primal_list = optype_primal_list + opname_primal_list + groupname_primal_list - special_id_list = list(range(string_ids_max + 1, string_ids_max + len(special_primal_list) + 1)) - - special_id_dict = dict(zip(special_primal_list, special_id_list)) - - df_concat['opType'] = df_concat['opType_primal'].apply( - lambda x: string_ids_dict[x] if x in string_ids_dict else special_id_dict[x] - ) - df_concat['opName'] = df_concat['opName_primal'].apply( - lambda x: string_ids_dict[x] if x in string_ids_dict else special_id_dict[x] - ) - df_concat['groupName'] = df_concat['groupName_primal'].apply( - lambda x: string_ids_dict[x] if x in string_ids_dict else special_id_dict[x] - ) - - communication_op = df_concat[ - ['opName', 'startNs', 'endNs', 'connectionId', 'groupName', 'opId', 'relay', 'retry', 'dataType', 'algType', - 'count', 'opType']] - communication_op.sort_values('startNs', ascending=True, inplace=True) - communication_op.set_index('opId', inplace=True) - string_ids_insert = list(map(list, zip(special_id_list, special_primal_list))) - - DBManager.insert_data_into_db(data_map.get(Constant.PROFILER_DB_PATH), TABLE_STRING_IDS, string_ids_insert) - - self.dump_data(data=communication_op, file_name=data_map.get(Constant.PROFILER_DB_PATH), - table_name=TABLE_COMMUNICATION_OP, custom_db_path=data_map.get(Constant.PROFILER_DB_PATH)) - - return data_map.get(Constant.RANK_ID) diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/mstx_sum/mstx_sum.py b/profiler/msprof_analyze/cluster_analyse/recipes/mstx_sum/mstx_sum.py index 69b4b056850b85a856634c7feb0121cfcb34494b..db6aae0de869853ccc5debac729492bfc9853695 100644 --- a/profiler/msprof_analyze/cluster_analyse/recipes/mstx_sum/mstx_sum.py +++ b/profiler/msprof_analyze/cluster_analyse/recipes/mstx_sum/mstx_sum.py @@ -21,7 +21,7 @@ from msprof_analyze.cluster_analyse.common_func.utils import describe_duration from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.mstx_mark_export import MstxMarkExport +from msprof_analyze.prof_exports.mstx_event_export import MstxMarkExport, MstxRangeExport from msprof_analyze.prof_exports.mstx_step_export import MstxStepExport logger = get_logger() @@ -43,16 +43,28 @@ def format_mark_info(df: pd.DataFrame, start_idx, stop_idx, name) -> MarkInfo: ) -def rename_mark_msg_name(mark_stats_df: pd.DataFrame): +def format_range_info(df: pd.DataFrame, idx, name) -> MarkInfo: + range_series = df.iloc[idx] + return MarkInfo( + name=name, + framework_duration=float(0), + cann_duration=float(range_series["cann_end_ts"] - range_series["cann_start_ts"]), + device_duration=float(range_series["device_end_ts"] - range_series["device_start_ts"]), + tid=range_series["tid"], + start_ns=range_series["cann_start_ts"] + ) + + +def rename_mark_msg_name(mstx_stats_df: pd.DataFrame): msg_idx_counter = {} - for idx, mark_info in enumerate(mark_stats_df.itertuples(index=False)): + for idx, mark_info in enumerate(mstx_stats_df.itertuples(index=False)): msg_idx_counter.setdefault(mark_info.step_id, {}).setdefault(mark_info.name, []).append(idx) for msg_dict in msg_idx_counter.values(): for msg, idx_list in msg_dict.items(): if len(idx_list) <= 1: continue for i, idx in enumerate(idx_list): - mark_stats_df.loc[idx, 'name'] = f"{msg}_{i}" + mstx_stats_df.loc[idx, 'name'] = f"{msg}_{i}" def compute_step_id(mark_stat, step_stats_df: pd.DataFrame): @@ -80,6 +92,45 @@ def format_columns(df: pd.DataFrame): return formatted_df[cols] +def handle_mark_data(mark_df: pd.DataFrame, rank_id: int) -> list: + res = [] + mark_df["framework_ts"] = mark_df["framework_ts"].astype("int64") + mark_info = {} + mismatch_msg = [] + for idx, row in enumerate(mark_df.itertuples(index=False)): + if row.msg.endswith(MstxSum.START_SUFFIX): + msg = row.msg[:-len(MstxSum.START_SUFFIX)] + mark_info.setdefault(row.tid, {}).setdefault(msg, []).append(idx) + elif row.msg.endswith(MstxSum.STOP_SUFFIX): + msg = row.msg[:-len(MstxSum.STOP_SUFFIX)] + idx_list = mark_info.get(row.tid, {}).get(msg, []) + if not idx_list: + mismatch_msg.append((row.msg, idx)) + continue + start_idx = idx_list.pop() + res.append(format_mark_info(mark_df, start_idx, idx, msg)) + + # 统计未匹配上的mark信息 + for msg_info in mark_info.values(): + for msg, idx_list in msg_info.items(): + if not idx_list: + continue + mismatch_msg.extend((msg + MstxSum.START_SUFFIX, idx) for idx in idx_list) + if mismatch_msg: + mismatch_msg.sort(key=lambda msg: msg[1]) + logger.warning(f"The following mark messages do not match anyone in " + f"rank {rank_id}: {','.join(msg[0] for msg in mismatch_msg)}.") + + return res + + +def handle_range_data(range_df: pd.DataFrame) -> list: + res = [] + for idx, row in enumerate(range_df.itertuples(index=False)): + res.append(format_range_info(range_df, idx, row.msg)) + return res + + class MstxSum(BaseRecipeAnalysis): TABLE_FRAMEWORK_STATS = "MSTXAllFrameworkStats" TABLE_CANN_STATS = "MSTXAllCannStats" @@ -154,44 +205,23 @@ class MstxSum(BaseRecipeAnalysis): def _mapper_func(self, data_map, analysis_class): profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) rank_id = data_map.get(Constant.RANK_ID) - step_df = MstxStepExport(profiler_db_path, analysis_class).read_export_db() + step_range = data_map.get(Constant.STEP_RANGE) + step_df = MstxStepExport(profiler_db_path, analysis_class, step_range).read_export_db() if step_df is None or step_df.empty: step_df = pd.DataFrame({"start_ns": [0], "end_ns": [float("inf")], "step_id": [0]}) - mark_df = MstxMarkExport(profiler_db_path, analysis_class).read_export_db() - if mark_df is None or mark_df.empty: - logger.warning(f"There is no mark data in {profiler_db_path}.") + mark_df = MstxMarkExport(profiler_db_path, analysis_class, step_range).read_export_db() + range_df = MstxRangeExport(profiler_db_path, analysis_class, step_range).read_export_db() + mstx_res = [] + if not mark_df.empty: + mstx_res += handle_mark_data(mark_df, rank_id) + if not range_df.empty: + mstx_res += handle_range_data(range_df) + if not mstx_res: + logger.warning(f"There is no mstx data in {profiler_db_path}.") return None - mark_df["framework_ts"] = mark_df["framework_ts"].astype("int64") - - mark_info = {} - mark_res = [] - mismatch_msg = [] - for idx, row in enumerate(mark_df.itertuples(index=False)): - if row.msg.endswith(MstxSum.START_SUFFIX): - msg = row.msg[:-len(MstxSum.START_SUFFIX)] - mark_info.setdefault(row.tid, {}).setdefault(msg, []).append(idx) - elif row.msg.endswith(MstxSum.STOP_SUFFIX): - msg = row.msg[:-len(MstxSum.STOP_SUFFIX)] - idx_list = mark_info.get(row.tid, {}).get(msg, []) - if not idx_list: - mismatch_msg.append((row.msg, idx)) - continue - start_idx = idx_list.pop() - mark_res.append(format_mark_info(mark_df, start_idx, idx, msg)) - - # 统计未匹配上的mark信息 - for msg_info in mark_info.values(): - for msg, idx_list in msg_info.items(): - if not idx_list: - continue - mismatch_msg.extend((msg + MstxSum.START_SUFFIX, idx) for idx in idx_list) - if mismatch_msg: - mismatch_msg.sort(key=lambda msg: msg[1]) - logger.warning(f"The following mark messages do not match anyone in " - f"rank {rank_id}: {','.join(msg[0] for msg in mismatch_msg)}.") - - mark_stats_df = pd.DataFrame(mark_res).assign(Rank=rank_id) - mark_stats_df["step_id"] = mark_stats_df.apply(compute_step_id, axis=1, step_stats_df=step_df) - rename_mark_msg_name(mark_stats_df) - mark_stats_df = format_columns(mark_stats_df).set_index("Name", drop=True) - return mark_stats_df \ No newline at end of file + + mstx_stats_df = pd.DataFrame(mstx_res).assign(Rank=rank_id) + mstx_stats_df["step_id"] = mstx_stats_df.apply(compute_step_id, axis=1, step_stats_df=step_df) + rename_mark_msg_name(mstx_stats_df) + mstx_stats_df = format_columns(mstx_stats_df).set_index("Name", drop=True) + return mstx_stats_df diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/p2p_pairing/p2p_pairing.py b/profiler/msprof_analyze/cluster_analyse/recipes/p2p_pairing/p2p_pairing.py deleted file mode 100644 index b3cce9d214ebbd62b13494c9d68c9bdfe9629d3b..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/p2p_pairing/p2p_pairing.py +++ /dev/null @@ -1,243 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -from json import JSONDecodeError - -import numpy as np -import pandas as pd - -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.cluster_analyse.common_func.table_constant import ProfilerTableConstant -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.db_manager import DBManager -from msprof_analyze.prof_common.file_manager import FileManager -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.p2p_pairing_export import P2PPairingExport - - -logger = get_logger() - - -class P2PPairing(BaseRecipeAnalysis): - - P2P_OP_NAME_PATTERN = r"^hcom_([Ss]end|[Rr](ecv|eceive))__\d+_\d+_\d+$" - DOMAIN_ID_EXTRACT_PATTERN = r"__(\d+)_\d+_\d+" - RECEIVE_OP_MATCH_PATTERN = r"[Rr]ecv|[Rr]eceive" - VALID_DST_RANK_TASK_TYPE = [Constant.NOTIFY_RECORD, Constant.NOTIFY_WAIT] - # intermediate dataframe column names - COL_NAME_IS_UNIQUE_VALUE = "isUniqueValue" - COL_NAME_OP_DST_RANK = "opDstRank" - COL_NAME_DOMAIN_ID = "domainId" - COL_NAME_IS_RECEIVE = "isReceive" - COL_NAME_OP_NAMING_INDEX = "opNamingIndex" - # output column name - COL_NAME_P2P_CONNECTION_ID = "opConnectionId" - # export params - TARGET_TABLE_NAME = Constant.TABLE_COMMUNICATION_OP - - def __init__(self, params): - super().__init__(params) - logger.info("P2PPairing init.") - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - def run(self, context): - self.mapper_func(context) - logger.info("P2PPairing completed.") - - def update_connection_info_to_table(self, df_result, profiler_db_path): - """ - 将生成好的连接ID添加至COMMUNICATION OP表中,新增列`opConnectionId`。目前只处理Send和Recv算子,对应的opId会更新具体的连接ID, - 否则置空 - """ - conn, cursor = DBManager.create_connect_db(profiler_db_path) - ret = DBManager.check_columns_exist(cursor, self.TARGET_TABLE_NAME, {self.COL_NAME_P2P_CONNECTION_ID}) - if ret is None: - logger.error("Failed to connect to the database. Please check the database configurations") - return - if self.COL_NAME_P2P_CONNECTION_ID in ret: - logger.error(f"`{self.COL_NAME_P2P_CONNECTION_ID}` already exists in the {self.TARGET_TABLE_NAME}. " - f"Exiting to prevent result overwrite.") - return - DBManager.execute_sql( - conn, - f"ALTER TABLE {self.TARGET_TABLE_NAME} ADD COLUMN {self.COL_NAME_P2P_CONNECTION_ID} TEXT" - ) - DBManager.execute_sql( - conn, - f"UPDATE {self.TARGET_TABLE_NAME} SET {self.COL_NAME_P2P_CONNECTION_ID} = NULL" - ) - DBManager.executemany_sql( - conn, - f""" - UPDATE {self.TARGET_TABLE_NAME} - SET {self.COL_NAME_P2P_CONNECTION_ID} = ? - WHERE {ProfilerTableConstant.OP_ID} = ?;""", - [(row[self.COL_NAME_P2P_CONNECTION_ID], row[P2PPairingExport.CO_OP_NAME]) - for _, row in df_result.iterrows()] - ) - DBManager.destroy_db_connect(conn, cursor) - - def generate_p2p_connection_index(self, df): - """ - 生成每一个P2P的算子的对应连接ID,连接ID的生成规则按照`通信域_Send卡号_Recv卡号_算子index`。 - 其中通信域为通信域字符串的哈希值后三位表示;Send卡和Recv卡分别为这个通信域内的local rank号;算子index是这两张卡之间按时间线排序, - 出现Send和Recv算子已有的频次。比如说,一个算子的名称为`hcom_send_233_58_1`,自己在通信域内的rank号为0,对端的rank号为1;在这之前 - 并没有存在0卡向1卡的Send任务。因此生成的id为`233_0_1_0` - """ - df[self.COL_NAME_DOMAIN_ID] = df[P2PPairingExport.OP_NAME]. \ - str.extract(self.DOMAIN_ID_EXTRACT_PATTERN)[0] - df[self.COL_NAME_IS_RECEIVE] = df[P2PPairingExport.OP_NAME]. \ - str.contains(self.RECEIVE_OP_MATCH_PATTERN) - df.loc[ - df[self.COL_NAME_IS_RECEIVE], [P2PPairingExport.SRC_RANK, self.COL_NAME_OP_DST_RANK] - ] = df.loc[ - df[self.COL_NAME_IS_RECEIVE], [self.COL_NAME_OP_DST_RANK, P2PPairingExport.SRC_RANK] - ].values - df[self.COL_NAME_OP_NAMING_INDEX] = df.sort_values(by=[P2PPairingExport.START_TIME]). \ - groupby([P2PPairingExport.SRC_RANK, self.COL_NAME_OP_DST_RANK]).cumcount() - df[self.COL_NAME_P2P_CONNECTION_ID] = (df[self.COL_NAME_DOMAIN_ID].astype(str) + "_" - + df[P2PPairingExport.SRC_RANK].astype(str) + "_" - + df[self.COL_NAME_OP_DST_RANK].astype(str) + "_" - + df[self.COL_NAME_OP_NAMING_INDEX].astype(str)) - return df.reset_index() - - def fine_filtering_src_dst_ranks(self, df: pd.DataFrame): - """ - 精筛符合条件的数据: - 1、小算子任务包含了“Notify_Record”和“Notify_Wait”的数据 - 2、上一步得到的数据中对端卡号是否一致,如果不一致则会抛出warning - 3、步骤1得到数据中本端卡号是否一致,如果不一致则会报出error返回空值 - """ - df = df[df[P2PPairingExport.TASK_TYPE].isin(self.VALID_DST_RANK_TASK_TYPE)] - - def check_dst_rank_unique(group): - return group[P2PPairingExport.DST_RANK].nunique() == 1 - - unique_dst_rank: pd.DataFrame = (df.groupby(P2PPairingExport.OP_NAME) - .apply(check_dst_rank_unique, include_groups=False)) - - def get_dst_rank_value(group): - if group[P2PPairingExport.DST_RANK].nunique() == 1: - return group[P2PPairingExport.DST_RANK].iloc[0] - return np.nan - - dst_rank_value: pd.DataFrame = (df.groupby(P2PPairingExport.OP_NAME, group_keys=False). - apply(get_dst_rank_value, include_groups=False)) - - df = df.copy() - df[self.COL_NAME_IS_UNIQUE_VALUE] = df[P2PPairingExport.OP_NAME].map(unique_dst_rank) - df[self.COL_NAME_OP_DST_RANK] = df[P2PPairingExport.OP_NAME].map(dst_rank_value) - df[self.COL_NAME_OP_DST_RANK] = df[self.COL_NAME_OP_DST_RANK].fillna(Constant.INVALID_RANK_NUM) - df[self.COL_NAME_OP_DST_RANK] = df[self.COL_NAME_OP_DST_RANK].astype(df[P2PPairingExport.DST_RANK].dtype) - - check_dst_rank_unique_false: pd.DataFrame = df[~df[self.COL_NAME_IS_UNIQUE_VALUE]] - if not check_dst_rank_unique_false.empty: - logger.warning(f"There are communication op entries with multiple destination ranks! " - f"Please check the corresponding profiler database file.") - - df = df[df[self.COL_NAME_IS_UNIQUE_VALUE]] - - src_rank_unique_values: int = df[P2PPairingExport.SRC_RANK].nunique() - if src_rank_unique_values != 1: - logger.error(f"There are communication op entries with multiple source ranks! " - f"Please check the corresponding profiler database file.") - return None - return df.reset_index() - - def filter_data_by_group_name(self, df: pd.DataFrame): - """ - 初步筛选出目标数据: - 1、筛选出Send和Recv的算子 - 2、筛选出同一opId在COMMUNICATION OP中groupName和COMMUNICATION TASK INFO中groupName一致的数据 - """ - df = df[df[P2PPairingExport.OP_NAME].str.match(self.P2P_OP_NAME_PATTERN)] - filtered_df = df[df[P2PPairingExport.CO_GROUP_NAME] == df[P2PPairingExport.CTI_GROUP_NAME]] - anomaly_group_match = df[df[P2PPairingExport.CO_GROUP_NAME] != df[P2PPairingExport.CTI_GROUP_NAME]] - if not anomaly_group_match.empty: - logger.warning(f"Group name mismatch in {len(anomaly_group_match)} entries. Please check the" - f" profiler database in communication task info.") - return filtered_df.reset_index() - - def extract_pp_group_from_metadata(self, profiler_parent_path) -> any: - """ - 从profiler_metadata.json的文件中获取pp通信域的信息 - """ - metadata_path = os.path.join(profiler_parent_path, Constant.PROFILER_METADATA) - try: - if os.path.exists(metadata_path): - metadata = FileManager.read_json_file(metadata_path) - parallel_group_info: dict = metadata.get(Constant.PARALLEL_GROUP_INFO, None) if metadata else None - else: - raise FileNotFoundError(f"No `{Constant.PROFILER_METADATA}` found in {profiler_parent_path}.") - except (FileNotFoundError, JSONDecodeError) as e: - logger.error(f"Failed to load profiler metadata: {e}") - return None - - if parallel_group_info is None: - logger.error(f"No key name `{Constant.PARALLEL_GROUP_INFO}` found in {metadata_path}") - return None - - pp_group_info = [] - for name in parallel_group_info: - each_group_info: dict = parallel_group_info[name] - if each_group_info[Constant.GROUP_NAME] == Constant.PP: - pp_group_info.append(parallel_group_info[name]) - if not pp_group_info: - logger.error(f"No pipeline parallel info found in {metadata_path}") - return None - - return pp_group_info - - def _mapper_func(self, data_map, analysis_class): - profiler_db_path: str = data_map.get(Constant.PROFILER_DB_PATH) - profiler_parent_path: str = os.path.dirname(os.path.dirname(profiler_db_path)) - - df: pd.DataFrame = P2PPairingExport(profiler_db_path, analysis_class).read_export_db() - if df is None or df.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - - pp_group_info = self.extract_pp_group_from_metadata(profiler_parent_path) # 暂时没用到,预留给后续确认用全局rank - if pp_group_info is None: - logger.error(f"Cannot obtain pipeline parallel info from the metadata. " - f"Please check the corresponding {Constant.PROFILER_METADATA}") - - df = self.filter_data_by_group_name(df) - if df.empty: - return None - - df_filtered = self.fine_filtering_src_dst_ranks(df.copy()) - if df_filtered is None: - logger.error("Got error when trying to match rank numbers!") - return None - - df_result = df_filtered.groupby([P2PPairingExport.OP_NAME, P2PPairingExport.CO_OP_NAME]).agg( - { - P2PPairingExport.START_TIME: "first", - P2PPairingExport.SRC_RANK: "first", - self.COL_NAME_OP_DST_RANK: "first" - } - ).reset_index() - - df_result = self.generate_p2p_connection_index(df_result) - - df_result = df_result[[P2PPairingExport.CO_OP_NAME, self.COL_NAME_P2P_CONNECTION_ID]] - - self.update_connection_info_to_table(df_result, profiler_db_path) - return data_map.get(Constant.RANK_ID) diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/slow_link.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/slow_link.py deleted file mode 100644 index f2c5e5fe7d8004687a8cd0ab6eae929659b662b9..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/slow_link.py +++ /dev/null @@ -1,216 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# http://www.apache.org/licenses/LICENSE-2.0 -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -from collections import defaultdict - -import pandas as pd -import numpy as np -from tqdm import tqdm - -from msprof_analyze.cluster_analyse.common_func.utils import describe_duration -from msprof_analyze.cluster_analyse.common_func.utils import detect_outliers_z_score -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.slow_link_export import SlowLinkExport - -logger = get_logger() - - -class SlowLink(BaseRecipeAnalysis): - TABLE_SLOW_LINK_SUM = "SlowLinkSum" - TABLE_SLOW_LINK_OPS = "SlowLinkOps" - - TOP_NUM = "top_num" - DEFAULT_TOP_NUM = 15 - - def __init__(self, params): - super().__init__(params) - logger.info("SlowLink init.") - self.slow_link_sum = [] - self.slow_link_ops = [] - top_num = self._extra_args.get(self.TOP_NUM, self.DEFAULT_TOP_NUM) - self.top_num = int(top_num) if isinstance(top_num, str) and top_num.isdigit() else self.DEFAULT_TOP_NUM - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - @classmethod - def add_parser_argument(cls, parser): - parser.add_argument("--top_num", type=str, help="Duration cost top count", default=cls.DEFAULT_TOP_NUM) - - def merge_func(self, mapper_res): - # 过滤掉mapper_res中为None的元素 - mapper_res = list(filter(lambda df: df is not None, mapper_res)) - - # 如果过滤后mapper_res为空,记录错误并返回 - if not mapper_res: - logger.error("Mapper data is empty. Please check the input or data source.") - return - dataframes = [pd.DataFrame(item) for item in mapper_res] - mapper_res = pd.concat(dataframes, ignore_index=True) - # 从mapper_res中提取各个字段的值 - rank_id_arr = mapper_res["rankId"].values # 提取rankId数组 - num_ranks = len(rank_id_arr) # 获取rankId数组的长度 - group_name_arr = mapper_res["groupName"].values # 提取groupName数组 - communication_time_arr = mapper_res["communicationTime"].values # 提取通信时间数组 - op_name_arr = mapper_res["opName"].values # 提取操作名称数组 - - # 初始化用于存储分组信息的字典和数组 - process_group = defaultdict(lambda: defaultdict(list)) # 用于存储按组和操作名分组的索引 - transmit_time_arr = np.zeros(num_ranks, dtype=np.int64) # 初始化传输时间数组 - related_ranks_arr = np.zeros(num_ranks, dtype=np.int32) # 初始化相关rank数量数组 - - # 遍历所有记录,按groupName和opName分组 - for idx in range(num_ranks): - # 如果操作名称中包含"send"或"receive",跳过(可能是发送或接收操作) - if "send" in op_name_arr[idx] or "receive" in op_name_arr[idx]: - continue - # 将当前索引添加到对应的分组中 - process_group[group_name_arr[idx]][op_name_arr[idx]].append(idx) - - # 遍历分组后的数据,计算每个操作的传输时间和相关rank数量 - for _, ops_same_group in tqdm(process_group.items(), desc="Processing database data..."): - for _, ops in ops_same_group.items(): - # 提取当前分组中所有操作的通信时间 - communication_time_list = [communication_time_arr[op_idx] for op_idx in ops] - # 计算最小通信时间作为传输时间 - transmit_time = min(communication_time_list) - # 计算当前分组中操作的数量作为相关rank数量 - related_ranks_num = len(ops) - - # 更新传输时间和相关rank数量数组 - for op_idx in ops: - transmit_time_arr[op_idx] = transmit_time - related_ranks_arr[op_idx] = related_ranks_num - - # 将计算得到的传输时间和相关rank数量添加到mapper_res中 - mapper_res.insert(mapper_res.shape[1], 'transmitTime', transmit_time_arr) - mapper_res.insert(mapper_res.shape[1], 'relatedRanks', related_ranks_arr) - - # 调用过滤函数处理mapper_res - self.filter_func(mapper_res) - - def filter_func(self, mapper_res): - """ - 处理数据,分组并检测异常值。 - """ - # 按 opType, dataSize, related_ranks 分组 - grouped = mapper_res.groupby(['opType', 'dataSize', 'relatedRanks']) - - for _, group in grouped: - # 提取分组数据中的 transmit_time 列 - transmit_time_data = group['transmitTime'].values - - # 检测异常值 - outliers = detect_outliers_z_score(transmit_time_data) - - if outliers: - # 如果存在异常值,将整个分组数据存入 Slow_Link_Ops - self.slow_link_ops.append(group) - - if self.slow_link_ops: - self.slow_link_ops = pd.concat(self.slow_link_ops, ignore_index=True) - # 重置索引并去掉多余的索引列 - data = pd.DataFrame(self.slow_link_ops) - - # 按 'opType', 'dataSize', 'related_ranks' 分组 - grouped = data.groupby(['opType', 'dataSize', 'relatedRanks']) - - # 计算统计信息 - group_data = describe_duration(grouped['transmitTime']) - - # 找到每个组中 transmit_time 最小值和最大值对应的 rankId - min_rank = grouped['transmitTime'].idxmin().map(data['rankId']) - max_rank = grouped['transmitTime'].idxmax().map(data['rankId']) - - # 将最大值和最小值对应的 rankId 添加到 group_data - group_data['maxRank'] = max_rank.values - group_data['minRank'] = min_rank.values - - # 构造 filteringName - group_data['opTypeRelatedRanksDataSize'] = group_data.index.map(lambda x: f"{x[0]}{x[2]}_{x[1]}") - # 将 filteringName 移动到第一列 - cols = ['opTypeRelatedRanksDataSize'] + [col for col in group_data.columns if - col != 'opTypeRelatedRanksDataSize'] - group_data = group_data[cols] - - # 重置索引 - group_data = group_data.reset_index(drop=True) - # 计算最大值和最小值与均值的绝对值 - group_data['abs_max_mean'] = abs(group_data['MaxNs'] - group_data['MeanNs']) - group_data['abs_min_mean'] = abs(group_data['MinNs'] - group_data['MeanNs']) - - # 计算最大值和最小值与均值的绝对值中的较大值 - group_data['max_abs_mean'] = group_data[['abs_max_mean', 'abs_min_mean']].max(axis=1) - - # 计算偏移比值 - group_data['offsetRatio'] = group_data['max_abs_mean'] / group_data['StdNs'] - - # 按偏移比值降序排序 - group_data = group_data.sort_values(by='offsetRatio', ascending=False) - - # 根据 self.top_num 筛选出偏移比值最大的前 N 条记录 - group_data = group_data.head(self.top_num) - - # 删除辅助列 'abs_max_mean', 'abs_min_mean', 'max_abs_mean' - group_data = group_data.drop(columns=['abs_max_mean', 'abs_min_mean', 'max_abs_mean']) - - # 调整列的顺序,将 offsetRatio 移到 MinRank 和 MaxRank 之前 - columns = [col for col in group_data.columns if col not in ['maxRank', 'minRank', 'offsetRatio']] - columns.insert(len(columns), 'offsetRatio') # 将 offsetRatio 插入到倒数第三的位置 - columns.extend(['maxRank', 'minRank']) # 添加 MaxRank 和 MinRank 到列的最后 - - # 重新排列列的顺序 - group_data = group_data[columns] - - # 在处理 group_data 的最后部分并保存 - self.slow_link_sum = group_data - - def run(self, context): - if self.top_num <= 0: - logger.warning(f"SlowLink: top_num is set to a invalid value, " - f"it will be reset to default value({self.DEFAULT_TOP_NUM}).") - self.top_num = self.DEFAULT_TOP_NUM - mapper_res = self.mapper_func(context) - self.merge_func(mapper_res) - - if self._export_type == "db": - self.save_db() - elif self._export_type == "notebook": - self.save_notebook() - else: - logger.error("Unknown export type.") - - def save_notebook(self): - self.dump_data(self.slow_link_sum, "slow_link_sum.csv", index=False) - self.dump_data(self.slow_link_ops, "slow_link_ops.csv", index=False) - self.create_notebook("stats.ipynb") - self.add_helper_file("cluster_display.py") - - def save_db(self): - self.dump_data(self.slow_link_sum, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, self.TABLE_SLOW_LINK_SUM, - index=False) - self.dump_data(self.slow_link_ops, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, self.TABLE_SLOW_LINK_OPS, - index=False) - - def _mapper_func(self, data_map, analysis_class): - profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) - rank_id = data_map.get(Constant.RANK_ID) - df = SlowLinkExport(profiler_db_path, analysis_class).read_export_db() - if df is None or df.empty: - logger.warning(f"There is no stats data in {profiler_db_path}.") - return None - df.insert(0, "rankId", rank_id) - return df \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/stats.ipynb b/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/stats.ipynb deleted file mode 100644 index 30edbc245379aa6b02e8895427bc7ad5db6656b3..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/slow_link/stats.ipynb +++ /dev/null @@ -1,111 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# SLOWLINK Summary\n", - "\n", - "集群场景快慢卡数据分析\n", - "\n", - "主要包含以下2个统计内容:\n", - "1. 按算子类型分组的,整个集群通信算子耗时的统计情况\n", - "2. 整个集群异常的opType_relatedRanks_dataSize" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 数据准备" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from IPython.display import display, HTML\n", - "display(HTML(\"\"))\n", - "\n", - "import matplotlib.pyplot as plt\n", - "\n", - "import pandas as pd\n", - "pd.set_option(\"display.max_rows\", 100)\n", - "pd.set_option(\"display.width\", 1000)\n", - "\n", - "import cluster_display\n", - "\n", - "slow_link_ops_df = pd.read_csv(\"slow_link_ops.csv\")\n", - "slow_link_sum_df = pd.read_csv(\"slow_link_sum.csv\", index_col=\"opTypeRelatedRanksDataSize\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cluster_display.display_transmittime_bar(slow_link_ops_df, 0.05, 'hcom_allGather_', 5, 1024)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### 集群异常的opType_relatedRanks_dataSize分析\n", - "\n", - "统计集群异常的opType_relatedRanks_dataSize,时间单位为微秒(us)\n", - "\n", - "包含以下统计项:\n", - "- Count:算子数量\n", - "- Mean:平均耗时\n", - "- Std:标准差\n", - "- Min:最小值\n", - "- Q1:四分之一分位数\n", - "- Median:中位数\n", - "- Q3:四分之三分位数\n", - "- Max:最大值\n", - "- Sum:总耗时\n", - "- MinRank:耗时最少算子所在的Rank\n", - "- MaxRank:耗时最长算子所在的Rank" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "display(slow_link_sum_df)\n", - "fig_slow_link_ops = cluster_display.display_duration_boxplots(None, slow_link_sum_df, x_title=\"opTypeRelatedRanksDataSize\")" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.8" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank_pp_stage/__init__.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/__init__.py similarity index 100% rename from profiler/msprof_analyze/cluster_analyse/recipes/slow_rank_pp_stage/__init__.py rename to profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/__init__.py diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/dixon_table.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/dixon_table.py new file mode 100644 index 0000000000000000000000000000000000000000..7bf7e2c80621f6756b2d9ad82051eefac263d141 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/dixon_table.py @@ -0,0 +1,117 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +# 单边狄克逊检验表,995置信度 +DIXON_TABLE_995 = { + 3: 0.994, + 4: 0.920, + 5: 0.823, + 6: 0.744, + 7: 0.680, + 8: 0.723, + 9: 0.676, + 10: 0.638, + 11: 0.707, + 12: 0.675, + 13: 0.649, + 14: 0.672, + 15: 0.649, + 16: 0.629, + 17: 0.611, + 18: 0.595, + 19: 0.580, + 20: 0.568, + 21: 0.556, + 22: 0.545, + 23: 0.536, + 24: 0.526, + 25: 0.519, + 26: 0.510, + 27: 0.503, + 28: 0.496, + 29: 0.489, + 30: 0.484, + 31: 0.478, + 32: 0.473, + 33: 0.468, + 34: 0.463, + 35: 0.458, + 36: 0.454, + 37: 0.450, + 38: 0.446, + 39: 0.442, + 40: 0.439, + 41: 0.435, + 42: 0.432, + 43: 0.429, + 44: 0.425, + 45: 0.423, + 46: 0.420, + 47: 0.417, + 48: 0.414, + 49: 0.412, + 50: 0.409, + 51: 0.407, + 52: 0.405, + 53: 0.402, + 54: 0.400, + 55: 0.398, + 56: 0.396, + 57: 0.394, + 58: 0.392, + 59: 0.391, + 60: 0.388, + 61: 0.387, + 62: 0.385, + 63: 0.383, + 64: 0.382, + 65: 0.380, + 66: 0.379, + 67: 0.377, + 68: 0.376, + 69: 0.374, + 70: 0.372, + 71: 0.371, + 72: 0.370, + 73: 0.368, + 74: 0.368, + 75: 0.366, + 76: 0.365, + 77: 0.364, + 78: 0.363, + 79: 0.361, + 80: 0.360, + 81: 0.359, + 82: 0.358, + 83: 0.356, + 84: 0.356, + 85: 0.355, + 86: 0.353, + 87: 0.352, + 88: 0.352, + 89: 0.351, + 90: 0.350, + 91: 0.349, + 92: 0.348, + 93: 0.347, + 94: 0.346, + 95: 0.345, + 96: 0.344, + 97: 0.344, + 98: 0.343, + 99: 0.341, + 100: 0.341, +} \ No newline at end of file diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/slow_rank.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/slow_rank.py new file mode 100644 index 0000000000000000000000000000000000000000..48f8ba6ed9079ab196085d4b3ff800c90a38e895 --- /dev/null +++ b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank/slow_rank.py @@ -0,0 +1,183 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from collections import defaultdict + +import pandas as pd +import numpy as np + +from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_exports.cluster_time_summary_export import CommunicationTimeExport +from msprof_analyze.cluster_analyse.recipes.slow_rank.dixon_table import DIXON_TABLE_995 + +logger = get_logger() + + +def judge_norm(time_list, threshold=3): + t_max = max(time_list) + t_min = min(time_list) + t_mean = np.mean(time_list) + t_std = np.std(time_list) + threshold_high = t_mean + threshold * t_std + threshold_low = t_mean - threshold * t_std + + # 耗时低于下阈值的卡认为是慢卡 + outliers_idx = [i for i, time in enumerate(time_list) if time < threshold_low] + + # 如果存在高于上阈值的卡,则将耗时最短的卡加到慢卡的list中 + if t_max > threshold_high: + if time_list.index(t_min) not in outliers_idx: + outliers_idx.append(time_list.index(t_min)) + return outliers_idx + + +def judge_dixon(time_list): + n = len(time_list) + if n in [0, 1, 2]: + return [] + sorted_list = sorted(time_list) + + # 判断计算检验指标时分母是否可能为0 + if len(set(sorted_list)) <= 3: + return [] + + # 计算狄克逊检验的检验指标,次小值和最小值差,比上最大值和最小值的差。根据数据数量改变次小值和最大值的选取 + if n <= Constant.MAX_DIXON_NUM: + if n <= Constant.DIXON_THRESHOLD_1: + flag = (sorted_list[1] - sorted_list[0]) / (sorted_list[-1] - sorted_list[0]) \ + if (sorted_list[-1] - sorted_list[0]) else 0 + elif n <= Constant.DIXON_THRESHOLD_2: + flag = (sorted_list[1] - sorted_list[0]) / (sorted_list[-2] - sorted_list[0]) \ + if (sorted_list[-2] - sorted_list[0]) else 0 + elif n <= Constant.DIXON_THRESHOLD_3: + flag = (sorted_list[2] - sorted_list[0]) / (sorted_list[-2] - sorted_list[0]) \ + if (sorted_list[-2] - sorted_list[0]) else 0 + else: + flag = (sorted_list[2] - sorted_list[0]) / (sorted_list[-3] - sorted_list[0]) \ + if (sorted_list[-3] - sorted_list[0]) else 0 + + # 根据数据数量查表,若计算的检验指标较大,则认为有异常值,耗时最短的卡是慢卡 + if flag > DIXON_TABLE_995[n]: + return [time_list.index(sorted_list[0])] + return [] + + +def judge_slow_rank(time_list): + """根据time list长度 选择狄克逊检验或三倍标准差""" + if len(time_list) <= Constant.MAX_DIXON_NUM: + return judge_dixon(time_list) + else: + return judge_norm(time_list) + + +class SlowRankAnalysis(BaseRecipeAnalysis): + def __init__(self, params): + super().__init__(params) + logger.info("Slow Rank Analysis init.") + + @property + def base_dir(self): + return os.path.basename(os.path.dirname(__file__)) + + def reducer_func(self, mapper_res): + mapper_res = list(filter(lambda df: df is not None, mapper_res)) + if not mapper_res: + logger.error("Mapper data is None.") + return None + concated_df = pd.concat(mapper_res) + return concated_df + + def run(self, context): + if self._is_msprof: + logger.warning("Slow rank analysis do not support msprof db now.") + return + + mapper_res = self.mapper_func(context) + comm_ops_df = self.reducer_func(mapper_res) + if comm_ops_df is None: + return + + analyzer = SlowRankVoteAnalysis(comm_ops_df) + perpector_df = analyzer.run() + + if self._export_type == Constant.DB: + self.save_db(perpector_df) + else: + logger.error("SlowRank analysis is not supported for notebook export type.") + + def save_db(self, perpector_df): + self.dump_data(perpector_df, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "SlowRank") + + def _mapper_func(self, data_map, analysis_class): + profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) + step_range = data_map.get(Constant.STEP_RANGE) + df = CommunicationTimeExport(profiler_db_path, analysis_class, step_range).read_export_db() + return df + + +class SlowRankVoteAnalysis: + def __init__(self, comm_ops): + self.comm_ops = comm_ops + + def grouping_ops(self): + """按照通信域、算子名称对通信算子进行分组""" + grouped_ops_dict = defaultdict(lambda: defaultdict(list)) + self.comm_ops = self.comm_ops[~self.comm_ops["opName"].str.contains("send")] + self.comm_ops = self.comm_ops[~self.comm_ops["opName"].str.contains("receive")] + grouped_df = self.comm_ops.groupby("groupName") + exclude_groups = [] + for group_name in grouped_df.groups.keys(): + ops_groupby_group_name = grouped_df.get_group(group_name) + ops_num = ops_groupby_group_name.groupby("opName").size().values + if len(set(ops_num)) > 1: + exclude_groups.append(group_name) + for exclude_group in exclude_groups: + self.comm_ops.drop(self.comm_ops[self.comm_ops["groupName"] == exclude_group].index, inplace=True) + self.comm_ops.reset_index(drop=True, inplace=True) + n = len(self.comm_ops) + group_name_arr = self.comm_ops["groupName"].values + op_name_arr = self.comm_ops["opName"].values + for idx in range(n): + group_name = group_name_arr[idx] + op_name = op_name_arr[idx] + grouped_ops_dict[group_name][op_name].append(idx) + return grouped_ops_dict + + def run(self): + grouped_ops_dict = self.grouping_ops() + perpector_dict = self.analysis(grouped_ops_dict) + return perpector_dict + + def analysis(self, grouped_ops_dict): + rank_id_arr = self.comm_ops["rankId"].values + comm_time_arr = self.comm_ops["communication_time"].values + perpector_dict = defaultdict(lambda: 0) + for _, ops_same_group in grouped_ops_dict.items(): + for _, ops_list in ops_same_group.items(): + time_list = [comm_time_arr[op_idx] for op_idx in ops_list] + perpector_rank_idx = judge_slow_rank(time_list) + if perpector_rank_idx: + for rank_idx in perpector_rank_idx: + slow_rank = rank_id_arr[ops_list[rank_idx]] + perpector_dict[slow_rank] += 1 + + perpector_df = pd.DataFrame(columns=["rankId", "slowAffectCount"]) + for rank, perpector_times in perpector_dict.items(): + perpector_df.loc[len(perpector_df)] = [rank, perpector_times] + perpector_df.set_index(["rankId"], inplace=True) + return perpector_df diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank_pp_stage/slow_rank_pp_stage.py b/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank_pp_stage/slow_rank_pp_stage.py deleted file mode 100644 index fd5bdc05dc04156fc22ff51f0c19e0d2dba64190..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/cluster_analyse/recipes/slow_rank_pp_stage/slow_rank_pp_stage.py +++ /dev/null @@ -1,295 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import json -from collections import defaultdict - -import pandas as pd - -from msprof_analyze.cluster_analyse.recipes.base_recipe_analysis import BaseRecipeAnalysis -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.logger import get_logger -from msprof_analyze.prof_exports.cluster_time_summary_export import CommunicationTimeExport -from msprof_analyze.prof_common.database_service import DatabaseService - -logger = get_logger() - - -class SlowRankPPStageAnalysis(BaseRecipeAnalysis): - TP_SIZE = "tensor_model_parallel_size" - PP_SIZE = "pipeline_model_parallel_size" - DP_SIZE = "data_parallel_size" - - def __init__(self, params): - super().__init__(params) - logger.info("SlowRank PPstage analysis init.") - - self.p2p_analysis_result = None - self.pp_analysis_result = None - self.p2p_vote_result = None - self.pp_vote_result = None - - self.distributed_args = self.load_distributed_args() - - @property - def base_dir(self): - return os.path.basename(os.path.dirname(__file__)) - - @classmethod - def add_parser_argument(cls, parser): - parser.add_argument("--tp", type=int, help=cls.TP_SIZE, default=None) - parser.add_argument("--pp", type=int, help=cls.PP_SIZE, default=None) - parser.add_argument("--dp", type=int, help=cls.DP_SIZE, default=None) - - def reducer_func(self, mapper_res): - mapper_res = list(filter(lambda df: df is not None, mapper_res)) - if not mapper_res: - logger.error("Mapper data is None.") - return None - concated_df = pd.concat(mapper_res) - return concated_df - - def run(self, context): - if self.distributed_args is None: - return - mapper_res = self.mapper_func(context) - comm_ops_df = self.reducer_func(mapper_res) - if comm_ops_df is None: - return - - p2p_analysis_result_list = [] - p2p_vote_result_list = [] - pp_analysis_result_list = [] - pp_vote_result_list = [] - - pp_stage_rank_map = self.map_rank_pp_stage() - - for _, df_one_step in comm_ops_df.groupby("step"): - p2p_analysis_result, p2p_vote_result, pp_analysis_result, pp_vote_result = \ - SlowRankPPStageStepAnalysis(df_one_step).analysis(pp_stage_rank_map) - p2p_analysis_result_list.append(p2p_analysis_result) - p2p_vote_result_list.append(p2p_vote_result) - pp_analysis_result_list.append(pp_analysis_result) - pp_vote_result_list.append(pp_vote_result) - - for step_id, (p2p_analysis_result, p2p_vote_result, pp_analysis_result, pp_vote_result) in \ - enumerate( - zip( - p2p_analysis_result_list, - p2p_vote_result_list, - pp_analysis_result_list, - pp_vote_result_list - )): - p2p_analysis_result["step"] = step_id - p2p_vote_result["step"] = step_id - pp_analysis_result["step"] = step_id - pp_vote_result["step"] = step_id - - self.p2p_analysis_result = pd.concat(p2p_analysis_result_list) - self.p2p_vote_result = pd.concat(p2p_vote_result_list) - self.pp_analysis_result = pd.concat(pp_analysis_result_list) - self.pp_vote_result = pd.concat(pp_vote_result_list) - - if self._export_type == Constant.DB: - self.save_db() - else: - logger.error("SlowRank PPstage is not supported for notebook export type.") - - def save_db(self): - self.dump_data(self.p2p_vote_result, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "P2PAnalysisResult") - self.dump_data(self.pp_vote_result, Constant.DB_CLUSTER_COMMUNICATION_ANALYZER, "PPAnalysisResult") - - def map_rank_pp_stage(self): - tp_size = self.distributed_args.get(self.TP_SIZE, 1) - pp_size = self.distributed_args.get(self.PP_SIZE, 1) - dp_size = self.distributed_args.get(self.DP_SIZE, 1) - - rank_pp_stage_map = {} - rank = 0 - for i in range(pp_size): - for _ in range(tp_size * dp_size): - rank_pp_stage_map[rank] = i - rank += 1 - return rank_pp_stage_map - - def load_distributed_args(self): - tp_size = self._extra_args.get("tp", None) - pp_size = self._extra_args.get("pp", None) - dp_size = self._extra_args.get("dp", None) - - if tp_size and pp_size and dp_size: - if tp_size <= 0 or pp_size <= 0 or dp_size <= 0: - logger.error("Invalid distributed_args, tp pp dp < 0.") - return None - return { - self.TP_SIZE: tp_size, - self.DP_SIZE: dp_size, - self.PP_SIZE: pp_size, - } - else: - rank_id = list(self._data_map.keys())[0] - profiler_db_path = self._data_map[rank_id] - db_path = os.path.join(profiler_db_path, Constant.SINGLE_OUTPUT, f"ascend_pytorch_profiler_{rank_id}.db") - if os.path.exists(db_path): - try: - service = DatabaseService(db_path) - service.add_table_for_query("META_DATA", ["name", "value"]) - df = service.query_data().get("META_DATA", None) - distributed_args = df.loc[df["name"] == "distributed_args", "value"] - if distributed_args.empty: - logger.error("Distributed args not in profiling files, please input manually.") - else: - distributed_args = json.loads(distributed_args.values[0]) - except Exception as err: - logger.error(err) - logger.error("Distributed args not in profiling files, please input manually.") - return None - - tp_size = distributed_args.get(self.TP_SIZE, 1) - pp_size = distributed_args.get(self.PP_SIZE, 1) - dp_size = distributed_args.get(self.DP_SIZE, 1) - if not isinstance(tp_size, int) or not isinstance(pp_size, int) or not isinstance(dp_size, int): - logger.error("Invalid distributed_args in profiling files, please input manually.") - return None - if tp_size <= 0 or pp_size <= 0 or dp_size <= 0: - logger.error("Invalid distributed_args in profiling files, please input manually.") - return None - return { - self.TP_SIZE: tp_size, - self.PP_SIZE: pp_size, - self.DP_SIZE: dp_size, - } - - logger.error(f"Db_file: {db_path} not exist.") - return None - - def _mapper_func(self, data_map, analysis_class): - profiler_db_path = data_map.get(Constant.PROFILER_DB_PATH) - df = CommunicationTimeExport(profiler_db_path, analysis_class).read_export_db() - return df - - -class SlowRankPPStageStepAnalysis: - def __init__(self, comm_ops): - self.comm_ops = comm_ops - self.exclude_ranks = [] - - def grouping_pp_stage_ops(self, pp_stage_rank_map): - p2p_op_group = defaultdict(lambda: defaultdict(list)) - pp_op_group = defaultdict(lambda: defaultdict(list)) - - def divid_opname(op_name): - # op_name的格式:输入 OPTYPE__GORUPHASH_IDX_1 输出 OPTYPE_IDX - splited_name = op_name.split("__") - if len(splited_name) != 2: - return None - splited_num = splited_name[1].split("_") - if len(splited_num) != 3: - return None - return "_".join([splited_name[0], splited_num[1]]) - - ops_num = len(self.comm_ops) - op_name_arr = self.comm_ops["opName"].values - rank_id_arr = self.comm_ops["rank"].values - for idx in range(ops_num): - rank = rank_id_arr[idx] - op_name = op_name_arr[idx] - op_name_short = divid_opname(op_name) - if op_name_short is None: - continue - pp_stage_idx = pp_stage_rank_map[rank] - if rank in self.exclude_ranks: - continue - if "send" in op_name_short or "receive" in op_name_short: - p2p_op_group[pp_stage_idx][op_name_short].append(idx) - else: - pp_op_group[pp_stage_idx][op_name_short].append(idx) - - return p2p_op_group, pp_op_group - - def analysis_pp_stage(self, vote_group): - min_time_dict = defaultdict(lambda: defaultdict(lambda: 0)) - max_time_dict = defaultdict(lambda: defaultdict(lambda: 0)) - mean_time_dict = defaultdict(lambda: defaultdict(lambda: 0)) - count_dict = defaultdict(lambda: defaultdict(lambda: 0)) - rank_vote = defaultdict(lambda: 0) - perpetrator_dict = defaultdict(lambda: defaultdict(lambda: 0)) - minimum_rank_op_name = defaultdict(list) - - communication_time_arr = self.comm_ops["communication_time"].values - rank_id_arr = self.comm_ops["rank"].values - for pp_idx, ops_same_group in vote_group.items(): - for op_name, ops in ops_same_group.items(): - communication_time_list = [communication_time_arr[op_idx] for op_idx in ops] - min_time = min(communication_time_list) - min_op_idx = ops[communication_time_list.index(min_time)] - min_op_rank = rank_id_arr[min_op_idx] - rank_vote[min_op_rank] += 1 - perpetrator_dict[pp_idx][op_name] = min_op_rank - minimum_rank_op_name[min_op_rank].append(op_name) - - max_time = max(communication_time_list) - mean_time = sum(communication_time_list) // len(communication_time_list) - min_time_dict[pp_idx][op_name] = min_time - max_time_dict[pp_idx][op_name] = max_time - mean_time_dict[pp_idx][op_name] = mean_time - count_dict[pp_idx][op_name] = len(ops) - - analysis_result = pd.DataFrame( - columns=[ - "ppIdx", - "opName", - "minTime", - "maxTime", - "meanTime", - "count", - "perpetratorRank" - ] - ) - - for pp_idx in min_time_dict.keys(): - for op_name in min_time_dict[pp_idx].keys(): - analysis_result.loc[len(analysis_result)] = [ - pp_idx, op_name, - min_time_dict[pp_idx][op_name], - max_time_dict[pp_idx][op_name], - mean_time_dict[pp_idx][op_name], - count_dict[pp_idx][op_name], - perpetrator_dict[pp_idx][op_name] - ] - - vote_result = pd.DataFrame(columns=["rankId", "minimumTimes"]) - for rank, minimum_times in rank_vote.items(): - vote_result.loc[len(vote_result)] = [rank, minimum_times] - vote_result.set_index(["rankId"], inplace=True) - - return analysis_result, vote_result - - def analysis(self, pp_stage_rank_map): - self.select_exclude_ranks() - p2p_op_group, pp_op_group = self.grouping_pp_stage_ops(pp_stage_rank_map) - p2p_analysis_result, p2p_vote_result = self.analysis_pp_stage(p2p_op_group) - pp_analysis_result, pp_vote_result = self.analysis_pp_stage(pp_op_group) - return p2p_analysis_result, p2p_vote_result, pp_analysis_result, pp_vote_result - - def select_exclude_ranks(self): - grouped_df = self.comm_ops.groupby("rank") - for rank in grouped_df.groups.keys(): - ops_groupby_rank = grouped_df.get_group(rank) - ops_num = ops_groupby_rank.groupby("opName").size().values - if len(set(ops_num)) > 1: - self.exclude_ranks.append(rank) - \ No newline at end of file diff --git a/profiler/msprof_analyze/compare_tools/README.md b/profiler/msprof_analyze/compare_tools/README.md index ada936c90aea2916b6ddfec37b85e0735b8732a8..54d7283b2bca4b919a04c9aba5da968ef5e7f4dc 100644 --- a/profiler/msprof_analyze/compare_tools/README.md +++ b/profiler/msprof_analyze/compare_tools/README.md @@ -69,7 +69,7 @@ PyTorch Profiler采集结果数据目录结构如下: #### NPU性能数据采集 -通过Ascend PyTorch Profiler工具采集NPU的性能数据,采集参数配置与GPU基本一致,只需将GPU的性能数据采集代码中torch.profiler替换成torch_npu.profiler。,参考链接:[Profiling数据采集](https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze)。 +通过Ascend PyTorch Profiler工具采集NPU的性能数据,采集参数配置与GPU基本一致,只需将GPU的性能数据采集代码中torch.profiler替换成torch_npu.profiler,参考链接:[NPU性能数据采集](https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze#npu性能数据采集)。 Ascend PyTorch Profiler采集结果数据目录结构如下: @@ -151,9 +151,11 @@ python performance_compare.py [基准性能数据文件所在路径] [比对性 | --enable_kernel_compare | 开启kernel性能比对。仅针对NPU与NPU比对的场景。支持扩展参数请参见“**kernel性能比对特有参数说明**”。 | 否 | | --enable_api_compare | 开启API性能比对。MindSpore场景暂不支持。需要使用性能数据中的trace_view.csv文件。 | 否 | | --disable_details | 隐藏明细比对,只进行统计级比对。 | 否 | -| --base_step | 基准性能数据step ID,配置后使用基准性能数据对应step的数据进行比对。为整数,需配置实际数据存在的step ID,默认未配置,比对所有性能数据,需要与--comparison_step同时配置。配置示例:--base_step=1。 | 否 | -| --comparison_step | 比对性能数据step ID,配置后使用比对性能数据对应step的数据进行比对。为整数,需配置实际数据存在的step ID,默认未配置,比对所有性能数据,需要与--base_step同时配置。配置示例:--comparison_step=1。 | 否 | +| --base_step | 基准性能数据step ID,配置后使用基准性能数据对应step的数据进行比对。为整数,需配置实际数据存在的step ID,默认未配置,比对所有性能数据,需要与--comparison_step同时配置。配置示例:--base_step=1。
    **仅--enable_operator_compare、--enable_communication_compare、--enable_memory_compare、--enable_kernel_compare或--enable_api_compare开启时,该参数配置生效。** | 否 | +| --comparison_step | 比对性能数据step ID,配置后使用比对性能数据对应step的数据进行比对。为整数,需配置实际数据存在的step ID,默认未配置,比对所有性能数据,需要与--base_step同时配置。配置示例:--comparison_step=1。
    **仅--enable_operator_compare、--enable_communication_compare、--enable_memory_compare、--enable_kernel_compare或--enable_api_compare开启时,该参数配置生效。** | 否 | | --force | 强制执行compare。配置后可强制跳过如下情况:
    指定的目录、文件的用户属主不属于当前用户,忽略属主判断直接执行。
    csv文件大于5G、json文件大于10G、db文件大于8G,忽略文件过大判断直接执行。
    配置该参数表示开启强制执行,默认未配置表示关闭。 | 否 | +| --debug | 工具执行报错时可打开此开关,将会展示详细保存堆栈信息。配置该参数表示开启Debug,默认未配置表示关闭。 | 否 | +| -h,-H
    --help | 在需要查询当前命令附属子命令或相关参数时,给出帮助建议。 | 否 | 说明:以上开关均不设置的情况下,**工具默认开启所有的性能比对**,当用户设置了以上开关,则按照用户设置的开关进行性能比对,示例如下: @@ -193,7 +195,7 @@ MindSpore场景暂不支持。 #### 自定义比对算子 -一般情况下compare功能按照默认配置的算子进行比对,若用户需要对特定算子的性能进行比对和分析,可以通过在[compare_config.ini](https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/compare_tools/compare_backend/compare_config/compare_config.ini)文件中配置需要比对的算子名的识别关键词,之后再执行比对操作(msprof-analyze compare),比对结果在结果文件performance_comparison_result_{timestamp}.csv中呈现。 +一般情况下compare功能按照默认配置的算子进行比对,若用户需要对特定算子的性能进行比对和分析,可以通过在[compare_config.ini](https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/compare_tools/compare_backend/compare_config/compare_config.ini)文件中配置需要比对的算子名的识别关键词,之后再执行比对操作(msprof-analyze compare),比对结果在结果文件performance_comparison_result_{timestamp}.xlsx中呈现。 配置算子名的识别关键词为算子名称中的一部分,代表只要算子名称中包含该关键词,那么该算子会进行比对。 @@ -209,7 +211,7 @@ MindSpore场景暂不支持。 MindSpore场景仅支持**总体性能**、**通信性能**和**kernel性能**的对比。 -比对结果分为打屏和performance_comparison_result_{timestamp}.csv两种形式输出,其中打屏输出为概要信息,csv文件保存详细结果。 +比对结果分为打屏和performance_comparison_result_{timestamp}.xlsx两种形式输出,其中打屏输出为概要信息,xlsx文件保存详细结果。 ### 总体性能 @@ -233,7 +235,7 @@ MindSpore场景仅支持**总体性能**、**通信性能**和**kernel性能** | RDMA Bandwidth(GB/s) | RDMA带宽,单位GB/s。 | | SDMA Bandwidth(GB/s) | SDMA带宽,单位GB/s。 | | SDMA Time(Num) | 拷贝类任务耗时,Num表示计算的次数。 | -| Free Time | 调度耗时 = E2E耗时 - 算子耗时 - 通信不可掩盖耗时。Free的定义为Device侧既不在通信又不在计算的时间,因此包含拷贝时间(SDMA Time)。 | +| Free Time | 调度耗时 = E2E耗时 - 算子耗时 - 通信不可掩盖耗时。Free的定义为Device侧既不在通信也不在计算的时间,因此包含拷贝时间(SDMA Time)。 | | E2E Time(Not minimal profiling) | E2E总耗时,计算流端到端耗时。当存在Not minimal profiling时,表示该时间存在性能膨胀,会影响通信和调度耗时。 | | Other Time | AI CPU、DSA、TensorMove等其他算子耗时。 | @@ -254,42 +256,45 @@ MindSpore场景仅支持**总体性能**、**通信性能**和**kernel性能** Index列完整字段说明: -| 字段 | | | 说明 | -| ---------------------------- | :------------------ | ----------------------------------- | ------------------------------------------------------------ | -| Computing Time | | | 计算流耗时,计算流所有event耗时总和。如果有多条并发计算,计算流耗时对重叠部分只会计算一次。
    NPU场景下,仅当采集性能数据的Level等级为L1及以上且aic_metrics取值为PipeUtilization时才可拆分出Computing Time的二级字段Flash Attention、Conv等。 | -| | AllGatherMatmul | | AllGatherMatmul算子。MC²算子,仅为示例。 | -| | | Computing | AllGatherMatmul算子的计算算子。 | -| | | Communication | AllGatherMatmul算子的通信算子。 | -| | MatmulReduceScatter | | MatmulReduceScatter算子。MC²算子,仅为示例。 | -| | | Computing | MatmulReduceScatter算子的计算算子。 | -| | | Communication | MatmulReduceScatter算子的通信算子。 | -| | Flash Attention | | Flash Attention算子。 | -| | | Flash Attention (Forward) (Cube) | Flash Attention前向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | -| | | Flash Attention (Forward) (Vector) | Flash Attention前向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | -| | | Flash Attention (Backward) (Cube) | Flash Attention反向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | -| | | Flash Attention (Backward) (Vector) | Flash Attention反向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | -| | Conv | | Conv算子。 | -| | | Conv (Forward) (Cube) | Conv前向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | -| | | Conv (Forward) (Vector) | Conv前向Vector算子。Conv前向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | -| | | Conv (Backward) (Cube) | Conv反向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | -| | | Conv (Backward) (Vector) | Conv反向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | -| | Matmul | | Matmul算子。 | -| | | Matmul (Cube) | Matmul算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | -| | | Matmul (Vector) | Matmul算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | -| | Paged Attention | | Paged Attention算子。 | -| | Vector | | Vector算子。 | -| | | Vector (Trans) | 转换类Vector算子,主要包含Cast、TransPose、TransData算子。(仅针对NPU数据) | -| | | Vector ( No Trans) | 非转换类Vector算子。 | -| | Cube | | 未识别出Flash Attention、Conv和Matmul的Cube算子。 | -| | SDMA (Tensor Move) | | 拷贝类任务。 | -| | Other | | AI CPU、DSA等其他算子。 | -| Uncovered Communication Time | | | 通信未掩盖耗时,包含卡间等待时间。 | -| | Wait | | 卡间同步等待耗时。(仅针对NPU数据) | -| | Transmit | | 通信传输耗时。 | -| Free Time | | | 调度耗时 = E2E耗时 - 算子耗时 - 通信不可掩盖耗时。Free的定义为Device侧既不在通信又不在计算的时间,因此包含拷贝时间(SDMA Time)。 | -| | SDMA | | NPU为除Tensor Move外的拷贝类任务,GPU为所有拷贝类任务。 | -| | Free | | 排除SDMA的空闲耗时。 | -| E2E Time | | | E2E总耗时,计算流端到端耗时。当存在Not minimal profiling时,表示该时间存在性能膨胀,会影响通信和调度耗时。 | +| 字段 | | | 说明 | +| ---------------------------- | :--------------------------------------------- | ----------------------------------- | ------------------------------------------------------------ | +| Computing Time | | | 计算流耗时,计算流所有event耗时总和。如果有多条并发计算,计算流耗时对重叠部分只会计算一次。
    NPU场景下,仅当采集性能数据的Level等级为L1及以上且aic_metrics取值为PipeUtilization时才可拆分出Computing Time的二级字段Flash Attention、Conv等。 | +| | AllGatherMatmul | | AllGatherMatmul算子。MC²算子,仅为示例。 | +| | | Computing | AllGatherMatmul算子的计算算子。 | +| | | Communication | AllGatherMatmul算子的通信算子。 | +| | MatmulReduceScatter | | MatmulReduceScatter算子。MC²算子,仅为示例。 | +| | | Computing | MatmulReduceScatter算子的计算算子。 | +| | | Communication | MatmulReduceScatter算子的通信算子。 | +| | Flash Attention | | Flash Attention算子。 | +| | | Flash Attention (Forward) (Cube) | Flash Attention前向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | +| | | Flash Attention (Forward) (Vector) | Flash Attention前向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | +| | | Flash Attention (Backward) (Cube) | Flash Attention反向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | +| | | Flash Attention (Backward) (Vector) | Flash Attention反向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | +| | Conv | | Conv算子。 | +| | | Conv (Forward) (Cube) | Conv前向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | +| | | Conv (Forward) (Vector) | Conv前向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | +| | | Conv (Backward) (Cube) | Conv反向算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | +| | | Conv (Backward) (Vector) | Conv反向算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | +| | Matmul | | Matmul算子。 | +| | | Matmul (Cube) | Matmul算子下发的所有Cube类Kernel,一般为执行该算子核心计算的算子。 | +| | | Matmul (Vector) | Matmul算子下发的所有Vector类Kernel,一般为插入的转换类算子,如TransData。 | +| | Paged Attention | | Paged Attention算子。 | +| | Vector | | Vector算子。 | +| | | Vector (Trans) | 转换类Vector算子,主要包含Cast、TransPose、TransData算子。(仅针对NPU数据) | +| | | Vector ( No Trans) | 非转换类Vector算子。 | +| | Cube | | 未识别出Flash Attention、Conv和Matmul的Cube算子。 | +| | SDMA (Tensor Move) | | 拷贝类任务。 | +| | Other | | AI CPU、DSA等其他算子。 | +| Uncovered Communication Time | | | 通信未掩盖耗时,包含卡间等待时间。 | +| | {group_name}: Group group_name_* Communication | | 通信域,格式为:\{通信域名\}: Group group\_name\_* Communication,*表示通信域的编号。 | +| | | Wait | 卡间同步等待耗时。(仅针对NPU数据) | +| | | Transmit | 通信传输耗时。 | +| | Uncovered Communication Overlapped | | 两个通信域之间的未被计算掩盖的并行耗时。 | +| | | {group_name} & {group_name} | 两个通信域,比如tp & pp,表示tp域和pp域未被计算掩盖的并行耗时。 | +| Free Time | | | 调度耗时 = E2E耗时 - 算子耗时 - 通信不可掩盖耗时。Free的定义为Device侧既不在通信也不在计算的时间,因此包含拷贝时间(SDMA Time)。 | +| | SDMA | | NPU为除Tensor Move外的拷贝类任务,GPU为所有拷贝类任务。 | +| | Free | | 排除SDMA的空闲耗时。 | +| E2E Time | | | E2E总耗时,计算流端到端耗时。当存在Not minimal profiling时,表示该时间存在性能膨胀,会影响通信和调度耗时。 | 可以采取最简性能数据采集的方式来减少E2E耗时的性能膨胀,示例代码如下: @@ -343,7 +348,7 @@ MindSpore场景暂不支持。 - Device Total Time(ms):该模块调用的算子(包含子模块)在device侧执行的总耗时,单位ms。 - Device Total Time Diff(ms):GPU与NPU的Device Total Time(ms)差值。 - Device Self Time Diff(ms):GPU与NPU的Device Self Time(ms)差值。 -- Total Time Ratio:GPU与NPU的Device Total Time(ms)比值。 +- Diff Total Ratio:GPU与NPU的Device Total Time(ms)比值。 - Base Call Stack:基准文件模块的调用栈。 - Comparison Call Stack:比较文件模块的调用栈。 diff --git a/profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_compare_summary/__init__.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/__init__.py similarity index 100% rename from profiler/msprof_analyze/cluster_analyse/recipes/cluster_time_compare_summary/__init__.py rename to profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/__init__.py diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/framework_api_bean.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/framework_api_bean.py new file mode 100644 index 0000000000000000000000000000000000000000..ebc3df3eb63c561a8363ad8acbe580add157f6db --- /dev/null +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/framework_api_bean.py @@ -0,0 +1,97 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from decimal import Decimal + +from msprof_analyze.compare_tools.compare_backend.utils.common_func import convert_to_decimal +from msprof_analyze.compare_tools.compare_backend.utils.common_func import convert_to_float +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.compare_tools.compare_backend.compare_config.compare_config import CompareConfig + + +class FrameworkApiBean: + + def __init__(self, data): + self._data = data + self.is_torch_op = False + + @property + def dur(self) -> float: + return convert_to_float(self.end_time - self.start_time) + + @property + def start_time(self) -> Decimal: + return convert_to_decimal(self._data.get("startNs", 0)) / Constant.NS_TO_US + + @property + def end_time(self) -> Decimal: + return convert_to_decimal(self._data.get("endNs", 0)) / Constant.NS_TO_US + + @property + def name(self) -> str: + return self._data.get("name", "") + + @property + def lower_name(self) -> str: + return self.name.lower() + + @property + def connection_id(self): + return self._data.get("connectionId", "") + + @property + def cann_connection_id(self): + return self._data.get("cann_connectionId", "") + + @property + def input_dims(self): + return self._data.get("inputShapes", Constant.NA) + + @property + def input_type(self): + return self._data.get("inputDtypes", Constant.NA) + + @property + def call_stack(self): + return self._data.get("callStack", Constant.NA) + + def is_step_profiler(self): + return self.name.find("ProfilerStep#") != -1 + + def is_fa_for_cpu_op(self) -> bool: + """ + 这个类在cpu op和gpu中均有用到,这里是在cpu op阶段判断 + """ + return any(cube_mask in self.lower_name for cube_mask in CompareConfig().fa_mask) + + def is_conv_for_cpu_op(self) -> bool: + """ + 这个类在cpu op和gpu中均有用到,这里是在cpu op阶段判断 + """ + return any(conv_mask in self.lower_name for conv_mask in CompareConfig().conv_mask) + + def is_matmul_for_cpu_op(self) -> bool: + """ + 这个类在cpu op和gpu中均有用到,这里是在cpu op阶段判断 + """ + return any(bwd_mask in self.lower_name for bwd_mask in CompareConfig().mm_mask) + + def is_bwd_for_cpu_op(self) -> bool: + """ + 这个类在cpu op和gpu中均有用到,这里是在cpu op阶段判断 + """ + return any(bwd_mask in self.lower_name for bwd_mask in Constant.BWD_LIST) + + def is_cpu_cube_op(self) -> bool: + return self.is_matmul_for_cpu_op() or self.is_fa_for_cpu_op() or self.is_conv_for_cpu_op() diff --git a/profiler/msprof_analyze/precheck/env_check/environment_check.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_op_bean.py similarity index 48% rename from profiler/msprof_analyze/precheck/env_check/environment_check.py rename to profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_op_bean.py index 98d54ac506400ec53348cdaeea613d555dc81290..407d2d2d9adea392043c72a8fe2ebcf9c8b87d5f 100644 --- a/profiler/msprof_analyze/precheck/env_check/environment_check.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_op_bean.py @@ -12,39 +12,38 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from abc import ABC, abstractmethod +from msprof_analyze.prof_common.constant import Constant -class EnvironmentCheck(ABC): - CHECK_TYPE = "" +class HcclOpBean: - def __init__(self, **kwargs): - self.output = kwargs.get("output", "./output") + def __init__(self, data): + self._data = data - def init(self): - pass + @property + def name(self): + return self._data.get("opName", "") - def uninit(self): - pass + @property + def dur(self): + return self._data.get("Duration", 0) / Constant.NS_TO_US - @abstractmethod - def check(self): - pass + @property + def task_id(self): + return self._data.get("opId", "") + @property + def op_type(self): + return self._data.get("OpType", "") -class HardwareCheck(EnvironmentCheck): - def __init__(self, **kwargs): - super().__init__(**kwargs) + @property + def task_type(self): + return "" - @abstractmethod - def check(self): - pass + @property + def input_shapes(self): + return "" - -class SoftwareCheck(EnvironmentCheck): - def __init__(self, **kwargs): - super().__init__(**kwargs) - - @abstractmethod - def check(self): - pass + @property + def connection_id(self): + return self._data.get("connectionId", "") diff --git a/profiler/msprof_analyze/precheck/env_check/communication_check.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_task_bean.py similarity index 56% rename from profiler/msprof_analyze/precheck/env_check/communication_check.py rename to profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_task_bean.py index 807d4008115422ff312d2877495273bc25312eea..be6e05cf6d0dcf5f2bda06cfac78f2edcc6fd75a 100644 --- a/profiler/msprof_analyze/precheck/env_check/communication_check.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/hccl_task_bean.py @@ -12,14 +12,26 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import HardwareCheck +from msprof_analyze.prof_common.constant import Constant -class CommunicationCheck(HardwareCheck): - CHECK_TYPE = "communication" +class HcclTaskBean: - def __init__(self, **kwargs): - super().__init__(**kwargs) + def __init__(self, data): + self._data = data - def check(self): - pass + @property + def name(self): + return self._data.get("taskName", "") + + @property + def dur(self): + return self._data.get("Duration", 0) / Constant.NS_TO_US + + @property + def task_id(self): + return self._data.get("opId", "") + + @property + def group_name(self): + return self._data.get("GroupName", "") diff --git a/profiler/msprof_analyze/precheck/precheck.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/kernel_bean.py similarity index 47% rename from profiler/msprof_analyze/precheck/precheck.py rename to profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/kernel_bean.py index 64c938983813ab35e07d4ef5208fbe5423808100..cb92586b4f71876e6707b5be6f4641eb2441bf11 100644 --- a/profiler/msprof_analyze/precheck/precheck.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/db_data_bean/kernel_bean.py @@ -12,22 +12,34 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from msprof_analyze.precheck.env_check.check_item_factory import CheckItemFactory +from msprof_analyze.prof_common.constant import Constant -class Precheck: +class KernelBean: - @staticmethod - def env_precheck(**kwargs): - check_type = kwargs.get("check_type") - if not check_type: - return - check_items = CheckItemFactory.get_check_item(check_type) - for check_item in check_items: - check_obj = check_item(**kwargs) - check_obj.check() - return + def __init__(self, data): + self._data = data + @property + def name(self): + return self._data.get("OpName", "") -if __name__ == '__main__': - Precheck.env_precheck(check_type="env_variable") + @property + def dur(self): + return self._data.get("Duration") / Constant.NS_TO_US if self._data.get("Duration") else 0 + + @property + def task_id(self): + return self._data.get("TaskId", "") + + @property + def task_type(self): + return self._data.get("TaskType", "") + + @property + def input_shapes(self): + return self._data.get("InputShapes", "") + + @property + def connection_id(self): + return self._data.get("connectionId", "") diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/trace_event_bean.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/trace_event_bean.py index ab12d640a1aad9478ca067c56db3bcc10a156a0c..f177532d79f73b700c52be428f2c2d67a7f1d5da 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/trace_event_bean.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/origin_data_bean/trace_event_bean.py @@ -132,6 +132,18 @@ class TraceEventBean: def is_torch_op(self) -> bool: return self._is_torch_op + @property + def input_dims(self): + return self.args.get("Input Dims", Constant.NA) + + @property + def input_type(self): + return self.args.get("Input type", Constant.NA) + + @property + def call_stack(self): + return self.args.get("Call stack", Constant.NA) + @is_torch_op.setter def is_torch_op(self, value: bool): self._is_torch_op = value @@ -193,7 +205,7 @@ class TraceEventBean: return self._args.get("name", "").find("Communication") != -1 def is_hccl_process_name(self) -> bool: - return self.process_name == "HCCL" + return self.process_name in ["Communication", "HCCL"] def is_overlap_process_name(self) -> bool: return self.process_name == "Overlap Analysis" diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/overall_metrics_bean.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/overall_metrics_bean.py index 059416ec15e54b7732eb40cbf8b0e6ef1227bf90..19514b7716b966ab44d75f459046594650cd4a91 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/overall_metrics_bean.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/overall_metrics_bean.py @@ -24,6 +24,7 @@ class OverallMetricsBean: TABLE_NAME = Constant.OVERALL_METRICS_TABLE HEADERS = ExcelConfig.HEADERS.get(TABLE_NAME) OVERHEAD = ExcelConfig.OVERHEAD.get(TABLE_NAME) + DEFAULT_VALUE = [0, 0, "/"] def __init__(self, base_info: ProfilingInfo, comparison_info: ProfilingInfo): self._base_data = OverallMetricsInfo(base_info).overall_metrics @@ -39,55 +40,90 @@ class OverallMetricsBean: base_mc2_data = self._base_data.get("mc2", {}) comparison_mc2_data = self._comparison_data.get("mc2", {}) - default_value = [0, 0, "/"] for kernel_name, base_data in base_mc2_data.items(): comparison_data = comparison_mc2_data.pop(kernel_name, {}) - self._append_data(rows_data, self._get_row_data(kernel_name, base_data.get("mc2", default_value), - comparison_data.get("mc2", default_value))) + self._append_data(rows_data, self._get_row_data(kernel_name, base_data.get("mc2", self.DEFAULT_VALUE), + comparison_data.get("mc2", self.DEFAULT_VALUE))) self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMPUTING_TIME, - base_data.get(ExcelConfig.MC2_COMPUTING_TIME, default_value), - comparison_data.get(ExcelConfig.MC2_COMPUTING_TIME, default_value))) + base_data.get(ExcelConfig.MC2_COMPUTING_TIME, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.MC2_COMPUTING_TIME, + self.DEFAULT_VALUE))) self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMMUNICATION_TIME, - base_data.get(ExcelConfig.MC2_COMMUNICATION_TIME, default_value), + base_data.get(ExcelConfig.MC2_COMMUNICATION_TIME, self.DEFAULT_VALUE), comparison_data.get(ExcelConfig.MC2_COMMUNICATION_TIME, - default_value))) + self.DEFAULT_VALUE))) for kernel_name, comparison_data in comparison_mc2_data.items(): - self._append_data(rows_data, self._get_row_data(kernel_name, default_value, - comparison_data.get("mc2", default_value))) - self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMPUTING_TIME, default_value, + self._append_data(rows_data, self._get_row_data(kernel_name, self.DEFAULT_VALUE, + comparison_data.get("mc2", self.DEFAULT_VALUE))) + self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMPUTING_TIME, self.DEFAULT_VALUE, comparison_data.get(ExcelConfig.MC2_COMPUTING_TIME, - default_value))) - self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMMUNICATION_TIME, default_value, + self.DEFAULT_VALUE))) + self._append_data(rows_data, self._get_row_data(ExcelConfig.MC2_COMMUNICATION_TIME, self.DEFAULT_VALUE, comparison_data.get(ExcelConfig.MC2_COMMUNICATION_TIME, - default_value))) + self.DEFAULT_VALUE))) rows_data.extend( self._get_rows(self._base_data.get("before_group", {}), self._comparison_data.get("before_group", {}))) base_group_data = self._base_data.get("group", {}) comparison_group_data = self._comparison_data.get("group", {}) - default_value = [0, 0, "/"] + base_pg_name_dict = self._base_data.get("pg_name_dict", {}) + comparison_pg_name_dict = self._comparison_data.get("pg_name_dict", {}) + # deal base and comparsion data which can match with pg_name + for base_pg_name, base_group_name_list in base_pg_name_dict.items(): + if len(base_group_name_list) != 1 or base_pg_name == Constant.UNKNOWN: + continue + comparison_group_name_list = comparison_pg_name_dict.get(base_pg_name, []) + if len(comparison_group_name_list) != 1: + continue + + base_data = base_group_data.pop(base_group_name_list[0], {}) + comparison_data = comparison_group_data.pop(comparison_group_name_list[0], {}) + description = f"\t{base_pg_name}: Communication" + ExcelConfig.ROW_STYLE_MAP[description] = CellFormatType.LIGHT_BLUE_NORMAL + self._append_data(rows_data, + self._get_row_data(description, + base_data.get(ExcelConfig.COMMUNICATION_TIME, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.COMMUNICATION_TIME, + self.DEFAULT_VALUE))) + self._append_data(rows_data, + self._get_row_data(ExcelConfig.WAIT, base_data.get(ExcelConfig.WAIT, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.WAIT, self.DEFAULT_VALUE))) + self._append_data(rows_data, + self._get_row_data(ExcelConfig.TRANSMIT, + base_data.get(ExcelConfig.TRANSMIT, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.TRANSMIT, self.DEFAULT_VALUE))) + for group_name, base_data in base_group_data.items(): comparison_data = comparison_group_data.pop(group_name, {}) - self._append_data(rows_data, self._get_row_data(group_name, base_data.get("group", default_value), - comparison_data.get("group", default_value))) self._append_data(rows_data, - self._get_row_data(ExcelConfig.WAIT, base_data.get(ExcelConfig.WAIT, default_value), - comparison_data.get(ExcelConfig.WAIT, default_value))) + self._get_row_data(base_data.get("description", group_name), + base_data.get(ExcelConfig.COMMUNICATION_TIME, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.COMMUNICATION_TIME, + self.DEFAULT_VALUE))) + self._append_data(rows_data, + self._get_row_data(ExcelConfig.WAIT, base_data.get(ExcelConfig.WAIT, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.WAIT, self.DEFAULT_VALUE))) self._append_data(rows_data, self._get_row_data(ExcelConfig.TRANSMIT, - base_data.get(ExcelConfig.TRANSMIT, default_value), - comparison_data.get(ExcelConfig.TRANSMIT, default_value))) + base_data.get(ExcelConfig.TRANSMIT, self.DEFAULT_VALUE), + comparison_data.get(ExcelConfig.TRANSMIT, self.DEFAULT_VALUE))) for group_name, comparison_data in comparison_group_data.items(): - self._append_data(rows_data, self._get_row_data(group_name, default_value, - comparison_data.get("group", default_value))) - self._append_data(rows_data, self._get_row_data(ExcelConfig.WAIT, default_value, - comparison_data.get(ExcelConfig.WAIT, default_value))) - self._append_data(rows_data, self._get_row_data(ExcelConfig.TRANSMIT, default_value, - comparison_data.get(ExcelConfig.TRANSMIT, default_value))) + self._append_data(rows_data, + self._get_row_data(comparison_data.get("description", group_name), + self.DEFAULT_VALUE, + comparison_data.get(ExcelConfig.COMMUNICATION_TIME, + self.DEFAULT_VALUE))) + self._append_data(rows_data, self._get_row_data(ExcelConfig.WAIT, self.DEFAULT_VALUE, + comparison_data.get(ExcelConfig.WAIT, self.DEFAULT_VALUE))) + self._append_data(rows_data, self._get_row_data(ExcelConfig.TRANSMIT, self.DEFAULT_VALUE, + comparison_data.get(ExcelConfig.TRANSMIT, + self.DEFAULT_VALUE))) + rows_data.extend( + self._get_rows(self._base_data.get("group_overlap", {}), self._comparison_data.get("group_overlap", {}))) rows_data.extend( self._get_rows(self._base_data.get("after_group", {}), self._comparison_data.get("after_group", {}))) return rows_data @@ -96,10 +132,14 @@ class OverallMetricsBean: def _get_rows(cls, base_data_dict, comparison_data_dict): rows_data = [] for index, base_data in base_data_dict.items(): - comparison_data = comparison_data_dict.get(index) + comparison_data = comparison_data_dict.pop(index, cls.DEFAULT_VALUE) row = cls._get_row_data(index, base_data, comparison_data) if row: rows_data.append(row) + for index, comparison_data in comparison_data_dict.items(): + row = cls._get_row_data(index, cls.DEFAULT_VALUE, comparison_data) + if row: + rows_data.append(row) return rows_data @classmethod @@ -373,13 +413,30 @@ class OverallMetricsInfo: } if self._comm_group_list: for group_name in self._comm_group_list: - group_name_index = f"\t{group_name}" - ExcelConfig.ROW_STYLE_MAP[group_name_index] = CellFormatType.LIGHT_BLUE_NORMAL - overall_metrics_data.setdefault("group", {})[group_name_index] = { - "group": self.communication_data_by_group(group_name), + pg_name = self._profiling_info.get_pg_name_by_group(group_name) + description = " ".join([pg_name + ":" if pg_name != Constant.UNKNOWN else "", group_name]).strip() + ExcelConfig.ROW_STYLE_MAP[f"\t{description}"] = CellFormatType.LIGHT_BLUE_NORMAL + overall_metrics_data.setdefault("group", {})[group_name] = { + "description": f"\t{description}", + ExcelConfig.COMMUNICATION_TIME: self.communication_data_by_group(group_name), ExcelConfig.WAIT: self.wait_data_by_group(group_name), ExcelConfig.TRANSMIT: self.transmit_data_by_group(group_name) } + overall_metrics_data.setdefault("pg_name_dict", {}).setdefault(pg_name, []).append(group_name) + + if self._profiling_info.communication_overlap_time: + ExcelConfig.ROW_STYLE_MAP[ExcelConfig.UNCOVERED_COMM_OVERLAP] = CellFormatType.LIGHT_BLUE_NORMAL + comm_overlap_time = sum(self._profiling_info.communication_overlap_time.values()) + overall_metrics_data.setdefault("group_overlap", {})[ExcelConfig.UNCOVERED_COMM_OVERLAP] = [ + comm_overlap_time, comm_overlap_time / self.e2e_time, "/"] + for group_set, overlap_time in self._profiling_info.communication_overlap_time.items(): + pg_name_1 = self._profiling_info.get_pg_name_by_group(group_set[0]) + pg_name_2 = self._profiling_info.get_pg_name_by_group(group_set[1]) + pg_name = f"\t\t{pg_name_1 if pg_name_1 != Constant.UNKNOWN else group_set[0]} & " \ + f"{pg_name_2 if pg_name_2 != Constant.UNKNOWN else group_set[1]}" + overall_metrics_data.setdefault("group_overlap", {})[pg_name] = [overlap_time, + overlap_time / self.e2e_time, "/"] + for kernel_name in self._profiling_info.mc2_time_dict.keys(): mc2_name_index = f"\t{kernel_name}" ExcelConfig.ROW_STYLE_MAP[mc2_name_index] = CellFormatType.LIGHT_BLUE_NORMAL diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/profiling_info.py b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/profiling_info.py index bcbce59c016338195d49044508614307f2db1896..353d79aafbc85a71b0eeb96d31e50584b3f86547 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/profiling_info.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/compare_bean/profiling_info.py @@ -27,7 +27,8 @@ class ProfilingInfo: 'page_attention_time', 'page_attention_num', 'vector_time_trans', 'vector_num_trans', 'vector_time_notrans', 'vector_num_notrans', 'sdma_time_tensor_move', 'sdma_num_tensor_move', 'sdma_time_stream', 'sdma_num_stream', 'other_cube_time', 'other_cube_num', 'rdma_bandwidth', - 'sdma_bandwidth', 'communication_group_time', 'mc2_time_dict'] + 'sdma_bandwidth', 'communication_group_time', 'mc2_time_dict', 'pg_name_dict', + 'communication_overlap_time'] TABLE_NAME = Constant.PERFORMANCE_TABLE HEADERS = [] OVERHEAD = [] @@ -93,6 +94,10 @@ class ProfilingInfo: # 按group展示通信的卡间等待和传输耗时 self.communication_group_time = {} + # communication_group与pg_name的对应关系 + self.pg_name_dict = {} + # 展示通信间的掩盖耗时 + self.communication_overlap_time = {} @property def e2e_time_ms(self): @@ -334,6 +339,12 @@ class ProfilingInfo: for time in time_dict.values(): self.wait_time += time.get(Constant.WAIT_TIME, 0) + def update_communication_overlap_time(self, time_dict: dict): + self.communication_overlap_time = time_dict + + def update_communication_group_pg_name(self, pg_name_dict: dict): + self.pg_name_dict = pg_name_dict + def set_memory_used(self, memory: float): self.memory_used = memory @@ -401,3 +412,6 @@ class ProfilingInfo: def get_mc2_number_by_name(self, kernel_name: str): return self.mc2_time_dict.get(kernel_name, {}).get(Constant.MC2_NUMBER, 0) + + def get_pg_name_by_group(self, group: str): + return self.pg_name_dict.get(group, Constant.UNKNOWN) diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/comparison_generator.py b/profiler/msprof_analyze/compare_tools/compare_backend/comparison_generator.py index 2c3c3d920ce3896336f396eb51d4df3920ded7d7..e15d5e2f276410f7e6c3bced5e6b18fcfe29802c 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/comparison_generator.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/comparison_generator.py @@ -24,6 +24,8 @@ from msprof_analyze.compare_tools.compare_backend.utils.args_manager import Args from msprof_analyze.prof_common.constant import Constant from msprof_analyze.prof_common.additional_args_manager import AdditionalArgsManager from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.compare_tools.compare_backend.profiling_parser.npu_profiling_db_parser import \ + NPUProfilingDbParser logger = get_logger() @@ -52,14 +54,26 @@ class ComparisonGenerator: logger.error("%s", e) def load_data(self): - self._data_dict[Constant.BASE_DATA] = self.PARSER_DICT.get(self._args_manager.base_profiling_type)( - self._args_manager.args, - self._args_manager.base_path_dict, - self._args_manager.base_step).load_data() - self._data_dict[Constant.COMPARISON_DATA] = self.PARSER_DICT.get(self._args_manager.comparison_profiling_type)( - self._args_manager.args, - self._args_manager.comparison_path_dict, - self._args_manager.comparison_step).load_data() + if self._args_manager.base_path_dict.get(Constant.PROFILER_DB_PATH): + self._data_dict[Constant.BASE_DATA] = NPUProfilingDbParser(self._args_manager.args, + self._args_manager.base_path_dict, + self._args_manager.base_step).load_data() + else: + self._data_dict[Constant.BASE_DATA] = self.PARSER_DICT.get(self._args_manager.base_profiling_type)( + self._args_manager.args, + self._args_manager.base_path_dict, + self._args_manager.base_step).load_data() + if self._args_manager.comparison_path_dict.get(Constant.PROFILER_DB_PATH): + self._data_dict[Constant.COMPARISON_DATA] = \ + NPUProfilingDbParser(self._args_manager.args, + self._args_manager.comparison_path_dict, + self._args_manager.comparison_step).load_data() + else: + self._data_dict[Constant.COMPARISON_DATA] = self.PARSER_DICT.get( + self._args_manager.comparison_profiling_type)( + self._args_manager.args, + self._args_manager.comparison_path_dict, + self._args_manager.comparison_step).load_data() def generate_compare_result(self): overall_data = { diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/data_prepare/sequence_pre_matching.py b/profiler/msprof_analyze/compare_tools/compare_backend/data_prepare/sequence_pre_matching.py index 5c2590c723e646660b456acbdf3f114fb2726190..cdca93a92767f169f4f4c014ed18f5aa7d407a7a 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/data_prepare/sequence_pre_matching.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/data_prepare/sequence_pre_matching.py @@ -91,7 +91,7 @@ class SequencePreMatching: base_index += 1 comparison_index += 1 while comparison_index < comparison_data_len: - result_data.extend(self._match_torch_op([], comparison_data[0].get(Constant.OPS, []))) + result_data.extend(self._match_torch_op([], comparison_data[comparison_index].get(Constant.OPS, []))) comparison_index += 1 return result_data diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/base_profiling_parser.py b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/base_profiling_parser.py index b3d9a29944fe7f722b2fc54ac0381f8c23c1af14..bb7dfe7d90a543f36d6bcf0e4705cd00eef7dffb 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/base_profiling_parser.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/base_profiling_parser.py @@ -12,13 +12,14 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import os from abc import abstractmethod, ABC from decimal import Decimal import ijson from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.compare_event import ( - KernelEvent, + KernelEvent, MemoryEvent ) from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.kernel_details_bean \ @@ -91,6 +92,7 @@ class BaseProfilingParser(ABC): self._enable_communication_compare = args.enable_communication_compare self._enable_api_compare = args.enable_api_compare self._enable_kernel_compare = args.enable_kernel_compare + self._step_id = step_id self._dispatch_func = self._get_dispatch_func() self._result_data = ProfilingResult(self._profiling_type) self._memory_events = [] @@ -103,7 +105,7 @@ class BaseProfilingParser(ABC): self._categorize_performance_index = 0 self._cpu_cube_op = None self._bwd_tid = None - self._step_id = step_id + self._step_range = None @property def cpu_cube_op(self): @@ -114,6 +116,27 @@ class BaseProfilingParser(ABC): self._cpu_cube_op = cpu_cube_op return self._cpu_cube_op + @property + def step_range(self): + if self._step_range is not None: + return self._step_range + self._step_range = [] + if self._step_id == Constant.VOID_STEP: + return self._step_range + step_list = [] + events = self._result_data.torch_op_data or self._trace_event_generator(self._profiling_type) + for event in events: + if event.is_step_profiler(): + step_id = event.name.split("#")[-1] + step_list.append(step_id) + if int(step_id) == int(self._step_id): + self._step_range = [event.start_time, event.end_time] + break + if not self._step_range: + valid_step = ", ".join(step_list) + raise RuntimeError(f"Invalid step id: {self._step_id}, please choose from the valid steps: {valid_step}") + return self._step_range + @abstractmethod def _update_kernel_details(self): raise NotImplementedError("Function _update_kernel_details need to be implemented.") @@ -156,6 +179,7 @@ class BaseProfilingParser(ABC): self._dispatch_events() self._update_kernel_dict() self._update_communication_dict() + self._update_pg_name_map() if self._enable_memory_compare: self._update_memory_list() if self._enable_profiling_compare: @@ -325,15 +349,25 @@ class BaseProfilingParser(ABC): self._comm_list = list(filter(lambda x: x.is_nccl_name(), self._all_kernels.values())) self._comm_list.sort(key=lambda x: x.start_time) self._comm_task_list.sort(key=lambda x: x.start_time) + if len(self.step_range) == 2: + comm_list = [event + for event in self._comm_list + if self.step_range[0] <= event.start_time <= self.step_range[1]] + comm_task_list = [event + for event in self._comm_task_list + if self.step_range[0] <= event.start_time <= self.step_range[1]] + else: + comm_list = self._comm_list + comm_task_list = self._comm_task_list task_index = 0 - for communication_op in self._comm_list: + for communication_op in comm_list: name_list = communication_op.lower_name.split("_") if len(name_list) < 2: continue - comm_name = name_list[1] + comm_name = name_list[1] if name_list[0] == "hcom" else name_list[0] self._result_data.update_communication_dict(comm_name, communication_op.dur) - while task_index < len(self._comm_task_list): - task_event = self._comm_task_list[task_index] + while task_index < len(comm_task_list): + task_event = comm_task_list[task_index] if task_event.start_time < communication_op.start_time: task_index += 1 continue @@ -369,3 +403,17 @@ class BaseProfilingParser(ABC): with open(self._json_path, 'r') as file: for event in ijson.items(file, item): yield TraceEventBean(event) + + def _update_pg_name_map(self): + meta_file = os.path.join(self._profiling_path, Constant.PROFILER_METADATA) + if not os.path.exists(meta_file): + return + meta_data = FileManager.read_json_file(meta_file) + if Constant.PARALLEL_GROUP_INFO not in meta_data: + return + pg_name_map = {} + for group_id, group_info in meta_data[Constant.PARALLEL_GROUP_INFO].items(): + if group_id not in pg_name_map: + format_group_id = " ".join(["Group", group_id, "Communication"]) + pg_name_map[format_group_id] = group_info.get('group_name', "") + self._result_data.overall_metrics.update_communication_group_pg_name(pg_name_map) diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_db_parser.py b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_db_parser.py new file mode 100644 index 0000000000000000000000000000000000000000..447af32fdc7b760fe8028e804c39d159a9a3a2de --- /dev/null +++ b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_db_parser.py @@ -0,0 +1,288 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.compare_tools.compare_backend.profiling_parser.base_profiling_parser import ProfilingResult +from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.db_data_bean.framework_api_bean import \ + FrameworkApiBean +from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.db_data_bean.kernel_bean import \ + KernelBean +from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.db_data_bean.hccl_op_bean import \ + HcclOpBean +from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.db_data_bean.hccl_task_bean import \ + HcclTaskBean +from msprof_analyze.prof_common.logger import get_logger + +logger = get_logger() + + +class NPUProfilingDbParser: + pytorch_api_sql = """ + SELECT + PYTORCH_API.startNs AS "startNs", + PYTORCH_API.endNs AS "endNs", + PYTORCH_API.connectionId AS "connectionId", + STRING_IDS.value AS "name", + CONNECTION_IDS.connectionId AS "cann_connectionId" + FROM + PYTORCH_API + LEFT JOIN + CONNECTION_IDS ON PYTORCH_API.connectionId=CONNECTION_IDS.id + LEFT JOIN + STRING_IDS ON PYTORCH_API.name=STRING_IDS.id + LEFT JOIN + ENUM_API_TYPE ON PYTORCH_API.type=ENUM_API_TYPE.id + WHERE + ENUM_API_TYPE.name=? {} + """ + + def __init__(self, args: any, path_dict: dict, step_id: int = Constant.VOID_STEP): + self._args = args + self._result_data = ProfilingResult(Constant.NPU) + self._db_path = path_dict.get(Constant.PROFILER_DB_PATH) + self._step_id = step_id + self._enable_profiling_compare = args.enable_profiling_compare + self._enable_operator_compare = args.enable_operator_compare + self._enable_memory_compare = args.enable_memory_compare + self._enable_communication_compare = args.enable_communication_compare + self._enable_api_compare = args.enable_api_compare + self._enable_kernel_compare = args.enable_kernel_compare + self.conn, self.cursor = DBManager.create_connect_db(self._db_path) + self._step_range = [] + self._compute_op_data = [] + self._comm_op_data = [] + self._comm_task_data = [] + + def load_data(self) -> ProfilingResult: + self._prepare_data() + if self._enable_communication_compare: + self._update_communication_dict() + return self._result_data + + def _update_communication_dict(self): + hccl_task_dict = {} + for task in self._comm_task_data: + hccl_task_dict.setdefault(task.task_id, []).append(task) + for comm_op in self._comm_op_data: + name_list = comm_op.op_type.lower().split("_") + if len(name_list) < 2: + continue + comm_name = name_list[1] if name_list[0] == "hcom" else name_list[0] + self._result_data.update_communication_dict(comm_name, comm_op.dur) + tasks = hccl_task_dict.get(comm_op.task_id, []) + for task in tasks: + self._result_data.update_comm_task_data(comm_name, task) + + def _prepare_data(self): + self._get_step_range() + self._query_torch_op_data() + self._query_compute_op_data() + self._query_comm_op_data() + self._query_comm_task_data() + self._query_memory_data() + + def _get_step_range(self): + if self._step_id != Constant.VOID_STEP: + sql = "SELECT id, startNs, endNs FROM STEP_TIME" + all_data = DBManager.fetch_all_data(self.cursor, sql) + if not all_data: + raise RuntimeError('The profiling data lacks step markers. Please re-collect it.') + for data in all_data: + if int(data[0]) == int(self._step_id): + self._step_range = [data[1], data[2]] + if not self._step_range: + raise RuntimeError(f"Invalid Step Id: {self._step_id}") + + def _query_torch_op_data(self): + if any((self._enable_memory_compare, self._enable_operator_compare, self._enable_profiling_compare, + self._enable_api_compare)): + sql = self.pytorch_api_sql.format( + "AND PYTORCH_API.startNs>=? AND PYTORCH_API.startNs<=?") if len(self._step_range) == 2 else \ + self.pytorch_api_sql.format("") + param = ('op', self._step_range[0], self._step_range[1]) if len(self._step_range) == 2 else ('op',) + all_data = DBManager.fetch_all_data(self.cursor, sql, param=param) + for data in all_data: + self._result_data.update_torch_op_data(FrameworkApiBean(data)) + + def _query_compute_op_data(self): + if self._enable_operator_compare or self._args.max_kernel_num or self._enable_profiling_compare: + sql = """ + SELECT + NAME_IDS.value AS "OpName", + COMPUTE_TASK_INFO.globalTaskId AS "TaskId", + OPTYPE_IDS.value AS "opType", + TASKTYPE_IDS.value AS "TaskType", + INPUTSHAPES_IDS.value AS "InputShapes", + round(TASK.endNs - TASK.startNs) AS "Duration", + TASK.connectionId AS "connectionId" + FROM + COMPUTE_TASK_INFO + LEFT JOIN TASK + ON TASK.globalTaskId == COMPUTE_TASK_INFO.globalTaskId + LEFT JOIN + STRING_IDS AS NAME_IDS + ON NAME_IDS.id == COMPUTE_TASK_INFO.name + LEFT JOIN + STRING_IDS AS OPTYPE_IDS + ON OPTYPE_IDS.id == COMPUTE_TASK_INFO.opType + LEFT JOIN + STRING_IDS AS TASKTYPE_IDS + ON TASKTYPE_IDS.id == COMPUTE_TASK_INFO.taskType + LEFT JOIN + STRING_IDS AS INPUTSHAPES_IDS + ON INPUTSHAPES_IDS.id == COMPUTE_TASK_INFO.inputShapes + {} + """ + sql = sql.format("WHERE TASK.startNs>=? AND TASK.startNs<=?") if self._step_range else sql.format("") + if self._step_range: + all_data = DBManager.fetch_all_data(self.cursor, sql, param=self._step_range) + else: + all_data = DBManager.fetch_all_data(self.cursor, sql) + for data in all_data: + data_bean = KernelBean(data) + if data_bean.connection_id: + self._result_data.update_kernel_dict(data_bean.connection_id, data_bean) + if self._enable_kernel_compare: + self._compute_op_data.append(data_bean) + + def _query_comm_op_data(self): + if self._enable_communication_compare or self._enable_profiling_compare: + sql = """ + SELECT + NAME_IDS.value AS "opName", + COMMUNICATION_OP.opId AS "opId", + TYPE_IDS.value AS "OpType", + round(endNs - startNs) AS "Duration", + GROUP_NAME_IDS.value AS "GroupName", + COMMUNICATION_OP.connectionId AS "connectionId" + FROM + COMMUNICATION_OP + LEFT JOIN + STRING_IDS AS TYPE_IDS + ON TYPE_IDS.id == COMMUNICATION_OP.opType + LEFT JOIN + STRING_IDS AS NAME_IDS + ON NAME_IDS.id == COMMUNICATION_OP.opName + LEFT JOIN + STRING_IDS AS GROUP_NAME_IDS + ON GROUP_NAME_IDS.id == COMMUNICATION_OP.groupName + {} + """ + sql = sql.format("WHERE COMMUNICATION_OP.startNs>=? AND COMMUNICATION_OP.startNs<=?") \ + if self._step_range else sql.format("") + if self._step_range: + all_data = DBManager.fetch_all_data(self.cursor, sql, param=self._step_range) + else: + all_data = DBManager.fetch_all_data(self.cursor, sql) + self._comm_op_data = [HcclOpBean(data) for data in all_data] + + def _query_comm_task_data(self): + if self._enable_communication_compare or self._enable_profiling_compare: + sql = """ + SELECT + NAME_IDS.value AS "taskName", + COMMUNICATION_TASK_INFO.opId AS "opId", + round(TASK.endNs - TASK.startNs) AS "Duration", + GROUP_NAME_IDS.value AS "GroupName" + FROM + COMMUNICATION_TASK_INFO + LEFT JOIN + TASK + ON TASK.globalTaskId == COMMUNICATION_TASK_INFO.globalTaskId + LEFT JOIN + STRING_IDS AS NAME_IDS + ON NAME_IDS.id == COMMUNICATION_TASK_INFO.taskType + LEFT JOIN + STRING_IDS AS GROUP_NAME_IDS + ON GROUP_NAME_IDS.id == COMMUNICATION_TASK_INFO.groupName + {} + """ + sql = sql.format("WHERE TASK.startNs>=? AND TASK.startNs<=?") if self._step_range else sql.format("") + if self._step_range: + all_data = DBManager.fetch_all_data(self.cursor, sql, param=self._step_range) + else: + all_data = DBManager.fetch_all_data(self.cursor, sql) + self._comm_task_data = [HcclTaskBean(data) for data in all_data] + + def _query_memory_data(self): + if self._enable_memory_compare: + sql = """ + SELECT + STRING_IDS.value AS "opName", + OP_MEMORY.size AS "size", + OP_MEMORY.allocationTime AS "allocationTime", + OP_MEMORY.releaseTime AS "releaseTime", + OP_MEMORY.duration AS "duration" + FROM + OP_MEMORY + LEFT JOIN + STRING_IDS + ON OP_MEMORY.name == STRING_IDS.id + {} + """ + sql = sql.format( + "WHERE OP_MEMORY.releaseTime>=? AND OP_MEMORY.allocationTime<=? ORDER BY OP_MEMORY.releaseTime") \ + if self._step_range else sql.format("ORDER BY OP_MEMORY.releaseTime") + if self._step_range: + memory_data = DBManager.fetch_all_data(self.cursor, sql, param=self._step_range) + else: + memory_data = DBManager.fetch_all_data(self.cursor, sql) + + sql = self.pytorch_api_sql.format( + "AND PYTORCH_API.startNs>=? AND PYTORCH_API.startNs<=?") if len(self._step_range) == 2 else \ + self.pytorch_api_sql.format("") + param = ('queue', self._step_range[0], self._step_range[1]) if len(self._step_range) == 2 else ('queue',) + task_queue_data = DBManager.fetch_all_data(self.cursor, sql, param=param) + queue_dict = {} + for data in task_queue_data: + if data.get("name") == "Enqueue": + queue_dict.setdefault(data.get("connectionId"), {})["enqueue"] = data + else: + queue_dict.setdefault(data.get("connectionId"), {})["dequeue"] = data + task_queue_data = [] + for data in queue_dict.values(): + enqueue_data = data.get("enqueue") + dequeue_data = data.get("dequeue") + if enqueue_data and dequeue_data: + task_queue_data.append( + {Constant.TS: enqueue_data.get("startNs"), Constant.START_NS: dequeue_data.get("startNs"), + Constant.END_NS: dequeue_data.get("endNs")}) + task_queue_data.sort(key=lambda x: x.get(Constant.START_NS)) + + self._update_memory_data(memory_data, task_queue_data) + + def _update_memory_data(self, memory_data, task_queue_data): + task_queue_index = 0 + for op_memory in memory_data: + allocation_time = op_memory.get("allocationTime") if op_memory.get("allocationTime") else 0 + release_time = op_memory.get("releaseTime") if op_memory.get("releaseTime") else 0 + if "cann::" in op_memory.get("opName", ""): + while task_queue_index < len(task_queue_data): + task_queue = task_queue_data[task_queue_index] + if allocation_time < task_queue.get(Constant.START_NS): + break + if allocation_time > task_queue.get(Constant.END_NS): + task_queue_index += 1 + continue + self._result_data.update_memory_list({Constant.SIZE: op_memory.get("size"), + Constant.TS: task_queue.get(Constant.TS) / Constant.NS_TO_US, + Constant.ALLOCATION_TIME: allocation_time / Constant.NS_TO_US, + Constant.RELEASE_TIME: release_time / Constant.NS_TO_US}) + break + else: + self._result_data.update_memory_list({Constant.SIZE: op_memory.get("size"), + Constant.TS: allocation_time / Constant.NS_TO_US, + Constant.ALLOCATION_TIME: allocation_time / Constant.NS_TO_US, + Constant.RELEASE_TIME: release_time / Constant.NS_TO_US}) diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_parser.py b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_parser.py index 6e7e9e1c1e5963303ceb2cd73abda2089d7ae069..c543fa614ffa6a0fdd037c8101544e8f9457453e 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_parser.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/profiling_parser/npu_profiling_parser.py @@ -58,7 +58,6 @@ class NPUProfilingParser(BaseProfilingParser): self._hccl_tid_name_dict = {} self._c_core_sqe_list = [] self._c_core_sqe_index = 0 - self._dispatch_func = self._get_dispatch_func() if any((self._enable_profiling_compare, self._enable_operator_compare, self._enable_memory_compare, self._enable_api_compare, self._enable_communication_compare)): self._filter_meta_id() @@ -73,6 +72,34 @@ class NPUProfilingParser(BaseProfilingParser): return Constant.ASCEND_OUTPUT_PATH return Constant.PROFILING_PATH + @staticmethod + def __calculate_uncovered_comm_range(comm_events, uncovered_comm_events): + class Event: + def __init__(self, start_time, end_time): + self.start_time = start_time + self.end_time = end_time + + uncovered_comm_range = [] + index = 0 + for comm_event in comm_events: + while index < len(uncovered_comm_events): + if uncovered_comm_events[index].end_time < comm_event.start_time: + index += 1 + continue + if uncovered_comm_events[index].start_time > comm_event.end_time: + break + if uncovered_comm_events[index].end_time < comm_event.end_time: + uncovered_comm_range.append( + Event(max(comm_event.start_time, uncovered_comm_events[index].start_time), + uncovered_comm_events[index].end_time)) + index += 1 + continue + uncovered_comm_range.append( + Event(max(comm_event.start_time, uncovered_comm_events[index].start_time), + min(comm_event.end_time, uncovered_comm_events[index].end_time))) + break + return uncovered_comm_range + @staticmethod def __calculate_overlap_time_with_uncovered_communication(uncovered_communication_events: list, events: list): overlap_time = 0 @@ -127,6 +154,8 @@ class NPUProfilingParser(BaseProfilingParser): func_list.add(self._picking_flow_event) if self._enable_api_compare: func_list.add(self._picking_torch_op_event) + if self._step_id != Constant.VOID_STEP: + func_list.add(self._picking_torch_op_event) return list(func_list) def _update_kernel_details(self): @@ -251,11 +280,38 @@ class NPUProfilingParser(BaseProfilingParser): self.__add_sdma_time() self.__add_overlap_analysis_time() self.__add_communication_wait_time() + self.__add_uncovered_communication_overlap_time() self._result_data.overall_metrics.calculate_schedule_time() self._result_data.overall_metrics.trans_time_to_s() self._result_data.overall_metrics.calculate_other_time() self._update_bandwidth() + def __add_uncovered_communication_overlap_time(self): + comm_overlap_time_dict = {} + comm_tid_list = list(self._group_comm_tid_dict.keys()) + if not comm_tid_list: + return + uncovered_communication_events = list(filter(lambda x: x.is_comm_not_overlap(), self._overlap_analysis)) + uncovered_communication_events.sort(key=lambda x: x.start_time) + for index, comm_tid in enumerate(comm_tid_list): + if index == len(comm_tid_list) - 1: + continue + for index_2 in range(index + 1, len(comm_tid_list)): + comm_op_events_1 = list(filter(lambda x: x.tid == comm_tid, self._comm_list)) + comm_op_events_1.sort(key=lambda x: x.start_time) + uncovered_comm_op_events_1 = self.__calculate_uncovered_comm_range(comm_op_events_1, + uncovered_communication_events) + comm_op_events_2 = list(filter(lambda x: x.tid == comm_tid_list[index_2], self._comm_list)) + comm_op_events_2.sort(key=lambda x: x.start_time) + uncovered_comm_op_events_2 = self.__calculate_uncovered_comm_range(comm_op_events_2, + uncovered_communication_events) + overlap_time = self.__calculate_overlap_time_with_uncovered_communication(uncovered_comm_op_events_1, + uncovered_comm_op_events_2) + if overlap_time: + comm_overlap_time_dict[(self._hccl_tid_name_dict.get(comm_tid), self._hccl_tid_name_dict.get( + comm_tid_list[index_2]))] = overlap_time / Constant.MILLISECONDS_TO_MICROSECONDS + self._result_data.overall_metrics.update_communication_overlap_time(comm_overlap_time_dict) + def __add_communication_wait_time(self): """ 按group统计uncovered communication time的卡间等待时间、传输时间。选择传输时间最长的plane作为该group的卡间等待时间、传输时间。 @@ -405,10 +461,6 @@ class NPUProfilingParser(BaseProfilingParser): level = json_data.get('config', {}).get('experimental_config', {}).get('_profiler_level', '') if self.LEVEL_0 != level: return - self._result_data.overall_metrics.is_level0 = True - if self.ACTIVE_CPU in json_data.get('config', {}).get('common_config', {}).get('activities', []): - return - self._result_data.overall_metrics.minimal_profiling = True def __add_lccl_time(self): for event in self._all_kernels.values(): diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/utils/args_manager.py b/profiler/msprof_analyze/compare_tools/compare_backend/utils/args_manager.py index 6ac463982b434901390b4440ef0bcde956d62362..738aaf84ed600bb9b273510a9718c6c1984dd39d 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/utils/args_manager.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/utils/args_manager.py @@ -137,14 +137,27 @@ class ArgsManager: ascend_output = os.path.join(file_path, "ASCEND_PROFILER_OUTPUT") profiler_output = ascend_output if os.path.isdir(ascend_output) else file_path json_path = os.path.join(profiler_output, "trace_view.json") + db_path = "" if not os.path.isfile(json_path): - msg = (f"The data is not collected by PyTorch Adaptor mode or the data is not parsed. " - f"Invalid profiling path: {profiler_output}") - raise RuntimeError(msg) - path_dict = { - Constant.PROFILING_TYPE: Constant.NPU, Constant.PROFILING_PATH: file_path, - Constant.TRACE_PATH: json_path, Constant.ASCEND_OUTPUT_PATH: profiler_output - } + sub_dirs = os.listdir(profiler_output) + for sub_dir in sub_dirs: + if sub_dir.startswith(("ascend_pytorch_profiler", "ascend_mindspore_profiler")) and sub_dir.endswith( + ".db"): + db_path = os.path.join(profiler_output, sub_dir) + break + if not db_path: + msg = (f"The data is not collected by PyTorch or Mindspore mode or the data is not parsed. " + f"Invalid profiling path: {profiler_output}") + raise RuntimeError(msg) + path_dict = { + Constant.PROFILING_TYPE: Constant.NPU, Constant.PROFILING_PATH: file_path, + Constant.PROFILER_DB_PATH: db_path, Constant.ASCEND_OUTPUT_PATH: profiler_output + } + else: + path_dict = { + Constant.PROFILING_TYPE: Constant.NPU, Constant.PROFILING_PATH: file_path, + Constant.TRACE_PATH: json_path, Constant.ASCEND_OUTPUT_PATH: profiler_output + } sub_dirs = os.listdir(file_path) for dir_name in sub_dirs: if dir_name == "profiler_info.json" or re.match(r"profiler_info_[0-9]+\.json", dir_name): diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/utils/common_func.py b/profiler/msprof_analyze/compare_tools/compare_backend/utils/common_func.py index ac9b4726eadb2cb5998ae10b4eaa995b6d7f0252..b26bdb826250b1b233fc4fa6dcbb8655758a1a8b 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/utils/common_func.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/utils/common_func.py @@ -39,7 +39,7 @@ def convert_to_float(data: any) -> float: try: float_value = float(data) except Exception: - logger.error('Invalid profiling data which failed to convert data to float.') + logger.warning('Invalid profiling data which failed to convert data to float.') return 0.0 return float_value @@ -48,8 +48,8 @@ def convert_to_decimal(data: any) -> Decimal: try: decimal_value = Decimal(data) except Exception: - logger.error('Invalid profiling data which failed to convert data to decimal.') - return 0.0 + logger.warning('Invalid profiling data which failed to convert data to decimal.') + return Decimal(0) return decimal_value diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/utils/excel_config.py b/profiler/msprof_analyze/compare_tools/compare_backend/utils/excel_config.py index 75b1c64b9eddaadbe55bebb020597e9d4c572a10..e3a646970d48487c6d50d98fbe21f5f7c9d6f4ea 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/utils/excel_config.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/utils/excel_config.py @@ -320,6 +320,7 @@ class ExcelConfig(object): COMMUNICATION_TIME = "Uncovered Communication Time" WAIT = "\t\tWait" TRANSMIT = "\t\tTransmit" + UNCOVERED_COMM_OVERLAP = "\tUncovered Communication Overlapped" # free time FREE_TIME = "Free Time" diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/utils/torch_op_node.py b/profiler/msprof_analyze/compare_tools/compare_backend/utils/torch_op_node.py index 2b72ad3d990f1bb1a73aa071d359b32d269934b1..00af2be1bd6c82ffbd1219fa6968b4bdccebb8fa 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/utils/torch_op_node.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/utils/torch_op_node.py @@ -46,19 +46,19 @@ class TorchOpNode: @property def input_shape(self): - return str(self._event.args.get("Input Dims", Constant.NA)) + return str(self._event.input_dims) @property def origin_input_shape(self): - return self._event.args.get("Input Dims", Constant.NA) + return self._event.input_dims @property def input_type(self): - return str(self._event.args.get("Input type", Constant.NA)) + return str(self._event.input_type) @property def call_stack(self): - return str(self._event.args.get("Call stack", Constant.NA)) + return str(self._event.call_stack) @property def parent(self): diff --git a/profiler/msprof_analyze/compare_tools/compare_backend/utils/tree_builder.py b/profiler/msprof_analyze/compare_tools/compare_backend/utils/tree_builder.py index 6872bed55f67df6e5736be9cdd0ade0d97f4f598..1f0a5614fbc684c9e09b929698715ce468d0a9de 100644 --- a/profiler/msprof_analyze/compare_tools/compare_backend/utils/tree_builder.py +++ b/profiler/msprof_analyze/compare_tools/compare_backend/utils/tree_builder.py @@ -18,6 +18,8 @@ from queue import Queue from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.trace_event_bean import TraceEventBean from msprof_analyze.compare_tools.compare_backend.utils.module_node import ModuleNode from msprof_analyze.compare_tools.compare_backend.utils.torch_op_node import TorchOpNode +from msprof_analyze.compare_tools.compare_backend.compare_bean.origin_data_bean.db_data_bean.framework_api_bean import \ + FrameworkApiBean class TreeBuilder: @@ -43,7 +45,10 @@ class TreeBuilder: last_node.add_child_node(tree_node) last_node = tree_node if kernel_dict: - tree_node.set_kernel_list(kernel_dict.get(event.start_time, [])) + if isinstance(event, FrameworkApiBean): + tree_node.set_kernel_list(kernel_dict.get(event.cann_connection_id, [])) + else: + tree_node.set_kernel_list(kernel_dict.get(event.start_time, [])) else: event.set_name(last_node.name) last_node.set_memory_allocated(event) diff --git a/profiler/msprof_analyze/module_visualization/__init__.py b/profiler/msprof_analyze/module_visualization/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/module_visualization/graph/__init__.py b/profiler/msprof_analyze/module_visualization/graph/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/module_visualization/graph/prof_node.py b/profiler/msprof_analyze/module_visualization/graph/prof_node.py deleted file mode 100644 index 1f39ee9bfa2dd86cc02ea74400de8bcebdd75e21..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/module_visualization/graph/prof_node.py +++ /dev/null @@ -1,217 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.base_node import BaseNode -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean - - -class ProfNode(BaseNode): - - def __init__(self, event: TraceEventBean, parent_node=None): - super().__init__(event, parent_node) - self._kernel_total_list = [] - self._communication_total_list = [] - self._precision_index = 1 - self._computing_time = 0 - self._uncovered_comm_time = 0 - self._free_time = 0 - self._step_id = None - self._micro_step_id = None - self._bwd_overall_data = {} - - @property - def node_id(self): - return self._event.unique_id - - @property - def node_type(self): - if self._event.event_type is None: - return Constant.VIRTUAL_TYPE - return self._event.event_type - - @property - def step_id(self): - return self._step_id - - @property - def micro_step_id(self): - return self._micro_step_id - - @property - def is_backward(self): - return self.node_id.startswith(Constant.BACKWARD_MODULE) - - @property - def fwd_bwd_id(self): - return self._event.fwd_bwd_id - - @property - def is_bwd(self): - return "BACKWARD" in self.node_id - - @property - def total_kernels(self): - if self.node_type == Constant.VIRTUAL_TYPE: - return [kernel for node in self.child_nodes for kernel in node.total_kernels] - return self._kernel_total_list - - @property - def total_communications(self): - if self.node_type == Constant.VIRTUAL_TYPE: - return [comm for node in self.child_nodes for comm in node.total_communications] - return self._communication_total_list - - @property - def host_total_dur(self): - if self.node_type == Constant.VIRTUAL_TYPE: - return sum((node.host_total_dur for node in self.child_nodes)) - return self._event.dur - - @property - def host_self_dur(self): - if self.node_type == Constant.VIRTUAL_TYPE: - return 0 - return self.host_total_dur - sum((node.host_total_dur for node in self.child_nodes)) - - @property - def device_total_dur(self): - return sum((kernel.dur for kernel in self.total_kernels)) - - @property - def device_self_dur(self): - if self.node_type == Constant.VIRTUAL_TYPE: - return 0 - return self.device_total_dur - sum((node.device_total_dur for node in self.child_nodes)) - - @property - def input_data(self) -> dict: - data = {} - input_dim = self._event.args.get("Input Dims") - if input_dim: - data["Input Dims"] = input_dim - input_type = self._event.args.get("Input type") - if input_type: - data["Input type"] = input_type - return data - - @property - def kernel_data(self) -> list: - return [kernel.kernel_info for kernel in self.total_kernels] - - @property - def communication_data(self) -> list: - return [[comm.name, comm.dur] for comm in self.total_communications] - - @property - def overall_data(self): - return {"Computing Time(us)": round(self._computing_time, 3), - "Uncovered Communication Time(us)": round(self._uncovered_comm_time, 3), - "Free Time(us)": round(self._free_time, 3)} - - @property - def data(self): - data = { - "Overall Metrics": self.overall_data} if self.node_type != Constant.OPERATOR_TYPE else {} - if self._bwd_overall_data: - data.update({"Backward Overall Metrics": self._bwd_overall_data}) - data.update({"Input Data": self.input_data, - "precision_index": self.precision_index, - "Host Self Duration(us)": round(self.host_self_dur, 3), - "Host Total Duration(us)": round(self.host_total_dur, 3), - "Device Self Duration(us)": round(self.device_self_dur, 3), - "Device Total Duration(us)": round(self.device_total_dur, 3), - "kernels": self.kernel_data, - "Communications": self.communication_data}) - return data - - @property - def info(self): - info = {"id": self.node_id, - "node_type": self.node_type, - "data": self.data, - "upnode": self.parent_node.node_id if self.parent_node else "None", - "subnodes": [node.node_id for node in iter(self.child_nodes)]} - if self.step_id is not None: - info.update({"step_id": self.step_id}) - if self.micro_step_id is not None: - info.update({"micro_step_id": self.micro_step_id}) - return info - - @property - def is_root_node(self): - return self.node_id == Constant.NPU_ROOT_ID - - @property - def precision_index(self): - return self._precision_index - - @precision_index.setter - def precision_index(self, precision_index): - self._precision_index = precision_index - - @step_id.setter - def step_id(self, step_id): - self._step_id = step_id - - @micro_step_id.setter - def micro_step_id(self, micro_step_id): - self._micro_step_id = micro_step_id - - def update_child_nodes(self, node): - self._child_nodes.append(node) - - def reset_child_nodes(self, nodes): - self._child_nodes = nodes - - def update_kernel_total_list(self, kernel_list: list): - self._kernel_total_list.extend(kernel_list) - - def update_communication_total_list(self, communication_list: list): - self._communication_total_list.extend(communication_list) - - def update_child_precision_index(self): - if not self.child_nodes: - return - max_dur = max((node.device_total_dur for node in self.child_nodes)) - min_dur = min((node.device_total_dur for node in self.child_nodes)) - diff_dur = max_dur - min_dur - for node in self.child_nodes: - node.precision_index = 1 - (node.device_total_dur - min_dur) / diff_dur if diff_dur else 1 - - def update_overall_metrics(self, overlap_analysis_event): - if not self.total_kernels and not self.total_communications: - return - device_events = [] - device_events.extend(self.total_kernels) - device_events.extend(self.total_communications) - device_events.sort(key=lambda x: x.start_time) - device_start = device_events[0].start_time - device_end = device_events[-1].end_time - for event in overlap_analysis_event: - if event.start_time >= device_end: - break - if event.end_time <= device_start: - continue - duration_us = float( - min(device_end, event.end_time) - max(device_start, event.start_time)) - if event.name == Constant.COMPUTING_EVENT: - self._computing_time += duration_us - elif event.name == Constant.FREE_EVENT: - self._free_time += duration_us - elif event.name == Constant.UNCOVERED_COMMUNICATION_EVENT: - self._uncovered_comm_time += duration_us - - def update_bwd_overall_metrics(self, overall_metrics): - self._bwd_overall_data = overall_metrics diff --git a/profiler/msprof_analyze/module_visualization/graph_build/__init__.py b/profiler/msprof_analyze/module_visualization/graph_build/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/module_visualization/graph_build/fwd_module_node.py b/profiler/msprof_analyze/module_visualization/graph_build/fwd_module_node.py deleted file mode 100644 index 27bb52da7960ccb7f7ac51d92552cf461196903d..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/module_visualization/graph_build/fwd_module_node.py +++ /dev/null @@ -1,33 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.prof_common.base_node import BaseNode -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean - - -class FwdModuleNode(BaseNode): - def __init__(self, event: TraceEventBean, parent_node=None): - super().__init__(event, parent_node) - self._bwd_op_list = [] - - @property - def bwd_op_list(self): - return self._bwd_op_list - - @property - def event(self): - return self._event - - def update_bwd_op(self, bwd_op_list: list): - self._bwd_op_list.extend(bwd_op_list) diff --git a/profiler/msprof_analyze/module_visualization/graph_build/prof_graph_builder.py b/profiler/msprof_analyze/module_visualization/graph_build/prof_graph_builder.py deleted file mode 100644 index a0c00ef92b3e15e2c2c7c81f76d451db5eacb183..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/module_visualization/graph_build/prof_graph_builder.py +++ /dev/null @@ -1,237 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from decimal import Decimal - -from msprof_analyze.module_visualization.graph.prof_node import ProfNode -from msprof_analyze.module_visualization.graph_build.fwd_module_node import FwdModuleNode -from msprof_analyze.prof_common.tree_builder import TreeBuilder -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.module_visualization.prof_parse.prof_data_pre_process import ProfDataPreProcess - - -class ProfGraphBuilder: - - def __init__(self, prof_data_path: str): - self._prof_data_path = prof_data_path - self._prof_data = {} - self._fwd_bwd_id = 1 - - @classmethod - def _create_event_bean_from_ops(cls, op_list: list, name: str) -> TraceEventBean: - min_start = min((op.start_time for op in iter(op_list))) - max_end = max((op.end_time for op in iter(op_list))) - # 以反向算子的区间作为反向module的区间范围,为了module包含算子,做了-0.0001 +0.0001处理 - event = TraceEventBean( - {"ts": min_start - Decimal("0.0001"), "dur": float(max_end - min_start + Decimal("0.0001")), "name": name}) - event.event_type = Constant.MODULE_TYPE - return event - - @classmethod - def _trans_flow_to_dict(cls, flow_events: dict, end_events: list) -> dict: - end_event_dict = {} - for event in end_events: - end_event_dict[event.start_time] = event - result_data = {} - for flow in flow_events.values(): - start_point = flow.get("start") - end_point = flow.get("end") - if not start_point or not end_point: - continue - end_event = end_event_dict.get(end_point.start_time) - if end_event: - result_data.setdefault(start_point.start_time, []).append(end_event) - return result_data - - @classmethod - def _create_virtual_node(cls, all_nodes: list): - root_node = all_nodes[0] - virtual_nodes = [] - first_level_nodes = root_node.child_nodes - root_node.reset_child_nodes([]) - merged_nodes = [] - order_id = 1 - for node in first_level_nodes: - if node.node_type == Constant.OPERATOR_TYPE: - merged_nodes.append(node) - continue - if len(merged_nodes) >= 2: - virtual_node = ProfNode(TraceEventBean({"ts": min((node.start_time for node in merged_nodes))}, - f"Operators_Between_Modules_{order_id}"), root_node) - root_node.update_child_nodes(virtual_node) - order_id += 1 - for op_node in merged_nodes: - op_node.parent_node = virtual_node - virtual_node.update_child_nodes(op_node) - virtual_nodes.append(virtual_node) - elif len(merged_nodes) == 1: - root_node.update_child_nodes(merged_nodes[0]) - root_node.update_child_nodes(node) - merged_nodes = [] - if len(merged_nodes) >= 2: - virtual_node = ProfNode(TraceEventBean({"ts": min((node.start_time for node in merged_nodes))}, - f"Operators_Between_Modules_{order_id}"), root_node) - root_node.update_child_nodes(virtual_node) - for op_node in merged_nodes: - op_node.parent_node = virtual_node - virtual_node.update_child_nodes(op_node) - virtual_nodes.append(virtual_node) - elif len(merged_nodes) == 1: - root_node.update_child_nodes(merged_nodes[0]) - all_nodes.extend(virtual_nodes) - - @classmethod - def _set_event_order_id(cls, all_events: list): - name_dict = {} - for event in all_events: - order_id = name_dict.get(event.name, 0) - event.set_id(f"{event.name}_{order_id}") - name_dict[event.name] = order_id + 1 - - def build_graph(self): - self._prof_data = ProfDataPreProcess(self._prof_data_path).run() - all_data = [*self._prof_data.get(Constant.MODULE_EVENT, []), - *self.find_bwd_module(), - *self._prof_data.get(Constant.CPU_OP_EVENT, [])] - all_data.sort(key=lambda x: x.start_time) - self._set_event_order_id(all_data) - all_nodes = TreeBuilder.build_tree(all_data, ProfNode, TraceEventBean({}, Constant.NPU_ROOT_ID)) - if len(all_nodes) < 2: - msg = "Failed to build graph." - raise RuntimeError(msg) - self._update_kernel_details(all_nodes[0]) - self._update_communication_details(all_nodes[0]) - self._create_virtual_node(all_nodes) - self._update_precision_index_and_overall_metrics(all_nodes) - self._update_step_info(all_nodes[0]) - return all_nodes - - def find_bwd_module(self) -> list: - bwd_module_list = [] - fwdbwd_flow = self._prof_data.get(Constant.FWD_BWD_FLOW, {}) - fwdbwd_flow = {key: value for key, value in fwdbwd_flow.items() if - value.get("start") and value.get("end") and value.get("start").tid != value.get("end").tid} - module_list = self._prof_data.get(Constant.MODULE_EVENT, []) - cpu_op_list = self._prof_data.get(Constant.CPU_OP_EVENT, []) - if not fwdbwd_flow or not module_list or not cpu_op_list: - return bwd_module_list - fwd_tid = module_list[0].tid - bwd_tid = fwd_tid - for end_point in (flow.get("end") for flow in fwdbwd_flow.values()): - if end_point: - bwd_tid = end_point.tid - break - if fwd_tid == bwd_tid: - return bwd_module_list - # 将每一个反向包成一个module,名字叫“nn.Module: BACKWARD_0” - cpu_op_list.sort(key=lambda x: x.start_time) - pre_status = Constant.FWD_OR_OPT - bwd_op_list = [] - for op in cpu_op_list: - if op.tid == bwd_tid: - bwd_op_list.append(op) - pre_status = Constant.BACKWARD - continue - elif pre_status == Constant.BACKWARD: - bwd_module_list.append(self._create_event_bean_from_ops(bwd_op_list, Constant.BACKWARD_MODULE)) - bwd_module_list.extend(self._match_fwd_module(module_list, fwdbwd_flow, bwd_op_list)) - bwd_op_list.clear() - pre_status = Constant.FWD_OR_OPT - if bwd_op_list: - bwd_module_list.append(self._create_event_bean_from_ops(bwd_op_list, Constant.BACKWARD_MODULE)) - bwd_module_list.extend(self._match_fwd_module(module_list, fwdbwd_flow, bwd_op_list)) - bwd_op_list.clear() - return bwd_module_list - - def _match_fwd_module(self, module_list, fwdbwd_flow, bwd_op_list): - # 通过连线匹配正向module,构建出反向的整体module关系 - bwd_module_list = [] - all_nodes = TreeBuilder.build_tree(module_list, FwdModuleNode, TraceEventBean({})) - root_node = all_nodes[0] - fwdbwd_flow_dict = self._trans_flow_to_dict(fwdbwd_flow, bwd_op_list) - for start_time, end_events in fwdbwd_flow_dict.items(): - matched_node = root_node.binary_search(start_time) - while matched_node != Constant.INVALID_RETURN: - matched_node.update_bwd_op(end_events) - matched_node = matched_node.binary_search(start_time) - for module_node in all_nodes: - if module_node.bwd_op_list: - module_node.event.fwd_bwd_id = self._fwd_bwd_id - bwd_module_list.append( - self._create_event_bean_from_ops(module_node.bwd_op_list, f"{module_node.name} [BACKWARD]")) - bwd_module_list[-1].fwd_bwd_id = self._fwd_bwd_id - self._fwd_bwd_id += 1 - return bwd_module_list - - def _update_kernel_details(self, root_node): - kernel_flow_dict = self._trans_flow_to_dict(self._prof_data.get(Constant.TORCH_TO_NPU_FLOW, {}), - self._prof_data.get(Constant.KERNEL_EVENT, [])) - for start_time, kernels in kernel_flow_dict.items(): - matched_node = root_node.binary_search(start_time) - while matched_node != Constant.INVALID_RETURN: - matched_node.update_kernel_total_list(kernels) - matched_node = matched_node.binary_search(start_time) - - def _update_communication_details(self, root_node): - communication_flow_dict = self._trans_flow_to_dict(self._prof_data.get(Constant.TORCH_TO_NPU_FLOW, {}), - self._prof_data.get(Constant.HCCL_EVENT, [])) - for start_time, communications in communication_flow_dict.items(): - matched_node = root_node.binary_search(start_time) - while matched_node != Constant.INVALID_RETURN: - matched_node.update_communication_total_list(communications) - matched_node = matched_node.binary_search(start_time) - - def _update_step_info(self, root_node): - first_level_nodes = root_node.child_nodes - step_events = self._prof_data.get(Constant.STEP_EVENT, []) - node_dict = {} - if not step_events: - node_dict[None] = first_level_nodes - else: - for node in first_level_nodes: - for step_event in step_events: - if step_event.start_time <= node.start_time <= step_event.end_time: - node.step_id = step_event.step_id - node_dict.setdefault(step_event.step_id, []).append(node) - break - for nodes in node_dict.values(): - micro_step_list = [] - micro_events = [] - for node in nodes: - micro_events.append(node) - if node.is_backward: - micro_step_list.append(micro_events) - micro_events = [] - if micro_step_list: - micro_step_list[-1].extend(micro_events) - else: - micro_step_list.append(micro_events) - for index, micro_events in enumerate(micro_step_list): - for node in micro_events: - node.micro_step_id = index - - def _update_precision_index_and_overall_metrics(self, all_nodes: list): - overlap_analysis_event = self._prof_data.get(Constant.OVERLAP_ANALYSIS_EVENT, []) - overlap_analysis_event.sort(key=lambda x: x.start_time) - bwd_infos = {} - for node in all_nodes: - node.update_child_precision_index() - if node.node_type != Constant.OPERATOR_TYPE: - node.update_overall_metrics(overlap_analysis_event) - if node.is_bwd and node.fwd_bwd_id: - bwd_infos[node.fwd_bwd_id] = node.overall_data - for node in all_nodes: - if node.node_type != Constant.OPERATOR_TYPE and not node.is_bwd: - node.update_bwd_overall_metrics(bwd_infos.get(node.fwd_bwd_id, {})) diff --git a/profiler/msprof_analyze/module_visualization/prof_graph_export.py b/profiler/msprof_analyze/module_visualization/prof_graph_export.py deleted file mode 100644 index acb178f7e7e60ea733f93ccbcb7bccdbae458442..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/module_visualization/prof_graph_export.py +++ /dev/null @@ -1,58 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import logging -import os.path -from datetime import datetime - -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.file_reader import FileReader -from msprof_analyze.prof_common.path_manager import PathManager -from msprof_analyze.module_visualization.graph_build.prof_graph_builder import ProfGraphBuilder - - -class ProfGraphExport: - @classmethod - def export_to_json(cls, prof_data_path: str, output_path: str): - logging.basicConfig(format="%(asctime)s - %(levelname)s - %(message)s") - output_path = os.path.abspath(output_path) - prof_data_path = os.path.abspath(prof_data_path) - try: - PathManager.input_path_common_check(prof_data_path) - PathManager.check_input_directory_path(output_path) - PathManager.make_dir_safety(output_path) - PathManager.check_path_writeable(output_path) - except RuntimeError as err: - logging.error(err) - try: - cls.generate_graph_data(prof_data_path, output_path) - except RuntimeError as err: - logging.error(err) - - @classmethod - def generate_graph_data(cls, prof_data_path: str, output_path: str): - all_nodes = ProfGraphBuilder(prof_data_path).build_graph() - result_data = {"root": Constant.NPU_ROOT_ID, "node": {}} - for node in all_nodes: - result_data["node"][node.node_id] = node.info - step_list = list(set([node.step_id for node in all_nodes[0].child_nodes if node.step_id is not None])) - if step_list: - result_data["StepList"] = step_list - micro_steps = len( - set([node.micro_step_id for node in all_nodes[0].child_nodes if node.micro_step_id is not None])) - result_data["MicroSteps"] = micro_steps - file_name = "prof_graph_json_{}.vis".format(datetime.utcnow().strftime("%Y%m%d%H%M%S%f")[:-3]) - FileReader.write_json_file(output_path, result_data, file_name) - logging.info("Performance data has been converted into a graph-structured file: %s", - os.path.join(output_path, file_name)) diff --git a/profiler/msprof_analyze/module_visualization/prof_parse/__init__.py b/profiler/msprof_analyze/module_visualization/prof_parse/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/module_visualization/prof_parse/prof_data_pre_process.py b/profiler/msprof_analyze/module_visualization/prof_parse/prof_data_pre_process.py deleted file mode 100644 index 2d39649d58d543fc295bdc0296bd244c25f506f3..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/module_visualization/prof_parse/prof_data_pre_process.py +++ /dev/null @@ -1,137 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import logging -import os - -from msprof_analyze.prof_common.file_reader import FileReader -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.kernel_bean import KernelBean -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean - - -class ProfDataPreProcess: - def __init__(self, prof_data_path: str): - self._prof_data_path = prof_data_path - self._trace_path = "" - self._kernel_details_path = "" - self._kernel_pid = None - self._hccl_pid = None - self._overlap_analysis_pid = None - self._result_data = {Constant.CPU_OP_EVENT: [], Constant.MODULE_EVENT: [], Constant.KERNEL_EVENT: [], - Constant.TORCH_TO_NPU_FLOW: {}, Constant.FWD_BWD_FLOW: {}, Constant.HCCL_EVENT: [], - Constant.OVERLAP_ANALYSIS_EVENT: [], Constant.STEP_EVENT: []} - - @staticmethod - def _check_trace_data(trace_data): - if not isinstance(trace_data, list): - msg = f"Invalid profiling data path, this feature only supports performance data " \ - f"collected by Ascend PyTorch Profiler." - raise RuntimeError(msg) - - def run(self) -> dict: - self._check_trace_path() - self._parse_trace_events() - self._parse_kernel_details() - self._check_result_data() - return self._result_data - - def _check_trace_path(self): - if os.path.isfile(self._prof_data_path): - (split_file_path, split_file_name) = os.path.split(self._prof_data_path) - (shot_name, extension) = os.path.splitext(split_file_name) - if extension != ".json": - msg = f"Invalid profiling path suffix: {self._prof_data_path}. " \ - f"You should input in a json file path, such as trace_view.json." - raise RuntimeError(msg) - self._trace_path = self._prof_data_path - return - ascend_output = os.path.join(self._prof_data_path, "ASCEND_PROFILER_OUTPUT") - profiler_output = ascend_output if os.path.isdir(ascend_output) else self._prof_data_path - json_path = os.path.join(profiler_output, "trace_view.json") - if not os.path.isfile(json_path): - msg = f"Invalid profiling path: {self._prof_data_path}. The data path should be the " \ - f"folder that ends with the ascend_pt collected by the Ascend PyTorch Profiler." - raise RuntimeError(msg) - kernel_path = os.path.join(profiler_output, "kernel_details.csv") - if os.path.isfile(kernel_path): - self._kernel_details_path = kernel_path - self._trace_path = json_path - - def _parse_trace_events(self): - trace_data = FileReader.read_json_file(self._trace_path) - self._check_trace_data(trace_data) - iter_trace_data = [TraceEventBean(data) for data in trace_data] - for event in iter_trace_data: - if self._kernel_pid is not None and self._hccl_pid is not None and self._overlap_analysis_pid is not None: - break - if not event.is_meta(): - continue - if event.is_npu_process(): - self._kernel_pid = event.pid - elif event.is_hccl_process(): - self._hccl_pid = event.pid - elif event.is_overlap_analysis_process(): - self._overlap_analysis_pid = event.pid - if self._kernel_pid is None: - msg = "There is no operator on the NPU side for this data, please check whether the NPU switch is enabled." - raise RuntimeError(msg) - for event in iter_trace_data: - if event.is_optimizer(): - event.event_type = Constant.MODULE_TYPE - self._result_data[Constant.MODULE_EVENT].append(event) - elif event.is_cpu_op(): - if event.is_step(): - self._result_data[Constant.STEP_EVENT].append(event) - else: - event.event_type = Constant.OPERATOR_TYPE - self._result_data[Constant.CPU_OP_EVENT].append(event) - elif event.is_nn_module(): - event.event_type = Constant.MODULE_TYPE - self._result_data[Constant.MODULE_EVENT].append(event) - elif event.is_torch_to_npu(): - if event.is_flow_start(): - self._result_data[Constant.TORCH_TO_NPU_FLOW].setdefault(event.id, {})["start"] = event - else: - self._result_data[Constant.TORCH_TO_NPU_FLOW].setdefault(event.id, {})["end"] = event - elif event.is_fwd_bwd_flow(): - if event.is_flow_start(): - self._result_data[Constant.FWD_BWD_FLOW].setdefault(event.id, {})["start"] = event - else: - self._result_data[Constant.FWD_BWD_FLOW].setdefault(event.id, {})["end"] = event - elif event.is_kernel_event(self._kernel_pid): - self._result_data[Constant.KERNEL_EVENT].append(event) - elif event.is_hccl_event(self._hccl_pid): - self._result_data[Constant.HCCL_EVENT].append(event) - elif event.is_overlap_analysis_event(self._overlap_analysis_pid): - self._result_data[Constant.OVERLAP_ANALYSIS_EVENT].append(event) - - def _parse_kernel_details(self): - if not self._kernel_details_path: - return - try: - all_kernels = FileReader.read_csv_file(self._kernel_details_path, KernelBean) - except Exception as e: - logging.error(e) - kernels = list(filter(lambda x: x.is_computing_op, all_kernels)) - if kernels: - self._result_data[Constant.KERNEL_EVENT] = kernels - - def _check_result_data(self): - if not self._result_data.get(Constant.CPU_OP_EVENT): - msg = "This data does not have any aten operator, please make sure to enable the CPU switch." - raise RuntimeError(msg) - if not [event for event in self._result_data.get(Constant.MODULE_EVENT) if event.is_nn_module()]: - msg = "This data does not collect any modules, please make sure to enable the with_stack or with_modules." - raise RuntimeError(msg) diff --git a/profiler/msprof_analyze/osrt_trace/README.md b/profiler/msprof_analyze/osrt_trace/README.md deleted file mode 100644 index 0ffb70415c60922fdff1c9de07313f8ee81f6aed..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/README.md +++ /dev/null @@ -1,157 +0,0 @@ -# MSOSRT Trace系统库函数耗时检测 - -OSRT(OS runtime libraries trace)是根据Linux操作系统运行时库采集用户层库函数API的调用信息。MSOSRT(MindStudio OSRT)则是采集Linux C库函数和POSIX线程(pthread)库中典型的高耗时接口,即可能阻塞用户进程的函数(如read、ioctl,pthread_mutex_lock等),统计其耗时信息,帮助用户分析进程阻塞的原因。 - -## 使用方法 - -1. 约束条件:仅支持Linux系统,拥有g++编译环境和glibc、pthread等标准库。 -2. 将mstt代码仓下载到本地,进入到profiler/msprof_analyze/osrt_trace目录,执行`bash build.sh`,生成`libmsosrt_trace.so`。 -3. 执行`export LD_PRELOAD=./libmsosrt_trace.so:$LD_PRELOAD`,将`libmsosrt_trace.so`加入到LD_PRELOAD环境变量中。 -4. 设置检测阈值和导出目录的环境变量: - - ```bash - # 检测阈值,正整数,只统计超过阈值的库函数,单位:ns,默认为10000000 - export MSOSRT_TRACE_THRESHOLD=10000000 - # 导出目录,字符串,设置检测结果导出的目录,默认为当前目录 - export MSOSRT_EXPORT_PATH="./osrt_trace_result" - ``` - -5. 执行用户进程,如`python main.py` - -6. 用户进程执行结束后,在MSOSRT_EXPORT_PATH路径下会生成检测结果,生成结果文件:msosrt_trace\_{进程号}\_{进程名}.csv,如`msosrt_trace_2328177_python3.csv`,文件内容包含pid、tid、函数名、开始执行时间和耗时等信息,如下所示: - - | Pid | Tid | Function | StartTime(ns) | Duration(ns) | - | ------: | ------: | ----------------: | ------------------: | -----------: | - | 2328177 | 2328280 | pthread_cond_wait | 1725398310787080000 | 3088062410 | - | 2328177 | 2328282 | pthread_cond_wait | 1725398310787170000 | 3087994240 | - | 2328177 | 2328480 | read | 1725398318916180000 | 100509970 | - | 2328177 | 2328440 | ioctl | 1725398319218640000 | 512040720 | - | 2328177 | 2328177 | free | 1725398330504550000 | 56386880 | - -## 检测接口 - -MSOSRT支持检测如下操作系统库函数: - -- 内存操作 - - ```c - malloc - realloc - free - mmap - munmap - mremap - msync - mprotect - brk - ``` - -- 文件操作 - - ```c - dup - dup2 - dup3 - tee - splice - fallocate - fdatasync - fsync - fcntl - flock - lockf - truncate - ftruncate - ioctl - open - openat - pipe - pipe2 - mkfifo - mkfifoat - read - pread - readv - preadv - preadv2 - write - pwrite - writev - pwritev - pwritev2 - copy_file_range - sync - syncfs - sync_file_range - vmsplice - process_vm_readv - process_vm_writev - fclose - fcloseall - fflush - fgetc - fgets - fputc - fputs - flockfile - ftrylockfile - funlockfile - fopen - freopen - fread - fwrite - getdelim - getline - getc - putc - getc_unlocked - putc_unlocked - fflush_unlocked - fgetc_unlocked - fputc_unlocked - fread_unlocked - fwrite_unlocked - fgets_unlocked - fputs_unlocked - ``` - -- 网络操作 - - ```c - socket - socketpair - epoll_ctl - epoll_wait - epoll_pwait - select - listen - accept - accept4 - bind - poll - ppoll - send - sendto - sendmsg - sendmmsg - sendfile - recv - recvfrom - recvmsg - recvmmsg - ``` - -- 线程操作 - - ```c - pthread_mutex_lock - pthread_mutex_timedlock - pthread_cond_signal - pthread_cond_broadcast - pthread_cond_wait - pthread_cond_timedwait - pthread_rwlock_rdlock - pthread_rwlock_timedrdlock - pthread_rwlock_wrlock - pthread_rwlock_timedwrlock - ``` \ No newline at end of file diff --git a/profiler/msprof_analyze/osrt_trace/build.sh b/profiler/msprof_analyze/osrt_trace/build.sh deleted file mode 100644 index bb153e6247122c922dc5cea247be43bfec3d5430..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/build.sh +++ /dev/null @@ -1 +0,0 @@ -g++ ./src/*.cpp -std=c++11 -fPIC -fstack-protector-all -fno-strict-aliasing -fno-common -fvisibility=hidden -fvisibility-inlines-hidden -Wfloat-equal -Wextra -O2 -shared -lpthread -ldl -o libmsosrt_trace.so \ No newline at end of file diff --git a/profiler/msprof_analyze/osrt_trace/src/file_func.cpp b/profiler/msprof_analyze/osrt_trace/src/file_func.cpp deleted file mode 100644 index 319dcb227b139adf158d55fe762f97afdfa5fdd8..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/file_func.cpp +++ /dev/null @@ -1,664 +0,0 @@ -#include "file_func.h" - -#include - -#include "msosrt_trace.h" - -void FileFuncProxy::loadFunc() -{ - LOAD_FUNC(dup, DupFunc); - LOAD_FUNC(dup2, Dup2Func); - LOAD_FUNC(dup3, Dup3Func); - LOAD_FUNC(tee, TeeFunc); - LOAD_FUNC(splice, SpliceFunc); - LOAD_FUNC(fallocate, FallocateFunc); - LOAD_FUNC(fdatasync, FdatasyncFunc); - LOAD_FUNC(fsync, FsyncFunc); - LOAD_FUNC(fcntl, FcntlFunc); - LOAD_FUNC(flock, FlockFunc); - LOAD_FUNC(lockf, LockfFunc); - LOAD_FUNC(truncate, TruncateFunc); - LOAD_FUNC(ftruncate, FtruncateFunc); - LOAD_FUNC(ioctl, IoctlFunc); - LOAD_FUNC(open, OpenFunc); - LOAD_FUNC(openat, OpenatFunc); - LOAD_FUNC(pipe, PipeFunc); - LOAD_FUNC(pipe2, Pipe2Func); - LOAD_FUNC(mkfifo, MkfifoFunc); - LOAD_FUNC(mkfifoat, MkfifoatFunc); - LOAD_FUNC(read, ReadFunc); - LOAD_FUNC(pread, PreadFunc); - LOAD_FUNC(readv, ReadvFunc); - LOAD_FUNC(preadv, PreadvFunc); - LOAD_FUNC(preadv2, Preadv2Func); - LOAD_FUNC(write, WriteFunc); - LOAD_FUNC(pwrite, PwriteFunc); - LOAD_FUNC(writev, WritevFunc); - LOAD_FUNC(pwritev, PwritevFunc); - LOAD_FUNC(pwritev2, Pwritev2Func); - LOAD_FUNC(copy_file_range, CopyFileRangeFunc); - LOAD_FUNC(sync, SyncFunc); - LOAD_FUNC(syncfs, SyncfsFunc); - LOAD_FUNC(sync_file_range, SyncFileRangeFunc); - LOAD_FUNC(vmsplice, VmspliceFunc); - LOAD_FUNC(process_vm_readv, ProcessVmReadvFunc); - LOAD_FUNC(process_vm_writev, ProcessVmWritevFunc); - LOAD_FUNC(fclose, FcloseFunc); - LOAD_FUNC(fcloseall, FcloseallFunc); - LOAD_FUNC(fflush, FflushFunc); - LOAD_FUNC(fgetc, FgetcFunc); - LOAD_FUNC(fgets, FgetsFunc); - LOAD_FUNC(fputc, FputcFunc); - LOAD_FUNC(fputs, FputsFunc); - LOAD_FUNC(flockfile, FlockfileFunc); - LOAD_FUNC(ftrylockfile, FtrylockfileFunc); - LOAD_FUNC(funlockfile, FunlockfileFunc); - LOAD_FUNC(fopen, FopenFunc); - LOAD_FUNC(freopen, FreopenFunc); - LOAD_FUNC(fread, FreadFunc); - LOAD_FUNC(fwrite, FwriteFunc); - LOAD_FUNC(getdelim, GetdelimFunc); - LOAD_FUNC(getline, GetlineFunc); - LOAD_FUNC(getc, GetcFunc); - LOAD_FUNC(putc, PutcFunc); - LOAD_FUNC(getc_unlocked, GetcUnlockedFunc); - LOAD_FUNC(putc_unlocked, PutcUnlockedFunc); - LOAD_FUNC(fflush_unlocked, FflushUnlockedFunc); - LOAD_FUNC(fgetc_unlocked, FgetcUnlockedFunc); - LOAD_FUNC(fputc_unlocked, FputcUnlockedFunc); - LOAD_FUNC(fread_unlocked, FreadUnlockedFunc); - LOAD_FUNC(fwrite_unlocked, FwriteUnlockedFunc); - LOAD_FUNC(fgets_unlocked, FgetsUnlockedFunc); - LOAD_FUNC(fputs_unlocked, FputsUnlockedFunc); -} - -int dup(int oldfd) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_dup(oldfd); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int dup2(int oldfd, int newfd) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_dup2(oldfd, newfd); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int dup3(int oldfd, int newfd, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_dup3(oldfd, newfd, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t tee(int fd_in, int fd_out, size_t len, unsigned int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_tee(fd_in, fd_out, len, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t splice(int fd_in, off_t* off_in, int fd_out, off_t* off_out, size_t len, unsigned int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_splice(fd_in, off_in, fd_out, off_out, len, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fallocate(int fd, int mode, off_t offset, off_t len) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fallocate(fd, mode, offset, len); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fdatasync(int fildes) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fdatasync(fildes); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fsync(int fd) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fsync(fd); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fcntl(int fd, int op, ...) -{ - global_osrt_func.loadFunc(); - va_list args; - va_start(args, op); - void* arg = va_arg(args, void*); - va_end(args); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fcntl(fd, op, arg); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int flock(int fd, int op) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_flock(fd, op); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int lockf(int fd, int op, off_t len) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_lockf(fd, op, len); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int truncate(const char* path, off_t length) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_truncate(path, length); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int ftruncate(int fildes, off_t length) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_ftruncate(fildes, length); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int ioctl(int fd, int op, ...) -{ - global_osrt_func.loadFunc(); - va_list args; - va_start(args, op); - void* arg = va_arg(args, void*); - va_end(args); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_ioctl(fd, op, arg); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int open(const char* pathname, int flags, ...) -{ - global_osrt_func.loadFunc(); - va_list args; - va_start(args, flags); - mode_t arg = va_arg(args, mode_t); - va_end(args); - uint64_t start_time = nsec_now(); - auto ret = arg ? global_osrt_func.file_func.real_open(pathname, flags, arg) : global_osrt_func.file_func.real_open(pathname, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int openat(int dirfd, const char *pathname, int flags, ...) -{ - global_osrt_func.loadFunc(); - va_list args; - va_start(args, flags); - mode_t arg = va_arg(args, mode_t); - va_end(args); - uint64_t start_time = nsec_now(); - auto ret = arg ? global_osrt_func.file_func.real_openat(dirfd, pathname, flags, arg) : global_osrt_func.file_func.real_openat(dirfd, pathname, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pipe(int pipefd[2]) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pipe(pipefd); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pipe2(int pipefd[2], int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pipe2(pipefd, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int mkfifo(const char* pathname, mode_t mode) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_mkfifo(pathname, mode); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int mkfifoat(int dirfd, const char* pathname, mode_t mode) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_mkfifoat(dirfd, pathname, mode); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t read(int fd, void* buf, size_t count) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_read(fd, buf, count); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t pread(int fd, void* buf, size_t count, off_t offset) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pread(fd, buf, count, offset); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t readv(int fd, const struct iovec* iov, int iovcnt) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_readv(fd, iov, iovcnt); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t preadv(int fd, const struct iovec* iov, int iovcnt, off_t offset) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_preadv(fd, iov, iovcnt, offset); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t preadv2(int fd, const struct iovec* iov, int iovcnt, off_t offset, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_preadv2(fd, iov, iovcnt, offset, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t write(int fd, const void* buf, size_t count) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_write(fd, buf, count); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t pwrite(int fd, const void* buf, size_t count, off_t offset) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pwrite(fd, buf, count, offset); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t writev(int fd, const struct iovec* iov, int iovcnt) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_writev(fd, iov, iovcnt); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t pwritev(int fd, const struct iovec* iov, int iovcnt, off_t offset) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pwritev(fd, iov, iovcnt, offset); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t pwritev2(int fd, const struct iovec* iov, int iovcnt, off_t offset, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_pwritev2(fd, iov, iovcnt, offset, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t copy_file_range(int fd_in, off_t* off_in, int fd_out, off_t* off_out, size_t len, unsigned int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_copy_file_range(fd_in, off_in, fd_out, off_out, len, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void sync(void) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - global_osrt_func.file_func.real_sync(); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); -} - -int syncfs(int fd) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_syncfs(fd); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int sync_file_range(int fd, off_t offset, off_t nbytes, unsigned int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_sync_file_range(fd, offset, nbytes, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t vmsplice(int fd, const struct iovec* iov, size_t nr_segs, unsigned int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_vmsplice(fd, iov, nr_segs, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t process_vm_readv(pid_t pid, const struct iovec* local_iov, unsigned long liovcnt, - const struct iovec* remote_iov, unsigned long riovcnt, unsigned long flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_process_vm_readv(pid, local_iov, liovcnt, remote_iov, riovcnt, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t process_vm_writev(pid_t pid, const struct iovec* local_iov, unsigned long liovcnt, - const struct iovec* remote_iov, unsigned long riovcnt, unsigned long flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_process_vm_writev(pid, local_iov, liovcnt, remote_iov, riovcnt, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fclose(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fclose(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fcloseall(void) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fcloseall(); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fflush(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fflush(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fgetc(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fgetc(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -char* fgets(char* s, int size, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - char* ret = global_osrt_func.file_func.real_fgets(s, size, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fputc(int c, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fputc(c, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fputs(const char* s, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fputs(s, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void flockfile(FILE* filehandle) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - global_osrt_func.file_func.real_flockfile(filehandle); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); -} - -int ftrylockfile(FILE* filehandle) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_ftrylockfile(filehandle); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void funlockfile(FILE* filehandle) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - global_osrt_func.file_func.real_funlockfile(filehandle); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); -} - -FILE* fopen(const char* pathname, const char* mode) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fopen(pathname, mode); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -FILE* freopen(const char* pathname, const char* mode, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_freopen(pathname, mode, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -size_t fread(void* ptr, size_t size, size_t nmemb, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fread(ptr, size, nmemb, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -size_t fwrite(const void* ptr, size_t size, size_t nitems, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fwrite(ptr, size, nitems, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t getdelim(char** lineptr, size_t* n, int delimiter, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_getdelim(lineptr, n, delimiter, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t getline(char** lineptr, size_t* n, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_getline(lineptr, n, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int getc(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_getc(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int putc(int c, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_putc(c, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int getc_unlocked(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_getc_unlocked(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int putc_unlocked(int c, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_putc_unlocked(c, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fflush_unlocked(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fflush_unlocked(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fgetc_unlocked(FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fgetc_unlocked(stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fputc_unlocked(int c, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fputc_unlocked(c, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -size_t fread_unlocked(void* ptr, size_t size, size_t n, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fread_unlocked(ptr, size, n, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -size_t fwrite_unlocked(const void* ptr, size_t size, size_t n, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fwrite_unlocked(ptr, size, n, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -char* fgets_unlocked(char* s, int n, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - char* ret = global_osrt_func.file_func.real_fgets_unlocked(s, n, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int fputs_unlocked(const char* s, FILE* stream) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.file_func.real_fputs_unlocked(s, stream); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} diff --git a/profiler/msprof_analyze/osrt_trace/src/file_func.h b/profiler/msprof_analyze/osrt_trace/src/file_func.h deleted file mode 100644 index 23c6a25eeeddd734a1ab10ecfcb7d3035d2f9a6a..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/file_func.h +++ /dev/null @@ -1,144 +0,0 @@ -#pragma once - -#ifndef _GNU_SOURCE -#define _GNU_SOURCE -#endif - -#include -#include -#include - -using DupFunc = int(*)(int); -using Dup2Func = int(*)(int, int); -using Dup3Func = int(*)(int, int, int); -using TeeFunc = ssize_t(*)(int, int, size_t, unsigned int); -using SpliceFunc = ssize_t(*)(int, off_t*, int, off_t*, size_t, unsigned int); -using FallocateFunc = int(*)(int, int, off_t, off_t); -using FdatasyncFunc = int(*)(int); -using FsyncFunc = int(*)(int); -using FcntlFunc = int(*)(int, int, ...); -using FlockFunc = int(*)(int, int); -using LockfFunc = int(*)(int, int, off_t); -using TruncateFunc = int(*)(const char*, off_t); -using FtruncateFunc = int(*)(int, off_t); -using IoctlFunc = int(*)(int, int, ...); -using OpenFunc = int(*)(const char*, int, ...); -using OpenatFunc = int(*)(int, const char*, int, ...); -using PipeFunc = int(*)(int*); -using Pipe2Func = int(*)(int*, int); -using MkfifoFunc = int(*)(const char*, mode_t); -using MkfifoatFunc = int(*)(int, const char*, mode_t); -using ReadFunc = ssize_t(*)(int, void*, size_t); -using PreadFunc = ssize_t(*)(int, void*, size_t, off_t); -using ReadvFunc = ssize_t(*)(int, const struct iovec*, int); -using PreadvFunc = ssize_t(*)(int, const struct iovec*, int, off_t); -using Preadv2Func = ssize_t(*)(int, const struct iovec*, int, off_t, int); -using WriteFunc = ssize_t(*)(int, const void*, size_t); -using PwriteFunc = ssize_t(*)(int, const void*, size_t, off_t); -using WritevFunc = ssize_t(*)(int, const struct iovec*, int); -using PwritevFunc = ssize_t(*)(int, const struct iovec*, int, off_t); -using Pwritev2Func = ssize_t(*)(int, const struct iovec*, int, off_t, int); -using CopyFileRangeFunc = ssize_t(*)(int, off_t*, int, off_t*, size_t, unsigned int); -using SyncFunc = void(*)(void); -using SyncfsFunc = int(*)(int); -using SyncFileRangeFunc = int(*)(int, off_t, off_t, unsigned int); -using VmspliceFunc = ssize_t(*)(int, const struct iovec*, size_t, unsigned int); -using ProcessVmReadvFunc = ssize_t(*)(pid_t, const struct iovec*, unsigned long, const struct iovec*, unsigned long, unsigned long); -using ProcessVmWritevFunc = ssize_t(*)(pid_t, const struct iovec*, unsigned long, const struct iovec*, unsigned long, unsigned long); -using FcloseFunc = int(*)(FILE*); -using FcloseallFunc = int(*)(void); -using FflushFunc = int(*)(FILE*); -using FgetcFunc = int(*)(FILE*); -using FgetsFunc = char*(*)(char*, int, FILE*); -using FputcFunc = int(*)(int, FILE*); -using FputsFunc = int(*)(const char*, FILE*); -using FlockfileFunc = void(*)(FILE*); -using FtrylockfileFunc = int(*)(FILE*); -using FunlockfileFunc = void(*)(FILE*); -using FopenFunc = FILE*(*)(const char*, const char*); -using FreopenFunc = FILE*(*)(const char*, const char*, FILE*); -using FreadFunc = size_t(*)(void*, size_t, size_t, FILE*); -using FwriteFunc = size_t(*)(const void*, size_t, size_t, FILE*); -using GetdelimFunc = ssize_t(*)(char**, size_t*, int, FILE*); -using GetlineFunc = ssize_t(*)(char**, size_t*, FILE*); -using GetcFunc = int(*)(FILE*); -using PutcFunc = int(*)(int, FILE*); -using GetcUnlockedFunc = int(*)(FILE*); -using PutcUnlockedFunc = int(*)(int, FILE*); -using FflushUnlockedFunc = int(*)(FILE*); -using FgetcUnlockedFunc = int(*)(FILE*); -using FputcUnlockedFunc = int(*)(int, FILE*); -using FreadUnlockedFunc = size_t(*)(void*, size_t, size_t, FILE*); -using FwriteUnlockedFunc = size_t(*)(const void*, size_t, size_t, FILE*); -using FgetsUnlockedFunc = char*(*)(char*, int, FILE*); -using FputsUnlockedFunc = int(*)(const char*, FILE*); - -struct FileFuncProxy -{ - DupFunc real_dup = nullptr; - Dup2Func real_dup2 = nullptr; - Dup3Func real_dup3 = nullptr; - TeeFunc real_tee = nullptr; - SpliceFunc real_splice = nullptr; - FallocateFunc real_fallocate = nullptr; - FdatasyncFunc real_fdatasync = nullptr; - FsyncFunc real_fsync = nullptr; - FcntlFunc real_fcntl = nullptr; - FlockFunc real_flock = nullptr; - LockfFunc real_lockf = nullptr; - TruncateFunc real_truncate = nullptr; - FtruncateFunc real_ftruncate = nullptr; - IoctlFunc real_ioctl = nullptr; - OpenFunc real_open = nullptr; - OpenatFunc real_openat = nullptr; - PipeFunc real_pipe = nullptr; - Pipe2Func real_pipe2 = nullptr; - MkfifoFunc real_mkfifo = nullptr; - MkfifoatFunc real_mkfifoat = nullptr; - ReadFunc real_read = nullptr; - PreadFunc real_pread = nullptr; - ReadvFunc real_readv = nullptr; - PreadvFunc real_preadv = nullptr; - Preadv2Func real_preadv2 = nullptr; - WriteFunc real_write = nullptr; - PwriteFunc real_pwrite = nullptr; - WritevFunc real_writev = nullptr; - PwritevFunc real_pwritev = nullptr; - Pwritev2Func real_pwritev2 = nullptr; - CopyFileRangeFunc real_copy_file_range = nullptr; - SyncFunc real_sync = nullptr; - SyncfsFunc real_syncfs = nullptr; - SyncFileRangeFunc real_sync_file_range = nullptr; - VmspliceFunc real_vmsplice = nullptr; - ProcessVmReadvFunc real_process_vm_readv = nullptr; - ProcessVmWritevFunc real_process_vm_writev = nullptr; - FcloseFunc real_fclose = nullptr; - FcloseallFunc real_fcloseall = nullptr; - FflushFunc real_fflush = nullptr; - FgetcFunc real_fgetc = nullptr; - FgetsFunc real_fgets = nullptr; - FputcFunc real_fputc = nullptr; - FputsFunc real_fputs = nullptr; - FlockfileFunc real_flockfile = nullptr; - FtrylockfileFunc real_ftrylockfile = nullptr; - FunlockfileFunc real_funlockfile = nullptr; - FopenFunc real_fopen = nullptr; - FreopenFunc real_freopen = nullptr; - FreadFunc real_fread = nullptr; - FwriteFunc real_fwrite = nullptr; - GetdelimFunc real_getdelim = nullptr; - GetlineFunc real_getline = nullptr; - GetcFunc real_getc = nullptr; - PutcFunc real_putc = nullptr; - GetcUnlockedFunc real_getc_unlocked = nullptr; - PutcUnlockedFunc real_putc_unlocked = nullptr; - FflushUnlockedFunc real_fflush_unlocked = nullptr; - FgetcUnlockedFunc real_fgetc_unlocked = nullptr; - FputcUnlockedFunc real_fputc_unlocked = nullptr; - FreadUnlockedFunc real_fread_unlocked = nullptr; - FwriteUnlockedFunc real_fwrite_unlocked = nullptr; - FgetsUnlockedFunc real_fgets_unlocked = nullptr; - FputsUnlockedFunc real_fputs_unlocked = nullptr; - - void loadFunc(); -}; diff --git a/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.cpp b/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.cpp deleted file mode 100644 index a3a88b05480193ce9bee6c26480e214f69e4ddf0..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.cpp +++ /dev/null @@ -1,476 +0,0 @@ -#include "msosrt_trace.h" - -#include -#include -#include -#include -#include -#include -#include -#include - -#if !defined (__linux__) || !defined(__GLIBC__) -#error "This tool only works on Linux!" -#endif - -#ifdef __cplusplus -extern "C" { -#endif -static void setup_trace() __attribute ((constructor)); -static void end_trace() __attribute ((destructor)); -#ifdef __cplusplus -} -#endif - -// Special handling exit func -static void (*real_exit)(int status) __attribute__((noreturn)) = nullptr; -static void (*real__exit)(int status) __attribute__((noreturn)) = nullptr; -static void (*real__Exit)(int status) __attribute__((noreturn)) = nullptr; - -static __thread bool RECURSIVE = false; -static volatile bool INITIALIZED = false; - -namespace { -pid_t GetPid() -{ - static thread_local pid_t pid = getpid(); - return pid; -} - -pid_t GetTid() -{ - static thread_local pid_t tid = gettid(); - return tid; -} - -const char* DUMP_FILE = "msosrt_trace_"; -char EXPORT_PATH[PATH_MAX]; - -const size_t RECORD_LENGTH = 512 * 1024; // Default number of trace data records -struct { - OSRTRecord data_[RECORD_LENGTH]; - std::atomic index_{0}; - bool is_full_ = false; - - void recordData(const char* function, uint64_t start_time, uint64_t duration) - { - size_t index = index_.load(std::memory_order_relaxed); - if (index + 1 >= RECORD_LENGTH) { - index_.store(0, std::memory_order_relaxed); - is_full_ = true; - } else { - index_.fetch_add(1, std::memory_order_relaxed); - } - auto& record = data_[index]; - record.pid = GetPid(); - record.tid = GetTid(); - record.function = function; - record.start_time = start_time; - record.duration = duration; - } - - size_t size() - { - return is_full_ ? RECORD_LENGTH : index_.load(std::memory_order_relaxed); - } - - bool hasValidData() - { - pid_t pid = getpid(); - for (size_t i = 0, len = size(); i < len; ++i) { - if (data_[i].pid == pid && data_[i].function != nullptr) { - return true; - } - } - return false; - } -} OSRT_RECORD_QUEUE; -} - -OSRTFunc global_osrt_func; - -void OSRTFunc::loadFunc() -{ - static volatile bool loaded = false; - if (LIKELY(loaded)) { - return; - } - RECURSIVE = true; - LOAD_FUNC(malloc, MallocFunc); - LOAD_FUNC(realloc, ReallocFunc); - LOAD_FUNC(free, FreeFunc); - LOAD_FUNC(mmap, MmapFunc); - LOAD_FUNC(munmap, MunmapFunc); - LOAD_FUNC(mremap, MremapFunc); - LOAD_FUNC(msync, MsyncFunc); - LOAD_FUNC(mprotect, MprotectFunc); - LOAD_FUNC(brk, BrkFunc); - - LOAD_FUNC(pthread_mutex_lock, PthreadMutexLockFunc); - LOAD_FUNC(pthread_mutex_timedlock, PthreadMutexTimedlockFunc); - LOAD_FUNC(pthread_cond_signal, PthreadCondSignalFunc); - LOAD_FUNC(pthread_cond_broadcast, PthreadCondBroadcastFunc); - LOAD_FUNC(pthread_cond_wait, PthreadCondWaitFunc); - LOAD_FUNC(pthread_cond_timedwait, PthreadCondTimedwaitFunc); - LOAD_FUNC(pthread_rwlock_rdlock, PthreadRwlockRdlockFunc); - LOAD_FUNC(pthread_rwlock_timedrdlock, PthreadRwlockTimedrdlockFunc); - LOAD_FUNC(pthread_rwlock_wrlock, PthreadRwlockWrlockFunc); - LOAD_FUNC(pthread_rwlock_timedwrlock, PthreadRwlockTimedwrlockFunc); - - real_exit = reinterpret_cast(dlsym(RTLD_NEXT, "exit")); - real__exit = reinterpret_cast(dlsym(RTLD_NEXT, "_exit")); - real__Exit = reinterpret_cast(dlsym(RTLD_NEXT, "_Exit")); - - file_func.loadFunc(); - socket_func.loadFunc(); - - loaded = true; - RECURSIVE = false; -} - -void OSRTFunc::recordFunc(uint64_t start_time, uint64_t duration, const char* name) -{ - if (UNLIKELY(!INITIALIZED || RECURSIVE)) { - return; - } - if (UNLIKELY(duration >= threshold_)) { - RECURSIVE = true; - OSRT_RECORD_QUEUE.recordData(name, start_time, duration); - RECURSIVE = false; - } -} - -void OSRTFunc::dumpFunc() -{ - if (!INITIALIZED) { - return; - } - static std::mutex dump_mutex; - static bool dumped = false; - - std::lock_guard lock(dump_mutex); - if (!dumped) { - RECURSIVE = true; - if (OSRT_RECORD_QUEUE.hasValidData()) { - std::string dump_file; - pid_t pid = getpid(); - // The glibc program_invocation_short_name contains the basename that was used to invoke the calling program - if (program_invocation_short_name != nullptr) { - dump_file = std::string(EXPORT_PATH) + "/" + DUMP_FILE + std::to_string(pid) + "_" + program_invocation_short_name + ".csv"; - } else { - dump_file = std::string(EXPORT_PATH) + "/" + DUMP_FILE + std::to_string(pid) + ".csv"; - } - if (!PathUtils::IsFileExist(dump_file) && !PathUtils::CreateFile(dump_file)) { - fprintf(stderr, "[ERROR] Create msosrt trace file failed.\n"); - RECURSIVE = false; - return; - } - auto fd = fopen(dump_file.c_str(), "ab"); - if (fd == nullptr) { - RECURSIVE = false; - return; - } - fprintf(fd, "%s\n", "Pid,Tid,Function,StartTime(ns),Duration(ns)"); - for (size_t i = 0, len = OSRT_RECORD_QUEUE.size(); i < len; ++i) { - if (OSRT_RECORD_QUEUE.data_[i].pid == pid && OSRT_RECORD_QUEUE.data_[i].function != nullptr) { - fprintf(fd, "%" PRIdMAX ",%" PRIdMAX ",%s,%" PRIu64 ",%" PRIu64 "\n", - static_cast(pid), - static_cast(OSRT_RECORD_QUEUE.data_[i].tid), - OSRT_RECORD_QUEUE.data_[i].function, - OSRT_RECORD_QUEUE.data_[i].start_time, - OSRT_RECORD_QUEUE.data_[i].duration); - } - } - fclose(fd); - } - RECURSIVE = false; - } - dumped = true; -} - -static void setup_trace() -{ - if (LIKELY(INITIALIZED)) { - return; - } - global_osrt_func.loadFunc(); - INITIALIZED = true; - - RECURSIVE = true; - const char* threshold_env_val = getenv("MSOSRT_TRACE_THRESHOLD"); - int64_t threshold = 0; - if (threshold_env_val == nullptr || str_to_i64(threshold_env_val, threshold) != 0) { - fprintf(stderr, "[WARNING] Parse MSOSRT_TRACE_THRESHOLD failed, use default value\n"); - } else { - if (threshold > 0) { - global_osrt_func.threshold_ = threshold; - } else { - fprintf(stderr, "[WARNING] MSOSRT_TRACE_THRESHOLD must be a positive integer, use default value\n"); - } - } - - const char* export_path_env_val = getenv("MSOSRT_EXPORT_PATH"); - std::string dump_path; - if (export_path_env_val != nullptr) { - dump_path = export_path_env_val; - } - if (dump_path.empty()) { - fprintf(stderr, "[WARNING] MSOSRT_EXPORT_PATH is not set, data will export to current working directory\n"); - char cwd_path[PATH_MAX] = {0}; - if (getcwd(cwd_path, PATH_MAX) != nullptr) { - dump_path = cwd_path; - } - } - std::string abs_path = PathUtils::RelativeToAbsPath(dump_path); - if (PathUtils::DirPathCheck(abs_path)) { - std::string real_path = PathUtils::RealPath(abs_path); - strncpy(EXPORT_PATH, real_path.c_str(), real_path.size() < PATH_MAX ? real_path.size() : PATH_MAX); - fprintf(stderr, "[INFO] MSOSRT result export path is: %s\n", real_path.c_str()); - } else { - fprintf(stderr, "[ERROR] Invalid export path, data will not be exported.\n"); - } - RECURSIVE = false; -} - -static void end_trace() -{ - global_osrt_func.dumpFunc(); -} - -void* malloc(size_t size) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return (void*)global_osrt_func.real_malloc(size); - } - uint64_t start_time = nsec_now(); - void* ret = global_osrt_func.real_malloc(size); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void* realloc(void* ptr, size_t size) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return (void*)global_osrt_func.real_realloc(ptr, size); - } - uint64_t start_time = nsec_now(); - void* ret = global_osrt_func.real_realloc(ptr, size); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void free(void* ptr) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - global_osrt_func.real_free(ptr); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); -} - -void* mmap(void* addr, size_t length, int prot, int flags, int fd, off_t offset) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - void* ret = global_osrt_func.real_mmap(addr, length, prot, flags, fd, offset); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void* mremap(void* old_address, size_t old_size, size_t new_size, int flags, ...) -{ - global_osrt_func.loadFunc(); - va_list args; - va_start(args, flags); - void* arg = va_arg(args, void*); - va_end(args); - uint64_t start_time = nsec_now(); - void* ret = global_osrt_func.real_mremap(old_address, old_size, new_size, flags, arg); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int munmap(void* addr, size_t length) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_munmap(addr, length); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int msync(void* addr, size_t length, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_msync(addr, length, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int mprotect(void* addr, size_t len, int prot) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_mprotect(addr, len, prot); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int brk(void* addr) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_brk(addr); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_mutex_lock(pthread_mutex_t* mutex) -{ - if (UNLIKELY(!INITIALIZED && RECURSIVE)) { - // During the initialization phase we might be called inside of dlsym(). - // Since we'd enter an endless loop if we tried to resolved the real - // pthread_mutex_lock() here then we simply fake the lock which should - // be safe since no thread can be running yet. - return 0; - } - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_mutex_lock(mutex); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_mutex_timedlock(pthread_mutex_t* mutex, const struct timespec* abstime) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_mutex_timedlock(mutex, abstime); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_mutex_timedlock(mutex, abstime); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_cond_signal(pthread_cond_t* cond) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_cond_signal(cond); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_cond_signal(cond); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_cond_broadcast(pthread_cond_t* cond) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_cond_broadcast(cond); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_cond_broadcast(cond); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_cond_wait(pthread_cond_t* cond, pthread_mutex_t* mutex) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_cond_wait(cond, mutex); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_cond_wait(cond, mutex); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_cond_timedwait(pthread_cond_t* cond, pthread_mutex_t* mutex, const struct timespec* abstime) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_cond_timedwait(cond, mutex, abstime); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_cond_timedwait(cond, mutex, abstime); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_rwlock_rdlock(pthread_rwlock_t* rwlock) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_rwlock_rdlock(rwlock); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_rwlock_rdlock(rwlock); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_rwlock_timedrdlock(pthread_rwlock_t* rwlock, const struct timespec* abstime) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_rwlock_timedrdlock(rwlock, abstime); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_rwlock_timedrdlock(rwlock, abstime); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_rwlock_wrlock(pthread_rwlock_t* rwlock) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - global_osrt_func.real_pthread_rwlock_wrlock(rwlock); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_rwlock_wrlock(rwlock); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int pthread_rwlock_timedwrlock(pthread_rwlock_t* rwlock, const struct timespec* abstime) -{ - global_osrt_func.loadFunc(); - if (UNLIKELY(RECURSIVE)) { - return global_osrt_func.real_pthread_rwlock_timedwrlock(rwlock, abstime); - } - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.real_pthread_rwlock_timedwrlock(rwlock, abstime); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -void exit(int status) -{ - if (LIKELY(INITIALIZED)) { - global_osrt_func.dumpFunc(); - } - real_exit(status); -} - -void _exit(int status) -{ - if (LIKELY(INITIALIZED)) { - global_osrt_func.dumpFunc(); - } - real__exit(status); -} - -void _Exit(int status) -{ - if (LIKELY(INITIALIZED)) { - global_osrt_func.dumpFunc(); - } - real__Exit(status); -} diff --git a/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.h b/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.h deleted file mode 100644 index e153ef5138883cd597c0a5a524adc5ec5b555ea4..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/msosrt_trace.h +++ /dev/null @@ -1,207 +0,0 @@ -#pragma once - -#ifndef _GNU_SOURCE -#define _GNU_SOURCE -#endif - -#include -#include -#include -#include -#include -#include -#include - -#include "utils.h" -#include "file_func.h" -#include "socket_func.h" - -#define TRACE_API __attribute__((visibility("default"))) -#define LOAD_FUNC(name, func_type) \ - do { \ - (real_##name) = reinterpret_cast(dlsym(RTLD_NEXT, #name)); \ - } while (false) - -#ifdef __cplusplus -extern "C" { -#endif -// memory func -TRACE_API void* malloc(size_t size); -TRACE_API void* realloc(void* ptr, size_t size); -TRACE_API void free(void* ptr); -TRACE_API void* mmap(void* addr, size_t length, int prot, int flags, int fd, off_t offset); -TRACE_API int munmap(void* addr, size_t length); -TRACE_API void* mremap(void* old_address, size_t old_size, size_t new_size, int flags, ... /* void *new_address */); -TRACE_API int msync(void* addr, size_t length, int flags); -TRACE_API int mprotect(void* addr, size_t len, int prot); -TRACE_API int brk(void* addr); -// pthread func -TRACE_API int pthread_mutex_lock(pthread_mutex_t* mutex); -TRACE_API int pthread_mutex_timedlock(pthread_mutex_t* mutex, const struct timespec* abstime); -TRACE_API int pthread_cond_signal(pthread_cond_t* cond); -TRACE_API int pthread_cond_broadcast(pthread_cond_t* cond); -TRACE_API int pthread_cond_wait(pthread_cond_t* cond, pthread_mutex_t* mutex); -TRACE_API int pthread_cond_timedwait(pthread_cond_t* cond, pthread_mutex_t* mutex, const struct timespec* abstime); -TRACE_API int pthread_rwlock_rdlock(pthread_rwlock_t* rwlock); -TRACE_API int pthread_rwlock_timedrdlock(pthread_rwlock_t* rwlock, const struct timespec* abstime); -TRACE_API int pthread_rwlock_wrlock(pthread_rwlock_t* rwlock); -TRACE_API int pthread_rwlock_timedwrlock(pthread_rwlock_t* rwlock, const struct timespec* abstime); -// exit func -TRACE_API void exit(int status) __attribute__((noreturn)); -TRACE_API void _exit(int status) __attribute__((noreturn)); -TRACE_API void _Exit(int status) __attribute__((noreturn)); -// file func -TRACE_API int dup(int oldfd); -TRACE_API int dup2(int oldfd, int newfd); -TRACE_API int dup3(int oldfd, int newfd, int flags); -TRACE_API ssize_t tee(int fd_in, int fd_out, size_t len, unsigned int flags); -TRACE_API ssize_t splice(int fd_in, off_t* off_in, int fd_out, off_t* off_out, size_t len, unsigned int flags); -TRACE_API int fallocate(int fd, int mode, off_t offset, off_t len); -TRACE_API int fdatasync(int fildes); -TRACE_API int fsync(int fd); -TRACE_API int fcntl(int fd, int op, ...); -TRACE_API int flock(int fd, int op); -TRACE_API int lockf(int fd, int op, off_t len); -TRACE_API int truncate(const char* path, off_t length); -TRACE_API int ftruncate(int fildes, off_t length); -TRACE_API int ioctl(int fd, int op, ...); -TRACE_API int open(const char* pathname, int flags, ... /* mode_t mode */ ); -TRACE_API int openat(int dirfd, const char* pathname, int flags, ... /* mode_t mode */ ); -TRACE_API int pipe(int pipefd[2]); -TRACE_API int pipe2(int pipefd[2], int flags); -TRACE_API int mkfifo(const char* pathname, mode_t mode); -TRACE_API int mkfifoat(int dirfd, const char* pathname, mode_t mode); -TRACE_API ssize_t read(int fd, void* buf, size_t count); -TRACE_API ssize_t pread(int fd, void* buf, size_t count, off_t offset); -TRACE_API ssize_t readv(int fd, const struct iovec* iov, int iovcnt); -TRACE_API ssize_t preadv(int fd, const struct iovec* iov, int iovcnt, off_t offset); -TRACE_API ssize_t preadv2(int fd, const struct iovec* iov, int iovcnt, off_t offset, int flags); -TRACE_API ssize_t write(int fd, const void* buf, size_t count); -TRACE_API ssize_t pwrite(int fd, const void* buf, size_t count, off_t offset); -TRACE_API ssize_t writev(int fd, const struct iovec* iov, int iovcnt); -TRACE_API ssize_t pwritev(int fd, const struct iovec* iov, int iovcnt, off_t offset); -TRACE_API ssize_t pwritev2(int fd, const struct iovec* iov, int iovcnt, off_t offset, int flags); -TRACE_API ssize_t copy_file_range(int fd_in, off_t* off_in, int fd_out, off_t* off_out, size_t len, unsigned int flags); -TRACE_API void sync(void); -TRACE_API int syncfs(int fd); -TRACE_API int sync_file_range(int fd, off_t offset, off_t nbytes, unsigned int flags); -TRACE_API ssize_t vmsplice(int fd, const struct iovec* iov, size_t nr_segs, unsigned int flags); -TRACE_API ssize_t process_vm_readv(pid_t pid, const struct iovec* local_iov, unsigned long liovcnt, - const struct iovec* remote_iov, unsigned long riovcnt, unsigned long flags); -TRACE_API ssize_t process_vm_writev(pid_t pid, const struct iovec* local_iov, unsigned long liovcnt, - const struct iovec* remote_iov, unsigned long riovcnt, unsigned long flags); -TRACE_API int fclose(FILE* stream); -TRACE_API int fcloseall(void); -TRACE_API int fflush(FILE* stream); -TRACE_API int fgetc(FILE* stream); -TRACE_API char* fgets(char* s, int size, FILE* stream); -TRACE_API int fputc(int c, FILE* stream); -TRACE_API int fputs(const char* s, FILE* stream); -TRACE_API void flockfile(FILE* filehandle); -TRACE_API int ftrylockfile(FILE* filehandle); -TRACE_API void funlockfile(FILE* filehandle); -TRACE_API FILE* fopen(const char* pathname, const char* mode); -TRACE_API FILE* freopen(const char* pathname, const char* mode, FILE* stream); -TRACE_API size_t fread(void* ptr, size_t size, size_t nmemb, FILE* stream); -TRACE_API size_t fwrite(const void* ptr, size_t size, size_t nitems, FILE* stream); -TRACE_API ssize_t getdelim(char** lineptr, size_t* n, int delimiter, FILE* stream); -TRACE_API ssize_t getline(char** lineptr, size_t* n, FILE* stream); -TRACE_API int getc(FILE* stream); -TRACE_API int putc(int c, FILE* stream); -TRACE_API int getc_unlocked(FILE* stream); -TRACE_API int putc_unlocked(int c, FILE* stream); -TRACE_API int fflush_unlocked(FILE* stream); -TRACE_API int fgetc_unlocked(FILE* stream); -TRACE_API int fputc_unlocked(int c, FILE* stream); -TRACE_API size_t fread_unlocked(void* ptr, size_t size, size_t n, FILE* stream); -TRACE_API size_t fwrite_unlocked(const void* ptr, size_t size, size_t n, FILE* stream); -TRACE_API char* fgets_unlocked(char* s, int n, FILE* stream); -TRACE_API int fputs_unlocked(const char* s, FILE* stream); -// socket func -TRACE_API int socket(int domain, int type, int protocol); -TRACE_API int socketpair(int domain, int type, int protocol, int sv[2]); -TRACE_API int epoll_ctl(int epfd, int op, int fd, struct epoll_event* event); -TRACE_API int epoll_wait(int epfd, struct epoll_event* events, int maxevents, int timeout); -TRACE_API int epoll_pwait(int epfd, struct epoll_event* events, int maxevents, int timeout, const sigset_t* sigmask); -TRACE_API int select(int nfds, fd_set* readfds, fd_set* writefds, fd_set* exceptfds, struct timeval* timeout); -TRACE_API int listen(int sockfd, int backlog); -TRACE_API int accept(int sockfd, struct sockaddr* addr, socklen_t* addrlen); -TRACE_API int accept4(int sockfd, struct sockaddr* addr, socklen_t* addrlen, int flags); -TRACE_API int bind(int sockfd, const struct sockaddr* addr, socklen_t addrlen); -TRACE_API int poll(struct pollfd* fds, nfds_t nfds, int timeout); -TRACE_API int ppoll(struct pollfd* fds, nfds_t nfds, const struct timespec* tmo_p, const sigset_t* sigmask); -TRACE_API ssize_t send(int sockfd, const void* buf, size_t len, int flags); -TRACE_API ssize_t sendto(int sockfd, const void* buf, size_t len, int flags, const struct sockaddr* dest_addr, socklen_t addrlen); -TRACE_API ssize_t sendmsg(int sockfd, const struct msghdr* msg, int flags); -TRACE_API int sendmmsg(int sockfd, struct mmsghdr* msgvec, unsigned int vlen, int flags); -TRACE_API ssize_t sendfile(int out_fd, int in_fd, off_t* offset, size_t count); -TRACE_API ssize_t recv(int sockfd, void* buf, size_t len, int flags); -TRACE_API ssize_t recvfrom(int sockfd, void* buf, size_t len, int flags, struct sockaddr* src_addr, socklen_t* addrlen); -TRACE_API ssize_t recvmsg(int sockfd, struct msghdr* msg, int flags); -TRACE_API int recvmmsg(int sockfd, struct mmsghdr* msgvec, unsigned int vlen, int flags, struct timespec* timeout); -#ifdef __cplusplus -} -#endif - -using MallocFunc = void*(*)(size_t); -using ReallocFunc = void*(*)(void*, size_t); -using FreeFunc = void(*)(void*); -using MmapFunc = void*(*)(void*, size_t, int, int, int, off_t); -using MunmapFunc = int(*)(void*, size_t); -using MremapFunc = void*(*)(void*, size_t, size_t, int, ...); -using MsyncFunc = int(*)(void*, size_t, int); -using MprotectFunc = int(*)(void*, size_t, int); -using BrkFunc = int(*)(void*); -using PthreadMutexLockFunc = int(*)(pthread_mutex_t*); -using PthreadMutexTimedlockFunc = int(*)(pthread_mutex_t*, const struct timespec*); -using PthreadCondSignalFunc = int(*)(pthread_cond_t*); -using PthreadCondBroadcastFunc = int(*)(pthread_cond_t*); -using PthreadCondWaitFunc = int(*)(pthread_cond_t*, pthread_mutex_t*); -using PthreadCondTimedwaitFunc = int(*)(pthread_cond_t*, pthread_mutex_t*, const struct timespec*); -using PthreadRwlockRdlockFunc = int(*)(pthread_rwlock_t*); -using PthreadRwlockTimedrdlockFunc = int(*)(pthread_rwlock_t*, const struct timespec*); -using PthreadRwlockWrlockFunc = int(*)(pthread_rwlock_t*); -using PthreadRwlockTimedwrlockFunc = int(*)(pthread_rwlock_t*, const struct timespec*); - -struct OSRTRecord { - pid_t pid = 0; - pid_t tid = 0; - const char* function = nullptr; - uint64_t start_time = 0; - uint64_t duration = 0; -}; - -const uint64_t DEFAULT_THRESHOLD = 10 * 1000 * 1000; // 10ms - -struct OSRTFunc { - uint64_t threshold_ = DEFAULT_THRESHOLD; - - MallocFunc real_malloc = nullptr; - ReallocFunc real_realloc = nullptr; - FreeFunc real_free = nullptr; - MmapFunc real_mmap = nullptr; - MunmapFunc real_munmap = nullptr; - MremapFunc real_mremap = nullptr; - MsyncFunc real_msync = nullptr; - MprotectFunc real_mprotect = nullptr; - BrkFunc real_brk = nullptr; - PthreadMutexLockFunc real_pthread_mutex_lock = nullptr; - PthreadMutexTimedlockFunc real_pthread_mutex_timedlock = nullptr; - PthreadCondSignalFunc real_pthread_cond_signal = nullptr; - PthreadCondBroadcastFunc real_pthread_cond_broadcast = nullptr; - PthreadCondWaitFunc real_pthread_cond_wait = nullptr; - PthreadCondTimedwaitFunc real_pthread_cond_timedwait = nullptr; - PthreadRwlockRdlockFunc real_pthread_rwlock_rdlock = nullptr; - PthreadRwlockTimedrdlockFunc real_pthread_rwlock_timedrdlock = nullptr; - PthreadRwlockWrlockFunc real_pthread_rwlock_wrlock = nullptr; - PthreadRwlockTimedwrlockFunc real_pthread_rwlock_timedwrlock = nullptr; - - FileFuncProxy file_func; - SocketFuncProxy socket_func; - - void loadFunc(); - void recordFunc(uint64_t start_time, uint64_t duration, const char* name); - void dumpFunc(); -}; - -extern OSRTFunc global_osrt_func; diff --git a/profiler/msprof_analyze/osrt_trace/src/socket_func.cpp b/profiler/msprof_analyze/osrt_trace/src/socket_func.cpp deleted file mode 100644 index f2863c6a515f3d5159eb5e7e1212499d78301df9..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/socket_func.cpp +++ /dev/null @@ -1,217 +0,0 @@ -#include "socket_func.h" - -#include "msosrt_trace.h" - -void SocketFuncProxy::loadFunc() -{ - LOAD_FUNC(socket, SocketFunc); - LOAD_FUNC(socketpair, SocketpairFunc); - LOAD_FUNC(epoll_ctl, EpollCtlFunc); - LOAD_FUNC(epoll_wait, EpollWaitFunc); - LOAD_FUNC(epoll_pwait, EpollPwaitFunc); - LOAD_FUNC(select, SelectFunc); - LOAD_FUNC(listen, ListenFunc); - LOAD_FUNC(accept, AcceptFunc); - LOAD_FUNC(accept4, Accept4Func); - LOAD_FUNC(bind, BindFunc); - LOAD_FUNC(poll, PollFunc); - LOAD_FUNC(ppoll, PpollFunc); - LOAD_FUNC(send, SendFunc); - LOAD_FUNC(sendto, SendtoFunc); - LOAD_FUNC(sendmsg, SendmsgFunc); - LOAD_FUNC(sendmmsg, SendmmsgFunc); - LOAD_FUNC(sendfile, SendfileFunc); - LOAD_FUNC(recv, RecvFunc); - LOAD_FUNC(recvfrom, RecvfromFunc); - LOAD_FUNC(recvmsg, RecvmsgFunc); - LOAD_FUNC(recvmmsg, RecvmmsgFunc); -} - -int socket(int domain, int type, int protocol) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_socket(domain, type, protocol); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int socketpair(int domain, int type, int protocol, int sv[2]) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_socketpair(domain, type, protocol, sv); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int epoll_ctl(int epfd, int op, int fd, struct epoll_event* event) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_epoll_ctl(epfd, op, fd, event); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int epoll_wait(int epfd, struct epoll_event* events, int maxevents, int timeout) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_epoll_wait(epfd, events, maxevents, timeout); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int epoll_pwait(int epfd, struct epoll_event* events, int maxevents, int timeout, const sigset_t* sigmask) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_epoll_pwait(epfd, events, maxevents, timeout, sigmask); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int select(int nfds, fd_set* readfds, fd_set* writefds, fd_set* exceptfds, struct timeval* timeout) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_select(nfds, readfds, writefds, exceptfds, timeout); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int listen(int sockfd, int backlog) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_listen(sockfd, backlog); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int accept(int sockfd, struct sockaddr* addr, socklen_t* addrlen) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_accept(sockfd, addr, addrlen); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int accept4(int sockfd, struct sockaddr* addr, socklen_t* addrlen, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_accept4(sockfd, addr, addrlen, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int bind(int sockfd, const struct sockaddr* addr, socklen_t addrlen) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_bind(sockfd, addr, addrlen); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int poll(struct pollfd* fds, nfds_t nfds, int timeout) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_poll(fds, nfds, timeout); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int ppoll(struct pollfd* fds, nfds_t nfds, const struct timespec* tmo_p, const sigset_t* sigmask) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_ppoll(fds, nfds, tmo_p, sigmask); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t send(int sockfd, const void* buf, size_t len, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_send(sockfd, buf, len, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t sendto(int sockfd, const void* buf, size_t len, int flags, const struct sockaddr* dest_addr, socklen_t addrlen) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_sendto(sockfd, buf, len, flags, dest_addr, addrlen); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t sendmsg(int sockfd, const struct msghdr* msg, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_sendmsg(sockfd, msg, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int sendmmsg(int sockfd, struct mmsghdr* msgvec, unsigned int vlen, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_sendmmsg(sockfd, msgvec, vlen, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t sendfile(int out_fd, int in_fd, off_t* offset, size_t count) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_sendfile(out_fd, in_fd, offset, count); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t recv(int sockfd, void* buf, size_t len, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_recv(sockfd, buf, len, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t recvfrom(int sockfd, void* buf, size_t len, int flags, struct sockaddr* src_addr, socklen_t* addrlen) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_recvfrom(sockfd, buf, len, flags, src_addr, addrlen); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -ssize_t recvmsg(int sockfd, struct msghdr* msg, int flags) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_recvmsg(sockfd, msg, flags); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} - -int recvmmsg(int sockfd, struct mmsghdr* msgvec, unsigned int vlen, int flags, struct timespec* timeout) -{ - global_osrt_func.loadFunc(); - uint64_t start_time = nsec_now(); - auto ret = global_osrt_func.socket_func.real_recvmmsg(sockfd, msgvec, vlen, flags, timeout); - global_osrt_func.recordFunc(start_time, nsec_now() - start_time, __FUNCTION__); - return ret; -} diff --git a/profiler/msprof_analyze/osrt_trace/src/socket_func.h b/profiler/msprof_analyze/osrt_trace/src/socket_func.h deleted file mode 100644 index 361ce1d6382eada6cd942d74c2f3e0e7cd8621a0..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/socket_func.h +++ /dev/null @@ -1,60 +0,0 @@ -#pragma once - -#ifndef _GNU_SOURCE -#define _GNU_SOURCE -#endif - -#include -#include -#include -#include -#include - -using SocketFunc = int(*)(int, int, int); -using SocketpairFunc = int(*)(int, int, int, int* sv); -using EpollCtlFunc = int(*)(int, int, int, struct epoll_event*); -using EpollWaitFunc = int(*)(int, struct epoll_event*, int, int); -using EpollPwaitFunc = int(*)(int, struct epoll_event*, int, int, const sigset_t*); -using SelectFunc = int(*)(int, fd_set*, fd_set*, fd_set*, struct timeval*); -using ListenFunc = int(*)(int, int); -using AcceptFunc = int(*)(int, struct sockaddr*, socklen_t*); -using Accept4Func = int(*)(int, struct sockaddr*, socklen_t*, int); -using BindFunc = int(*)(int, const struct sockaddr*, socklen_t); -using PollFunc = int(*)(struct pollfd*, nfds_t, int); -using PpollFunc = int(*)(struct pollfd*, nfds_t, const struct timespec*, const sigset_t*); -using SendFunc = ssize_t(*)(int, const void*, size_t, int); -using SendtoFunc = ssize_t(*)(int, const void*, size_t, int, const struct sockaddr*, socklen_t); -using SendmsgFunc = ssize_t(*)(int, const struct msghdr*, int); -using SendmmsgFunc = int(*)(int, struct mmsghdr*, unsigned int, int); -using SendfileFunc = ssize_t(*)(int, int, off_t*, size_t); -using RecvFunc = ssize_t(*)(int, void*, size_t, int); -using RecvfromFunc = ssize_t(*)(int, void*, size_t, int, struct sockaddr*, socklen_t*); -using RecvmsgFunc = ssize_t(*)(int, struct msghdr*, int); -using RecvmmsgFunc = int(*)(int, struct mmsghdr*, unsigned int, int, struct timespec*); - -struct SocketFuncProxy -{ - SocketFunc real_socket = nullptr; - SocketpairFunc real_socketpair = nullptr; - EpollCtlFunc real_epoll_ctl = nullptr; - EpollWaitFunc real_epoll_wait = nullptr; - EpollPwaitFunc real_epoll_pwait = nullptr; - SelectFunc real_select = nullptr; - ListenFunc real_listen = nullptr; - AcceptFunc real_accept = nullptr; - Accept4Func real_accept4 = nullptr; - BindFunc real_bind = nullptr; - PollFunc real_poll = nullptr; - PpollFunc real_ppoll = nullptr; - SendFunc real_send = nullptr; - SendtoFunc real_sendto = nullptr; - SendmsgFunc real_sendmsg = nullptr; - SendmmsgFunc real_sendmmsg = nullptr; - SendfileFunc real_sendfile = nullptr; - RecvFunc real_recv = nullptr; - RecvfromFunc real_recvfrom = nullptr; - RecvmsgFunc real_recvmsg = nullptr; - RecvmmsgFunc real_recvmmsg = nullptr; - - void loadFunc(); -}; diff --git a/profiler/msprof_analyze/osrt_trace/src/utils.cpp b/profiler/msprof_analyze/osrt_trace/src/utils.cpp deleted file mode 100644 index 82382d23039e63c7ab2d4475d0dcf7fe2aec9fad..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/osrt_trace/src/utils.cpp +++ /dev/null @@ -1,159 +0,0 @@ -#include "utils.h" - -#include -#include -#include -#include -#include -#include - -int str_to_i64(const std::string& str, int64_t& num) -{ - if (str.empty()) { - return -1; - } - size_t pos = 0; - try { - num = std::stoll(str, &pos); - } catch (...) { - return -1; - } - if (pos != str.size()) { - return -1; - } - return 0; -} - -bool PathUtils::IsFileExist(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return false; - } - return (access(path.c_str(), F_OK) == 0) ? true : false; -} - -bool PathUtils::IsFileWritable(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return false; - } - return (access(path.c_str(), W_OK) == 0) ? true : false; -} - -bool PathUtils::IsDir(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return false; - } - struct stat st{}; - int ret = lstat(path.c_str(), &st); - if (ret != 0) { - return false; - } - return S_ISDIR(st.st_mode) ? true : false; -} - -bool PathUtils::CreateDir(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return false; - } - if (IsFileExist(path)) { - return IsDir(path) ? true : false; - } - size_t pos = 0; - while ((pos = path.find_first_of('/', pos)) != std::string::npos) { - std::string base_dir = path.substr(0, ++pos); - if (IsFileExist(base_dir)) { - if (IsDir(base_dir)) { - continue; - } else { - return false; - } - } - if (mkdir(base_dir.c_str(), DATA_DIR_AUTHORITY) != 0) { - return false; - } - } - return (mkdir(path.c_str(), DATA_DIR_AUTHORITY) == 0) ? true : false; -} - -std::string PathUtils::RealPath(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return ""; - } - char realPath[PATH_MAX] = {0}; - if (realpath(path.c_str(), realPath) == nullptr) { - return ""; - } - return std::string(realPath); -} - -std::string PathUtils::RelativeToAbsPath(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX) { - return ""; - } - if (path[0] != '/') { - char pwd_path[PATH_MAX] = {0}; - if (getcwd(pwd_path, PATH_MAX) != nullptr) { - return std::string(pwd_path) + "/" + path; - } - return ""; - } - return std::string(path); -} - -std::string PathUtils::DirName(const std::string &path) -{ - if (path.empty()) { - return ""; - } - char temp_path[PATH_MAX] = {0}; - strncpy(temp_path, path.c_str(), path.size() < PATH_MAX ? path.size() : PATH_MAX); - char* path_c = dirname(temp_path); - return path_c ? std::string(path_c) : ""; -} - -bool PathUtils::CreateFile(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX || !CreateDir(DirName(path))) { - return false; - } - int fd = creat(path.c_str(), DATA_FILE_AUTHORITY); - return (fd < 0 || close(fd) != 0) ? false : true; -} - -bool PathUtils::IsSoftLink(const std::string &path) -{ - if (path.empty() || path.size() > PATH_MAX || !IsFileExist(path)) { - return false; - } - struct stat st{}; - if (lstat(path.c_str(), &st) != 0) { - return false; - } - return S_ISLNK(st.st_mode); -} - -bool PathUtils::DirPathCheck(const std::string& abs_path) -{ - if (abs_path.empty() || abs_path.size() > PATH_MAX) { - fprintf(stderr, "[ERROR] The length of Path %s is invalid.\n", abs_path.c_str()); - return false; - } - if (IsSoftLink(abs_path)) { - fprintf(stderr, "[ERROR] Path %s is soft link.\n", abs_path.c_str()); - return false; - } - if (!IsFileExist(abs_path) && !CreateDir(abs_path)) { - fprintf(stderr, "[ERROR] Path %s not exist and create failed.\n", abs_path.c_str()); - return false; - } - if (!IsDir(abs_path) || !IsFileWritable(abs_path)) { - fprintf(stderr, "[ERROR] %s is not a directory or is not writable.\n", abs_path.c_str()); - return false; - } - return true; -} diff --git a/profiler/msprof_analyze/precheck/README.md b/profiler/msprof_analyze/precheck/README.md deleted file mode 100644 index 882cc4e12fff0b64c905d6146d9b7a0d98e7bad9..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/README.md +++ /dev/null @@ -1,393 +0,0 @@ -# Profiler Precheck 用户指南 - -欢迎使用 Profiler Precheck 工具!本指南将详细介绍该工具的功能、使用方法以及内部实现原理,帮助您快速上手并充分利用其性能分析能力。 - -## 目录 -- [1. 概述](#1-概述) -- [2. 整体架构](#2-整体架构) -- [3. 使用方法](#3-使用方法) - - [3.1 云容器场景](#31-云容器场景) - - [3.2 裸机场景](#32-裸机场景) -- [4. 命令参数说明](#4-命令参数说明) -- [5. 常见问题](#5-常见问题) - -## 1. 概述 - -Profiler Precheck 是一个用于分布式训练任务的性能分析工具。它可以自动采集集群中各节点的硬件与软件环境信息,并基于历史数据和专家知识,对当前训练任务的配置与资源使用情况进行分析,给出优化建议与预警,帮助用户发现并解决潜在的性能瓶颈,提升训练效率。 - -## 2. 整体架构 - -Profiler Precheck 采用主从架构,由一个主节点(master节点)和多个从节点(slave节点)组成: - -- **主节点(master节点)**: - - 负责接收用户的任务请求 - - 将 Precheck 相关代码分发到各从节点 - - 协调各节点的分析过程 - - 汇总分析结果生成最终报告 - - 通常是集群训练中rank=0的设备所在的主机节点 - -- **从节点(slave节点)**: - - 负责在本节点上执行用户的训练脚本 - - 运行 Profiler 采集各项性能指标 - - 将结果回传给主节点 - -### 预检流程 -1. **准备阶段**:用户在master节点上提交预检请求,master节点将代码分发到各slave节点 -2. **采集阶段**:各节点启动训练脚本,同时运行 Profiler 采集性能数据 -3. **汇总阶段**:master节点汇总各slave节点上报的性能数据 -4. **分析阶段**:主节点对汇总数据进行分析,生成分析报告 - -### 典型场景 - -## 3. 使用方法 - -### 3.1 云容器场景 -详细预检流程请参考:[云场景预检流程](assert/code_structure_startnode_docker.svg) - -#### 3.1.1 部署流程 - -1. **准备基础环境** -```bash -# 下载并加载基础镜像 -docker load -i user_image.tar - -# 创建训练容器 -docker run -it --name user_container \ - --device=/dev/davinci0 \ - --device=/dev/davinci_manager \ - --device=/dev/devmm_svm \ - --device=/dev/hisi_hdc \ - -v /usr/local/Ascend:/usr/local/Ascend \ - -v /path/to/data:/data \ - -v /path/to/model:/model \ - user_image:latest -``` - -2. **构建预检环境** -```bash -# 安装预检工具 -pip install msprof-analyze-xx.whl - -# 创建预检启动脚本 -cat > /usr/local/bin/run_node_precheck.sh << 'EOF' -#!/bin/bash -msprof-analyze precheck start_node \ - --node_rank ${NODE_RANK:-0} \ - --master_addr ${MASTER_ADDR:-"127.0.0.1"} \ - --master_port ${MASTER_PORT:-29500} \ - --nnodes ${NNODES:-1} \ - --nproc_per_node ${NPUS_PER_NODE:-8} \ - --task_name ${TASK_NAME:-"container_test"} \ - --profiling_cmd ${PROFILING_CMD:-"run.sh"} -EOF -chmod +x /usr/local/bin/run_node_precheck.sh - -# 保存预检镜像 -docker commit user_container precheck_image:latest -docker save -o precheck_image.tar precheck_image:latest -``` - -3. **分发和启动** -```bash -# 在每个节点上加载镜像 -docker load -i precheck_image.tar - -# 启动主节点容器 -docker run -d --name precheck_master \ - --network host \ - --device=/dev/davinci* \ - -v /usr/local/Ascend:/usr/local/Ascend \ - -v /path/to/data:/data \ - -e MASTER_ADDR=192.168.0.1 \ - -e MASTER_PORT=29500 \ - -e NNODES=2 \ - -e NODE_RANK=0 \ - -e NPUS_PER_NODE=8 \ - -e TASK_NAME=container_test \ - -e PROFILING_CMD="run.sh" \ - precheck_image:latest \ - /usr/local/bin/run_node_precheck.sh - -# 启动从节点容器 -docker run -d --name precheck_worker \ - --network host \ - --device=/dev/davinci* \ - -v /usr/local/Ascend:/usr/local/Ascend \ - -v /path/to/data:/data \ - -e MASTER_ADDR=192.168.0.1 \ - -e MASTER_PORT=29500 \ - -e NNODES=2 \ - -e NODE_RANK=1 \ - -e NPUS_PER_NODE=8 \ - -e TASK_NAME=container_test \ - -e PROFILING_CMD="run.sh" \ - precheck_image:latest \ - /usr/local/bin/run_node_precheck.sh -``` - -#### 3.1.2 配置说明 - -##### 容器环境变量 -| 变量名 | 说明 | 默认值 | -|--------|------|--------| -| MASTER_ADDR | 主节点IP地址 | 127.0.0.1 | -| MASTER_PORT | 主节点端口 | 29500 | -| NNODES | 总节点数 | 1 | -| NODE_RANK | 节点序号 | 0 | -| NPUS_PER_NODE | 每节点NPU数量 | 8 | -| TASK_NAME | 预检任务名称 | container_test | -| PROFILING_CMD | 训练命令 | run.sh | - -##### 容器挂载说明 -| 挂载点 | 说明 | 必需 | -|--------|------|------| -| /usr/local/Ascend | CANN工具包 | 是 | -| /data | 训练数据目录 | 否 | -| /model | 模型文件目录 | 否 | -| /output | 输出目录 | 否 | - -### 3.2 裸机场景 -详细预检流程请参考:[裸机场景预检流程](assert/code_structure_startall.svg) - -#### 3.2.1 环境配置验证 - -在开始使用预检工具前,需要确保集群环境配置正确。我们提供了一系列验证脚本帮助您快速检查环境: - -##### 1. SSH 免密配置 -```bash -# 1. 生成SSH密钥(如果已存在则跳过) -[ ! -f ~/.ssh/id_rsa ] && ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa - -# 2. 复制密钥到其他节点(替换用户名和IP) -ssh-copy-id user@192.168.0.2 -``` - -##### 2. 环境检查 -我们提供两个脚本帮助验证集群配置: - -1. **SSH连通性检查** -```bash -# 基础检查(默认5秒超时) -HOST_IPS="192.168.0.1,192.168.0.2" bash test_hosts_ssh.sh - -# 自定义超时时间 -HOST_IPS="192.168.0.1,192.168.0.2" TIMEOUT=10 bash test_hosts_ssh.sh -``` - -2. **集群环境一致性检查** -```bash -# 基础环境检查(Python、PyTorch等) -HOST_IPS="192.168.0.1,192.168.0.2" bash test_hosts_env.sh - -# 完整环境检查(包含CANN环境, developing) -HOST_IPS="192.168.0.1,192.168.0.2" CHECK_CANN=1 bash test_hosts_env.sh -``` - -示例输出: -``` -🔍 Cluster Environment Checker -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -📊 Step 1: Collecting local environment info... -Detecting Python environment... -Checking installed packages... -Checking CANN environment... - -📌 Local Environment Summary: - • Python Path: /usr/bin/python3 - • Python Version: Python 3.8.10 - • Msprof-analyze: v1.3.0 - • Torch: v2.4.0 - • Torch_NPU: v2.2.0 -``` - -#### 3.2.2 启动预检 - -预检工具支持两种使用模式:内置基准测试和自定义训练脚本。 - -##### 方式一:使用内置ResNet基准测试 - -```bash -# 使用IP列表方式 -msprof-analyze precheck start_all \ - --host_ips "192.168.0.1,192.168.0.2" \ - --master_addr 192.168.0.1 \ - --nnodes 2 \ - --nproc_per_node 8 \ - --task_name resnet_test \ - --profiling_cmd "[resnet]" - -# 使用配置文件方式 -msprof-analyze precheck start_all \ - --host_config_file hosts.csv \ - --master_addr 192.168.0.1 \ - --nnodes 2 \ - --nproc_per_node 8 \ - --profiling_cmd "[resnet]" -``` - -##### 方式二:使用自定义训练脚本 - -1. **准备训练脚本** -```python -# train.py -import torch_npu -from torch_npu.profiler import profile, ProfilerActivity - -def train(node_prof_save_dir): - # 配置性能分析器 - with profile( - activities=[ProfilerActivity.CPU, ProfilerActivity.NPU], - on_trace_ready=torch_npu.profiler.tensorboard_trace_handler(node_prof_save_dir) - ) as prof: - # 训练代码 - for epoch in range(num_epochs): - for batch in dataloader: - # 训练逻辑 - ... - prof.step() # 记录性能数据 -``` - -2. **创建启动脚本** - -run.sh示例如下: -```bash -#!/bin/bash - -# 设置性能数据保存目录 -NODE_PROF_SAVE_DIR=${NODE_PROF_SAVE_DIR:-"./output/prof_data"} -mkdir -p "$NODE_PROF_SAVE_DIR" - -# 启动训练 -python3 train.py \ - --prof_dir "$NODE_PROF_SAVE_DIR" \ - --batch-size 32 \ - --epochs 10 \ - "$@" # 支持传入额外参数 -``` - -3. **启动预检分析** -```bash -# 设置执行权限 -chmod +x run.sh - -# 使用相对路径 -msprof-analyze precheck start_all \ - --host_ips "192.168.0.1,192.168.0.2" \ - --master_addr 192.168.0.1 \ - --nnodes 2 \ - --nproc_per_node 8 \ - --task_name custom_test \ - --profiling_cmd "./run.sh --extra-args value" - -# 使用绝对路径(推荐) -msprof-analyze precheck start_all \ - --host_ips "192.168.0.1,192.168.0.2" \ - --master_addr 192.168.0.1 \ - --nnodes 2 \ - --nproc_per_node 8 \ - --task_name custom_test \ - --profiling_cmd "/path/to/run.sh --extra-args value" -``` - -#### 3.2.3 使用注意事项 - -1. **路径设置** - - 建议使用绝对路径指定脚本位置 - - 确保所有节点的脚本路径一致 - - 检查目录和文件的读写权限 - -2. **环境变量** - - `NODE_PROF_SAVE_DIR`: 性能数据保存目录 - - 可通过 `"$@"` 传递额外的训练参数 - -3. **常见问题** - - 确保 run.sh 有执行权限 - - 验证工作目录的正确性 - - 检查性能数据目录是否可写 - -## 4. 命令参数说明 - -### 基本用法 -```bash -msprof-analyze precheck [options] - -Commands: - start_all 启动所有节点的预检 - start_node 启动单个节点的预检 - stop 停止预检(todo) - status 查看预检状态 (todo) -``` - -### 通用参数 -| 参数名 | 类型 | 必需 | 默认值 | 说明 | -|--------|------|------|-----------------------------------------------------------------------------------------------------------------|------| -| master_addr | str | 是 | - | 主节点IP地址 | -| master_port | int | 否 | 29500 | 主节点通信端口 | -| nnodes | int | 是 | - | 总节点数 | -| nproc_per_node | int | 是 | - | 每节点进程数 | -| task_name | str | 否 | auto_timestamp | 任务名称 | -| output_dir | str | 否 | ./output | 输出目录 | -| node_prof_save_dir | str | 否 | {output_dir}/{task_name}/node_prof_save_dir | 节点性能数据保存目录 | -| master_prof_gather_dir | str | 否 | {output_dir}/{task_name}/master_prof_gather_dir | 主节点数据汇总目录 | -| static | bool | 否 | False | 是否使用静态profiler采集模式 | -| prof_in_shared_storage | bool | 否 | False | 是否使用共享存储(跳过数据收集) | -| profiling_cmd | str | 是 | 训练命令说明:
    - `[resnet]`: 运行ResNet基准测试
    - `python train.py [args]`: 自定义训练脚本
    - `bash run.sh [args]`: 自定义训练脚本 | 要求用户自定义脚需要将profiler数据保存到{node_prof_save_dir} - -### start_all 专用参数 -| 参数名 | 类型 | 必需 | 说明 | -|--------|------|------|------| -| host_ips | str | 是* | 节点IP列表,逗号分隔 | -| host_config_file | str | 是* | SSH配置文件路径 | - -*注:host_ips 和 host_config_file 必须提供其中之一 - -### start_node 专用参数 -| 参数名 | 类型 | 必需 | 说明 | -|--------|------|------|------| -| node_rank | int | 是 | 当前节点序号(0 到 nnodes-1) | - -## 5. 常见问题 - -### 5.1 容器场景常见问题 - -1. **容器启动失败** -```bash -# 检查设备挂载 -ls -l /dev/davinci* - -# 检查日志 -docker logs precheck_container -``` - -2. **网络连接问题** -```bash -# 检查网络配置 -docker network inspect precheck_net - -# 测试容器间连接 -docker exec precheck_master ping precheck_worker -``` - -### 5.2 裸机场景常见问题 - -1. **SSH连接超时** -```bash -# 增加连接超时时间 -HOST_IPS="192.168.0.1,192.168.0.2" TIMEOUT=10 bash test_hosts_ssh.sh -``` - -2. **环境不一致** -```bash -# 详细检查环境 -HOST_IPS="192.168.0.1,192.168.0.2" CHECK_CANN=1 bash test_hosts_env.sh -``` - -3. **CANN环境问题** -```bash -# 检查CANN工具 -npu-smi info - -# 检查环境变量 -echo $LD_LIBRARY_PATH | grep Ascend -``` \ No newline at end of file diff --git a/profiler/msprof_analyze/precheck/__init__.py b/profiler/msprof_analyze/precheck/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/__main__.py b/profiler/msprof_analyze/precheck/__main__.py deleted file mode 100644 index deb0a713199c5629195ed16d54d1ae67c8df3d78..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/__main__.py +++ /dev/null @@ -1,98 +0,0 @@ -import os -from copy import deepcopy -import logging - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.common.logger import add_file_handler, create_logger -from msprof_analyze.precheck.common.utils import cn_now -from msprof_analyze.precheck.manager.args_manager import PrecheckArgsManager -from msprof_analyze.precheck.tools.ssh_utils import run_remote_command -from msprof_analyze.prof_common.path_manager import PathManager - - -def get_command_tpl(): - cwd = os.getcwd() - from msprof_analyze.precheck.runner.__main__ import get_conda_envs_info - _, conda_activate_cmd = get_conda_envs_info() - - EXECUTOR = f'source ~/.bashrc && {conda_activate_cmd} && cd {cwd} && {Constant.MS_PROF_PRECHECK_CMD} start_node' - ARGS = ('--nnodes={nnodes}', '--nproc_per_node={nproc_per_node}', - '--node_rank={node_rank}', '--master_addr={master_addr}', - '--master_port={master_port}', - '--nproc_per_node={nproc_per_node}', - '--node_prof_save_dir={node_prof_save_dir}', - '--master_prof_gather_dir={master_prof_gather_dir}', - '--task_name={task_name}', - '--profiling_cmd="{profiling_cmd}"', - '--output_dir={output_dir}', - ) - TPL = EXECUTOR + " " + " ".join(ARGS) - return TPL - - -def start_precheck(args: PrecheckArgsManager, logger): - config = dict( - nnodes=args.nnodes, - node_rank=-1, - nproc_per_node=args.nproc_per_node, - master_addr=args.master_addr, - master_port=args.master_port, - node_prof_save_dir=args.node_prof_save_dir, - master_prof_gather_dir=args.master_prof_gather_dir, - static=args.static, - task_name=args.task_name, - python_path=args.python_path, - output_dir=args.output_dir, - profiling_cmd=args.profiling_cmd, - prof_in_shared_storage=args.prof_in_shared_storage, - ) - - hosts_info = [] - for node_id, host in enumerate(args.host_ips): - node_config = deepcopy(config) - node_config['node_rank'] = node_id - - TPL = get_command_tpl() - cmd = TPL.format(**node_config) - if node_config.get('static', False) is True: - cmd += ' --static' - if node_config.get('prof_in_shared_storage', False) is True: - cmd += ' --prof_in_shared_storage' - - host_info = { - "host": host, - "username": os.getenv('USER'), - "key_filename": "~/.ssh/id_rsa", - "command": cmd, - "port": 22 - } - - if args.host_config_file: - host_info.update(args.ssh_remote_hosts[host]) - - hosts_info.append(host_info) - - logger.info("Starting remote command execution on %d hosts", len(hosts_info)) - run_remote_command(hosts_info) - logger.info("Precheck main processes have been started on all hosts") - - -def main(args=None): - logger = create_logger("profiler.precheck", Constant.LOGGING_LEVEL, use_memory_handler=True) - - PathManager.make_dir_safety(args.task_output_dir) - - timestamp = cn_now().strftime('%Y%m%d_%H%M%S') - log_filename = f'precheck_{timestamp}.log' - log_file_path = os.path.join(args.task_output_dir, log_filename) - PathManager.create_file_safety(log_file_path) - PathManager.check_path_writeable(log_file_path) - - logger = add_file_handler(logger, log_file_path) - logger.info("Starting precheck, Precheck log file will be saved at %s", log_file_path) - logger.info("Precheck arguments: %s", args) - - try: - start_precheck(args, logger) - except Exception as e: - logger.error("Precheck runner failed with error: %s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) diff --git a/profiler/msprof_analyze/precheck/analyze/__init__.py b/profiler/msprof_analyze/precheck/analyze/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/analyze/advisor_adaptor.py b/profiler/msprof_analyze/precheck/analyze/advisor_adaptor.py deleted file mode 100644 index 491969e804f2622c4077d9a2abb52b9915b38ca7..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/analyze/advisor_adaptor.py +++ /dev/null @@ -1,56 +0,0 @@ -import sys -import os -import logging -from pathlib import Path - -sys.path.append(os.path.join(os.path.dirname(os.path.dirname(__file__)), "compare_tools")) -sys.path.append(os.path.join(os.path.dirname(os.path.dirname(__file__)), "cluster_analyse")) - -from msprof_analyze.advisor.analyzer.analyzer_controller import AnalyzerController -from msprof_analyze.advisor.interface.interface import Interface -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -class advisor_adaptor: - def __init__(self): - pass - - @staticmethod - def _check_profiling_path_valid(profiling_path): - PathManager.input_path_common_check(profiling_path) - PathManager.check_path_owner_consistent(profiling_path) - if not Path(profiling_path).exists(): - logger.error(" Invalid profiling path: %s", profiling_path) - return False - return True - - @staticmethod - def _check_output_path_valid(output_path): - if not output_path: - return False - - if not os.path.exists(output_path): - return PathManager.make_dir_safety(output_path) - - PathManager.check_input_directory_path(output_path) - PathManager.input_path_common_check(output_path) - PathManager.check_path_owner_consistent(output_path) - return True - - def analyze(self, input_profiling_path, output_path): - if self._check_profiling_path_valid(input_profiling_path) and self._check_output_path_valid(output_path): - try: - reduced_dimensions = Interface.all_dimension[:-1] #advisor 默认调用全部功能,此方法不需要compare功能,故对列表进行处理 - AnalyzerController().do_analysis(dimensions=reduced_dimensions, - profiling_path=input_profiling_path, - benchmark_profiling_path=None, - output_path=output_path, - ) - except RuntimeError as e: - logger.error("RuntimeError during analysis: %s", e) - except Exception as e: - logger.error("Unexpected error during analysis: %s", e) - else: - logger.error("Invalid paths provided; analysis aborted.") diff --git a/profiler/msprof_analyze/precheck/assert/code_structure_startall.svg b/profiler/msprof_analyze/precheck/assert/code_structure_startall.svg deleted file mode 100644 index 9502f093c35d4a05eef603a0c9e3089075d6ea5b..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/assert/code_structure_startall.svg +++ /dev/null @@ -1 +0,0 @@ -Launch LayerPrecheck Control LayerPrecheck Execution LayerData Collection & Analysis LayerUserUserrun_llama2_precheck.sh/run_precheck.shrun_llama2_precheck.sh/run_precheck.shprecheck_cli.pyprecheck_cli.pyprecheck/_ _main_ _.pyprecheck/_ _main_ _.pySSH RunnerSSH Runnerprecheck_cli.py(start_node)precheck_cli.py(start_node)runner/_ _main_ _.pyrunner/_ _main_ _.pyUser Training ScriptUser Training Scripttrain_with_profiler.pytrain_with_profiler.pyCollectorRunnerCollectorRunnerAdvisorRunnerAdvisorRunnerExecute scriptmsprof-analyze precheck start_allConfiguration:1. Node IPs2. Master node settings3. Distributed parameters4. Output directoriesstart_precheck()run_remote_command()loop[for each host]Execute on remote nodestart_precheck_runner()get_conda_envs_info()Auto-detect conda/python envalt[profiling_cmd == "[resnet]"]Execute example modelInitialize profilerTraining loop1. Load model & dataset2. Configure optimizer3. Execute training steps4. Collect metricsComplete training[profiling_cmd == custom command]Prepare environmentSet distributed env vars:- MASTER_ADDR- MASTER_PORT- NNODES- NODE_RANK- NPROC_PER_NODEExecute via bashExample:torchrun $DISTRIBUTED_ARGS \pretrain_gpt.py \$MODEL_ARGS \$PROFILE_ARGS \...Training completealt[not prof_in_shared_storage]Package profiling datazip_directory()1. Compress profiling data2. Filter by whitelist patterns3. Check archive size limitstransport()1. Transfer to master node2. Handle node rank specific logicCollection completealt[rank == 0]Analyze collected datarun_analyzer()1. Extract archives2. Process ascend_pt files3. Generate reportsAnalysis completeExecution completeNode completeAll nodes completePrecheck completeCommand completeDisplay completion \ No newline at end of file diff --git a/profiler/msprof_analyze/precheck/assert/code_structure_startnode_docker.svg b/profiler/msprof_analyze/precheck/assert/code_structure_startnode_docker.svg deleted file mode 100644 index a3bcca97fefddc8b3fa3123452c879a58d074e6c..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/assert/code_structure_startnode_docker.svg +++ /dev/null @@ -1 +0,0 @@ -Cloud Platform LayerLaunch LayerPrecheck Execution LayerData Collection & Analysis LayerUserUserCloud PlatformCloud PlatformDocker ContainersDocker Containersrun_node_precheck.shrun_node_precheck.shprecheck_cli.pyprecheck_cli.pyrunner/_ _main_ _.pyrunner/_ _main_ _.pyUser Training ScriptUser Training Scripttrain_with_profiler.pytrain_with_profiler.pyCollectorRunnerCollectorRunnerAdvisorRunnerAdvisorRunnerPlatform Configuration1. Upload Docker image2. Configure cluster settings(nodes, NPUs per node)3. Set training parameters(model, dataset, etc.)Container DeploymentDeploy containers across cluster nodesPrecheck Executionloop[For each container in parallel]Execute with env vars(MASTER_ADDR, NODES,NODE_RANK, etc.)msprof-analyze precheck start_nodeInitialize precheck sessionget_conda_envs_info()1. Detect conda environment2. Get activation command3. Setup environment varsalt[profiling_cmd == "[resnet]"]Execute example modelInitialize profilerTraining loop1. Load model & dataset2. Configure optimizer3. Execute training steps4. Collect metricsComplete training[profiling_cmd == custom command]Prepare environmentSet distributed env vars:- MASTER_ADDR- MASTER_PORT- NNODES- NODE_RANK- NPROC_PER_NODEExecute via bashExample:torchrun $DISTRIBUTED_ARGS \custom_training.py \$MODEL_ARGS \$PROFILE_ARGS \...Initialize profilerTraining loop1. Load custom configuration2. Setup distributed env3. Execute training steps4. Collect profiling dataTraining completealt[not prof_in_shared_storage]Package profiling datazip_directory()1. Compress profiling data2. Filter by whitelist patterns3. Check archive size limitstransport()1. Transfer to master node2. Handle node rank specific logicCollection completealt[rank == 0]Analyze collected datarun_analyzer()1. Extract archives2. Process ascend_pt files3. Generate reportsAnalysis completePrecheck completeCommand finishedContainer task completeAll containers finishedResultsReturn profiling resultsand analysis report \ No newline at end of file diff --git a/profiler/msprof_analyze/precheck/collect/__init__.py b/profiler/msprof_analyze/precheck/collect/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/collect/collector.py b/profiler/msprof_analyze/precheck/collect/collector.py deleted file mode 100644 index ca74b3e6769106ec23c8e789ca8eb170fbef600a..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/collect/collector.py +++ /dev/null @@ -1,458 +0,0 @@ -import sys -import os -from typing import Any, Dict - -sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) - -import logging -from pathlib import Path -import argparse -import time -import math - -import torch -import torch_npu -import torch.distributed as dist -import numpy as np -import torch.multiprocessing as mp - -from msprof_analyze.prof_common.path_manager import PathManager -from msprof_analyze.precheck.manager.group_manager import GroupManager, EnvGroup, SubGroup -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.common.time_stat import TimeStat -from msprof_analyze.precheck.common.utils import create_npu_event, event_elaspe_second, parition_sub_group_ranks, \ - get_master_rank_collect_dir, get_slave_rank_collect_dir, cat_files, is_equal_file_hash, get_quick_hash, \ - compress_directory -from msprof_analyze.precheck.manager.disk_manager import DiskManager - - -class Collector: - - def __init__(self): - self.stream = None - self.time_stat = None - self.world_size = None - self.device = None - self.local_rank = None - self.rank = None - self.logger = logging.getLogger(__name__) - - def init(self, slave_env: EnvGroup): - self.rank = slave_env.rank - self.local_rank = slave_env.local_rank - torch.npu.set_device(self.local_rank) - self.device = torch.device('npu:%d' % self.local_rank) - self.world_size = slave_env.world_size - self.time_stat = TimeStat() - self.stream = torch_npu.npu.current_stream() - - def gather_rank_data(self, group, gather_tensor, all_gather=False, dst_rank=None) -> tuple: - cur_group_size = dist.get_world_size(group) - self.logger.debug( - "[Rank %d] Local rank %d, gather data from %d ranks" % (self.rank, self.local_rank, cur_group_size)) - wait_event = create_npu_event(self.stream) - dist.barrier(group=group) - start_event = create_npu_event(self.stream) - wait_time = event_elaspe_second(self.stream, wait_event, start_event) - if all_gather: - gather_list = [] - for _ in range(cur_group_size): - gather_list.append(torch.zeros_like(gather_tensor, dtype=gather_tensor.dtype, device=self.device)) - dist.all_gather(gather_list, gather_tensor, group=group) - else: - if self.rank == dst_rank: - gather_list = [] - for _ in range(cur_group_size): - gather_list.append(torch.zeros_like(gather_tensor, dtype=gather_tensor.dtype, device=self.device)) - else: - gather_list = None - dist.gather(gather_tensor, gather_list=gather_list, dst=dst_rank, group=group) - end_event = create_npu_event(self.stream) - transfer_time = event_elaspe_second(self.stream, start_event, end_event) - - return gather_list, wait_time, transfer_time - - def create_sub_group(self, file_sizes_hash, master_rank_num): - # 需要根据file_sizes来划分sub_group ranks - file_sizes = [item[0] for item in file_sizes_hash[master_rank_num:]] - partitions = parition_sub_group_ranks(master_rank_num, file_sizes) - self.logger.debug("[Rank %d] subgroup partiitons %s" % (self.rank, partitions)) - - wait_time = 0 - transfer_time = 0 - for ranks in partitions: - if len(ranks) > 1: - wait_event = create_npu_event(self.stream) - dist.barrier() - start_event = create_npu_event(self.stream) - wait_time = event_elaspe_second(self.stream, wait_event, start_event) - sub_group = dist.new_group(ranks=ranks, backend='hccl') - end_event = create_npu_event(self.stream) - transfer_time = event_elaspe_second(self.stream, start_event, end_event) - - self.logger.info( - '[Rank %d] after new group, ranks: %s, file_sizes_hash %s' % (self.rank, ranks, file_sizes_hash)) - cur_file_sizes = [file_sizes_hash[r].cpu().tolist()[0] for r in ranks[1:]] - cur_file_hashes = [file_sizes_hash[r].cpu().tolist()[1:] for r in ranks[1:]] - - GroupManager().add_rank_sub_group(sub_group=sub_group, ranks=ranks, file_sizes=cur_file_sizes, - file_hashes=cur_file_hashes) - else: - self.logger.debug('[Rank %d] ranks %s not enough for creating subgroup' % (self.rank, ranks)) - self.time_stat.init_pg_stat.sub_group_init = [wait_time, transfer_time] - - def bd_split_file_size(self, sub_group, split_size=None): - split_size_bd = torch.tensor([split_size], dtype=torch.int64, device=self.device) \ - if self.rank == sub_group.master_rank else torch.zeros(1, dtype=torch.int64, device=self.device) - wait_event = create_npu_event(self.stream) - dist.barrier(group=sub_group.group) - start_event = create_npu_event(self.stream) - wait_time = event_elaspe_second(self.stream, wait_event, start_event) - self.logger.info("[Rank %d] after split size barrier" % self.rank) - dist.broadcast(split_size_bd, group=sub_group.group, src=sub_group.master_rank) - end_event = create_npu_event(self.stream) - transfer_time = event_elaspe_second(self.stream, start_event, end_event) - self.logger.info("[Rank %d] after split size bd, %s" % (self.rank, split_size_bd)) - - self.time_stat.com_stat.broad_splits = [wait_time, transfer_time] - return split_size_bd.cpu().item() - - def gather_file_split(self, sub_group, tensor, master_rank_num, output_file_dir=None): - for i in range(sub_group.max_splits): - # is master node - if self.rank < master_rank_num: - cur_tensor = torch.zeros(sub_group.split_file_size, dtype=torch.uint8, device=self.device) - else: - start_time = time.perf_counter() - cur_tensor = tensor[i * sub_group.split_file_size: (i + 1) * sub_group.split_file_size] - if len(cur_tensor) < sub_group.split_file_size: - cur_tensor = np.pad(cur_tensor, (0, sub_group.split_file_size - len(cur_tensor)), 'constant', - constant_values=0) - cur_tensor = torch.tensor(cur_tensor, dtype=torch.uint8, device=self.device) - end_time = time.perf_counter() - self.time_stat.disk_stat.read_input_file_splits.append(end_time - start_time) - - # gather rank data内部有barrier与计时 - file_tensor_list, wait_time, transfer_time = self.gather_rank_data(dst_rank=sub_group.master_rank, - group=sub_group.group, - gather_tensor=cur_tensor) - self.logger.debug("[Rank %d] gather file split %d, wait time: %f, gather time: %f seconds" % ( - self.rank, i, wait_time, transfer_time)) - self.time_stat.com_stat.gather_file_splits.append([wait_time, transfer_time]) - - # 记录从memory_on_chip刷到硬盘中的耗时 - if file_tensor_list: - master_rank_collect_dir = get_master_rank_collect_dir(output_file_dir, self.rank) - memory_on_chip_ram_times = [] - ram_disk_times = [] - for rank_i, rank in enumerate(sub_group.ranks): - if rank != sub_group.master_rank: - group_rank = rank - master_rank_num - rank_dir = get_slave_rank_collect_dir(master_rank_collect_dir, group_rank) - if not os.path.exists(rank_dir): - os.makedirs(rank_dir, exist_ok=True) - rank_file = os.path.join(rank_dir, 'split_%d' % i) - cur_split_size = sub_group.splits[rank_i - 1][i] - if cur_split_size > 0: - start_time = time.perf_counter() - data = file_tensor_list[rank_i][:cur_split_size].cpu().numpy().tobytes() - ram_time = time.perf_counter() - with open(rank_file, 'wb') as f: - f.write(data) - end_time = time.perf_counter() - memory_on_chip_ram_times.append(ram_time - start_time) - ram_disk_times.append(end_time - ram_time) - - self.time_stat.disk_stat.memory_on_chip.append(memory_on_chip_ram_times) - self.time_stat.disk_stat.ram_disk.append(ram_disk_times) - - for tensor in file_tensor_list: - del tensor - del file_tensor_list - torch.npu.empty_cache() - - def concat_file_split(self, output_file_dir: str, sub_group: SubGroup, master_rank_num): - cur_rank_collect_dir = get_master_rank_collect_dir(output_file_dir, self.rank) - concat_times = [] - verify_hash_times = [] - for rank_i, rank in enumerate(sub_group.ranks): - # 只提取slave rank的case - if rank == self.rank: - continue - group_rank = rank - master_rank_num - rank_dir = get_slave_rank_collect_dir(cur_rank_collect_dir, group_rank) - output_file_name = os.path.join(rank_dir, 'merge.zip') - file_split_names = [] - start_time = time.perf_counter() - with open(output_file_name, 'wb') as output_file: - for split_i in range(sub_group.max_splits): - file_split = os.path.join(rank_dir, 'split_%d' % split_i) - if not os.path.exists(file_split): - self.logger.error('[Rank %d] not exist file split %s' % (self.rank, file_split)) - else: - file_split_names.append(file_split) - cat_files(output_file_name, input_files=file_split_names) - for file_split in file_split_names: - os.remove(file_split) - - end_time = time.perf_counter() - concat_times.append(end_time - start_time) - self.logger.debug( - '[Rank %d] concatenate slave rank %s, time: %f seconds' % (self.rank, rank, end_time - start_time)) - - start_time = time.perf_counter() - output_file_hash = get_quick_hash(output_file_name) - self.logger.debug('[Rank %d] rank_i %d, file_hashs:%s' % (self.rank, rank_i, sub_group.file_hashes)) - if not is_equal_file_hash(output_file_hash, sub_group.file_hashes[rank_i - 1]): - self.logger.error('[Rank %d] Not equal merge file hash. %s. %s' % ( - self.rank, output_file_hash, sub_group.file_hashes[rank_i - 1])) - end_time = time.perf_counter() - verify_hash_times.append(end_time - start_time) - - self.time_stat.disk_stat.hash_output_file = verify_hash_times - self.time_stat.disk_stat.concat_file = concat_times - - def master_node_run(self, master_env: EnvGroup, output_file_dir, split_file_size=None): - try: - # 设置环境变量,这些会在torch.dist中用到 - # 因为master node rank为0, 所以global rank直接等于local rank - master_env.set_env() - self.init(master_env) - - start_event = create_npu_event(self.stream) - self.logger.info('[Rank %d] Start master node process' % self.rank) - torch.npu.set_device(self.device) - init_process_group_event = create_npu_event(self.stream) - elp_time = event_elaspe_second(self.stream, start_event, init_process_group_event) - self.logger.debug('[Rank %d] init process group time %f seconds' % (self.rank, elp_time)) - self.time_stat.init_pg_stat.global_group_init = elp_time - - self.logger.info("[Rank %d] master node run" % (self.rank)) - # Step 2. Gather tensor size from slave node. - gather_tensor = torch.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=torch.int64, device=self.device) - # 分为 (file_size, file_hash) - dist.init_process_group(backend='hccl', rank=self.rank, world_size=self.world_size) - if not (dist.is_available() and dist.is_initialized()): - raise RuntimeError("Distributed environment is not available") - - file_sizes_hash, wait_time, transfer_time = self.gather_rank_data(group=dist.group.WORLD, - gather_tensor=gather_tensor, - all_gather=True) - self.time_stat.com_stat.gather_file_size = [wait_time, transfer_time] - - self.logger.debug('[Rank %d] gather file size time %f seconds' % (self.rank, transfer_time)) - - # 判断硬盘空间是否足够,解压的过程中需要额外的空间存储临时文件与原压缩包 - file_sizes = [item[0] for item in file_sizes_hash[master_env.local_world_size:]] - total_file_size = sum(file_sizes) - - total_size_gb = Constant.UNZIP_DISK_SIZE_RAIO * total_file_size / (1024 * 1024 * 1024) - - self.logger.debug( - '[Rank %d] collect file sizes %s, total size %fgb' % (self.rank, file_sizes, total_file_size)) - DiskManager.check_disk_space(output_file_dir, total_size_gb) - - # Step 3. broadcast子通信域配置,建立子通信域 - self.logger.info("[Rank %d] creating sub group %s" % (self.rank, file_sizes_hash)) - self.create_sub_group(file_sizes_hash, master_env.local_world_size) - sub_group = GroupManager().get_rank_sub_group(self.rank) - - # 以下进入每个子通信域特定的逻辑 - if sub_group: - self.logger.info("[Rank %d] Subgroup ranks %s, file_sizes %s" % ( - self.rank, sub_group.ranks, sub_group.file_sizes)) - - # 未指定split file size的话,根据memory_on_chip/rank_num计算 - if not split_file_size: - if len(sub_group.ranks) > 0: - split_file_size = math.floor(Constant.MASTER_RANK_MEMORY_ON_CHIP / (len(sub_group.ranks))) - else: - logger.error("Value of sub_group.ranks is invalid, %d.", len(sub_group.ranks)) - self.bd_split_file_size(sub_group, split_file_size) - sub_group.split_size(split_file_size) - self.logger.info("[Rank %d] Subgroup split file size %s, splits %s" % ( - self.rank, sub_group.split_file_size, sub_group.splits)) - self.gather_file_split(sub_group=sub_group, tensor=None, master_rank_num=master_env.local_world_size, - output_file_dir=output_file_dir) - self.logger.debug("[Rank %d] start concat file split" % self.rank) - self.concat_file_split(output_file_dir, sub_group, master_env.local_world_size) - if len(sub_group.ranks) > 1: - self.logger.info(self.time_stat.to_string()) - else: - self.logger.info("[Rank %d] master rank not in sub group" % self.rank) - dist.barrier() - except Exception as e: - self.logger.error("%s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise e - finally: - dist.destroy_process_group() - - def slave_node_run(self, slave_env: EnvGroup, input_file_dir, master_rank_num): - try: - self.logger.debug('Enter slave node run wrapper') - # 设置环境变量,这些会在torch.dist中用到 - slave_env.set_env() - self.init(slave_env) - torch.npu.set_device(self.device) - start_event = create_npu_event(self.stream) - init_process_group_event = create_npu_event(self.stream) - elp_time = event_elaspe_second(self.stream, start_event, init_process_group_event) - self.time_stat.init_pg_stat.global_group_init = elp_time - - self.logger.debug('[Rank %d] init process group time %f seconds' % (self.rank, elp_time)) - self.logger.info('[Rank %d] Start slave node process' % self.rank) - - # Step2. 先压缩文件,统计文件大小,再进入到gather逻辑里 - if os.path.isfile(input_file_dir): - file_path = input_file_dir - else: - PathManager.check_path_writeable(input_file_dir) - file_path = os.path.join(str(Path(input_file_dir).parent), 'compress.tar') - start_time = time.perf_counter() - compress_directory(input_file_dir, file_path) - end_time = time.perf_counter() - self.time_stat.disk_stat.compress_input_file = end_time - start_time - self.logger.info("[Rank %d] Compress directory time: %f seconds" % (self.rank, end_time - start_time)) - file_size = os.path.getsize(file_path) - start_time = time.perf_counter() - file_hash_chunks = get_quick_hash(file_path) - end_time = time.perf_counter() - self.time_stat.disk_stat.hash_input_file = end_time - start_time - self.logger.info("[Rank %d] Hash input file time: %f seconds" % (self.rank, end_time - start_time)) - file_hash_chunks.insert(0, file_size) - self.logger.info( - "[Rank %d] File hash chunks (first element is file size): %s" % (self.rank, file_hash_chunks)) - gather_tensor = torch.tensor(file_hash_chunks, dtype=torch.int64, device=self.device) - - dist.init_process_group(backend='hccl', rank=self.rank, world_size=self.world_size) - if not (dist.is_available() and dist.is_initialized()): - raise RuntimeError("Distributed environment is not available") - - file_sizes_hash, wait_time, transfer_time = self.gather_rank_data(group=dist.group.WORLD, - gather_tensor=gather_tensor, - all_gather=True) - self.time_stat.com_stat.gather_file_size = [wait_time, transfer_time] - self.logger.info("[Rank %d] Gather file size - wait time: %f seconds, transfer time: %f seconds" % ( - self.rank, wait_time, transfer_time)) - # Step3. 建立子通信域 - self.logger.debug("[Rank %d] creating sub group %s" % (self.rank, file_sizes_hash)) - self.create_sub_group(file_sizes_hash, master_rank_num) - sub_group = GroupManager().get_rank_sub_group(self.rank) - - # 进入每个子通信域特定的逻辑 - if sub_group: - # Step4. broacast split size大小 - self.logger.info("[Rank %d] Subgroup ranks %s, file_sizes %s" % ( - self.rank, sub_group.ranks, sub_group.file_sizes)) - split_file_size = self.bd_split_file_size(sub_group) - sub_group.split_size(split_file_size) - file_tensor = np.memmap(file_path, dtype=np.uint8, mode='r') - self.gather_file_split(sub_group=sub_group, tensor=file_tensor, master_rank_num=master_rank_num) - self.logger.info(self.time_stat.to_string()) - else: - self.logger.warning("[Rank %d] slave rank not in sub group" % (self.rank)) - dist.barrier() - except Exception as e: - self.logger.error("%s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise e - finally: - dist.destroy_process_group() - - def run(self, args_dict: Dict[str, Any]): - input_file_dir = args_dict.get("input_file_dir") - output_file_dir = args_dict.get("output_file_dir") - nnodes = args_dict.get("nnodes") - node_rank = args_dict.get("node_rank") - master_addr = args_dict.get("master_addr") - master_port = args_dict.get("master_port") - master_rank_num = args_dict.get("master_rank_num") - split_file_size = args_dict.get("split_file_size") - time_out = args_dict.get("time_out") - log_file = args_dict.get("log_file") - - logging.basicConfig( - filename=log_file, # File to write logs to - level=logging.DEBUG, # Minimum logging level to write to the file - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' # Log message format - ) - self.logger.info({"message": "Run method arguments", - "class": self.__class__.__name__, - "method": sys._getframe().f_code.co_name, - "args": args_dict}) - - # 计算calculate world size - world_size = nnodes + master_rank_num - 1 - # master node的逻辑 - if node_rank == 0: - processes = [] - for i in range(master_rank_num): - master_env = EnvGroup(rank=i, local_rank=i, world_size=world_size, master_addr=master_addr, - master_port=master_port, group_rank=0, local_world_size=master_rank_num) - process = mp.Process(target=self.master_node_run, args=(master_env, output_file_dir, split_file_size)) - self.logger.info("Start master node subprocess %d." % i) - process.start() - processes.append(process) - start_time = time.perf_counter() - try: - while True: - all_done = all(not process.is_alive() for process in processes) - if all_done: - self.logger.info("All subprocesses finished successfully.") - break - elapsed_time = time.perf_counter() - start_time - time.sleep(5) - if elapsed_time > time_out: - raise TimeoutError("Timeout reached. Terminating all subprocesses.") - - except TimeoutError as e: - self.logger.error("%s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - for process in processes: - if process.is_alive(): - process.terminate() - process.join() - finally: - # 确保Ensure all processes are cleaned up - for process in processes: - process.join() - # slave node的逻辑 - else: - rank = node_rank + master_rank_num - 1 - slave_env = EnvGroup(rank=rank, local_rank=0, world_size=world_size, master_addr=master_addr, - master_port=master_port, group_rank=node_rank, local_world_size=1) - self.slave_node_run(slave_env, input_file_dir, master_rank_num) - - -if __name__ == "__main__": - parser = argparse.ArgumentParser() - parser.add_argument("--input_file_dir", type=str, help='input profiling data dir') - parser.add_argument("--output_file_dir", type=str, help='input profiling data dir') - parser.add_argument("--nnodes", type=int, help='the total node number') - parser.add_argument("--node_rank", type=int, help='node rank in the cluster') - parser.add_argument("--master_addr", type=str, help='master address') - parser.add_argument("--master_port", type=int, default=29501, help='master port') - parser.add_argument("--master_rank_num", type=int, default=8, help='master rank nums') - - parser.add_argument("--split_file_size", type=int, default=None, help='split file size') - - # master node整体time out的时间 - parser.add_argument("--time_out", type=int, default=Constant.DEFAULT_TIME_OUT, - help='totoal process time out in seconds') - parser.add_argument("--log_file", type=str, default=None, help='logging file') - args = parser.parse_args() - - logging.basicConfig( - filename=args.log_file, # File to write logs to - level=logging.DEBUG, # Minimum logging level to write to the file - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' # Log message format - ) - logger = logging.getLogger(__name__) - - collector = Collector() - logger.debug(vars(args)) - args_dict = vars(args) - - try: - collector.run(args_dict) - except Exception as e: - logger.error("%s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise e diff --git a/profiler/msprof_analyze/precheck/common/__init__.py b/profiler/msprof_analyze/precheck/common/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/common/constant.py b/profiler/msprof_analyze/precheck/common/constant.py deleted file mode 100644 index 1fc724e7524917c4c5abb21b83e26e573d0f956b..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/common/constant.py +++ /dev/null @@ -1,45 +0,0 @@ -import logging -import os -import stat -from datetime import timezone, timedelta - - -class Constant: - DEFAULT_SPLIT_FILE_SIZE = 15 * 1024 # 便于测试多文件split,默认split size设为15k - MASTER_RANK_MEMORY_ON_CHIP = 10 * 1024 * 1024 * 1024 # 10GB 片上内存可用显存来传输数据 - UNZIP_DISK_SIZE_RAIO = 1.0 # 需要x倍压缩文件的空间进行解压操作 - DEFAULT_TIME_OUT = 1200 - - ARG_MAX_LEN = 255 # 参数最大长度 - ARG_MIN_INT_VALUE = - (1 << 31) # 32位整数最小值 - ARG_MAX_INT_VALUE = (1 << 31) - 1 # 32位整数最大值 - ARG_MIN_PORT_VALUE = 0 - ARG_MAX_PORT_VALUE = 65535 - - PROFILER_FILE_PATTERNS = [r'profiler_metadata\.json', r'profiler_info_\d{1,10}\.json', r'ASCEND_PROFILER_OUTPUT/.*'] - - COLLECTOR_MASTER_RANK_NUM = 4 - COLLECTOR_DEFAULT_TIMEOUT = 1200 # seconds - COLLECTOR_SPLIT_FILE_SIZE = None # 文件传输的split块大小,默认split size设为根据显存自动计算 - LOCALHOST_ADDRESSES = {'localhost', '127.0.0.1'} - - MAX_ARCHIVE_SIZE = 20 * 1024 * 1024 * 1024 # 20 GB - MAX_ARCHIVE_FILE_COUNT = 10000 - MAX_ARCHIVE_RATIO = 10 - - DEFAULT_PROFILING_COMMANDS = { - "[resnet]": "resnet", - } - - MS_PROF_PRECHECK_CMD = "msprof-analyze precheck" - - ENABLE_STACKTRACE_LOGGING = False - LOGGING_LEVEL = logging.INFO - - -class TimeConstant: - """Time related constants""" - UTC = timezone.utc - CHINA_OFFSET = timedelta(hours=8) - CHINA_TIMEZONE = timezone(CHINA_OFFSET, name='Asia/Shanghai') - MS_TO_S = 1 / 1000 # Milliseconds to seconds conversion factor diff --git a/profiler/msprof_analyze/precheck/common/logger.py b/profiler/msprof_analyze/precheck/common/logger.py deleted file mode 100644 index 04346a80343098491e7610610730f9ea52fded8a..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/common/logger.py +++ /dev/null @@ -1,103 +0,0 @@ -import logging -import logging.handlers - - -def create_logger(name: str, level: int = logging.DEBUG, use_memory_handler: bool = True) -> logging.Logger: - """ - Create a logger with optional memory handler for buffering logs. - - Args: - name: The name of the logger. recommend to use the module name: __name__. - level: The logging level, default is DEBUG. - use_memory_handler: Whether to add a memory handler for buffering logs, default is True. - - Returns: - A configured logger instance. - - Examples: - # Create a logger with memory handler - logger = create_logger("my_logger", logging.INFO, use_memory_handler=True) - - # Create a logger without memory handler - logger = create_logger("my_logger", logging.INFO, use_memory_handler=False) - - Notes: - When use_memory_handler is True, a memory handler is added to buffer logs until a specific log level - (default is ERROR) is reached, then logs are flushed to the target handler. This can avoid frequent - file writes and improve performance. Buffered logs can be manually flushed by calling logger.handlers[1].flush() - if no file handler is created yet. - - When use_memory_handler is False, no memory handler is added, and logs are written to the target handler - (e.g., console or file) in real-time. - """ - logger = logging.getLogger(name) - logger.handlers.clear() - - logger.setLevel(level) - logger.propagate = False - - console_handler = logging.StreamHandler() - console_handler.setLevel(level) - formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') - console_handler.setFormatter(formatter) - logger.addHandler(console_handler) - - if use_memory_handler: - memory_handler = logging.handlers.MemoryHandler(capacity=1000, flushLevel=logging.ERROR) - memory_handler.setLevel(level) - memory_handler.setFormatter(formatter) - logger.addHandler(memory_handler) - - return logger - - -def add_file_handler(logger: logging.Logger, log_file: str) -> logging.Logger: - """ - Add a file handler to an existing logger and handle the memory handler if present. - - Args: - logger: An existing logger instance. - log_file: The path to the log file. - - Returns: - The updated logger instance. - - Example: - # Initialize a logger - logger = create_logger("my_logger", logging.DEBUG, use_memory_handler=True) - - # Add a file handler to the logger - logger = add_file_handler(logger, "output.log") - - Notes: - This function adds a file handler to the given logger, inheriting the log level from the logger. - If a memory handler was previously added to the logger, its target handler is set to the new file handler, - buffered logs are flushed to the file, and then the memory handler is removed. - This ensures that both buffered logs and subsequent logs are written to the file after using the file handler. - """ - file_handler = logging.FileHandler(log_file) - file_handler.setLevel(logger.level) - formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') - file_handler.setFormatter(formatter) - logger.addHandler(file_handler) - - for handler in logger.handlers: - if isinstance(handler, logging.handlers.MemoryHandler): - handler.setTarget(file_handler) - handler.flush() - logger.removeHandler(handler) - - return logger - - -if __name__ == "__main__": - logger = create_logger("test_logger", logging.DEBUG, use_memory_handler=True) - logger.info("This is an info message from initial logger with memory handler") - - import tempfile - - with tempfile.NamedTemporaryFile(mode='w', delete=False) as temp_file: - temp_file_path = temp_file.name - add_file_handler(logger, temp_file_path) - logger.info("This is an info message from logger with file handler") - logger.info("The log file is {}".format(temp_file_path)) diff --git a/profiler/msprof_analyze/precheck/common/singleton.py b/profiler/msprof_analyze/precheck/common/singleton.py deleted file mode 100644 index b645f284d642d3ba84c6e0cde374d865f16b7105..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/common/singleton.py +++ /dev/null @@ -1,9 +0,0 @@ -def singleton(cls: any) -> any: - _instance = {} - - def _singleton(*args: any, **kw: any) -> any: - if cls not in _instance: - _instance[cls] = cls(*args, **kw) - return _instance.get(cls) - - return _singleton diff --git a/profiler/msprof_analyze/precheck/common/time_stat.py b/profiler/msprof_analyze/precheck/common/time_stat.py deleted file mode 100644 index 69df61cb797310adb4262c1261eec651770c08d8..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/common/time_stat.py +++ /dev/null @@ -1,74 +0,0 @@ -from dataclasses import dataclass, field -from typing import List - -@dataclass -class InitProcessGroupStat: - global_group_init: float = None - sub_group_init: List[float] = field(default_factory=list) #wait time, transfer time - def sum_transfer_time(self): - return self.global_group_init + self.sub_group_init[1] - - def to_str_list(self): - str_list = ['[InitPGStat]:'] - str_list.append(' global group init: %f seconds:' % self.global_group_init) - str_list.append(' sub group init: %f seconds:' % self.sub_group_init[1]) - return str_list - -@dataclass -class ComStat: - gather_file_size: List[float] = field(default_factory=list) - broad_splits: List[float] = field(default_factory=list) - gather_file_splits: List[List[float]] = field(default_factory=list) - def sum_transfer_time(self): - return self.gather_file_size[1] + self.broad_splits[1] - - def to_str_list(self): - str_list = ['[ComStat]:'] - str_list.append(' gather file size: %f seconds:' % self.gather_file_size[1]) - str_list.append(' broad splits: %f seconds:' % self.broad_splits[1]) - file_split_times = [t[1] for t in self.gather_file_splits] - str_list.append(' gather file splits: %s seconds:' % file_split_times) - return str_list - -@dataclass -class DiskStat: - memory_on_chip: List[List[float]] = field(default_factory=list) - ram_disk: List[List[float]] = field(default_factory=list) - - concat_file: List[float] = field(default_factory=list) - hash_output_file: List[float] = field(default_factory=list) - - read_input_file_splits: List[float] = field(default_factory=list) - hash_input_file: float = None - - def to_str_list(self): - str_list = ['[DiskStat]:'] - if len(self.memory_on_chip) > 0: - for memory_on_chip, ram_disk in zip(self.memory_on_chip, self.ram_disk): - str_list.append(' File Split: ') - str_list.append(' hdm_ram time: %s' % memory_on_chip) - str_list.append(' ram_disk time: %s' % ram_disk) - str_list.append(' concat file time for slave ranks: %s' % self.concat_file) - str_list.append(' verify file hash time for slave ranks: %s' % self.hash_output_file) - - #slave node - else: - str_list.append(' hash file time: %s' % self.hash_input_file) - str_list.append(' read file split times: %s' % self.read_input_file_splits) - - return str_list - - -@dataclass -class TimeStat: - init_pg_stat: InitProcessGroupStat = field(default_factory=InitProcessGroupStat) - com_stat: ComStat = field(default_factory=ComStat) - disk_stat: DiskStat = field(default_factory=DiskStat) - - #print it for logging, 应当区分master node rank与slave node。 - def to_string(self): - str_list = ['[TimeStat]:'] - str_list.extend(' %s' %s for s in self.init_pg_stat.to_str_list()) - str_list.extend(' %s' %s for s in self.com_stat.to_str_list()) - str_list.extend(' %s' %s for s in self.disk_stat.to_str_list()) - return '\n'.join(str_list) diff --git a/profiler/msprof_analyze/precheck/common/utils.py b/profiler/msprof_analyze/precheck/common/utils.py deleted file mode 100644 index 03f4e7d7ee31032067298351c25f426505cfc2ff..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/common/utils.py +++ /dev/null @@ -1,193 +0,0 @@ -import os -import sys -import hashlib -import subprocess -import logging -from datetime import datetime - -import torch_npu -from msprof_analyze.precheck.common.constant import TimeConstant -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -def get_file_md5(filepath, chunk_size=4096, split_hash_size=4): - PathManager.check_input_file_path(filepath) - PathManager.check_path_readable(filepath) - md5_hash = hashlib.md5() - with open(filepath, "rb") as file: - for chunk in iter(lambda: file.read(chunk_size), b""): - md5_hash.update(chunk) - hash_bytes = int(md5_hash.hexdigest(), 16).to_bytes(16, 'big') - - chunks = [] - for i in range(0, 16, split_hash_size): - chunks.append(int.from_bytes(hash_bytes[i:i + split_hash_size], 'big')) - return chunks - - -def get_quick_hash(file_path, sample_size=65536, hash_spilt_size=4): - PathManager.check_input_file_path(file_path) - PathManager.check_path_readable(file_path) - file_size = os.path.getsize(file_path) - if file_size < sample_size * 5: - return get_file_md5(file_path) - hash_md5 = hashlib.md5() - with open(file_path, "rb") as f: - hash_md5.update(f.read(sample_size)) - f.seek(max(0, (os.path.getsize(file_path) // 2) - (sample_size // 2))) - hash_md5.update(f.read(sample_size)) - f.seek(-sample_size, 2) - hash_md5.update(f.read(sample_size)) - hash_bytes = int(hash_md5.hexdigest(), 16).to_bytes(16, 'big') - - chunks = [] - for i in range(0, 16, hash_spilt_size): - chunks.append(int.from_bytes(hash_bytes[i:i + hash_spilt_size], 'big')) - return chunks - - -def is_equal_file_hash(chunks1, chunks2): - for chunk1, chunk2 in zip(chunks1, chunks2): - if chunk1 != chunk2: - return False - return True - - -def cat_files(output_file, input_files): - """ - Concatenate multiple binary input files into a single output file using cat command. - - Args: - output_file (str): Path to the output file - input_files (list): List of input file paths to concatenate - - Returns: - bool: True if concatenation was successful - - Raises: - subprocess.CalledProcessError: If the cat command fails - """ - PathManager.check_input_file_path(output_file) - cmd = ["cat"] + list(input_files) - - try: - with open(output_file, 'wb') as outfile: - result = subprocess.run(cmd, stdout=outfile, stderr=subprocess.PIPE) - - if result.returncode == 0: - return True - else: - logger.error("Error occurred during concatenation: %s", - result.stderr.decode('utf-8', errors='replace')) - raise subprocess.CalledProcessError(result.returncode, cmd, - output=None, - stderr=result.stderr) - - except OSError as e: - logger.error("OS error occurred during file operation: %s", str(e)) - raise - - -def compress_directory(src_dir, output_file): - PathManager.check_input_directory_path(src_dir) - PathManager.check_path_readable(src_dir) - if not os.path.isdir(src_dir): - raise FileNotFoundError(f"The directory '{src_dir}' does not exist.") - try: - result = subprocess.run( - ["/bin/tar", "-czf", output_file, "-C", src_dir, "."], - check=True, # Raise an error if the command fails - stdout=subprocess.PIPE, - stderr=subprocess.PIPE - ) - except subprocess.CalledProcessError as e: - raise RuntimeError( - f"Failed to compress directory '{src_dir}' into '{output_file}'. " - f"Error: {e.stderr.decode('utf-8')}" - ) from e - - -def get_master_rank_collect_dir(output_file_dir, master_rank_i): - return os.path.join(output_file_dir, 'rank_%d_collect' % master_rank_i) - - -def get_slave_rank_collect_dir(master_rank_collect_dir, group_rank): - return os.path.join(master_rank_collect_dir, 'node_%d' % group_rank) - - -def parition_sub_group_ranks(master_rank_num, file_sizes): - master_rank_num = int(master_rank_num) - indexed_lst = sorted(enumerate(file_sizes), key=lambda x: x[1]) - sorted_indices = [index + master_rank_num for index, value in indexed_lst] - if master_rank_num != 0: - base_size = len(file_sizes) // master_rank_num - else: - logging.error("%s value can not be 0", master_rank_num) - extra_items = len(file_sizes) % master_rank_num - partitions = [] - start = 0 - for i in range(master_rank_num): - end = start + base_size + (1 if i < extra_items else 0) - partition_indices = [i] - partition_indices.extend(sorted_indices[start:end]) - partitions.append(partition_indices) - start = end - return partitions - - -def get_split_file_size(memory_on_chip_size, sub_group_rank_num): - if sub_group_rank_num != 0: - return memory_on_chip_size // sub_group_rank_num - else: - logging.error("%s value can not be 0", sub_group_rank_num) - return None - - -def create_npu_event(stream): - event = torch_npu.npu.Event(enable_timing=True) - stream.record_event(event) - return event - - -def event_elaspe_second(stream, event1, event2): - stream.synchronize() - return event1.elapsed_time(event2) * TimeConstant.MS_TO_S - - -def cn_now() -> datetime: - """ - Get current time in China timezone as a formatted string. - - Returns: - datetime: Current time in China timezone - """ - return datetime.now(tz=TimeConstant.UTC).astimezone(TimeConstant.CHINA_TIMEZONE) - - -def check_file_owner_and_permission(file_path): - """ - Check if the file belongs to current user and only owner has write permission. - - Args: - file_path: Path to the file to check - - Raises: - RuntimeError: If file not found, not owned by current user, or has wrong permissions - """ - PathManager.check_path_readable(file_path) - - if not os.path.isfile(file_path): - raise RuntimeError(f"File not found at {file_path}") - - # Check file owner - if os.stat(file_path).st_uid != os.getuid(): - raise RuntimeError(f"File {file_path} is not owned by current user") - - # Check file permissions (only owner should have write permission) - current_mode = os.stat(file_path).st_mode - desired_mode = 0o700 # rwx------ (only owner has read/write/execute) - if (current_mode & 0o777) != desired_mode: - os.chmod(file_path, desired_mode) - logger.warning("File %s has wrong permissions, has been changed to %o", file_path, desired_mode) diff --git a/profiler/msprof_analyze/precheck/env_check/__init__.py b/profiler/msprof_analyze/precheck/env_check/__init__.py deleted file mode 100644 index b14094e3f9a77a0970342980ed8de1017f58ce19..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/env_check/__init__.py +++ /dev/null @@ -1,14 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. \ No newline at end of file diff --git a/profiler/msprof_analyze/precheck/env_check/check_item_factory.py b/profiler/msprof_analyze/precheck/env_check/check_item_factory.py deleted file mode 100644 index 0ea14bfe0d37768828291a1e9c71a1b890c7bd0d..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/env_check/check_item_factory.py +++ /dev/null @@ -1,57 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.precheck.env_check.environment_variable_check import EnvironmentVariableCheck -from msprof_analyze.precheck.env_check.python_library_check import PythonLibraryCheck -from msprof_analyze.precheck.env_check.cpu_check import CPUCheck -from msprof_analyze.precheck.env_check.npu_check import NPUCheck -from msprof_analyze.precheck.env_check.communication_check import CommunicationCheck -from msprof_analyze.precheck.env_check.io_check import IOCheck - - -HARDWARE_CHECK_LIST = [ - CPUCheck, - NPUCheck, - CommunicationCheck, - IOCheck, -] - -SOFTWARE_CHECK_LIST = [ - EnvironmentVariableCheck, - PythonLibraryCheck, -] - - -class CheckItemFactory: - CHECK_ITEMS = { - check_item.CHECK_TYPE: check_item - for check_item in SOFTWARE_CHECK_LIST + HARDWARE_CHECK_LIST - } - - @staticmethod - def get_check_item(check_type: str) -> list: - if check_type == "all": - return SOFTWARE_CHECK_LIST + HARDWARE_CHECK_LIST - if check_type == "software": - return SOFTWARE_CHECK_LIST - if check_type == "hardware": - return HARDWARE_CHECK_LIST - check_type_list = check_type.split("|") - check_items = [] - for check_type in check_type_list: - check_item = CheckItemFactory.CHECK_ITEMS.get(check_type) - if not check_item: - continue - check_items.append(check_item) - return check_items diff --git a/profiler/msprof_analyze/precheck/env_check/io_check.py b/profiler/msprof_analyze/precheck/env_check/io_check.py deleted file mode 100644 index 5cfd5c425f0d18d7021c8ef8dca7447c9df6dfc6..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/env_check/io_check.py +++ /dev/null @@ -1,25 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import HardwareCheck - - -class IOCheck(HardwareCheck): - CHECK_TYPE = "io" - - def __init__(self, **kwargs): - super().__init__(**kwargs) - - def check(self): - pass diff --git a/profiler/msprof_analyze/precheck/env_check/npu_check.py b/profiler/msprof_analyze/precheck/env_check/npu_check.py deleted file mode 100644 index c7ffa4997da7a75f566461c70af22393e9b97fb1..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/env_check/npu_check.py +++ /dev/null @@ -1,25 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import HardwareCheck - - -class NPUCheck(HardwareCheck): - CHECK_TYPE = "npu" - - def __init__(self, **kwargs): - super().__init__(**kwargs) - - def check(self): - pass diff --git a/profiler/msprof_analyze/precheck/env_check/python_library_check.py b/profiler/msprof_analyze/precheck/env_check/python_library_check.py deleted file mode 100644 index 81de7000ce7cdf37c1a6c52ff6d650df95d86d9b..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/env_check/python_library_check.py +++ /dev/null @@ -1,25 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import SoftwareCheck - - -class PythonLibraryCheck(SoftwareCheck): - CHECK_TYPE = "python_lib" - - def __init__(self, **kwargs): - super().__init__(**kwargs) - - def check(self): - pass diff --git a/profiler/msprof_analyze/precheck/examples/__init__.py b/profiler/msprof_analyze/precheck/examples/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/examples/profiler/__init__.py b/profiler/msprof_analyze/precheck/examples/profiler/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/examples/profiler/dynamic_prof.py b/profiler/msprof_analyze/precheck/examples/profiler/dynamic_prof.py deleted file mode 100644 index f4b1e9b849b32380978b45661d18c03447ee6482..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/profiler/dynamic_prof.py +++ /dev/null @@ -1,71 +0,0 @@ -import json -import os -import logging -from copy import deepcopy - -logger = logging.getLogger(__name__) - -DEFAULT_DP_CONFIG = { - "activities": ["CPU", "NPU"], - "prof_dir": "./prof_result", - "analyse": False, - "record_shapes": False, - "profile_memory": False, - "with_stack": False, - "with_flops": False, - "with_modules": False, - "active": 1, - "is_rank": False, - "rank_list": [], - "experimental_config": { - "profiler_level": "Level0", - "aic_metrics": "AiCoreNone", - "l2_cache": False, - "op_attr": False, - "gc_detect_threshold": None, - "data_simplification": True, - "record_op_args": False, - "export_type": "text", - "msprof_tx": False - } -} - - -def _get_prof_config_json(prof_dp_path): - prof_config_json = os.path.join(prof_dp_path, "profiler_config.json") - return prof_config_json - - -def _set_default_prof_config(prof_config_json): - with open(prof_config_json, "w") as f: - json.dump(DEFAULT_DP_CONFIG, f, indent=4) - - -def get_dynamic_prof_config_path(): - cwd = os.path.dirname(os.path.realpath(__file__)) - prof_dp_path = os.path.join(cwd, './local_config/config_dynamic') - - prof_config_json = _get_prof_config_json(prof_dp_path) - os.makedirs(os.path.dirname(prof_config_json), exist_ok=True) - - if not os.path.exists(prof_config_json): - _set_default_prof_config(prof_config_json) - logger.info("Created default dynamic profiler config file at {}".format(prof_config_json)) - - return prof_dp_path - - -def start_dynamic_profiler(prof_dp_path, prof_save_dir): - prof_config_json = _get_prof_config_json(prof_dp_path) - if prof_save_dir is not None: - if not os.path.exists(prof_config_json): - data = deepcopy(DEFAULT_DP_CONFIG) - else: - with open(prof_config_json, 'r') as f: - data = json.load(f) - data['prof_dir'] = prof_save_dir - - with open(prof_config_json, 'w') as f: - json.dump(data, f, indent=4) - - logger.info('has started dynamic profiling') diff --git a/profiler/msprof_analyze/precheck/examples/profiler/models.py b/profiler/msprof_analyze/precheck/examples/profiler/models.py deleted file mode 100644 index 4a0f8cc0de62efcd92081a632fb9786188f05de3..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/profiler/models.py +++ /dev/null @@ -1,67 +0,0 @@ -import logging -from typing import Dict, Any, Tuple - -import torch -import torch.nn as nn -from torch.utils.data import Dataset - -logger = logging.getLogger(__name__) - - -# ============= Models ============= -class SimpleResNet(nn.Module): - def __init__(self, num_classes: int = 10): - super().__init__() - self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3) - self.bn1 = nn.BatchNorm2d(64) - self.relu = nn.ReLU(inplace=True) - self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) - self.fc = nn.Linear(64 * 56 * 56, num_classes) - - def forward(self, x: torch.Tensor) -> torch.Tensor: - x = self.conv1(x) - x = self.bn1(x) - x = self.relu(x) - x = self.maxpool(x) - x = torch.flatten(x, 1) - x = self.fc(x) - return x - - -# ============= Datasets ============= -class DummyImageDataset(Dataset): - def __init__(self, input_shape: Tuple[int, ...], num_samples: int = 100): - self.input_shape = input_shape - self.num_samples = num_samples - - def __len__(self) -> int: - return self.num_samples - - def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.Tensor]: - x = torch.randn(self.input_shape) - y = torch.randint(0, 10, ()) - return x, y - - -# ============= Example Registry ============= -class ExampleRegistry: - @staticmethod - def get_example_config(example_name: str) -> Dict[str, Any]: - configs = { - "resnet": { - "model_class": SimpleResNet, - "model_args": {"num_classes": 10}, - "dataset_class": DummyImageDataset, - "dataset_args": {"input_shape": (3, 224, 224), "num_samples": 800}, - "batch_size": 8, - }, - } - - if example_name not in configs: - available_models = ", ".join(configs.keys()) - raise ValueError( - f"Unknown example name: {example_name}. " - f"Available models are: {available_models}" - ) - - return configs[example_name] diff --git a/profiler/msprof_analyze/precheck/examples/profiler/train_with_profiler.py b/profiler/msprof_analyze/precheck/examples/profiler/train_with_profiler.py deleted file mode 100644 index 9e6eb482c4cc10e4f31026b36738654d305409f2..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/profiler/train_with_profiler.py +++ /dev/null @@ -1,286 +0,0 @@ -""" -Example Usage: -1. Single node training examples: -torchrun --nproc_per_node=8 \ - --nnodes=1 \ - --node_rank=0 \ - --master_addr="127.0.0.1" \ - --master_port=29500 \ - train_with_profiler.py \ - --example_name bert \ - --prof_output_dir ./profiler_output - -2. Distributed training examples: - - # Multiple nodes (2 nodes, 8 GPUs each) - # On node 0 (master node): - torchrun --nproc_per_node=8 \ - --nnodes=2 \ - --node_rank=0 \ - --master_addr="192.168.1.1" \ - --master_port=29500 \ - train_with_profiler.py \ - --example_name bert \ - --prof_output_dir ./profiler_output - - # On node 1: - torchrun --nproc_per_node=8 \ - --nnodes=2 \ - --node_rank=1 \ - --master_addr="192.168.1.1" \ - --master_port=29500 \ - train_with_profiler.py \ - --example_name bert \ - --prof_output_dir ./profiler_output - -Distributed Training Parameters: ---nproc_per_node: Number of processes per node (typically number of GPUs) ---nnodes: Total number of nodes ---node_rank: Rank of current node (0 to nnodes-1) ---master_addr: IP address of master node ---master_port: Port for master node communication - -Available Models: -- resnet: ResNet model implementation - -Environment Variables (automatically set by torchrun): -- RANK: Global rank of the process -- WORLD_SIZE: Total number of processes -- LOCAL_RANK: Local rank within the current node -- MASTER_ADDR: Master node address -- MASTER_PORT: Master node port -""" - -import os -import argparse -import ipaddress -import datetime -import logging -from typing import Optional, List - -import torch -import torch_npu -import torch.nn as nn -import torch.distributed as dist -from torch.utils.data import Dataset, DataLoader -from tqdm import tqdm - -try: - from torch_npu.profiler import dynamic_profile as dp -except ImportError: - dp = None - -from msprof_analyze.precheck.examples.profiler.models import ExampleRegistry -from msprof_analyze.precheck.examples.profiler.dynamic_prof import get_dynamic_prof_config_path -from msprof_analyze.precheck.common.constant import Constant - -logger = logging.getLogger(__name__) - - -class ProfilerCallback: - """Callback for handling profiling operations""" - - def __init__(self, prof_save_dir, - is_dynamic=False, dynamic_prof_path=None): - self.profiler = None - self.is_dynamic = is_dynamic - if is_dynamic: - self.dynamic_prof_path = dynamic_prof_path if dynamic_prof_path else get_dynamic_prof_config_path() - self.prof_save_dir = prof_save_dir - - def on_train_begin(self): - if self.is_dynamic: - dp.init(self.dynamic_prof_path) - dist.barrier() - if dist.get_rank() == 0: - from msprof_analyze.precheck.examples.profiler.dynamic_prof import start_dynamic_profiler - start_dynamic_profiler(self.dynamic_prof_path, - self.prof_save_dir) - self.profiler = dp - else: - experimental_config = torch_npu.profiler._ExperimentalConfig( - aic_metrics=torch_npu.profiler.AiCMetrics.PipeUtilization, - profiler_level=torch_npu.profiler.ProfilerLevel.Level2, - l2_cache=False, - data_simplification=False - ) - self.profiler = torch_npu.profiler.profile( - activities=[ - torch_npu.profiler.ProfilerActivity.CPU, - torch_npu.profiler.ProfilerActivity.NPU - ], - with_stack=True, - record_shapes=True, - profile_memory=True, - schedule=torch_npu.profiler.schedule( - wait=5, warmup=5, active=20, repeat=1, skip_first=10), - experimental_config=experimental_config, - with_flops=True, - with_modules=True, - on_trace_ready=torch_npu.profiler.tensorboard_trace_handler( - self.prof_save_dir) - ) - self.profiler.__enter__() - - def on_step_end(self): - if self.profiler: - self.profiler.step() - - def on_train_end(self): - if not self.is_dynamic and self.profiler: - self.profiler.__exit__(None, None, None) - - -class Trainer: - def __init__( - self, - model: nn.Module, - dataloader: Optional[Dataset] = None, - callbacks: Optional[List[ProfilerCallback]] = None, - criterion: Optional[nn.Module] = None, - optimizer: Optional[torch.optim.Optimizer] = None, - ): - self.model = model - self.dataloader = dataloader - self.callbacks = callbacks or [] - - # Setup loss and optimizer with defaults - self.criterion = criterion or nn.CrossEntropyLoss() - self.optimizer = optimizer or torch.optim.Adam(self.model.parameters()) - - # get dist config from env - self.rank = int(os.environ.get("RANK", 0)) - self.world_size = int(os.environ.get("WORLD_SIZE", 1)) - self.local_rank = int(os.environ.get("LOCAL_RANK", 0)) - self.device = f"npu:{self.local_rank}" - - # Setup device and distributed training - self.setup_distributed(self.rank, self.world_size, self.local_rank) - - # Move model and criterion to device - self.model = self.model.to(self.device) - self.criterion = self.criterion.to(self.device) - - @staticmethod - def setup_distributed(rank, world_size, local_rank): - if dist.is_initialized(): - return - - torch.npu.set_device(local_rank) - dist.init_process_group( - backend='hccl', - rank=rank, - world_size=world_size, - timeout=datetime.timedelta(seconds=1800) - ) - logger.info(f"[Rank {rank}] Initialized distributed training") - - def cleanup(self): - """Explicitly cleanup distributed training resources""" - if dist.is_initialized(): - dist.destroy_process_group() - logger.info(f"[Rank {self.rank}] Destroyed distributed training") - - def train(self, epoch: int = 1): - # Call training start callbacks - for callback in self.callbacks: - callback.on_train_begin() - - # Training loop - for epoch_idx in range(epoch): - if self.rank == 0: - pbar = tqdm( - total=len(self.dataloader), - desc=f'Epoch {epoch_idx + 1}/{epoch}', - unit='batch' - ) - - for step, (inputs, labels) in enumerate(self.dataloader): - # Move data to device - inputs = inputs.to(self.device) - labels = labels.to(self.device) - - # Forward pass - self.optimizer.zero_grad() - outputs = self.model(inputs) - loss = self.criterion(outputs, labels) - - # Backward pass - loss.backward() - self.optimizer.step() - - if self.rank == 0: - pbar.update(1) - pbar.set_postfix({ - 'step': f'{step + 1}/{len(self.dataloader)}', - 'loss': f'{loss.item():.4f}' - }) - - dist.barrier() - - # Call step end callbacks - for callback in self.callbacks: - callback.on_step_end() - - if self.rank == 0: - pbar.close() - - # Call training end callbacks - for callback in self.callbacks: - callback.on_train_end() - - -def main(): - parser = argparse.ArgumentParser() - parser.add_argument('--example_name', default='resnet', - choices=['resnet'], - help='Name of the example to run') - parser.add_argument('--prof_output_dir', required=True) - parser.add_argument('--static', action='store_true', required=False, default=False) - args = parser.parse_args() - - # Get example configuration - example_config = ExampleRegistry.get_example_config(args.example_name) - - # Create model and dataset - model = example_config["model_class"](**example_config["model_args"]) - dataset = example_config["dataset_class"](**example_config["dataset_args"]) - - # Create loss and optimizer (可选,使用默认值也可以) - criterion = nn.CrossEntropyLoss() - optimizer = torch.optim.Adam(model.parameters(), lr=0.001) - - # Create profiler callback - profiler_callback = ProfilerCallback( - args.prof_output_dir, - is_dynamic=(not args.static) - ) - - dataloader = DataLoader(dataset, batch_size=example_config["batch_size"]) - - # Initialize trainer - trainer = Trainer( - model=model, - dataloader=dataloader, - callbacks=[profiler_callback], - criterion=criterion, # 可选 - optimizer=optimizer, # 可选 - ) - - try: - trainer.train() - finally: - trainer.cleanup() - - -if __name__ == '__main__': - logging.basicConfig( - level=logging.INFO, - format='%(asctime)s - %(name)s - %(levelname)s - %(message)s' - ) - - try: - main() - except Exception as e: - logger.error(f"Unexpected error: {e}", exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise diff --git a/profiler/msprof_analyze/precheck/examples/scripts/precheck_run_llama2.sh b/profiler/msprof_analyze/precheck/examples/scripts/precheck_run_llama2.sh deleted file mode 100644 index e3bf0859e7565ecbea7857bb1601fc9e58812b57..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/scripts/precheck_run_llama2.sh +++ /dev/null @@ -1,128 +0,0 @@ -#!/bin/bash - -export CUDA_DEVICE_MAX_CONNECTIONS=1 -export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True - -GPUS_PER_NODE=${GPUS_PER_NODE:-8} -MASTER_ADDR=${MASTER_ADDR:-"192.168.0.1"} -MASTER_PORT=${MASTER_PORT:-6000} -NNODES=${NNODES:-2} -NODE_RANK=${NODE_RANK:-0} -WORLD_SIZE=$(($GPUS_PER_NODE*$NNODES)) - -CKPT_SAVE_DIR=${CKPT_SAVE_DIR:-"./ckpt/llama-2-7b"} -CKPT_LOAD_DIR=${CKPT_LOAD_DIR:-"./model_weights/llama-2-7b-legacy"} -TOKENIZER_MODEL=${TOKENIZER_MODEL:-"./model_from_hf/llama-2-7b-hf/tokenizer.model"} -DATA_PATH=${DATA_PATH:-"./dataset/enwiki_text_document"} - -TP=${TP:-2} -PP=${PP:-4} - -# Result directory -OUTPUT_DIR=${OUTPUT_DIR:-"./result/precheck/llama2-1129-2130"} - -PROF_NODE_RES_DIR="$OUTPUT_DIR/node_prof_save_dir" -LOG_FILE="$OUTPUT_DIR/precheck.log" - -# Check if profiling output directory exists before running training -# This prevents starting a long training job if the directory is missing -if [ ! -d "$OUTPUT_DIR" ]; then - echo "Error: Result directory $OUTPUT_DIR does not exist." \ - "Please create the directory before running training" \ - "(in ${BASH_SOURCE[0]})" >&2 - exit 1 -fi - -# Get the directory of the current script and cd into it -# SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" -# echo "Script directory: $SCRIPT_DIR" -# cd "$SCRIPT_DIR" -# echo "Changed working directory to: $(pwd)" - - -DISTRIBUTED_ARGS=" - --nproc_per_node $GPUS_PER_NODE \ - --nnodes $NNODES \ - --node_rank $NODE_RANK \ - --master_addr $MASTER_ADDR \ - --master_port $MASTER_PORT -" - -GPT_ARGS=" - --tensor-model-parallel-size ${TP} \ - --pipeline-model-parallel-size ${PP} \ - --sequence-parallel \ - --num-layers 32 \ - --hidden-size 4096 \ - --ffn-hidden-size 11008 \ - --num-attention-heads 32 \ - --tokenizer-type Llama2Tokenizer \ - --tokenizer-model ${TOKENIZER_MODEL} \ - --seq-length 4096 \ - --max-position-embeddings 4096 \ - --micro-batch-size 1 \ - --global-batch-size 256 \ - --make-vocab-size-divisible-by 1 \ - --lr 1.25e-6 \ - --train-iters 5 \ - --lr-decay-style cosine \ - --untie-embeddings-and-output-weights \ - --disable-bias-linear \ - --attention-dropout 0.0 \ - --init-method-std 0.01 \ - --hidden-dropout 0.0 \ - --position-embedding-type rope \ - --normalization RMSNorm \ - --use-fused-rmsnorm \ - --swiglu \ - --use-flash-attn \ - --no-masked-softmax-fusion \ - --attention-softmax-in-fp32 \ - --min-lr 1.25e-7 \ - --weight-decay 1e-1 \ - --lr-warmup-fraction 0.01 \ - --clip-grad 1.0 \ - --adam-beta1 0.9 \ - --initial-loss-scale 65536 \ - --adam-beta2 0.95 \ - --no-gradient-accumulation-fusion \ - --no-load-optim \ - --no-load-rng \ - --use-distributed-optimizer \ - --use-fused-swiglu \ - --use-fused-rotary-pos-emb \ - --overlap-grad-reduce \ - --bf16" - -DATA_ARGS=" \ - --data-path $DATA_PATH \ - --split 949,50,1" - -PROFILE_ARGS=" \ - --profile \ - --profile-step-start 2 \ - --profile-step-end 4 \ - --profile-ranks -1 \ - --profile-level level0 \ - --profile-with-cpu \ - --profile-save-path $PROF_NODE_RES_DIR" - -OUTPUT_ARGS=" \ - --log-interval 1 \ - --save-interval 10000 \ - --eval-interval 1000 \ - --eval-iters 0" - -# Add precheck arguments -# PRECHECK_ARGS=" \ -# --do_precheck" - -torchrun $DISTRIBUTED_ARGS pretrain_gpt.py \ - $GPT_ARGS \ - $DATA_ARGS \ - $OUTPUT_ARGS \ - $PROFILE_ARGS \ - --distributed-backend nccl \ - --load $CKPT_LOAD_DIR \ - --save $CKPT_SAVE_DIR \ - | tee $LOG_FILE diff --git a/profiler/msprof_analyze/precheck/examples/scripts/run_llama2_precheck.sh b/profiler/msprof_analyze/precheck/examples/scripts/run_llama2_precheck.sh deleted file mode 100644 index 495dab8ca6fdeaab6ca87df61a6be0d4d7830f6c..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/scripts/run_llama2_precheck.sh +++ /dev/null @@ -1,40 +0,0 @@ -#!/bin/bash - -# You should set the IP addresses of the nodes in the NODES_IP variable -# Change the IP addresses to the actual IP addresses of your nodes -NODES_IP="${NODES_IP:-192.168.0.1,192.168.0.2}" - -# Convert comma-separated NODES_IP to an array nodes_ip -IFS=',' read -r -a nodes_ip <<< "$NODES_IP" - - -echo "Starting distributed precheck with ${#nodes_ip[@]} nodes" -echo "Master node: ${nodes_ip[0]}" -echo "All nodes: ${nodes_ip[*]}" - -output_dir_base="./result/demo_precheck" - -# Add timestamp to task name -timestamp=$(date +"%Y%m%d_%H%M%S") -task_name="llama2-demo_${timestamp}" - -output_dir="${output_dir_base}/${task_name}" -node_prof_save_dir="${output_dir}/node_prof_save_dir" - -# Join array elements with commas -host_ips=$(IFS=,; echo "${nodes_ip[*]}") - -# Run precheck with distributed configuration -msprof-analyze precheck start_all \ - --host_ips "${host_ips}" \ - --master_addr "${nodes_ip[0]}" \ - --master_port 29500 \ - --nnodes ${#nodes_ip[@]} \ - --nproc_per_node 8 \ - --output_dir ${output_dir_base} \ - --task_name ${task_name} \ - --node_prof_save_dir ${node_prof_save_dir} \ - --profiling_cmd "OUTPUT_DIR=${output_dir} bash ./examples/scripts/precheck_run_llama2.sh" \ - --static - -echo "Precheck completed" diff --git a/profiler/msprof_analyze/precheck/examples/scripts/run_precheck.sh b/profiler/msprof_analyze/precheck/examples/scripts/run_precheck.sh deleted file mode 100644 index bf5b3b89cff5e945af557ed07c997174ac19a78b..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/scripts/run_precheck.sh +++ /dev/null @@ -1,37 +0,0 @@ -#!/bin/bash - -# You should set the IP addresses of the nodes in the NODES_IP variable -# Change the IP addresses to the actual IP addresses of your nodes -NODES_IP="${NODES_IP:-192.168.0.1,192.168.0.2}" - - -# Convert comma-separated NODES_IP to an array nodes_ip -IFS=',' read -r -a nodes_ip <<< "$NODES_IP" - -timestamp=$(date +"%Y%m%d_%H%M%S") -task_name="task_demo_${timestamp}" - -echo "Starting distributed precheck with ${#nodes_ip[@]} nodes" -echo "Master node: ${nodes_ip[0]}" -echo "All nodes: ${nodes_ip[@]}" - -output_dir=./output_test - -PROFILING_CMD="[resnet]" - -# Join array elements with commas -host_ips=$(IFS=,; echo "${nodes_ip[*]}") - -# Run precheck with distributed configuration -msprof-analyze precheck start_all \ - --host_ips "${host_ips}" \ - --master_addr ${nodes_ip[0]} \ - --master_port 29500 \ - --nnodes ${#nodes_ip[@]} \ - --nproc_per_node 8 \ - --output_dir "${output_dir}" \ - --task_name ${task_name} \ - --profiling_cmd "${PROFILING_CMD}" \ - --static - -echo "Precheck completed" diff --git a/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_env.sh b/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_env.sh deleted file mode 100644 index 68aa4b33ddce4cfaeee0b2b5e1008901b02809e8..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_env.sh +++ /dev/null @@ -1,166 +0,0 @@ -#!/bin/bash - -# 默认值设置 -HOST_IPS=${HOST_IPS:-""} -TIMEOUT=${TIMEOUT:-5} - -# ANSI 颜色代码 -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -RED='\033[0;31m' -BLUE='\033[0;34m' -NC='\033[0m' # No Color -BOLD='\033[1m' - -# 检查必需参数 -if [ -z "$HOST_IPS" ]; then - echo -e "${RED}Error: HOST_IPS environment variable is not set${NC}" - echo -e "Usage: ${BOLD}HOST_IPS='192.168.0.1,192.168.0.2' [CHECK_CANN=1] [TIMEOUT=5] bash $0${NC}" - exit 1 -fi - -# 获取CANN信息的函数 -get_cann_info() { - # 尝试多种方式获取CANN信息 - if command -v npu-smi &>/dev/null; then - npu_info=$(npu-smi info 2>/dev/null) - driver_version=$(echo "$npu_info" | grep "Driver Version" | awk -F':' '{print $2}' | tr -d ' ') - firmware_version=$(echo "$npu_info" | grep "Firmware Version" | awk -F':' '{print $2}' | tr -d ' ') - echo "Driver:$driver_version;Firmware:$firmware_version" - else - echo "NPU-SMI Not Found" - fi -} - -# 打印标题 -echo -e "\n${BOLD}🔍 Cluster Environment Checker${NC}" -echo -e "Usage: ${BOLD}HOST_IPS='192.168.0.1,192.168.0.2' [CHECK_CANN=1] [TIMEOUT=5] bash $0${NC}" -echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" - -# 获取本机环境信息 -echo -e "\n${BOLD}📊 Step 1: Collecting local environment info...${NC}" -echo -e "${BLUE}Detecting Python environment...${NC}" -LOCAL_PYTHON_PATH=$(which python3) -LOCAL_PYTHON_VERSION=$($LOCAL_PYTHON_PATH -V 2>&1) -echo -e "${BLUE}Checking installed packages...${NC}" -LOCAL_MSPROF_VERSION=$($LOCAL_PYTHON_PATH -m pip show msprof-analyze | grep Version | awk '{print $2}') -LOCAL_TORCH_VERSION=$($LOCAL_PYTHON_PATH -m pip show torch | grep Version | awk '{print $2}') -LOCAL_TORCH_NPU_VERSION=$($LOCAL_PYTHON_PATH -m pip show torch_npu | grep Version | awk '{print $2}') - -echo -e "\n${BOLD}📌 Local Environment Summary:${NC}" -echo -e " • Python Path: ${GREEN}$LOCAL_PYTHON_PATH${NC}" -echo -e " • Python Version: ${GREEN}$LOCAL_PYTHON_VERSION${NC}" -echo -e " • Msprof-analyze: ${GREEN}v$LOCAL_MSPROF_VERSION${NC}" -echo -e " • Torch: ${GREEN}v$LOCAL_TORCH_VERSION${NC}" -echo -e " • Torch_NPU: ${GREEN}v$LOCAL_TORCH_NPU_VERSION${NC}" - -# 构建远程检查命令 -CHECK_CMD=$(cat << EOF -echo "=== Python Path Check ===" && \ -test -f $LOCAL_PYTHON_PATH && \ -echo "=== Python Version ===" && \ -$LOCAL_PYTHON_PATH -V && \ -echo "=== Msprof-analyze Version ===" && \ -$LOCAL_PYTHON_PATH -m pip show msprof-analyze | grep Version | awk '{print \$2}' && \ -echo "=== Torch Version ===" && \ -$LOCAL_PYTHON_PATH -m pip show torch | grep Version | awk '{print \$2}' && \ -echo "=== Torch_NPU Version ===" && \ -$LOCAL_PYTHON_PATH -m pip show torch_npu | grep Version | awk '{print \$2}' && \ -echo "=== TMUX Check ===" && \ -which tmux -EOF -) - -# 检查每个远程主机 -echo -e "\n${BOLD}🔄 Step 2: Checking cluster nodes...${NC}" -IFS=',' read -ra HOSTS <<< "$HOST_IPS" -total_hosts=${#HOSTS[@]} -current_host=0 -failed_hosts=() - -for host in "${HOSTS[@]}"; do - ((current_host++)) - echo -e "\n${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" - echo -e "${BOLD}📡 Checking host [$current_host/$total_hosts]: ${YELLOW}$host${NC}" - - # 检查ssh连接 - echo -e " ⏳ Testing SSH connection..." - if ! ssh -o BatchMode=yes -o ConnectTimeout=$TIMEOUT $host "exit 0" &>/dev/null; then - echo -e " ${RED}❌ SSH connection failed${NC}" - failed_hosts+=("$host [SSH Failed]") - continue - fi - echo -e " ${GREEN}✓ SSH connection successful${NC}" - - # 检查Python解释器 - echo -e " ⏳ Verifying Python interpreter..." - if ! ssh -o BatchMode=yes -o ConnectTimeout=$TIMEOUT $host "test -f $LOCAL_PYTHON_PATH" &>/dev/null; then - echo -e " ${RED}❌ Python interpreter not found at: $LOCAL_PYTHON_PATH${NC}" - failed_hosts+=("$host [Python Not Found]") - continue - fi - echo -e " ${GREEN}✓ Python interpreter verified${NC}" - - # 检查环境 - echo -e " ⏳ Checking environment..." - remote_output=$(ssh -o BatchMode=yes -o ConnectTimeout=$TIMEOUT $host "$CHECK_CMD" 2>&1) - if [ $? -ne 0 ]; then - echo -e " ${RED}❌ Environment check failed${NC}" - echo -e " Error details: $remote_output" - failed_hosts+=("$host [Check Failed]") - continue - fi - - # 解析远程输出 - remote_python_version=$(echo "$remote_output" | awk '/=== Python Version ===/{getline; print}') - remote_msprof_version=$(echo "$remote_output" | awk '/=== Msprof-analyze Version ===/{getline; print}') - remote_torch_version=$(echo "$remote_output" | awk '/=== Torch Version ===/{getline; print}') - remote_torch_npu_version=$(echo "$remote_output" | awk '/=== Torch_NPU Version ===/{getline; print}') - remote_tmux_path=$(echo "$remote_output" | awk '/=== TMUX Check ===/{getline; print}') - - # 检查结果 - errors=() - - [ "$remote_python_version" != "$LOCAL_PYTHON_VERSION" ] && \ - errors+=("Python version mismatch: Local=$LOCAL_PYTHON_VERSION Remote=$remote_python_version") - - [ "$remote_msprof_version" != "$LOCAL_MSPROF_VERSION" ] && \ - errors+=("Msprof version mismatch: Local=$LOCAL_MSPROF_VERSION Remote=$remote_msprof_version") - - [ "$remote_torch_version" != "$LOCAL_TORCH_VERSION" ] && \ - errors+=("Torch version mismatch: Local=$LOCAL_TORCH_VERSION Remote=$remote_torch_version") - - [ "$remote_torch_npu_version" != "$LOCAL_TORCH_NPU_VERSION" ] && \ - errors+=("Torch_NPU version mismatch: Local=$LOCAL_TORCH_NPU_VERSION Remote=$remote_torch_npu_version") - - [ -z "$remote_tmux_path" ] && \ - errors+=("TMUX not found") - - if [ ${#errors[@]} -eq 0 ]; then - echo -e " ${GREEN}✓ All environment checks passed${NC}" - else - echo -e " ${RED}❌ Environment check failed:${NC}" - for error in "${errors[@]}"; do - echo -e " • ${RED}$error${NC}" - done - failed_hosts+=("$host [Version Mismatch]") - fi -done - -# 总结报告 -echo -e "\n${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" -echo -e "${BOLD}📋 Final Report${NC}" -if [ ${#failed_hosts[@]} -eq 0 ]; then - echo -e "${GREEN}✅ All $total_hosts hosts passed environment checks!${NC}" - exit 0 -else - echo -e "${RED}❌ Environment check failed for ${#failed_hosts[@]} out of $total_hosts hosts:${NC}" - for failed_host in "${failed_hosts[@]}"; do - echo -e " • ${RED}$failed_host${NC}" - done - echo -e "\n${YELLOW}💡 Tips:${NC}" - echo -e " • Ensure all hosts have the same Python environment" - echo -e " • Check if tmux is installed: ${BOLD}sudo apt-get install tmux${NC}" - echo -e " • Verify SSH connectivity: ${BOLD}ssh-copy-id user@host${NC}" - exit 1 -fi diff --git a/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_ssh.sh b/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_ssh.sh deleted file mode 100644 index 7489bb601ceaff09acf43aadd2b36e40a6682fb5..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/examples/scripts/test_hosts_ssh.sh +++ /dev/null @@ -1,61 +0,0 @@ -### SSH 连通性测试 -# 保存为 test_hosts_ssh.sh -#!/bin/bash - -# 默认值设置 -HOST_IPS=${HOST_IPS:-""} -TIMEOUT=${TIMEOUT:-5} - -# 检查必需参数 -if [ -z "$HOST_IPS" ]; then - echo "Error: HOST_IPS environment variable is not set" - echo "Usage: HOST_IPS='192.168.0.1,192.168.0.2' TIMEOUT=5 bash $0" - exit 1 -fi - -echo "Testing SSH connections with timeout ${TIMEOUT}s..." -echo "Host list: $HOST_IPS" -echo "-----------------------------------" - -# 测试每个主机的SSH连接 -failed_hosts=() -IFS=',' read -ra HOSTS <<< "$HOST_IPS" -for host in "${HOSTS[@]}"; do - echo -n "Testing SSH connection to $host... " - if ssh -o BatchMode=yes -o ConnectTimeout=$TIMEOUT $host "exit 0" &> /dev/null; then - echo "Success ✓" - else - echo "Failed ✗" - failed_hosts+=($host) - fi -done - -# 如果有失败的主机,输出设置建议 -if [ ${#failed_hosts[@]} -ne 0 ]; then - echo -e "\n❌ Some hosts are not accessible via SSH" - echo "Please run these commands to set up passwordless SSH:" - echo "-----------------------------------" - for host in "${failed_hosts[@]}"; do - echo "# 1. If ~/.ssh/id_rsa doesn't exist, generate it" - echo "[ ! -f ~/.ssh/id_rsa ] && ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa" - echo "" - echo "# 2. Copy your key to remote host" - echo "ssh-copy-id $USER@$host" - echo "" - echo "# 3. Set correct permissions" - echo "chmod 600 ~/.ssh/id_rsa" - echo "-----------------------------------" - done - exit 1 -else - echo -e "\n✅ All SSH connections successful!" -fi - -# 使用方法: -# ```bash -# # 方式1:直接运行(使用默认超时时间5秒) -# HOST_IPS="192.168.0.1,192.168.0.2" bash test_hosts_ssh.sh - -# # 方式2:指定超时时间 -# HOST_IPS="192.168.0.1,192.168.0.2" TIMEOUT=3 bash test_hosts_ssh.sh -# ``` diff --git a/profiler/msprof_analyze/precheck/manager/__init__.py b/profiler/msprof_analyze/precheck/manager/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/manager/args_manager.py b/profiler/msprof_analyze/precheck/manager/args_manager.py deleted file mode 100644 index 252a51bae5257e750541fde452db69e9e88eb8bb..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/manager/args_manager.py +++ /dev/null @@ -1,446 +0,0 @@ -import argparse -import ipaddress -import os -import re -import shlex -import shutil -import sys -import logging -from typing import List, Union -from collections import OrderedDict - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.common.utils import cn_now -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -class BaseArgsManager: - def __init__(self, args): - self._args = args - - def __repr__(self): - return str(self.to_dict()) - - @property - def master_addr(self): - return self._args.master_addr - - @property - def master_port(self): - return self._args.master_port - - @property - def nnodes(self): - return self._args.nnodes - - @property - def nproc_per_node(self): - return self._args.nproc_per_node - - @property - def node_prof_save_dir(self): - return self._args.node_prof_save_dir or os.path.join(self.task_output_dir, 'node_prof_save_dir') - - @property - def master_prof_gather_dir(self): - return self._args.master_prof_gather_dir or os.path.join(self.task_output_dir, 'master_prof_gather_dir') - - @property - def output_dir(self): - return self._args.output_dir - - @property - def task_name(self): - if self._args.task_name: - return self._args.task_name - return "task_" + cn_now().strftime("%Y%m%d-%H%M%S") - - @property - def static(self): - return self._args.static - - @property - def task_output_dir(self): - return os.path.join(self.output_dir, self.task_name) - - @property - def profiling_cmd(self): - return self._args.profiling_cmd - - @property - def prof_in_shared_storage(self): - return getattr(self._args, 'prof_in_shared_storage', False) - - @staticmethod - def escape_special_chars(text): - ESCAPE_CHARS_MAP = { - '\n': '\\n', - '\t': '\\t', - '\r': '\\r', - '\\': '\\\\', - '\"': '\\\"', - '\'': '\\\'' - } - return re.sub(r'([\n\t\r\\\'"])', lambda match: ESCAPE_CHARS_MAP[match.group()], text) - - @staticmethod - def _check_output_path_valid(output_path: str) -> Union[Exception, None]: - try: - if not os.path.exists(output_path): - PathManager.check_input_directory_path(output_path) - else: - PathManager.check_input_directory_path(output_path) - PathManager.check_path_owner_consistent(output_path) - except Exception as e: - return e - return None - - @staticmethod - def _check_ip_valid(ip: str) -> Union[Exception, None]: - try: - ipaddress.ip_address(ip) - except ValueError as e: - return e - return None - - @staticmethod - def _check_int_range( - value: int, min_value: int = Constant.ARG_MIN_INT_VALUE, max_value: int = Constant.ARG_MAX_INT_VALUE - ) -> Union[Exception, None]: - if not (min_value <= value <= max_value): - return ValueError(f"The value must be between {min_value} and {max_value}.") - return None - - @staticmethod - def _check_executable_path_valid(executable_path: str) -> Union[Exception, None]: - try: - PathManager.check_path_owner_consistent(executable_path) - if not os.path.isfile(executable_path): - raise ValueError("The path is not a valid executable file.") - if not os.access(executable_path, os.X_OK): - raise ValueError("The file at the path is not executable.") - except Exception as e: - return e - return None - - @staticmethod - def _check_identifier_valid(identifier: str) -> Union[Exception, None]: - pattern = r'^[a-zA-Z_][a-zA-Z0-9_-]*$' - if not re.match(pattern, identifier): - return ValueError(f"It must start with a letter or underscore, " - f"followed by any number of letters, digits, underscores, or dashes.") - return None - - @staticmethod - def _check_command_injection(cmd: str) -> Union[Exception, None]: - dangerous_chars = [';', '&&', '||', '|', '>', '<', '`', '$', '\\'] - for char in dangerous_chars: - if char in cmd: - return ValueError( - f"Command contains dangerous character '{char}'. " - "Command injection is not allowed." - ) - return None - - @staticmethod - def _check_dangerous_commands(cmd: str) -> Union[Exception, None]: - dangerous_commands = [ - 'rm', 'mv', 'cp', 'chmod', 'chown', 'dd', - 'mkfs', 'mount', 'umount', 'sudo', 'su', - 'reboot', 'shutdown', 'poweroff', 'init', - 'passwd', 'adduser', 'deluser', 'useradd', - 'userdel', 'groupadd', 'groupdel' - ] - - cmd_parts = shlex.split(cmd) - if not cmd_parts: - return ValueError("Empty command is not allowed") - - base_cmd = os.path.basename(cmd_parts[0]) - if base_cmd in dangerous_commands: - return ValueError( - f"Command '{base_cmd}' is not allowed for security reasons" - ) - return None - - @classmethod - def safe_format(cls, format_str: str, *args, max_len=Constant.ARG_MAX_LEN): - """ - Safely formats a string by truncating arguments longer than a specified maximum length and escaping special characters. - - This function is designed to create user-friendly error messages by ensuring that all arguments are displayed in a safe and concise manner. - It truncates any argument that exceeds the maximum length and appends an ellipsis to indicate the truncation. - Additionally, it escapes special characters in the arguments to prevent formatting errors or injection issues. - - Args: - format_str (str): The format string into which the arguments are inserted. - *args: Variable length argument list to be formatted into the format_str. - max_len (int): The maximum allowed length of any argument string after which it will be truncated. - Defaults to Constant.MAX_ARG_LEN. - - Returns: - str: A formatted string with all arguments safely inserted. - """ - - def _str(x): - x_str = str(x) - if len(x_str) > max_len: - x_str = x_str[:max_len] + "..." - return cls.escape_special_chars(x_str) - - args = [_str(arg) for arg in args] - return format_str.format(*args) - - @classmethod - def raise_error(cls, error_format_msg, *args): - """ - Raises a RuntimeError with a formatted message that includes special character escaping and length limitation. - - This method is designed to handle untrusted external parameters `*args` by ensuring that the error message is user-friendly. - It applies special character escaping and truncates arguments to a predefined maximum length to prevent formatting errors or injection issues. - - Args: - error_format_msg (str): The format string into which the arguments are inserted. - *args: Variable length argument list to be formatted into the error_format_msg. - """ - err_msg = cls.safe_format(error_format_msg, *args) - raise RuntimeError(err_msg) - - def to_dict(self): - """Automatically convert all properties to a dictionary.""" - properties_dict = {} - for prop in dir(self): - if isinstance(getattr(type(self), prop, None), property): - properties_dict[prop] = getattr(self, prop) - return properties_dict - - def check_args(self): - - error = self._check_ip_valid(self.master_addr) - if error: - self.raise_error('Master address {} is not valid: {}', self.master_addr, error) - - error = self._check_int_range(self.master_port, - min_value=Constant.ARG_MIN_PORT_VALUE, max_value=Constant.ARG_MAX_PORT_VALUE) - if error: - self.raise_error('Master port {} is not valid: {}', self.master_port, error) - - error = self._check_int_range(self.nnodes, min_value=1) - if error: - self.raise_error('Total number of nodes {} is not valid: {}', self.nnodes, error) - - error = self._check_int_range(self.nproc_per_node, min_value=1) - if error: - self.raise_error('Number of processes per node {} is not valid: {}', self.nproc_per_node, error) - - error = self._check_output_path_valid(self.output_dir) - if error: - self.raise_error('Output directory {} is not valid: {}', self.output_dir, error) - - error = self._check_identifier_valid(self.task_name) - if error: - self.raise_error('Task name {} is not valid: {}', self.task_name, error) - - error = self._check_output_path_valid(self.node_prof_save_dir) - if error: - self.raise_error('Node prof save directory {} is not valid: {}', self.node_prof_save_dir, error) - - error = self._check_output_path_valid(self.master_prof_gather_dir) - if error: - self.raise_error('Master prof gather directory {} is not valid: {}', self.master_prof_gather_dir, error) - - self._check_profiling_cmd_valid(self.profiling_cmd) - - def _check_profiling_cmd_valid(self, profiling_cmd: str) -> None: - if not profiling_cmd.strip(): - logger.error('Profiling command should not be empty.') - - if profiling_cmd in Constant.DEFAULT_PROFILING_COMMANDS: - logger.info(self.safe_format('Using default profiling command for {}', profiling_cmd)) - return - - if len(self.profiling_cmd) > Constant.ARG_MAX_LEN: - self.raise_error( - 'The profiling command is too long, it must be less than {} characters', Constant.ARG_MAX_LEN) - - error = self._check_command_injection(self.profiling_cmd) - if error: - self.raise_error('Profiling command {} is not valid: {}', self.profiling_cmd, error) - - error = self._check_dangerous_commands(self.profiling_cmd) - if error: - self.raise_error('Profiling command {} is not valid: {}', self.profiling_cmd, error) - - -class PrecheckArgsManager(BaseArgsManager): - def __init__(self, args): - super().__init__(args) - - self._args = args - self._ssh_remote_hosts = {} - self._host_ips = [] - - self.check_args() - - @property - def host_ips(self): - return self._host_ips - - @property - def host_config_file(self): - return self._args.host_config_file - - @property - def ssh_remote_hosts(self): - return self._ssh_remote_hosts - - @property - def python_path(self): - if not self._args.python_path: - return sys.executable - - if os.path.exists(self._args.python_path): - return self._args.python_path - - python_path = shutil.which(self._args.python_path) - return python_path - - @classmethod - def _check_host_ips_valid(cls, host_ips: List[str]) -> Union[Exception, None]: - if not host_ips: - return None - - for i, ip in enumerate(host_ips): - if not ipaddress.ip_address(ip): - return ValueError(f"The {i}-th host ip is not valid.") - - if len(host_ips) != len(set(host_ips)): - return ValueError("Host IPs must be unique.") - - return None - - def try_to_parse_host_config_file(self, host_config_file: str) -> Union[Exception, OrderedDict]: - if not host_config_file: - logger.info("SSH config file is not provided.") - logger.info("Use default ssh settings for all nodes: ssh_key_file, user, port = ~/.ssh/id_rsa, $USER, 22") - return {} - - if not os.path.isfile(host_config_file): - return FileNotFoundError(f"SSH config file {host_config_file} does not exist.") - - PathManager.check_path_readable(host_config_file) - PathManager.check_file_size(host_config_file) - - ssh_remote_hosts = [] - required_fields = ['host_ip', 'ssh_key_file', 'user', 'port'] - with open(host_config_file, 'r') as f: - header = f.readline().strip().split(',') - if any(field not in header for field in required_fields): - return ValueError(f"Host config file {host_config_file} is missing required fields: {required_fields}") - - for line in f: - values = line.strip().split(',') - if len(values) != len(required_fields): - return ValueError( - f"Host config file {host_config_file} has invalid number of fields in line: {line}") - - host_ip, ssh_key_file, user, port = values - ssh_key_file = PathManager.expanduser_for_argumentparser(ssh_key_file) - port = int(port) - - exception = None - try: - PathManager.check_path_readable(ssh_key_file) - if os.stat(ssh_key_file).st_mode & 0o777 != 0o600: - raise ValueError(f"SSH key file {ssh_key_file} must have permissions set to 600") - - exception = self._check_int_range(port, min_value=Constant.ARG_MIN_PORT_VALUE, - max_value=Constant.ARG_MAX_PORT_VALUE) \ - or self._check_identifier_valid(user) \ - or self._check_ip_valid(host_ip) - - except Exception as e: - exception = e - - if exception: - return RuntimeError( - f"Host config file {host_config_file} is not valid, invalid line: {line}, error: {exception}") - - ssh_remote_hosts.append({ - 'host': host_ip, - 'username': user, - 'key_filename': ssh_key_file, - 'port': int(port) - }) - - ssh_remote_hosts = OrderedDict({item['host']: item for item in ssh_remote_hosts}) - return ssh_remote_hosts - - def check_args(self): - super().check_args() - - error = self._check_executable_path_valid(self.python_path) - if error: - self.raise_error('Python path {} is not valid: {}', self.python_path, error) - - # Ensure either host_ips or host_config_file is provided - if not self.host_config_file and not self._args.host_ips: - self.raise_error('Either host config file or host ips must be provided') - - # If host_ips is provided, validate it first - if self._args.host_ips: - error = self._check_host_ips_valid(self._args.host_ips) - if error: - self.raise_error('Host ips {} is not valid: {}', self._args.host_ips, error) - - # Set the validated host_ips - self._host_ips = self._args.host_ips - - # If config file is provided, parse and validate it - if self.host_config_file: - res = self.try_to_parse_host_config_file(self.host_config_file) - if isinstance(res, Exception): - self.raise_error('Host config file {} is not valid: {}', self.host_config_file, res) - self._ssh_remote_hosts = res - config_file_ips = list(self._ssh_remote_hosts.keys()) - - # If host_ips is also provided, verify they match - if self._args.host_ips: - if not set(self._args.host_ips) == set(config_file_ips): - self.raise_error('Host ips does not match the IPs in host config file. Given: {}, In file: {}', - self._args.host_ips, config_file_ips) - else: - # If only config file is provided, use IPs from the config file - self._host_ips = config_file_ips - - # Validate number of nodes and master node configuration - if self.nnodes != len(self.host_ips): - self.raise_error( - 'The number of nodes {} is not equal to the number of host ips {}', - self.nnodes, len(self.host_ips)) - - if self.master_addr != self.host_ips[0]: - self.raise_error( - 'The master address {} is not the first host ip {}', - self.master_addr, self.host_ips[0]) - - -class PrecheckRunnerArgsManager(BaseArgsManager): - def __init__(self, args): - super().__init__(args) - - self._args = args - self.check_args() - - @property - def node_rank(self): - return self._args.node_rank - - def check_args(self): - super().check_args() - - error = self._check_int_range(self.node_rank, min_value=0, max_value=self.nnodes - 1) - if error: - self.raise_error('Node rank {} is not valid: {}', self.node_rank, error) diff --git a/profiler/msprof_analyze/precheck/manager/disk_manager.py b/profiler/msprof_analyze/precheck/manager/disk_manager.py deleted file mode 100644 index a497c992cbe895e2dcaf115a8ad3469a687a0759..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/manager/disk_manager.py +++ /dev/null @@ -1,24 +0,0 @@ -import os -import logging - -logger = logging.getLogger(__name__) - - -class DiskManager: - @staticmethod - def check_disk_space(input_prof_path, prof_data_size_gb): - if not os.path.exists(input_prof_path): - logger.error(f"路径不存在: {input_prof_path}") - raise FileNotFoundError(f"路径不存在: {input_prof_path}") - - if not os.access(input_prof_path, os.R_OK): - logger.error(f"无读取权限: {input_prof_path}") - raise PermissionError(f"无读取权限: {input_prof_path}") - - statvfs = os.statvfs(input_prof_path) - disk_free_gb = statvfs.f_bavail * statvfs.f_frsize / (1024 ** 3) - - if disk_free_gb - prof_data_size_gb <= 50: - logger.error(f"磁盘空间不足: {disk_free_gb:.2f}GB, 输入数据大小: {prof_data_size_gb:.2f}GB") - raise BufferError(f"磁盘空间不足: {disk_free_gb:.2f}GB, 输入数据大小: {prof_data_size_gb:.2f}GB") - diff --git a/profiler/msprof_analyze/precheck/manager/distribute_manager.py b/profiler/msprof_analyze/precheck/manager/distribute_manager.py deleted file mode 100644 index f35fdf45c6ad6245ebf3e1225faec41c6a0382c8..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/manager/distribute_manager.py +++ /dev/null @@ -1,52 +0,0 @@ -from copy import deepcopy - - -class DistributeManager: - def __init__(self, args): - self.master_addr = args.master_addr - self.master_port = args.master_port - self.nnodes = args.nnodes - self.nproc_per_node = args.nproc_per_node - self.node_rank = args.node_rank - - self.local_rank = 0 - self.rank = self.local_rank + self.node_rank * self.nproc_per_node - - self.world_size = self.nnodes * self.nproc_per_node - self.local_world_size = self.nproc_per_node - - self.group_rank = -1 - - def __repr__(self): - """ - Custom __repr__ method to print out the object in a human-readable format - """ - return (f"DistributeManager(master_addr='{self.master_addr}', " - f"master_port='{self.master_port}', nnodes={self.nnodes}, " - f"nproc_per_node={self.nproc_per_node}, node_rank={self.node_rank}, " - f"local_rank={self.local_rank}, rank={self.rank}, " - f"world_size={self.world_size}, local_world_size={self.local_world_size}, " - f"group_rank={self.group_rank})") - - def update_local_rank(self, local_rank: int): - self.local_rank = local_rank - self.rank = self.local_rank + self.node_rank * self.nproc_per_node - return deepcopy(self) - - def get_dist_env_data(self): - self.rank = self.local_rank + self.node_rank * self.nproc_per_node - - data = { - "MASTER_ADDR": self.master_addr, - "MASTER_PORT": self.master_port, - "LOCAL_RANK": self.local_rank, - "GROUP_RANK": self.group_rank, - "NODE_RANK": self.node_rank, - "RANK": self.rank, - "WORLD_SIZE": self.world_size, - "LOCAL_WORLD_SIZE": self.local_world_size, - } - - for k in data: - data[k] = str(data[k]) - return data diff --git a/profiler/msprof_analyze/precheck/manager/group_manager.py b/profiler/msprof_analyze/precheck/manager/group_manager.py deleted file mode 100644 index fd492bd3d8b54612ccf8401d2e1997f5a0908081..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/manager/group_manager.py +++ /dev/null @@ -1,123 +0,0 @@ -import math -import os -import torch.distributed as dist - -from msprof_analyze.advisor.utils.utils import singleton - - -class EnvGroup: - def __init__(self, rank, local_rank, world_size, master_addr, master_port, group_rank, local_world_size): - self.rank = rank - self.local_rank = local_rank - self.world_size = world_size - self.master_addr = master_addr - self.master_port = master_port - self.group_rank = group_rank - self.local_world_size = local_world_size - self.check_all_attribute() - - def check_all_attribute(self): - if not isinstance(self.rank, int): - raise ValueError('rank must be an integer') - - if not isinstance(self.local_rank, int): - raise ValueError('local_rank must be an integer') - - if not isinstance(self.world_size, int): - raise ValueError('world_size must be an integer') - - if not isinstance(self.master_addr, str): - raise ValueError('master_addr must be an string') - - if not isinstance(self.master_port, int): - raise ValueError('master_port must be an integer') - - if not isinstance(self.group_rank, int): - raise ValueError('group_rank must be an integer') - - if not isinstance(self.local_world_size, int): - raise ValueError('local_world_size must be an integer') - - def set_env(self): - os.environ["RANK"] = str(self.rank) - os.environ["LOCAL_RANK"] = str(self.local_rank) - os.environ["WORLD_SIZE"] = str(self.world_size) - os.environ["MASTER_ADDR"] = self.master_addr - os.environ["MASTER_PORT"] = str(self.master_port) - os.environ["GROUP_RANK"] = str(self.group_rank) - os.environ["LOCAL_WORLD_SIZE"] = str(self.local_world_size) - - -class SubGroup: - def __init__(self, group, master_rank, ranks, file_sizes, file_hashes): - self.group = group - self.master_rank = master_rank - self.ranks = ranks - self.file_sizes = file_sizes - self.file_hashes = file_hashes - self.max_file_sizes = max(file_sizes) - self.split_file_size = None - self.splits = None - self.max_splits = None - - def split_size(self, split_file_size): - self.split_file_size = split_file_size - self.splits = [] - self.max_splits = math.ceil(self.max_file_sizes / split_file_size) - for file_size in self.file_sizes: - cur_splits = [] - for _ in range(self.max_splits): - if file_size > 0: - cur_splits.append(min(split_file_size, file_size)) - else: - cur_splits.append(0) - file_size -= split_file_size - self.splits.append(cur_splits) - - -@singleton -class GroupManager: - _initialized = False - - def __init__(self): - if not self._initialized: - self._rank = int(os.environ['RANK']) - self._local_rank = int(os.environ['LOCAL_RANK']) - self._world_size = int(os.environ['WORLD_SIZE']) - self._group_rank = int(os.environ['GROUP_RANK']) - self._rank_size = int(os.environ['LOCAL_WORLD_SIZE']) - self._local_group = None - self._node_group = None - self._sub_group_dict = {} - - def get_rank(self): - return self._rank - - def get_local_rank(self): - return self._local_rank - - def get_world_size(self): - return self._world_size - - def get_rank_size(self): - return self._rank_size - - def get_group_rank(self): - return self._group_rank - - def get_local_group(self): - if self._local_group is None: - groups = [x for x in range(self._group_rank * self._rank_size, (self._group_rank + 1) * self._rank_size)] - self._local_group = dist.new_group(ranks=groups) - return self._local_group - - def add_rank_sub_group(self, sub_group, ranks, file_sizes, file_hashes): - for rank in ranks: - self._sub_group_dict[rank] = SubGroup(group=sub_group, master_rank=ranks[0], ranks=ranks, - file_sizes=file_sizes, file_hashes=file_hashes) - - def get_rank_sub_group(self, rank): - if rank in self._sub_group_dict: - return self._sub_group_dict[rank] - else: - return None diff --git a/profiler/msprof_analyze/precheck/manager/task_manager.py b/profiler/msprof_analyze/precheck/manager/task_manager.py deleted file mode 100644 index d08f7afdad2c624e210e506ae5897480c0e6ced7..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/manager/task_manager.py +++ /dev/null @@ -1,79 +0,0 @@ -import os -import logging -import argparse - -from msprof_analyze.precheck.analyze.advisor_adaptor import advisor_adaptor -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -class TaskManager: - ADVISOR = 'advisor' - supported_analyzer = { - ADVISOR: advisor_adaptor, - } - - all_analyzer = list(supported_analyzer.keys()) - - @staticmethod - def add_analyzer(analyzer_name, analyzer_class): - - if analyzer not in TaskManager.supported_analyzer: - TaskManager.supported_analyzer[analyzer_name] = analyzer_class - - @staticmethod - def get_analyzer(analyzer_name): - return TaskManager.supported_analyzer.get(analyzer_name) - - @staticmethod - def get_result(analyzer_name, input_path, output): - - if analyzer_name not in TaskManager.all_analyzer: - logger.error("Error analyzer %s, supported analyzer are %s", analyzer_name, TaskManager.all_analyzer) - raise ValueError("Error analyzer %s, supported analyzer are %s", analyzer_name, TaskManager.all_analyzer) - - input_profiling_path_real = PathManager.get_realpath(input_path) - output_path_real = PathManager.get_realpath(output) - try: - analyze = TaskManager.get_analyzer(analyzer_name) - analyzer_instance = analyze() - result = analyzer_instance.analyze(input_profiling_path=input_profiling_path_real, - output_path=output_path_real) - - except Exception as e: - logger.error("%s is skipped when an exception is encountered. The exception is as follows: %s", - analyzer_name, e) - - -def get_args(): - parser = argparse.ArgumentParser(description="Profiler task manager") - - # Add command-line arguments - parser.add_argument('--input_profiling_path', type=str, - default=os.path.abspath("./result/"), - help="Path to the input profiling data") - parser.add_argument('--output_path', type=str, default=os.path.abspath('../result'), - help="Path to store the output results") - - return parser.parse_args() - - -if __name__ == "__main__": - try: - # Get arguments from the command line - args = get_args() - - # Use the command-line arguments or the default values - input_profiling_path = args.input_profiling_path - output_path = args.output_path - # Access all analyzers from the TaskManager - all_analyzer = TaskManager.all_analyzer - - # Loop through all analyzers and fetch the results - for analyzer in all_analyzer: - TaskManager.get_result(analyzer=analyzer, input_profiling_path=input_profiling_path, - output_path=output_path) - - except Exception as error: - logger.error("%s", error) diff --git a/profiler/msprof_analyze/precheck/requirements.txt b/profiler/msprof_analyze/precheck/requirements.txt deleted file mode 100644 index 8203bbe24f892e6d71b116939629c3582b2b1582..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/requirements.txt +++ /dev/null @@ -1,41 +0,0 @@ -absl-py==2.1.0 -attrs==24.2.0 -auto-tune -cloudpickle==3.0.0 -decorator==5.1.1 -filelock==3.15.4 -fsspec==2024.6.1 -MarkupSafe==2.1.5 -ml-dtypes==0.2.0 -mpmath==1.3.0 -networkx==3.1 -numpy==1.24.4 -psutil==6.0.0 -scipy==1.10.1 -sympy==1.13.2 -te -torch_npu==2.4.0 -tornado==6.4.1 -typing_extensions==4.12.2 - - - -## requirements for mstt advisor -click -tabulate -jinja2 -PyYAML -tqdm -prettytable -ijson -requests -xlsxwriter -SQLAlchemy -urllib3<2.0 -# bottleneck >= 1.3.6 # 注释行没有问题 -pandas - -# 如果你想要确保下载所有包的完整版本和所有的依赖项(包括子依赖), -# pip download -r requirements.txt -d pip_cache --no-deps -# 在离线环境中使用缓存安装依赖 -# pip install --no-index --find-links=file:///path/to/pip_cache -r requirements.txt diff --git a/profiler/msprof_analyze/precheck/runner/__init__.py b/profiler/msprof_analyze/precheck/runner/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/runner/__main__.py b/profiler/msprof_analyze/precheck/runner/__main__.py deleted file mode 100644 index 8f031ae14c2b3610799e2824f4a2d7212ae19eae..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/runner/__main__.py +++ /dev/null @@ -1,160 +0,0 @@ -import subprocess -import sys -import os -import logging - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.common.logger import add_file_handler, create_logger -from msprof_analyze.precheck.common.utils import check_file_owner_and_permission, cn_now -from msprof_analyze.precheck.manager.args_manager import PrecheckRunnerArgsManager -from msprof_analyze.precheck.runner.runners import CollectorRunner, AdvisorRunner -from msprof_analyze.precheck.manager.distribute_manager import DistributeManager -from msprof_analyze.prof_common.path_manager import PathManager - -logging.basicConfig(level=Constant.LOGGING_LEVEL) -logger = create_logger("msprof_analyze.precheck", Constant.LOGGING_LEVEL, use_memory_handler=True) - - -def get_conda_envs_info(python_path=sys.executable): - """ - Get the conda environment activation command based on Python executable path. - For non-conda environments, returns source ~/.bashrc command. - - Args: - python_path (str): The path to the Python executable. - - Returns: - tuple: A tuple containing (env_name, activation_command). - For conda: (env_name, "source /path/to/conda/bin/activate env_name") - For non-conda: (None, "source ~/.bashrc") - """ - try: - # Check if we're in a conda environment using CONDA_PREFIX - conda_prefix = os.environ.get('CONDA_PREFIX') - if conda_prefix: - conda_env = os.path.basename(conda_prefix) - conda_base = os.path.dirname(os.path.dirname(conda_prefix)) if 'envs' in conda_prefix else conda_prefix - activate_script = os.path.join(conda_base, "bin", "activate") - - if os.path.exists(activate_script): - check_file_owner_and_permission(activate_script) - return conda_env, f"source {activate_script} {conda_env}" - - # Fallback to path-based detection - CONDA_ENV_BASE_BIAS = 4 - path_splits = python_path.rsplit(os.path.sep, CONDA_ENV_BASE_BIAS) - - if len(path_splits) == CONDA_ENV_BASE_BIAS + 1: - conda_base_path, envs_str, conda_env, _, _ = path_splits - - if envs_str == 'envs': - activate_script = os.path.join(conda_base_path, "bin", "activate") - if os.path.exists(activate_script): - check_file_owner_and_permission(activate_script) - return conda_env, f"source {activate_script} {conda_env}" - - return None, "source ~/.bashrc" - - except Exception as e: - logger.warning("Failed to get conda environment info: %s. Falling back to source ~/.bashrc", str(e)) - return None, "source ~/.bashrc" - - -def start_precheck_runner(args: PrecheckRunnerArgsManager): - logger.info("Starting precheck runner with arguments: %s", args) - - dist_config = DistributeManager(args) - logger.info("Command line arguments: %s", sys.argv) - logger.info("Distributed configuration: %s", dist_config) - - profiler_res_dir_base = args.node_prof_save_dir - transporter_res_dir_base = args.master_prof_gather_dir - advisor_res_dir_base = args.master_prof_gather_dir - - PathManager.make_dir_safety(profiler_res_dir_base) - PathManager.make_dir_safety(transporter_res_dir_base) - PathManager.make_dir_safety(advisor_res_dir_base) - - prof_node_res_dir = profiler_res_dir_base - logger.info("Profiler results directory: %s", prof_node_res_dir) - - # start profiling - logger.info("Starting profiler runner") - env_name, conda_activate_cmd = get_conda_envs_info() - if env_name is None: - logger.warning("No conda environment found. Using system environment.") - else: - logger.info("Using conda environment: %s", env_name) - - profiler_example_name = Constant.DEFAULT_PROFILING_COMMANDS.get(args.profiling_cmd, None) - if profiler_example_name is None: - profiling_cmd = [ - "/bin/bash", "-ic", - f"{conda_activate_cmd} && cd {os.getcwd()} && " - f"MASTER_ADDR={dist_config.master_addr} MASTER_PORT={dist_config.master_port} " - f"NNODES={dist_config.nnodes} NODE_RANK={dist_config.node_rank} " - f"NPROC_PER_NODE={dist_config.nproc_per_node} " - f"{args.profiling_cmd}" - ] - else: - profiler_example_base = os.path.join(os.path.dirname(os.path.dirname(__file__)), "examples", "profiler", ) - - profiling_cmd = [ - "/bin/bash", "-ic", - f"{conda_activate_cmd} && cd {os.getcwd()} && " - f"torchrun " - f"--master_addr={dist_config.master_addr} " - f"--master_port={dist_config.master_port} " - f"--nproc_per_node={dist_config.nproc_per_node} " - f"--nnodes={dist_config.nnodes} " - f"--node_rank={dist_config.node_rank} " - f"{os.path.join(profiler_example_base, 'train_with_profiler.py')} " - f"--example_name {profiler_example_name} " - f"--prof_output_dir {prof_node_res_dir}" - + (" --static" if args.static else "") - ] - - logger.info("Using custom profiling command: %s", ' '.join(profiling_cmd)) - try: - logger.info("Executing profiling command...") - subprocess.run(profiling_cmd, check=True, capture_output=False, text=True) - logger.info("Profiling command completed successfully") - except subprocess.CalledProcessError as e: - logger.error("Profiling command failed with error: %s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise - - # zip and transport to master - if args.prof_in_shared_storage: - logger.info("Skipping data collection as profiling data is in shared storage") - prof_gather_dir = prof_node_res_dir - else: - logger.info("Starting collector runner") - CollectorRunner(src_dir=prof_node_res_dir, des_dir=transporter_res_dir_base, config=dist_config).run() - prof_gather_dir = transporter_res_dir_base - - # analyse the gathered files - if dist_config.rank == 0: - logger.info("Starting advisor runner") - AdvisorRunner( - src_dir=prof_gather_dir, - des_dir=advisor_res_dir_base, - config=dist_config, - is_shared_storage=args.prof_in_shared_storage - ).run() - - logger.info("Completed precheck runner execution") - - -def main(args=None): - global logger - output_dir = os.path.join(args.output_dir, args.task_name) - PathManager.make_dir_safety(output_dir) - - timestamp = cn_now().strftime('%Y%m%d_%H%M%S') - log_file_path = os.path.join(output_dir, f'precheck_runner_{timestamp}.log') - logger = add_file_handler(logger, log_file_path) - - try: - start_precheck_runner(args) - except Exception as e: - logger.error("Precheck runner failed with error: %s", e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) diff --git a/profiler/msprof_analyze/precheck/runner/runners.py b/profiler/msprof_analyze/precheck/runner/runners.py deleted file mode 100644 index f46dc398a7fe1a5d02733be3428c4fb30f649f43..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/runner/runners.py +++ /dev/null @@ -1,151 +0,0 @@ -import os -import subprocess -import zipfile -import glob -import logging - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.manager.distribute_manager import DistributeManager -from msprof_analyze.precheck.tools.archive_utils import create_archive, extract_archive, ArchiveConfig, \ - compare_directory_with_archive -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -class AdvisorRunner: - def __init__(self, src_dir, des_dir, config: DistributeManager, *args, **kwargs): - self.src_dir = src_dir - self.dest_dir = des_dir - self.config = config - self.is_shared_storage = kwargs.get('is_shared_storage', False) - - logger.info('%s init, args: %s, kwargs: %s', self.__class__.__name__, args, kwargs) - self.archive_extract_dir = os.path.join(self.dest_dir, 'prof_unzipped') - - def prepare_analysis_dir(self): - """Prepare directory for analysis, either by extracting archives or using source directly""" - if self.is_shared_storage: - logger.info("Using shared storage directory directly: %s", self.src_dir) - return self.src_dir - - logger.info("Preparing analysis directory by extracting archives") - PathManager.make_dir_safety(self.archive_extract_dir) - - archives_found = False - for root, _, files in os.walk(self.src_dir): - for file in files: - if any(file.endswith(ext) for ext in ['.zip', '.tar', '.tar.gz', '.tgz', ]): - archives_found = True - archive_path = os.path.join(root, file) - logger.info("Extracting archive: %s", archive_path) - extract_archive(archive_path, self.archive_extract_dir) - - if not archives_found: - logger.info("No archives found in %s, using source directory directly", self.src_dir) - return self.src_dir - - return self.archive_extract_dir - - def run(self): - if self.config.node_rank == 0 and self.config.local_rank == 0: - analysis_dir = self.prepare_analysis_dir() - self.run_analyzer(analysis_dir) - - def run_analyzer(self, analysis_dir): - """Find and process ascend_pt files in the analysis directory""" - - def call_analyzer(input_profiling_path, output_path): - from msprof_analyze.precheck.manager.task_manager import TaskManager - all_analyzer = TaskManager.all_analyzer - for analyzer in all_analyzer: - TaskManager.get_result(analyzer_name=analyzer, - input_path=input_profiling_path, - output=output_path) - - ascend_pt_dirs = glob.glob(os.path.join(analysis_dir, "*_ascend_pt"), recursive=False) - - if ascend_pt_dirs: - logger.info("Found %d ascend_pt directories in %s:", len(ascend_pt_dirs), analysis_dir) - for ascend_pt_dir in ascend_pt_dirs: - logger.debug("Found ascend_pt directory: %s", ascend_pt_dir) - - call_analyzer(analysis_dir, self.dest_dir) - else: - logger.warning("No ascend_pt files found in %s", analysis_dir) - - -class CollectorRunner: - def __init__(self, src_dir, des_dir, config: DistributeManager): - self.src_dir = os.path.abspath(src_dir) - self.des_dir = os.path.abspath(des_dir) - self.config = config - - logger.info('%s init', self.__class__.__name__) - - @staticmethod - def zip_directory(src_dir): - """Zip the specified directory.""" - zip_file_path = f"{src_dir}.zip" - - logger.info('Start zipping directory %s to %s', src_dir, zip_file_path) - - # Check if zip file already exists and contents match - if os.path.exists(zip_file_path): - logger.info('Found existing zip file: %s', zip_file_path) - logger.info('Comparing contents with source directory...') - - if compare_directory_with_archive(src_dir, zip_file_path): - logger.info('Existing zip matches source - reusing zip file') - return zip_file_path - - logger.info('Existing zip differs from source - creating new zip') - - # Create new zip file - create_archive(ArchiveConfig( - src_dir=src_dir, - output_path=zip_file_path, - whitelist=Constant.PROFILER_FILE_PATTERNS, - use_regex=True, - regex_fullmatch=False, - )) - - logger.info('Successfully created new zip file %s', zip_file_path) - - return zip_file_path - - def run(self): - zip_file = self.zip_directory(self.src_dir) - - self.transport(zip_file) - - def transport(self, zip_file): - """Transport the zip file to the destination.""" - - def run_collector(input_file_dir, output_file_dir: str, config: DistributeManager): - args_dict = { - "input_file_dir": input_file_dir, - "output_file_dir": output_file_dir, - "nnodes": config.nnodes, - "node_rank": config.node_rank, - "master_addr": config.master_addr, - "master_port": config.master_port, - "master_rank_num": Constant.COLLECTOR_MASTER_RANK_NUM, - "split_file_size": Constant.COLLECTOR_SPLIT_FILE_SIZE, - "time_out": Constant.COLLECTOR_DEFAULT_TIMEOUT, - "log_file": None - } - - from msprof_analyze.precheck.collect.collector import Collector - Collector().run(args_dict) - - run_collector(zip_file, self.des_dir, self.config) - - if self.config.node_rank == 0 or self.config.master_addr in Constant.LOCALHOST_ADDRESSES: - mv_command = ['cp', zip_file, self.des_dir] - logger.info("[rank=%s] %s", self.config.rank, mv_command) - subprocess.run(mv_command, check=True) - else: - pass - - logger.info("[rank=%s] Successfully transferred %s to %s", self.config.rank, zip_file, self.des_dir) diff --git a/profiler/msprof_analyze/precheck/tools/__init__.py b/profiler/msprof_analyze/precheck/tools/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/profiler/msprof_analyze/precheck/tools/archive_utils.py b/profiler/msprof_analyze/precheck/tools/archive_utils.py deleted file mode 100644 index 236413a3052174d07b7463ffb0d42042b50daa59..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/tools/archive_utils.py +++ /dev/null @@ -1,274 +0,0 @@ -import glob -import os -import zipfile -import tarfile -import logging -import re -import fnmatch -from dataclasses import dataclass -from typing import List, Optional - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -@dataclass -class ArchiveConfig: - src_dir: str - output_path: str - use_tar: bool = False - whitelist: Optional[List[str]] = None - blacklist: Optional[List[str]] = None - use_regex: bool = False - regex_fullmatch: bool = True - - -def create_archive(archive_args: ArchiveConfig): - """ - Create a zip or tar archive from a source directory. - - The archive will contain files from the source directory that match the whitelist - patterns (if specified) and don't match the blacklist patterns (if specified). - Patterns can be either glob patterns or regular expressions based on the use_regex flag. - - For regex patterns: - - If regex_fullmatch is True, the entire path must match the pattern - - If regex_fullmatch is False, the pattern can match anywhere in the path - - For glob patterns: - - Standard glob syntax is used (*, ?, [seq], [!seq]) - - Patterns are matched against the full relative path - - Args: - archive_args: Configuration object containing: - src_dir: Source directory to archive - output_path: Output path for the archive file - use_tar: If True create tar.gz, if False create zip - whitelist: List of patterns to include - blacklist: List of patterns to exclude - use_regex: If True use regex patterns, if False use glob - regex_fullmatch: If True require full regex match - - """ - - if not os.path.exists(archive_args.src_dir): - raise ValueError(f"Source directory '{archive_args.src_dir}' does not exist") - - save_dir = os.path.dirname(archive_args.output_path) - if not os.path.exists(save_dir): - raise ValueError(f"Destination directory '{save_dir}' does not exist") - - logger.info("Creating %s archive: %s", 'tar' if archive_args.use_tar else 'zip', archive_args.output_path) - logger.debug("Source directory: %s", archive_args.src_dir) - - if archive_args.use_regex: - if archive_args.whitelist: - whitelist = [re.compile(pattern) for pattern in archive_args.whitelist] - else: - whitelist = None - if archive_args.blacklist: - blacklist = [re.compile(pattern) for pattern in archive_args.blacklist] - else: - blacklist = None - else: - whitelist = archive_args.whitelist - blacklist = archive_args.blacklist - - def should_include_file(relative_path): - # Define pattern matching functions - def regex_fullmatch(pattern): - return pattern.fullmatch(relative_path) - - def regex_search(pattern): - return pattern.search(relative_path) - - def glob_match(pattern): - return fnmatch.fnmatch(relative_path, pattern) - - # Choose pattern matcher based on args - if archive_args.use_regex: - if archive_args.regex_fullmatch: - pattern_matcher = regex_fullmatch - else: - pattern_matcher = regex_search - else: - pattern_matcher = glob_match - - # Check blacklist first - if blacklist and any(map(pattern_matcher, blacklist)): - return False - - # If no whitelist, include all non-blacklisted files - if not whitelist: - return True - - # Check whitelist - return any(map(pattern_matcher, whitelist)) - - # Get all files in source directory recursively - abs_files = glob.glob(os.path.join(archive_args.src_dir, '**', '*'), recursive=True) - files = [os.path.relpath(file, archive_args.src_dir) for file in abs_files] - - files_to_add = [ - file for file_abs_path, file in zip(abs_files, files) - if should_include_file(file) and os.path.isfile(file_abs_path) - ] - - logger.info("Has found %d files to add at path: %s", len(files_to_add), archive_args.src_dir) - - # Process files based on archive type (tar or zip) - def add_files_to_tar(files_to_add): - with tarfile.open(archive_args.output_path, 'w:gz') as f: - for file in files_to_add: - file_path = os.path.join(archive_args.src_dir, file) - f.add(file_path, arcname=file) - - def add_files_to_zip(files_to_add): - with zipfile.ZipFile(archive_args.output_path, 'w', zipfile.ZIP_DEFLATED) as f: - for file in files_to_add: - file_path = os.path.join(archive_args.src_dir, file) - f.write(file_path, arcname=file) - - if archive_args.use_tar: - add_files_to_tar(files_to_add) - else: - add_files_to_zip(files_to_add) - - logger.info("Archive created successfully: %s", archive_args.output_path) - - -def _check_safe_zip(archive_file, max_archive_ratio=None, - max_size=Constant.MAX_ARCHIVE_SIZE, - max_file_count=Constant.MAX_ARCHIVE_FILE_COUNT, - ): - PathManager.check_path_readable(archive_file) - - archive_size = os.path.getsize(archive_file) - if max_archive_ratio is not None: - max_size = max(max_size, max_archive_ratio * archive_size) - - try: - with zipfile.ZipFile(archive_file, 'r') as zip_ref: - total_size = 0 - total_file_count = 0 - for info in zip_ref.infolist(): - total_size += info.file_size - total_file_count += 1 - if total_size > max_size: - raise RuntimeError("Archive size exceeds the limit") - if total_file_count > max_file_count: - raise RuntimeError("Archive file count exceeds the limit") - except (zipfile.BadZipFile, OSError) as e: - logger.error("Error reading zip file %s: %s", archive_file, e) - raise - - -def _check_safe_tar(archive_file, max_archive_ratio=None, - max_size=Constant.MAX_ARCHIVE_SIZE, - max_file_count=Constant.MAX_ARCHIVE_FILE_COUNT, - ): - PathManager.check_path_readable(archive_file) - - archive_size = os.path.getsize(archive_file) - if max_archive_ratio is not None: - max_size = max(max_size, max_archive_ratio * archive_size) - - try: - with tarfile.open(archive_file, 'r:*') as tar_ref: - total_size = 0 - total_file_count = 0 - for member in tar_ref.getmembers(): - total_size += member.size - total_file_count += 1 - if total_size > max_size: - raise RuntimeError("Archive size exceeds the limit") - if total_file_count > max_file_count: - raise RuntimeError("Archive file count exceeds the limit") - except (tarfile.TarError, OSError) as e: - logger.error("Error reading tar file %s: %s", archive_file, e) - raise - - -def _unzip(zip_file, extract_dir): - """Extract contents from a zip archive""" - - _check_safe_zip(zip_file, max_archive_ratio=Constant.MAX_ARCHIVE_RATIO) - with zipfile.ZipFile(zip_file, 'r') as zip_ref: - zip_ref.extractall(extract_dir) - logger.info("Unzipped %s to %s", zip_file, extract_dir) - - -def _untar(tar_file, extract_dir): - """Extract contents from a tar/tar.gz/tgz archive""" - - _check_safe_tar(tar_file, max_archive_ratio=Constant.MAX_ARCHIVE_RATIO) - with tarfile.open(tar_file, 'r:*') as tar_ref: # Auto-detect compression type - tar_ref.extractall(extract_dir) - logger.info("Untarred %s to %s", tar_file, extract_dir) - - -def extract_archive(archive_file, extract_dir): - """Extract contents from zip or tar archive files""" - - if archive_file.endswith('.zip'): - _unzip(archive_file, extract_dir) - elif archive_file.endswith('.tar') or archive_file.endswith('.tar.gz') or archive_file.endswith('.tgz'): - _untar(archive_file, extract_dir) - else: - logger.warning("Unsupported archive type: %s", archive_file) - - -def compare_directory_with_archive(src_dir: str, zip_file_path: str) -> bool: - """ - Compare contents of source directory with existing zip file. - - Args: - src_dir: Source directory path - zip_file_path: Path to zip file - - Returns: - bool: True if contents match, False otherwise - """ - # Get source files info - src_files = {} - for file_path in glob.glob(os.path.join(src_dir, "**"), recursive=True): - if os.path.isfile(file_path): - rel_path = os.path.relpath(file_path, src_dir) - src_files[rel_path] = os.path.getsize(file_path) - - # Compare with zip contents - with zipfile.ZipFile(zip_file_path, 'r') as existing_zip: - zip_files = { - info.filename: info.file_size - for info in existing_zip.filelist - } - - return src_files == zip_files - - -if __name__ == '__main__': - logging.basicConfig(level=logging.INFO) - - # Example usage with fnmatch whitelist, blacklist - config = ArchiveConfig( - src_dir="profiler/msprof_analyze/precheck/runner", - output_path="profiler/msprof_analyze/precheck/runner.zip", - whitelist=[r"tools/*", r"profiler/*", r"tests/*"], # Only include files in these directories - blacklist=[r"*.pyc"], # Exclude .pyc files - use_regex=False, - ) - - create_archive(config) - - # Example usage with regex whitelist, blacklist - config = ArchiveConfig( - src_dir="profiler/msprof_analyze/precheck/runner", - output_path="profiler/msprof_analyze/precheck/runner_regex.zip", - whitelist=[r"tools/.*", r"profiler/.*", r"tests/.*"], - blacklist=[r".*\.pyc$"], - use_regex=True, - ) - - create_archive(config) diff --git a/profiler/msprof_analyze/precheck/tools/ssh_utils.py b/profiler/msprof_analyze/precheck/tools/ssh_utils.py deleted file mode 100644 index c99c828d15ecda1ff13ad3ad7a2af885431a64e7..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/precheck/tools/ssh_utils.py +++ /dev/null @@ -1,264 +0,0 @@ -import getpass -import ipaddress -import os -import logging -import re -import subprocess -import shlex -from dataclasses import dataclass -from typing import List, Union - -from msprof_analyze.precheck.common.constant import Constant -from msprof_analyze.precheck.common.utils import cn_now -from msprof_analyze.prof_common.path_manager import PathManager - -logger = logging.getLogger(__name__) - - -@dataclass -class SSHConfig: - host: str - username: str - key_file: str - port: int = 22 - timeout: int = 3 - - def __post_init__(self): - """ Validate all fields after initialization """ - error = _check_ip_valid(self.host) - if error: - raise RuntimeError(f"Invalid host {self.host}: {error}") - - error = _check_int_range(self.port, min_value=1, max_value=Constant.ARG_MAX_INT_VALUE) - if error: - raise RuntimeError(f"Invalid port {self.port}: {error}") - - error = _check_ssh_key_file_valid(self.key_file) - if error: - raise RuntimeError(f"Invalid SSH key file {self.key_file}: {error}") - - error = _check_identifier_valid(self.username) - if error: - raise RuntimeError(f"Invalid username {self.username}: {error}") - - error = _check_int_range(self.timeout, min_value=1) - if error: - raise RuntimeError(f"Invalid timeout {self.timeout}: {error}") - - -def _check_ip_valid(ip: str) -> Union[Exception, None]: - try: - ipaddress.ip_address(ip) - except ValueError as e: - return e - return None - - -def _check_int_range( - value: int, min_value: int = Constant.ARG_MIN_INT_VALUE, max_value: int = Constant.ARG_MAX_INT_VALUE -) -> Union[Exception, None]: - if not (min_value <= value <= max_value): - return ValueError(f"The value must be between {min_value} and {max_value}.") - return None - - -def _check_identifier_valid(identifier: str) -> Union[Exception, None]: - pattern = r'^[a-zA-Z_][a-zA-Z0-9_-]*$' - if not re.match(pattern, identifier): - return ValueError(f"It must start with a letter or underscore, " - f"followed by any number of letters, digits, underscores, or dashes.") - return None - - -def _check_ssh_key_file_valid(ssh_key_file: str) -> Union[Exception, None]: - try: - expanded_path = os.path.expanduser(ssh_key_file) - stat_info = os.stat(expanded_path) - current_uid = os.getuid() - - # check file owner - if stat_info.st_uid != current_uid: - return ValueError(f"SSH key file {ssh_key_file} must be owned by the current user") - # check permissions to only read and write by owner - if stat_info.st_mode & 0o777 != 0o600: - return ValueError(f"SSH key file {ssh_key_file} must have permissions set to 600") - - return None - - except FileNotFoundError: - return ValueError(f"SSH key file {ssh_key_file} does not exist") - except PermissionError: - return ValueError(f"Permission denied when accessing SSH key file {ssh_key_file}") - - -def execute_ssh_command(config: SSHConfig, command: str) -> dict: - """ - Execute a command directly on a remote host using SSH without using tmux. - - Args: - config (SSHConfig): SSH configuration - command (str): Command to run on the remote host - - Returns: - dict: Dict containing command execution status and output with keys: - - success (bool): Whether the command was executed successfully - - output (str): Output from the command execution - """ - if not isinstance(config, SSHConfig): - raise ValueError("config must be an instance of SSHConfig") - - ssh_prefix = f"ssh -o ConnectTimeout={config.timeout} -p {config.port} {config.username}@{config.host}" - if config.key_file: - ssh_prefix += f" -i {config.key_file}" - - try: - result = subprocess.run([*shlex.split(ssh_prefix), command], capture_output=True, text=True, check=True) - return { - 'success': True, - 'output': result.stdout - } - except subprocess.CalledProcessError as e: - logger.error("SSH command failed on %s: %s", config.host, e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - return { - 'success': False, - 'output': e.stderr - } - - -def execute_ssh_command_in_tmux(config: SSHConfig, session_name: str, command: str) -> dict: - """ - Connect to remote host using system ssh command, start or update tmux session and run command - - Args: - config (SSHConfig): SSH configuration - session_name (str): Base name for tmux session - command (str): Command to run in tmux session - - Returns: - dict: Dict containing session info with keys: - - session_name (str): Name of tmux session - - win_name (str): Name of tmux window - - attach_cmd (str): Command to attach to tmux session - """ - if not isinstance(config, SSHConfig): - raise ValueError("config must be an instance of SSHConfig") - - error = _check_identifier_valid(session_name) - if error: - raise RuntimeError(f"Invalid session name {session_name}: {error}") - - win_name = cn_now().strftime("%H%M") - attach_cmd = "" - - try: - ssh_prefix = f"ssh -o ConnectTimeout={config.timeout} -p {config.port} {config.username}@{config.host}" - if config.key_file: - ssh_prefix += f" -i {config.key_file}" - - check_cmd = f"{ssh_prefix} 'tmux list-sessions | grep -q \"^{session_name}:\" && echo exists || echo new'" - result = subprocess.run(shlex.split(check_cmd), capture_output=True, text=True) - session_status = result.stdout.strip() - - escaped_command = command.replace("'", "\\'").replace('"', '\\"') - - tmux_cmd_suffix = f"script -f /tmp/tmux_output_{win_name} -c \"{escaped_command}\"; bash -i" - if session_status == "exists": - logger.info("Session '%s' exists on %s. Creating a new window with name '%s'.", - session_name, config.host, win_name) - tmux_cmd = f"tmux new-window -t {session_name} -n '{win_name}' '{tmux_cmd_suffix}'" - else: - logger.info( - "Session '%s' does not exist on %s. Creating a new session with name '%s'. " - "Creating a new window with name '%s'.", session_name, config.host, session_name, win_name) - tmux_cmd = f"tmux new-session -d -s {session_name} -n '{win_name}' '{tmux_cmd_suffix}'" - - logger.info("Running command to start session: %s", tmux_cmd) - - result = subprocess.run(shlex.split(ssh_prefix) + [tmux_cmd], capture_output=True, text=True, check=True) - - if result.stdout.strip(): - logger.info("Output from %s:\n%s", config.host, result.stdout) - - attach_cmd = f"tmux attach -t {session_name}:{win_name}" - logger.info('Session started. To attach to the session, run: "%s" in terminal on %s@%s', - attach_cmd, config.username, config.host) - - except Exception as e: - logger.error("Failed to connect to %s: %s", config.host, e, exc_info=Constant.ENABLE_STACKTRACE_LOGGING) - raise RuntimeError(f"Fail to start host {config.host}") from e - - return dict( - session_name=session_name, - win_name=win_name, - attach_cmd=attach_cmd, - ) - - -def run_remote_command(hosts_info: List[dict], session_name: str = None, using_tmux: bool = True) -> List[dict]: - """ - Execute specified commands on remote hosts using SSH, optionally within a tmux session. - - This function supports executing commands directly via SSH or within a tmux session for - better management of long-running processes. - - Args: - hosts_info (list of dict): Information about the hosts on which commands will be executed. - Each dictionary should contain: - - host (str): Hostname or IP address of the remote machine. - - username (str): SSH username for the remote host. - - key_filename (str, optional): Path to the SSH private key file. Defaults to '~/.ssh/id_rsa'. - - command (str): Command to be executed on the remote host. - - port (int, optional): SSH port number. Defaults to 22. - session_name (str, optional): Name to be used for the tmux session, if using tmux. Automatically generated - if not provided. - using_tmux (bool): Whether to execute the command within a tmux session. Defaults to True. - - Returns: - list of dict: Results from each host, with each dictionary containing: - - session_name (str): Name of the tmux session (if used). - - win_name (str): Name of the tmux window (if used). - - attach_cmd (str): Command to attach to the tmux session (if used). - """ - user = getpass.getuser() - if session_name is None: - session_name = f"auto_{user}_{cn_now().strftime('%m%d')}" - - results = [] - - for host_info in hosts_info: - config = SSHConfig( - host=host_info["host"], - username=host_info["username"], - key_file=host_info.get("key_filename", "~/.ssh/id_rsa"), - port=host_info.get("port", 22) - ) - config.key_file = PathManager.expanduser_for_argumentparser(config.key_file) - if using_tmux: - results.append(execute_ssh_command_in_tmux(config, session_name, host_info["command"])) - else: - results.append(execute_ssh_command(config, host_info["command"])) - - return results - - -def main(): - hosts = [{ - "host": "127.0.0.1", - "username": os.getenv("USER"), - "key_filename": "~/.ssh/id_ed25519", - "command": f"echo Hello!", - "port": 22 - }] - - run_remote_command(hosts) - - -if __name__ == "__main__": - logging.basicConfig( - level=logging.DEBUG, - format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", - handlers=[ - logging.StreamHandler(), - ] - ) - main() diff --git a/profiler/msprof_analyze/prof_common/__init__.py b/profiler/msprof_analyze/prof_common/__init__.py index 8b7e7544bb1bd466a9b223cb1f706422bcab9435..c2764ec2a520567abc0c7d119b222f5fea7c3b72 100644 --- a/profiler/msprof_analyze/prof_common/__init__.py +++ b/profiler/msprof_analyze/prof_common/__init__.py @@ -14,4 +14,4 @@ # limitations under the License. import os import sys -sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) \ No newline at end of file +sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) diff --git a/profiler/msprof_analyze/prof_common/additional_args_manager.py b/profiler/msprof_analyze/prof_common/additional_args_manager.py index 9136b049bc087fd04e2fca9a55e93f0f59859fb0..a298025977aa91be097d9dc85281f2d6dc6bf7e4 100644 --- a/profiler/msprof_analyze/prof_common/additional_args_manager.py +++ b/profiler/msprof_analyze/prof_common/additional_args_manager.py @@ -29,7 +29,7 @@ # limitations under the License. from typing import Dict -from msprof_analyze.advisor.utils.utils import singleton +from msprof_analyze.prof_common.singleton import singleton @singleton diff --git a/profiler/msprof_analyze/prof_common/base_node.py b/profiler/msprof_analyze/prof_common/base_node.py deleted file mode 100644 index e96c5521ca11b778e277df1d17fb26a88f9f988f..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_common/base_node.py +++ /dev/null @@ -1,82 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from math import ceil -from queue import Queue - -from decimal import Decimal - -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean - - -class BaseNode: - def __init__(self, event: TraceEventBean, parent_node=None): - self._event = event - self._parent_node = parent_node - self._child_nodes = [] - - @property - def parent_node(self): - return self._parent_node - - @property - def child_nodes(self): - return self._child_nodes - - @property - def name(self): - return self._event.name - - @property - def start_time(self) -> Decimal: - return self._event.start_time - - @property - def end_time(self) -> Decimal: - return self._event.end_time - - @parent_node.setter - def parent_node(self, parent_node): - self._parent_node = parent_node - - def update_child_nodes(self, node): - self._child_nodes.append(node) - - def binary_search(self, ts_time): - if not self.child_nodes: - return Constant.INVALID_RETURN - right = len(self.child_nodes) - 1 - left = 0 - while right > left: - mid = left + ceil((right - left) / 2) - if ts_time >= self.child_nodes[mid].start_time: - left = mid - else: - right = mid - 1 - if self.child_nodes[left].start_time < ts_time < self.child_nodes[left].end_time: - return self.child_nodes[left] - return Constant.INVALID_RETURN - - def find_all_child_nodes(self) -> list: - result_data = [] - node_queue = Queue() - for child_node in self.child_nodes: - node_queue.put(child_node) - while not node_queue.empty(): - tree_node = node_queue.get() - result_data.append(tree_node) - for child_node in tree_node.child_nodes: - node_queue.put(child_node) - return result_data diff --git a/profiler/msprof_analyze/prof_common/constant.py b/profiler/msprof_analyze/prof_common/constant.py index d77589dbcc0ebfe865c661c4d54292e05702bc9f..bf6175fa4572f8f8e99e419b736af3395040d340 100644 --- a/profiler/msprof_analyze/prof_common/constant.py +++ b/profiler/msprof_analyze/prof_common/constant.py @@ -43,6 +43,7 @@ class Constant(object): FRAMEWORK_DIR = "FRAMEWORK" CLUSTER_ANALYSIS_OUTPUT = "cluster_analysis_output" SINGLE_OUTPUT = "ASCEND_PROFILER_OUTPUT" + ANALYZE_DIR = "analyze" COMM_JSON = "communication.json" COMM_MATRIX_JSON = "communication_matrix.json" STEP_TIME_CSV = "step_trace_time.csv" @@ -61,6 +62,7 @@ class Constant(object): # communication P2P = "p2p" COLLECTIVE = "collective" + TOTAL = "total" STEP_ID = "step_id" RANK_ID = "rank_id" GROUP_NAME = "group_name" @@ -86,6 +88,7 @@ class Constant(object): ELAPSE_TIME_MS = "Elapse Time(ms)" IDLE_TIME_MS = "Idle Time(ms)" LARGE_PACKET_RATIO = "Large Packet Ratio" + TYPE = "type" # params DATA_MAP = "data_map" @@ -97,6 +100,8 @@ class Constant(object): TRANSPORT_TYPE = "Transport Type" COMM_DATA_DICT = "comm_data_dict" DATA_TYPE = "data_type" + IS_MSPROF = "is_prof" + IS_MINDSPORE = "is_mindspore" # step time RANK = "rank" @@ -114,42 +119,13 @@ class Constant(object): DB = "db" INVALID = "invalid" - # profiler db tables - TABLE_AICORE_FREQ = "AICORE_FREQ" - TABLE_CANN_API = "CANN_API" - TABLE_COMMUNICATION_OP = "COMMUNICATION_OP" - TABLE_COMMUNICATION_TASK_INFO = "COMMUNICATION_TASK_INFO" - TABLE_COMPUTE_TASK_INFO = "COMPUTE_TASK_INFO" - TABLE_CONNECTION_IDS = "CONNECTION_IDS" - TABLE_CONNECTION_CATS = "connectionCats" - TABLE_ENUM_API_TYPE = "ENUM_API_TYPE" - TABLE_ENUM_HCCL_DATA_TYPE = "ENUM_HCCL_DATA_TYPE" - TABLE_ENUM_HCCL_LINK_TYPE = "ENUM_HCCL_LINK_TYPE" - TABLE_ENUM_HCCL_RDMA_TYPE = "ENUM_HCCL_RDMA_TYPE" - TABLE_ENUM_TRANSPORT_TYPE = "ENUM_TRANSPORT_TYPE" - TABLE_ENUM_MODULE = "ENUM_MODULE" - TABLE_MSTX_EVENT_TYPE = "MSTX_EVENT_TYPE" - TABLE_HOST_INFO = "HOST_INFO" - TABLE_META_DATA = "META_DATA" - TABLE_NPU_INFO = "NPU_INFO" - TABLE_OVERLAP_ANALYSIS = "OVERLAP_ANALYSIS" - TABLE_PYTORCH_API = "PYTORCH_API" - TABLE_RANK_DEVICE_MAP = "RANK_DEVICE_MAP" - TABLE_SESSION_TIME_INFO = "SESSION_TIME_INFO" - TABLE_STATUS_INFO = "status_info" - TABLE_STEP_TIME = "STEP_TIME" - TABLE_STRING_IDS = "STRING_IDS" - TABLE_TASK = "TASK" - TABLE_TASK_MPU_INFO = "TASK_MPU_INFO" - - # export_type - NOTEBOOK = "notebook" - # db name DB_COMMUNICATION_ANALYZER = "analysis.db" DB_CLUSTER_COMMUNICATION_ANALYZER = "cluster_analysis.db" + DB_MS_COMMUNICATION_ANALYZER = "communication_analyzer.db" # db tables + TABLE_COMMUNICATION_GROUP = "CommunicationGroup" TABLE_COMM_ANALYZER_BANDWIDTH = "CommAnalyzerBandwidth" TABLE_COMM_ANALYZER_TIME = "CommAnalyzerTime" TABLE_COMM_ANALYZER_MATRIX = "CommAnalyzerMatrix" @@ -157,25 +133,22 @@ class Constant(object): TABLE_HOST_INFO = "HostInfo" TABLE_RANK_DEVICE_MAP = "RankDeviceMap" TABLE_CLUSTER_BASE_INFO = "ClusterBaseInfo" - TABLE_CLUSTER_TIME_SUMMARY = "ClusterTimeSummary" + TABLE_META_DATA = "META_DATA" + TABLE_COMMUNICATION_GROUP_MAPPING = "CommunicationGroupMapping" + TABLE_CLUSTER_COMMUNICATION_MATRIX = "ClusterCommAnalyzerMatrix" + TABLE_CLUSTER_COMMUNICATION_BANDWIDTH = "ClusterCommAnalyzerBandwidth" + TABLE_CLUSTER_COMMUNICATION_TIME = "ClusterCommunicationTime" # data config key CONFIG = "config" EXPER_CONFIG = "experimental_config" EXPER_EXPORT_TYPE = "_export_type" - EXPORT_TYPE = "_export_type" + PROFILER_PARAMETER = "profiler_parameters" # metadata key DISTRIBUTED_ARGS = "distributed_args" PARALLEL_GROUP_INFO = "parallel_group_info" - # parallel_info_key - GROUP_NAME = "group_name" - GLOBAL_RANKS = "global_ranks" - - # group name value - PP = "pp" - # mode ALL = "all" COMMUNICATION_TIME = "communication_time" @@ -202,10 +175,12 @@ class Constant(object): BLUE_COLOR = "00BFFF" LIGHT_BLUE_COLOR = "87CEFA" US_TO_MS = 1000 + NS_TO_US = 1000 KB_TO_MB = 1024 INVALID_VALUE = -1 MILLISECONDS_TO_SECONDS = 10 ** 3 MICROSECONDS_TO_SECONDS = 10 ** 6 + MILLISECONDS_TO_MICROSECONDS = 10 ** 3 PROFILING_TYPE = "profiling type" @@ -288,10 +263,6 @@ class Constant(object): VOID_STEP = -1 - # communication task type - NOTIFY_RECORD = "Notify_Record" - NOTIFY_WAIT = "Notify_Wait" - # advisor # timeline @@ -397,7 +368,6 @@ class Constant(object): PT_PROF_SUFFIX = "ascend_pt" ASCEND_PROFILER_OUTPUT = "ASCEND_PROFILER_OUTPUT" - KERNEL_DETAILS_CSV = "kernel_details.csv" CLUSTER_STEP_TIME_CSV = "cluster_step_trace_time.csv" CLUSTER_COMM_JSON = "cluster_communication.json" COMMUNICATION_JSON = "communication.json" @@ -425,7 +395,6 @@ class Constant(object): # Unit Conversion COMMUNICATION_B_TO_GB = 0.001 ** 3 US_TO_S = 0.001 ** 2 - TIME_UNIT_SCALE = 1000 WRITE_MODES = stat.S_IWUSR | stat.S_IRUSR | stat.S_IRGRP WRITE_FLAGS = os.O_WRONLY | os.O_CREAT | os.O_TRUNC @@ -443,9 +412,9 @@ class Constant(object): OPERATOR_TYPE = 1 VIRTUAL_TYPE = 9 - # trace bar + # json trace bar NPU_BAR = "Ascend Hardware" - HCCL_BAR = "HCCL" + COMM_BAR = "Communication" OVERLAP_BAR = "Overlap Analysis" # overlap_analysis event @@ -472,9 +441,25 @@ class Constant(object): RANK_LIST = "rank_list" EXPORT_TYPE = "export_type" EXTRA_ARGS = "args" + STEP_RANGE = "step_range" + START_NS = "startNs" + END_NS = "endNs" # hccl_sum UINT32_BITS = 32 UINT32_MASK = 0xffffffff - INVALID_RANK_NUM = 4294967295 + # slow rank + MAX_DIXON_NUM = 100 + DIXON_THRESHOLD_1 = 7 + DIXON_THRESHOLD_2 = 10 + DIXON_THRESHOLD_3 = 13 + + UNKNOWN = "unknown" + + SQL_PLACEHOLDER_PATTERN = r"\?|\%s" + + # cluster_analysis_output + COMMUNICATION_GROUP_JSON = "communication_group.json" + CLUSTER_COMMUNICATION_MATRIX_JSON = "cluster_communication_matrix.json" + KEY_COMM_GROUP_PARALLEL_INFO = "comm_group_parallel_info" diff --git a/profiler/msprof_analyze/prof_common/database_service.py b/profiler/msprof_analyze/prof_common/database_service.py index 1e51b787dcb3e2911f3d0795fefd95cd34bb68af..8cd4cdd2a1f414ba6d7945a22dea8fc6c312f85e 100644 --- a/profiler/msprof_analyze/prof_common/database_service.py +++ b/profiler/msprof_analyze/prof_common/database_service.py @@ -12,18 +12,46 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import re + import pandas as pd from msprof_analyze.prof_common.db_manager import DBManager from msprof_analyze.prof_common.logger import get_logger +from msprof_analyze.prof_common.constant import Constant logger = get_logger() class DatabaseService: - def __init__(self, db_path): + TABLE_TS_DICT = { + "TASK": "startNs", + "COMMUNICATION_OP": "startNs", + "CANN_API": "startNs", + "PYTORCH_API": "startNs", + "MSTX_EVENTS": "startNs", + "GC_RECORD": "startNs", + "ACC_PMU": "timestampNs", + "NIC": "timestampNs", + "RoCE": "timestampNs", + "LLC": "timestampNs", + "SAMPLE_PMU_TIMELINE": "timestampNs", + "NPU_MEM": "timestampNs", + "NPU_MODULE_MEM": "timestampNs", + "NPU_OP_MEM": "timestampNs", + "HBM": "timestampNs", + "DDR": "timestampNs", + "HCCS": "timestampNs", + "PCIE": "timestampNs", + "AICORE_FREQ": "timestampNs" + } + + def __init__(self, db_path, step_range): self._db_path = db_path + self._step_range = step_range self._table_info = {} + self._param = (self._step_range.get(Constant.START_NS), + self._step_range.get(Constant.END_NS)) if self._step_range else None def add_table_for_query(self, table_name: str, columns=None): if not isinstance(table_name, str): @@ -47,10 +75,26 @@ class DatabaseService: if not DBManager.judge_table_exists(cursor, table_name): logger.warning(f"This table {table_name} does not exist in this database {self._db_path}.") continue - columns_str = "*" if not columns else ",".join(columns) - query_sql = f"select {columns_str} from {table_name}" + table_columns = DBManager.get_table_columns_name(cursor, table_name) + if not columns: + columns_str = ",".join(table_columns) + else: + columns = [column for column in columns if column in table_columns] + columns_str = ",".join(columns) + if not columns_str: + logger.error(f"The fields to be queried in Table {table_name} are invalid.") + return result_data + if table_name in self.TABLE_TS_DICT and self._step_range: + where_str = f"where {self.TABLE_TS_DICT.get(table_name)} >= ? " \ + f"and {self.TABLE_TS_DICT.get(table_name)} <= ?" + else: + where_str = "" + query_sql = f"select {columns_str} from {table_name} {where_str}" try: - data = pd.read_sql(query_sql, conn) + if self._param is not None and re.search(Constant.SQL_PLACEHOLDER_PATTERN, query_sql): + data = pd.read_sql(query_sql, conn, params=self._param) + else: + data = pd.read_sql(query_sql, conn) result_data[table_name] = data except Exception as err: logger.error(err) diff --git a/profiler/msprof_analyze/prof_common/db_manager.py b/profiler/msprof_analyze/prof_common/db_manager.py index 8740499c27edc9562ad2861b5da8d1a21f02dd0c..1c95e2eec123cf7378c81f5cf355a1ee7bb3bf2f 100644 --- a/profiler/msprof_analyze/prof_common/db_manager.py +++ b/profiler/msprof_analyze/prof_common/db_manager.py @@ -15,6 +15,7 @@ import os import sqlite3 +from typing import List from msprof_analyze.cluster_analyse.common_func.empty_class import EmptyClass from msprof_analyze.cluster_analyse.common_func.tables_config import TablesConfig @@ -143,41 +144,6 @@ class DBManager: logger.error("conn is invalid param") return False - @staticmethod - def execute_sql(conn: any, sql: str, params: any = None) -> bool: - """ - execute sql - """ - try: - if isinstance(conn, sqlite3.Connection): - if params: - conn.cursor().execute(sql, params) - else: - conn.cursor().execute(sql) - conn.commit() - return True - except sqlite3.Error as err: - logger.error(err) - return False - logger.error("conn is invalid param") - return False - - @staticmethod - def executemany_sql(conn: any, sql: str, params: any) -> bool: - """ - execute many sql once - """ - try: - if isinstance(conn, sqlite3.Connection): - conn.cursor().executemany(sql, params) - conn.commit() - return True - except sqlite3.Error as err: - logger.error(err) - return False - logger.error("conn is invalid param") - return False - @classmethod def check_tables_in_db(cls, db_path: any, *tables: any) -> bool: if check_db_path_valid(db_path): @@ -224,6 +190,17 @@ class DBManager: cls.destroy_db_connect(conn, curs) return res + @classmethod + def get_table_columns_name(cls, curs: any, table: any) -> List[str]: + sql = f"PRAGMA table_info({table})" + try: + curs.execute(sql) + columns = curs.fetchall() + except sqlite3.Error as err: + logger.error(err) + return [] + return [column[1] for column in columns] + @classmethod def fetch_all_data(cls: any, curs: any, sql: str, param: tuple = None, is_dict: bool = True) -> list: """ @@ -284,21 +261,6 @@ class DBManager: cls.insert_data_into_table(conn, table_name, data) cls.destroy_db_connect(conn, curs) - @classmethod - def check_columns_exist(cls, curs: any, table_name: str, columns: set) -> any: - """ - check columns exist in table, return empty set if none of them exist, else return the set of existing columns - """ - if not isinstance(curs, sqlite3.Cursor): - return None - try: - curs.execute(f"PRAGMA table_info({table_name})") - table_columns = {col[1] for col in curs.fetchall()} - return columns & table_columns - except sqlite3.Error as err: - logger.error(err) - return None - class CustomizedDictFactory: @staticmethod diff --git a/profiler/msprof_analyze/prof_common/file_manager.py b/profiler/msprof_analyze/prof_common/file_manager.py index 7329d1d9f3cd11588bf63300e581260205b400cb..064eb3f039aea5d39dd44e8ca84aac45f68019b8 100644 --- a/profiler/msprof_analyze/prof_common/file_manager.py +++ b/profiler/msprof_analyze/prof_common/file_manager.py @@ -114,6 +114,28 @@ class FileManager: raise RuntimeError(f"Failed to read the file: {base_name}, reason is {str(e)}") from e return content + @classmethod + def create_common_file(cls, file_path: str, content: str) -> None: + base_name = os.path.basename(file_path) + PathManager.check_path_writeable(os.path.dirname(file_path)) + try: + with os.fdopen( + os.open(file_path, os.O_WRONLY | os.O_CREAT, Constant.FILE_AUTHORITY), + 'w') as file: + file.write(content) + except Exception as e: + raise RuntimeError(f"Can't create file: {base_name}") from e + + @classmethod + def create_csv_from_dataframe(cls, file_path: str, data, index) -> None: + base_name = os.path.basename(file_path) + PathManager.check_path_writeable(os.path.dirname(file_path)) + try: + data.to_csv(file_path, index=index) + except Exception as e: + raise RuntimeError(f"Can't create file: {base_name}") from e + os.chmod(file_path, Constant.FILE_AUTHORITY) + @classmethod def create_csv_file(cls, profiler_path: str, data: list, file_name: str, headers: list = None) -> None: if not data: diff --git a/profiler/msprof_analyze/prof_common/file_reader.py b/profiler/msprof_analyze/prof_common/file_reader.py deleted file mode 100644 index 313933ba7f9334d8ce9273824aeba565c379a1cc..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_common/file_reader.py +++ /dev/null @@ -1,86 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import csv -import json -import logging -import os - -from msprof_analyze.prof_common.path_manager import PathManager -from msprof_analyze.prof_common.constant import Constant - - -class FileReader: - DATA_FILE_AUTHORITY = 0o640 - DATA_DIR_AUTHORITY = 0o750 - - @classmethod - def read_json_file(cls, file_path: str) -> any: - PathManager.check_path_readable(file_path) - if not os.path.isfile(file_path): - raise FileNotFoundError("File not exists.") - file_size = os.path.getsize(file_path) - if file_size <= 0: - return [] - if file_size > Constant.MAX_FILE_SIZE_5_GB: - msg = f"The file({file_path}) size exceeds the preset max value, failed to read the file." - raise RuntimeError(msg) - try: - with open(file_path, "rt") as file: - json_data = json.loads(file.read()) - except Exception as e: - msg = f"Can't read file: {file_path}" - raise RuntimeError(msg) from e - return json_data - - @classmethod - def write_json_file(cls, output_path: str, data: dict, file_name: str, format_json: bool = False) -> None: - if not data: - return - output_file = os.path.join(output_path, file_name) - PathManager.check_path_writeable(output_path) - try: - with os.fdopen( - os.open(output_file, os.O_WRONLY | os.O_CREAT, cls.DATA_FILE_AUTHORITY), 'w' - ) as file: - indent = 4 if format_json else None - file.write(json.dumps(data, indent=indent)) - except Exception as e: - raise RuntimeError(f"Can't create the file: {output_file}") from e - - @classmethod - def read_csv_file(cls, file_path: str, bean_class: any = None) -> any: - PathManager.check_path_readable(file_path) - if not os.path.isfile(file_path): - raise FileNotFoundError("File not exists.") - file_size = os.path.getsize(file_path) - if file_size <= 0: - return [] - if file_size > Constant.MAX_FILE_SIZE_5_GB: - check_msg = input( - f"The file({file_path}) size exceeds the preset max value. Continue reading the file? [y/n]") - if check_msg.lower() != "y": - logging.warning(f"The user choose not to read the file: %s", file_path) - return [] - result_data = [] - try: - with open(file_path, newline="") as csv_file: - reader = csv.DictReader(csv_file) - for row in reader: - row_data = bean_class(row) if bean_class else row - result_data.append(row_data) - except Exception as e: - msg = f"Failed to read the file: {file_path}" - raise RuntimeError(msg) from e - return result_data diff --git a/profiler/msprof_analyze/prof_common/kernel_bean.py b/profiler/msprof_analyze/prof_common/kernel_bean.py deleted file mode 100644 index f1c90895fc4bf78dc6b7c98bc6d7d781b7308b38..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_common/kernel_bean.py +++ /dev/null @@ -1,47 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.prof_common.utils import convert_to_decimal - - -class KernelBean: - def __init__(self, data: dict): - self._name = data.get("Name", "") - self._op_type = data.get("Type", "") - self._core_type = data.get("Accelerator Core", "") - self._input_shape = data.get("Input Shapes", "").replace("\"", "") - self._input_type = data.get("Input Data Types", "") - self._input_format = data.get("Input Formats", "") - self._duration = data.get("Duration(us)", 0) - self._ts = data.get("Start Time(us)", "") - - @property - def start_time(self): - return convert_to_decimal(self._ts) - - @property - def end_time(self): - return self.start_time + convert_to_decimal(self.dur) - - @property - def is_computing_op(self): - return self._core_type != "HCCL" - - @property - def dur(self): - return float(self._duration) - - @property - def kernel_info(self): - return [self._name, self._op_type, self._core_type, self._input_shape, self._input_type, self.dur] diff --git a/profiler/msprof_analyze/prof_common/singleton.py b/profiler/msprof_analyze/prof_common/singleton.py new file mode 100644 index 0000000000000000000000000000000000000000..70aff054cc3ecc067e41e65c6a11ece1ab7f332d --- /dev/null +++ b/profiler/msprof_analyze/prof_common/singleton.py @@ -0,0 +1,96 @@ +# Copyright (c) 2025 Huawei Technologies Co., Ltd +# All rights reserved. +# +# Licensed under the BSD 3-Clause License (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# https://opensource.org/licenses/BSD-3-Clause +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import inspect +from functools import wraps + + +def get_class_absolute_path(cls): + module = inspect.getmodule(cls) + if module is not None: + module_path = module.__name__ + class_name = cls.__name__ + return f"{module_path}.{class_name}" + else: + return None + + +def is_static_func(function_obj): + return isinstance(function_obj, staticmethod) + + +def singleton(cls): + """ + :param cls: any class + :return: singleton handle + + When using the singleton function, you need to manually specify collection_path='dataSet_path'. Otherwise, the + singleton function is initialized by class name. + if cls has 'collection_path' property, _instance map will build by class_name and 'collection_path', the + default value of collection path is class absolute path. + + _instance = {cls.name: {collection_path: instance}} + """ + _instance = {} + + @wraps(cls) # 使用 wraps 装饰器 + def _singleton(*args, **kw): + # 适配多进程异步调用场景,确保不同子进程的单例类互相隔离 + pid = os.getpid() + if pid not in _instance: + _instance[pid] = {} + + collection_path = kw.get("collection_path") + if not collection_path: + collection_path = get_class_absolute_path(cls) + if cls in _instance[pid] and collection_path in _instance[pid][cls]: + return _instance[pid][cls].get(collection_path) + if cls not in _instance[pid]: + _instance[pid][cls] = {collection_path: cls(*args, **kw)} + else: + _instance[pid][cls][collection_path] = cls(*args, **kw) + return _instance[pid][cls].get(collection_path) + + def reset_all_instances(): + """ + 用于ut使用,清空单例类,防止ut不同测试用例间相互干扰 + """ + _instance.clear() + + # 保留原始类的属性和方法 + _singleton.__name__ = cls.__name__ + _singleton.__module__ = cls.__module__ + _singleton.__doc__ = cls.__doc__ + + # 拷贝原始类的类方法和静态方法 + _singleton.__dict__.update(cls.__dict__) + for base_class in inspect.getmro(cls)[::-1]: + # 获取类的所有成员 + members = inspect.getmembers(base_class) + + # 过滤出函数对象 + function_objs = [member[1] + for member in members + if inspect.isfunction(member[1]) or inspect.ismethod(member[1]) + ] + for function_obj in function_objs: + if inspect.isfunction(function_obj) and not is_static_func(function_obj): + continue + setattr(_singleton, function_obj.__name__, function_obj) + + _singleton.reset_all_instances = reset_all_instances + singleton.reset_all_instances = reset_all_instances + + return _singleton \ No newline at end of file diff --git a/profiler/msprof_analyze/prof_common/trace_event_bean.py b/profiler/msprof_analyze/prof_common/trace_event_bean.py deleted file mode 100644 index ea78b54df57f8a1d72517baf2c48748b13ab7847..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_common/trace_event_bean.py +++ /dev/null @@ -1,113 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from decimal import Decimal - -from msprof_analyze.prof_common.constant import Constant -from msprof_analyze.prof_common.utils import convert_to_decimal -from msprof_analyze.prof_common.analyze_dict import AnalyzeDict - - -class TraceEventBean(AnalyzeDict): - def __init__(self, data: dict, unique_id: str = None): - super().__init__(data) - self._id = unique_id - self._type = None - self._start_time = convert_to_decimal(self.ts) if self.ts else 0 - self._end_time = self._start_time + convert_to_decimal(self.dur) if self.dur else 0 - self._fwd_bwd_id = None - - @property - def unique_id(self): - return self._id - - @property - def start_time(self) -> Decimal: - return self._start_time - - @property - def step_id(self) -> int: - return self.name.split("#")[-1] - - @property - def end_time(self) -> Decimal: - return self._end_time - - @property - def kernel_info(self): - return [self.name, self.args.get("Task Type", ""), self.dur] - - @property - def event_type(self): - return self._type - - @property - def fwd_bwd_id(self): - return self._fwd_bwd_id - - @event_type.setter - def event_type(self, event_type): - self._type = event_type - - @fwd_bwd_id.setter - def fwd_bwd_id(self, fwd_bwd_id): - self._fwd_bwd_id = fwd_bwd_id - - def set_id(self, name_id): - self._id = name_id - - def is_cpu_op(self): - return self.cat == "cpu_op" - - def is_optimizer(self): - return self.cat == "cpu_op" and self.name.lower().startswith("optimizer") - - def is_nn_module(self): - return self.cat == "python_function" and self.name.lower().startswith("nn.module") - - def is_step(self): - return self.name.lower().startswith("profilerstep#") - - def is_torch_to_npu(self): - return self.cat == "async_npu" - - def is_fwd_bwd_flow(self): - return self.cat == "fwdbwd" - - def is_flow_start(self): - return self.ph == "s" - - def is_flow_end(self): - return self.ph == "f" - - def is_meta(self): - return self.ph == "M" - - def is_kernel_event(self, kernel_pid): - return self.ph == "X" and self.pid == kernel_pid - - def is_hccl_event(self, hccl_pid): - return self.ph == "X" and self.pid == hccl_pid and self.name.startswith("hcom_") - - def is_overlap_analysis_event(self, overlap_analysis_pid): - return self.ph == "X" and self.pid == overlap_analysis_pid - - def is_npu_process(self): - return self.ph == "M" and self.name == "process_name" and self.args.get("name", "") == Constant.NPU_BAR - - def is_hccl_process(self): - return self.ph == "M" and self.name == "process_name" and self.args.get("name", "") == Constant.HCCL_BAR - - def is_overlap_analysis_process(self): - return self.ph == "M" and self.name == "process_name" and self.args.get("name", "") == Constant.OVERLAP_BAR diff --git a/profiler/msprof_analyze/prof_common/tree_builder.py b/profiler/msprof_analyze/prof_common/tree_builder.py deleted file mode 100644 index 34b056e71bd9880cea0e3402da699a4fbadd150a..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_common/tree_builder.py +++ /dev/null @@ -1,37 +0,0 @@ -# Copyright (c) 2024 Huawei Technologies Co., Ltd -# All rights reserved. -# -# Licensed under the BSD 3-Clause License (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://opensource.org/licenses/BSD-3-Clause -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from msprof_analyze.prof_common.trace_event_bean import TraceEventBean - - -class TreeBuilder: - @staticmethod - def build_tree(event_list: list, node_class: any, root_bean: any): - root_node = node_class(root_bean) - all_nodes = [root_node] + [None] * len(event_list) - event_list.sort(key=lambda x: x.start_time) - last_node = root_node - index = 1 - for event in event_list: - while last_node: - if last_node != root_node and event.start_time > last_node.end_time: - last_node = last_node.parent_node - continue - tree_node = node_class(event, last_node) - last_node.update_child_nodes(tree_node) - all_nodes[index] = tree_node - last_node = tree_node - index += 1 - break - return all_nodes diff --git a/profiler/msprof_analyze/prof_common/utils.py b/profiler/msprof_analyze/prof_common/utils.py index 284c17c86e36b8fb87d2ea73ed7e3089f44fcbb6..5357cb9665500561f24bfcb201886814f3f9d40b 100644 --- a/profiler/msprof_analyze/prof_common/utils.py +++ b/profiler/msprof_analyze/prof_common/utils.py @@ -12,14 +12,13 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. + import configparser import os from email.utils import parseaddr from typing import Dict, List from urllib.parse import urlparse -from decimal import Decimal - from msprof_analyze.prof_common.logger import get_logger from msprof_analyze.prof_common.path_manager import PathManager @@ -86,18 +85,17 @@ def convert_to_float(num): return 0 -def convert_to_decimal(data: any) -> Decimal: - try: - decimal_value = Decimal(data) - except Exception: - logger.error('Invalid profiling data which failed to convert data to decimal.') - return 0.0 - return decimal_value - - def convert_to_int(num): try: return int(num) except (ValueError, NameError): logger.error(f"Can not convert %s to int", num) return 0 + + +def compute_ratio(dividend: float, divisor: float): + if abs(divisor) < 1e-15: + return 0 + else: + return round(dividend / divisor, 4) + diff --git a/profiler/msprof_analyze/prof_exports/base_stats_export.py b/profiler/msprof_analyze/prof_exports/base_stats_export.py index 59d58bdff5485a6ace0f2c12dadbf543ecd4b978..2d17c41cb511c6e8c75c6ad89c478d20eab26600 100644 --- a/profiler/msprof_analyze/prof_exports/base_stats_export.py +++ b/profiler/msprof_analyze/prof_exports/base_stats_export.py @@ -12,6 +12,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import re import pandas as pd @@ -24,25 +25,33 @@ logger = get_logger() class BaseStatsExport: - def __init__(self, db_path, analysis_class): + def __init__(self, db_path, analysis_class, step_range): self._db_path = db_path self._analysis_class = analysis_class + self._step_range = step_range self._query = None - self.mode = Constant.ANALYSIS + self._param = (self._step_range.get(Constant.START_NS), + self._step_range.get(Constant.END_NS)) if self._step_range else None def get_query(self): return self._query def read_export_db(self): try: + if not self._db_path: + logger.error("db path is None.") + return None query = self.get_query() if query is None: logger.error("query is None.") return None - conn, cursor = DBManager.create_connect_db(self._db_path, self.mode) - data = pd.read_sql(query, conn) + conn, cursor = DBManager.create_connect_db(self._db_path, Constant.ANALYSIS) + if self._param is not None and re.search(Constant.SQL_PLACEHOLDER_PATTERN, query): + data = pd.read_sql(query, conn, params=self._param) + else: + data = pd.read_sql(query, conn) DBManager.destroy_db_connect(conn, cursor) return data except Exception as e: logger.error(f"File {self._db_path} read failed error: {e}") - return None \ No newline at end of file + return None diff --git a/profiler/msprof_analyze/prof_exports/cann_api_sum_export.py b/profiler/msprof_analyze/prof_exports/cann_api_sum_export.py index efdba81e94360e7f8e88801711fb2ff72fa5b47f..456aac95f07fec30f9f87a7e401d14a7edc4ea05 100644 --- a/profiler/msprof_analyze/prof_exports/cann_api_sum_export.py +++ b/profiler/msprof_analyze/prof_exports/cann_api_sum_export.py @@ -14,6 +14,7 @@ # limitations under the License. from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant QUERY = """ WITH @@ -31,6 +32,7 @@ WITH upper_quartile(endNs - startNs) AS upper_quartile_duration FROM CANN_API + {} GROUP BY name ), totals AS ( @@ -60,6 +62,7 @@ ORDER BY 2 DESC; class CannApiSumExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "WHERE CANN_API.startNs >= ? and CANN_API.startNs <= ?" if step_range else "" + self._query = QUERY.format(filter_statement) diff --git a/profiler/msprof_analyze/prof_exports/cluster_time_summary_export.py b/profiler/msprof_analyze/prof_exports/cluster_time_summary_export.py index 840359618f383b4606544832788b886fba6cad4b..36b86301dc73ffb8e6a206f9b61bac2b159fc747 100644 --- a/profiler/msprof_analyze/prof_exports/cluster_time_summary_export.py +++ b/profiler/msprof_analyze/prof_exports/cluster_time_summary_export.py @@ -13,89 +13,26 @@ # See the License for the specific language governing permissions and # limitations under the License. -from msprof_analyze.prof_common.db_manager import DBManager from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant class CommunicationTimeExport(BaseStatsExport): QUERY = """ SELECT - rdm.rankid AS rank, - si.value AS groupName, - (co.endNs - co.startNs) / 1000.0 AS communication_time, - sii.value AS opName, - step_time.id AS step - FROM - COMMUNICATION_OP co - CROSS JOIN - RANK_DEVICE_MAP rdm - JOIN - STRING_IDS si ON co.groupName = si.id - JOIN - STRING_IDS sii ON co.opName = sii.id - LEFT JOIN STEP_TIME step_time - ON co.startNs >= step_time.startNs - AND co.endNs <= step_time.endNs - """ - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = self.QUERY - - -class MemoryAndDispatchTimeExport(BaseStatsExport): - QUERY = """ - - WITH - computing AS ( - SELECT TASK.startNs, TASK.endNs, CANN_API.startNs as apiStartNs, 0 AS type - FROM COMPUTE_TASK_INFO - JOIN TASK - ON COMPUTE_TASK_INFO.globalTaskId = TASK.globalTaskId - AND TASK.startNs != TASK.endNs - JOIN CANN_API - ON CANN_API.connectionId = TASK.connectionId - ), - communication AS ( - SELECT COMMUNICATION_OP.startNs, COMMUNICATION_OP.endNs, CANN_API.startNs as apiStartNs, 1 AS type - FROM COMMUNICATION_OP - JOIN CANN_API - ON CANN_API.connectionId = COMMUNICATION_OP.connectionId - ), - memory AS ( - SELECT TASK.startNs, TASK.endNs, TASK.startNs as apiStartNs, 4 AS type - FROM TASK - WHERE - taskType = ( - SELECT id - FROM STRING_IDS - WHERE value='MEMCPY_ASYNC' - ) - ), - overlap AS ( - SELECT startNs, endNs, apiStartNs, type - FROM computing - UNION ALL - SELECT startNs, endNs, apiStartNs, type - FROM communication - UNION ALL - SELECT startNs, endNs, apiStartNs, type - FROM memory - ) - SELECT - overlap.startNs AS start, - overlap.endNs AS end, - (overlap.startNs - overlap.apiStartNs) / 1000.0 AS dispatch, - overlap.type, - step_time.id AS step - FROM overlap - LEFT JOIN STEP_TIME step_time - ON overlap.apiStartNs >= step_time.startNs - AND overlap.apiStartNs <= step_time.endNs - ORDER BY overlap.startNs, overlap.endNs + RANK_DEVICE_MAP.rankId, + si_group.value AS groupName, + si_op.value AS opName, + (COMMUNICATION_OP.endNs - COMMUNICATION_OP.startNs) / 1000.0 AS communication_time + FROM COMMUNICATION_OP + CROSS JOIN RANK_DEVICE_MAP + JOIN STRING_IDS si_group ON COMMUNICATION_OP.groupName = si_group.id + JOIN STRING_IDS si_op ON COMMUNICATION_OP.opName = si_op.id + JOIN CANN_API ON CANN_API.connectionId = COMMUNICATION_OP.connectionId + {} """ - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = self.QUERY - self.mode = None + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "WHERE CANN_API.startNs >= ? and CANN_API.startNs <= ?" if step_range else "" + self._query = self.QUERY.format(filter_statement) diff --git a/profiler/msprof_analyze/prof_exports/communicaion_info_export.py b/profiler/msprof_analyze/prof_exports/communicaion_info_export.py new file mode 100644 index 0000000000000000000000000000000000000000..d08c2fbc7d064121b1300684549372e6f88f89ab --- /dev/null +++ b/profiler/msprof_analyze/prof_exports/communicaion_info_export.py @@ -0,0 +1,177 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant + +QUERY_COMMUNICATION_PTA = """ +WITH +band AS ( + SELECT + hccl_op_name, + transport_type, + JSON_OBJECT( + 'Transit Time(ms)', transit_time, + 'Transit Size(MB)', transit_size, + 'Bandwidth(GB/s)', bandwidth, + 'Large Packet Ratio', large_packet_ratio + ) AS band_dict + FROM CommAnalyzerBandwidth + WHERE transport_type IN ('SDMA', 'RDMA') +), +sdma AS (SELECT hccl_op_name, band_dict FROM band WHERE transport_type = 'SDMA'), +rdma AS (SELECT hccl_op_name, band_dict FROM band WHERE transport_type = 'RDMA') + +SELECT + time.hccl_op_name, + time.group_name, + time.start_timestamp, + time.elapse_time, + time.step, + time.type, + sdma.band_dict AS sdma_dict, + rdma.band_dict AS rdma_dict +FROM CommAnalyzerTime AS time +LEFT JOIN sdma ON time.hccl_op_name = sdma.hccl_op_name +LEFT JOIN rdma ON time.hccl_op_name = rdma.hccl_op_name +""" + +QUERY_COMMUNICATION_MINDSPORE = """ +WITH +band AS ( + SELECT + hccl_op_name, + transport_type, + JSON_OBJECT( + 'Transit Time(ms)', transit_time, + 'Transit Size(MB)', transit_size, + 'Bandwidth(GB/s)', bandwidth, + 'Large Packet Ratio', large_packet_ratio + ) AS band_dict + FROM CommAnalyzerBandwidth + WHERE transport_type IN ('SDMA', 'RDMA') +), +sdma AS (SELECT hccl_op_name, band_dict FROM band WHERE transport_type = 'SDMA'), +rdma AS (SELECT hccl_op_name, band_dict FROM band WHERE transport_type = 'RDMA') + +SELECT + time.hccl_op_name, + time.group_name, + time.start_timestamp, + time.elapse_time, + sdma.band_dict AS sdma_dict, + rdma.band_dict AS rdma_dict +FROM CommAnalyzerTime AS time +LEFT JOIN sdma ON time.hccl_op_name = sdma.hccl_op_name +LEFT JOIN rdma ON time.hccl_op_name = rdma.hccl_op_name +""" + +QUERY_CLUSTER_COMMUNICATION = """ +WITH +band AS ( + SELECT + hccl_op_name, + band_type, + JSON_OBJECT( + 'Transport Type', band_type, + 'Transit Time(ms)', transit_time, + 'Transit Size(MB)', transit_size, + 'Bandwidth(GB/s)', bandwidth, + 'Large Packet Ratio', large_packet_ratio + ) AS band_dict + FROM {band_table} + WHERE band_type IN ('SDMA', 'RDMA') +), +sdma AS ( + SELECT hccl_op_name, band_dict + FROM band + WHERE band_type = 'SDMA' +), +rdma AS ( + SELECT hccl_op_name, band_dict + FROM band + WHERE band_type = 'RDMA' +) + +SELECT + group_map.rank_set, + time.hccl_op_name, + time.group_name, + time.start_timestamp, + time.elapsed_time, + time.step, + time.rank_id, + sdma.band_dict AS sdma_dict, + rdma.band_dict AS rdma_dict +FROM {time_table} AS time +JOIN {group_table} AS group_map + ON time.group_name = group_map.group_name +LEFT JOIN sdma + ON time.hccl_op_name = sdma.hccl_op_name +LEFT JOIN rdma + ON time.hccl_op_name = rdma.hccl_op_name +""" + +QUERY_CLUSTER_BANDWIDTH = """ +SELECT + step, + rank_id, + band_type, + transit_time, + transit_size +FROM {band_table} +WHERE band_type IN ('SDMA', 'RDMA') +""" + +QUERY_CLUSTER_STEP_TRACE_TIME = """ +SELECT * +FROM ClusterStepTraceTime +""" + + +class CommunicationInfoExport(BaseStatsExport): + + def __init__(self, db_path, is_pta): + super().__init__(db_path, "None", {}) + self._query = QUERY_COMMUNICATION_PTA if is_pta else QUERY_COMMUNICATION_MINDSPORE + + +class ClusterAnalysisExport(BaseStatsExport): + def __init__(self, db_path, data_simplification): + super().__init__(db_path, "None", {}) + self.cluster_time_table = "ClusterCommunicationTime" if data_simplification else "ClusterCommAnalyzerTime" + self.cluster_band_table = "ClusterCommunicationBandwidth" if data_simplification \ + else "ClusterCommAnalyzerBandwidth" + self.cluster_group_table = "CommunicationGroupMapping" if data_simplification else "CommunicationGroup" + + +class ClusterStepTraceTimeExport(ClusterAnalysisExport): + def __init__(self, db_path): + super().__init__(db_path, False) + self._query = QUERY_CLUSTER_STEP_TRACE_TIME + + +class ClusterCommunicationInfoExport(ClusterAnalysisExport): + def __init__(self, db_path, data_simplification): + super().__init__(db_path, data_simplification) + self._query = QUERY_CLUSTER_COMMUNICATION.format(time_table=self.cluster_time_table, + band_table=self.cluster_band_table, + group_table=self.cluster_group_table) + + +class ClusterBandwidthInfoExport(ClusterAnalysisExport): + def __init__(self, db_path, data_simplification): + super().__init__(db_path, data_simplification) + self._query = QUERY_CLUSTER_BANDWIDTH.format(band_table=self.cluster_band_table) diff --git a/profiler/msprof_analyze/prof_exports/compute_op_sum_export.py b/profiler/msprof_analyze/prof_exports/compute_op_sum_export.py index ed41d128056368b7a0e35f51e78edd95ce746486..24a2cc2d990976b5fa6e4c26a8b4dd745a88a767 100644 --- a/profiler/msprof_analyze/prof_exports/compute_op_sum_export.py +++ b/profiler/msprof_analyze/prof_exports/compute_op_sum_export.py @@ -14,6 +14,7 @@ # limitations under the License. from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant QUERY = """ SELECT @@ -38,6 +39,7 @@ LEFT JOIN LEFT JOIN STRING_IDS AS INPUTSHAPES_IDS ON INPUTSHAPES_IDS.id == COMPUTE_TASK_INFO.inputShapes +{} """ QUERY_EXCLUDE_OPNAME = """ @@ -59,18 +61,21 @@ LEFT JOIN LEFT JOIN STRING_IDS AS INPUTSHAPES_IDS ON INPUTSHAPES_IDS.id == COMPUTE_TASK_INFO.inputShapes +{} """ class ComputeOpSumExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "WHERE TASK.startNs >= ? and TASK.startNs <= ?" if step_range else "" + self._query = QUERY.format(filter_statement) class ComputeOpSumExportExcludeOpName(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY_EXCLUDE_OPNAME \ No newline at end of file + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "WHERE TASK.startNs >= ? and TASK.startNs <= ?" if step_range else "" + self._query = QUERY_EXCLUDE_OPNAME.format(filter_statement) diff --git a/profiler/msprof_analyze/prof_exports/mstx2commop_export.py b/profiler/msprof_analyze/prof_exports/ep_load_balance_ecport.py similarity index 44% rename from profiler/msprof_analyze/prof_exports/mstx2commop_export.py rename to profiler/msprof_analyze/prof_exports/ep_load_balance_ecport.py index 8c68bb4527c363d83941289c4c9b2ae5d86aa2fd..59acd6bdde7d23ad6d645d420c3f6f9fcf7cd2b6 100644 --- a/profiler/msprof_analyze/prof_exports/mstx2commop_export.py +++ b/profiler/msprof_analyze/prof_exports/ep_load_balance_ecport.py @@ -1,4 +1,4 @@ -# Copyright (c) 2024, Huawei Technologies Co., Ltd. +# Copyright (c) 2025, Huawei Technologies Co., Ltd. # All rights reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -14,26 +14,28 @@ # limitations under the License. from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant -QUERY = """ -SELECT - ta.startNs, - ta.endNs, - ta.connectionId, - si.value -from - MSTX_EVENTS ms -JOIN - TASK ta - ON ms.connectionId == ta.connectionId -JOIN - STRING_IDS si - ON ms.message == si.id +GROUPED_MATMUL_QUERY = """ +SELECT + InputShapes_IDS.value AS "InputShapes" +FROM COMPUTE_TASK_INFO +JOIN TASK + ON COMPUTE_TASK_INFO.globalTaskId = TASK.globalTaskId +LEFT JOIN STRING_IDS AS InputShapes_IDS + ON InputShapes_IDS.id = COMPUTE_TASK_INFO.inputShapes +WHERE COMPUTE_TASK_INFO.opType = ( + SELECT id + FROM STRING_IDS + WHERE value = 'GroupedMatmul' +) +{} """ -class Mstx2CommopExport(BaseStatsExport): +class InputShapeExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "And TASK.startNs >= ? And TASK.endNs <= ?" if step_range else "" + self._query = GROUPED_MATMUL_QUERY.format(filter_statement) diff --git a/profiler/msprof_analyze/prof_exports/filter_db_export.py b/profiler/msprof_analyze/prof_exports/filter_db_export.py deleted file mode 100644 index 048b20a260d25ec48c17bd2dc85d70f15b177910..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_exports/filter_db_export.py +++ /dev/null @@ -1,102 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport -from msprof_analyze.prof_common.logger import get_logger - -logger = get_logger() - -FILTER_TABLES = ["MatMulV3", "MatMulV2", "GroupedMatmul", "FlashAttentionScore", "FlashAttentionScoreGrad"] -values_str = ', '.join([f"'{op_type}'" for op_type in FILTER_TABLES]) - -OP_QUERY = f""" -SELECT COMPUTE_TASK_INFO.* -FROM COMPUTE_TASK_INFO - WHERE - opType IN ( - SELECT - id - FROM - STRING_IDS - WHERE - value IN ({values_str}) - ) -""" - -TASK_QUERY = """ -SELECT TASK.* -FROM TASK -INNER JOIN COMPUTE_TASK_INFO -ON TASK.globalTaskId = COMPUTE_TASK_INFO.globalTaskId; -""" - -CANN_QUERY = """ -WITH all_connection_ids AS ( - SELECT connectionId - FROM TASK - UNION - SELECT connectionId - FROM COMMUNICATION_OP -) - -SELECT CANN_API.* -FROM CANN_API -INNER JOIN all_connection_ids -ON CANN_API.connectionId = all_connection_ids.connectionId; -""" - -PYTORCH_QUERY = """ -WITH all_connection_ids AS ( - SELECT connectionId - FROM TASK - UNION - SELECT connectionId - FROM COMMUNICATION_OP -) - -SELECT PYTORCH_API.* -FROM PYTORCH_API - -INNER JOIN all_connection_ids -ON PYTORCH_API.connectionId = all_connection_ids.connectionId; -""" - - -class OPFilter(BaseStatsExport): - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = OP_QUERY - - -class TaskFilter(BaseStatsExport): - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = TASK_QUERY - - -class CANNFilter(BaseStatsExport): - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = CANN_QUERY - - -class PYTORCHFilter(BaseStatsExport): - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = PYTORCH_QUERY \ No newline at end of file diff --git a/profiler/msprof_analyze/prof_exports/hccl_sum_export.py b/profiler/msprof_analyze/prof_exports/hccl_sum_export.py index 2470e059ffcfb116f1dad657de53d5aa7ddd865b..80750ef88ddfeceb54b35f3083ce6f6a0ae318eb 100644 --- a/profiler/msprof_analyze/prof_exports/hccl_sum_export.py +++ b/profiler/msprof_analyze/prof_exports/hccl_sum_export.py @@ -14,6 +14,7 @@ # limitations under the License. from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant QUERY = """ SELECT @@ -32,11 +33,13 @@ LEFT JOIN LEFT JOIN STRING_IDS AS GROUP_NAME_IDS ON GROUP_NAME_IDS.id == COMMUNICATION_OP.groupName +{} """ class HcclSumExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_stat = "WHERE COMMUNICATION_OP.startNs >= ? and COMMUNICATION_OP.startNs <= ?" if step_range else "" + self._query = QUERY.format(filter_stat) diff --git a/profiler/msprof_analyze/prof_exports/mstx_mark_export.py b/profiler/msprof_analyze/prof_exports/mstx_event_export.py similarity index 45% rename from profiler/msprof_analyze/prof_exports/mstx_mark_export.py rename to profiler/msprof_analyze/prof_exports/mstx_event_export.py index 9b561d9f066687efa373fcbba6dcaaae2e492eff..76bf4672b616b94d639f75c89782b07f3a175dcc 100644 --- a/profiler/msprof_analyze/prof_exports/mstx_mark_export.py +++ b/profiler/msprof_analyze/prof_exports/mstx_event_export.py @@ -14,8 +14,9 @@ # limitations under the License. from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport +from msprof_analyze.prof_common.constant import Constant -QUERY = """ +MARK_QUERY = """ WITH FRAMEWORK_API AS ( SELECT @@ -26,6 +27,7 @@ WITH LEFT JOIN CONNECTION_IDS ON PYTORCH_API.connectionId == CONNECTION_IDS.id + {} ) SELECT MSG_IDS.value AS "msg", @@ -44,6 +46,8 @@ LEFT JOIN LEFT JOIN STRING_IDS AS MSG_IDS ON MSTX_EVENTS.message == MSG_IDS.id +WHERE + MSTX_EVENTS.eventType == 3 {} ORDER BY MSTX_EVENTS.startNs """ @@ -51,6 +55,50 @@ ORDER BY class MstxMarkExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + self._query = self.get_query_statement() + self._param = (step_range.get(Constant.START_NS), step_range.get(Constant.END_NS), + step_range.get(Constant.START_NS), + step_range.get(Constant.END_NS)) if step_range else None + + def get_query_statement(self): + if self._step_range: + filter_statement_1 = "WHERE PYTORCH_API.startNs >= ? AND PYTORCH_API.startNs <= ?" + filter_statement_2 = "AND MSTX_EVENTS.startNs >= ? AND MSTX_EVENTS.startNs <= ?" + else: + filter_statement_1, filter_statement_2 = "", "" + return MARK_QUERY.format(filter_statement_1, filter_statement_2) + + +RANGE_QUERY = ''' +SELECT + MSG_IDS.value AS "msg", + MSTX_EVENTS.startNs AS "cann_start_ts", + MSTX_EVENTS.endNs AS "cann_end_ts", + TASK.startNs AS "device_start_ts", + TASK.endNs AS "device_end_ts", + MSTX_EVENTS.globalTid AS "tid" +FROM + MSTX_EVENTS +LEFT JOIN + TASK + ON MSTX_EVENTS.connectionId == TASK.connectionId +LEFT JOIN + STRING_IDS AS MSG_IDS + ON MSTX_EVENTS.message == MSG_IDS.id +WHERE + MSTX_EVENTS.eventType == 2 {} +AND + MSTX_EVENTS.connectionId != 4294967295 +ORDER BY + MSTX_EVENTS.startNs + ''' + + +class MstxRangeExport(BaseStatsExport): + + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) + filter_statement = "AND MSTX_EVENTS.startNs >= ? AND MSTX_EVENTS.startNs <= ?" if step_range else "" + self._query = RANGE_QUERY.format(filter_statement) diff --git a/profiler/msprof_analyze/prof_exports/mstx_step_export.py b/profiler/msprof_analyze/prof_exports/mstx_step_export.py index 3051a280ccb1c9eb2a83933c357948bcf59b4d1f..c8aec91b7e5ce5fb29fffebeb8668fec723e3fa8 100644 --- a/profiler/msprof_analyze/prof_exports/mstx_step_export.py +++ b/profiler/msprof_analyze/prof_exports/mstx_step_export.py @@ -29,6 +29,6 @@ ORDER BY class MstxStepExport(BaseStatsExport): - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) + def __init__(self, db_path, recipe_name, step_range): + super().__init__(db_path, recipe_name, step_range) self._query = QUERY diff --git a/profiler/msprof_analyze/prof_exports/p2p_pairing_export.py b/profiler/msprof_analyze/prof_exports/p2p_pairing_export.py deleted file mode 100644 index 2f6a73942619e1bad19eb5978893363acb6cca73..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/prof_exports/p2p_pairing_export.py +++ /dev/null @@ -1,71 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -from string import Template - -from msprof_analyze.cluster_analyse.common_func.table_constant import TableConstant -from msprof_analyze.prof_exports.base_stats_export import BaseStatsExport - - -QUERY = Template(""" -SELECT - co.opName AS "$opNameId", - siii.value AS "$opName", - co.startNs AS "$startTime", - co.endNs AS "$endTime", - rdm.rankId AS "$globalRank", - cti.srcRank AS "$srcRank", - cti.dstRank AS "$dstRank", - siiii.value AS "$taskType", - sii.value AS "$coGroupName", - si.value AS "$ctiGroupName" -FROM - COMMUNICATION_TASK_INFO cti - LEFT JOIN COMMUNICATION_OP co on cti.opId = co.opId - CROSS JOIN RANK_DEVICE_MAP rdm - JOIN STRING_IDS si on cti.groupName = si.id - JOIN STRING_IDS sii on co.groupName = sii.id - JOIN STRING_IDS siii on co.opName = siii.id - JOIN STRING_IDS siiii on cti.taskType = siiii.id -""") - - -class P2PPairingExport(BaseStatsExport): - - CO_OP_NAME = "opNameId" - OP_NAME = "opName" - START_TIME = "startTime" - END_TIME = "endTime" - GLOBAL_RANK = "globalRank" - SRC_RANK = "srcRank" - DST_RANK = "dstRank" - TASK_TYPE = "taskType" - CO_GROUP_NAME = "coGroupName" - CTI_GROUP_NAME = "ctiGroupName" - - - def __init__(self, db_path, recipe_name): - super().__init__(db_path, recipe_name) - self._query = QUERY.safe_substitute( - opNameId=self.CO_OP_NAME, - opName=self.OP_NAME, - startTime=self.START_TIME, - endTime=self.END_TIME, - globalRank=self.GLOBAL_RANK, - srcRank=self.SRC_RANK, - dstRank=self.DST_RANK, - taskType=self.TASK_TYPE, - coGroupName=self.CO_GROUP_NAME, - ctiGroupName=self.CTI_GROUP_NAME - ) diff --git a/profiler/msprof_analyze/requirements/build.txt b/profiler/msprof_analyze/requirements/build.txt index 9bb3af4b2a9cdb8401a8c9c44bc6140fc5dc80ec..3ef20e787be3bad76de0ccde4dc3e3a1dbe63efb 100644 --- a/profiler/msprof_analyze/requirements/build.txt +++ b/profiler/msprof_analyze/requirements/build.txt @@ -7,7 +7,7 @@ tqdm prettytable ijson requests -xlsxwriter +xlsxwriter>=3.0.6 sqlalchemy urllib3<2.0 numpy<=1.26.4 diff --git a/profiler/msprof_analyze/test/run_st.py b/profiler/msprof_analyze/test/run_st.py index e15bf17a2f4bdad495d2b48be7304b98b416241f..6045f1c52a2c48f306af297b3bb43d8567e0cd6d 100644 --- a/profiler/msprof_analyze/test/run_st.py +++ b/profiler/msprof_analyze/test/run_st.py @@ -20,6 +20,8 @@ import sys import threading stop_print_thread = False +# 当前ci环境不支持该用例 +BLACKLIST_FILES = ["test_cann_api_sum.py"] def print_stout(output): @@ -43,6 +45,10 @@ def stop_stout_threads(thread_list): def start_st_process(module_name): st_path = os.path.join(os.path.abspath(os.path.dirname(__file__)), "st", module_name) cmd = ["python3", "-m", "pytest", "-s", st_path] + for case in BLACKLIST_FILES: + ignored_case_path = os.path.join(st_path, case) + if os.path.exists(ignored_case_path): + cmd.extend(["--ignore", ignored_case_path]) process = subprocess.Popen(cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) stout_thread = threading.Thread(target=print_stout, args=(process.stdout,)) stout_thread.start() diff --git a/profiler/msprof_analyze/test/st/advisor/test_advisor_cmd_single_ascend_pt_compare.py b/profiler/msprof_analyze/test/st/advisor/test_advisor_cmd_single_ascend_pt_compare.py index a485d62188deaa661499b79b94c94ce990398fcd..c6eca362e09af1b009d2beee88dfb70169a6a40f 100644 --- a/profiler/msprof_analyze/test/st/advisor/test_advisor_cmd_single_ascend_pt_compare.py +++ b/profiler/msprof_analyze/test/st/advisor/test_advisor_cmd_single_ascend_pt_compare.py @@ -13,7 +13,6 @@ # See the License for the specific language governing permissions and # limitations under the License. import os -import subprocess import logging from unittest import TestCase @@ -23,11 +22,10 @@ from bs4 import BeautifulSoup from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.advisor.utils import get_files, execute_cmd +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestAdvisorCmdSingleAscendPtNoCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2", "n122-120-121_12321_20240911113658382_ascend_pt") @@ -53,11 +51,11 @@ class TestAdvisorCmdSingleAscendPtNoCompare(TestCase): "Kernel compare of Target and Benchmark", "Byte Alignment Analysis", "Bandwidth Contention Analysis", - "AICPU operator", - "Dynamic Shape Operator", - "FUSIBLE OPERATOR ANALYSIS", - "Affinity Apis", - "Operator Dispatch" + "AICPU Issues", + "Operator Dynamic Shape Issues", + "Fusible Operator Analysis", + "Affinity API Issues", + "Operator Dispatch Issues" ] # True presents the attr is nan @@ -231,7 +229,7 @@ class TestAdvisorCmdSingleAscendPtNoCompare(TestCase): b_names = ["Square", "Suggestion 1:", "Equal", "Suggestion 1:"] try: - df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='AICPU operator', header=0) + df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='AICPU Issues', header=0) except FileNotFoundError: logging.error("File %s not found.", str(self.RESULT_EXCEL.get("all", None))) return @@ -285,7 +283,7 @@ class TestAdvisorCmdSingleAscendPtNoCompare(TestCase): ignore_api = ["torch_npu.optim.NpuFusedAdamW", "torch_npu.npu_confusion_transpose"] try: - df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='Affinity Apis', header=0) + df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='Affinity API Issues', header=0) except FileNotFoundError: logging.error("File %s not found.", str(self.RESULT_EXCEL.get("all", None))) return @@ -315,7 +313,7 @@ class TestAdvisorCmdSingleAscendPtNoCompare(TestCase): t1_elapsed_time = ['58486.704798215804'] try: - df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='Operator Dispatch', header=0) + df = pd.read_excel(self.RESULT_EXCEL.get("all", None), sheet_name='Operator Dispatch Issues', header=0) except FileNotFoundError: logging.error("File %s not found.", str(self.RESULT_EXCEL.get("all", None))) return diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cann_api_sum.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cann_api_sum.py new file mode 100644 index 0000000000000000000000000000000000000000..b612967a3d6b924da28e2e35f04eb002ac52c020 --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cann_api_sum.py @@ -0,0 +1,96 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from unittest import TestCase +import pandas as pd + +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestCannApiSum(TestCase): + """ + Test recipe: cann_spi_sum + """ + TABLE_CANN_API_SUM = "CannApiSum" + TABLE_CANN_API_SUM_RANK = "CannApiSumRank" + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestCannApiSum") + COMMAND_SUCCESS = 0 + + def setup_class(self): + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "cann_api_sum", + "--output_path", self.OUTPUT_PATH, "--force"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("CannApiSum task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + self.db_path_base = os.path.join(self.CLUSTER_PATH, "cluster_analysis_output_base", + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn_base, self.cursor_base = DBManager.create_connect_db(self.db_path_base) + + + def teardown_class(self): + DBManager.destroy_db_connect(self.conn, self.cursor) + DBManager.destroy_db_connect(self.conn_base, self.cursor_base) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + def check_tables_in_db(self): + expected_tables = [ + TestCannApiSum.TABLE_CANN_API_SUM, + TestCannApiSum.TABLE_CANN_API_SUM_RANK, + ] + return DBManager.check_tables_in_db(self.db_path, *expected_tables) + + def check_cann_api_sum_columns(self): + """ + 检查CannApiSum的表头 + """ + expected_columns = ["name", "timeRatio", "totalTimeNs", "totalCount", "averageNs", "Q1Ns", "medNs", + "Q3Ns", "minNs", "maxNs", "stdev", "minRank", "maxRank"] + return DBManager.get_table_columns_name(self.cursor, TestCannApiSum.TABLE_CANN_API_SUM) == expected_columns + + def check_cann_api_sum_rank_columns(self): + """ + 检查CannApiSumRank的表头 + """ + expected_columns = ["index", "name", "durationRatio", "totalTimeNs", "totalCount", "averageNs", "minNs", + "Q1Ns", "medNs", "Q3Ns", "maxNs", "stdev", "rank"] + return DBManager.get_table_columns_name(self.cursor, + TestCannApiSum.TABLE_CANN_API_SUM_RANK) == expected_columns + + def test_cann_api_sum_should_run_success_when_given_cluster_data(self): + self.assertTrue(self.check_tables_in_db(), msg="DB does not exist or is missing tables.") + self.assertTrue(self.check_cann_api_sum_columns(), + msg=f"The header of {self.TABLE_CANN_API_SUM} does not meet expectations.") + self.assertTrue(self.check_cann_api_sum_rank_columns(), + msg=f"The header of {self.TABLE_CANN_API_SUM_RANK} does not meet expectations.") + + def test_cann_api_sum_data_when_given_cluster_data(self): + query = f"select * from {self.TABLE_CANN_API_SUM}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def test_cann_api_sum_rank_data_when_given_cluster_data(self): + query = f"select * from {self.TABLE_CANN_API_SUM_RANK}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) \ No newline at end of file diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_mindspore_db.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_mindspore_db.py new file mode 100644 index 0000000000000000000000000000000000000000..b19d9d50c0e2b041d9df1cd388c6e894d26d76df --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_mindspore_db.py @@ -0,0 +1,105 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import unittest +from unittest import TestCase + +import pandas as pd + +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestClusterAnalyseMindsporeDb(TestCase): + """ + Test cluster analyse mindspore db + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_mindspore_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestClusterAnalyseMindsporeDb") + COMMAND_SUCCESS = 0 + RUN_TEST = os.path.exists(CLUSTER_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def setup_class(self): + # generate db data + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "all", + "--output_path", self.OUTPUT_PATH, "--force", "--data_simplification"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("pytorch db cluster analyse task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, "cluster_analysis_output", "cluster_analysis.db") + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def teardown_class(self): + # Delete db Data + DBManager.destroy_db_connect(self.conn, self.cursor) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_host_info_data(self): + query = "select hostName from HostInfo" + data = pd.read_sql(query, self.conn) + self.assertEqual(data["hostName"].tolist(), ["90-90-81-187"]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_rank_device_map_data(self): + query = "select * from RankDeviceMap" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_step_trace_time_data(self): + query = "select * from ClusterStepTraceTime" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + flag = 470398.169 in data["computing"].tolist() + self.assertTrue(flag) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_group_map_data(self): + query = "select * from CommunicationGroupMapping" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 15) + data = data[data["group_name"] == '11959133092869241673'] + self.assertEqual(data["rank_set"].tolist(), ["(0,1,2,3,4,5,6,7)"]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_matrix_data(self): + query = "SELECT * FROM ClusterCommunicationMatrix WHERE hccl_op_name = 'Total Op Info' " + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 71) + query = "SELECT transport_type, transit_size, transit_time, bandwidth FROM ClusterCommunicationMatrix WHERE " \ + "hccl_op_name='Total Op Info' and group_name='11262865095472569221' and src_rank=5 and dst_rank=1" + data = pd.read_sql(query, self.conn) + self.assertEqual(data.iloc[0].tolist(), ['HCCS', 37.748736, 3.609372109375, 10.458532635621374]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_time_data(self): + query = "select rank_id, count(0) cnt from ClusterCommunicationTime where hccl_op_name = " \ + "'Total Op Info' group by rank_id" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + self.assertEqual(data["cnt"].tolist(), [4 for _ in range(8)]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_bandwidth_data(self): + query = "select * from ClusterCommunicationBandwidth where hccl_op_name = 'Total Op Info' and " \ + "group_name='739319275709983152' order by count" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 15) + self.assertEqual(data["count"].tolist(), [0, 0, 0, 0, 18, 24, 24, 24, 120, 120, 120, 387, 387, 387, 387]) diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_msprof_db.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_msprof_db.py new file mode 100644 index 0000000000000000000000000000000000000000..e468004ac403aaefcc3664294e15ea8a45435c3c --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_msprof_db.py @@ -0,0 +1,105 @@ +# Copyright (c) 2024-2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import unittest +from unittest import TestCase + +import pandas as pd + +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestClusterAnalyseMsprofDb(TestCase): + """ + Test cluster analyse msprof db + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_msprof_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestClusterAnalyseMsprofDb") + COMMAND_SUCCESS = 0 + RUN_TEST = os.path.exists(CLUSTER_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def setup_class(self): + # generate db data + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "all", + "--output_path", self.OUTPUT_PATH, "--force", "--data_simplification"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("pytorch db cluster analyse task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, "cluster_analysis_output", "cluster_analysis.db") + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def teardown_class(self): + # Delete db Data + DBManager.destroy_db_connect(self.conn, self.cursor) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_host_info_data(self): + query = "select hostName from HostInfo" + data = pd.read_sql(query, self.conn) + self.assertEqual(data["hostName"].tolist(), ["90-90-81-187"]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_rank_device_map_data(self): + query = "select * from RankDeviceMap" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_step_trace_time_data(self): + query = "select * from ClusterStepTraceTime" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + flag = 470398.169 in data["computing"].tolist() + self.assertTrue(flag) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_group_map_data(self): + query = "select * from CommunicationGroupMapping" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 15) + data = data[data["group_name"] == '11959133092869241673'] + self.assertEqual(data["rank_set"].tolist(), ["(0,1,2,3,4,5,6,7)"]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_matrix_data(self): + query = "SELECT * FROM ClusterCommunicationMatrix WHERE hccl_op_name = 'Total Op Info' " + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 71) + query = "SELECT transport_type, transit_size, transit_time, bandwidth FROM ClusterCommunicationMatrix WHERE " \ + "hccl_op_name='Total Op Info' and group_name='11262865095472569221' and src_rank=5 and dst_rank=1" + data = pd.read_sql(query, self.conn) + self.assertEqual(data.iloc[0].tolist(), ['HCCS', 37.748736, 3.609372109375, 10.458532635621374]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_time_data(self): + query = "select rank_id, count(0) cnt from ClusterCommunicationTime where hccl_op_name = " \ + "'Total Op Info' group by rank_id" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 8) + self.assertEqual(data["cnt"].tolist(), [4 for _ in range(8)]) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_comm_bandwidth_data(self): + query = "select * from ClusterCommunicationBandwidth where hccl_op_name = 'Total Op Info' and " \ + "group_name='739319275709983152' order by count" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 15) + self.assertEqual(data["count"].tolist(), [0, 0, 0, 0, 18, 24, 24, 24, 120, 120, 120, 387, 387, 387, 387]) diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db.py index bbc07adebfb5fd96da6f61cdd9a24e107b3653c8..ce1c877838a6fdaf1149c2b22fcd6c64409470a1 100644 --- a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db.py +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db.py @@ -14,7 +14,6 @@ # limitations under the License. """Test cluster analyse pytorch db""" import os - from unittest import TestCase import pandas as pd @@ -29,14 +28,13 @@ from msprof_analyze.test.st.cluster_analyse.cluster_communication_analyzer_matri from msprof_analyze.test.st.cluster_analyse.cluster_communication_analyzer_time_db \ import ClusterCommunicationAnalyzerTime from msprof_analyze.test.st.cluster_analyse.cluster_step_trace_time_db import ClusterStepTraceTimeDb +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestClusterAnalysePytorchDb(TestCase): """ Test cluster analyse pytorch db """ - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data/") CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") db_path = "" STEP_TRACE_TIME_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db", "cluster_analysis_output_text", @@ -45,20 +43,21 @@ class TestClusterAnalysePytorchDb(TestCase): "cluster_analysis_output", "cluster_communication_matrix.json") COMMUNICATION_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db", "cluster_analysis_output_text", "cluster_analysis_output", "cluster_communication.json") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestClusterAnalysePytorchDb") COMMAND_SUCCESS = 0 def setup_class(self): # generate db data - PathManager.make_dir_safety(self.ST_DATA_PATH) + PathManager.make_dir_safety(self.OUTPUT_PATH) cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "all", - "--output_path", self.ST_DATA_PATH, "--force"] - if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.ST_DATA_PATH): + "--output_path", self.OUTPUT_PATH, "--force"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): self.fail("pytorch db cluster analyse task failed.") - self.db_path = os.path.join(self.ST_DATA_PATH, "cluster_analysis_output", "cluster_analysis.db") + self.db_path = os.path.join(self.OUTPUT_PATH, "cluster_analysis_output", "cluster_analysis.db") def teardown_class(self): # Delete db Data - PathManager.remove_path_safety(os.path.join(self.ST_DATA_PATH, "cluster_analysis_output")) + PathManager.remove_path_safety(self.OUTPUT_PATH) def test_msprof_analyze_text_db_trace_time_compare(self): """ @@ -70,10 +69,11 @@ class TestClusterAnalysePytorchDb(TestCase): "Cluster step trace time count wrong.") query = "SELECT * FROM ClusterStepTraceTime where type= 'rank' and [index] = 7" db_cluster_step_trace_time = select_by_query(self.db_path, query, ClusterStepTraceTimeDb) + df = df[df["Index"] == 7] text_cluster_step_trace_time = ClusterStepTraceTimeDb(*df.iloc[0]) self.assertEqual(text_cluster_step_trace_time.type, db_cluster_step_trace_time.type, "Cluster step trace time db vs text 'type' property wrong.") - self.assertEqual(text_cluster_step_trace_time.index, db_cluster_step_trace_time.index, + self.assertEqual(str(text_cluster_step_trace_time.index), str(db_cluster_step_trace_time.index), "Cluster step trace time db vs text 'index' property wrong.") self.assertEqual(round(text_cluster_step_trace_time.computing), round(db_cluster_step_trace_time.computing), "Cluster step trace time db vs text 'computing' property wrong.") @@ -201,4 +201,3 @@ class TestClusterAnalysePytorchDb(TestCase): self.assertEqual(round(text_cluster_communication_analyzer_time.get('Synchronization Time Ratio')), round(db_cluster_communication_analyzer_time.synchronization_time_ratio), "Cluster communication time db vs text 'Synchronization Time Ratio' property wrong.") - diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db_simplification.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db_simplification.py new file mode 100644 index 0000000000000000000000000000000000000000..a0aad4c01162783146df68913261431ee15106ed --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_db_simplification.py @@ -0,0 +1,94 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from unittest import TestCase + +import pandas as pd + +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestClusterAnalysePytorchDbSimplification(TestCase): + """ + Test cluster analyse pytorch db in data simplification + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestClusterAnalysePytorchDbSimplification") + COMMAND_SUCCESS = 0 + + def setup_class(self): + # generate db data + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "all", + "--output_path", self.OUTPUT_PATH, "--force", "--data_simplification"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("pytorch db cluster analyse task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, "cluster_analysis_output", "cluster_analysis.db") + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + + def teardown_class(self): + # Delete db Data + DBManager.destroy_db_connect(self.conn, self.cursor) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + def test_host_info_data(self): + query = "select hostName from HostInfo" + data = pd.read_sql(query, self.conn) + self.assertEqual(data["hostName"].tolist(), ["n122-120-121"]) + + def test_rank_device_map_data(self): + query = "select * from RankDeviceMap" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 16) + + def test_step_trace_time_data(self): + query = "select * from ClusterStepTraceTime" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 16) + flag = 14945901.524 in data["computing"].tolist() + self.assertTrue(flag) + + def test_comm_group_map_data(self): + query = "select * from CommunicationGroupMapping" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 33) + data = data[data["group_name"] == '7519234732706649132'] + self.assertEqual(data["rank_set"].tolist(), ["(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)"]) + + def test_comm_matrix_data(self): + query = "SELECT * FROM ClusterCommunicationMatrix WHERE hccl_op_name = 'Total Op Info' " + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 232) + query = "SELECT transport_type, transit_size, transit_time, bandwidth FROM ClusterCommunicationMatrix WHERE " \ + "hccl_op_name='Total Op Info' and group_name='1046397798680881114' and src_rank=12 and dst_rank=4" + data = pd.read_sql(query, self.conn) + self.assertEqual(data.iloc[0].tolist(), ['RDMA', 59341.69862400028, 17684.277734, 3.3556190146182354]) + + def test_comm_time_data(self): + query = "select rank_id, count(0) cnt from ClusterCommunicationTime where hccl_op_name = " \ + "'Total Op Info' group by rank_id" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 16) + self.assertEqual(data["cnt"].tolist(), [4 for _ in range(16)]) + + def test_comm_bandwidth_data(self): + query = "select * from ClusterCommunicationBandwidth where hccl_op_name = 'Total Op Info' and " \ + "group_name='12703750860003234865' order by count" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 2) + self.assertEqual(data["count"].tolist(), [2, 36]) diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_text.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_text.py index d6f8d470109bf21a33bfde683bafb0f41660dcb0..2817c93eb7aa766077b24b82db6f8c32c5d5dd22 100644 --- a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_text.py +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_pytorch_text.py @@ -18,15 +18,15 @@ import logging import subprocess import pandas as pd from unittest import TestCase + from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestClusterAnalyseCmdPytorchText(TestCase): """ PyTorch text type cluster data """ - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2") OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "ClusterAnalyseCmdPytorchText") @@ -139,9 +139,13 @@ class TestClusterAnalyseCmdPytorchText(TestCase): "Communication(Not Overlapped and Exclude Receive)", "Preparing"] self.assertEqual(headers, df.columns.tolist(), "PyTorch text result columns wrong.") - data_base = ["rank", "7", 14945901.573999925, 50289541.49199608, 14462809.01400388, 64752350.50599996, + data_base = ["rank", 7, 14945901.573999925, 50289541.49199608, 14462809.01400388, 64752350.50599996, 377397.078000026, 65612840.25, 0.0, 50289541.49199608, 1726054679554437.8] - self.assertEqual(data_base, df.iloc[0].loc["Type":"Preparing"].tolist(), "PyTorch text result data wrong.") + + rank_7_df = df[df["Index"] == 7] + self.assertEqual(len(rank_7_df), 1, "PyTorch text result data wrong.") + data_compare = rank_7_df.iloc[0].loc["Type":"Preparing"].values.tolist() + self.assertEqual(data_base, data_compare, "PyTorch text result data wrong.") def communication_matrix_compare(self): """ @@ -153,10 +157,11 @@ class TestClusterAnalyseCmdPytorchText(TestCase): for header in headers: result_data = result_data.get(header, {}) compare_data = [] - for data in list(result_data.values())[:12]: + result_data = {k: result_data[k] for k in sorted(result_data.keys(), reverse=True)} + for data in list(result_data.values()): compare_data.append(data.get("Bandwidth(GB/s)", -1)) - data_base = [25.0568, 641.8677, 23.4726, 23.2394, 626.9544, 24.9039, - 22.7738, 23.0614, 640.6486, 25.7812, 23.1025, 23.2896] + data_base = [641.8677, 23.4726, 23.2394, 25.0568, 22.7738, 626.9544, 23.0614, 24.9039, + 23.2896, 23.1025, 640.6486, 25.7812, 23.1077, 22.9017, 23.2811, 629.2938] self.assertEqual(data_base, compare_data, "PyTorch text result data wrong.") def communication_compare(self): @@ -170,6 +175,7 @@ class TestClusterAnalyseCmdPytorchText(TestCase): for header in headers: result_data = result_data.get(header, {}) board_datas = [] + result_data = {k: result_data[k] for k in sorted(result_data.keys(), reverse=True)} for data in list(result_data.values())[:2]: board_datas.append(data.get("Communication Time Info", {})) compare_data = [] diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_step_id_param.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_step_id_param.py new file mode 100644 index 0000000000000000000000000000000000000000..fc4e682f7abdc4426d568c6ec0f67dd3d6a0e9b8 --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_cluster_analyse_step_id_param.py @@ -0,0 +1,90 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from unittest import TestCase + +import pandas as pd + +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestClusterAnalyseStepIdParam(TestCase): + """ + Test cluster analyse pytorch db with step_id param + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestClusterAnalyseStepIdParam") + COMMAND_SUCCESS = 0 + + def setup_class(self): + # generate db data + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "hccl_sum", + "--output_path", self.OUTPUT_PATH, "--force", "--step_id", "5"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("pytorch db cluster analyse task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, "cluster_analysis_output", "cluster_analysis.db") + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + + def teardown_class(self): + # Delete db Data + DBManager.destroy_db_connect(self.conn, self.cursor) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + def test_all_rank_stats(self): + query = "select * from HcclAllRankStats" + data = pd.read_sql(query, self.conn) + self.assertEqual(data[data["OpType"] == "hcom_allGather_"]["Count"].tolist()[0], 12160) + self.assertEqual(data[data["OpType"] == "hcom_allReduce_"]["Count"].tolist()[0], 192) + self.assertEqual(data[data["OpType"] == "hcom_broadcast_"]["Count"].tolist()[0], 48) + self.assertEqual(data[data["OpType"] == "hcom_reduceScatter_"]["Count"].tolist()[0], 7472) + + def test_group_name_map(self): + query = "select * from HcclGroupNameMap" + data = pd.read_sql(query, self.conn) + self.assertEqual(len(data), 33) + + def test_per_rank_stats(self): + query = "select Rank, sum(Count) cnt from HcclPerRankStats group by Rank" + data = pd.read_sql(query, self.conn) + for rank in range(16): + self.assertEqual(data[data["Rank"] == rank]["cnt"].tolist()[0], 1242) + + def test_top_op_stats(self): + check_data = { + "hcom_allReduce__606_0_1": 7, + "hcom_allReduce__058_0_1": 15, + "hcom_allReduce__184_0_1": 11, + "hcom_allReduce__286_0_1": 4, + "hcom_allReduce__053_0_1": 9, + "hcom_allReduce__408_0_1": 5, + "hcom_allReduce__865_0_1": 0, + "hcom_allReduce__618_0_1": 12, + "hcom_allReduce__532_0_1": 3, + "hcom_allReduce__809_0_1": 1, + "hcom_allReduce__444_0_1": 8, + "hcom_allReduce__740_0_1": 13, + "hcom_allReduce__273_0_1": 2, + "hcom_allReduce__349_0_1": 6, + "hcom_allReduce__558_0_1": 14 + } + query = "select * from HcclTopOpStats" + data = pd.read_sql(query, self.conn) + for op_name, rank in check_data.items(): + self.assertEqual(data[data["OpName"] == op_name]["MinRank"].tolist()[0], rank) + self.assertEqual(data[data["OpName"] == op_name]["MaxRank"].tolist()[0], rank) diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_compute_op_sum.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_compute_op_sum.py new file mode 100644 index 0000000000000000000000000000000000000000..ec61c13159b3534e9b5a972fec9d1db85341767c --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_compute_op_sum.py @@ -0,0 +1,132 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from unittest import TestCase +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.compute_op_sum.compute_op_sum import ComputeOpSum +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestComputeOpSum(TestCase): + """ + Test recipe: compute_op_sum + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestComputeOpSum") + COMMAND_SUCCESS = 0 + + def setup_class(self): + self.db_path_base = os.path.join(self.CLUSTER_PATH, "cluster_analysis_output_base", + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn_base, self.cursor_base = DBManager.create_connect_db(self.db_path_base) + + def connect_db(self, cmd): + # 需要手动调用 + PathManager.make_dir_safety(self.OUTPUT_PATH) + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("ComputeOpSum task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + + def tearDown(self): + DBManager.destroy_db_connect(self.conn, self.cursor) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + def check_tables_in_db(self): + expected_tables = [ + ComputeOpSum.TABLE_ALL_RANK_STATS, + ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPNAME, + ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE, + ] + return DBManager.check_tables_in_db(self.db_path, *expected_tables) + + def check_tables_in_db_when_exclude_op_name(self): + expected_tables = [ + ComputeOpSum.TABLE_ALL_RANK_STATS, + ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE, + ] + return DBManager.check_tables_in_db(self.db_path, *expected_tables) and not \ + DBManager.check_tables_in_db(self.db_path, ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPNAME) + + def check_compute_op_all_rank_stats_columns(self): + # 检查ComputeOpAllRankStats的表头 + expected_columns = ["OpType", "TaskType", "Count", "MeanNs", "StdNs", "MinNs", + "Q1Ns", "MedianNs", "Q3Ns", "MaxNs", "SumNs"] + return DBManager.get_table_columns_name(self.cursor, ComputeOpSum.TABLE_ALL_RANK_STATS) == expected_columns + + def check_compute_op_per_rank_stats_by_opname_columns(self): + # 检查ComputeOpPerRankStatsByOpName的表头 + expected_columns = ["OpName", "OpType", "TaskType", "InputShapes", "Count", "MeanNs", "StdNs", "MinNs", + "Q1Ns", "MedianNs", "Q3Ns", "MaxNs", "SumNs", "Rank"] + return DBManager.get_table_columns_name(self.cursor, + ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPNAME) == expected_columns + + def check_compute_op_per_rank_stats_by_optype_columns(self): + # 检查ComputeOpPerRankStatsByOpType的表头 + expected_columns = ["OpType", "TaskType", "Count", "MeanNs", "StdNs", "MinNs", + "Q1Ns", "MedianNs", "Q3Ns", "MaxNs", "SumNs", "Rank"] + return DBManager.get_table_columns_name(self.cursor, + ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE) == expected_columns + + def check_compute_op_all_rank_stats_data_when_given_cluster_data(self): + query = f"select * from {ComputeOpSum.TABLE_ALL_RANK_STATS}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def check_compute_op_per_rank_stats_by_opname_data_when_given_cluster_data(self): + query = f"select * from {ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPNAME}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def check_compute_op_per_rank_stats_by_optype_data_when_given_cluster_data(self): + query = f"select * from {ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def test_compute_op_sum_should_run_success_when_given_cluster_data(self): + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "compute_op_sum", + "-o", self.OUTPUT_PATH, "--force"] + self.connect_db(cmd) + self.assertTrue(self.check_tables_in_db(), msg="DB does not exist or is missing tables.") + self.assertTrue(self.check_compute_op_all_rank_stats_columns(), + msg=f"The header of {ComputeOpSum.TABLE_ALL_RANK_STATS} does not meet expectations.") + self.assertTrue(self.check_compute_op_per_rank_stats_by_opname_columns(), + msg=f"The header of {ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPNAME} does not meet expectations.") + self.assertTrue(self.check_compute_op_per_rank_stats_by_optype_columns(), + msg=f"The header of {ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE} does not meet expectations.") + self.check_compute_op_all_rank_stats_data_when_given_cluster_data() + self.check_compute_op_per_rank_stats_by_opname_data_when_given_cluster_data() + self.check_compute_op_per_rank_stats_by_optype_data_when_given_cluster_data() + + def test_compute_op_sum_should_run_success_when_given_cluster_data_and_exclude_op_name(self): + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "compute_op_sum", + "-o", self.OUTPUT_PATH, "--exclude_op_name", "--force"] + self.connect_db(cmd) + self.assertTrue(self.check_tables_in_db_when_exclude_op_name(), msg="DB does not exist or is missing tables.") + self.assertTrue(self.check_compute_op_all_rank_stats_columns(), + msg=f"The header of {ComputeOpSum.TABLE_ALL_RANK_STATS} does not meet expectations.") + self.assertTrue(self.check_compute_op_per_rank_stats_by_optype_columns(), + msg=f"The header of {ComputeOpSum.TABLE_PER_RANK_STATS_BY_OPTYPE} does not meet expectations.") + self.check_compute_op_all_rank_stats_data_when_given_cluster_data() + self.check_compute_op_per_rank_stats_by_optype_data_when_given_cluster_data() \ No newline at end of file diff --git a/profiler/msprof_analyze/test/st/cluster_analyse/test_hccl_sum.py b/profiler/msprof_analyze/test/st/cluster_analyse/test_hccl_sum.py new file mode 100644 index 0000000000000000000000000000000000000000..7c11946f242fa42abe87f60d178f48969e6190bc --- /dev/null +++ b/profiler/msprof_analyze/test/st/cluster_analyse/test_hccl_sum.py @@ -0,0 +1,118 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +from unittest import TestCase +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.hccl_sum.hccl_sum import HcclSum +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.prof_common.db_manager import DBManager +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestHcclSum(TestCase): + """ + Test recipe: hccl_sum + """ + CLUSTER_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2_db") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "TestHcclSum") + COMMAND_SUCCESS = 0 + + def setup_class(self): + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "cluster", "-d", self.CLUSTER_PATH, "-m", "hccl_sum", + "--output_path", self.OUTPUT_PATH, "--force"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.fail("HcclSum task failed.") + self.db_path = os.path.join(self.OUTPUT_PATH, Constant.CLUSTER_ANALYSIS_OUTPUT, + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn, self.cursor = DBManager.create_connect_db(self.db_path) + self.db_path_base = os.path.join(self.CLUSTER_PATH, "cluster_analysis_output_base", + Constant.DB_CLUSTER_COMMUNICATION_ANALYZER) + self.conn_base, self.cursor_base = DBManager.create_connect_db(self.db_path_base) + + def teardown_class(self): + DBManager.destroy_db_connect(self.conn, self.cursor) + DBManager.destroy_db_connect(self.conn_base, self.cursor_base) + PathManager.remove_path_safety(self.OUTPUT_PATH) + + def check_tables_in_db(self): + expected_tables = [ + HcclSum.TABLE_ALL_RANK_STATS, + HcclSum.TABLE_PER_RANK_STATS, + HcclSum.TABLE_TOP_OP_STATS, + HcclSum.TABLE_GROUP_NAME_MAP + ] + return DBManager.check_tables_in_db(self.db_path, *expected_tables) + + def check_hccl_all_rank_stats_columns(self): + # 检查HcclAllRankStats的表头 + expected_columns = ["OpType", "Count", "MeanNs", "StdNs", "MinNs", "Q1Ns", "MedianNs", "Q3Ns", + "MaxNs", "SumNs"] + return DBManager.get_table_columns_name(self.cursor, HcclSum.TABLE_ALL_RANK_STATS) == expected_columns + + def check_hccl_per_rank_stats_columns(self): + # 检查HcclPerRankStats的表头 + expected_columns = ["OpType", "Count", "MeanNs", "StdNs", "MinNs", "Q1Ns", "MedianNs", "Q3Ns", + "MaxNs", "SumNs", "Rank"] + return DBManager.get_table_columns_name(self.cursor, HcclSum.TABLE_PER_RANK_STATS) == expected_columns + + def check_hccl_top_op_stats_columns(self): + # 检查HcclTopOpStats的表头 + expected_columns = ["OpName", "Count", "MeanNs", "StdNs", "MinNs", "Q1Ns", "MedianNs", "Q3Ns", + "MaxNs", "SumNs", "MinRank", "MaxRank"] + return DBManager.get_table_columns_name(self.cursor, HcclSum.TABLE_TOP_OP_STATS) == expected_columns + + def check_hccl_group_name_map_columns(self): + # 检查HcclGroupNameMap的表头 + expected_columns = ["GroupName", "GroupId", "Ranks"] + return DBManager.get_table_columns_name(self.cursor, HcclSum.TABLE_GROUP_NAME_MAP) == expected_columns + + def test_hccl_sum_should_run_success_when_given_cluster_data(self): + self.assertTrue(self.check_tables_in_db(), msg="DB does not exist or is missing tables.") + self.assertTrue(self.check_hccl_all_rank_stats_columns(), + msg=f"The header of {HcclSum.TABLE_ALL_RANK_STATS} does not meet expectations.") + self.assertTrue(self.check_hccl_per_rank_stats_columns(), + msg=f"The header of {HcclSum.TABLE_PER_RANK_STATS} does not meet expectations.") + self.assertTrue(self.check_hccl_top_op_stats_columns(), + msg=f"The header of {HcclSum.TABLE_TOP_OP_STATS} does not meet expectations.") + self.assertTrue(self.check_hccl_group_name_map_columns(), + msg=f"The header of {HcclSum.TABLE_GROUP_NAME_MAP} does not meet expectations.") + + def test_hccl_all_rank_stats_data_when_given_cluster_data(self): + query = f"select * from {HcclSum.TABLE_ALL_RANK_STATS}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def test_hccl_per_rank_stats_data_when_given_cluster_data(self): + query = f"select * from {HcclSum.TABLE_PER_RANK_STATS}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def test_hccl_top_op_stats_data_when_given_cluster_data(self): + query = f"select * from {HcclSum.TABLE_TOP_OP_STATS}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) + + def test_hccl_group_name_map_data_when_given_cluster_data(self): + query = f"select * from {HcclSum.TABLE_GROUP_NAME_MAP}" + df = pd.read_sql(query, self.conn) + df_base = pd.read_sql(query, self.conn_base) + self.assertTrue(df.equals(df_base)) \ No newline at end of file diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_mindspore.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_mindspore.py new file mode 100644 index 0000000000000000000000000000000000000000..d40db720ec1434a32cff01469926c207ba800a6e --- /dev/null +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_mindspore.py @@ -0,0 +1,63 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import unittest +from unittest import TestCase +import pandas as pd + +from msprof_analyze.prof_common.path_manager import PathManager +from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH + + +class TestCompareToolsCmdMindSpore(TestCase): + BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "ms_cluster_data_1", "ubuntu_3543034_20250228021645572_ascend_ms") + COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "ms_cluster_data_1", + "ubuntu_3543025_20250228021645573_ascend_ms") + OUTPUT_PATH = os.path.join(os.path.abspath(os.path.dirname(__file__)), "CompareToolsCmdMindSpore") + RESULT_EXCEL = "" + COMMAND_SUCCESS = 0 + RUN_TEST = os.path.exists(BASE_PROFILING_PATH) and os.path.exists(COMPARISON_PROFILING_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def setup_class(self): + PathManager.make_dir_safety(self.OUTPUT_PATH) + cmd = ["msprof-analyze", "compare", "-d", self.COMPARISON_PROFILING_PATH, "-bp", self.BASE_PROFILING_PATH, "-o", + self.OUTPUT_PATH, "--force"] + if execute_cmd(cmd) != self.COMMAND_SUCCESS or not os.path.exists(self.OUTPUT_PATH): + self.assertTrue(False, msg="enable api compare comparison task failed.") + if not check_result_file(self.OUTPUT_PATH): + self.assertTrue(False, msg="enable api compare comparison result excel is not find.") + self.result_excel = os.path.join(self.OUTPUT_PATH, check_result_file(self.OUTPUT_PATH)) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def teardown_class(self): + PathManager.remove_path_safety(self.OUTPUT_PATH) + + @unittest.skipIf(not RUN_TEST, "Skipping this test based on RUN_TEST environment variable") + def test_overall_metrics(self): + index_exp = [ + "Computing Time", "Other", "Uncovered Communication Time", "tp-0-1-2-3", "Transmit", + "hccl_world_group", "Transmit", "dp-cp-1", "Transmit", "pp-1-5", "Transmit", + "dp-cp-0", "Transmit", "pp-0-4", "Transmit", "Free Time", "Free", "E2E Time" + ] + diff_duration = [0.74, 0.74, -243.79, -243.59, -243.59, -0.81, -0.81, -5.37, -5.37, -712.80, -712.80, 6.71, + 6.71, 712.77, 712.77, 2.03, 2.03, -241.01] + df = pd.read_excel(self.result_excel, sheet_name="OverallMetrics", header=2) + for index, row in df.iterrows(): + self.assertEqual(index_exp[index], row["Index"].strip().split(":")[0], + msg="mindspore data compare results 'Index' column is wrong") + self.assertEqual(diff_duration[index], round(row["Diff Duration(ms)"], 2), + msg="mindspore data compare results 'Diff Duration(ms)' column is wrong") \ No newline at end of file diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_api_compare.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_api_compare.py index ad388ee6d4b1e0e0ec61166c5890f9c5125ae7df..eb493ad39a46c592dfc10b0be7160bc579c8dd54 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_api_compare.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_api_compare.py @@ -19,11 +19,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableApiCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_communication_compare.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_communication_compare.py index 18d8a4b35f412bf2a5d81d16f846374295ae5c1e..9cf0701a9823663df492369fb0c2c64d93e68d69 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_communication_compare.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_communication_compare.py @@ -20,11 +20,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableCommunicationCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_kernel_compare.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_kernel_compare.py index 5082ee07d9df3f50110e8ed6063c60df0c6fb0ab..2e90ce68891b7a854ecd1be2b2f195a4e8dd3cad 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_kernel_compare.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_kernel_compare.py @@ -19,11 +19,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableKernelCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_3", "n122-122-067_12380_20240912033946038_ascend_pt") diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_memory_compare.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_memory_compare.py index 6535af01e52c1996ea0857767711854f731d11b5..d216afd9eed682131fced0a9356c9fee564da1f9 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_memory_compare.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_memory_compare.py @@ -19,11 +19,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableMemoryCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", "n122-197-168_1333345_20241105122131111_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_operator_compare.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_operator_compare.py index 19211e9ae70bc3416aa6ced71d016374ca7a7775..203c575f90793dfd4731046a19a2241baec3f8d4 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_operator_compare.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_operator_compare.py @@ -20,11 +20,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableOperatorCompare(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", "n122-197-168_1333345_20241105122131111_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_profiling.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_profiling.py index 07fd66d517ab0784c289453f9e6aca44be27026b..6f49a112ae54df359fee6cd8d16f849738a36555 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_profiling.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_npu_enable_profiling.py @@ -20,11 +20,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpuEnableProfiling(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", "n122-197-168_1333345_20241105122131111_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu.py index 67be541ccc11d1e5780c4295fe7df15f22d47a13..7a6a2f423a17af3b83ebe2ca762f7a108c0ec6b6 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu.py @@ -19,11 +19,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpu(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2", "n122-120-121_12321_20240911113658382_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_2", "n122-120-121_12322_20240911113658370_ascend_pt") @@ -48,11 +47,11 @@ class TestCompareToolsCmdPytorchNpuVsNpu(TestCase): duration_exp = [ 14302.47, 1128.78, 1128.78, 10320.26, 10320.26, 2836.92, 445.59, 2391.33, 16.33, 0.18, 50636.60, 8412.29, 11.69, 8400.59, 117.70, 116.68, 1.03, 82.64, 0.16, 82.47, 99.71, 0.00, 99.71, 32759.01, 32759.01, - 17478.85, 17308.09, 170.76, 682.94, 682.94, 65622.00 + 17478.85, 17308.09, 170.76, 8313.60, 8313.60, 682.94, 682.94, 65622.00 ] diff_exp = [6.48, 4.84, 4.84, -9.23, -9.23, 10.77, -0.83, 11.60, 0.09, 0.01, 33.92, -331.06, 100.38, -431.44, 53.12, 53.13, -0.01, -1.53, 0.26, -1.79, -82.30, 0.19, -82.49, -35.74, -35.74, -0.36, -0.27, -0.09, - -48.46, -48.46, -8.05] + -431.79, -431.79, -48.46, -48.46, -8.05] df = pd.read_excel(self.RESULT_EXCEL, sheet_name="OverallMetrics", header=2) for index, row in df.iterrows(): self.assertEqual(duration_exp[index], round(row["Duration(ms)"], 2), diff --git a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu_step.py b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu_step.py index a6c9efb2ddc30fafba6433d05120ed80d51db110..5706eb7651bbeb948dadc4abf72cd0aa034f29cf 100644 --- a/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu_step.py +++ b/profiler/msprof_analyze/test/st/compare_tools/test_compare_tools_cmd_pytorch_npu_vs_npu_step.py @@ -20,11 +20,10 @@ import pandas as pd from msprof_analyze.prof_common.path_manager import PathManager from msprof_analyze.test.st.utils import execute_cmd, check_result_file +from msprof_analyze.test.st.utils import ST_DATA_PATH class TestCompareToolsCmdPytorchNpuVsNpu(TestCase): - ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", - "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") BASE_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", "n122-197-168_1333345_20241105122131111_ascend_pt") COMPARISON_PROFILING_PATH = os.path.join(ST_DATA_PATH, "cluster_data_4", diff --git a/profiler/msprof_analyze/test/st/utils.py b/profiler/msprof_analyze/test/st/utils.py index 434b32c43d7b3b7757f9478ce79aeea60b9c0a32..63b2ee1616190c0de056a5deb050413dded61274 100644 --- a/profiler/msprof_analyze/test/st/utils.py +++ b/profiler/msprof_analyze/test/st/utils.py @@ -19,6 +19,8 @@ import logging import sqlite3 COMMAND_SUCCESS = 0 +ST_DATA_PATH = os.getenv("MSTT_PROFILER_ST_DATA_PATH", + "/home/dcs-50/smoke_project_for_msprof_analyze/mstt_profiler/st_data") def execute_cmd(cmd): diff --git a/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_link_advice.py b/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_link_advice.py index 3a0b3de50fb1a9896d88a22915991274b6466992..165d514494cec9a7c91112bfb15b1cbb7c1d97ff 100644 --- a/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_link_advice.py +++ b/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_link_advice.py @@ -12,13 +12,13 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import os import unittest from msprof_analyze.advisor.advisor_backend.cluster_advice.slow_link_advice import SlowLinkAdvice class TestSlowLinkAdvice(unittest.TestCase): - DATA = 'data' BOTTLENECK = 'bottleneck' ADVICE = 'advice' @@ -26,7 +26,8 @@ class TestSlowLinkAdvice(unittest.TestCase): @classmethod def setUpClass(cls): super().setUpClass() - cls.prof_dir = './resource/advisor' + cls.prof_dir = os.path.abspath( + os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../../../resource/advisor')) cls.expect_data = { 0: { 'RDMA time(ms)': 0, @@ -46,10 +47,10 @@ class TestSlowLinkAdvice(unittest.TestCase): } } cls.expect_bottleneck = 'SDMA bandwidth(GB/s): \n' \ - 'The average is 2.355, ' \ - 'while the maximum is 2.359GB/s and ' \ - 'the minimum is 2.352GB/s. ' \ - 'the difference is 0.007GB/s. \n' + 'The average is 2.355, ' \ + 'while the maximum is 2.359GB/s and ' \ + 'the minimum is 2.352GB/s. ' \ + 'the difference is 0.007GB/s. \n' def test_compute_ratio_abnormal(self): result = SlowLinkAdvice.compute_ratio(19.0, 0) diff --git a/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_rank_advice.py b/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_rank_advice.py index 8c196faba0708ad8be2f23445c645d268b16642e..5edba856e7fc1b58843667a6dd34214dcd0bde66 100644 --- a/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_rank_advice.py +++ b/profiler/msprof_analyze/test/ut/advisor/advisor_backend/cluster_advice/test_slow_rank_advice.py @@ -12,6 +12,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import os import unittest from msprof_analyze.advisor.advisor_backend.cluster_advice.slow_rank_advice import SlowRankAdvice @@ -26,7 +27,8 @@ class TestSlowRankAdvice(unittest.TestCase): @classmethod def setUpClass(cls): super().setUpClass() - cls.prof_dir = './resource/advisor' + cls.prof_dir = os.path.abspath( + os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../../../resource/advisor')) cls.expect_data = { 0: [80309.68717187493, 683731.2897031249, 990605.1042031233, 0], 1: [80435.74650000008, 133385.97745312497, 1488610.0587500026, 0], diff --git a/profiler/msprof_analyze/test/ut/advisor/communication_advice/test_byte_alignment_analyzer.py b/profiler/msprof_analyze/test/ut/advisor/communication_advice/test_byte_alignment_analyzer.py index 9b4f02b41701d69a6cb539427df79f66bf24d143..ea8e12b8199be306eb1fc46566ee5c1f2de24b49 100644 --- a/profiler/msprof_analyze/test/ut/advisor/communication_advice/test_byte_alignment_analyzer.py +++ b/profiler/msprof_analyze/test/ut/advisor/communication_advice/test_byte_alignment_analyzer.py @@ -23,8 +23,8 @@ from msprof_analyze.advisor.common.analyzer_scopes import SupportedScopes class TestByteAlignmentAnalyzer(unittest.TestCase): - TMP_DIR = "./ascend_pt" - OUTPUT_DIR = "./ascend_pt/ASCEND_PROFILER_OUTPUT" + TMP_DIR = "./TestByteAlignmentAnalyzer/ascend_pt" + OUTPUT_DIR = "./TestByteAlignmentAnalyzer/ascend_pt/ASCEND_PROFILER_OUTPUT" interface = None err_interface = None diff --git a/profiler/msprof_analyze/test/ut/advisor/compute_advice/data/kernel_details.csv b/profiler/msprof_analyze/test/ut/advisor/compute_advice/data/kernel_details.csv new file mode 100644 index 0000000000000000000000000000000000000000..8a255e939ae2ff4e781c7a356b342815838e2ff3 --- /dev/null +++ b/profiler/msprof_analyze/test/ut/advisor/compute_advice/data/kernel_details.csv @@ -0,0 +1,30 @@ +Step Id,Model ID,Task ID,Stream ID,Name,Type,OP State,Accelerator Core,Start Time(us),Duration(us),Wait Time(us),Block Dim,Mix Block Dim,HF32 Eligible,Input Shapes,Input Data Types,Input Formats,Output Shapes,Output Data Types,Output Formats,Context ID,aicore_time(us),aic_total_cycles,aic_mac_time(us),aic_mac_ratio,aic_scalar_time(us),aic_scalar_ratio,aic_mte1_time(us),aic_mte1_ratio,aic_mte2_time(us),aic_mte2_ratio,aic_fixpipe_time(us),aic_fixpipe_ratio,aic_icache_miss_rate,aiv_time(us),aiv_total_cycles,aiv_vec_time(us),aiv_vec_ratio,aiv_scalar_time(us),aiv_scalar_ratio,aiv_mte2_time(us),aiv_mte2_ratio,aiv_mte3_time(us),aiv_mte3_ratio,aiv_icache_miss_rate,cube_utilization(%) +19,4294967295,61653,2,aclnnMatmul_MatMulCommon_MatMulV2,MatMulV2,dynamic,AI_CORE,"1736413971558972.912 ",185.504,1.087,16,0,NO,"""81920,4096;8192,512""",DT_BF16;DT_BF16,ND;ND,"""4096,512""",DT_BF16,ND,N/A,183.87,5295467,151.425,0.824,88.03,0.479,119.148,0.648,177.314,0.964,5.736,0.031,0.001,0,0,0,0,0,0,0,0,0,0,0,79.295 +19,4294967295,61669,2,aclnnMatmul_MatMulV3Common_MatMulV3,MatMulV3,dynamic,AI_CORE,"1736413971560588.764 ",501.17,2.2,20,0,NO,"""81920,1536;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,478.701,17233251,356.349,0.744,118.087,0.247,296.009,0.618,452.112,0.944,35.833,0.075,0.001,0,0,0,0,0,0,0,0,0,0,0,95.517 +19,4294967295,61694,2,aclnnMatmul_MatMulCommon_MatMulV2,MatMulV2,dynamic,AI_CORE,"1736413971565213.257 ",186.823,1.178,16,0,NO,"""81920,4096;8192,512""",DT_BF16;DT_BF16,ND;ND,"""4096,512""",DT_BF16,ND,N/A,183.728,5291376,151.502,0.825,87.902,0.478,118.519,0.645,177.654,0.967,5.773,0.031,0.001,0,0,0,0,0,0,0,0,0,0,0,78.675 +19,4294967295,61710,2,aclnnMatmul_MatMulV3Common_MatMulV3,MatMulV3,dynamic,AI_CORE,"1736413971566843.489 ",516.991,2.33,20,0,NO,"""81920,1536;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,491.775,17703905,356.249,0.724,118.59,0.241,295.046,0.6,463.696,0.943,37.671,0.077,0.001,0,0,0,0,0,0,0,0,0,0,0,95.123 +19,4294967295,61735,2,aclnnMatmul_MatMulCommon_MatMulV2,MatMulV2,dynamic,AI_CORE,"1736413971571596.404 ",187.724,0.766,16,0,NO,"""81920,4096;8192,512""",DT_BF16;DT_BF16,ND;ND,"""4096,512""",DT_BF16,ND,N/A,184.904,5325221,151.489,0.819,87.893,0.475,118.63,0.642,178.815,0.967,5.77,0.031,0.001,0,0,0,0,0,0,0,0,0,0,0,78.798 +19,4294967295,61751,2,aclnnMatmul_MatMulV3Common_MatMulV3,MatMulV3,dynamic,AI_CORE,"1736413971573223.437 ",514.87,2.15,20,0,NO,"""81920,1536;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,486.931,17529512,356.117,0.731,118.847,0.244,295.529,0.607,457.002,0.939,37.938,0.078,0.001,0,0,0,0,0,0,0,0,0,0,0,94.574 +19,4294967295,61776,2,aclnnMatmul_MatMulCommon_MatMulV2,MatMulV2,dynamic,AI_CORE,"1736413971577931.851 ",190.544,1.367,16,0,NO,"""81920,4096;8192,512""",DT_BF16;DT_BF16,ND;ND,"""4096,512""",DT_BF16,ND,N/A,187.073,5387702,151.741,0.811,87.935,0.47,117.467,0.628,181.043,0.968,5.803,0.031,0.001,0,0,0,0,0,0,0,0,0,0,0,78.543 +19,4294967295,61792,2,aclnnMatmul_MatMulV3Common_MatMulV3,MatMulV3,dynamic,AI_CORE,"1736413971579566.403 ",504.071,2.28,20,0,NO,"""81920,1536;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,485.542,17479517,356.283,0.734,117.755,0.243,296.421,0.61,455.064,0.937,37.75,0.078,0.001,0,0,0,0,0,0,0,0,0,0,0,96.324 +19,4294967295,13792,2,aclnnMatmul_MatMulV3Common_MatMulV5,MatMulV3,dynamic,AI_CORE,"1736413974248200.543 ",521.31,2.22,20,0,NO,"""8192,15365;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,499.234,17972434,356.364,0.714,117.639,0.236,295.58,0.592,471.784,0.945,35.825,0.072,0.001,0,0,0,0,0,0,0,0,0,0,0,95.765 +19,4294967295,13792,2,aclnnMatmul_MatMulV3Common_MatMulV5,MatMulV3,dynamic,AI_CORE,"1736413974248200.543 ",521.31,2.22,20,0,NO,"""8192,15365;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,499.234,17972434,356.364,0.714,117.639,0.236,295.58,0.592,471.784,0.945,35.825,0.072,0.001,0,0,0,0,0,0,0,0,0,0,0,95.765 +19,4294967295,13792,2,aclnnMatmul_MatMulV3Common_MatMulV5,MatMulV3,dynamic,AI_CORE,"1736413974248200.543 ",521.31,2.22,20,0,NO,"""8192,15365;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,499.234,17972434,356.364,0.714,117.639,0.236,295.58,0.592,471.784,0.945,35.825,0.072,0.001,0,0,0,0,0,0,0,0,0,0,0,95.765 +19,4294967295,13792,2,aclnnMatmul_MatMulV3Common_MatMulV5,MatMulV3,dynamic,AI_CORE,"1736413974248200.543 ",521.31,2.22,20,0,NO,"""8192,15365;8192,4096""",DT_BF16;DT_BF16,ND;ND,"""1536,4096""",DT_BF16,ND,N/A,499.234,17972434,356.364,0.714,117.639,0.236,295.58,0.592,471.784,0.945,35.825,0.072,0.001,0,0,0,0,0,0,0,0,0,0,0,95.765 +19,4294967295,60679,2,aclnnFlashAttentionScore_FlashAttentionScore_FlashAttentionScore,FlashAttentionScore,dynamic,MIX_AIC,"1736413971411629.128 ",410.188,1.53,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;;;;4096,4096;;;;;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;UINT8;DT_BF16;BOOL;INT64;INT64;INT64;INT64;INT64,NCL;NCL;NCL;ND;ND;ND;ND;ND;ND;ND;ND;ND,"""2,4,4096,8;2,4,4096,8;;4096,2,512""",FLOAT;FLOAT;DT_BF16;DT_BF16,ND;ND;ND;ND,0,366.147,13181275,129.055,0.352,352.275,0.962,108.364,0.296,172.86,0.872,216.141,0.59,0.003,365.782,26336326,228.687,0.625,137.979,0.377,118.603,0.324,71.448,0.195,0.013,89.263 +19,4294967295,60707,2,aclnnFlashAttentionScore_FlashAttentionScore_FlashAttentionScore,FlashAttentionScore,dynamic,MIX_AIC,"1736413971415611.468 ",406.128,1.279,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;;;;4096,4096;;;;;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;UINT8;DT_BF16;BOOL;INT64;INT64;INT64;INT64;INT64,NCL;NCL;NCL;ND;ND;ND;ND;ND;ND;ND;ND;ND,"""2,4,4096,8;2,4,4096,8;;4096,2,512""",FLOAT;FLOAT;DT_BF16;DT_BF16,ND;ND;ND;ND,0,358.77,12915719,128.96,0.359,345.096,0.962,108.337,0.302,168.284,0.869,209.057,0.583,0.003,358.308,25798146,228.693,0.638,137.809,0.385,108.679,0.303,70.099,0.196,0.013,88.339 +19,4294967295,60735,2,aclnnFlashAttentionScore_FlashAttentionScore_FlashAttentionScore,FlashAttentionScore,dynamic,MIX_AIC,"1736413971420248.800 ",407.008,0.84,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;;;;4096,4096;;;;;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;UINT8;DT_BF16;BOOL;INT64;INT64;INT64;INT64;INT64,NCL;NCL;NCL;ND;ND;ND;ND;ND;ND;ND;ND;ND,"""2,4,4096,8;2,4,4096,8;;4096,2,512""",FLOAT;FLOAT;DT_BF16;DT_BF16,ND;ND;ND;ND,0,359.702,12949284,128.975,0.359,346.306,0.963,108.43,0.301,166.899,0.864,209.018,0.581,0.003,359.274,25867705,228.693,0.637,138.438,0.385,107.723,0.3,70.146,0.195,0.013,88.377 +19,4294967295,60763,2,aclnnFlashAttentionScore_FlashAttentionScore_FlashAttentionScore,FlashAttentionScore,dynamic,MIX_AIC,"1736413971424592.447 ",405.228,1.35,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;;;;4096,4096;;;;;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;UINT8;DT_BF16;BOOL;INT64;INT64;INT64;INT64;INT64,NCL;NCL;NCL;ND;ND;ND;ND;ND;ND;ND;ND;ND,"""2,4,4096,8;2,4,4096,8;;4096,2,512""",FLOAT;FLOAT;DT_BF16;DT_BF16,ND;ND;ND;ND,0,359.793,12952532,128.923,0.358,345.768,0.961,108.411,0.301,167.379,0.865,208.79,0.58,0.003,359.294,25869164,228.691,0.637,138.411,0.385,107.868,0.3,70.163,0.195,0.013,88.788 +19,4294967295,61655,2,aclnnFlashAttentionScoreGrad_FlashAttentionScoreGrad_FlashAttentionScoreGrad,FlashAttentionScoreGrad,dynamic,MIX_AIC,"1736413971559180.676 ",762.215,1.37,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;4096,2,512;4096,4096;2,4,4096,8;2,4,4096,8;;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;BOOL;FLOAT;FLOAT;DT_BF16;DT_BF16;INT64,NCL;NCL;NCL;NCL;ND;NCHW;NCHW;ND;NCL;ND,"""4096,2,512;4096,2,512;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16,ND;ND;ND;ND,0,755.664,27203907,344.023,0.455,592.472,0.784,266.388,0.353,397.091,0.525,589.726,0.525,0.004,755.04,54362915,318.452,0.422,184.623,0.245,206.78,0.274,152.973,0.203,0.006,99.141 +19,4294967295,61696,2,aclnnFlashAttentionScoreGrad_FlashAttentionScoreGrad_FlashAttentionScoreGrad,FlashAttentionScoreGrad,dynamic,MIX_AIC,"1736413971565420.821 ",763.215,1.189,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;4096,2,512;4096,4096;2,4,4096,8;2,4,4096,8;;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;BOOL;FLOAT;FLOAT;DT_BF16;DT_BF16;INT64,NCL;NCL;NCL;NCL;ND;NCHW;NCHW;ND;NCL;ND,"""4096,2,512;4096,2,512;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16,ND;ND;ND;ND,0,757.83,27281885,344.047,0.454,595.954,0.786,266.123,0.351,389.105,0.513,576.226,0.513,0.004,757.046,54507345,318.443,0.421,188.292,0.249,200.176,0.264,162.113,0.214,0.006,99.294 +19,4294967295,61737,2,aclnnFlashAttentionScoreGrad_FlashAttentionScoreGrad_FlashAttentionScoreGrad,FlashAttentionScoreGrad,dynamic,MIX_AIC,"1736413971571804.228 ",757.095,0.88,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;4096,2,512;4096,4096;2,4,4096,8;2,4,4096,8;;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;BOOL;FLOAT;FLOAT;DT_BF16;DT_BF16;INT64,NCL;NCL;NCL;NCL;ND;NCHW;NCHW;ND;NCL;ND,"""4096,2,512;4096,2,512;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16,ND;ND;ND;ND,0,750.605,27021778,343.983,0.458,586.708,0.782,266.304,0.355,392.522,0.523,584.432,0.523,0.004,749.913,53993736,318.436,0.425,188.508,0.251,207.668,0.277,152.634,0.204,0.006,99.143 +19,4294967295,61778,2,aclnnFlashAttentionScoreGrad_FlashAttentionScoreGrad_FlashAttentionScoreGrad,FlashAttentionScoreGrad,dynamic,MIX_AIC,"1736413971578144.095 ",755.915,1.22,20,40,NO,"""4096,2,512;4096,2,512;4096,2,512;4096,2,512;4096,4096;2,4,4096,8;2,4,4096,8;;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;BOOL;FLOAT;FLOAT;DT_BF16;DT_BF16;INT64,NCL;NCL;NCL;NCL;ND;NCHW;NCHW;ND;NCL;ND,"""4096,2,512;4096,2,512;4096,2,512;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16,ND;ND;ND;ND,0,750.152,27005467,344.115,0.459,579.317,0.772,266.08,0.355,398.019,0.531,587.37,0.531,0.004,749.348,53953058,318.444,0.425,186.908,0.249,207.068,0.276,151.329,0.202,0.006,99.238 +19,4294967295,60763,2,aclnnFlashAttentionScore_FlashAttentionScore_FlashAttentionScore_varlen,FlashAttentionScore,dynamic,MIX_AIC,"1736413971424592.447 ",405.228,1.35,20,40,NO,"""4096,2,511;4096,2,512;4096,2,512;;;;4096,4096;;;;;""",DT_BF16;DT_BF16;DT_BF16;DT_BF16;UINT8;DT_BF16;BOOL;INT64;INT64;INT64;INT64;INT64,NCL;NCL;NCL;ND;ND;ND;ND;ND;ND;ND;ND;ND,"""2,3,4096,8;2,4,4096,8;;4096,2,512""",FLOAT;FLOAT;DT_BF16;DT_BF16,ND;ND;ND;ND,0,359.793,12952532,128.923,0.358,345.768,0.961,108.411,0.301,167.379,0.465,208.79,0.58,0.003,359.294,25869164,228.691,0.637,138.411,0.385,107.868,0.3,70.163,0.195,0.013,88.788 +19,4294967295,60683,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413971412768.871 ",26.78,0.485,40,0,NO,"""512,2,4096;512,2,4096""",DT_BF16;DT_BF16,NCL;NCL,"""512,2,4096""",DT_BF16,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,24.19,1741674,5.986,0.247,1.352,0.056,20.363,0.842,3.195,0.132,0.027,0 +19,4294967295,60690,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413971414677.549 ",31.201,0.664,40,0,NO,"""512,2,4096;512,2,4096""",DT_BF16;DT_BF16,NCL;NCL,"""512,2,4096""",DT_BF16,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,28.617,2060443,5.986,0.209,1.444,0.05,25.005,0.874,3.336,0.117,0.026,0 +19,4294967295,60711,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413971416743.250 ",27.021,1.246,40,0,NO,"""512,2,4096;512,2,4096""",DT_BF16;DT_BF16,NCL;NCL,"""512,2,4096""",DT_BF16,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,24.304,1749862,5.986,0.246,1.258,0.052,20.424,0.84,3.23,0.133,0.027,0 +19,4294967295,60718,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413971419318.962 ",25.08,0.984,40,0,NO,"""512,2,4096;512,2,4096""",DT_BF16;DT_BF16,NCL;NCL,"""512,2,4096""",DT_BF16,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,22.47,1617840,5.989,0.267,2.009,0.089,18.809,0.837,3.191,0.142,0.024,0 +19,4294967295,13907,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413974268377.206 ",1.38,31.48,1,0,NO,""";""",FLOAT;FLOAT,ND;ND,"""""",FLOAT,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,0.883,1589,0.027,0.03,0.265,0.3,0.18,0.204,0.108,0.123,0.182,0 +19,4294967295,13910,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413974268502.128 ",1.46,17.48,1,0,NO,""";""",FLOAT;FLOAT,ND;ND,"""""",FLOAT,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,0.948,1706,0.027,0.028,0.276,0.291,0.217,0.229,0.127,0.134,0.174,0 +19,4294967295,13913,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413974268605.410 ",1.5,0.09,1,0,NO,""";""",FLOAT;FLOAT,ND;ND,"""""",FLOAT,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,0.96,1728,0.027,0.028,0.268,0.28,0.221,0.23,0.132,0.137,0.145,0 +19,4294967295,13916,2,aclnnAdd_AddAiCore_Add,Add,dynamic,AI_VECTOR_CORE,"1736413974268747.953 ",1.58,28.28,1,0,NO,""";""",FLOAT;FLOAT,ND;ND,"""""",FLOAT,ND,N/A,0,0,0,0,0,0,0,0,0,0,0,0,0,1.107,1993,0.027,0.024,0.426,0.384,0.201,0.181,0.118,0.106,0.162,0 \ No newline at end of file diff --git a/profiler/msprof_analyze/test/ut/advisor/compute_advice/test_ai_core_performance_advice.py b/profiler/msprof_analyze/test/ut/advisor/compute_advice/test_ai_core_performance_advice.py new file mode 100644 index 0000000000000000000000000000000000000000..4a6e3ac292156bac316edbaf2059e7b39f8de11c --- /dev/null +++ b/profiler/msprof_analyze/test/ut/advisor/compute_advice/test_ai_core_performance_advice.py @@ -0,0 +1,85 @@ +# Copyright (c) Huawei Technologies Co., Ltd. 2025. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import os +import shutil + +import unittest +from msprof_analyze.advisor.interface.interface import Interface +from msprof_analyze.advisor.common.analyzer_scopes import SupportedScopes + + +class TestAICorePerformanceAdvice(unittest.TestCase): + TMP_DIR = "./TestAICorePerformanceAdvice/ascend_pt" + OUTPUT_DIR = "./TestAICorePerformanceAdvice/ascend_pt/ASCEND_PROFILER_OUTPUT" + interface = None + err_interface = None + + @classmethod + def clear_htmls(cls): + current_path = os.path.dirname(os.path.abspath(__file__)) + for filename in os.listdir(current_path): + # 检查文件是否以“mstt”开头 + if filename.startswith("mstt"): + # 构建文件的完整路径 + file_path = os.path.join(current_path, filename) + # 删除文件 + os.remove(file_path) + + @classmethod + def copy_kernel_details(cls, path): + # Define source and destination paths + source_csv_path = os.path.join(os.path.dirname(__file__), 'data', path) + destination_csv_path = f"{TestAICorePerformanceAdvice.OUTPUT_DIR}/kernel_details.csv" + + # Check if source CSV file exists + if not os.path.exists(source_csv_path): + raise FileNotFoundError(f"test data file not found:{source_csv_path}") + + # Ensure the output directory exists + if not os.path.exists(TestAICorePerformanceAdvice.OUTPUT_DIR): + os.makedirs(TestAICorePerformanceAdvice.OUTPUT_DIR) + + # Copy the CSV file from source to destination + shutil.copyfile(source_csv_path, destination_csv_path) + + def tearDown(self): + if os.path.exists(TestAICorePerformanceAdvice.TMP_DIR): + shutil.rmtree(TestAICorePerformanceAdvice.TMP_DIR) + self.clear_htmls() + + def setUp(self): + if os.path.exists(TestAICorePerformanceAdvice.TMP_DIR): + shutil.rmtree(TestAICorePerformanceAdvice.TMP_DIR) + if not os.path.exists(TestAICorePerformanceAdvice.TMP_DIR): + os.makedirs(TestAICorePerformanceAdvice.TMP_DIR) + if not os.path.exists(TestAICorePerformanceAdvice.OUTPUT_DIR): + os.makedirs(TestAICorePerformanceAdvice.OUTPUT_DIR) + self.clear_htmls() + + def test_ai_core_performance_total(self): + file_path = "kernel_details.csv" + self.copy_kernel_details(file_path) + interface = Interface(profiling_path=self.TMP_DIR) + dimension = Interface.COMPUTATION + scope = SupportedScopes.AICORE_PERFORMANCE_ANALYSIS + result = interface.get_result(dimension, scope, render_html=1, output_dict=False, profiling_path=self.TMP_DIR) + self.assertLess(1, len(result.data.get("Cube算子性能分析").get("data")[0])) + self.assertLess(1, len(result.data.get("Cube算子性能分析").get("data")[1])) + self.assertLess(1, len(result.data.get("Cube算子性能分析").get("data")[2])) + self.assertLess(1, len(result.data.get("FA算子性能分析").get("data")[0])) + self.assertLess(1, len(result.data.get("FA算子性能分析").get("data")[1])) + self.assertLess(1, len(result.data.get("FA算子性能分析").get("data")[2])) + self.assertLess(1, len(result.data.get("Vector算子性能分析").get("data")[0])) + self.assertLess(1, len(result.data.get("Vector算子性能分析").get("data")[1])) + result.clear() \ No newline at end of file diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/cluster_data_preprocess/test_step_trace_time_analysis.py b/profiler/msprof_analyze/test/ut/cluster_analyse/cluster_data_preprocess/test_step_trace_time_analysis.py index bd8e2da21f7421e3686293367b02428197386f6e..067886ec20125de045e18d183d0cf31d69b3eb23 100644 --- a/profiler/msprof_analyze/test/ut/cluster_analyse/cluster_data_preprocess/test_step_trace_time_analysis.py +++ b/profiler/msprof_analyze/test/ut/cluster_analyse/cluster_data_preprocess/test_step_trace_time_analysis.py @@ -54,7 +54,7 @@ class TestStepTraceTimeAnalysis(unittest.TestCase): StepTraceTimeBean({"Step": 1, "time1": 10, "time2": 20}) ] } - check.communication_group = {Constant.P2P: [[0, 1]]} + check.communication_data_dict = {Constant.STAGE: [[0, 1]]} check.analyze_step_time() self.assertIn([0, 'stage', (0, 1), 10.0, 20.0], check.step_data_list) @@ -75,7 +75,7 @@ class TestStepTraceTimeAnalysis(unittest.TestCase): StepTraceTimeBean({"Step": None, "time1": 1, "time2": 1}), ], } - check.communication_group = {Constant.P2P: [[0, 1], [2, 3]]} + check.communication_data_dict = {Constant.STAGE: [[0, 1], [2, 3]]} check.analyze_step_time() self.assertIn([None, 'stage', (2, 3), 2.0, 3.0], check.step_data_list) self.assertIn([None, 'rank', 0, 1.0, 2.0], check.step_data_list) \ No newline at end of file diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/common_func/test_path_manager.py b/profiler/msprof_analyze/test/ut/cluster_analyse/common_func/test_path_manager.py index 1ffb456fde0f6bb6335a27cd559008a40a132f6b..ece8c2a0dcf3962041b88a16d88e6de47c42ed23 100644 --- a/profiler/msprof_analyze/test/ut/cluster_analyse/common_func/test_path_manager.py +++ b/profiler/msprof_analyze/test/ut/cluster_analyse/common_func/test_path_manager.py @@ -20,9 +20,9 @@ import pytest from msprof_analyze.prof_common.path_manager import PathManager -PATH_DIR = "resource" -PATH_FILE = "resource/test.csv" -PATH_TEMP = "temp" +PATH_DIR = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../../resource')) +PATH_FILE = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../../resource/test.csv')) +PATH_TEMP = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), '../../../temp')) class TestPathManager(unittest.TestCase): diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/communication_group/test_communication_group_generator.py b/profiler/msprof_analyze/test/ut/cluster_analyse/communication_group/test_communication_group_generator.py deleted file mode 100644 index 517327b81117f100a1bf0f71edbe9ebd45ef605e..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/test/ut/cluster_analyse/communication_group/test_communication_group_generator.py +++ /dev/null @@ -1,113 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import unittest -from unittest import mock - -from msprof_analyze.cluster_analyse.communication_group.communication_group_generator import CommunicationGroupGenerator -from msprof_analyze.prof_common.constant import Constant - - -class TestCommunicationGroupGenerator(unittest.TestCase): - DIR_PATH = '' - PARAMS = { - Constant.DATA_SIMPLIFICATION: "ORIGINAL", - Constant.DATA_TYPE: Constant.TEXT - } - - def test_generate_p2p_communication_when_given_group_1p_return_1p2p(self): - check = CommunicationGroupGenerator(self.PARAMS).processor - check.collective_group_dict = { - 'group1': {0} - } - with mock.patch("msprof_analyze.prof_common.file_manager.FileManager.read_json_file", - return_value=True): - check.generate_p2p_communication_group() - ret = {0} - self.assertEqual(ret, set(check.communication_group[Constant.P2P][0])) - - def test_generate_p2p_communication_when_given_group_8p_return_correct_value(self): - check = CommunicationGroupGenerator(self.PARAMS).processor - check.collective_group_dict = { - 'group1': {1, 2, 3, 4}, - 'group2': {5, 6, 7, 8}, - } - with mock.patch("msprof_analyze.prof_common.file_manager.FileManager.read_json_file", - return_value=True): - check.generate_p2p_communication_group() - ret_a = {1, 2, 3, 4} - ret_b = {5, 6, 7, 8} - self.assertEqual(ret_a, set(check.communication_group[Constant.P2P][0])) - self.assertEqual(ret_b, set(check.communication_group[Constant.P2P][1])) - - def test_generate_p2p_communication_when_given_group_16p_expect_4_group(self): - check = CommunicationGroupGenerator(self.PARAMS).processor - check.collective_group_dict = { - 'group1': {0, 1}, - 'group2': {0, 2}, - 'group3': {2, 3}, - 'group4': {3, 1}, - 'group5': {4, 5}, - 'group6': {4, 6}, - 'group7': {5, 7}, - 'group8': {6, 7}, - 'group9': {8, 9}, - 'group10': {8, 10}, - 'group11': {11, 10}, - 'group12': {11, 9}, - 'group13': {12, 13}, - 'group14': {12, 14}, - 'group15': {15, 13}, - 'group16': {15, 14} - } - with mock.patch("msprof_analyze.prof_common.file_manager.FileManager.read_json_file", - return_value=True): - check.generate_p2p_communication_group() - ret_a = {0, 1, 2, 3} - ret_b = {4, 5, 6, 7} - ret_c = {8, 9, 10, 11} - ret_d = {12, 13, 14, 15} - self.assertEqual(ret_a, set(check.communication_group[Constant.P2P][0])) - self.assertEqual(ret_b, set(check.communication_group[Constant.P2P][1])) - self.assertEqual(ret_c, set(check.communication_group[Constant.P2P][2])) - self.assertEqual(ret_d, set(check.communication_group[Constant.P2P][3])) - - def test_generate_p2p_communication_group_when_given_repeat_group_expect_2_group(self): - check = CommunicationGroupGenerator(self.PARAMS).processor - check.collective_group_dict = { - 'group1': {0, 1, 2, 3}, - 'group2': {0, 1, 2, 3}, - 'group3': {0, 1, 2, 3}, - 'group4': {0, 1, 2, 3}, - 'group5': {3, 2, 4, 5}, - 'group6': {4, 5, 6, 7}, - 'group7': {4, 5, 6, 7}, - 'group8': {4, 5, 6, 7}, - 'group9': {8, 9, 11, 10}, - 'group10': {8, 9, 11, 10}, - 'group11': {11, 10, 12, 13}, - 'group12': {11, 10, 12, 13}, - 'group13': {11, 10, 12, 13}, - 'group14': {12, 13, 14, 15}, - 'group15': {12, 13, 14, 15}, - 'group16': {12, 13, 14, 15} - } - with mock.patch("msprof_analyze.prof_common.file_manager.FileManager.read_json_file", - return_value=True): - check.generate_p2p_communication_group() - ret_a = {0, 1, 2, 3, 4, 5, 6, 7} - ret_b = {8, 9, 10, 11, 12, 13, 14, 15} - self.assertEqual(ret_a, set(check.communication_group[Constant.P2P][0])) - self.assertEqual(ret_b, set(check.communication_group[Constant.P2P][1])) diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_cluster_time_compare_summary.py b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_cluster_time_compare_summary.py deleted file mode 100644 index 9cc3dd8180851afb00c2c8fb91e2a37ffb7a0973..0000000000000000000000000000000000000000 --- a/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_cluster_time_compare_summary.py +++ /dev/null @@ -1,136 +0,0 @@ -# Copyright (c) 2025, Huawei Technologies Co., Ltd. -# All rights reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - - -import unittest -from unittest import mock -import pandas as pd - -from msprof_analyze.cluster_analyse.recipes.cluster_time_compare_summary.cluster_time_compare_summary import \ - ClusterTimeCompareSummary -from msprof_analyze.prof_common.constant import Constant - -NAMESPACE = "msprof_analyze.prof_common" - - -class TestClusterTimeCompareSummary(unittest.TestCase): - PARAMS = { - Constant.COLLECTION_PATH: "/data", - Constant.DATA_MAP: {}, - Constant.DATA_TYPE: Constant.DB, - Constant.CLUSTER_ANALYSIS_OUTPUT_PATH: "./test_cluster_time_compare_summary", - Constant.RECIPE_NAME: "ClusterTimeCompareSummary", - Constant.RECIPE_CLASS: ClusterTimeCompareSummary, - Constant.PARALLEL_MODE: Constant.CONCURRENT_MODE, - Constant.EXPORT_TYPE: Constant.DB, - ClusterTimeCompareSummary.RANK_LIST: Constant.ALL, - } - - def test_check_params_is_valid_should_return_false_when_bp_param_does_not_exist(self): - params = {} - params.update(self.PARAMS) - self.assertFalse(ClusterTimeCompareSummary(params).check_params_is_valid()) - - def test_check_params_is_valid_should_return_false_when_export_type_is_notebook(self): - params = {Constant.EXTRA_ARGS: ["--bp", "/data2"]} - params.update(self.PARAMS) - params[Constant.EXPORT_TYPE] = Constant.NOTEBOOK - self.assertFalse(ClusterTimeCompareSummary(params).check_params_is_valid()) - - def test_check_params_is_valid_should_return_false_when_base_path_is_invalid(self): - params = {Constant.EXTRA_ARGS: ["--bp", "/data2"]} - params.update(self.PARAMS) - with mock.patch(NAMESPACE + ".path_manager.PathManager.check_input_file_path", side_effect=RuntimeError): - self.assertFalse(ClusterTimeCompareSummary(params).check_params_is_valid()) - - def test_check_params_is_valid_should_return_false_when_table_cluster_time_summary_does_not_exist(self): - params = {} - params.update(self.PARAMS) - with mock.patch(NAMESPACE + ".db_manager.DBManager.check_tables_in_db", return_value=False): - self.assertFalse(ClusterTimeCompareSummary(params).check_params_is_valid()) - - def test_check_params_is_valid_should_return_false_when_base_table_cluster_time_summary_does_not_exist(self): - params = {Constant.EXTRA_ARGS: ["--bp", "/data2"]} - params.update(self.PARAMS) - with mock.patch(NAMESPACE + ".path_manager.PathManager.check_input_file_path"), \ - mock.patch(NAMESPACE + ".db_manager.DBManager.check_tables_in_db", side_effect=[True, False]): - self.assertFalse(ClusterTimeCompareSummary(params).check_params_is_valid()) - - def test_run_when_all_parameters_are_normal(self): - params = {Constant.EXTRA_ARGS: ["--bp", "/data2"]} - params.update(self.PARAMS) - params[Constant.EXPORT_TYPE] = "" - base_cluster_time_summary_df_dict = { - Constant.TABLE_CLUSTER_TIME_SUMMARY: pd.DataFrame( - { - "rank": [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6], - "step": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1], - "computation": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "communicationNotOverlapComputation": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "communicationOverlapComputation": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "communication": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "free": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "communicationWaitStageTime": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "communicationTransmitStageTime": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "memory": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "memoryNotOverlapComputationCommunication": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], - "taskLaunchDelayAvgTime": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13] - } - ) - } - cluster_time_summary_df_dict = { - Constant.TABLE_CLUSTER_TIME_SUMMARY: pd.DataFrame( - { - "rank": [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7], - "step": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1], - "computation": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "communicationNotOverlapComputation": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "communicationOverlapComputation": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "communication": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "free": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "communicationWaitStageTime": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "communicationTransmitStageTime": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "memory": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "memoryNotOverlapComputationCommunication": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16], - "taskLaunchDelayAvgTime": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] - } - ) - } - expected_result = pd.DataFrame({ - "rank": [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6], - "step": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1], - "computationDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], - "communicationNotOverlapComputationDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, - 1.0, 1.0, 1.0, 1.0], - "communicationOverlapComputationDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, - 1.0, 1.0, 1.0, 1.0], - "communicationDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], - "freeDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], - "communicationWaitStageTimeDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], - "communicationTransmitStageTimeDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, - 1.0, 1.0, 1.0, 1.0], - "memoryDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], - "memoryNotOverlapComputationCommunicationDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, - 1.0, 1.0, 1.0, 1.0], - "taskLaunchDelayAvgTimeDiff": [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] - }) - with mock.patch(NAMESPACE + ".path_manager.PathManager.check_input_file_path"), \ - mock.patch(NAMESPACE + ".db_manager.DBManager.check_tables_in_db", side_effect=[True, True]), \ - mock.patch(NAMESPACE + ".database_service.DatabaseService.query_data", - side_effect=[cluster_time_summary_df_dict, base_cluster_time_summary_df_dict]): - cluster_time_compare_summary = ClusterTimeCompareSummary(params) - cluster_time_compare_summary.run() - self.assertTrue(cluster_time_compare_summary.compare_result.equals(expected_result)) - diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_ep_load_balance.py b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_ep_load_balance.py new file mode 100644 index 0000000000000000000000000000000000000000..577df7bb84193be6d6dc815145e3c2d33d34c455 --- /dev/null +++ b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_ep_load_balance.py @@ -0,0 +1,90 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import unittest +from unittest.mock import patch, MagicMock +import pandas as pd + +from msprof_analyze.prof_common.constant import Constant +from msprof_analyze.cluster_analyse.recipes.ep_load_balance.ep_load_balance import EPLoadBalance + + +class TestEPLoadBalance(unittest.TestCase): + + def setUp(self): + self.params = {} + self.ep_load_balance = EPLoadBalance(self.params) + self.mock_db_path = "mock_db_path" + self.mock_rank_id = 0 + self.mock_step_range = {Constant.START_NS: 0, Constant.END_NS: 1000} + self.mock_global_ranks = [0, 1] + + @patch("msprof_analyze.cluster_analyse.recipes.ep_load_balance.ep_load_balance.DatabaseService") + def test_mapper_func_given_valid_data_map_when_called_then_pass(self, mock_db_service): + """ + Test _mapper_func method to ensure it returns a DataFrame with correct Rank and epRanks columns + when provided with a valid data map. + """ + # Mock the DatabaseService and its methods + mock_db_instance = mock_db_service.return_value + mock_db_instance.query_data.return_value = { + "META_DATA": pd.DataFrame( + { + "name": ["parallel_group_info"], + "value": ['{"group1": {"group_name": "exp", "global_ranks": [0, 1]}}'], + } + ) + } + + # Mock the InputShapeExport + + mock_input_shape_export = MagicMock() + mock_input_shape_export.read_export_db.return_value = pd.DataFrame( + {"InputShapes": ["1,3;4,6;;;;;4", "1,3;4,6;;;;;4"]} + ) + + with patch( + "msprof_analyze.cluster_analyse.recipes.ep_load_balance.ep_load_balance.InputShapeExport", + return_value=mock_input_shape_export, + ): + data_map = { + Constant.PROFILER_DB_PATH: self.mock_db_path, + Constant.RANK_ID: self.mock_rank_id, + Constant.STEP_RANGE: self.mock_step_range, + } + result = self.ep_load_balance._mapper_func(data_map, "mock_analysis_class") + + self.assertIsNotNone(result) + self.assertEqual(result["Rank"].tolist(), [self.mock_rank_id] * 2) + self.assertEqual(result["epRanks"].tolist(), [self.mock_global_ranks] * 2) + + def test_reducer_func_given_dataframes_when_called_then_pass(self): + """ + Test reducer_func method to ensure it processes multiple DataFrames and generates + ep_tokens_summary and top_ep_tokens_map correctly. + """ + mock_mapper_res = [ + pd.DataFrame( + {"Rank": [0, 1], "epRanks": [[0, 1], [0, 1]], "InputShapes": ["1,3;4,6;;;;;4", "7,8;10,12;;;;4"]} + ), + pd.DataFrame( + {"Rank": [2, 3], "epRanks": [[0, 1], [0, 1]], "InputShapes": ["1,3;4,6;;;;;4", "1,3;4,6;;;;;4"]} + ), + ] + + self.ep_load_balance.reducer_func(mock_mapper_res) + + self.assertIsNotNone(self.ep_load_balance.ep_tokens_summary) + self.assertIsNotNone(self.ep_load_balance.top_ep_tokens_map) \ No newline at end of file diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_freq_analysis.py b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_freq_analysis.py new file mode 100644 index 0000000000000000000000000000000000000000..0a559b79178d03879df6901703c66c7cfcd03663 --- /dev/null +++ b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_freq_analysis.py @@ -0,0 +1,83 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import random +import unittest + +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.freq_analysis.freq_analysis import FreqAnalysis + + +class TestFreqAnalysis(unittest.TestCase): + + freq = [1800] + free_freq = [800, 1800] + abnormal_freq = [1200, 1300, 1800] + + def test_no_error_freq(self): + params = {} + recipe = FreqAnalysis(params) + mapper_res = [(self.freq, 0)] * 10 + recipe.reducer_func(mapper_res) + self.assertEqual(recipe.free_freq_ranks, []) + self.assertEqual(recipe.abnormal_freq_ranks, []) + self.assertEqual(recipe.abnormal_freq_ranks_map, {}) + + + def test_free_rank_map(self): + params = {} + recipe = FreqAnalysis(params) + mapper_res = [ + (self.freq, 0), + (self.free_freq, 1), + (self.free_freq, 2), + (self.freq, 3) + ] + recipe.reducer_func(mapper_res) + self.assertEqual(recipe.free_freq_ranks, [1, 2]) + self.assertEqual(recipe.abnormal_freq_ranks, []) + self.assertEqual(recipe.abnormal_freq_ranks_map, {}) + + def test_abnormal_rank_map(self): + params = {} + recipe = FreqAnalysis(params) + mapper_res = [ + (self.freq, 0), + (self.abnormal_freq, 1), + (self.abnormal_freq, 2), + (self.freq, 3) + ] + + recipe.reducer_func(mapper_res) + self.assertEqual(recipe.free_freq_ranks, []) + self.assertEqual(recipe.abnormal_freq_ranks, [1, 2]) + + def test_mix_freq_case(self): + params = {} + recipe = FreqAnalysis(params) + mapper_res = [] + rank_case = [[], [], []] + random_freq = {0: self.freq, 1: self.free_freq, 2: self.abnormal_freq} + + for i in range(1000): + random_num = random.choice([0, 1, 2]) + mapper_res.append((random_freq.get(random_num, self.freq), i)) + rank_case[random_num].append(i) + + recipe.reducer_func(mapper_res) + self.assertEqual(recipe.free_freq_ranks, rank_case[1]) + self.assertEqual(recipe.abnormal_freq_ranks, rank_case[2]) diff --git a/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_slow_rank.py b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_slow_rank.py new file mode 100644 index 0000000000000000000000000000000000000000..9fecb4f2a06eb5fcda71836fd6b8ad0f55a424ae --- /dev/null +++ b/profiler/msprof_analyze/test/ut/cluster_analyse/recipes/test_slow_rank.py @@ -0,0 +1,101 @@ +# Copyright (c) 2025, Huawei Technologies Co., Ltd. +# All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +import unittest + +import pandas as pd + +from msprof_analyze.cluster_analyse.recipes.slow_rank.slow_rank import judge_norm, judge_dixon, SlowRankVoteAnalysis + + +class TestJudgeNorm(unittest.TestCase): + def test_no_outlier(self): + data_list = [10] * 120 + res = judge_norm(data_list) + self.assertEqual(res, []) + + def test_with_outlier(self): + data_with_outlier = [10] * 120 + data_with_outlier.append(0) + res = judge_norm(data_with_outlier) + self.assertEqual(res, [120]) + + +class TestJudgeDixon(unittest.TestCase): + def test_no_outlier(self): + for i in [6, 8, 12, 30]: + data_list = [100 + j for j in range(i)] + res = judge_dixon(data_list) + self.assertEqual(res, []) + + def test_with_outlier(self): + for i in [6, 8, 12, 30]: + data_with_outlier = [100 + j for j in range(i)] + data_with_outlier.append(0) + res = judge_dixon(data_with_outlier) + self.assertEqual(res, [i]) + + +class TestVoteAnalysis(unittest.TestCase): + + @staticmethod + def init_cmm_ops_df(group_0_op_0_num, group_0_op_1_num, group_1_op_0_num): + comm_ops_df = pd.DataFrame(columns=["rankId", "groupName", "opName", "communication_times"]) + for i in range(group_0_op_0_num): + comm_ops_df.loc[len(comm_ops_df)] = [i, "group_0", "op_0", 0] + for i in range(group_0_op_1_num): + comm_ops_df.loc[len(comm_ops_df)] = [i, "group_0", "op_1", 0] + for i in range(group_1_op_0_num): + comm_ops_df.loc[len(comm_ops_df)] = [i, "group_1", "op_0", 0] + return comm_ops_df + + def test_grouping_ops(self): + group_0_op_0_num = 10 + group_0_op_1_num = 10 + group_1_op_0_num = 5 + comm_ops_df = self.init_cmm_ops_df(group_0_op_0_num, group_0_op_1_num, group_1_op_0_num) + analyzer = SlowRankVoteAnalysis(comm_ops_df) + res = analyzer.grouping_ops() + res = dict(res) + for key in res.keys(): + res[key] = dict(res[key]) + golden_res = { + "group_0": { + "op_0": [i for i in range(group_0_op_0_num)], + "op_1": [i + group_0_op_0_num for i in range(group_0_op_1_num)] + }, + "group_1": { + "op_0": [i + group_0_op_0_num + group_0_op_1_num for i in range(group_1_op_0_num)] + } + } + self.assertEqual(res, golden_res) + + def test_grouping_ops_with_exclude(self): + group_0_op_0_num = 10 + group_0_op_1_num = 12 + group_1_op_0_num = 5 + comm_ops_df = self.init_cmm_ops_df(group_0_op_0_num, group_0_op_1_num, group_1_op_0_num) + analyzer = SlowRankVoteAnalysis(comm_ops_df) + res = analyzer.grouping_ops() + res = dict(res) + for key in res.keys(): + res[key] = dict(res[key]) + golden_res = { + "group_1": { + "op_0": [i for i in range(group_1_op_0_num)] + } + } + self.assertEqual(res, golden_res) diff --git a/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_base_profiling_parser.py b/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_base_profiling_parser.py index de2ef0a46800bade04cf955cee68ce6f82acd274..dcdc6ba7419c93580fdc5ed91f3a4f092339a033 100644 --- a/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_base_profiling_parser.py +++ b/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_base_profiling_parser.py @@ -31,6 +31,7 @@ class ProfilingParser(BaseProfilingParser): self._enable_api_compare = True self._bwd_tid = 1 self._step_id = -1 + self._step_range = [] def _update_kernel_details(self): pass @@ -97,7 +98,7 @@ class TestBaseProfilingParser(unittest.TestCase): 2: {"start": MockEvent(1, 2, 12), "end": MockEvent(2, 3, 22)}, 3: {}} all_kernels = {"2-3-23": MockEvent(2, 3, 23), "2-3-21": MockEvent(2, 3, 21), "2-3-22": MockEvent(2, 3, 22)} - comm_events = [{"ph": "X", "name": "hccl_allreduce", "pid": 7, "tid": 3, "ts": 1, "dur": 2}] + comm_events = [{"ph": "X", "name": "hcom_allreduce", "pid": 7, "tid": 3, "ts": 1, "dur": 2}] task_events = [{"ph": "X", "name": "notify_wait", "pid": 7, "tid": 1, "ts": 2, "dur": 1}, {"ph": "X", "name": "notify_wait", "pid": 7, "tid": 1, "ts": 5, "dur": 1}] diff --git a/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_npu_profiling_parser.py b/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_npu_profiling_parser.py index 8f33065c16f2d7a00ab08cec6305c0f237a66252..d1a76de0a74f57cc406e5142d3dca37a7012aab3 100644 --- a/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_npu_profiling_parser.py +++ b/profiler/msprof_analyze/test/ut/compare_tools/profiling_parser/test_npu_profiling_parser.py @@ -146,9 +146,8 @@ class TestNPUProfilingParser(unittest.TestCase): patch("msprof_analyze.compare_tools.compare_backend.profiling_parser." "npu_profiling_parser.NPUProfilingParser.__init__", return_value=None), \ - patch( - "compare_backend.profiling_parser.npu_profiling_parser.BaseProfilingParser." - "_trace_event_generator", + patch("msprof_analyze.compare_tools.compare_backend.profiling_parser.npu_profiling_parser." + "BaseProfilingParser._trace_event_generator", return_value=(TraceEventBean(event) for event in self.meta_events)): res = NPUProfilingParser({}, {}) res._hccl_op_tid_list = [] diff --git a/profiler/msprof_analyze/precheck/env_check/cpu_check.py b/profiler/msprof_analyze/test/ut_coverage.sh similarity index 32% rename from profiler/msprof_analyze/precheck/env_check/cpu_check.py rename to profiler/msprof_analyze/test/ut_coverage.sh index e3765c71ebc3a9ee4700fe59cfc84727c68c3417..d8b60991bfc80022caf46235e1c0ed48a8e56f76 100644 --- a/profiler/msprof_analyze/precheck/env_check/cpu_check.py +++ b/profiler/msprof_analyze/test/ut_coverage.sh @@ -1,3 +1,4 @@ +#!/bin/bash # Copyright (c) 2025, Huawei Technologies Co., Ltd. # All rights reserved. # @@ -12,14 +13,41 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from msprof_analyze.precheck.env_check.environment_check import HardwareCheck +set -e -class CPUCheck(HardwareCheck): - CHECK_TYPE = "cpu" +# 获取脚本的绝对路径和目录 +real_path=$(readlink -f "$0") +script_dir=$(dirname "$real_path") +output_dir="${script_dir}/ut_coverage" +profiler_path=$(readlink -f "${script_dir}/../..") +msprof_analyze_path="${profiler_path}/msprof_analyze" +srccode="${msprof_analyze_path}/advisor,${msprof_analyze_path}/cli,${msprof_analyze_path}/cluster_analyse,${msprof_analyze_path}/compare_tools,${msprof_analyze_path}/prof_common,${msprof_analyze_path}/prof_exports" +test_code="${script_dir}/ut" - def __init__(self, **kwargs): - super().__init__(**kwargs) +# 更新 PYTHONPATH +export PYTHONPATH="${profiler_path}:${test_code}:${PYTHONPATH}" - def check(self): - pass +# 创建输出目录 +mkdir -p "$output_dir" +cd "$output_dir" + +# 删除旧的覆盖率文件 +rm -f .coverage + +# 运行单元测试并生成覆盖率报告 +coverage run --branch --source="${srccode}" -m pytest -s "${test_code}" --junit-xml=./final.xml +coverage xml -o coverage.xml +coverage report >python_coverage_report.log + +# 如果设置了 diff 参数,比较覆盖率差异 +if [[ -n "$1" && "$1" == "diff" ]]; then + target_branch=${2:-master} + diff-cover coverage.xml --compare-branch="origin/${target_branch}" --html-report inc_coverage_result.html --fail-under=80 +fi + +# 输出报告路径 +echo "Report: $output_dir" + +# 清理 .pycache 文件 +find "${script_dir}/.." -name "__pycache__" -exec rm -r {} + diff --git a/profiler/msprof_analyze/version.txt b/profiler/msprof_analyze/version.txt index 359a5b952d49f3592571e2af081510656029298e..f93ea0ca333052aa92d755e06d2a672b7f895426 100644 --- a/profiler/msprof_analyze/version.txt +++ b/profiler/msprof_analyze/version.txt @@ -1 +1 @@ -2.0.0 \ No newline at end of file +2.0.2 \ No newline at end of file diff --git a/sample/README.md b/sample/README.md index 8e555f4870d2c39fc5cabad3092d1c17f60d3dfa..15238cb9f3815d6fecb0c743e6f826d2abc2988b 100644 --- a/sample/README.md +++ b/sample/README.md @@ -8,10 +8,19 @@ 说明:该sample目录中,每个最小目录就是一个完整的样例工程。这些样例工程本身可能以为依赖的不同存在差异。 ## 依赖说明 -安装CANN包,并使能环境变量,并确保```ASCEND_HOME_PATH```生效,可以在CANN包安装目录下使能: -``` -source set_env.sh -``` +- 硬件环境请参见《[昇腾产品形态说明](https://gitee.com/link?target=https%3A%2F%2Fwww.hiascend.com%2Fdocument%2Fdetail%2Fzh%2Fcanncommercial%2F80RC22%2Fquickstart%2Fquickstart%2Fquickstart_18_0002.html)》。 +- 软件环境请参见《[CANN 软件安装指南](https://gitee.com/link?target=https%3A%2F%2Fwww.hiascend.com%2Fdocument%2Fdetail%2Fzh%2Fcanncommercial%2F80RC22%2Fsoftwareinst%2Finstg%2Finstg_0000.html%3FMode%3DPmIns%26OS%3DUbuntu%26Software%3DcannToolKit)》安装昇腾设备开发或运行环境,即toolkit软件包。 + +以上环境依赖请根据实际环境选择适配的版本。 + +### 版本配套 +| 条件 | 要求 | +|---|---| +| CANN版本 | >=8.0.RC1.alpha001 | +| 硬件要求 | Atlas 800T A2 训练服务器| + +- 支持AscendPyTorch 1.11.0或更高版本,支持的PyTorch和CANN以及PyTorch和Python软件版本配套关系请参见《[Ascend Extension for PyTorch插件](https://gitee.com/ascend/pytorch)》。 +- 固件驱动版本与配套CANN软件支持的固件驱动版本相同,开发者可通过“[昇腾社区-固件与驱动](https://gitee.com/link?target=https%3A%2F%2Fwww.hiascend.com%2Fhardware%2Ffirmware-drivers%2Fcommunity%3Fproduct%3D2%26model%3D28%26cann%3D8.0.RC3.alpha003%26driver%3D1.0.25.alpha)”页面根据产品型号与CANN软件版本获取配套的固件与驱动。 ## 目录介绍 整体目录结构如下: @@ -91,7 +100,7 @@ mssanitizer ./*.fatbin # 默认进行memcheck检查 ``` LINK_LIBS := -L${ASCEND_HOME_PATH}/lib64 -lruntime -lascendcl -lstdc++ 修改为: - LINK_LIBS := -L${ASCEND_HOME_PATH}/lib64 -L${ASCEND_HOME_PATH}/tools/simulator/${SOC_VERSION}/lib/ -lruntime_camodel -lascendcl -lstdc++ # 需要添加libruntime_camodel的依赖路径, SOC_VERSION 使用npu-smi info查询NPU Name + LINK_LIBS := -L${ASCEND_HOME_PATH}/lib64 -L${ASCEND_HOME_PATH}/tools/simulator/${SOC_VERSION}/lib/ -lruntime_camodel -lascendcl -lstdc++ # 需要添加libruntime_camodel的依赖路径, SOC_VERSION 通过使用npu-smi info命令进行查询,获取Chip Name信息。实际配置值 为AscendChip Name,例如Chip Name取值为xxxyy,实际配置值为Ascendxxxyy。当Ascendxxxyy为代码样例路径时,需要配置ascendxxxyy。 ``` + 调试信息增强: ``` diff --git a/sample/normal_sample/cube_only/main.cpp b/sample/normal_sample/cube_only/main.cpp index f6e1e5663203cd2b07e29fbc48255f781d31120f..2e151c2ff465868412fa884730bcb8baaa372268 100644 --- a/sample/normal_sample/cube_only/main.cpp +++ b/sample/normal_sample/cube_only/main.cpp @@ -49,7 +49,6 @@ void printAclFloat16(aclFloat16 *addr) void MakeTiling(uint32_t *addr, size_t size) { - assert(sizeof(TCubeTiling) <= size); // TCubeTiling该结构体在kernel_tiling/kernel_tiling.h中的结构体定义 // tiling_api.h中本身定义的结构与kernel_tiling.h相近,通过GetTiling实现映射 // TCubeTiling定义的可读性较好,可以直接理解,但使用tiling_api可以直接使能部分默认值 @@ -99,7 +98,6 @@ void MakeTiling(uint32_t *addr, size_t size) tiling->dbL0A = 2; tiling->dbL0B = 2; tiling->dbL0C = 1; - tiling->reserved = 0; } // y = matmul(xa, xb) @@ -108,7 +106,7 @@ int32_t main(int32_t argc, char *argv[]) size_t xaSize = 512 * 1024 * sizeof(aclFloat16); size_t xbSize = 512 * 1024 * sizeof(aclFloat16); size_t ySize = 512 * 1024 * sizeof(float); - size_t tilingSize = 48 * sizeof(uint32_t); + size_t tilingSize = sizeof(TCubeTiling); uint32_t blockDim = 8; CHECK_ACL(aclInit(nullptr)); diff --git a/sample/normal_sample/mix/Makefile b/sample/normal_sample/mix/Makefile index e7676a20ed5037030cc15ee0f67484ff0c08f369..8f162255b04c85740779bed62eb7e4cbc0f47b9a 100644 --- a/sample/normal_sample/mix/Makefile +++ b/sample/normal_sample/mix/Makefile @@ -8,7 +8,7 @@ DAV_FLAG := --cce-aicore-arch=dav-c220 ASCENDC_INC_FLAG := -I${ASCEND_HOME_PATH}/compiler/tikcpp/tikcfw -I${ASCEND_HOME_PATH}/compiler/tikcpp/tikcfw/impl -I${ASCEND_HOME_PATH}/compiler/tikcpp/tikcfw/interface -I${ASCEND_HOME_PATH}/include # 参考device_intf.cmake的配置简化 TILING_INC_FLAG := -I${ASCEND_HOME_PATH}/compiler/tikcpp/tikcfw HOST_INC_FLAG := -I${ASCEND_HOME_PATH}/include -LINK_LIBS := -L${ASCEND_HOME_PATH}/lib64 -lruntime -lascendcl -lstdc++ -lprofapi -lmmpa -lascendalog -lregister -lerror_manager +LINK_LIBS := -L${ASCEND_HOME_PATH}/lib64 -lruntime -lascendcl -lstdc++ -lprofapi -lmmpa -lascendalog -lregister -lerror_manager -lc_sec LINK_STATIC_LIBS := ${ASCEND_HOME_PATH}/lib64/libascendc_runtime.a all: build diff --git a/sample/normal_sample/mix/main.cpp b/sample/normal_sample/mix/main.cpp index 0b68f4c546552b6af416b4898cd4a366959a1cf6..91ba79932e3e37ce84f041de2bfabe74c8c795d4 100644 --- a/sample/normal_sample/mix/main.cpp +++ b/sample/normal_sample/mix/main.cpp @@ -101,7 +101,6 @@ void MakeTiling(int32_t *addr, size_t size) tiling->dbL0A = 2; tiling->dbL0B = 2; tiling->dbL0C = 1; - tiling->reserved = 0; } int32_t main(int32_t argc, char *argv[]) diff --git "a/\345\205\254\347\275\221URL\350\257\264\346\230\216.md" "b/\345\205\254\347\275\221URL\350\257\264\346\230\216.md" index 5d77e387caf7964eb405dc2aa5b7cbb009f510cc..a31fdfe42f0c2470143dbab31db8fb967e4056b8 100644 --- "a/\345\205\254\347\275\221URL\350\257\264\346\230\216.md" +++ "b/\345\205\254\347\275\221URL\350\257\264\346\230\216.md" @@ -1,9 +1,20 @@ # 公网URL说明 -| 软件类型 | 软件名 | 路径 | 类型 | 内容 | 用途说明 | -|------|----------------------------------------------------|------------------------------------------|------|------------------------------------------------------------------------------------------------------------|--------------------| -| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/googletest/repository/archive/release-1.12.1.tar.gz | 开源软件下载 | -| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/sinojelly/mockcpp/repository/archive/v2.7.zip | 开源软件下载 | -| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/JSON-for-Modern-CPP/repository/archive/v3.10.1.zip | 开源软件下载 | -| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/openssl/repository/archive/OpenSSL_1_1_1k.tar.gz | 开源软件下载 | -| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/protobuf_source/repository/archive/v3.15.0.tar.gz | 开源软件下载 | +| 软件类型 | 软件名 | 路径 | 类型 | 内容 | 用途说明 | +| -------- | -------------------------------------------------- | ------------------------------------------------------------ | -------- | ------------------------------------------------------------ | ------------------------------------------ | +| 开源软件 | dynolog | /.gitmodules | 公网地址 | https://github.com/facebookincubator/dynolog.git | 在线监控底座 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/config/config.ini | 公网地址 | https://www.hiascend.com/document/detail/zh/canncommercial/80RC2/devaids/auxiliarydevtool/atlasprofiling_16_0038.html | MIndStudio Ascend PyTorch Profiler参考示例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/config/config.ini | 公网地址 | https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/advisor/doc/Samples%20of%20Fused%20Operator%20API%20Replacement.md" | Advisor优化手段参考示例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/config/config.ini | 公网地址 | https://www.hiascend.com/document/detail/zh/canncommercial/80RC2/devaids/auxiliarydevtool/aoe_16_043.html | Advisor优化手段参考示例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/config/config.ini | 公网地址 | https://www.mindspore.cn/lite/docs/en/master/use/cloud_infer/converter_tool_ascend.html#aoe-auto-tuning | Advisor优化手段参考示例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/config/config.ini | 公网地址 | https://www.hiascend.com/document/detail/zh/canncommercial/700/modeldevpt/ptmigr/AImpug_000060.html | Advisor优化手段参考示例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze | /profiler/msprof_analyze/config/config.ini | 公网地址 | https://gitee.com/ascend/mstt/tree/master/profiler/msprof_analyze | msprof-analyze工具地址 | +| 开源软件 | MindStudio Training Tools - msprof-analyze | /profiler/msprof_analyze/LICENSE | 公网地址 | http://www.apache.org/licenses/LICENSE-2.0 | 开源软件协议地址 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/rules/aicpu_rules.ymal | 公网地址 | https://gitee.com/ascend/mstt/blob/master/profiler/msprof_analyze/advisor/doc/Samples%20of%20AI%20CPU%20Operator%20Replacement.md | AI CPU 算子替换样例 | +| 开源软件 | MindStudio Training Tools - msprof-analyze advisor | /profiler/msprof_analyze/advisor/rules/environment_variable_info.yaml | 公网地址 | https://support.huawei.com/enterprise/zh/doc/EDOC1100371278/5eeeed85?idPath=23710424 | 组网指南 | +| 开源软件 | MindStudio Training Tools - msprof-analyze | /profiler/msprof_analyze/config/config.ini | 公网地址 | pmail_mindstudio@huawei.com | 公网邮箱 | +| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/googletest/repository/archive/release-1.12.1.tar.gz | 开源软件下载 | +| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/sinojelly/mockcpp/repository/archive/v2.7.zip | 开源软件下载 | +| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/JSON-for-Modern-CPP/repository/archive/v3.10.1.zip | 开源软件下载 | +| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/openssl/repository/archive/OpenSSL_1_1_1k.tar.gz | 开源软件下载 | +| 开源软件 | MindStudio Training Tools - accuracy_tools | /debug/accuracy_tools/cmake/config.ini | 公网地址 | https://gitee.com/mirrors/protobuf_source/repository/archive/v3.15.0.tar.gz | 开源软件下载 |