# 点源时域麦克斯韦方程AI求解
**Repository Path**: Awedinary/time_domain_maxwell
## Basic Information
- **Project Name**: 点源时域麦克斯韦方程AI求解
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-06-14
- **Last Updated**: 2025-06-16
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Solving Partial Differential Equations with Point Source Based on Physics-Informed Neural Networks
## 1.项目简介
本项目复现了论文《Solving Partial Differential Equations with Point Source Based on Physics-Informed Neural Networks》提出的方法,该论文解决了含点源(Dirac δ函数)的偏微分方程求解难题。针对传统物理信息神经网络(PINNs)在处理点源奇异性时的失效问题,论文实现了三大核心技术:
1. **狄拉克函数连续化**:使用高斯分布近似δ函数
2. **区域自适应加权**:带下界约束的不确定性损失平衡算法
3. **多尺度SIREN网络**:结合多尺度特征提取与周期性激活函数
在二维时域Maxwell方程上验证了方法的有效性。
## 2. 背景介绍
含点源的偏微分方程是描述局域激励物理过程的核心模型。在电磁学中,Maxwell方程的点源项可表征天线辐射;在声学中,波动方程的点源对应单扬声器声场;在流体力学中,点源描述局部注液/抽液过程。这些方程的统一数学特征是在控制方程中包含狄拉克δ函数:
$$
\mathcal{N}(u) = f(x,t) + \delta\left( x - x_{0} \right)g(t).
$$
其中δ(x)为Dirac delta函数,x0为点源位置。Dirac delta函数的定义为:
$$
\delta(x) = \left\{ \begin{array}{r}
+ \infty,\ x = 0 \\
0,\ x \neq 0
\end{array} \right.\ ,\ \int_{- \infty}^{+ \infty}{\delta(x)dx = 1.}
$$
狄拉克函数具有奇异性,即无穷大值、零支撑集的特点,导致传统数值方法(如有限差分法)面临两大挑战:
1. **网格依赖性**:有限差分法(FDTD)需网格尺寸远小于点源尺度(通常Δx<0.01λ),计算成本随维度指数增长。
2. **能量守恒破坏**:有限元法在点源处需特殊基函数,否则导致数值震荡。
近年来,深度学习技术在求解偏微分方程(PDEs)中展现出巨大潜力,其中物理信息神经网络(PINNs)因其能够结合物理规律和数据驱动的方法而备受关注,PINNs通过自动微分嵌入物理约束:
$$
L_{phy} = {\frac{1}{N}\sum_{i = 1}^{N}{\parallel \mathcal{N}}\left( u_{NN}(x_{i}) \right) - f(x_{i}) \parallel}^{2}.
$$
但狄拉克函数的不可导性将导致点源邻域梯度爆炸或优化过程陷入局部极小值。因此,当PDEs中包含点源时,传统PINNs方法由于Dirac delta函数的奇异性而无法直接求解。
针对以上挑战,论文提出了三项创新,以解决点源时域麦克斯韦方程在以往求解中遇到的问题:
1. **消除奇异性**:用高斯分布近似Dirac delta函数;
2. **区域自适应加权**:提出一种带下界约束的不确定性加权算法,用于平衡点源区域与其他区域的损失项;
3. **频率自适应网络**:结合多尺度深度神经网络(multi-scale DNN)与周期性激活函数构建神经网络架构,提升物理信息神经网络(PINNs)方法的精度与收敛速度。
## 3. 问题定义
麦克斯韦方程组是一组耦合偏微分方程,与电磁学的基本物理建模相关。论文采用二维麦克斯韦方程组验证所提出方法的鲁棒性。问题定义如下:
求解的方程为二维横电波(TE)Maxwell方程组,TE波的控制方程如下:
$$
\frac{\partial E_{x}}{\partial t} = \frac{1}{\epsilon}\frac{\partial H_{z}}{\partial y},
$$
$$
\frac{\partial E_{y}}{\partial t} = - \frac{1}{\epsilon}\frac{\partial H_{z}}{\partial x},
$$
$$
\frac{\partial H_{z}}{\partial t} = - \frac{1}{\mu}\left( \frac{\partial E_{y}}{\partial x} - \frac{\partial E_{x}}{\partial y} + J \right).
$$
其中Ex和Ey表示笛卡尔坐标系中的电场分量,Hz为垂直于水平面的磁场分量。
常数μ和ε分别表示自由空间的磁导率和介电常数,取值如下:
$$
\mu\approx4\pi\times10^{-7}H/m,\epsilon\approx8.854\times10^{-12}F/m.
$$
J作为已知源函数,表征通过源节点的能量传输。在不失一般性的前提下,采用含多频分量的高斯脉冲作为源函数,其表达式为:
$$
J(x,y,t) = e^{- \left( \frac{t - d}{\tau} \right)^{2}}\delta\left( x - x_{0} \right)\delta\left( y - y_{0} \right).
$$
其中$d$为时间延迟,点源位置(x0,y0) = (0,0),τ为脉冲参数,当前取值为:
$$
τ = \frac{3.65 \times \sqrt{2.3}}{(\pi f)}
$$
特征频率f设定为1GHz.
同时,假设初始电磁场为静态场,对应的初始条件如下:
$$
E_{x}(x,y,0) = E_{y}(x,y,0) = 0,
$$
$$
H_{z}(x,y,0) = 0.
$$
为在有限真空域内唯一求解方程组,采用标准的Mur二阶吸收边界条件。对于矩形截断域,其边上、下、左、右的边界条件需满足:
$$
\frac{\partial H_{z}}{\partial x} - \frac{1}{c}\frac{\partial H_{z}}{\partial t} + \frac{c\epsilon}{2}\frac{\partial E_{x}}{\partial y} = 0,
$$
$$
\frac{\partial H_{z}}{\partial x} + \frac{1}{c}\frac{\partial H_{z}}{\partial t} - \frac{c\epsilon}{2}\frac{\partial E_{x}}{\partial y} = 0,
$$
$$
\frac{\partial H_{z}}{\partial y} - \frac{1}{c}\frac{\partial H_{z}}{\partial t} + \frac{c\epsilon}{2}\frac{\partial E_{y}}{\partial x} = 0,
$$
$$
\frac{\partial H_{z}}{\partial y} + \frac{1}{c}\frac{\partial H_{z}}{\partial t} - \frac{c\epsilon}{2}\frac{\partial E_{y}}{\partial x} = 0.
$$
其中,c表示光速。所求解问题的空间域为Ω = [-1, 1]2(单位:米),时间域T∈[0, 4]ns。
最终,论文所求解的核心问题可以表述为:构建一个从时空坐标到电磁场分量的连续映射函数:
$$
\Phi:\Omega \times \lbrack 0,T\rbrack \rightarrow \mathbb{R}^{3},
$$
$$
(x,y,t) \mapsto (E_{x}(x,y,t),E_{y}(x,y,t),H_{z}(x,y,t)\ ).
$$
## 4. 研究现状
在求解含有点源的偏微分方程方面,许多研究团队提出了解决方案。Deep Ritz方法[1]通过数学形式推导,将含有点源的偏微分方程求解转化为最小化问题,从而进行求解。但该方法使用场景受限,只能适用于可以由变分问题推导出的偏微分方程求解问题。NVIDIA SimNet方法[2]sup>提出使用变分公式求解带有点源的偏微分方程,但这种方法的性能严重依赖于测试函数的选择,计算时间会随着测试函数的数量线性增加。Zhang等[3]sup>和Moseley等[4]sup>结合了传统数值方法和PINNs,用经典数值方法模拟波动方程的早期(0 ≤ t ≤ T0 < T)过程以避免点源的产生,然后应用PINNs方法求解t = t0处具有给定初始条件的偏微分方程,但这类方法需额外标注数据。Bekele等[5]sup>通过添加标注数据改进PINNs,这种方法依赖于经典的计算方法来生成标记数据,因此在很多情况下成本高昂,甚至是不可行的。值得注意的是,现有方法均未根本解决奇异性导致的PINNs训练不收敛问题,且缺乏通用性框架。
[1] WEINAN E, YU Bing. The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems[J]. Communications in Mathematics and Statistics, 2018, 6(1): 1-12.
[2] HENNIGH Oliver, NARASIMHAN Susheela, NABIAN Mohammad Amin, et al. Nvidia simnet™: An AI-accelerated multi-physics simulation framework[C]//International Conference on Computational Science. Cham: Springer, 2021: 447-461.
[3] ZHANG Pan, HU Yanyan, JIN Yuchen, et al. A Maxwell's equations based deep learning method for time domain electromagnetic simulations[J]. IEEE Journal on Multiscale and Multiphysics Computational Techniques, 2021, 6: 35-40.
[4] MOSELEY Ben, MARKHAM Andrew, NISSEN-MEYER Tarje. Solving the wave equation with physics-informed deep learning[EB/OL]. arXiv preprint, 2020: arXiv:2006.11894.
[5] BEKELE Yared W. Physics-informed deep learning for flow and deformation in poroelastic media[EB/OL]. arXiv preprint, 2020: arXiv:2010.15426.
## 5. 论文方法
论文提出三项核心方法解决点源偏微分方程的奇异性问题。
### (1)狄拉克函数的连续化逼近
为消除Dirac delta函数的奇异性,论文采用对称单峰连续概率密度函数近似替代:
$$
\eta_{\alpha}(x) = \alpha^{- d}\eta(\frac{x - x_{0}}{\alpha}).
$$
其中α为核宽度(默认α=0.01),d为空间维数η(·)可选用高斯分布、柯西分布或拉普拉斯分布。以高斯分布为例:
$$
\eta_{\alpha}(x) = \frac{1}{\alpha\sqrt{2\pi}}e^{- \frac{{\parallel x - x_{0} \parallel}^{2}}{2\alpha^{2}}}.
$$
此变换将点源转化为连续支撑集上的平滑函数。图1展示了α的调优过程,初始设为0.01,每70k迭代减半,最终稳定在0.00125。根据图1的实验结果,论文选择α= 0.01作为核宽度,并在后续实验中进行固定。
### (2)下界约束的不确定性加权算法
由于概率密度函数在原点处高度集中,相当于α趋近于0,因此靠近原点的残差样本往往占整个残差损失的主导地位,使得网络难以学习控制方程,标准优化器可能无法收敛到解。为了解决这个问题,论文将可行区域划分为两个子域:
$$
\Omega = \Omega_{0} \cup \Omega_{1},\ \Omega_{0} \cap \Omega_{1} = \varnothing
$$
其中Ω 0表示点源临域,Ω 1覆盖补集区域,为非点源域。通过这样的分解,可以将PINNs方法中的剩余损失分为两部分,按照点源临域和非点源域进行差异化平衡:
$$
L_{r}(\theta) = \lambda_{r,0}L_{r,0}(\theta) + \lambda_{r,1}L_{r,1}(\theta)
$$
其中Lr,0(θ)和Lr,1(θ)分别对应于Ω 0和Ω 1中的剩余损失。在论文的实验中,点源的激发位置为x0,因此Ω 按照如下方式划分:
$$
\Omega_{0} = \left\{ x_{0} + x \in \Omega\ | \parallel x \parallel \leq 3\alpha \right\},\ \Omega_{1} = \Omega\backslash\Omega_{0}
$$
再考虑到PINNs方法的初始条件(IC)和边界条件(BC),总PINNs损失即可表示为以下形式:
$$
L_{total}(\theta) = \lambda_{r,0}L_{r,0}(\theta) + \lambda_{r,1}L_{r,1}(\theta) + \lambda_{ic}L_{ic}(\theta) + \lambda_{bc}L_{bc}(\theta).
$$
合适的权重向量λ = (λr,0,λr,1,λic,λbc)对于增强约束神经网络的可训练性至关重要,但通过人工调优来搜索最优权向量是不可行的,因此论文采用不确定性加权方法来解决这个问题,引入一个下界约束项ε2:
$$
L_{total}(\theta;\omega) = \sum_{i = 1}^{m}{\frac{1}{2(\epsilon^{2} + \omega_{i}^{2})}L_{i}(\theta) + \log(\epsilon^{2} + \omega_{i}^{2})}.
$$
图2展示了不同的ε取值对结果的影响,实验中将ε设为0.01。
### (3)多尺度SIREN网络(MS-SIREN)
论文针对解的多频特性(如Poisson方程的解式含无限级数),设计了如图3所示的网络架构。

多尺度SIREN (MS-SIREN)将多尺度DNN与正弦激活函数相结合,如图3所示。在连续隐藏层之间添加跳跃连接以加速训练并提高精度,这使得神经网络更容易训练,因为它们有助于避免梯度消失问题。同时MS-SIREN拥有n个子网,对应不同的缩放系数{a1,...,an},分别捕获不同频率成分。
## 6. 复现结果
### 6.1 模型训练
实验中,首先对所要求解的点源时域麦克斯韦方程进行定义。代码如下所示:
```python
# 2d TE-mode Maxwell equation with 2nd-order Mur boundary condition and static initial electromagnetic field
class Maxwell2DMur(Problem):
def __init__(self, model, config, domain_name=None, bc_name=None, ic_name=None):
super(Maxwell2DMur, self).__init__()
self.domain_name = domain_name
self.bc_name = bc_name
self.ic_name = ic_name
self.model = model
# operators
self.grad = Grad(self.model)
self.reshape = ops.Reshape()
self.cast = ops.Cast()
self.mul = ops.Mul()
self.cast = ops.Cast()
self.split = ops.Split(1, 3)
self.concat = ops.Concat(1)
# constants
self.pi = Tensor(PI, mstype.float32)
self.eps_x = Tensor(EPS, mstype.float32)
self.eps_y = Tensor(EPS, mstype.float32)
self.mu_z = Tensor(MU, mstype.float32)
self.light_speed = Tensor(LIGHT_SPEED, mstype.float32)
# gauss-type pulse source
self.src_frq = config.get("src_frq", 1e+9)
self.tau = Tensor((2.3 ** 0.5) / (PI * self.src_frq), mstype.float32)
self.amp = Tensor(1.0, mstype.float32)
self.t0 = Tensor(3.65 * self.tau, mstype.float32)
# src space
self.x0 = Tensor(config["src_pos"][0], mstype.float32)
self.y0 = Tensor(config["src_pos"][1], mstype.float32)
self.sigma = Tensor(config["src_radius"] / 4.0, mstype.float32)
self.coord_min = config["coord_min"]
self.coord_max = config["coord_max"]
input_scale = config.get("input_scale", [1.0, 1.0, 2.5e+8]) # scale of input data to improve accuracy
output_scale = config.get("output_scale", [37.67303, 37.67303, 0.1]) # scale of output data to improve accuracy
self.s_x = Tensor(input_scale[0], mstype.float32)
self.s_y = Tensor(input_scale[1], mstype.float32)
self.s_t = Tensor(input_scale[2], mstype.float32)
self.s_ex = Tensor(output_scale[0], mstype.float32)
self.s_ey = Tensor(output_scale[1], mstype.float32)
self.s_hz = Tensor(output_scale[2], mstype.float32)
def smooth_src(self, x, y, t):
"""Incentive sources and Gaussian smoothing of Dirac function"""
source = self.amp * ops.exp(- ((t - self.t0) / self.tau)**2)
gauss = 1 / (2 * self.pi * self.sigma**2) * \
ops.exp(- ((x - self.x0)**2 + (y - self.y0)**2) / (2 * (self.sigma**2)))
return self.mul(source, gauss)
@ms.jit
def governing_equation(self, *output, **kwargs):
"""maxwell equation of TE mode wave"""
u = output[0]
# input data
data = kwargs[self.domain_name]
x = self.reshape(data[:, 0], (-1, 1))
y = self.reshape(data[:, 1], (-1, 1))
t = self.reshape(data[:, 2], (-1, 1))
# get gradients
dex_dxyt = self.grad(data, None, 0, u)
_, dex_dy, dex_dt = self.split(dex_dxyt)
dey_dxyt = self.grad(data, None, 1, u)
dey_dx, _, dey_dt = self.split(dey_dxyt)
dhz_dxyt = self.grad(data, None, 2, u)
dhz_dx, dhz_dy, dhz_dt = self.split(dhz_dxyt)
# residual of each equation
loss_a1 = (self.s_hz * dhz_dy) / (self.s_ex * self.s_t * self.eps_x)
loss_a2 = dex_dt / self.s_t
loss_b1 = -(self.s_hz * dhz_dx) / (self.s_ey * self.s_t * self.eps_y)
loss_b2 = dey_dt / self.s_t
loss_c1 = (self.s_ey * dey_dx - self.s_ex * dex_dy) / (self.s_hz * self.s_t * self.mu_z)
loss_c2 = - dhz_dt / self.s_t
src = self.smooth_src(x, y, t) / (self.s_hz * self.s_t * self.mu_z)
pde_r1 = loss_a1 - loss_a2
pde_r2 = loss_b1 - loss_b2
pde_r3 = loss_c1 - loss_c2 - src
# total residual
pde_r = ops.Concat(1)((pde_r1, pde_r2, pde_r3))
return pde_r
@ms.jit
def boundary_condition(self, *output, **kwargs):
"""2nd-order mur boundary condition"""
# network input and output
u = output[0]
data = kwargs[self.bc_name]
# specify each boundary
coord_min = self.coord_min
coord_max = self.coord_max
batch_size, _ = data.shape
attr = ms_np.zeros(shape=(batch_size, 4))
attr[:, 0] = ms_np.where(ms_np.isclose(data[:, 0], coord_min[0]), 1.0, 0.0)
attr[:, 1] = ms_np.where(ms_np.isclose(data[:, 0], coord_max[0]), 1.0, 0.0)
attr[:, 2] = ms_np.where(ms_np.isclose(data[:, 1], coord_min[1]), 1.0, 0.0)
attr[:, 3] = ms_np.where(ms_np.isclose(data[:, 1], coord_max[1]), 1.0, 0.0)
# get gradients
dex_dxyt = self.grad(data, None, 0, u)
_, dex_dy, _ = self.split(dex_dxyt)
dey_dxyt = self.grad(data, None, 1, u)
dey_dx, _, _ = self.split(dey_dxyt)
dhz_dxyt = self.grad(data, None, 2, u)
dhz_dx, dhz_dy, dhz_dt = self.split(dhz_dxyt)
# residual of each boundary
bc_r1 = dhz_dx / self.s_x - dhz_dt / (self.light_speed * self.s_x) + \
self.s_ex * self.light_speed * self.eps_x / (2 * self.s_hz * self.s_x) * dex_dy # 左边界
bc_r2 = dhz_dx / self.s_x + dhz_dt / (self.light_speed * self.s_x) - \
self.s_ex * self.light_speed * self.eps_x / (2 * self.s_hz * self.s_x) * dex_dy # 右边界
bc_r3 = dhz_dy / self.s_y - dhz_dt / (self.light_speed * self.s_y) - \
self.s_ey * self.light_speed * self.eps_y / (2 * self.s_hz * self.s_y) * dey_dx # 下边界
bc_r4 = dhz_dy / self.s_y + dhz_dt / (self.light_speed * self.s_y) + \
self.s_ey * self.light_speed * self.eps_y / (2 * self.s_hz * self.s_y) * dey_dx # 上边界
bc_r_all = self.concat((bc_r1, bc_r2, bc_r3, bc_r4))
bc_r = self.mul(bc_r_all, attr)
return bc_r
@ms.jit
def initial_condition(self, *output, **kwargs):
"""initial condition: u = 0"""
u = output[0]
return u
```
训练集通过在线采样生成,利用MindElec几何模块构造圆形(源项)与矩形(总区域)几何区域,从而支持对有源与无源区域进行区分采样。之后调用GeometryWithTime将几何空间与时间域绑定,动态生成采样点集成入Dataset,从而完成训练集的构造。在PINNs中,训练集不需要外部标签数据,因为损失函数是直接基于物理方程、初始条件和边界条件构建的。
训练过程的代码如下所示:
```python
import argparse
parser = argparse.ArgumentParser(description='Physics-Driven Maxwell Solver Training')
parser.add_argument('--model_dir', '-m', type=str, default="./ckpt",
help='Directory path for saving training models')
parser.add_argument('--summary_dir', '-s', type=str, default="./vision",
help='')
args = parser.parse_args()
def load_config():
"""load config"""
with open(os.path.join(os.path.dirname(__file__), "config.json")) as f:
config = json.load(f)
return config
def train(config):
"""training process"""
if config["random_sampling"]:
elec_train_dataset = create_random_dataset(config)
else:
elec_train_dataset = create_train_dataset(config["train_data_path"])
train_dataset = elec_train_dataset.create_dataset(batch_size=config["train_batch_size"],
shuffle=True,
prebatched_data=True,
drop_remainder=True)
steps_per_epoch = len(elec_train_dataset)
print("check train dataset size: ", len(elec_train_dataset))
np.save(f"{config['summary_path']}/train_data.npy", elec_train_dataset[0])
# define network
model = MultiScaleFCCell(config["input_size"],
config["output_size"],
layers=config["layers"],
neurons=config["neurons"],
input_scale=config["input_scale"],
residual=config["residual"],
weight_init=HeUniform(negative_slope=math.sqrt(5)),
act="sin",
num_scales=config["num_scales"],
amp_factor=config["amp_factor"],
scale_factor=config["scale_factor"]
)
model.cell_list.to_float(ms.float16)
mtl = MTLWeightedLossCell(num_losses=elec_train_dataset.num_dataset)
# define problem
train_prob = {}
for dataset in elec_train_dataset.all_datasets:
train_prob[dataset.name] = Maxwell2DMur(model=model, config=config,
domain_name=dataset.name + "_points",
ic_name=dataset.name + "_points",
bc_name=dataset.name + "_points")
print("check problem: ", train_prob)
train_constraints = Constraints(elec_train_dataset, train_prob)
# optimizer
params = model.trainable_params() + mtl.trainable_params()
lr_scheduler = MultiStepLR(config["lr"], config["milestones"], config["lr_gamma"],
steps_per_epoch, config["train_epoch"])
lr = lr_scheduler.get_lr()
optim = nn.Adam(params, learning_rate=Tensor(lr))
if config["load_ckpt"]:
param_dict = load_checkpoint(config["load_ckpt_path"])
load_param_into_net(model, param_dict)
load_param_into_net(mtl, param_dict)
# define solver
solver = Solver(model,
optimizer=optim,
mode="PINNs",
train_constraints=train_constraints,
test_constraints=None,
metrics={'l2': L2(), 'distance': nn.MAE()},
loss_fn='smooth_l1_loss',
loss_scale_manager=DynamicLossScaleManager(init_loss_scale=2 ** 10, scale_window=2000),
mtl_weighted_cell=mtl,
)
loss_time_callback = LossAndTimeMonitor(steps_per_epoch)
# Init a SummaryCollector callback instance, and use it in model.train or model.eval
specified = {"collect_metric": True, "histogram_regular": "^conv1.*|^conv2.*", "collect_graph": False,
"collect_dataset_graph": True}
summary_collector = SummaryCollector(summary_dir=config["summary_path"], collect_specified_data=specified,
collect_freq=1, keep_default_action=False, collect_tensor_freq=1)
callbacks = [loss_time_callback, summary_collector]
if config.get("train_with_eval", False):
inputs, label = get_test_data(config["test_data_path"])
predict_callback = PredictCallback(model, inputs, label, config=config, visual_fn=visual_result)
callbacks += [predict_callback]
if config["save_ckpt"]:
config_ck = CheckpointConfig(save_checkpoint_steps=10,
keep_checkpoint_max=2)
ckpoint_cb = ModelCheckpoint(prefix='ckpt_maxwell_frq1e9',
directory=config["save_ckpt_path"], config=config_ck)
callbacks += [ckpoint_cb]
solver.train(config["train_epoch"], train_dataset, callbacks=callbacks, dataset_sink_mode=False)
```
### 6.2 模型测试
构造出满足(x,y,t)到(Ex,Ey,Hz)映射关系的测试集,最后对结果进行评估,代码如下所示:
```python
def evaluation(config):
"""evaluation"""
# define network
model = MultiScaleFCCell(config["input_size"],
config["output_size"],
layers=config["layers"],
neurons=config["neurons"],
input_scale=config["input_scale"],
residual=config["residual"],
act="sin",
num_scales=config["num_scales"],
amp_factor=config["amp_factor"],
scale_factor=config["scale_factor"]
)
model.to_float(ms.float16)
model.input_scale.to_float(ms.float32)
# load parameters
param_dict = load_checkpoint(config["load_ckpt_path"])
convert_ckpt_dict = {}
for _, param in model.parameters_and_names():
convert_name1 = "model.cell_list." + param.name
convert_name2 = "model.cell_list." + ".".join(param.name.split(".")[2:])
for key in [convert_name1, convert_name2]:
if key in param_dict:
convert_ckpt_dict[param.name] = param_dict[key]
load_param_into_net(model, convert_ckpt_dict)
# load test dataset
inputs, label = get_test_data(config["test_data_path"])
outputs_size = config.get("outputs_size", 3)
inputs_size = config.get("inputs_size", 3)
outputs_scale = np.array(config["output_scale"], dtype=np.float32)
batch_size = config.get("test_batch_size", 8192 * 4)
dx = inputs[0, 1, 0, 0] - inputs[0, 0, 0, 0]
dy = inputs[0, 0, 1, 1] - inputs[0, 0, 0, 1]
dt = inputs[1, 0, 0, 2] - inputs[0, 0, 0, 2]
ex_inputs = copy.deepcopy(inputs)
ey_inputs = copy.deepcopy(inputs)
hz_inputs = copy.deepcopy(inputs)
ex_inputs = ex_inputs.reshape(-1, inputs_size)
ey_inputs = ey_inputs.reshape(-1, inputs_size)
hz_inputs = hz_inputs.reshape(-1, inputs_size)
ex_inputs[:, 1] += dy / 2.0
ex_inputs[:, 2] += dt / 2.0
ey_inputs[:, 0] += dx / 2.0
ey_inputs[:, 2] += dt / 2.0
inputs_each = [ex_inputs, ey_inputs, hz_inputs]
index = 0
prediction_each = np.zeros(label.shape)
prediction_each = prediction_each.reshape((-1, outputs_size))
time_beg = time.time()
while index < len(inputs_each[0]):
index_end = min(index + batch_size, len(inputs_each[0]))
# predict each physical quantity respectively in order to keep consistent with fdtd on staggered mesh
for i in range(outputs_size):
test_batch = Tensor(inputs_each[i][index: index_end, :], ms.float32)
predict = model(test_batch)
predict = predict.asnumpy()
prediction_each[index: index_end, i] = predict[:, i] * outputs_scale[i]
index = index_end
print("==================================================================================================")
print("predict total time: {} s".format(time.time() - time_beg))
prediction = prediction_each.reshape(label.shape)
np.save(f"{config['predict_path']}/predict.npy", prediction)
vision_path = config.get("vision_path", "./vision")
visual_result(inputs, label, prediction, path=vision_path, name="predict")
# get accuracy
error = label - prediction
l2_error_ex = np.sqrt(np.sum(np.square(error[..., 0]))) / np.sqrt(np.sum(np.square(label[..., 0])))
l2_error_ey = np.sqrt(np.sum(np.square(error[..., 1]))) / np.sqrt(np.sum(np.square(label[..., 1])))
l2_error_hz = np.sqrt(np.sum(np.square(error[..., 2]))) / np.sqrt(np.sum(np.square(label[..., 2])))
print("l2_error, Ex: ", l2_error_ex, ", Ey: ", l2_error_ey, ", Hz: ", l2_error_hz)
```
经过评估,以下截取了模拟过程中时间序列为238的图片,预测结果如下所示:

统计出Ex,Ey,Hz的L2相对误差为:
| 分量 | L2误差 |
| ------------- | ------ |
| Ex | 0.070 |
| Ey | 0.093 |
| Hz | 0.015 |
| mean | 0.060 |
与原论文中平均0.021的L2相对误差相比,复现出的结果明显误差较大,推测原因为复现过程中使用FDTD方法生成的参考解与实际值存在一定的误差,导致L2相对误差增大。