博客资料
基于深度学习的多聚焦图像融合(Multi-Focus Image Fusion)论文及代码整理
文献
1.Multi-focus image fusion with a deep convolutional neural network [CNN(IF 2017)][paper][code]
2.Ensemble of CNN for multi-focus image fusion [ECNN(IF 2019)][paper][code]
3.Multilevel features convolutional neural network for multifocus image fusion [MLFCNN(TCI 2019)] [paper]
4.DRPL: Deep Regression Pair Learning for Multi-Focus Image Fusion [DRPL(TIP 2020)] [paper][code]
5.An α-Matte Boundary Defocus Model-Based Cascaded Network for Multi-Focus Image Fusion [MMF-Net(TIP 2020)] [paper][code]
6.Towards Reducing Severe Defocus Spread Effects for Multi-Focus Image Fusion via an Optimization Based Strategy [MFF-SSIM(TCI 2020)][paper][code]
7.Structural Similarity Loss for Learning to Fuse Multi-Focus Images [MFNet(Sensors 2020)][paper]
8.Global-Feature Encoding U-Net (GEU-Net) for Multi-Focus Image Fusion [GEU-Net(TIP 2021)][paper][code]
9.DTMNet: A Discrete Tchebichef Moments-Based Deep Neural Network for Multi-Focus Image Fusion [DTMNet(ICCV 2021)][paper]
10.SMFuse: Multi-Focus Image Fusion Via Self-Supervised Mask-Optimization [SMFuse(TCI 2021)][paper][code]
11.Depth-Distilled Multi-focus Image Fusion [D2MFIF(TMM 2021)][paper]
12.SESF-Fuse: an unsupervised deep model for multi-focus image fusion [SESF-Fuse(NCAA 2021)][paper][code]
单点定位
笔记
A Deep Learning Localization Method for Measuring Abdominal Muscle Dimensions in Ultrasound Images
多模态声源定位(SSL)
Sound source localization based on multi-task learning and image translation network [paper]
y m = s ∗ h m + n m y_m=s*h_m+n_m ym=s∗hm+nm
其中, y m ∈ R L y_m\in \mathbb R^L ym∈RL是第 m m m个麦克风接收到的信号( M ∈ { 1 , 2 , ⋯ , M } M\in\{1,2,\cdots ,M\} M∈{1,2,⋯,M}), s s s是源信号, n m n_m nm是麦克风第 n n n个噪声信号,它表示房间的混响。 y = [ y 1 , y 2 , ⋯ , y m ] T ∈ R M × L \mathbf y=[y_1,y_2,\cdots,y_m]^T\in\mathbb R^{M\times L} y=[y1,y2,⋯,ym]T∈RM×L表示从音频长度为 L L L的所有传感器接收的信号的集合。
对于每个阵列有 K K K个麦克风的 N N N个阵列,即 M = N K M=NK M=NK,则 y \mathbf y y为 K × N × L K\times N\times L K×N×L的张量,即研究的 y ∈ R K × N × L \mathbf y\in\mathbb R^{K\times N\times L} y∈RK×N×L
特征提取
快照的数量可以用音频长度 L L L和快速傅里叶变换(FFT)和窗口长度 F F F表示为 S = L / 2 F S=L/2F S=L/2F(仅考虑频率段的正一半),可得到 Y ∈ C S × F × K × N Y\in\mathbb C^{S\times F\times K\times N} Y∈CS×F×K×N作为 y \mathbf{y} y的快速傅里叶变换(STFT)的输出,这里 y y y是针对单个测量情况计算的。如果数据集中有 T T T个独立的测量值,那么 Y \mathbf Y Y的帧数为 C = T S C=TS C=TS( Y ∈ C S × F × K × N Y\in\mathbb C^{S\times F\times K\times N} Y∈CS×F×K×N)
DOA特征提取
假设均匀线阵(ULA)和宽带信号再任意角度 θ \theta θ上传统波束形成输出的幅度为
P ( θ ) = ∣ w θ H Y f n s ∣ = ∣ ∑ k = 1 K Y f n s [ k ] e j ( 2 π k u sin θ f 0 / c ) ∣ P(\theta)=\left|w_\theta^H Y_f^{ns}\right|=\left|\sum^K_{k=1}Y_f^{ns}[k]e^{j(2\pi ku\sin{\theta}f_0/c)}\right| P(θ)=∣ ∣wθHYfns∣ ∣=∣ ∣k=1∑KYfns[k]ej(2πkusinθf0/c)∣ ∣
这里, w θ = [ e j ( 2 π k u sin θ f 0 1 / c ) , e j ( 2 π k u sin θ f 0 2 / c ) , ⋯ , e j ( 2 π k u sin θ f 0 K / c ) ] T w_{\theta}=\left[e^{j(2\pi ku\sin{\theta}f_01/c)},e^{j(2\pi ku\sin{\theta}f_02/c)},\cdots,e^{j(2\pi ku\sin{\theta}f_0K/c)}\right]^T wθ=[ej(2πkusinθf01/c),ej(2πkusinθf02/c),⋯,ej(2πkusinθf0K/c)]T, u u u, f 0 f_0 f0和 c c c代表麦克风间距,中频和声速, P ( θ ) P(\theta) P(θ)中的每个点表示信号与基于自由空间传播的预期波束形成期响应的相似性; Y f n s ∈ C K × 1 Y_f^{ns}\in\mathbb C^{K\times 1} Yfns∈CK×1表示 Y Y Y中第 n n n个数组,第 s s s个快照和第 f f f个频率的数据
范围特征提取
范围特征是基于距离的特征,首先假设只考虑距离引起的相位差,与角度特征提取类似,距离 d d d处的常规波束形成输出为
P ( d ) = ∣ w d H Y k n s ∣ = ∣ ∑ l = 1 F Y k n s [ l ] e j ( 2 π l f l d / c ) ∣ P(d)=\left|w_d^H Y_k^{ns}\right|=\left|\sum_{l=1}^F Y_k^{ns}[l]e^{j(2\pi lf_ld/c)}\right| P(d)=∣ ∣wdHYkns∣ ∣=∣ ∣l=1∑FYkns[l]ej(2πlfld/c)∣ ∣
其中, w d = [ e j ( 2 π l f l d 1 / c ) , e j ( 2 π l f l d 2 / c ) , ⋯ , e j ( 2 π l f l d F / c ) ] w_d=\left[e^{j(2\pi lf_ld1/c)},e^{j(2\pi lf_ld2/c)},\cdots,e^{j(2\pi lf_ldF/c)}\right] wd=[ej(2πlfld1/c),ej(2πlfld2/c),⋯,ej(2πlfldF/c)], Y k n s ∈ C F × 1 Y_k^{ns}\in\mathbb{C}^{F\times 1} Ykns∈CF×1表示第 n n n个阵列中第 k k k个麦克风和第 s s s个快照的数据, f l f_l fl对应的是第 l l l频段的频率
DOA范围特征提取
对于角度为 θ \theta θ距离为 d d d的数组
P n s ( θ , d ) = ∣ w θ H Y n s w d ∗ ∣ = ∣ ∑ k = 1 K ∑ l = 1 F Y k l e j ( 2 π k u sin θ f 0 / c ) e j ( 2 π l f l d / c ) ∣ P_{ns}(\theta,d)=\left|w_\theta^HY^{ns}w_d^*\right|=\left|\sum_{k=1}^K\sum_{l=1}^FY_{kl}e^{j(2\pi ku\sin\theta f_0/c)}e^{j(2\pi lf_ld/c)}\right| Pns(θ,d)=∣ ∣wθHYnswd∗∣ ∣=∣ ∣k=1∑Kl=1∑FYklej(2πkusinθf0/c)ej(2πlfld/c)∣ ∣
其中, n ∈ { 1 , 2 , ⋯ , N } n\in \{1,2,\cdots,N\} n∈{1,2,⋯,N}, s ∈ { 1 , 2 , ⋯ , S } s\in \{1,2,\cdots,S\} s∈{1,2,⋯,S}, Y n s [ k , l ] = Y k l ⋅ P n s ( θ , d ) Y_{ns}[k,l]=Y_{kl}\cdot P_{ns}(\theta,d) Yns[k,l]=Ykl⋅Pns(θ,d)是常规波束形成器的幅度输出, Y k l Y_{kl} Ykl表示第 k k k个麦克风第 l l l个频段的STFT输出; ( ⋅ ) H (\cdot)^H (⋅)H, ( ⋅ ) T (\cdot)^T (⋅)T, ( ⋅ ) ∗ (\cdot)^* (⋅)∗分别表示埃米尔特复共轭转置、转置和复共轭。
定位目标
预测的位置 ( x ′ , y ′ ) (x',y') (x′,y′)与真实位置 ( x , y ) (x,y) (x,y)的距离为
d ( x ′ , y ′ ) = ( x ′ − x ) 2 + ( y ′ − y ) 2 d(x',y')=\sqrt{(x'-x)^2+(y'-y)^2} d(x′,y′)=(x′−x)2+(y′−y)2
它在目标图像中的值
l ( x ′ , y ′ ) = e − d ( x ′ , y ′ ) 2 / σ l(x',y')=e^{-d(x',y')^2/\sigma} l(x′,y′)=e−d(x′,y′)2/σ
其中 σ \sigma σ是控制衰减速率的超参数,在 d ( x ′ , y ′ ) = 0 d(x',y')=0 d(x′,y′)=0时, l ( x ′ , y ′ ) = 1 l(x',y')=1 l(x′,y′)=1为其最大值