个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布
个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布
个人感觉这个任务还是有一定特点的:输入数据应该呈现类似均匀分布
由于基于坐标的MLP易于基于梯度的优化和机器学习,并且可以比网格采样表示更紧凑,因此该策略具有吸引力。并在许多方面各种任务中取得了最先进的结果。
在第6节中,我们继续进行高维实验的高斯分布,并将尺度作为超参数,在验证数据集上进行调整.在映射时需要在高斯分布中随机采样参数,分布的std对实验结果的影响如下:
除了3Dshape其他的评价指标都为PSNR,3Dshape使用IoU (higher is better for all).
On the Spectral Bias of Deep Neural Networks笔记
深度神经网络为什么不易过拟合?固有频谱偏差On the Spectral Bias of Neural Networks
github代码
论文 18 Jun 2020
论文笔记1
论文笔记2
video
相关链接:
Nerf
https://dellaert.github.io/NeRF/
ICLR 2019 有什么值得关注的亮点?我们终于证明了,只要足够宽,随机初始化的神经网络+梯度下降真的可以拟合所有数据!!!
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
我们的工作的动机是广泛使用基于坐标的MLP来表示各种视觉信号,包括图像[38]和3D场景[24,27,32]。特别地,我们的分析旨在阐明实验结果,证明使用对数间隔轴对齐频率的正弦波进行坐标输入映射(他们称之为“位置编码”)可以提高基于坐标的MLP在从2D图像[27]和蛋白质合成新视图任务中的性能低温电子显微镜的结构建模[44]。我们分析了这种技术,以表明它对应于MLP的NTK的修改,并且我们表明其他非轴对齐频率分布可以优于这种位置编码。
之前在自然语言处理和时间序列分析方面的工作[18,39,42]使用了类似的位置编码来表示时间或一维位置。特别是,Xu等人[42]使用随机傅立叶特征(RFF)1用正弦输入映射逼近平稳核,并提出调整映射参数的技术。我们的工作将这种映射直接解释为对最终网络的NTK的修改,从而扩展了这一点。此外,我们还讨论了多维坐标的嵌入,这是视觉和图形任务所必需的。
为了分析在输入坐标通过MLP之前将傅里叶特征映射应用于输入坐标的效果,我们依赖于最近的理论工作,该工作使用NTK[2,5,11,16,20]将神经网络在无限宽度和无限小学习率的限制下建模为核回归。
特别是,我们使用了Lee等人[20]和Arora等人[2]的分析,这表明在梯度下降过程中,网络的输出仍然接近线性动力系统的输出,其收敛速度由NTK矩阵的特征值决定[2,3,5,20,43]。对NTK特征分解(NTK’s eigendecomposition)的分析表明,其特征值谱随频率迅速衰减,这解释了人们广泛观察到的深层网络学习低频函数的“谱偏差”[3,4,33]。
我们利用这一分析来考虑在网络之前添加傅立叶特征映射的影响,并且我们表明这种映射对NTK的特征值谱和相应的网络在实际中的收敛特性有显著的影响。
我们利用这一分析来考虑在网络之前添加傅立叶特征映射的影响,并且我们表明这种映射对NTK的特征值谱和相应的网络在实际中的收敛特性有显著的影响。
以下为官方实现的核心代码:this example,for each input image coordinate
(
x
,
y
)
(x, y)
(x,y), the model predicts the associated color
(
r
,
g
,
b
)
(r, g, b)
(r,g,b).
在这个任务中,我们训练MLP从2D输入像素坐标回归到图像的相应RGB值。对于每个测试图像,我们在一个有规则间隔的屏幕上训练一个MLP
包含1/4像素的网格,并报告剩余像素的测试错误。我们比较了自然图像数据集和文本图像数据集上的输入映射。
B_dict = {}
# Standard network - no mapping
B_dict['none'] = None
# Basic mapping
B_dict['basic'] = np.eye(2)
# Three different scales of Gaussian Fourier feature mappings
B_gauss = random.normal(rand_key, (mapping_size, 2))
for scale in [1., 10., 100.]:
B_dict[f'gauss_{scale}'] = B_gauss * scale
# This should take about 2-3 minutes
outputs = {}
for k in tqdm(B_dict):
outputs[k] = train_model(network_size, learning_rate, iters, B_dict[k], train_data, test_data)
init_fn, apply_fn = make_network(*network_size)
apply_fn(params, input_mapping(x, B)))
# JAX network definition
def make_network(num_layers, num_channels):
layers = []
for i in range(num_layers - 1):
layers.append(stax.Dense(num_channels))
layers.append(stax.Relu)
layers.append(stax.Dense(3))
layers.append(stax.Sigmoid)
return stax.serial(*layers)
Pytorch implementation and comparison of Fourier Feature Networks and Sinusoidal Representation Networks
MLP(
(layers): Sequential(
(0): Linear(in_features=512, out_features=256, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=256, out_features=256, bias=True)
(3): ReLU(inplace=True)
(4): Linear(in_features=256, out_features=256, bias=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=256, out_features=3, bias=True)
)
)
def map_x(x,B):
xp = torch.matmul(2*math.pi*x,B)
return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)
def map_x(x,B):
xp = torch.matmul(2*math.pi*x,B)
return torch.sin(xp)
====================================================
# Image taken from authors' colab demo: https://colab.research.google.com/github/tancik/fourier-feature-networks/blob/master/Demo.ipynb
import numpy as np
import matplotlib.pyplot as plt
# from tqdm.notebook import tqdm as tqdm
from tqdm import *
import os, imageio
from imageio import imread,imsave
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
# Download image, take a square crop from the center
image_url = 'https://live.staticflickr.com/7492/15677707699_d9d67acf9d_b.jpg'
img = imageio.imread(image_url)[..., :3] / 255.
c = [img.shape[0]//2, img.shape[1]//2]
r = 256
img = img[c[0]-r:c[0]+r, c[1]-r:c[1]+r]
# plt.imshow(img)
# plt.show()
# Create input pixel coordinates in the unit square
coords = np.linspace(0, 1, img.shape[0], endpoint=False)
x_test = np.stack(np.meshgrid(coords, coords), -1)
test_data = [x_test, img]
train_data = [x_test[::2,::2], img[::2,::2]]
#%%
class MLP(nn.Module):
def __init__(self,depth=4,mapping_size=512,hidden_size=256):
super().__init__()
layers = []
layers.append(nn.Linear(mapping_size,hidden_size))
layers.append(nn.ReLU(inplace=True))
# for _ in range(depth-2):
# layers.append(nn.Linear(hidden_size,hidden_size))
# layers.append(nn.ReLU(inplace=True))
layers.append(nn.Linear(hidden_size,3))
self.layers = nn.Sequential(*layers)
def forward(self,x):
return torch.sigmoid(self.layers(x))
#%%
xb,yb = torch.tensor(train_data[0]).reshape(-1,2),torch.tensor(train_data[1]).reshape(-1,3)
x_test,y_test = torch.tensor(test_data[0]).reshape(-1,2),torch.tensor(test_data[1]).reshape(-1,3)
xb,yb,x_test,y_test = xb.float().cuda(),yb.float().cuda(),x_test.float().cuda(),y_test.float().cuda()
#%% md
# Original Mapping
#%%
def map_x(x,B):
xp = torch.matmul(2*math.pi*x,B)
return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)
#%%
model = MLP().cuda()
opt = torch.optim.Adam(model.parameters(),lr=1e-4)
loss = nn.MSELoss()
B = torch.randn(2,256).cuda() * 10
xt = map_x(xb,B) # [65536,2] - [2,256] -> [65536,256]
for i in tqdm(range(1000)):
print(i)
ypred = model(xt)
l = loss(ypred,yb)
opt.zero_grad()
l.backward()
opt.step()
model.eval()
with torch.no_grad():
ypreds = model(map_x(x_test, B))
ypreds = ypreds.reshape(512, 512, 3)
imsave('gaussion\\gaussion'+str(i)+'.png', (ypreds * 255).cpu().numpy())
# Preds
model.cpu().eval()
with torch.no_grad():
ypreds = model(map_x(x_test,B.cpu()))
ypreds = ypreds.reshape(512,512,3)
imsave('gaussion.png',(ypreds*255).numpy() )
========================================================================
# Image taken from authors' colab demo: https://colab.research.google.com/github/tancik/fourier-feature-networks/blob/master/Demo.ipynb
import numpy as np
import matplotlib.pyplot as plt
# from tqdm.notebook import tqdm as tqdm
from tqdm import *
import os, imageio
from imageio import imread,imsave
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
# Download image, take a square crop from the center
image_url = 'https://live.staticflickr.com/7492/15677707699_d9d67acf9d_b.jpg'
img = imageio.imread(image_url)[..., :3] / 255.
c = [img.shape[0]//2, img.shape[1]//2]
r = 256
img = img[c[0]-r:c[0]+r, c[1]-r:c[1]+r]
# plt.imshow(img)
# plt.show()
# Create input pixel coordinates in the unit square
coords = np.linspace(0, 1, img.shape[0], endpoint=False)
x_test = np.stack(np.meshgrid(coords, coords), -1)
test_data = [x_test, img]
train_data = [x_test[::2,::2], img[::2,::2]]
#%%
class MLP(nn.Module):
def __init__(self,depth=4,mapping_size=512,hidden_size=256):
super().__init__()
layers = []
layers.append(nn.Linear(mapping_size,hidden_size))
layers.append(nn.ReLU(inplace=True))
# for _ in range(depth-2):
# layers.append(nn.Linear(hidden_size,hidden_size))
# layers.append(nn.ReLU(inplace=True))
layers.append(nn.Linear(hidden_size,3))
self.layers = nn.Sequential(*layers)
def forward(self,x):
return torch.sigmoid(self.layers(x))
#%%
xb,yb = torch.tensor(train_data[0]).reshape(-1,2),torch.tensor(train_data[1]).reshape(-1,3)
x_test,y_test = torch.tensor(test_data[0]).reshape(-1,2),torch.tensor(test_data[1]).reshape(-1,3)
xb,yb,x_test,y_test = xb.float().cuda(),yb.float().cuda(),x_test.float().cuda(),y_test.float().cuda()
#%% md
# Original Mapping
#%%
def map_x(x,B):
xp = torch.matmul(2*math.pi*x,B)
return torch.cat([torch.sin(xp),torch.cos(xp)],dim=-1)
#%%
model = MLP().cuda()
opt = torch.optim.Adam(model.parameters(),lr=1e-4)
loss = nn.MSELoss()
xb = torch.randn(65536,2).cuda()
B = torch.randn(2,256).cuda() * 10
xt = torch.randn(65536,256*2).cuda()#map_x(xb,B) # [65536,2] - [2,256] -> [65536,256*2]
for i in tqdm(range(1000)):
print(i)
ypred = model(xt)
l = loss(ypred,yb)
opt.zero_grad()
l.backward()
opt.step()
model.eval()
with torch.no_grad():
ypreds = model(xt)#model(map_x(x_test, B))
ypreds = ypreds.reshape(256, 256, 3)
imsave('gaussion\\gaussion'+str(i)+'.png', (ypreds * 255).cpu().numpy())
class MLP(nn.Module):
def __init__(self,outdim=3,mapping_size=512,hidden_size=256):
super().__init__()
layers = []
layers.append(nn.Linear(mapping_size,hidden_size))
layers.append(nn.ReLU(inplace=True))
# for _ in range(depth-2):
# layers.append(nn.Linear(hidden_size,hidden_size))
# layers.append(nn.ReLU(inplace=True))
layers.append(nn.Linear(hidden_size,outdim))
self.layers = nn.Sequential(*layers)
def forward(self,x):
return torch.sigmoid(self.layers(x))
Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. NeurIPS,2007. ↩︎