吉安百姓网:【python实现卷积神经网络】卷积层Conv2D实现(带stride、padding)

admin/2020-04-16/ 分类:科技/阅读:

关于卷积操作是若何举行的就不必多说了,连系代码一步一步来看卷积层是怎么实现的。

代码泉源:https://github.com/eriklindernoren/ML-From-Scratch

 

先看一下其基本的组件函数,首先是determine_padding(filter_shape, output_shape="same")

def determine_padding(filter_shape, output_shape="same"): # No padding if output_shape == "valid": return (0, 0), (0, 0) # Pad so that the output shape is the same as input shape (given that stride=1) elif output_shape == "same": filter_height, filter_width = filter_shape # Derived from: # output_height = (height pad_h - filter_height) / stride 1 # In this case output_height = height and stride = 1. This gives the # expression for the padding below. pad_h1 = int(math.floor((filter_height - 1)/2)) pad_h2 = int(math.ceil((filter_height - 1)/2)) pad_w1 = int(math.floor((filter_width - 1)/2)) pad_w2 = int(math.ceil((filter_width - 1)/2)) return (pad_h1, pad_h2), (pad_w1, pad_w2)

说明:凭据卷积核的形状以及padding的方式来计算出padding的值,包罗上、下、左、右,其中out_shape=valid示意不填充。

弥补:

  • math.floor(x)示意返回小于或即是x的最大整数。
  • math.ceil(x)示意返回大于或即是x的最大整数。

带入现实的参数来看下输出:

pad_h,pad_w=determine_padding((3,3), output_shape="same")

输出:(1,1),(1,1)

然后是image_to_column(images, filter_shape, stride, output_shape='same')函数

def image_to_column(images, filter_shape, stride, output_shape='same'): filter_height, filter_width = filter_shape pad_h, pad_w = determine_padding(filter_shape, output_shape)# Add padding to the image images_padded = np.pad(images, ((0, 0), (0, 0), pad_h, pad_w), mode='constant')# Calculate the indices where the dot products are to be applied between weights # and the image k, i, j = get_im2col_indices(images.shape, filter_shape, (pad_h, pad_w), stride) # Get content from image at those indices cols = images_padded[:, k, i, j] channels = images.shape[1] # Reshape content into column shape cols = cols.transpose(1, 2, 0).reshape(filter_height * filter_width * channels, -1) return cols

说明:输入的images的形状是[batchsize,channel,height,width],类似于pytorch的图像格式的输入。也就是说images_padded是在height和width上举行padding的。在其中调用了get_im2col_indices()函数,那我们接下来看看它是个什么样子的:

def get_im2col_indices(images_shape, filter_shape, padding, stride=1): # First figure out what the size of the output should be batch_size, channels, height, width = images_shape filter_height, filter_width = filter_shape pad_h, pad_w = padding out_height = int((height np.sum(pad_h) - filter_height) / stride 1) out_width = int((width np.sum(pad_w) - filter_width) / stride 1) i0 = np.repeat(np.arange(filter_height), filter_width) i0 = np.tile(i0, channels) i1 = stride * np.repeat(np.arange(out_height), out_width) j0 = np.tile(np.arange(filter_width), filter_height * channels) j1 = stride * np.tile(np.arange(out_width), out_height) i = i0.reshape(-1, 1) i1.reshape(1, -1) j = j0.reshape(-1, 1) j1.reshape(1, -1) k = np.repeat(np.arange(channels), filter_height * filter_width).reshape(-1, 1)return (k, i, j)

说明:单独看很难明白,我们照样带着带着现实的参数一步步来看。

get_im2col_indices((1,3,32,32), (3,3), ((1,1),(1,1)), stride=1)

说明:看一下每一个变量的转变情形,out_width和out_height就不多说,是卷积之后的输出的特征图的宽和高维度。

  • i0:np.repeat(np.arange(3),3):[0 ,0,0,1,1,1,2,2,2]
  • i0:np.tile([0,0,0,1,1,1,2,2,2],3):[0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2],巨细为:(27,)
  • i1:1*np.repeat(np.arange(32),32):[0,0,0......,31,31,31],巨细为:(1024,)
  • j0:np.tile(np.arange(3),3*3):[0,1,2,0,1,2,......],巨细为:(27,)
  • j1:1*np.tile(np.arange(32),32):[0,1,2,3,......,0,1,2,......,29,30,31],巨细为(1024,)
  • i:i0.reshape(-1,1) i1.reshape(1,-1):巨细(27,1024)
  • j:j0.reshape(-1,1) j1.reshape(1,-1):巨细(27,1024)
  • k:np.repeat(np.arange(3),3*3).reshape(-1,1):巨细(27,1)

弥补:

  • numpy.pad(array, pad_width, mode, **kwargs):array是要要被填充的数据,第二个参数指定填充的长度,mod用于指定填充的数据,默认是0,若是是constant,则需要指定填充的值。
  • numpy.arange(start, stop, step, dtype = None):举例numpy.arange(3),输出[0,1,2]
  • numpy.repeat(array,repeats,axis=None):举例numpy.repeat([0,1,2],3),输出:[0,0,0,1,1,1,2,2,2]
  • numpy.tile(array,reps):举例numpy.tile([0,1,2],3),输出:[0,1,2,0,1,2,0,1,2]
  • 详细的更庞大的用法照样得去查相关资料。这里只列举出与本代码相关的。

有了这些巨细照样挺难明白的呀。那么我们继续,需要明确的是k是对通道举行操作,i是对特征图的高,j是对特征图的宽。使用3×3的卷积核在一个通道上举行卷积,每次执行3×3=9个像素操作,共3个通道,以是共对9×3=27个像素点举行操作。而图像巨细是32×32,共1024个像素。再回去看这三行代码:

 cols = images_padded[:, k, i, j] channels = images.shape[1] # Reshape content into column shape cols = cols.transpose(1, 2, 0).reshape(filter_height * filter_width * channels, -1)

images_padded的巨细是(1,3,34,34),则cols=images_padded的巨细是(1,27,1024)

channels的巨细是3

最终cols=cols.transpose(1,2,0).reshape(3*3*3,-1)的巨细是(27,1024)。

当batchsize的巨细不是1,假设是64时,那么最终输出的cols的巨细就是:(27,1024×64)=(27,65536)。

最后就是卷积层的实现了:

首先有一个Layer通用基类,通过继续该基类可以实现差别的层,例如卷积层、池化层、批量归一化层等等:

class Layer(object): def set_input_shape(self, shape): """ Sets the shape that the layer expects of the input in the forward pass method """ self.input_shape = shape def layer_name(self): """ The name of the layer. Used in model summary. """ return self.__class__.__name__ def parameters(self): """ The number of trainable parameters used by the layer """ return 0 def forward_pass(self, X, training): """ Propogates the signal forward in the network """ raise NotImplementedError() def backward_pass(self, accum_grad): """ Propogates the accumulated gradient backwards in the network. If the has trainable weights then these weights are also tuned in this method. As input (accum_grad) it receives the gradient with respect to the output of the layer and returns the gradient with respect to the output of the previous layer. """ raise NotImplementedError() def output_shape(self): """ The shape of the output produced by forward_pass """ raise NotImplementedError()

对于子类继续该基类必须要实现的方式,若是没有实现使用raise NotImplementedError()抛出异常。

接着就可以基于该基类实现Conv2D了:

class Conv2D(Layer): """A 2D Convolution Layer. Parameters: ----------- n_filters: int The number of filters that will convolve over the input matrix. The number of channels of the output shape. filter_shape: tuple A tuple (filter_height, filter_width). input_shape: tuple The shape of the expected input of the layer. (batch_size, channels, height, width) Only needs to be specified for first layer in the network. padding: string Either 'same' or 'valid'. 'same' results in padding being added so that the output height and width matches the input height and width. For 'valid' no padding is added. stride: int The stride length of the filters during the convolution over the input. """ def __init__(self, n_filters, filter_shape, input_shape=None, padding='same', stride=1): self.n_filters = n_filters self.filter_shape = filter_shape self.padding = padding self.stride = stride self.input_shape = input_shape self.trainable = True def initialize(self, optimizer): # Initialize the weights filter_height, filter_width = self.filter_shape channels = self.input_shape[0] limit = 1 / math.sqrt(np.prod(self.filter_shape)) self.W = np.random.uniform(-limit, limit, size=(self.n_filters, channels, filter_height, filter_width)) self.w0 = np.zeros((self.n_filters, 1)) # Weight optimizers self.W_opt = copy.copy(optimizer) self.w0_opt = copy.copy(optimizer) def parameters(self): return np.prod(self.W.shape)  np.prod(self.w0.shape) def forward_pass(self, X, training=True): batch_size, channels, height, width = X.shape self.layer_input = X # Turn image shape into column shape # (enables dot product between input and weights) self.X_col = image_to_column(X, self.filter_shape, stride=self.stride, output_shape=self.padding) # Turn weights into column shape self.W_col = self.W.reshape((self.n_filters, -1)) # Calculate output output = self.W_col.dot(self.X_col)  self.w0 # Reshape into (n_filters, out_height, out_width, batch_size) output = output.reshape(self.output_shape()  (batch_size, )) # Redistribute axises so that batch size comes first return output.transpose(3,0,1,2) def backward_pass(self, accum_grad): # Reshape accumulated gradient into column shape accum_grad = accum_grad.transpose(1, 2, 3, 0).reshape(self.n_filters, -1) if self.trainable: # Take dot product between column shaped accum. gradient and column shape # layer input to determine the gradient at the layer with respect to layer weights grad_w = accum_grad.dot(self.X_col.T).reshape(self.W.shape) # The gradient with respect to bias terms is the sum similarly to in Dense layer grad_w0 = np.sum(accum_grad, axis=1, keepdims=True) # Update the layers weights self.W = self.W_opt.update(self.W, grad_w) self.w0 = self.w0_opt.update(self.w0, grad_w0) # Recalculate the gradient which will be propogated back to prev. layer accum_grad = self.W_col.T.dot(accum_grad) # Reshape from column shape to image shape accum_grad = column_to_image(accum_grad, self.layer_input.shape, self.filter_shape, stride=self.stride, output_shape=self.padding) return accum_grad def output_shape(self): channels, height, width = self.input_shape pad_h, pad_w = determine_padding(self.filter_shape, output_shape=self.padding) output_height = (height np.sum(pad_h) - self.filter_shape[0]) / self.stride 1 output_width = (width np.sum(pad_w) - self.filter_shape[1]) / self.stride 1 return self.n_filters, int(output_height), int(output_width)

假设输入照样(1,3,32,32)的维度,使用16个3×3的卷积核举行卷积,那么self.W的巨细就是(16,3,3,3),self.w0的巨细就是(16,1)。

self.X_col的巨细就是(27,1024),self.W_col的巨细是(16,27),那么output = self.W_col.dot(self.X_col) self.w0的巨细就是(16,1024)

最后是这么使用的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8) input_shape=image.squeeze().shape conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='same', stride=1) conv2d.initialize(None) output=conv2d.forward_pass(image,training=True) print(output.shape)

输出效果:(1,16,32,32)

计算下参数:

print(conv2d.parameters())

输出效果:448

也就是448=3×3×3×16 16

再是一个padding=valid的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8) input_shape=image.squeeze().shape conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='valid', stride=1) conv2d.initialize(None) output=conv2d.forward_pass(image,training=True) print(output.shape) print(conv2d.parameters())

需要注重的是cols的巨细转变了,由于我们卷积之后的输出是(1,16,30,30)

输出:

cols的巨细:(27,900)

(1,16,30,30)

448

最后是带步长的:

image = np.random.randint(0,255,size=(1,3,32,32)).astype(np.uint8) input_shape=image.squeeze().shape conv2d = Conv2D(16, (3,3), input_shape=input_shape, padding='valid', stride=2) conv2d.initialize(None) output=conv2d.forward_pass(image,training=True) print(output.shape) print(conv2d.parameters())

cols的巨细:(27,225)

(1,16,15,15)

448

最后弥补下:

卷积层参数计算公式 :params=卷积核高×卷积核宽×通道数目×卷积核数目 偏置项(卷积核数目)

卷积之后图像巨细计算公式:

输出图像的高=(输入图像的高 padding(高)×2-卷积核高)/步长 1

输出图像的宽=(输入图像的宽 padding(宽)×2-卷积核宽)/步长 1

 

get_im2col_indices()函数中的变换操作是清晰了,至于为什么这么变换的缘故原由还需要好好去琢磨。至于反向流传和优化optimizer等研究好了之后再更新了。

 

,

Sunbet 申博

Sunbet 申博www.baodingxsls.com Sunbet是菲律宾娱乐的官方网站。Sunbt官网有你喜欢的Sunbet、申博APP下载、菲律宾娱乐最新网址、菲律宾娱乐管理网最新网址等。

TAG:
阅读:
广告 330*360
广告 330*360
Sunbet_进入申博sunbet官网
微信二维码扫一扫
关注微信公众号
新闻自媒体 Copyright © 2002-2019 Sunbet 版权所有
二维码
意见反馈 二维码