初始化变量
创建自己的会话(session)
训练算法
实现神经网络
cd D:\software\OneDrive\桌面\吴恩达深度学习课后作业\第二部分 改善深层神经网络\第三周 TensorFlow入门
D:\software\OneDrive\桌面\吴恩达深度学习课后作业\第二部分 改善深层神经网络\第三周 TensorFlow入门
import math
import numpy as np
import h5py
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.framework import ops
from tf_utils import load_dataset, random_mini_batches, convert_to_one_hot, predict
%matplotlib inline
np.random.seed(1)
y_hat = tf.constant(36,name="y_hat")
y = tf.constant(39,name="y")
loss = tf.Variable((y_hat-y)**2,name="loss") #创建一个变量
#含有tf.Variable的环境下需要初始化,因为tf中建立的变量是没有初始化的
init = tf.global_variables_initializer() #初始化模型参数
with tf.Session() as session:
session.run(init)
print(session.run(loss))
9
在TensorFlow中编写和运行程序包含以下步骤:
a = tf.constant(1)
b = tf.constant(7)
c = tf.multiply(a,b)
print(c) #因为没有创建一个会话运行它,运算计算图
Tensor(“Mul:0”, shape=(), dtype=int32)
session = tf.Session()
print(session.run(c))
7
注:记住要初始化变量,创建一个会话并在该会话中运行操作。
当你首次定义x时,不必为其指定值。
占位符只是一个变量,你在运行会话时才将数据分配给该变量。
x = tf.placeholder(tf.int64,name="x") #Int64 是有符号 64 位整数数据类型
print(session.run(3*x,feed_dict={x:4}))
session.close()
12
计算以下方程式:Y=WX+b,其中W和X是随机矩阵,b是随机向量。
练习:计算WX+b ,其中W,X和b是从随机正态分布中得到的,W的维度为(4,3),X的维度为(3,1),b的维度为(4,1)。例如,下面是定义维度为(3,1)的常量X的方法:
tf.matmul(…, …)进行矩阵乘法
tf.add(…, …)进行加法
np.random.randn(…)随机初始化
def linear_function():
np.random.seed(1) #种子相同,只有一种随机情况
X = tf.constant(np.random.randn(3,1),name="X")
W = tf.constant(np.random.randn(4,3),name="W")
b = tf.constant(np.random.randn(4,1),name="b")
Y = tf.add(tf.matmul(W,X),b)
session = tf.Session()
result = session.run(Y)
session.close()
return result
result = linear_function()
print( "result = " + str(result))
result = [[-2.15657382]
[ 2.95891446]
[-1.08926781]
[-0.84538042]]
def sigmoid(z):
x = tf.placeholder(tf.float32,name="x")
sigmoid = tf.sigmoid(x)
with tf.Session() as session:
result = session.run(sigmoid,feed_dict={x:z})
return result
你将使用占位符变量x进行此练习。在运行会话时,应该使用feed字典传入输入z。在本练习中,你必须:
(i)创建一个占位符x;
(ii)使用tf.sigmoid定义计算Sigmoid所需的操作;
(iii)然后运行该会话
创建会话的两种方式
(1)
sess = tf.Session()
result = sess.run(..., feed_dict = {...})
sess.close() # Close the session
(2)
with tf.Session() as sess:
result = sess.run(..., feed_dict = {...})
print("sigmoid(0) = " + str(sigmoid(0)))
print("sigmoid(12)="+str(sigmoid(12)))
sigmoid(0) = 0.5
sigmoid(12)=0.999994
tf.nn.sigmoid_cross_entropy_with_logits(logits = ..., labels = ...)
def cost(logits, labels):
y = tf.placeholder(tf.float32,name="y")
z = tf.placeholder(tf.float32,name="z")
cost = tf.nn.sigmoid_cross_entropy_with_logits(logits=z,labels=y)
session = tf.Session()
result = session.run(cost,feed_dict={z:logits,y:labels})
session.close()
return result
logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))
cost = cost(logits,np.array([0,0,1,1]))
print("Cost="+str(cost))
Cost=[ 1.00538719 1.03664076 0.41385433 0.39956617]
这称为独热编码,因为在转换后的表示形式中,每一列中的一个元素正好是“hot”(设为1)
实现以下函数,以获取一个标签向量和C类的总数,并返回一个独热编码。使用tf.one_hot()来做到这一点。
def one_hot_matrix(labels, C):
C = tf.constant(C,name="C")
one_hot_matrix = tf.one_hot(labels, C, axis=0)
session = tf.Session()
one_hot = session.run(one_hot_matrix)
session.close()
return one_hot
labels = np.array([1,2,3,0,2,1])
one_hot = one_hot_matrix(labels,C=4)
print("one_hot=\n"+str(one_hot))
one_hot=
[[ 0. 0. 0. 1. 0. 0.]
[ 1. 0. 0. 0. 0. 1.]
[ 0. 1. 0. 0. 1. 0.]
[ 0. 0. 1. 0. 0. 0.]]
初始化0和1的向量。 调用的函数是tf.ones()。使用零初始化,可以改用tf.zeros()。这些函数采用一个维度,并分别返回一个包含0和1的维度数组。
练习:实现以下函数以获取维度并返回维度数组。
tf.ones(shape)
def ones(shape):
ones = tf.ones(shape)
session= tf.Session()
result = session.run(ones)
session.close()
return result
print ("ones = " + str(ones([3]))) #将向量都初始化为1
ones = [ 1. 1. 1.]
实现tensorflow模型包含两个部分:
创建计算图
运行计算图
你的目标是建立一种能够高精度识别符号的算法。为此,你将构建一个tensorflow模型,该模型与你先前在numpy中为猫识别构建的tensorflow模型几乎相同(但现在使用softmax输出)。这是将numpy实现的模型与tensorflow进行比较的好机会。
模型为LINEAR-> RELU-> LINEAR-> RELU-> LINEAR-> SOFTMAX 。 SIGMOID输出层已转换为SOFTMAX。SOFTMAX层将SIGMOID应用到两个以上的类。
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()
index = 1
plt.imshow(X_train_orig[index])
print("y="+str(np.squeeze(Y_train_orig[:,index])))
y=0
X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0],-1).T
X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0],-1).T
X_train = X_train_flatten/255
X_test = X_test_flatten/255
Y_train = convert_to_one_hot(Y_train_orig,6)
Y_test = convert_to_one_hot(Y_test_orig,6)
print ("number of training examples = " + str(X_train.shape[1]))
print ("number of test examples = " + str(X_test.shape[1]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))
number of training examples = 1080
number of test examples = 120
X_train shape: (12288, 1080)
Y_train shape: (6, 1080)
X_test shape: (12288, 120)
Y_test shape: (6, 120)
def create_placeholders(n_x, n_y):
X = tf.placeholder(shape=[n_x,None],dtype=tf.float32)
Y = tf.placeholder(shape=[n_y,None],dtype=tf.float32)
return X,Y
X, Y = create_placeholders(12288, 6)
print ("X = " + str(X))
print ("Y = " + str(Y))
X = Tensor(“Placeholder:0”, shape=(12288, ?), dtype=float32)
Y = Tensor(“Placeholder_1:0”, shape=(6, ?), dtype=float32)
实现以下函数以初始化tensorflow中的参数。使用权重的Xavier初始化和偏差的零初始化。
Initializes parameters to build a neural network with tensorflow. The shapes are:
W1 : [25, 12288]
b1 : [25, 1]
W2 : [12, 25]
b2 : [12, 1]
W3 : [6, 12]
b3 : [6, 1]
def initialize_parameters():
tf.set_random_seed(1)
W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed=1))
b1 = tf.get_variable("b1", [25,1], initializer=tf.zeros_initializer())
W2 = tf.get_variable("W2", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed=1))
b2 = tf.get_variable("b2", [12,1], initializer=tf.zeros_initializer())
W3 = tf.get_variable("W3", [6,12], initializer = tf.contrib.layers.xavier_initializer(seed=1))
b3 = tf.get_variable("b3", [6,1], initializer=tf.zeros_initializer())
parameters = {
"W1":W1,
"b1":b1,
"W2":W2,
"b2":b2,
"W3":W3,
"b3":b3
}
return parameters
tf.reset_default_graph()
with tf.Session() as session:
parameters = initialize_parameters()
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))
W1 =
b1 =
W2 =
b2 =
tf.add(...,...)进行加法
tf.matmul(...,...)进行矩阵乘法
tf.nn.relu(...)以应用ReLU激活
def forward_propagation(X, parameters):
W1 = parameters["W1"]
b1 = parameters["b1"]
W2 = parameters["W2"]
b2 = parameters["b2"]
W3 = parameters["W3"]
b3 = parameters["b3"]
Z1 = tf.add(tf.matmul(W1,X),b1)
A1 = tf.nn.relu(Z1)
Z2 = tf.add(tf.matmul(W2,A1),b2)
A2 = tf.nn.relu(Z2)
Z3 = tf.add(tf.matmul(W3,A2),b3)
return Z3
tf.reset_default_graph()
with tf.Session() as session:
X,Y = create_placeholders(12288,6)
parameters = initialize_parameters()
Z3 = forward_propagation(X,parameters)
print("Z3 = " + str(Z3))
Z3 = Tensor(“Add_2:0”, shape=(6, ?), dtype=float32)
tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = …, labels = …))
重要的是要知道tf.nn.softmax_cross_entropy_with_logits的"logits"和"labels"输入应具有一样的维度(数据数,类别数)。 因此,我们为你转换了Z3和Y。
此外,tf.reduce_mean是对所以数据进行求和。
transpose:转置
def compute_cost(Z3, Y):
logits = tf.transpose(Z3)
labels = tf.transpose(Y)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits,labels=labels))
return cost
tf.reset_default_graph()
with tf.Session() as session:
X, Y = create_placeholders(12288, 6)
parameters = initialize_parameters()
Z3 = forward_propagation(X, parameters)
cost = compute_cost(Z3,Y)
print("cost = " + str(cost))
cost = Tensor(“Mean:0”, shape=(), dtype=float32)
计算损失函数之后,你将创建一个"optimizer"对象。运行tf.session时,必须与损失一起调用此对象。调用时,它将使用所选方法和学习率对给定的损失执行优化。
对于梯度下降,优化器将是:
optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(cost)
执行优化,你可以执行以下操作:
_ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})
注意编码时,我们经常使用_作为“throwaway”变量来存储以后不再需要使用的值。这里_代表了我们不需要的optimizer的评估值(而 c 代表了 cost变量的值)。
练习:调用之前实现的函数构建完整模型。
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
num_epochs = 1500, minibatch_size = 32, print_cost = True):
ops.reset_default_graph()
tf.set_random_seed(1)
seed = 3
(n_x,m) = X_train.shape
n_y = Y_train.shape[0]
costs = []
X,Y = create_placeholders(n_x,n_y)
parameters = initialize_parameters()
Z3 = forward_propagation(X,parameters)
cost = compute_cost(Z3,Y)
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(num_epochs):
epoch_cost = 0.
num_minibatches = int(m / minibatch_size)
seed = seed + 1
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
for minibatch in minibatches:
(mini_batch_X, mini_batch_Y) = minibatch
_,minibatch_cost = sess.run([optimizer,cost],feed_dict = {X:mini_batch_X,Y:mini_batch_Y})
epoch_cost += minibatch_cost / num_minibatches
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
if print_cost == True and epoch % 5 == 0:
costs.append(epoch_cost)
# plot the cost
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
parameters = sess.run(parameters)
print ("Parameters have been trained!")
correct_prediction = tf.equal(tf.argmax(Z3),tf.argmax(Y)) #argmax:求自变量最大的函数
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) #cast:转换数据类型
print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
return parameters
parameters = model(X_train, Y_train, X_test, Y_test)
Cost after epoch 0: 1.855702
Cost after epoch 100: 1.016458
Cost after epoch 200: 0.733102
Cost after epoch 300: 0.572939
Cost after epoch 400: 0.468774
Cost after epoch 500: 0.381021
Cost after epoch 600: 0.313827
Cost after epoch 700: 0.254280
Cost after epoch 800: 0.203799
Cost after epoch 900: 0.166512
Cost after epoch 1000: 0.140937
Cost after epoch 1100: 0.107750
Cost after epoch 1200: 0.086299
Cost after epoch 1300: 0.060949
Cost after epoch 1400: 0.050934
Parameters have been trained!
Train Accuracy: 0.999074
Test Accuracy: 0.725
Tensorflow是深度学习中经常使用的编程框架
Tensorflow中的两个主要对象类别是张量和运算符。
在Tensorflow中进行编码时,你必须执行以下步骤:
1、创建一个包含张量(变量,占位符…)和操作(tf.matmul,tf.add,…)的计算图
2、创建会话
3、初始化会话
4、运行会话以执行计算图
你可以像在model()中看到的那样多次执行计算图
在“优化器”对象上运行会话时,将自动完成反向传播和优化。