numpy - Wrong values for partial derivatives in neural network python -


i implementing simple neural network classifier iris dataset. nn has 3 input nodes, 1 hidden layer 2 nodes, , 3 output nodes. have implemented evrything values of partial derivatives not calculated correctly. have exhausted myself looking solution couldn't. here code calculating partial derivatives.

def derivative_cost_function(self,x,y,thetas):     '''         computes derivates of cost function w.r.t input parameters (thetas)           given input , labels.          input:         ------             x: can either single d x n-dimensional vector or d x n dimensional matrix of inputs             theata: must  dk x 1-dimensional vector representing vectors of k classes             y: must k x n-dimensional label vector         returns:         ------             partial_thetas: dk x 1-dimensional vector of partial derivatives of cost function w.r.t parameters..     '''      #forward pass     a2, a3=self.forward_pass(x,thetas)      #now back-propogate       # unroll thetas     l1theta, l2theta = self.unroll_thetas(thetas)       nexamples=float(x.shape[1])      # compute delta3, l2theta     a3 = np.array(a3)     a2 = np.array(a2)     y = np.array(y)      a3 = a3.t     delta3 = (a3 * (1 - a3)) * (((a3 - y)/((a3)*(1-a3))))      l2derivatives = np.dot(delta3, a2)     #print "layer 2 derivatives shape = ", l2derivatives.shape     #print "layer 2 derivatives = ", l2derivatives        # compute delta2, l1 theta     a2 = a2.t     dotproduct = np.dot(l2theta.t,delta3)     delta2 = dotproduct * (a2) * (1- a2)       l1derivatives = np.dot(delta2[1:], x.t)     #print "layer 1 derivatives shape = ", l1derivatives.shape     #print "layer 1 derivatives = ", l1derivatives       #remember exclude last element of delta2, representing deltas of bias terms...     # i.e. delta2=delta2[:-1]        # roll thetas big vector     thetas=(self.roll_thetas(l1derivatives,l2derivatives)).reshape(thetas.shape) # return same shape received      return thetas 

why not have of implementation in https://github.com/zizhaozhang/simple_neutral_network/blob/master/nn.py

the derivatives here:

def dcostfunction(self, theta, in_dim, hidden_dim, num_labels, x, y):         #compute gradient         t1, t2 = self.uncat(theta, in_dim, hidden_dim)           a1, z2, a2, z3, a3 = self._forward(x, t1, t2) # p x s matrix          # t1 = t1[1:, :] # remove bias term         # t2 = t2[1:, :]         sigma3 = -(y - a3) * self.dactivation(z3) # not apply dsigmode here? should         sigma2 = np.dot(t2, sigma3)         term = np.ones((1,num_labels))         sigma2 = sigma2 * np.concatenate((term, self.dactivation(z2)),axis=0)          theta2_grad = np.dot(sigma3, a2.t)         theta1_grad = np.dot(sigma2[1:,:], a1.t)          theta1_grad = theta1_grad / num_labels         theta2_grad = theta2_grad / num_labels          return self.cat(theta1_grad.t, theta2_grad.t) 

hope helps


Comments

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Magento/PHP - Get phones on all members in a customer group -

session - Logging Out Using PHP -