Neural network output on MNIST converges to zero, regardless of input2019 Community Moderator ElectionRole of Bias in Neural NetworksEpoch vs Iteration when training neural networksWhat are advantages of Artificial Neural Networks over Support Vector Machines?Optimizing Neural Network Input for ConvergenceXOR Neural Network not convergingTraining MLP in TheanoNeural Network converging to zero outputZero input in neural networkNeural network in keras not convergingHow find the max of a list and then store the max in a new list

Employee lack of ownership

How do anti-virus programs start at Windows boot?

Humanity loses the vast majority of its technology, information, and population in the year 2122. How long does it take to rebuild itself?

Why using two cd commands in bash script does not execute the second command

Can anyone tell me why this program fails?

How is the Swiss post e-voting system supposed to work, and how was it wrong?

Theorems like the Lovász Local Lemma?

Happy pi day, everyone!

Using "wallow" verb with object

Sword in the Stone story where the sword was held in place by electromagnets

PTIJ: Who should pay for Uber rides: the child or the parent?

Know when to turn notes upside-down(eighth notes, sixteen notes, etc.)

What are the possible solutions of the given equation?

Rules about breaking the rules. How do I do it well?

Can elves maintain concentration in a trance?

Why are the outputs of printf and std::cout different

Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?

Bash: What does "masking return values" mean?

Did CPM support custom hardware using device drivers?

RegionDifference for Cylinder and Cuboid

Latest web browser compatible with Windows 98

What is this large pipe coming out of my roof?

How do I hide Chekhov's Gun?

Dot in front of file



Neural network output on MNIST converges to zero, regardless of input



2019 Community Moderator ElectionRole of Bias in Neural NetworksEpoch vs Iteration when training neural networksWhat are advantages of Artificial Neural Networks over Support Vector Machines?Optimizing Neural Network Input for ConvergenceXOR Neural Network not convergingTraining MLP in TheanoNeural Network converging to zero outputZero input in neural networkNeural network in keras not convergingHow find the max of a list and then store the max in a new list










1















After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.



After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.



Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like



[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.



At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?



Here's my code for the neural network class (72 lines):



import numpy as np

class neuralNetwork():

def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)

def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []

for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses

def sigmoid(self, x):
return 1/(1 + np.exp(-x))

def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)

def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]

for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)

return layers

def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]

def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)

layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)

# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])

layer_deltas = [output_delta]

for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)

for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]

for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment

return self.synapses

def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)

error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))

return mean_error


...and my code for training (64 lines):



# -*- coding: utf-8 -*-

import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST

def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)

def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]

def get_label_by_one_hot(layer):
return np.argmax(layer)

def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0

normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)

for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]

guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0

print(str(correct_guesses) + "/" + str(guesses) + " correct")


BATCH_SIZE = 64
MAX_ITERATIONS = 1000

np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])

mndata = MNIST('MNIST')

#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]

testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)

for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])

if i > 0:
neural_network.train(training_data_images, training_data_one_hot)

# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)

if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break

print("All done!")









share|improve this question



















  • 1





    Your sigmoid_derivative isn't what it should be.

    – cheersmate
    Mar 8 at 10:55











  • @cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

    – Nekkowe
    Mar 8 at 11:13







  • 1





    @cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

    – Nekkowe
    Mar 8 at 12:43















1















After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.



After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.



Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like



[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.



At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?



Here's my code for the neural network class (72 lines):



import numpy as np

class neuralNetwork():

def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)

def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []

for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses

def sigmoid(self, x):
return 1/(1 + np.exp(-x))

def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)

def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]

for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)

return layers

def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]

def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)

layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)

# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])

layer_deltas = [output_delta]

for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)

for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]

for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment

return self.synapses

def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)

error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))

return mean_error


...and my code for training (64 lines):



# -*- coding: utf-8 -*-

import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST

def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)

def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]

def get_label_by_one_hot(layer):
return np.argmax(layer)

def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0

normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)

for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]

guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0

print(str(correct_guesses) + "/" + str(guesses) + " correct")


BATCH_SIZE = 64
MAX_ITERATIONS = 1000

np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])

mndata = MNIST('MNIST')

#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]

testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)

for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])

if i > 0:
neural_network.train(training_data_images, training_data_one_hot)

# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)

if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break

print("All done!")









share|improve this question



















  • 1





    Your sigmoid_derivative isn't what it should be.

    – cheersmate
    Mar 8 at 10:55











  • @cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

    – Nekkowe
    Mar 8 at 11:13







  • 1





    @cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

    – Nekkowe
    Mar 8 at 12:43













1












1








1








After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.



After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.



Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like



[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.



At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?



Here's my code for the neural network class (72 lines):



import numpy as np

class neuralNetwork():

def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)

def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []

for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses

def sigmoid(self, x):
return 1/(1 + np.exp(-x))

def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)

def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]

for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)

return layers

def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]

def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)

layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)

# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])

layer_deltas = [output_delta]

for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)

for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]

for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment

return self.synapses

def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)

error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))

return mean_error


...and my code for training (64 lines):



# -*- coding: utf-8 -*-

import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST

def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)

def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]

def get_label_by_one_hot(layer):
return np.argmax(layer)

def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0

normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)

for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]

guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0

print(str(correct_guesses) + "/" + str(guesses) + " correct")


BATCH_SIZE = 64
MAX_ITERATIONS = 1000

np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])

mndata = MNIST('MNIST')

#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]

testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)

for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])

if i > 0:
neural_network.train(training_data_images, training_data_one_hot)

# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)

if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break

print("All done!")









share|improve this question
















After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.



After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.



Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like



[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.



At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?



Here's my code for the neural network class (72 lines):



import numpy as np

class neuralNetwork():

def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)

def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []

for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses

def sigmoid(self, x):
return 1/(1 + np.exp(-x))

def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)

def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]

for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)

return layers

def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]

def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)

layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)

# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])

layer_deltas = [output_delta]

for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)

for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]

for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment

return self.synapses

def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)

error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))

return mean_error


...and my code for training (64 lines):



# -*- coding: utf-8 -*-

import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST

def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)

def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]

def get_label_by_one_hot(layer):
return np.argmax(layer)

def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0

normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)

for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]

guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0

print(str(correct_guesses) + "/" + str(guesses) + " correct")


BATCH_SIZE = 64
MAX_ITERATIONS = 1000

np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])

mndata = MNIST('MNIST')

#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]

testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)

for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])

if i > 0:
neural_network.train(training_data_images, training_data_one_hot)

# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)

if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break

print("All done!")






python neural-network






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 8 at 12:42







Nekkowe

















asked Mar 7 at 12:29









NekkoweNekkowe

3818




3818







  • 1





    Your sigmoid_derivative isn't what it should be.

    – cheersmate
    Mar 8 at 10:55











  • @cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

    – Nekkowe
    Mar 8 at 11:13







  • 1





    @cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

    – Nekkowe
    Mar 8 at 12:43












  • 1





    Your sigmoid_derivative isn't what it should be.

    – cheersmate
    Mar 8 at 10:55











  • @cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

    – Nekkowe
    Mar 8 at 11:13







  • 1





    @cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

    – Nekkowe
    Mar 8 at 12:43







1




1





Your sigmoid_derivative isn't what it should be.

– cheersmate
Mar 8 at 10:55





Your sigmoid_derivative isn't what it should be.

– cheersmate
Mar 8 at 10:55













@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

– Nekkowe
Mar 8 at 11:13






@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular sigmoid, so I figured doing it again would lead to the wrong result. see also here

– Nekkowe
Mar 8 at 11:13





1




1





@cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

– Nekkowe
Mar 8 at 12:43





@cheersmate I have edited the function name to sigmoid_output_to_derivative for clarity, accordingly

– Nekkowe
Mar 8 at 12:43












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043841%2fneural-network-output-on-mnist-converges-to-zero-regardless-of-input%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043841%2fneural-network-output-on-mnist-converges-to-zero-regardless-of-input%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Identity Server 4 is not redirecting to Angular app after login2019 Community Moderator ElectionIdentity Server 4 and dockerIdentityserver implicit flow unauthorized_clientIdentityServer Hybrid Flow - Access Token is null after user successful loginIdentity Server to MVC client : Page Redirect After loginLogin with Steam OpenId(oidc-client-js)Identity Server 4+.NET Core 2.0 + IdentityIdentityServer4 post-login redirect not working in Edge browserCall to IdentityServer4 generates System.NullReferenceException: Object reference not set to an instance of an objectIdentityServer4 without HTTPS not workingHow to get Authorization code from identity server without login form

2005 Ahvaz unrest Contents Background Causes Casualties Aftermath See also References Navigation menue"At Least 10 Are Killed by Bombs in Iran""Iran"Archived"Arab-Iranians in Iran to make April 15 'Day of Fury'"State of Mind, State of Order: Reactions to Ethnic Unrest in the Islamic Republic of Iran.10.1111/j.1754-9469.2008.00028.x"Iran hangs Arab separatists"Iran Overview from ArchivedConstitution of the Islamic Republic of Iran"Tehran puzzled by forged 'riots' letter""Iran and its minorities: Down in the second class""Iran: Handling Of Ahvaz Unrest Could End With Televised Confessions""Bombings Rock Iran Ahead of Election""Five die in Iran ethnic clashes""Iran: Need for restraint as anniversary of unrest in Khuzestan approaches"Archived"Iranian Sunni protesters killed in clashes with security forces"Archived

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme