Neural network output on MNIST converges to zero, regardless of input2019 Community Moderator ElectionRole of Bias in Neural NetworksEpoch vs Iteration when training neural networksWhat are advantages of Artificial Neural Networks over Support Vector Machines?Optimizing Neural Network Input for ConvergenceXOR Neural Network not convergingTraining MLP in TheanoNeural Network converging to zero outputZero input in neural networkNeural network in keras not convergingHow find the max of a list and then store the max in a new list
Employee lack of ownership
How do anti-virus programs start at Windows boot?
Humanity loses the vast majority of its technology, information, and population in the year 2122. How long does it take to rebuild itself?
Why using two cd commands in bash script does not execute the second command
Can anyone tell me why this program fails?
How is the Swiss post e-voting system supposed to work, and how was it wrong?
Theorems like the Lovász Local Lemma?
Happy pi day, everyone!
Using "wallow" verb with object
Sword in the Stone story where the sword was held in place by electromagnets
PTIJ: Who should pay for Uber rides: the child or the parent?
Know when to turn notes upside-down(eighth notes, sixteen notes, etc.)
What are the possible solutions of the given equation?
Rules about breaking the rules. How do I do it well?
Can elves maintain concentration in a trance?
Why are the outputs of printf and std::cout different
Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?
Bash: What does "masking return values" mean?
Did CPM support custom hardware using device drivers?
RegionDifference for Cylinder and Cuboid
Latest web browser compatible with Windows 98
What is this large pipe coming out of my roof?
How do I hide Chekhov's Gun?
Dot in front of file
Neural network output on MNIST converges to zero, regardless of input
2019 Community Moderator ElectionRole of Bias in Neural NetworksEpoch vs Iteration when training neural networksWhat are advantages of Artificial Neural Networks over Support Vector Machines?Optimizing Neural Network Input for ConvergenceXOR Neural Network not convergingTraining MLP in TheanoNeural Network converging to zero outputZero input in neural networkNeural network in keras not convergingHow find the max of a list and then store the max in a new list
After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.
After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.
Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.
At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?
Here's my code for the neural network class (72 lines):
import numpy as np
class neuralNetwork():
def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)
def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []
for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses
def sigmoid(self, x):
return 1/(1 + np.exp(-x))
def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)
def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]
for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)
return layers
def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]
def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)
layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)
# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])
layer_deltas = [output_delta]
for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)
for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]
for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment
return self.synapses
def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)
error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))
return mean_error
...and my code for training (64 lines):
# -*- coding: utf-8 -*-
import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST
def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)
def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]
def get_label_by_one_hot(layer):
return np.argmax(layer)
def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0
normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)
for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]
guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0
print(str(correct_guesses) + "/" + str(guesses) + " correct")
BATCH_SIZE = 64
MAX_ITERATIONS = 1000
np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])
mndata = MNIST('MNIST')
#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]
testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)
for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])
if i > 0:
neural_network.train(training_data_images, training_data_one_hot)
# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)
if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break
print("All done!")
python neural-network
add a comment |
After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.
After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.
Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.
At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?
Here's my code for the neural network class (72 lines):
import numpy as np
class neuralNetwork():
def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)
def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []
for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses
def sigmoid(self, x):
return 1/(1 + np.exp(-x))
def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)
def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]
for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)
return layers
def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]
def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)
layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)
# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])
layer_deltas = [output_delta]
for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)
for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]
for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment
return self.synapses
def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)
error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))
return mean_error
...and my code for training (64 lines):
# -*- coding: utf-8 -*-
import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST
def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)
def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]
def get_label_by_one_hot(layer):
return np.argmax(layer)
def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0
normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)
for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]
guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0
print(str(correct_guesses) + "/" + str(guesses) + " correct")
BATCH_SIZE = 64
MAX_ITERATIONS = 1000
np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])
mndata = MNIST('MNIST')
#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]
testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)
for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])
if i > 0:
neural_network.train(training_data_images, training_data_one_hot)
# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)
if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break
print("All done!")
python neural-network
1
Yoursigmoid_derivative
isn't what it should be.
– cheersmate
Mar 8 at 10:55
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regularsigmoid
, so I figured doing it again would lead to the wrong result. see also here
– Nekkowe
Mar 8 at 11:13
1
@cheersmate I have edited the function name tosigmoid_output_to_derivative
for clarity, accordingly
– Nekkowe
Mar 8 at 12:43
add a comment |
After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.
After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.
Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.
At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?
Here's my code for the neural network class (72 lines):
import numpy as np
class neuralNetwork():
def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)
def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []
for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses
def sigmoid(self, x):
return 1/(1 + np.exp(-x))
def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)
def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]
for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)
return layers
def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]
def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)
layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)
# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])
layer_deltas = [output_delta]
for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)
for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]
for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment
return self.synapses
def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)
error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))
return mean_error
...and my code for training (64 lines):
# -*- coding: utf-8 -*-
import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST
def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)
def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]
def get_label_by_one_hot(layer):
return np.argmax(layer)
def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0
normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)
for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]
guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0
print(str(correct_guesses) + "/" + str(guesses) + " correct")
BATCH_SIZE = 64
MAX_ITERATIONS = 1000
np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])
mndata = MNIST('MNIST')
#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]
testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)
for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])
if i > 0:
neural_network.train(training_data_images, training_data_one_hot)
# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)
if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break
print("All done!")
python neural-network
After reading through Iamtrask's guide on programming a simple NN in python, I made an attempt at rewriting one as a simple class so I could pick the layer count and sizes and apply it do different problems more easily.
After some finagling, it reached a point where it does a wonderful job on the example in that tutorial, and on other simple things like Binary Numbers <<>> Gray Code conversion, so I figured I'd go for something less simple with the MNIST handwritten digits dataset.
Unfortunately, this is where I'm stumped. After the first couple generations, the output layer always approaches something like
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
thaaat. They never all fully reach zero, but the result is that no matter the input, it ends up guessing the same digit because one of the output nodes is just slightly further from zero than the rest. I tried adding more nodes to the two hidden layers until Python told me enough was enough, tried doing it with just one hidden layer, and the result never really gets any better.
At first, I figured I must've misunderstood something fundamental about backpropagation, but then why does my NN adjust perfectly fine on the simpler problems? What is it I'm missing here, and how can I fix it to reach any useful result?
Here's my code for the neural network class (72 lines):
import numpy as np
class neuralNetwork():
def __init__(self, layer_node_counts):
self.synapses = self.init_synapses(layer_node_counts)
def init_synapses(self, layer_node_counts):
last_layer_node_count = layer_node_counts[0]
synapses = []
for current_layer_node_count in layer_node_counts[1:]:
synapses.append(2* np.random.random((last_layer_node_count, current_layer_node_count)) - 1)
last_layer_node_count = current_layer_node_count
return synapses
def sigmoid(self, x):
return 1/(1 + np.exp(-x))
def sigmoid_output_to_derivative(self, x):
# kind of a bell curve!
return x*(1-x)
def feed_forward(self, input_):
# forward propagation through all our synapses and layers, starting with the input array:
layers = [np.array(input_)]
for key, synapse in enumerate(self.synapses):
newLayer = self.sigmoid(layers[key] @ synapse)
layers.append(newLayer)
return layers
def classify(self, input_):
resulting_layers = self.feed_forward(input_)
# return output layer(s)
return resulting_layers[-1]
def train(self, input_, target_output):
input_ = np.atleast_2d(input_)
target_output = np.atleast_2d(target_output)
layer_result_matrices = self.feed_forward(input_)
synapse_adjustments_total = [0] * len(self.synapses)
# how much this layer was off the mark
output_error = target_output - layer_result_matrices[-1]
# how much we're letting it matter (bell curve height - depends on "confidence" of the synapse connection)
output_delta = output_error * self.sigmoid_derivative(layer_result_matrices[-1])
layer_deltas = [output_delta]
for index in reversed(range(1, len(self.synapses))):
layer_error = layer_deltas[0] @ self.synapses[index].T
layer_delta = layer_error * self.sigmoid_derivative(layer_result_matrices[index])
layer_deltas.insert(0, layer_delta)
for index in range(len(self.synapses)):
synapse_adjustments_total[index] += layer_result_matrices[index].T @ layer_deltas[index]
for index, adjustment in enumerate(synapse_adjustments_total):
self.synapses[index] += adjustment
return self.synapses
def calculate_mean_error(self, input_, target_output):
current_output = self.classify(input_)
error_matrix = np.abs(target_output - current_output) / len(target_output)
mean_error = np.mean(np.abs(error_matrix))
return mean_error
...and my code for training (64 lines):
# -*- coding: utf-8 -*-
import numpy as np
import nekkowe_neural_network as nnn
from mnist import MNIST
def normalize_input(images):
return np.array(images) / (255 * 0.99 + 0.01)
def get_one_hot_by_label(label):
return [0.99 if i == label else 0.01 for i in range(10)]
def get_label_by_one_hot(layer):
return np.argmax(layer)
def test_accuracy(neural_network, test_images, target_labels):
guesses = 0
correct_guesses = 0
normalized_input = normalize_input(test_images)
output_layers = neural_network.classify(normalized_input)
for i, output_layer in enumerate(output_layers):
predicted_label = get_label_by_one_hot(output_layer)
target_label = target_labels[i]
guesses += 1
correct_guesses += 1 if predicted_label == target_label else 0
print(str(correct_guesses) + "/" + str(guesses) + " correct")
BATCH_SIZE = 64
MAX_ITERATIONS = 1000
np.random.seed(1)
neural_network = nnn.neuralNetwork([28**2, 28**2, 28**2, 10])
mndata = MNIST('MNIST')
#training_data_images, training_data_labels = mndata.load_training()
#training_data_one_hot = [get_one_hot_by_label(label) for label in training_data_labels]
testing_data_images, testing_data_labels = mndata.load_testing()
training_data = mndata.load_training_in_batches(BATCH_SIZE)
for i, batch in enumerate(training_data):
training_data_images = np.array(batch[0])
training_data_labels = np.array(batch[1])
training_data_one_hot = np.array([get_one_hot_by_label(label) for label in training_data_labels])
if i > 0:
neural_network.train(training_data_images, training_data_one_hot)
# Report progress at 0, 1, 10, 100, 200, 300, 400 etc. as well as the final one:
if i % 100 == 0 or np.log10(i) % 1 == 0 or i == MAX_ITERATIONS:
print("Batch " + str(i) + ":")
test_accuracy(neural_network, testing_data_images, testing_data_labels)
if i == MAX_ITERATIONS:
print("Reached iteration limit!")
break
print("All done!")
python neural-network
python neural-network
edited Mar 8 at 12:42
Nekkowe
asked Mar 7 at 12:29
NekkoweNekkowe
3818
3818
1
Yoursigmoid_derivative
isn't what it should be.
– cheersmate
Mar 8 at 10:55
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regularsigmoid
, so I figured doing it again would lead to the wrong result. see also here
– Nekkowe
Mar 8 at 11:13
1
@cheersmate I have edited the function name tosigmoid_output_to_derivative
for clarity, accordingly
– Nekkowe
Mar 8 at 12:43
add a comment |
1
Yoursigmoid_derivative
isn't what it should be.
– cheersmate
Mar 8 at 10:55
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regularsigmoid
, so I figured doing it again would lead to the wrong result. see also here
– Nekkowe
Mar 8 at 11:13
1
@cheersmate I have edited the function name tosigmoid_output_to_derivative
for clarity, accordingly
– Nekkowe
Mar 8 at 12:43
1
1
Your
sigmoid_derivative
isn't what it should be.– cheersmate
Mar 8 at 10:55
Your
sigmoid_derivative
isn't what it should be.– cheersmate
Mar 8 at 10:55
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular
sigmoid
, so I figured doing it again would lead to the wrong result. see also here– Nekkowe
Mar 8 at 11:13
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular
sigmoid
, so I figured doing it again would lead to the wrong result. see also here– Nekkowe
Mar 8 at 11:13
1
1
@cheersmate I have edited the function name to
sigmoid_output_to_derivative
for clarity, accordingly– Nekkowe
Mar 8 at 12:43
@cheersmate I have edited the function name to
sigmoid_output_to_derivative
for clarity, accordingly– Nekkowe
Mar 8 at 12:43
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043841%2fneural-network-output-on-mnist-converges-to-zero-regardless-of-input%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55043841%2fneural-network-output-on-mnist-converges-to-zero-regardless-of-input%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Your
sigmoid_derivative
isn't what it should be.– cheersmate
Mar 8 at 10:55
@cheersmate I should definitely rephrase that function name - in the case where it's used, the input has already been passed through the regular
sigmoid
, so I figured doing it again would lead to the wrong result. see also here– Nekkowe
Mar 8 at 11:13
1
@cheersmate I have edited the function name to
sigmoid_output_to_derivative
for clarity, accordingly– Nekkowe
Mar 8 at 12:43