how to standardize the data before feeding into the deep neural network nodelCan python distinguish two different functions defined with the same name but imported from different modules?tf.trainable_variables returns more than one graph's variableTensorflow dataset data preprocessing is done once for the whole dataset or for each call to iterator.next()?Wide and deep neural network - Why is the loss fluctuating?Boolean expression in Tensorflow CIFAR-10 tutorialTensorflow Dataset structureTensorflow's ResNet on custom image dataBatch scaling for Deep Neural NetworkTensorFlow per_image_standardization vs mean standardization across full datasetlooking for a training script of resnet for cifar10 in tensorflow

How do you respond to a colleague from another team when they're wrongly expecting that you'll help them?

Mixing PEX brands

Does Doodling or Improvising on the Piano Have Any Benefits?

Mimic lecturing on blackboard, facing audience

What does "Scientists rise up against statistical significance" mean? (Comment in Nature)

The probability of Bus A arriving before Bus B

Terse Method to Swap Lowest for Highest?

What happens if you are holding an Iron Flask with a demon inside and walk into an Antimagic Field?

What should you do if you miss a job interview (deliberately)?

How to cover method return statement in Apex Class?

How do you make your own symbol when Detexify fails?

Is there a way to get `mathscr' with lower case letters in pdfLaTeX?

Why is it that I can sometimes guess the next note?

When were female captains banned from Starfleet?

Store Credit Card Information in Password Manager?

Does IPv6 have similar concept of network mask?

What features enable the Su-25 Frogfoot to operate with such a wide variety of fuels?

Moving brute-force search to FPGA

On a tidally locked planet, would time be quantized?

Can disgust be a key component of horror?

Temporarily disable WLAN internet access for children, but allow it for adults

How to hide some fields of struct in C?

Is there a RAID 0 Equivalent for RAM?

Why is the "ls" command showing permissions of files in a FAT32 partition?

how to standardize the data before feeding into the deep neural network nodel

Can python distinguish two different functions defined with the same name but imported from different modules?tf.trainable_variables returns more than one graph's variableTensorflow dataset data preprocessing is done once for the whole dataset or for each call to iterator.next()?Wide and deep neural network - Why is the loss fluctuating?Boolean expression in Tensorflow CIFAR-10 tutorialTensorflow Dataset structureTensorflow's ResNet on custom image dataBatch scaling for Deep Neural NetworkTensorFlow per_image_standardization vs mean standardization across full datasetlooking for a training script of resnet for cifar10 in tensorflow

I have met two ways of standardization before feeding data into the TensorFlow model.
The first way is using tf.dataset.per_image_standardization().
This function computes mean and stddev for each image individually. I find this way in the official TensorFlow resnet cifar10 tutorial.
https://github.com/tensorflow/models/tree/master/official/resnet
In the testing phase, each image is standardized individually.

The second way is computing the mean and stddev of the whole dataset in the per channel style. I find this way at the following densenet implementation.https://github.com/taki0112/Densenet-Tensorflow
In testing phase, the test dataset is also preprocessed as whole batch.

These two standardization ways are not equivalent.
My question is: for the second standarization method, how to preprocess a single image for inference? What mean and stddev we should use? Do we need to use the mean and stddev computed for the training dataset as that in batch normalization?

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

add a comment |

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

add a comment |

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

python tensorflow

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

edited Mar 8 at 8:36

asked Mar 8 at 2:29

user1388672

315

asked Mar 8 at 2:29

user1388672

315

asked Mar 8 at 2:29

user1388672

315

add a comment |

1 Answer
1

active

oldest

votes

Yes, you should use the mean and std computed from the training phase.

In general, there are 2 approaches for normalization. Let's say we have an input X of shape [B, H, W, C]

The per feature approach normalizes every point of the image separately. For this to be done, matrices of shape [H, W, C] that estimate mean and std per feature must be computed at training phase.

The per channel approach normalizes every channel of the image separately. This can be done in 3 ways:
- Compute mean and std per channel across training set
- Get statistics from a big collection of images and use these at evaluation phase (e.g. imagenet: 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225])
- Normalize each channel on the fly. Compute mean and std of each example (testing phase) or each batch (training phase) and normalize each channel separately.

The majority of models uses the "per channel" approach, but there is not a correct answer. The important thing is to be consistent between training and test phase. Check also here for more details.

edit: For transfer learning purposes the best choice is to gradually adopting to new dataset statistics. Hence, init your statistics from the old dataset and throughout finetuning update them with the ones from the new dataset. In the end of the training phase, mean and std must have adjusted to the new dataset.

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55055866%2fhow-to-standardize-the-data-before-feeding-into-the-deep-neural-network-nodel%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Yes, you should use the mean and std computed from the training phase.

In general, there are 2 approaches for normalization. Let's say we have an input X of shape [B, H, W, C]

The per feature approach normalizes every point of the image separately. For this to be done, matrices of shape [H, W, C] that estimate mean and std per feature must be computed at training phase.

The per channel approach normalizes every channel of the image separately. This can be done in 3 ways:
- Compute mean and std per channel across training set
- Get statistics from a big collection of images and use these at evaluation phase (e.g. imagenet: 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225])
- Normalize each channel on the fly. Compute mean and std of each example (testing phase) or each batch (training phase) and normalize each channel separately.

The majority of models uses the "per channel" approach, but there is not a correct answer. The important thing is to be consistent between training and test phase. Check also here for more details.

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

add a comment |

Yes, you should use the mean and std computed from the training phase.

In general, there are 2 approaches for normalization. Let's say we have an input X of shape [B, H, W, C]

The per feature approach normalizes every point of the image separately. For this to be done, matrices of shape [H, W, C] that estimate mean and std per feature must be computed at training phase.

The per channel approach normalizes every channel of the image separately. This can be done in 3 ways:
- Compute mean and std per channel across training set
- Get statistics from a big collection of images and use these at evaluation phase (e.g. imagenet: 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225])
- Normalize each channel on the fly. Compute mean and std of each example (testing phase) or each batch (training phase) and normalize each channel separately.

The majority of models uses the "per channel" approach, but there is not a correct answer. The important thing is to be consistent between training and test phase. Check also here for more details.

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

add a comment |

Yes, you should use the mean and std computed from the training phase.

In general, there are 2 approaches for normalization. Let's say we have an input X of shape [B, H, W, C]

The per feature approach normalizes every point of the image separately. For this to be done, matrices of shape [H, W, C] that estimate mean and std per feature must be computed at training phase.

The per channel approach normalizes every channel of the image separately. This can be done in 3 ways:
- Compute mean and std per channel across training set
- Get statistics from a big collection of images and use these at evaluation phase (e.g. imagenet: 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225])
- Normalize each channel on the fly. Compute mean and std of each example (testing phase) or each batch (training phase) and normalize each channel separately.

The majority of models uses the "per channel" approach, but there is not a correct answer. The important thing is to be consistent between training and test phase. Check also here for more details.

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

Yes, you should use the mean and std computed from the training phase.

In general, there are 2 approaches for normalization. Let's say we have an input X of shape [B, H, W, C]

The per feature approach normalizes every point of the image separately. For this to be done, matrices of shape [H, W, C] that estimate mean and std per feature must be computed at training phase.

The per channel approach normalizes every channel of the image separately. This can be done in 3 ways:
- Compute mean and std per channel across training set
- Get statistics from a big collection of images and use these at evaluation phase (e.g. imagenet: 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225])
- Normalize each channel on the fly. Compute mean and std of each example (testing phase) or each batch (training phase) and normalize each channel separately.

The majority of models uses the "per channel" approach, but there is not a correct answer. The important thing is to be consistent between training and test phase. Check also here for more details.

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

edited Mar 9 at 13:45

answered Mar 8 at 11:32

ntipakos

364

answered Mar 8 at 11:32

ntipakos

364

answered Mar 8 at 11:32

ntipakos

364

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

add a comment |

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

Thanks a lot. Your reply is helpful.

– user1388672
Mar 8 at 13:36

Thanks for your reply. Suppose that I have trained a model with per-channel and whole-dataset standardization. What if I want to do transfer learning? How should I preprocess the dataset? If we use per image standardization, there is no problem.

– user1388672
Mar 9 at 2:53

updated the post with explanation for finetuning

– ntipakos
Mar 9 at 13:46

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggtcf

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

1 Answer
1

1 Answer
1

1 Answer
1