What are some situations when normalizing input data to zero mean, unit variance is not appropriate or not beneficial?

up vote
7
down vote

favorite

I have seen normalization of input data to zero mean, unit variance many times in machine learning. Is this a good practice to be done all the time or are there times when it is not appropriate or not beneficial?

asked Sep 3 at 6:02

user781486

2727

add a commentÂ |Â

up vote
7
down vote

favorite

asked Sep 3 at 6:02

user781486

2727

add a commentÂ |Â

up vote
7
down vote

favorite

asked Sep 3 at 6:02

user781486

2727

machine-learning feature-scaling normalization

asked Sep 3 at 6:02

user781486

2727

asked Sep 3 at 6:02

user781486

2727

asked Sep 3 at 6:02

user781486

2727

asked Sep 3 at 6:02

user781486

2727

asked Sep 3 at 6:02

user781486

2727

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
6
down vote

accepted

A detailed answer to the question can be found here.

[...]are there times when it is not appropriate or not beneficial?

Short answer: Yes and No. Yes in the terms, that it can significantly change your output of e.g. clustering algorithms. No, on the other hand, if these changes are what you want to achieve. Or to put it in the words of the author of the mentioned source:

Scaling features for clustering algorithms can substantially change the outcome. Imagine four clusters around the origin, each one in a different quadrant, all nicely scaled. Now, imagine the y-axis being stretched to ten times the length of the the x-axis. instead of four little quadrant-clusters, you're going to get the long squashed baguette of data chopped into four pieces along its length! (And, the important part is, you might prefer either of these!)

The take-home-message of this is: always think carefully about what you want to achieve and what kind of data your algorithms prefer - it does matter!

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37734%2fwhat-are-some-situations-when-normalizing-input-data-to-zero-mean-unit-variance%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
6
down vote

accepted

A detailed answer to the question can be found here.

[...]are there times when it is not appropriate or not beneficial?

Scaling features for clustering algorithms can substantially change the outcome. Imagine four clusters around the origin, each one in a different quadrant, all nicely scaled. Now, imagine the y-axis being stretched to ten times the length of the the x-axis. instead of four little quadrant-clusters, you're going to get the long squashed baguette of data chopped into four pieces along its length! (And, the important part is, you might prefer either of these!)

The take-home-message of this is: always think carefully about what you want to achieve and what kind of data your algorithms prefer - it does matter!

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

add a commentÂ |Â

up vote
6
down vote

accepted

A detailed answer to the question can be found here.

[...]are there times when it is not appropriate or not beneficial?

Scaling features for clustering algorithms can substantially change the outcome. Imagine four clusters around the origin, each one in a different quadrant, all nicely scaled. Now, imagine the y-axis being stretched to ten times the length of the the x-axis. instead of four little quadrant-clusters, you're going to get the long squashed baguette of data chopped into four pieces along its length! (And, the important part is, you might prefer either of these!)

The take-home-message of this is: always think carefully about what you want to achieve and what kind of data your algorithms prefer - it does matter!

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

add a commentÂ |Â

up vote
6
down vote

accepted

A detailed answer to the question can be found here.

[...]are there times when it is not appropriate or not beneficial?

Scaling features for clustering algorithms can substantially change the outcome. Imagine four clusters around the origin, each one in a different quadrant, all nicely scaled. Now, imagine the y-axis being stretched to ten times the length of the the x-axis. instead of four little quadrant-clusters, you're going to get the long squashed baguette of data chopped into four pieces along its length! (And, the important part is, you might prefer either of these!)

The take-home-message of this is: always think carefully about what you want to achieve and what kind of data your algorithms prefer - it does matter!

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

A detailed answer to the question can be found here.

[...]are there times when it is not appropriate or not beneficial?

Scaling features for clustering algorithms can substantially change the outcome. Imagine four clusters around the origin, each one in a different quadrant, all nicely scaled. Now, imagine the y-axis being stretched to ten times the length of the the x-axis. instead of four little quadrant-clusters, you're going to get the long squashed baguette of data chopped into four pieces along its length! (And, the important part is, you might prefer either of these!)

The take-home-message of this is: always think carefully about what you want to achieve and what kind of data your algorithms prefer - it does matter!

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

edited Sep 5 at 17:37

answered Sep 3 at 8:40

AndrÃ©

4389

answered Sep 3 at 8:40

AndrÃ©

4389

answered Sep 3 at 8:40

AndrÃ©

4389

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

add a commentÂ |Â

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

PCA would, by the way, be one of the algorithms that do not want to be operated without normalization - just to highlight the other side of the story.
â€“Â AndrÃ©
Sep 3 at 8:42

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu