Dropped 2 Categories in Dummy Variables (Logistic Regression)

Multi tool use
Multi tool use

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



Should I still interpret it as usual, using the original dropped category as a baseline?










share|cite|improve this question





























    up vote
    1
    down vote

    favorite












    I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



    Should I still interpret it as usual, using the original dropped category as a baseline?










    share|cite|improve this question

























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



      Should I still interpret it as usual, using the original dropped category as a baseline?










      share|cite|improve this question















      I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



      Should I still interpret it as usual, using the original dropped category as a baseline?







      logistic feature-selection categorical-encoding






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 2 hours ago









      kjetil b halvorsen

      26.3k977189




      26.3k977189










      asked 3 hours ago









      SuperSaiyan

      163




      163




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






          share|cite|improve this answer



























            up vote
            0
            down vote













            Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



            It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






            share|cite|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "65"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f371738%2fdropped-2-categories-in-dummy-variables-logistic-regression%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              2
              down vote













              You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






              share|cite|improve this answer
























                up vote
                2
                down vote













                You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






                share|cite|improve this answer






















                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






                  share|cite|improve this answer












                  You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered 3 hours ago









                  Isabella Ghement

                  4,433316




                  4,433316






















                      up vote
                      0
                      down vote













                      Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                      It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                      share|cite|improve this answer
























                        up vote
                        0
                        down vote













                        Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                        It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                        share|cite|improve this answer






















                          up vote
                          0
                          down vote










                          up vote
                          0
                          down vote









                          Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                          It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                          share|cite|improve this answer












                          Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                          It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered 3 hours ago









                          a_statistician

                          1,549139




                          1,549139



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f371738%2fdropped-2-categories-in-dummy-variables-logistic-regression%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              i8 X19MxAXnXVGCH5J,qLds6QsjRp,gzQqJsXD4bE,Q7XgstDcViXp AycLXVw4Dmr3NWf,sLo4nOMbLfE5prAa7uwjtUQj
                              AIdaLR,jbZ7pnLkd 0wA2,omhkdbfoyiG11GLH Y5OS OhF1QUHVHge,HUi1,efyXUc8SyT9,4h5 lbi

                              Popular posts from this blog

                              How to check contact read email or not when send email to Individual?

                              How many registers does an x86_64 CPU actually have?

                              Displaying single band from multi-band raster using QGIS