Difference Between a Staging and the Production DWH

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












3















I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.



But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?



From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.










share|improve this question






















  • Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

    – seventyeightist
    Feb 17 at 20:39











  • Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

    – STORM
    Feb 18 at 6:44















3















I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.



But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?



From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.










share|improve this question






















  • Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

    – seventyeightist
    Feb 17 at 20:39











  • Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

    – STORM
    Feb 18 at 6:44













3












3








3








I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.



But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?



From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.










share|improve this question














I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.



But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?



From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.







sql-server data-warehouse






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 17 at 18:13









STORMSTORM

1184




1184












  • Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

    – seventyeightist
    Feb 17 at 20:39











  • Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

    – STORM
    Feb 18 at 6:44

















  • Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

    – seventyeightist
    Feb 17 at 20:39











  • Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

    – STORM
    Feb 18 at 6:44
















Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

– seventyeightist
Feb 17 at 20:39





Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?

– seventyeightist
Feb 17 at 20:39













Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

– STORM
Feb 18 at 6:44





Exactly. I am normally a classical software developer and nowadays more involved in bi topics.

– STORM
Feb 18 at 6:44










3 Answers
3






active

oldest

votes


















2














In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)



The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.



In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:



  • translating raw data into dimensions

  • grouping and aggregating

  • cleaning (e.g. how to deal with missing values)

Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).



In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.






share|improve this answer






























    9














    I think this is a case of the same word being used to describe two different things.



    The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.



    The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.



    Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.



    Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.






    share|improve this answer






























      0














      As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).



      I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system



      Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders



      Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.



      Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.



      This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:



      - 1. Sandbox
      - 2. int
      - 3. staging(UAT/production)
      - 4. Live (production)


      I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.






      share|improve this answer






















        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "182"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229988%2fdifference-between-a-staging-and-the-production-dwh%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        2














        In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)



        The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.



        In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:



        • translating raw data into dimensions

        • grouping and aggregating

        • cleaning (e.g. how to deal with missing values)

        Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).



        In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.






        share|improve this answer



























          2














          In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)



          The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.



          In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:



          • translating raw data into dimensions

          • grouping and aggregating

          • cleaning (e.g. how to deal with missing values)

          Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).



          In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.






          share|improve this answer

























            2












            2








            2







            In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)



            The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.



            In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:



            • translating raw data into dimensions

            • grouping and aggregating

            • cleaning (e.g. how to deal with missing values)

            Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).



            In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.






            share|improve this answer













            In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)



            The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.



            In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:



            • translating raw data into dimensions

            • grouping and aggregating

            • cleaning (e.g. how to deal with missing values)

            Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).



            In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Feb 17 at 20:36









            seventyeightistseventyeightist

            93948




            93948























                9














                I think this is a case of the same word being used to describe two different things.



                The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.



                The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.



                Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.



                Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.






                share|improve this answer



























                  9














                  I think this is a case of the same word being used to describe two different things.



                  The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.



                  The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.



                  Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.



                  Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.






                  share|improve this answer

























                    9












                    9








                    9







                    I think this is a case of the same word being used to describe two different things.



                    The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.



                    The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.



                    Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.



                    Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.






                    share|improve this answer













                    I think this is a case of the same word being used to describe two different things.



                    The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.



                    The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.



                    Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.



                    Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Feb 17 at 18:33









                    Mr.BrownstoneMr.Brownstone

                    9,37432342




                    9,37432342





















                        0














                        As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).



                        I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system



                        Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders



                        Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.



                        Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.



                        This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:



                        - 1. Sandbox
                        - 2. int
                        - 3. staging(UAT/production)
                        - 4. Live (production)


                        I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.






                        share|improve this answer



























                          0














                          As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).



                          I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system



                          Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders



                          Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.



                          Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.



                          This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:



                          - 1. Sandbox
                          - 2. int
                          - 3. staging(UAT/production)
                          - 4. Live (production)


                          I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.






                          share|improve this answer

























                            0












                            0








                            0







                            As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).



                            I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system



                            Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders



                            Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.



                            Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.



                            This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:



                            - 1. Sandbox
                            - 2. int
                            - 3. staging(UAT/production)
                            - 4. Live (production)


                            I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.






                            share|improve this answer













                            As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).



                            I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system



                            Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders



                            Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.



                            Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.



                            This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:



                            - 1. Sandbox
                            - 2. int
                            - 3. staging(UAT/production)
                            - 4. Live (production)


                            I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Feb 18 at 10:22









                            Harry SmithHarry Smith

                            1




                            1



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Database Administrators Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229988%2fdifference-between-a-staging-and-the-production-dwh%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown






                                Popular posts from this blog

                                How to check contact read email or not when send email to Individual?

                                How many registers does an x86_64 CPU actually have?

                                Nur Jahan