Why does solidity check only the first 4 bytes of the calldata to determine the method?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












4















I was doing research on EVM and solidity and I came across this fact that the calldata/input data is created using RLP encoding and stuff. I know the process and I don't want to elaborate on that. My question is, what is the logic or maths behind choosing the first 4 bytes of data for identifying the method? Why not 5? Why not some other number?
For example if the call data is: 0xee919d50000000000000000000000000000000000000000000000000000000000000001
Then why do we take only the first 4 bytes, i.e., ee919d50 as the method id and not more or fewer bytes?



I also read from the first answer to this question: How does the EVM find the entry of a called function? that said that if you want to implement your own logic, you can consider the first 8 bytes of data instead of first four bytes. I am interested in knowing the actual reason for selecting the number "4".



I would appreciate if someone could explain or point a resource that has a detailed explanation of this question.










share|improve this question


























    4















    I was doing research on EVM and solidity and I came across this fact that the calldata/input data is created using RLP encoding and stuff. I know the process and I don't want to elaborate on that. My question is, what is the logic or maths behind choosing the first 4 bytes of data for identifying the method? Why not 5? Why not some other number?
    For example if the call data is: 0xee919d50000000000000000000000000000000000000000000000000000000000000001
    Then why do we take only the first 4 bytes, i.e., ee919d50 as the method id and not more or fewer bytes?



    I also read from the first answer to this question: How does the EVM find the entry of a called function? that said that if you want to implement your own logic, you can consider the first 8 bytes of data instead of first four bytes. I am interested in knowing the actual reason for selecting the number "4".



    I would appreciate if someone could explain or point a resource that has a detailed explanation of this question.










    share|improve this question
























      4












      4








      4


      1






      I was doing research on EVM and solidity and I came across this fact that the calldata/input data is created using RLP encoding and stuff. I know the process and I don't want to elaborate on that. My question is, what is the logic or maths behind choosing the first 4 bytes of data for identifying the method? Why not 5? Why not some other number?
      For example if the call data is: 0xee919d50000000000000000000000000000000000000000000000000000000000000001
      Then why do we take only the first 4 bytes, i.e., ee919d50 as the method id and not more or fewer bytes?



      I also read from the first answer to this question: How does the EVM find the entry of a called function? that said that if you want to implement your own logic, you can consider the first 8 bytes of data instead of first four bytes. I am interested in knowing the actual reason for selecting the number "4".



      I would appreciate if someone could explain or point a resource that has a detailed explanation of this question.










      share|improve this question














      I was doing research on EVM and solidity and I came across this fact that the calldata/input data is created using RLP encoding and stuff. I know the process and I don't want to elaborate on that. My question is, what is the logic or maths behind choosing the first 4 bytes of data for identifying the method? Why not 5? Why not some other number?
      For example if the call data is: 0xee919d50000000000000000000000000000000000000000000000000000000000000001
      Then why do we take only the first 4 bytes, i.e., ee919d50 as the method id and not more or fewer bytes?



      I also read from the first answer to this question: How does the EVM find the entry of a called function? that said that if you want to implement your own logic, you can consider the first 8 bytes of data instead of first four bytes. I am interested in knowing the actual reason for selecting the number "4".



      I would appreciate if someone could explain or point a resource that has a detailed explanation of this question.







      solidity blockchain evm






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jan 23 at 8:54









      Ashish MishraAshish Mishra

      263




      263




















          1 Answer
          1






          active

          oldest

          votes


















          3














          As any engineering choice, it is a trade off.



          This way you can address in theory 4.294.967.296 different methods in any single contract.



          It is a reasonable choice because, even giving the (crazy big) possibility to have a collision in 99999 out 100000 cases, you are assured in any case of a lot of unique entries for methods in the single contract!



          It seems enough for any possible contract, giving furthermore the presence of a code size limit of 24 kbytes or so (who ever saw a smart contract having 500 or 1000 methods? The major of them have five-to-twenty methods and that’s all).



          On the other hand, using a maximum of 32 bit (I.e. 4 bytes) is a reasonable choice in order to efficiently address the hash table with the most of the cpu’s today presumably used to run EVM nodes.



          Let’s say that 4 bytes (I.e. 32 bit) is the maximum possible number of entries easy to address using the majority of current CPUs.






          share|improve this answer
























            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "642"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fethereum.stackexchange.com%2fquestions%2f65989%2fwhy-does-solidity-check-only-the-first-4-bytes-of-the-calldata-to-determine-the%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            3














            As any engineering choice, it is a trade off.



            This way you can address in theory 4.294.967.296 different methods in any single contract.



            It is a reasonable choice because, even giving the (crazy big) possibility to have a collision in 99999 out 100000 cases, you are assured in any case of a lot of unique entries for methods in the single contract!



            It seems enough for any possible contract, giving furthermore the presence of a code size limit of 24 kbytes or so (who ever saw a smart contract having 500 or 1000 methods? The major of them have five-to-twenty methods and that’s all).



            On the other hand, using a maximum of 32 bit (I.e. 4 bytes) is a reasonable choice in order to efficiently address the hash table with the most of the cpu’s today presumably used to run EVM nodes.



            Let’s say that 4 bytes (I.e. 32 bit) is the maximum possible number of entries easy to address using the majority of current CPUs.






            share|improve this answer





























              3














              As any engineering choice, it is a trade off.



              This way you can address in theory 4.294.967.296 different methods in any single contract.



              It is a reasonable choice because, even giving the (crazy big) possibility to have a collision in 99999 out 100000 cases, you are assured in any case of a lot of unique entries for methods in the single contract!



              It seems enough for any possible contract, giving furthermore the presence of a code size limit of 24 kbytes or so (who ever saw a smart contract having 500 or 1000 methods? The major of them have five-to-twenty methods and that’s all).



              On the other hand, using a maximum of 32 bit (I.e. 4 bytes) is a reasonable choice in order to efficiently address the hash table with the most of the cpu’s today presumably used to run EVM nodes.



              Let’s say that 4 bytes (I.e. 32 bit) is the maximum possible number of entries easy to address using the majority of current CPUs.






              share|improve this answer



























                3












                3








                3







                As any engineering choice, it is a trade off.



                This way you can address in theory 4.294.967.296 different methods in any single contract.



                It is a reasonable choice because, even giving the (crazy big) possibility to have a collision in 99999 out 100000 cases, you are assured in any case of a lot of unique entries for methods in the single contract!



                It seems enough for any possible contract, giving furthermore the presence of a code size limit of 24 kbytes or so (who ever saw a smart contract having 500 or 1000 methods? The major of them have five-to-twenty methods and that’s all).



                On the other hand, using a maximum of 32 bit (I.e. 4 bytes) is a reasonable choice in order to efficiently address the hash table with the most of the cpu’s today presumably used to run EVM nodes.



                Let’s say that 4 bytes (I.e. 32 bit) is the maximum possible number of entries easy to address using the majority of current CPUs.






                share|improve this answer















                As any engineering choice, it is a trade off.



                This way you can address in theory 4.294.967.296 different methods in any single contract.



                It is a reasonable choice because, even giving the (crazy big) possibility to have a collision in 99999 out 100000 cases, you are assured in any case of a lot of unique entries for methods in the single contract!



                It seems enough for any possible contract, giving furthermore the presence of a code size limit of 24 kbytes or so (who ever saw a smart contract having 500 or 1000 methods? The major of them have five-to-twenty methods and that’s all).



                On the other hand, using a maximum of 32 bit (I.e. 4 bytes) is a reasonable choice in order to efficiently address the hash table with the most of the cpu’s today presumably used to run EVM nodes.



                Let’s say that 4 bytes (I.e. 32 bit) is the maximum possible number of entries easy to address using the majority of current CPUs.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 23 at 13:11

























                answered Jan 23 at 9:39









                Rick ParkRick Park

                1,4401217




                1,4401217



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Ethereum Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fethereum.stackexchange.com%2fquestions%2f65989%2fwhy-does-solidity-check-only-the-first-4-bytes-of-the-calldata-to-determine-the%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown






                    Popular posts from this blog

                    How to check contact read email or not when send email to Individual?

                    Displaying single band from multi-band raster using QGIS

                    How many registers does an x86_64 CPU actually have?