Understanding how a simple contract breaks into bytecode

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite
3












I'm trying to understand how contracts look in terms of bytecode and struggling to do that just based on the yellow paper. In particular, consider the following empty contract:



pragma solidity ^0.4.17;

contract Simplest



Using the Remix compiler, I'm able to compile into the following bytecode:



6080604052348015600f57600080fd5b50603580601d6000396000f3006080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Which translates to the following op-codes:



PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x35 DUP1 PUSH1 0x1D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN STOP PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x0 DUP1 REVERT STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 MSTORE8 LOG2 BLOCKHASH ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16 0xbd 0x5e 0x5c 0xad 0xb3 SWAP8 0xbb 0x5f 0xb3 0x4e DUP11 0xab PUSH28 0x5782D9E28DFD4E986281002900000000000000000000000000000000


1) What does this represent? Is this meant to be the bytecode for the constructor function?



2) There are several sequences which don't make sense to me, in particular:



ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16


Why are 0xf8, 0xe1, 0xdd following opcodes that don't take any arguments? What is their purpose?



3) Once the contract is compiled, where is the bytecode for the "functions" of the contract?



I suspect the 3 questions are quite linked hence asking them together.










share|improve this question

















  • 1




    The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
    – Briomkez
    Sep 13 at 9:34















up vote
2
down vote

favorite
3












I'm trying to understand how contracts look in terms of bytecode and struggling to do that just based on the yellow paper. In particular, consider the following empty contract:



pragma solidity ^0.4.17;

contract Simplest



Using the Remix compiler, I'm able to compile into the following bytecode:



6080604052348015600f57600080fd5b50603580601d6000396000f3006080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Which translates to the following op-codes:



PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x35 DUP1 PUSH1 0x1D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN STOP PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x0 DUP1 REVERT STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 MSTORE8 LOG2 BLOCKHASH ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16 0xbd 0x5e 0x5c 0xad 0xb3 SWAP8 0xbb 0x5f 0xb3 0x4e DUP11 0xab PUSH28 0x5782D9E28DFD4E986281002900000000000000000000000000000000


1) What does this represent? Is this meant to be the bytecode for the constructor function?



2) There are several sequences which don't make sense to me, in particular:



ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16


Why are 0xf8, 0xe1, 0xdd following opcodes that don't take any arguments? What is their purpose?



3) Once the contract is compiled, where is the bytecode for the "functions" of the contract?



I suspect the 3 questions are quite linked hence asking them together.










share|improve this question

















  • 1




    The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
    – Briomkez
    Sep 13 at 9:34













up vote
2
down vote

favorite
3









up vote
2
down vote

favorite
3






3





I'm trying to understand how contracts look in terms of bytecode and struggling to do that just based on the yellow paper. In particular, consider the following empty contract:



pragma solidity ^0.4.17;

contract Simplest



Using the Remix compiler, I'm able to compile into the following bytecode:



6080604052348015600f57600080fd5b50603580601d6000396000f3006080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Which translates to the following op-codes:



PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x35 DUP1 PUSH1 0x1D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN STOP PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x0 DUP1 REVERT STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 MSTORE8 LOG2 BLOCKHASH ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16 0xbd 0x5e 0x5c 0xad 0xb3 SWAP8 0xbb 0x5f 0xb3 0x4e DUP11 0xab PUSH28 0x5782D9E28DFD4E986281002900000000000000000000000000000000


1) What does this represent? Is this meant to be the bytecode for the constructor function?



2) There are several sequences which don't make sense to me, in particular:



ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16


Why are 0xf8, 0xe1, 0xdd following opcodes that don't take any arguments? What is their purpose?



3) Once the contract is compiled, where is the bytecode for the "functions" of the contract?



I suspect the 3 questions are quite linked hence asking them together.










share|improve this question













I'm trying to understand how contracts look in terms of bytecode and struggling to do that just based on the yellow paper. In particular, consider the following empty contract:



pragma solidity ^0.4.17;

contract Simplest



Using the Remix compiler, I'm able to compile into the following bytecode:



6080604052348015600f57600080fd5b50603580601d6000396000f3006080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Which translates to the following op-codes:



PUSH1 0x80 PUSH1 0x40 MSTORE CALLVALUE DUP1 ISZERO PUSH1 0xF JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH1 0x35 DUP1 PUSH1 0x1D PUSH1 0x0 CODECOPY PUSH1 0x0 RETURN STOP PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x0 DUP1 REVERT STOP LOG1 PUSH6 0x627A7A723058 KECCAK256 MSTORE8 LOG2 BLOCKHASH ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16 0xbd 0x5e 0x5c 0xad 0xb3 SWAP8 0xbb 0x5f 0xb3 0x4e DUP11 0xab PUSH28 0x5782D9E28DFD4E986281002900000000000000000000000000000000


1) What does this represent? Is this meant to be the bytecode for the constructor function?



2) There are several sequences which don't make sense to me, in particular:



ISZERO 0xf8 DUP8 0xe1 0xdd SWAP16


Why are 0xf8, 0xe1, 0xdd following opcodes that don't take any arguments? What is their purpose?



3) Once the contract is compiled, where is the bytecode for the "functions" of the contract?



I suspect the 3 questions are quite linked hence asking them together.







remix evm bytecode






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Sep 13 at 8:46









Peteris

22613




22613







  • 1




    The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
    – Briomkez
    Sep 13 at 9:34













  • 1




    The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
    – Briomkez
    Sep 13 at 9:34








1




1




The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
– Briomkez
Sep 13 at 9:34





The answer to this question gives help you for for point 2) ethereum.stackexchange.com/questions/15050/….
– Briomkez
Sep 13 at 9:34











1 Answer
1






active

oldest

votes

















up vote
5
down vote



accepted










What you've got in bytecode here is the contract's constructor / construction bytecode. When you create a contract, the constructor runs, handles any initial arguments or statements you make, and also "creates" the code of the contract.



The way this is done is through a return statement. The deployed contract code will be whatever the constructor returns. So the bytecode you posted will include the bytecode of the constructor, as well as the code of the deployed contract itself. It also includes a third section - some metadata (a Solidity feature). Breaking it down, here are the 3 distinct sections:



Constructor bytecode (basically a deployment script):



0x6080604052348015600f57600080fd5b50603580601d6000396000f300


This translates to the following opcodes, which I've annotated a bit:



// Set free memory pointer (0x40) to 0x80
[1] PUSH1 0x80
[3] PUSH1 0x40
[4] MSTORE
// Check msg.value
[5] CALLVALUE
[6] DUP1
[7] ISZERO
// If msg.value == 0, JUMP to 0x0F (15)
[9] PUSH1 0x0f
[10] JUMPI
// Otherwise, the next 3 instructions do: revert(0, 0)
[12] PUSH1 0x00
[13] DUP1
[14] REVERT
// Here's 0x0F, which has a corresponding JUMPDEST
[15] JUMPDEST
[16] POP
// Now we need the constructor to set up the deployment bytecode:
// 0x35 is presumably the length of the bytecode to return
[18] PUSH1 0x35
[19] DUP1
// 0x1D is the location in the bytecode from which the CODECOPY starts (29)
[21] PUSH1 0x1d
// 0x00 is where the code is copied to, in memory
[23] PUSH1 0x00
[24] CODECOPY
// Now we have the bytecode of the contract in memory, starting at 0x00 (and 0x35 bytes long). We DUP1'd the length earlier, so we can just push 0x00 and RETURN. The RETURN opcode will return 0x35 bytes, starting at position 0x00 in memory.
[26] PUSH1 0x00
[27] RETURN
[28] STOP
// A final STOP at 28 (0x1C) marks the end of the constructor bytecode. The code copied started at 0x1D, so the next chunk of bytecode will be the deployed code


And now here's the contract bytecode (with metadata at the end):



0x6080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Again, translated to opcodes:



// Set free memory pointer (0x40) to 0x80
[30] PUSH1 0x80
[32] PUSH1 0x40
[33] MSTORE
// The next 3 opcodes simply push 0 twice, then REVERT(0, 0)
[35] PUSH1 0x00
[36] DUP1
[37] REVERT
// And, a final STOP to mark the end of contract bytecode!
[38] STOP
// This next bit is confusing when translated directly from opcodes, because it's simply the metadata Solidity appends to the end of bytecode. It's not meant to be executed
[39] LOG1
[46] PUSH6 0x627a7a723058
[47] SHA3
[48] MSTORE8
[49] LOG2
[50] BLOCKHASH
[51] ISZERO
[52] 'f8'(Unknown Opcode)
[53] DUP8
[54] 'e1'(Unknown Opcode)
[55] 'dd'(Unknown Opcode)
[56] SWAP16
[57] 'bd'(Unknown Opcode)
[58] '5e'(Unknown Opcode)
[59] '5c'(Unknown Opcode)
[60] 'ad'(Unknown Opcode)
[61] 'b3'(Unknown Opcode)
[62] SWAP8
[63] 'bb'(Unknown Opcode)
[64] '5f'(Unknown Opcode)
[65] 'b3'(Unknown Opcode)
[66] '4e'(Unknown Opcode)
[67] DUP11
[68] 'ab'(Unknown Opcode)


Here's more information on contract metadata: https://solidity.readthedocs.io/en/v0.4.24/metadata.html



So, to directly answer your questions:



  1. Yes, this is meant to be the constructor bytecode. Since the constructor needs to return the deployment bytecode, the constructor bytecode includes the deployment bytecode as well.


  2. The sequences to which you are referring are from the contract's metadata, and are not meant to be executed.


  3. The contract you compiled doesn't have any functions, so the entirety of the "runtime bytecode" is the 10-opcode section of the deployed bytecode, which sets the free memory pointer and immediately reverts. A typical contract with functions will do something slightly different -- it will grab the first 4 bytes of calldata (using CALLDATALOAD), and compare those against a series of function selectors. When it finds a match, it will JUMP to that function's position in the code. If no match is found, it executes the fallback function. If that doesn't exist, it reverts!


Hope that cleared it up! Feel free to ask more questions!






share|improve this answer


















  • 1




    Thanks Alex, that's incredibly helpful!
    – Peteris
    Sep 13 at 9:48






  • 1




    Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
    – Peteris
    Sep 13 at 10:17






  • 1




    As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
    – Alexander Wade
    Sep 13 at 11:49










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "642"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fethereum.stackexchange.com%2fquestions%2f58652%2funderstanding-how-a-simple-contract-breaks-into-bytecode%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
5
down vote



accepted










What you've got in bytecode here is the contract's constructor / construction bytecode. When you create a contract, the constructor runs, handles any initial arguments or statements you make, and also "creates" the code of the contract.



The way this is done is through a return statement. The deployed contract code will be whatever the constructor returns. So the bytecode you posted will include the bytecode of the constructor, as well as the code of the deployed contract itself. It also includes a third section - some metadata (a Solidity feature). Breaking it down, here are the 3 distinct sections:



Constructor bytecode (basically a deployment script):



0x6080604052348015600f57600080fd5b50603580601d6000396000f300


This translates to the following opcodes, which I've annotated a bit:



// Set free memory pointer (0x40) to 0x80
[1] PUSH1 0x80
[3] PUSH1 0x40
[4] MSTORE
// Check msg.value
[5] CALLVALUE
[6] DUP1
[7] ISZERO
// If msg.value == 0, JUMP to 0x0F (15)
[9] PUSH1 0x0f
[10] JUMPI
// Otherwise, the next 3 instructions do: revert(0, 0)
[12] PUSH1 0x00
[13] DUP1
[14] REVERT
// Here's 0x0F, which has a corresponding JUMPDEST
[15] JUMPDEST
[16] POP
// Now we need the constructor to set up the deployment bytecode:
// 0x35 is presumably the length of the bytecode to return
[18] PUSH1 0x35
[19] DUP1
// 0x1D is the location in the bytecode from which the CODECOPY starts (29)
[21] PUSH1 0x1d
// 0x00 is where the code is copied to, in memory
[23] PUSH1 0x00
[24] CODECOPY
// Now we have the bytecode of the contract in memory, starting at 0x00 (and 0x35 bytes long). We DUP1'd the length earlier, so we can just push 0x00 and RETURN. The RETURN opcode will return 0x35 bytes, starting at position 0x00 in memory.
[26] PUSH1 0x00
[27] RETURN
[28] STOP
// A final STOP at 28 (0x1C) marks the end of the constructor bytecode. The code copied started at 0x1D, so the next chunk of bytecode will be the deployed code


And now here's the contract bytecode (with metadata at the end):



0x6080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Again, translated to opcodes:



// Set free memory pointer (0x40) to 0x80
[30] PUSH1 0x80
[32] PUSH1 0x40
[33] MSTORE
// The next 3 opcodes simply push 0 twice, then REVERT(0, 0)
[35] PUSH1 0x00
[36] DUP1
[37] REVERT
// And, a final STOP to mark the end of contract bytecode!
[38] STOP
// This next bit is confusing when translated directly from opcodes, because it's simply the metadata Solidity appends to the end of bytecode. It's not meant to be executed
[39] LOG1
[46] PUSH6 0x627a7a723058
[47] SHA3
[48] MSTORE8
[49] LOG2
[50] BLOCKHASH
[51] ISZERO
[52] 'f8'(Unknown Opcode)
[53] DUP8
[54] 'e1'(Unknown Opcode)
[55] 'dd'(Unknown Opcode)
[56] SWAP16
[57] 'bd'(Unknown Opcode)
[58] '5e'(Unknown Opcode)
[59] '5c'(Unknown Opcode)
[60] 'ad'(Unknown Opcode)
[61] 'b3'(Unknown Opcode)
[62] SWAP8
[63] 'bb'(Unknown Opcode)
[64] '5f'(Unknown Opcode)
[65] 'b3'(Unknown Opcode)
[66] '4e'(Unknown Opcode)
[67] DUP11
[68] 'ab'(Unknown Opcode)


Here's more information on contract metadata: https://solidity.readthedocs.io/en/v0.4.24/metadata.html



So, to directly answer your questions:



  1. Yes, this is meant to be the constructor bytecode. Since the constructor needs to return the deployment bytecode, the constructor bytecode includes the deployment bytecode as well.


  2. The sequences to which you are referring are from the contract's metadata, and are not meant to be executed.


  3. The contract you compiled doesn't have any functions, so the entirety of the "runtime bytecode" is the 10-opcode section of the deployed bytecode, which sets the free memory pointer and immediately reverts. A typical contract with functions will do something slightly different -- it will grab the first 4 bytes of calldata (using CALLDATALOAD), and compare those against a series of function selectors. When it finds a match, it will JUMP to that function's position in the code. If no match is found, it executes the fallback function. If that doesn't exist, it reverts!


Hope that cleared it up! Feel free to ask more questions!






share|improve this answer


















  • 1




    Thanks Alex, that's incredibly helpful!
    – Peteris
    Sep 13 at 9:48






  • 1




    Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
    – Peteris
    Sep 13 at 10:17






  • 1




    As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
    – Alexander Wade
    Sep 13 at 11:49














up vote
5
down vote



accepted










What you've got in bytecode here is the contract's constructor / construction bytecode. When you create a contract, the constructor runs, handles any initial arguments or statements you make, and also "creates" the code of the contract.



The way this is done is through a return statement. The deployed contract code will be whatever the constructor returns. So the bytecode you posted will include the bytecode of the constructor, as well as the code of the deployed contract itself. It also includes a third section - some metadata (a Solidity feature). Breaking it down, here are the 3 distinct sections:



Constructor bytecode (basically a deployment script):



0x6080604052348015600f57600080fd5b50603580601d6000396000f300


This translates to the following opcodes, which I've annotated a bit:



// Set free memory pointer (0x40) to 0x80
[1] PUSH1 0x80
[3] PUSH1 0x40
[4] MSTORE
// Check msg.value
[5] CALLVALUE
[6] DUP1
[7] ISZERO
// If msg.value == 0, JUMP to 0x0F (15)
[9] PUSH1 0x0f
[10] JUMPI
// Otherwise, the next 3 instructions do: revert(0, 0)
[12] PUSH1 0x00
[13] DUP1
[14] REVERT
// Here's 0x0F, which has a corresponding JUMPDEST
[15] JUMPDEST
[16] POP
// Now we need the constructor to set up the deployment bytecode:
// 0x35 is presumably the length of the bytecode to return
[18] PUSH1 0x35
[19] DUP1
// 0x1D is the location in the bytecode from which the CODECOPY starts (29)
[21] PUSH1 0x1d
// 0x00 is where the code is copied to, in memory
[23] PUSH1 0x00
[24] CODECOPY
// Now we have the bytecode of the contract in memory, starting at 0x00 (and 0x35 bytes long). We DUP1'd the length earlier, so we can just push 0x00 and RETURN. The RETURN opcode will return 0x35 bytes, starting at position 0x00 in memory.
[26] PUSH1 0x00
[27] RETURN
[28] STOP
// A final STOP at 28 (0x1C) marks the end of the constructor bytecode. The code copied started at 0x1D, so the next chunk of bytecode will be the deployed code


And now here's the contract bytecode (with metadata at the end):



0x6080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Again, translated to opcodes:



// Set free memory pointer (0x40) to 0x80
[30] PUSH1 0x80
[32] PUSH1 0x40
[33] MSTORE
// The next 3 opcodes simply push 0 twice, then REVERT(0, 0)
[35] PUSH1 0x00
[36] DUP1
[37] REVERT
// And, a final STOP to mark the end of contract bytecode!
[38] STOP
// This next bit is confusing when translated directly from opcodes, because it's simply the metadata Solidity appends to the end of bytecode. It's not meant to be executed
[39] LOG1
[46] PUSH6 0x627a7a723058
[47] SHA3
[48] MSTORE8
[49] LOG2
[50] BLOCKHASH
[51] ISZERO
[52] 'f8'(Unknown Opcode)
[53] DUP8
[54] 'e1'(Unknown Opcode)
[55] 'dd'(Unknown Opcode)
[56] SWAP16
[57] 'bd'(Unknown Opcode)
[58] '5e'(Unknown Opcode)
[59] '5c'(Unknown Opcode)
[60] 'ad'(Unknown Opcode)
[61] 'b3'(Unknown Opcode)
[62] SWAP8
[63] 'bb'(Unknown Opcode)
[64] '5f'(Unknown Opcode)
[65] 'b3'(Unknown Opcode)
[66] '4e'(Unknown Opcode)
[67] DUP11
[68] 'ab'(Unknown Opcode)


Here's more information on contract metadata: https://solidity.readthedocs.io/en/v0.4.24/metadata.html



So, to directly answer your questions:



  1. Yes, this is meant to be the constructor bytecode. Since the constructor needs to return the deployment bytecode, the constructor bytecode includes the deployment bytecode as well.


  2. The sequences to which you are referring are from the contract's metadata, and are not meant to be executed.


  3. The contract you compiled doesn't have any functions, so the entirety of the "runtime bytecode" is the 10-opcode section of the deployed bytecode, which sets the free memory pointer and immediately reverts. A typical contract with functions will do something slightly different -- it will grab the first 4 bytes of calldata (using CALLDATALOAD), and compare those against a series of function selectors. When it finds a match, it will JUMP to that function's position in the code. If no match is found, it executes the fallback function. If that doesn't exist, it reverts!


Hope that cleared it up! Feel free to ask more questions!






share|improve this answer


















  • 1




    Thanks Alex, that's incredibly helpful!
    – Peteris
    Sep 13 at 9:48






  • 1




    Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
    – Peteris
    Sep 13 at 10:17






  • 1




    As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
    – Alexander Wade
    Sep 13 at 11:49












up vote
5
down vote



accepted







up vote
5
down vote



accepted






What you've got in bytecode here is the contract's constructor / construction bytecode. When you create a contract, the constructor runs, handles any initial arguments or statements you make, and also "creates" the code of the contract.



The way this is done is through a return statement. The deployed contract code will be whatever the constructor returns. So the bytecode you posted will include the bytecode of the constructor, as well as the code of the deployed contract itself. It also includes a third section - some metadata (a Solidity feature). Breaking it down, here are the 3 distinct sections:



Constructor bytecode (basically a deployment script):



0x6080604052348015600f57600080fd5b50603580601d6000396000f300


This translates to the following opcodes, which I've annotated a bit:



// Set free memory pointer (0x40) to 0x80
[1] PUSH1 0x80
[3] PUSH1 0x40
[4] MSTORE
// Check msg.value
[5] CALLVALUE
[6] DUP1
[7] ISZERO
// If msg.value == 0, JUMP to 0x0F (15)
[9] PUSH1 0x0f
[10] JUMPI
// Otherwise, the next 3 instructions do: revert(0, 0)
[12] PUSH1 0x00
[13] DUP1
[14] REVERT
// Here's 0x0F, which has a corresponding JUMPDEST
[15] JUMPDEST
[16] POP
// Now we need the constructor to set up the deployment bytecode:
// 0x35 is presumably the length of the bytecode to return
[18] PUSH1 0x35
[19] DUP1
// 0x1D is the location in the bytecode from which the CODECOPY starts (29)
[21] PUSH1 0x1d
// 0x00 is where the code is copied to, in memory
[23] PUSH1 0x00
[24] CODECOPY
// Now we have the bytecode of the contract in memory, starting at 0x00 (and 0x35 bytes long). We DUP1'd the length earlier, so we can just push 0x00 and RETURN. The RETURN opcode will return 0x35 bytes, starting at position 0x00 in memory.
[26] PUSH1 0x00
[27] RETURN
[28] STOP
// A final STOP at 28 (0x1C) marks the end of the constructor bytecode. The code copied started at 0x1D, so the next chunk of bytecode will be the deployed code


And now here's the contract bytecode (with metadata at the end):



0x6080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Again, translated to opcodes:



// Set free memory pointer (0x40) to 0x80
[30] PUSH1 0x80
[32] PUSH1 0x40
[33] MSTORE
// The next 3 opcodes simply push 0 twice, then REVERT(0, 0)
[35] PUSH1 0x00
[36] DUP1
[37] REVERT
// And, a final STOP to mark the end of contract bytecode!
[38] STOP
// This next bit is confusing when translated directly from opcodes, because it's simply the metadata Solidity appends to the end of bytecode. It's not meant to be executed
[39] LOG1
[46] PUSH6 0x627a7a723058
[47] SHA3
[48] MSTORE8
[49] LOG2
[50] BLOCKHASH
[51] ISZERO
[52] 'f8'(Unknown Opcode)
[53] DUP8
[54] 'e1'(Unknown Opcode)
[55] 'dd'(Unknown Opcode)
[56] SWAP16
[57] 'bd'(Unknown Opcode)
[58] '5e'(Unknown Opcode)
[59] '5c'(Unknown Opcode)
[60] 'ad'(Unknown Opcode)
[61] 'b3'(Unknown Opcode)
[62] SWAP8
[63] 'bb'(Unknown Opcode)
[64] '5f'(Unknown Opcode)
[65] 'b3'(Unknown Opcode)
[66] '4e'(Unknown Opcode)
[67] DUP11
[68] 'ab'(Unknown Opcode)


Here's more information on contract metadata: https://solidity.readthedocs.io/en/v0.4.24/metadata.html



So, to directly answer your questions:



  1. Yes, this is meant to be the constructor bytecode. Since the constructor needs to return the deployment bytecode, the constructor bytecode includes the deployment bytecode as well.


  2. The sequences to which you are referring are from the contract's metadata, and are not meant to be executed.


  3. The contract you compiled doesn't have any functions, so the entirety of the "runtime bytecode" is the 10-opcode section of the deployed bytecode, which sets the free memory pointer and immediately reverts. A typical contract with functions will do something slightly different -- it will grab the first 4 bytes of calldata (using CALLDATALOAD), and compare those against a series of function selectors. When it finds a match, it will JUMP to that function's position in the code. If no match is found, it executes the fallback function. If that doesn't exist, it reverts!


Hope that cleared it up! Feel free to ask more questions!






share|improve this answer














What you've got in bytecode here is the contract's constructor / construction bytecode. When you create a contract, the constructor runs, handles any initial arguments or statements you make, and also "creates" the code of the contract.



The way this is done is through a return statement. The deployed contract code will be whatever the constructor returns. So the bytecode you posted will include the bytecode of the constructor, as well as the code of the deployed contract itself. It also includes a third section - some metadata (a Solidity feature). Breaking it down, here are the 3 distinct sections:



Constructor bytecode (basically a deployment script):



0x6080604052348015600f57600080fd5b50603580601d6000396000f300


This translates to the following opcodes, which I've annotated a bit:



// Set free memory pointer (0x40) to 0x80
[1] PUSH1 0x80
[3] PUSH1 0x40
[4] MSTORE
// Check msg.value
[5] CALLVALUE
[6] DUP1
[7] ISZERO
// If msg.value == 0, JUMP to 0x0F (15)
[9] PUSH1 0x0f
[10] JUMPI
// Otherwise, the next 3 instructions do: revert(0, 0)
[12] PUSH1 0x00
[13] DUP1
[14] REVERT
// Here's 0x0F, which has a corresponding JUMPDEST
[15] JUMPDEST
[16] POP
// Now we need the constructor to set up the deployment bytecode:
// 0x35 is presumably the length of the bytecode to return
[18] PUSH1 0x35
[19] DUP1
// 0x1D is the location in the bytecode from which the CODECOPY starts (29)
[21] PUSH1 0x1d
// 0x00 is where the code is copied to, in memory
[23] PUSH1 0x00
[24] CODECOPY
// Now we have the bytecode of the contract in memory, starting at 0x00 (and 0x35 bytes long). We DUP1'd the length earlier, so we can just push 0x00 and RETURN. The RETURN opcode will return 0x35 bytes, starting at position 0x00 in memory.
[26] PUSH1 0x00
[27] RETURN
[28] STOP
// A final STOP at 28 (0x1C) marks the end of the constructor bytecode. The code copied started at 0x1D, so the next chunk of bytecode will be the deployed code


And now here's the contract bytecode (with metadata at the end):



0x6080604052600080fd00a165627a7a7230582053a24015f887e1dd9fbd5e5cadb397bb5fb34e8aab7b5782d9e28dfd4e9862810029


Again, translated to opcodes:



// Set free memory pointer (0x40) to 0x80
[30] PUSH1 0x80
[32] PUSH1 0x40
[33] MSTORE
// The next 3 opcodes simply push 0 twice, then REVERT(0, 0)
[35] PUSH1 0x00
[36] DUP1
[37] REVERT
// And, a final STOP to mark the end of contract bytecode!
[38] STOP
// This next bit is confusing when translated directly from opcodes, because it's simply the metadata Solidity appends to the end of bytecode. It's not meant to be executed
[39] LOG1
[46] PUSH6 0x627a7a723058
[47] SHA3
[48] MSTORE8
[49] LOG2
[50] BLOCKHASH
[51] ISZERO
[52] 'f8'(Unknown Opcode)
[53] DUP8
[54] 'e1'(Unknown Opcode)
[55] 'dd'(Unknown Opcode)
[56] SWAP16
[57] 'bd'(Unknown Opcode)
[58] '5e'(Unknown Opcode)
[59] '5c'(Unknown Opcode)
[60] 'ad'(Unknown Opcode)
[61] 'b3'(Unknown Opcode)
[62] SWAP8
[63] 'bb'(Unknown Opcode)
[64] '5f'(Unknown Opcode)
[65] 'b3'(Unknown Opcode)
[66] '4e'(Unknown Opcode)
[67] DUP11
[68] 'ab'(Unknown Opcode)


Here's more information on contract metadata: https://solidity.readthedocs.io/en/v0.4.24/metadata.html



So, to directly answer your questions:



  1. Yes, this is meant to be the constructor bytecode. Since the constructor needs to return the deployment bytecode, the constructor bytecode includes the deployment bytecode as well.


  2. The sequences to which you are referring are from the contract's metadata, and are not meant to be executed.


  3. The contract you compiled doesn't have any functions, so the entirety of the "runtime bytecode" is the 10-opcode section of the deployed bytecode, which sets the free memory pointer and immediately reverts. A typical contract with functions will do something slightly different -- it will grab the first 4 bytes of calldata (using CALLDATALOAD), and compare those against a series of function selectors. When it finds a match, it will JUMP to that function's position in the code. If no match is found, it executes the fallback function. If that doesn't exist, it reverts!


Hope that cleared it up! Feel free to ask more questions!







share|improve this answer














share|improve this answer



share|improve this answer








edited Sep 13 at 14:50









Community♦

1




1










answered Sep 13 at 9:42









Alexander Wade

662




662







  • 1




    Thanks Alex, that's incredibly helpful!
    – Peteris
    Sep 13 at 9:48






  • 1




    Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
    – Peteris
    Sep 13 at 10:17






  • 1




    As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
    – Alexander Wade
    Sep 13 at 11:49












  • 1




    Thanks Alex, that's incredibly helpful!
    – Peteris
    Sep 13 at 9:48






  • 1




    Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
    – Peteris
    Sep 13 at 10:17






  • 1




    As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
    – Alexander Wade
    Sep 13 at 11:49







1




1




Thanks Alex, that's incredibly helpful!
– Peteris
Sep 13 at 9:48




Thanks Alex, that's incredibly helpful!
– Peteris
Sep 13 at 9:48




1




1




Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
– Peteris
Sep 13 at 10:17




Are STOPs always used to mark end of contract code? Can a JUMPI statement reach past a STOP and in that case how would the EVM know which STOP designates constructor code vs. contract code?
– Peteris
Sep 13 at 10:17




1




1




As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
– Alexander Wade
Sep 13 at 11:49




As far as I'm aware, one STOP marks the end of a constructor, and one marks the end of the contract bytecode (and start of the metadata). They're really just there for demarcation - you can absolutely JUMP past them. Nothing in the EVM will stop that!
– Alexander Wade
Sep 13 at 11:49

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fethereum.stackexchange.com%2fquestions%2f58652%2funderstanding-how-a-simple-contract-breaks-into-bytecode%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?