Filesystem design: necessity of inode number and table [closed]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












The *nix filesystems maintain an inode table at the beginning of the disk (or at some fixed location). It is indexed by the inode number, which is an integer that uniquely identifies an inode. Knowing the inode number, an inode can be found quickly. The inode contains pointers/addresses to other disk blocks, which contains the actual data of the file.



I would like to know whether my approach below to get rid of the inode table and the inode number is efficient:



We still have inodes, but now, the inodes are stored in the data region of disk, and instead of keeping track of the inode number, we just record the disk address or block number of the inode. Whenever we try to access a file or its inode, we just use the disk address to find the inode, instead of indexing into the inode table using the inode number. This will save us from another layer of indirection.



What is missing in my approach? I would like to understand the rationale behind the inode table.










share|improve this question















closed as too broad by Jeff Schaller, Stephen Harris, RalfFriedl, Romeo Ninov, lgeorget Sep 17 at 7:11


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.














  • Do it then post benchmarks.
    – Ipor Sircer
    Sep 16 at 23:45










  • or see ZFS which afaik does not use inodes
    – thrig
    Sep 16 at 23:53














up vote
1
down vote

favorite












The *nix filesystems maintain an inode table at the beginning of the disk (or at some fixed location). It is indexed by the inode number, which is an integer that uniquely identifies an inode. Knowing the inode number, an inode can be found quickly. The inode contains pointers/addresses to other disk blocks, which contains the actual data of the file.



I would like to know whether my approach below to get rid of the inode table and the inode number is efficient:



We still have inodes, but now, the inodes are stored in the data region of disk, and instead of keeping track of the inode number, we just record the disk address or block number of the inode. Whenever we try to access a file or its inode, we just use the disk address to find the inode, instead of indexing into the inode table using the inode number. This will save us from another layer of indirection.



What is missing in my approach? I would like to understand the rationale behind the inode table.










share|improve this question















closed as too broad by Jeff Schaller, Stephen Harris, RalfFriedl, Romeo Ninov, lgeorget Sep 17 at 7:11


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.














  • Do it then post benchmarks.
    – Ipor Sircer
    Sep 16 at 23:45










  • or see ZFS which afaik does not use inodes
    – thrig
    Sep 16 at 23:53












up vote
1
down vote

favorite









up vote
1
down vote

favorite











The *nix filesystems maintain an inode table at the beginning of the disk (or at some fixed location). It is indexed by the inode number, which is an integer that uniquely identifies an inode. Knowing the inode number, an inode can be found quickly. The inode contains pointers/addresses to other disk blocks, which contains the actual data of the file.



I would like to know whether my approach below to get rid of the inode table and the inode number is efficient:



We still have inodes, but now, the inodes are stored in the data region of disk, and instead of keeping track of the inode number, we just record the disk address or block number of the inode. Whenever we try to access a file or its inode, we just use the disk address to find the inode, instead of indexing into the inode table using the inode number. This will save us from another layer of indirection.



What is missing in my approach? I would like to understand the rationale behind the inode table.










share|improve this question















The *nix filesystems maintain an inode table at the beginning of the disk (or at some fixed location). It is indexed by the inode number, which is an integer that uniquely identifies an inode. Knowing the inode number, an inode can be found quickly. The inode contains pointers/addresses to other disk blocks, which contains the actual data of the file.



I would like to know whether my approach below to get rid of the inode table and the inode number is efficient:



We still have inodes, but now, the inodes are stored in the data region of disk, and instead of keeping track of the inode number, we just record the disk address or block number of the inode. Whenever we try to access a file or its inode, we just use the disk address to find the inode, instead of indexing into the inode table using the inode number. This will save us from another layer of indirection.



What is missing in my approach? I would like to understand the rationale behind the inode table.







files filesystems inode






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 17 at 0:24









Goro

5,89552662




5,89552662










asked Sep 16 at 23:41









flow2k

1729




1729




closed as too broad by Jeff Schaller, Stephen Harris, RalfFriedl, Romeo Ninov, lgeorget Sep 17 at 7:11


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.






closed as too broad by Jeff Schaller, Stephen Harris, RalfFriedl, Romeo Ninov, lgeorget Sep 17 at 7:11


Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.













  • Do it then post benchmarks.
    – Ipor Sircer
    Sep 16 at 23:45










  • or see ZFS which afaik does not use inodes
    – thrig
    Sep 16 at 23:53
















  • Do it then post benchmarks.
    – Ipor Sircer
    Sep 16 at 23:45










  • or see ZFS which afaik does not use inodes
    – thrig
    Sep 16 at 23:53















Do it then post benchmarks.
– Ipor Sircer
Sep 16 at 23:45




Do it then post benchmarks.
– Ipor Sircer
Sep 16 at 23:45












or see ZFS which afaik does not use inodes
– thrig
Sep 16 at 23:53




or see ZFS which afaik does not use inodes
– thrig
Sep 16 at 23:53










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










If I understand you correctly, you want to replace the inode number with the block address. That means (1) one inode per block, which wastes a lot of space (the inode isn't that large), and (2) it's not that different from using an inode number: An inode has a fixed size, so a block contains a known number n of inodes. So if you devide the inode number by n (which ideally is a power of two, so it's just a shift), the quotient is the block number of the inode (plus the disk address where the inode table starts), and the remainder is the index of the inode inside that block.



To understand the rationale behind the inode table, think about what data is stored in the inode table: It's attributes like owner, group, permissions and timestamps, and indices and indirect indices of the data blocks. You have to store those somewhere, and you can't store them together with the file data.



So if you want to design your own filesystem, the first question you have to answer is "how do I identify the data blocks that belong to a file?" and the second question is "where do I store attributes like ownership, permissions and timestamps?". And yes, you can use different schemes for that than inodes.



Edit



As for




why not just use its address, like we do with main memory and objects therein?




As I wrote, basically you have the block address - you'll just have to divide first, and add an offset. If you add the offset to every inode on principle, the "inode number" will be much larger, and you'll have a constant value in the high bits that's repeated in every inode number. This in turn will make each directory entry larger.



Don't forget that the unix filesystem was invented when harddisk sizes where around 20 Mbytes or so. You don't want to waste space, so you pack everything densely, and you avoid redundancy. Adding an offset every time you access an inode is cheap. Storing this offset as part of every "inode number" reference is expensive.



And the interesting thing is that even though the inode scheme it was invented for small harddisks in today's terms, it scales well, and even on harddisks in the terabyte range in "just works".






share|improve this answer






















  • Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
    – flow2k
    Sep 17 at 6:13











  • Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
    – flow2k
    Sep 17 at 6:16










  • Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
    – flow2k
    Sep 17 at 6:41

















1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










If I understand you correctly, you want to replace the inode number with the block address. That means (1) one inode per block, which wastes a lot of space (the inode isn't that large), and (2) it's not that different from using an inode number: An inode has a fixed size, so a block contains a known number n of inodes. So if you devide the inode number by n (which ideally is a power of two, so it's just a shift), the quotient is the block number of the inode (plus the disk address where the inode table starts), and the remainder is the index of the inode inside that block.



To understand the rationale behind the inode table, think about what data is stored in the inode table: It's attributes like owner, group, permissions and timestamps, and indices and indirect indices of the data blocks. You have to store those somewhere, and you can't store them together with the file data.



So if you want to design your own filesystem, the first question you have to answer is "how do I identify the data blocks that belong to a file?" and the second question is "where do I store attributes like ownership, permissions and timestamps?". And yes, you can use different schemes for that than inodes.



Edit



As for




why not just use its address, like we do with main memory and objects therein?




As I wrote, basically you have the block address - you'll just have to divide first, and add an offset. If you add the offset to every inode on principle, the "inode number" will be much larger, and you'll have a constant value in the high bits that's repeated in every inode number. This in turn will make each directory entry larger.



Don't forget that the unix filesystem was invented when harddisk sizes where around 20 Mbytes or so. You don't want to waste space, so you pack everything densely, and you avoid redundancy. Adding an offset every time you access an inode is cheap. Storing this offset as part of every "inode number" reference is expensive.



And the interesting thing is that even though the inode scheme it was invented for small harddisks in today's terms, it scales well, and even on harddisks in the terabyte range in "just works".






share|improve this answer






















  • Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
    – flow2k
    Sep 17 at 6:13











  • Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
    – flow2k
    Sep 17 at 6:16










  • Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
    – flow2k
    Sep 17 at 6:41














up vote
2
down vote



accepted










If I understand you correctly, you want to replace the inode number with the block address. That means (1) one inode per block, which wastes a lot of space (the inode isn't that large), and (2) it's not that different from using an inode number: An inode has a fixed size, so a block contains a known number n of inodes. So if you devide the inode number by n (which ideally is a power of two, so it's just a shift), the quotient is the block number of the inode (plus the disk address where the inode table starts), and the remainder is the index of the inode inside that block.



To understand the rationale behind the inode table, think about what data is stored in the inode table: It's attributes like owner, group, permissions and timestamps, and indices and indirect indices of the data blocks. You have to store those somewhere, and you can't store them together with the file data.



So if you want to design your own filesystem, the first question you have to answer is "how do I identify the data blocks that belong to a file?" and the second question is "where do I store attributes like ownership, permissions and timestamps?". And yes, you can use different schemes for that than inodes.



Edit



As for




why not just use its address, like we do with main memory and objects therein?




As I wrote, basically you have the block address - you'll just have to divide first, and add an offset. If you add the offset to every inode on principle, the "inode number" will be much larger, and you'll have a constant value in the high bits that's repeated in every inode number. This in turn will make each directory entry larger.



Don't forget that the unix filesystem was invented when harddisk sizes where around 20 Mbytes or so. You don't want to waste space, so you pack everything densely, and you avoid redundancy. Adding an offset every time you access an inode is cheap. Storing this offset as part of every "inode number" reference is expensive.



And the interesting thing is that even though the inode scheme it was invented for small harddisks in today's terms, it scales well, and even on harddisks in the terabyte range in "just works".






share|improve this answer






















  • Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
    – flow2k
    Sep 17 at 6:13











  • Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
    – flow2k
    Sep 17 at 6:16










  • Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
    – flow2k
    Sep 17 at 6:41












up vote
2
down vote



accepted







up vote
2
down vote



accepted






If I understand you correctly, you want to replace the inode number with the block address. That means (1) one inode per block, which wastes a lot of space (the inode isn't that large), and (2) it's not that different from using an inode number: An inode has a fixed size, so a block contains a known number n of inodes. So if you devide the inode number by n (which ideally is a power of two, so it's just a shift), the quotient is the block number of the inode (plus the disk address where the inode table starts), and the remainder is the index of the inode inside that block.



To understand the rationale behind the inode table, think about what data is stored in the inode table: It's attributes like owner, group, permissions and timestamps, and indices and indirect indices of the data blocks. You have to store those somewhere, and you can't store them together with the file data.



So if you want to design your own filesystem, the first question you have to answer is "how do I identify the data blocks that belong to a file?" and the second question is "where do I store attributes like ownership, permissions and timestamps?". And yes, you can use different schemes for that than inodes.



Edit



As for




why not just use its address, like we do with main memory and objects therein?




As I wrote, basically you have the block address - you'll just have to divide first, and add an offset. If you add the offset to every inode on principle, the "inode number" will be much larger, and you'll have a constant value in the high bits that's repeated in every inode number. This in turn will make each directory entry larger.



Don't forget that the unix filesystem was invented when harddisk sizes where around 20 Mbytes or so. You don't want to waste space, so you pack everything densely, and you avoid redundancy. Adding an offset every time you access an inode is cheap. Storing this offset as part of every "inode number" reference is expensive.



And the interesting thing is that even though the inode scheme it was invented for small harddisks in today's terms, it scales well, and even on harddisks in the terabyte range in "just works".






share|improve this answer














If I understand you correctly, you want to replace the inode number with the block address. That means (1) one inode per block, which wastes a lot of space (the inode isn't that large), and (2) it's not that different from using an inode number: An inode has a fixed size, so a block contains a known number n of inodes. So if you devide the inode number by n (which ideally is a power of two, so it's just a shift), the quotient is the block number of the inode (plus the disk address where the inode table starts), and the remainder is the index of the inode inside that block.



To understand the rationale behind the inode table, think about what data is stored in the inode table: It's attributes like owner, group, permissions and timestamps, and indices and indirect indices of the data blocks. You have to store those somewhere, and you can't store them together with the file data.



So if you want to design your own filesystem, the first question you have to answer is "how do I identify the data blocks that belong to a file?" and the second question is "where do I store attributes like ownership, permissions and timestamps?". And yes, you can use different schemes for that than inodes.



Edit



As for




why not just use its address, like we do with main memory and objects therein?




As I wrote, basically you have the block address - you'll just have to divide first, and add an offset. If you add the offset to every inode on principle, the "inode number" will be much larger, and you'll have a constant value in the high bits that's repeated in every inode number. This in turn will make each directory entry larger.



Don't forget that the unix filesystem was invented when harddisk sizes where around 20 Mbytes or so. You don't want to waste space, so you pack everything densely, and you avoid redundancy. Adding an offset every time you access an inode is cheap. Storing this offset as part of every "inode number" reference is expensive.



And the interesting thing is that even though the inode scheme it was invented for small harddisks in today's terms, it scales well, and even on harddisks in the terabyte range in "just works".







share|improve this answer














share|improve this answer



share|improve this answer








edited Sep 17 at 6:24

























answered Sep 17 at 5:58









dirkt

14.9k2932




14.9k2932











  • Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
    – flow2k
    Sep 17 at 6:13











  • Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
    – flow2k
    Sep 17 at 6:16










  • Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
    – flow2k
    Sep 17 at 6:41
















  • Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
    – flow2k
    Sep 17 at 6:13











  • Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
    – flow2k
    Sep 17 at 6:16










  • Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
    – flow2k
    Sep 17 at 6:41















Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
– flow2k
Sep 17 at 6:13





Correct, I proposed replacing the inode number with the block address. I understand the part where the inode is the structure that contains info on which data blocks a file has and its metadata/attributes - I'm not proposing to get rid of the inode. The point I struggled with was why we needed to have a number for each inode - why not just use its address, like we do with main memory and objects therein? If we use addresses for inodes, we don't have to bother storing them consecutively in a table at the beginning of the disk.
– flow2k
Sep 17 at 6:13













Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
– flow2k
Sep 17 at 6:16




Yes, the point you're making about the size of the inode makes sense - it would be wasteful to assign one block address to each inode. Pondering about this more, I think spatial locality may be another factor - we can have the inode table, or part of it, in main memory, which can speed up operations like directory listing, i.e. ls ~.
– flow2k
Sep 17 at 6:16












Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
– flow2k
Sep 17 at 6:41




Understood your edit - the offset added is the address of the inode table on disk. So another reason is block address takes more bits to store than the inode number. Thanks!
– flow2k
Sep 17 at 6:41


Popular posts from this blog

How to check contact read email or not when send email to Individual?

Bahrain

Postfix configuration issue with fips on centos 7; mailgun relay