How can the sha1sum function give you a unique hash? [closed]

up vote
4
down vote

favorite

I read that using the sha1sum command will run the SHA-1 algorithm and give you a "unique" result, but how can that be?

sha1sum gives you a 40 character hash, like this:

e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e

This is 40 characters long, and has hexadecimal characters, so the hash can have at most 16⁴⁰ combinations.

This should mean that, for any two random files, there is a small chance that their hash sum would be exactly the same, right?

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

closed as off-topic by dr01, schily, muru, Isaac, msp9011 Aug 22 at 4:35

This question does not appear to be about Unix or Linux within the scope defined in the help center.

If this question can be reworded to fit the rules in the help center, please edit the question.

1

See also a couple of related questions on crypt.SE: Is it fair to assume that SHA1 collisions won't occur on a set of <100k strings, How secure is SHA1? What are the chances of a real exploit?, Why is SHA-1 considered broken?
â€“Â ilkkachu
Aug 21 at 9:29

8

This does not seem to be related to Unix, but to mathematics.
â€“Â Kusalananda
Aug 21 at 9:34

6

I'm voting to close this question as off-topic because, even if it mentions the sha1sum command, it has nothing to do with Unix&Linux; it's a generic question about hash functions.
â€“Â dr01
Aug 21 at 12:50

add a commentÂ |Â

up vote
4
down vote

favorite

I read that using the sha1sum command will run the SHA-1 algorithm and give you a "unique" result, but how can that be?

sha1sum gives you a 40 character hash, like this:

e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e

This is 40 characters long, and has hexadecimal characters, so the hash can have at most 16⁴⁰ combinations.

This should mean that, for any two random files, there is a small chance that their hash sum would be exactly the same, right?

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

closed as off-topic by dr01, schily, muru, Isaac, msp9011 Aug 22 at 4:35

This question does not appear to be about Unix or Linux within the scope defined in the help center.

If this question can be reworded to fit the rules in the help center, please edit the question.

1

See also a couple of related questions on crypt.SE: Is it fair to assume that SHA1 collisions won't occur on a set of <100k strings, How secure is SHA1? What are the chances of a real exploit?, Why is SHA-1 considered broken?
â€“Â ilkkachu
Aug 21 at 9:29

8

This does not seem to be related to Unix, but to mathematics.
â€“Â Kusalananda
Aug 21 at 9:34

6

I'm voting to close this question as off-topic because, even if it mentions the sha1sum command, it has nothing to do with Unix&Linux; it's a generic question about hash functions.
â€“Â dr01
Aug 21 at 12:50

add a commentÂ |Â

up vote
4
down vote

favorite

I read that using the sha1sum command will run the SHA-1 algorithm and give you a "unique" result, but how can that be?

sha1sum gives you a 40 character hash, like this:

e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e

This is 40 characters long, and has hexadecimal characters, so the hash can have at most 16⁴⁰ combinations.

This should mean that, for any two random files, there is a small chance that their hash sum would be exactly the same, right?

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

I read that using the sha1sum command will run the SHA-1 algorithm and give you a "unique" result, but how can that be?

sha1sum gives you a 40 character hash, like this:

e5fa44f2b31c1fb553b6021e7360d07d5d91ff5e

This is 40 characters long, and has hexadecimal characters, so the hash can have at most 16⁴⁰ combinations.

This should mean that, for any two random files, there is a small chance that their hash sum would be exactly the same, right?

sha1sum

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

edited Aug 21 at 9:05

Stephen Kitt

146k22320386

asked Aug 21 at 8:53

iamAguest

985

asked Aug 21 at 8:53

iamAguest

985

asked Aug 21 at 8:53

iamAguest

985

closed as off-topic by dr01, schily, muru, Isaac, msp9011 Aug 22 at 4:35

This question does not appear to be about Unix or Linux within the scope defined in the help center.

If this question can be reworded to fit the rules in the help center, please edit the question.

closed as off-topic by dr01, schily, muru, Isaac, msp9011 Aug 22 at 4:35

This question does not appear to be about Unix or Linux within the scope defined in the help center.

If this question can be reworded to fit the rules in the help center, please edit the question.

1

See also a couple of related questions on crypt.SE: Is it fair to assume that SHA1 collisions won't occur on a set of <100k strings, How secure is SHA1? What are the chances of a real exploit?, Why is SHA-1 considered broken?
â€“Â ilkkachu
Aug 21 at 9:29

8

This does not seem to be related to Unix, but to mathematics.
â€“Â Kusalananda
Aug 21 at 9:34

6

I'm voting to close this question as off-topic because, even if it mentions the sha1sum command, it has nothing to do with Unix&Linux; it's a generic question about hash functions.
â€“Â dr01
Aug 21 at 12:50

add a commentÂ |Â

1

See also a couple of related questions on crypt.SE: Is it fair to assume that SHA1 collisions won't occur on a set of <100k strings, How secure is SHA1? What are the chances of a real exploit?, Why is SHA-1 considered broken?
â€“Â ilkkachu
Aug 21 at 9:29

8

This does not seem to be related to Unix, but to mathematics.
â€“Â Kusalananda
Aug 21 at 9:34

6

I'm voting to close this question as off-topic because, even if it mentions the sha1sum command, it has nothing to do with Unix&Linux; it's a generic question about hash functions.
â€“Â dr01
Aug 21 at 12:50

See also a couple of related questions on crypt.SE: Is it fair to assume that SHA1 collisions won't occur on a set of <100k strings, How secure is SHA1? What are the chances of a real exploit?, Why is SHA-1 considered broken?
â€“Â ilkkachu
Aug 21 at 9:29

This does not seem to be related to Unix, but to mathematics.
â€“Â Kusalananda
Aug 21 at 9:34

I'm voting to close this question as off-topic because, even if it mentions the sha1sum command, it has nothing to do with Unix&Linux; it's a generic question about hash functions.
â€“Â dr01
Aug 21 at 12:50

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
10
down vote

accepted

A SHA-1 hash, like any hash, isnÃ¢Â€Â™t unique; in fact collisions have been found (and more importantly, crafted).

The desirable property of a hash function is that itÃ¢Â€Â™s difficult to construct two files which produce the same hash, not that the hashes produced are unique (as you mention, a SHA-1 sum has 160 bits, so there are only 2¹⁶⁰ different hashes); difficult meaning that you canÃ¢Â€Â™t find a collision faster than brute-force search. If you pick one file, another randomly-chosen file has a one-in-2¹⁶⁰ chance of having the same hash. SHA-1 is considered insecure since 2005; youÃ¢Â€Â™ll see the odds given there as one-in-2⁸⁰, thanks to the birthday attack (the odds there are those of finding two colliding files in a large haystack, not those of finding a colliding file for a specific target).

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

3

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

1

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

add a commentÂ |Â

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
10
down vote

accepted

A SHA-1 hash, like any hash, isnÃ¢Â€Â™t unique; in fact collisions have been found (and more importantly, crafted).

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

3

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

1

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

add a commentÂ |Â

up vote
10
down vote

accepted

A SHA-1 hash, like any hash, isnÃ¢Â€Â™t unique; in fact collisions have been found (and more importantly, crafted).

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

3

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

1

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

add a commentÂ |Â

up vote
10
down vote

accepted

A SHA-1 hash, like any hash, isnÃ¢Â€Â™t unique; in fact collisions have been found (and more importantly, crafted).

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

A SHA-1 hash, like any hash, isnÃ¢Â€Â™t unique; in fact collisions have been found (and more importantly, crafted).

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

edited Aug 21 at 9:44

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

answered Aug 21 at 8:55

Stephen Kitt

146k22320386

3

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

1

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

add a commentÂ |Â

3

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

1

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

Two files have a 2^-160 chance of colliding, but having 2^80 files to test against each other gives about a 50 % chance of a collision, which is why the bit-strength of a hash is usually divided by two to get the strength against collisions. (It's not a question of having a collision between the two files Stephen has, but having one within ~all files, ever.)
â€“Â ilkkachu
Aug 21 at 9:18

2^80 is about 10^24, which is about the mass of the Earth in kilograms, just to give the number some scale. Random collisions are astronomically improbable, it's the brokenness of the hash you need to care about.
â€“Â ilkkachu
Aug 21 at 9:23

Ah yes, thanks @ilkkachu, itÃ¢Â€Â™s really about the average number of operations required to find a collision (statistically), correct? So the basic odds are 1:2^160, but you donÃ¢Â€Â™t need that many operations to find a collision because the start point isnÃ¢Â€Â™t fixed. As in, if you pick one file, then another, thereÃ¢Â€Â™s a 1:2^160 chance theyÃ¢Â€Â™ll collide, but thatÃ¢Â€Â™s not the same as finding a collision in a 2^80-sized haystack.
â€“Â Stephen Kitt
Aug 21 at 9:30

@StephenKitt, yea, exactly. Collisions are the case where you don't care what the colliding files are, so the birthday attack applies, hence 2^-80. Preimage resistance is different, that's where you're trying to find another file that gives the same hash as some single particular file, so the odds are 2^-160 for each file.
â€“Â ilkkachu
Aug 21 at 9:39

add a commentÂ |Â

搜尋此網誌

mjhjmtu