Uncompressed file estimation wrong?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
9
down vote

favorite
1












I had a large (~60G) compressed file (tar.gz).



I used split to break it into 4 parts and then cat to join them back together.



However, now, when I am trying to estimate the size of the uncompressed file, it turns out it is smaller than the original? How is this possible?



$ gzip -l myfile.tar.gz 
compressed uncompressed ratio uncompressed_name
60680003101 3985780736 -1422.4% myfile.tar









share|improve this question























  • Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
    – Barmar
    Sep 28 at 15:43














up vote
9
down vote

favorite
1












I had a large (~60G) compressed file (tar.gz).



I used split to break it into 4 parts and then cat to join them back together.



However, now, when I am trying to estimate the size of the uncompressed file, it turns out it is smaller than the original? How is this possible?



$ gzip -l myfile.tar.gz 
compressed uncompressed ratio uncompressed_name
60680003101 3985780736 -1422.4% myfile.tar









share|improve this question























  • Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
    – Barmar
    Sep 28 at 15:43












up vote
9
down vote

favorite
1









up vote
9
down vote

favorite
1






1





I had a large (~60G) compressed file (tar.gz).



I used split to break it into 4 parts and then cat to join them back together.



However, now, when I am trying to estimate the size of the uncompressed file, it turns out it is smaller than the original? How is this possible?



$ gzip -l myfile.tar.gz 
compressed uncompressed ratio uncompressed_name
60680003101 3985780736 -1422.4% myfile.tar









share|improve this question















I had a large (~60G) compressed file (tar.gz).



I used split to break it into 4 parts and then cat to join them back together.



However, now, when I am trying to estimate the size of the uncompressed file, it turns out it is smaller than the original? How is this possible?



$ gzip -l myfile.tar.gz 
compressed uncompressed ratio uncompressed_name
60680003101 3985780736 -1422.4% myfile.tar






compression gzip split






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 28 at 15:42









Barmar

6,7801122




6,7801122










asked Sep 28 at 9:23









pkaramol

384112




384112











  • Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
    – Barmar
    Sep 28 at 15:43
















  • Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
    – Barmar
    Sep 28 at 15:43















Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
– Barmar
Sep 28 at 15:43




Is split really relevant to this? Do you only have the problem after splitting and joining them back together?
– Barmar
Sep 28 at 15:43










1 Answer
1






active

oldest

votes

















up vote
20
down vote



accepted










This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and uncompressed correctly, but gzip -l gives an incorrect uncompressed size.



So splitting the tarball and reconstructing it hasn’t caused this, and shouldn’t have affected the file — if you want to make sure, you can check it with gzip -tv.



See Fastest way of working out uncompressed size of large GZIPPED file for more details, and the gzip manual:




The gzip format represents the input size modulo 2³², so the uncompressed size and compression ratio are listed incorrectly for uncompressed files 4 GiB and larger.







share|improve this answer






















  • So, actual content can still be intact, right?
    – Ruslan
    Sep 28 at 12:24










  • @Ruslan yes, the size displayed is wrong, but the contents are fine.
    – Stephen Kitt
    Sep 28 at 12:25










  • +1 I was gonna guess it was UINT32 error or something like that.
    – mathreadler
    Sep 28 at 16:11











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472013%2funcompressed-file-estimation-wrong%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
20
down vote



accepted










This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and uncompressed correctly, but gzip -l gives an incorrect uncompressed size.



So splitting the tarball and reconstructing it hasn’t caused this, and shouldn’t have affected the file — if you want to make sure, you can check it with gzip -tv.



See Fastest way of working out uncompressed size of large GZIPPED file for more details, and the gzip manual:




The gzip format represents the input size modulo 2³², so the uncompressed size and compression ratio are listed incorrectly for uncompressed files 4 GiB and larger.







share|improve this answer






















  • So, actual content can still be intact, right?
    – Ruslan
    Sep 28 at 12:24










  • @Ruslan yes, the size displayed is wrong, but the contents are fine.
    – Stephen Kitt
    Sep 28 at 12:25










  • +1 I was gonna guess it was UINT32 error or something like that.
    – mathreadler
    Sep 28 at 16:11















up vote
20
down vote



accepted










This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and uncompressed correctly, but gzip -l gives an incorrect uncompressed size.



So splitting the tarball and reconstructing it hasn’t caused this, and shouldn’t have affected the file — if you want to make sure, you can check it with gzip -tv.



See Fastest way of working out uncompressed size of large GZIPPED file for more details, and the gzip manual:




The gzip format represents the input size modulo 2³², so the uncompressed size and compression ratio are listed incorrectly for uncompressed files 4 GiB and larger.







share|improve this answer






















  • So, actual content can still be intact, right?
    – Ruslan
    Sep 28 at 12:24










  • @Ruslan yes, the size displayed is wrong, but the contents are fine.
    – Stephen Kitt
    Sep 28 at 12:25










  • +1 I was gonna guess it was UINT32 error or something like that.
    – mathreadler
    Sep 28 at 16:11













up vote
20
down vote



accepted







up vote
20
down vote



accepted






This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and uncompressed correctly, but gzip -l gives an incorrect uncompressed size.



So splitting the tarball and reconstructing it hasn’t caused this, and shouldn’t have affected the file — if you want to make sure, you can check it with gzip -tv.



See Fastest way of working out uncompressed size of large GZIPPED file for more details, and the gzip manual:




The gzip format represents the input size modulo 2³², so the uncompressed size and compression ratio are listed incorrectly for uncompressed files 4 GiB and larger.







share|improve this answer














This is caused by the size of the field used to store the uncompressed size in gzipped files: it’s only 32 bits, so gzip can only store sizes of files up to 4 GiB. Anything larger is compressed and uncompressed correctly, but gzip -l gives an incorrect uncompressed size.



So splitting the tarball and reconstructing it hasn’t caused this, and shouldn’t have affected the file — if you want to make sure, you can check it with gzip -tv.



See Fastest way of working out uncompressed size of large GZIPPED file for more details, and the gzip manual:




The gzip format represents the input size modulo 2³², so the uncompressed size and compression ratio are listed incorrectly for uncompressed files 4 GiB and larger.








share|improve this answer














share|improve this answer



share|improve this answer








edited Sep 28 at 9:37

























answered Sep 28 at 9:28









Stephen Kitt

149k23328396




149k23328396











  • So, actual content can still be intact, right?
    – Ruslan
    Sep 28 at 12:24










  • @Ruslan yes, the size displayed is wrong, but the contents are fine.
    – Stephen Kitt
    Sep 28 at 12:25










  • +1 I was gonna guess it was UINT32 error or something like that.
    – mathreadler
    Sep 28 at 16:11

















  • So, actual content can still be intact, right?
    – Ruslan
    Sep 28 at 12:24










  • @Ruslan yes, the size displayed is wrong, but the contents are fine.
    – Stephen Kitt
    Sep 28 at 12:25










  • +1 I was gonna guess it was UINT32 error or something like that.
    – mathreadler
    Sep 28 at 16:11
















So, actual content can still be intact, right?
– Ruslan
Sep 28 at 12:24




So, actual content can still be intact, right?
– Ruslan
Sep 28 at 12:24












@Ruslan yes, the size displayed is wrong, but the contents are fine.
– Stephen Kitt
Sep 28 at 12:25




@Ruslan yes, the size displayed is wrong, but the contents are fine.
– Stephen Kitt
Sep 28 at 12:25












+1 I was gonna guess it was UINT32 error or something like that.
– mathreadler
Sep 28 at 16:11





+1 I was gonna guess it was UINT32 error or something like that.
– mathreadler
Sep 28 at 16:11


















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f472013%2funcompressed-file-estimation-wrong%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Bahrain

Postfix configuration issue with fips on centos 7; mailgun relay