Is there a quickest way to count lines in file which is of 4TB in linux?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















Actually I have 4 TB of Txt file as Exported from Teradata records and I want to know How many records are there in that file.










share|improve this question
























  • Is each line a record? If yes, you can just use wc -l

    – Panki
    Mar 7 at 10:56











  • This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

    – Stephen Kitt
    Mar 7 at 11:08











  • If the export happened to put a comment at the top, that'd make it pretty fast to find.

    – Jeff Schaller
    Mar 7 at 11:21












  • I tried Using vim -R filename it took around 1.5 Hrs

    – Santosh Garole
    Mar 8 at 7:45

















0















Actually I have 4 TB of Txt file as Exported from Teradata records and I want to know How many records are there in that file.










share|improve this question
























  • Is each line a record? If yes, you can just use wc -l

    – Panki
    Mar 7 at 10:56











  • This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

    – Stephen Kitt
    Mar 7 at 11:08











  • If the export happened to put a comment at the top, that'd make it pretty fast to find.

    – Jeff Schaller
    Mar 7 at 11:21












  • I tried Using vim -R filename it took around 1.5 Hrs

    – Santosh Garole
    Mar 8 at 7:45













0












0








0


1






Actually I have 4 TB of Txt file as Exported from Teradata records and I want to know How many records are there in that file.










share|improve this question
















Actually I have 4 TB of Txt file as Exported from Teradata records and I want to know How many records are there in that file.







text-processing cat wc






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 10:56









Jeff Schaller

44.6k1162145




44.6k1162145










asked Mar 7 at 10:53









Santosh GaroleSantosh Garole

62




62












  • Is each line a record? If yes, you can just use wc -l

    – Panki
    Mar 7 at 10:56











  • This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

    – Stephen Kitt
    Mar 7 at 11:08











  • If the export happened to put a comment at the top, that'd make it pretty fast to find.

    – Jeff Schaller
    Mar 7 at 11:21












  • I tried Using vim -R filename it took around 1.5 Hrs

    – Santosh Garole
    Mar 8 at 7:45

















  • Is each line a record? If yes, you can just use wc -l

    – Panki
    Mar 7 at 10:56











  • This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

    – Stephen Kitt
    Mar 7 at 11:08











  • If the export happened to put a comment at the top, that'd make it pretty fast to find.

    – Jeff Schaller
    Mar 7 at 11:21












  • I tried Using vim -R filename it took around 1.5 Hrs

    – Santosh Garole
    Mar 8 at 7:45
















Is each line a record? If yes, you can just use wc -l

– Panki
Mar 7 at 10:56





Is each line a record? If yes, you can just use wc -l

– Panki
Mar 7 at 10:56













This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

– Stephen Kitt
Mar 7 at 11:08





This doesn’t answer the stated question, but the fastest way would be to ask your Teradata system.

– Stephen Kitt
Mar 7 at 11:08













If the export happened to put a comment at the top, that'd make it pretty fast to find.

– Jeff Schaller
Mar 7 at 11:21






If the export happened to put a comment at the top, that'd make it pretty fast to find.

– Jeff Schaller
Mar 7 at 11:21














I tried Using vim -R filename it took around 1.5 Hrs

– Santosh Garole
Mar 8 at 7:45





I tried Using vim -R filename it took around 1.5 Hrs

– Santosh Garole
Mar 8 at 7:45










2 Answers
2






active

oldest

votes


















2














If this information is not already present as meta data in a separate file (or embedded in the data, or available through a query to the system that you exported the data from) and if there is no index file of some description available, then the quickest way to count the number of lines is by using wc -l on the file.



You can not really do it quicker.



To count the number of records in the file, you will have to know what record separator is in used and use something like awk to count these. Again, that is if this information is not already stored elsewhere as meta data and if it's not available through a query to the originating system, and if the records themselves are not already enumerated and sorted within the file.






share|improve this answer
































    0














    You should not use line based utilities such as awk and sed. These utilities will issue a read() system call for every line in the input file (see that answer on why this is so). If you have lots of lines, this will be a huge performance loss.



    Since your file is 4TB in size, I guess that there are a lot of lines. So even wc -l will produce a lot of read() system calls, since it reads only 16384 bytes per call (on my system). Anyway this would be an improvement over awk and sed. The best method - unless you write your own program - might be just



    cat file | wc -l


    This is no useless use of cat, because cat reads chunks of 131072 bytes per read() system call (on my system) and wc -l will issue more, but not on the file directly, instead on the pipe. But however, cat tries to read as much as possible per system call.






    share|improve this answer

























    • Won't an io redirect be faster than cat and pipe ?

      – RoVo
      Mar 7 at 12:22












    • @RoVo Could be, have you tried it?

      – chaos
      Mar 7 at 12:25











    • Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

      – RoVo
      Mar 7 at 12:28












    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504892%2fis-there-a-quickest-way-to-count-lines-in-file-which-is-of-4tb-in-linux%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    If this information is not already present as meta data in a separate file (or embedded in the data, or available through a query to the system that you exported the data from) and if there is no index file of some description available, then the quickest way to count the number of lines is by using wc -l on the file.



    You can not really do it quicker.



    To count the number of records in the file, you will have to know what record separator is in used and use something like awk to count these. Again, that is if this information is not already stored elsewhere as meta data and if it's not available through a query to the originating system, and if the records themselves are not already enumerated and sorted within the file.






    share|improve this answer





























      2














      If this information is not already present as meta data in a separate file (or embedded in the data, or available through a query to the system that you exported the data from) and if there is no index file of some description available, then the quickest way to count the number of lines is by using wc -l on the file.



      You can not really do it quicker.



      To count the number of records in the file, you will have to know what record separator is in used and use something like awk to count these. Again, that is if this information is not already stored elsewhere as meta data and if it's not available through a query to the originating system, and if the records themselves are not already enumerated and sorted within the file.






      share|improve this answer



























        2












        2








        2







        If this information is not already present as meta data in a separate file (or embedded in the data, or available through a query to the system that you exported the data from) and if there is no index file of some description available, then the quickest way to count the number of lines is by using wc -l on the file.



        You can not really do it quicker.



        To count the number of records in the file, you will have to know what record separator is in used and use something like awk to count these. Again, that is if this information is not already stored elsewhere as meta data and if it's not available through a query to the originating system, and if the records themselves are not already enumerated and sorted within the file.






        share|improve this answer















        If this information is not already present as meta data in a separate file (or embedded in the data, or available through a query to the system that you exported the data from) and if there is no index file of some description available, then the quickest way to count the number of lines is by using wc -l on the file.



        You can not really do it quicker.



        To count the number of records in the file, you will have to know what record separator is in used and use something like awk to count these. Again, that is if this information is not already stored elsewhere as meta data and if it's not available through a query to the originating system, and if the records themselves are not already enumerated and sorted within the file.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Mar 7 at 11:09

























        answered Mar 7 at 10:58









        KusalanandaKusalananda

        139k17261433




        139k17261433























            0














            You should not use line based utilities such as awk and sed. These utilities will issue a read() system call for every line in the input file (see that answer on why this is so). If you have lots of lines, this will be a huge performance loss.



            Since your file is 4TB in size, I guess that there are a lot of lines. So even wc -l will produce a lot of read() system calls, since it reads only 16384 bytes per call (on my system). Anyway this would be an improvement over awk and sed. The best method - unless you write your own program - might be just



            cat file | wc -l


            This is no useless use of cat, because cat reads chunks of 131072 bytes per read() system call (on my system) and wc -l will issue more, but not on the file directly, instead on the pipe. But however, cat tries to read as much as possible per system call.






            share|improve this answer

























            • Won't an io redirect be faster than cat and pipe ?

              – RoVo
              Mar 7 at 12:22












            • @RoVo Could be, have you tried it?

              – chaos
              Mar 7 at 12:25











            • Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

              – RoVo
              Mar 7 at 12:28
















            0














            You should not use line based utilities such as awk and sed. These utilities will issue a read() system call for every line in the input file (see that answer on why this is so). If you have lots of lines, this will be a huge performance loss.



            Since your file is 4TB in size, I guess that there are a lot of lines. So even wc -l will produce a lot of read() system calls, since it reads only 16384 bytes per call (on my system). Anyway this would be an improvement over awk and sed. The best method - unless you write your own program - might be just



            cat file | wc -l


            This is no useless use of cat, because cat reads chunks of 131072 bytes per read() system call (on my system) and wc -l will issue more, but not on the file directly, instead on the pipe. But however, cat tries to read as much as possible per system call.






            share|improve this answer

























            • Won't an io redirect be faster than cat and pipe ?

              – RoVo
              Mar 7 at 12:22












            • @RoVo Could be, have you tried it?

              – chaos
              Mar 7 at 12:25











            • Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

              – RoVo
              Mar 7 at 12:28














            0












            0








            0







            You should not use line based utilities such as awk and sed. These utilities will issue a read() system call for every line in the input file (see that answer on why this is so). If you have lots of lines, this will be a huge performance loss.



            Since your file is 4TB in size, I guess that there are a lot of lines. So even wc -l will produce a lot of read() system calls, since it reads only 16384 bytes per call (on my system). Anyway this would be an improvement over awk and sed. The best method - unless you write your own program - might be just



            cat file | wc -l


            This is no useless use of cat, because cat reads chunks of 131072 bytes per read() system call (on my system) and wc -l will issue more, but not on the file directly, instead on the pipe. But however, cat tries to read as much as possible per system call.






            share|improve this answer















            You should not use line based utilities such as awk and sed. These utilities will issue a read() system call for every line in the input file (see that answer on why this is so). If you have lots of lines, this will be a huge performance loss.



            Since your file is 4TB in size, I guess that there are a lot of lines. So even wc -l will produce a lot of read() system calls, since it reads only 16384 bytes per call (on my system). Anyway this would be an improvement over awk and sed. The best method - unless you write your own program - might be just



            cat file | wc -l


            This is no useless use of cat, because cat reads chunks of 131072 bytes per read() system call (on my system) and wc -l will issue more, but not on the file directly, instead on the pipe. But however, cat tries to read as much as possible per system call.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Mar 7 at 12:22

























            answered Mar 7 at 12:16









            chaoschaos

            36k977120




            36k977120












            • Won't an io redirect be faster than cat and pipe ?

              – RoVo
              Mar 7 at 12:22












            • @RoVo Could be, have you tried it?

              – chaos
              Mar 7 at 12:25











            • Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

              – RoVo
              Mar 7 at 12:28


















            • Won't an io redirect be faster than cat and pipe ?

              – RoVo
              Mar 7 at 12:22












            • @RoVo Could be, have you tried it?

              – chaos
              Mar 7 at 12:25











            • Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

              – RoVo
              Mar 7 at 12:28

















            Won't an io redirect be faster than cat and pipe ?

            – RoVo
            Mar 7 at 12:22






            Won't an io redirect be faster than cat and pipe ?

            – RoVo
            Mar 7 at 12:22














            @RoVo Could be, have you tried it?

            – chaos
            Mar 7 at 12:25





            @RoVo Could be, have you tried it?

            – chaos
            Mar 7 at 12:25













            Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

            – RoVo
            Mar 7 at 12:28






            Short test with 10 iterations of wc -l with a 701MB file: wc -l file 1.7s ;; wc -l < file 1.7s ;; cat file | wc -l 2.6s.

            – RoVo
            Mar 7 at 12:28


















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504892%2fis-there-a-quickest-way-to-count-lines-in-file-which-is-of-4tb-in-linux%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?