shell - extracting date & time from logs

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








0















I have web server log files that look like this:



2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:07:49:01 +0100] "GET / HTTP/1.1" 200 58266 "-" "curl/7.61.1"
2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:08:49:01 +0100] "GET / HTTP/1.1" 200 58341 "-" "curl/7.61.1"
2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:08:56:10 +0100] "POST /wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500" "WordPress/4.9.9; https://ios-example.com"
...


I need to extract dates and times in this format 22/Feb/2019:07:49:01.



This is what I have now (shamelessly copied from this thread: extracting date field from the lines):



file="filename"
while IFS= read -r line
do
echo "`cut -d '[' -f2 $line | cut -d ' ' -f1`" # echoing now for testing purposes
done <"$file"


And this is the output when I run the script:



cut: '2001:67c:1220:80c:d4:985a:df2c:d717': Adresář nebo soubor neexistuje
cut: '[22/Feb/2019:07:49:01': Adresář nebo soubor neexistuje
cut: +0100]: Adresář nebo soubor neexistuje
cut: '"GET': Adresář nebo soubor neexistuje
cut: /: je adresářem
cut: 'HTTP/1.1"': Adresář nebo soubor neexistuje
cut: 200: Adresář nebo soubor neexistuje
cut: 58266: Adresář nebo soubor neexistuje
cut: '"-"': Adresář nebo soubor neexistuje
cut: '"curl/7.61.1"': Adresář nebo soubor neexistuje
22/Feb/2019:08:49:01
22/Feb/2019:08:56:10
22/Feb/2019:08:56:10
22/Feb/2019:09:24:33
22/Feb/2019:09:24:33
22/Feb/2019:09:43:13
22/Feb/2019:09:43:24
...


"Adresář nebo soubor neexistuje" means "Directory or file does not exist".



For a reason unknown to me, it does not work on the first line of the log file, but works fine with the rest of the file.










share|improve this question






























    0















    I have web server log files that look like this:



    2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:07:49:01 +0100] "GET / HTTP/1.1" 200 58266 "-" "curl/7.61.1"
    2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:08:49:01 +0100] "GET / HTTP/1.1" 200 58341 "-" "curl/7.61.1"
    2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:08:56:10 +0100] "POST /wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500" "WordPress/4.9.9; https://ios-example.com"
    ...


    I need to extract dates and times in this format 22/Feb/2019:07:49:01.



    This is what I have now (shamelessly copied from this thread: extracting date field from the lines):



    file="filename"
    while IFS= read -r line
    do
    echo "`cut -d '[' -f2 $line | cut -d ' ' -f1`" # echoing now for testing purposes
    done <"$file"


    And this is the output when I run the script:



    cut: '2001:67c:1220:80c:d4:985a:df2c:d717': Adresář nebo soubor neexistuje
    cut: '[22/Feb/2019:07:49:01': Adresář nebo soubor neexistuje
    cut: +0100]: Adresář nebo soubor neexistuje
    cut: '"GET': Adresář nebo soubor neexistuje
    cut: /: je adresářem
    cut: 'HTTP/1.1"': Adresář nebo soubor neexistuje
    cut: 200: Adresář nebo soubor neexistuje
    cut: 58266: Adresář nebo soubor neexistuje
    cut: '"-"': Adresář nebo soubor neexistuje
    cut: '"curl/7.61.1"': Adresář nebo soubor neexistuje
    22/Feb/2019:08:49:01
    22/Feb/2019:08:56:10
    22/Feb/2019:08:56:10
    22/Feb/2019:09:24:33
    22/Feb/2019:09:24:33
    22/Feb/2019:09:43:13
    22/Feb/2019:09:43:24
    ...


    "Adresář nebo soubor neexistuje" means "Directory or file does not exist".



    For a reason unknown to me, it does not work on the first line of the log file, but works fine with the rest of the file.










    share|improve this question


























      0












      0








      0








      I have web server log files that look like this:



      2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:07:49:01 +0100] "GET / HTTP/1.1" 200 58266 "-" "curl/7.61.1"
      2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:08:49:01 +0100] "GET / HTTP/1.1" 200 58341 "-" "curl/7.61.1"
      2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:08:56:10 +0100] "POST /wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500" "WordPress/4.9.9; https://ios-example.com"
      ...


      I need to extract dates and times in this format 22/Feb/2019:07:49:01.



      This is what I have now (shamelessly copied from this thread: extracting date field from the lines):



      file="filename"
      while IFS= read -r line
      do
      echo "`cut -d '[' -f2 $line | cut -d ' ' -f1`" # echoing now for testing purposes
      done <"$file"


      And this is the output when I run the script:



      cut: '2001:67c:1220:80c:d4:985a:df2c:d717': Adresář nebo soubor neexistuje
      cut: '[22/Feb/2019:07:49:01': Adresář nebo soubor neexistuje
      cut: +0100]: Adresář nebo soubor neexistuje
      cut: '"GET': Adresář nebo soubor neexistuje
      cut: /: je adresářem
      cut: 'HTTP/1.1"': Adresář nebo soubor neexistuje
      cut: 200: Adresář nebo soubor neexistuje
      cut: 58266: Adresář nebo soubor neexistuje
      cut: '"-"': Adresář nebo soubor neexistuje
      cut: '"curl/7.61.1"': Adresář nebo soubor neexistuje
      22/Feb/2019:08:49:01
      22/Feb/2019:08:56:10
      22/Feb/2019:08:56:10
      22/Feb/2019:09:24:33
      22/Feb/2019:09:24:33
      22/Feb/2019:09:43:13
      22/Feb/2019:09:43:24
      ...


      "Adresář nebo soubor neexistuje" means "Directory or file does not exist".



      For a reason unknown to me, it does not work on the first line of the log file, but works fine with the rest of the file.










      share|improve this question
















      I have web server log files that look like this:



      2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:07:49:01 +0100] "GET / HTTP/1.1" 200 58266 "-" "curl/7.61.1"
      2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:08:49:01 +0100] "GET / HTTP/1.1" 200 58341 "-" "curl/7.61.1"
      2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:08:56:10 +0100] "POST /wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500" "WordPress/4.9.9; https://ios-example.com"
      ...


      I need to extract dates and times in this format 22/Feb/2019:07:49:01.



      This is what I have now (shamelessly copied from this thread: extracting date field from the lines):



      file="filename"
      while IFS= read -r line
      do
      echo "`cut -d '[' -f2 $line | cut -d ' ' -f1`" # echoing now for testing purposes
      done <"$file"


      And this is the output when I run the script:



      cut: '2001:67c:1220:80c:d4:985a:df2c:d717': Adresář nebo soubor neexistuje
      cut: '[22/Feb/2019:07:49:01': Adresář nebo soubor neexistuje
      cut: +0100]: Adresář nebo soubor neexistuje
      cut: '"GET': Adresář nebo soubor neexistuje
      cut: /: je adresářem
      cut: 'HTTP/1.1"': Adresář nebo soubor neexistuje
      cut: 200: Adresář nebo soubor neexistuje
      cut: 58266: Adresář nebo soubor neexistuje
      cut: '"-"': Adresář nebo soubor neexistuje
      cut: '"curl/7.61.1"': Adresář nebo soubor neexistuje
      22/Feb/2019:08:49:01
      22/Feb/2019:08:56:10
      22/Feb/2019:08:56:10
      22/Feb/2019:09:24:33
      22/Feb/2019:09:24:33
      22/Feb/2019:09:43:13
      22/Feb/2019:09:43:24
      ...


      "Adresář nebo soubor neexistuje" means "Directory or file does not exist".



      For a reason unknown to me, it does not work on the first line of the log file, but works fine with the rest of the file.







      text-processing logs






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 16 at 18:14









      Kusalananda

      141k18264440




      141k18264440










      asked Mar 16 at 15:12









      stitch123stitch123

      62




      62




















          2 Answers
          2






          active

          oldest

          votes


















          1














          You made multiple mistakes :



          • cut use a file name as argument

          • you forget some double quote ( " )

          So if i rewrite you example , with a minimal number of change :



          • the use of $( instead of ` . this is more robust and it can be recursive .

          • the use of $VARIABLE_NAME instead of $VARIABLE_NAME . this is more robust

          a new version of



          file="filename"
          while IFS= read -r line
          do
          EXTRACT_DATE=$( echo "$line" | cut -d '[' -f2 | cut -d ' ' -f1 )
          echo "$EXTRACT_DATE"
          done <"$file"





          share|improve this answer























          • This fixed the issue, thank you very much for a quick answer!

            – stitch123
            Mar 16 at 15:42






          • 1





            There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

            – Kusalananda
            Mar 16 at 15:42


















          1














          The main issue, which creates the errors, is that you using the read line in $line as a filename for cut to read.



          You are also using echo to output the result of a command substitution. This is an anti-pattern. Just run the pipeline, without echo nor command substitution. It will output its result to the terminal by itself.



          Here, we use printf to give cut the line read from the file:



          file="filename"

          while IFS= read -r line; do
          printf '%sn' "$line" | cut -d '[' -f2 | cut -d ' ' -f1
          done <"$file"


          The next thing to note is that the while loop is totally unnecessary. You are calling cut twice for each line in the log file. The cut utility is perfectly capable of reading the file line by line by itself:



          file="filename"

          cut -d '[' -f2 "$file" | cut -d ' ' -f1


          Or, you could use GNU grep:



          grep -oP '(?<=[)[^ ]+' "$file"


          (This extracts everything up to the first space after the first [)



          or standard sed,



          sed 's/].*//; s/.*[//; s/ .*//' "$file"


          (This deletes everything after the first ], then deletes everything to the first [, then chops of the space and the rest ofter that)



          Related:



          • Why is using a shell loop to process text considered bad practice?





          share|improve this answer

























            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "106"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506686%2fshell-extracting-date-time-from-logs%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            You made multiple mistakes :



            • cut use a file name as argument

            • you forget some double quote ( " )

            So if i rewrite you example , with a minimal number of change :



            • the use of $( instead of ` . this is more robust and it can be recursive .

            • the use of $VARIABLE_NAME instead of $VARIABLE_NAME . this is more robust

            a new version of



            file="filename"
            while IFS= read -r line
            do
            EXTRACT_DATE=$( echo "$line" | cut -d '[' -f2 | cut -d ' ' -f1 )
            echo "$EXTRACT_DATE"
            done <"$file"





            share|improve this answer























            • This fixed the issue, thank you very much for a quick answer!

              – stitch123
              Mar 16 at 15:42






            • 1





              There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

              – Kusalananda
              Mar 16 at 15:42















            1














            You made multiple mistakes :



            • cut use a file name as argument

            • you forget some double quote ( " )

            So if i rewrite you example , with a minimal number of change :



            • the use of $( instead of ` . this is more robust and it can be recursive .

            • the use of $VARIABLE_NAME instead of $VARIABLE_NAME . this is more robust

            a new version of



            file="filename"
            while IFS= read -r line
            do
            EXTRACT_DATE=$( echo "$line" | cut -d '[' -f2 | cut -d ' ' -f1 )
            echo "$EXTRACT_DATE"
            done <"$file"





            share|improve this answer























            • This fixed the issue, thank you very much for a quick answer!

              – stitch123
              Mar 16 at 15:42






            • 1





              There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

              – Kusalananda
              Mar 16 at 15:42













            1












            1








            1







            You made multiple mistakes :



            • cut use a file name as argument

            • you forget some double quote ( " )

            So if i rewrite you example , with a minimal number of change :



            • the use of $( instead of ` . this is more robust and it can be recursive .

            • the use of $VARIABLE_NAME instead of $VARIABLE_NAME . this is more robust

            a new version of



            file="filename"
            while IFS= read -r line
            do
            EXTRACT_DATE=$( echo "$line" | cut -d '[' -f2 | cut -d ' ' -f1 )
            echo "$EXTRACT_DATE"
            done <"$file"





            share|improve this answer













            You made multiple mistakes :



            • cut use a file name as argument

            • you forget some double quote ( " )

            So if i rewrite you example , with a minimal number of change :



            • the use of $( instead of ` . this is more robust and it can be recursive .

            • the use of $VARIABLE_NAME instead of $VARIABLE_NAME . this is more robust

            a new version of



            file="filename"
            while IFS= read -r line
            do
            EXTRACT_DATE=$( echo "$line" | cut -d '[' -f2 | cut -d ' ' -f1 )
            echo "$EXTRACT_DATE"
            done <"$file"






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Mar 16 at 15:37









            EchoMike444EchoMike444

            1,06017




            1,06017












            • This fixed the issue, thank you very much for a quick answer!

              – stitch123
              Mar 16 at 15:42






            • 1





              There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

              – Kusalananda
              Mar 16 at 15:42

















            • This fixed the issue, thank you very much for a quick answer!

              – stitch123
              Mar 16 at 15:42






            • 1





              There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

              – Kusalananda
              Mar 16 at 15:42
















            This fixed the issue, thank you very much for a quick answer!

            – stitch123
            Mar 16 at 15:42





            This fixed the issue, thank you very much for a quick answer!

            – stitch123
            Mar 16 at 15:42




            1




            1





            There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

            – Kusalananda
            Mar 16 at 15:42





            There is absolutely no difference between $var and $var. Both mean exactly the same thing. The important thing is the double quoting of the variable expansion. Your code uses both variations of $var and $var for no good reason. The only place where you need $var is when the expansion is part of a string and the very next character is a character that is valid in a variable name, as in "$varx".

            – Kusalananda
            Mar 16 at 15:42













            1














            The main issue, which creates the errors, is that you using the read line in $line as a filename for cut to read.



            You are also using echo to output the result of a command substitution. This is an anti-pattern. Just run the pipeline, without echo nor command substitution. It will output its result to the terminal by itself.



            Here, we use printf to give cut the line read from the file:



            file="filename"

            while IFS= read -r line; do
            printf '%sn' "$line" | cut -d '[' -f2 | cut -d ' ' -f1
            done <"$file"


            The next thing to note is that the while loop is totally unnecessary. You are calling cut twice for each line in the log file. The cut utility is perfectly capable of reading the file line by line by itself:



            file="filename"

            cut -d '[' -f2 "$file" | cut -d ' ' -f1


            Or, you could use GNU grep:



            grep -oP '(?<=[)[^ ]+' "$file"


            (This extracts everything up to the first space after the first [)



            or standard sed,



            sed 's/].*//; s/.*[//; s/ .*//' "$file"


            (This deletes everything after the first ], then deletes everything to the first [, then chops of the space and the rest ofter that)



            Related:



            • Why is using a shell loop to process text considered bad practice?





            share|improve this answer





























              1














              The main issue, which creates the errors, is that you using the read line in $line as a filename for cut to read.



              You are also using echo to output the result of a command substitution. This is an anti-pattern. Just run the pipeline, without echo nor command substitution. It will output its result to the terminal by itself.



              Here, we use printf to give cut the line read from the file:



              file="filename"

              while IFS= read -r line; do
              printf '%sn' "$line" | cut -d '[' -f2 | cut -d ' ' -f1
              done <"$file"


              The next thing to note is that the while loop is totally unnecessary. You are calling cut twice for each line in the log file. The cut utility is perfectly capable of reading the file line by line by itself:



              file="filename"

              cut -d '[' -f2 "$file" | cut -d ' ' -f1


              Or, you could use GNU grep:



              grep -oP '(?<=[)[^ ]+' "$file"


              (This extracts everything up to the first space after the first [)



              or standard sed,



              sed 's/].*//; s/.*[//; s/ .*//' "$file"


              (This deletes everything after the first ], then deletes everything to the first [, then chops of the space and the rest ofter that)



              Related:



              • Why is using a shell loop to process text considered bad practice?





              share|improve this answer



























                1












                1








                1







                The main issue, which creates the errors, is that you using the read line in $line as a filename for cut to read.



                You are also using echo to output the result of a command substitution. This is an anti-pattern. Just run the pipeline, without echo nor command substitution. It will output its result to the terminal by itself.



                Here, we use printf to give cut the line read from the file:



                file="filename"

                while IFS= read -r line; do
                printf '%sn' "$line" | cut -d '[' -f2 | cut -d ' ' -f1
                done <"$file"


                The next thing to note is that the while loop is totally unnecessary. You are calling cut twice for each line in the log file. The cut utility is perfectly capable of reading the file line by line by itself:



                file="filename"

                cut -d '[' -f2 "$file" | cut -d ' ' -f1


                Or, you could use GNU grep:



                grep -oP '(?<=[)[^ ]+' "$file"


                (This extracts everything up to the first space after the first [)



                or standard sed,



                sed 's/].*//; s/.*[//; s/ .*//' "$file"


                (This deletes everything after the first ], then deletes everything to the first [, then chops of the space and the rest ofter that)



                Related:



                • Why is using a shell loop to process text considered bad practice?





                share|improve this answer















                The main issue, which creates the errors, is that you using the read line in $line as a filename for cut to read.



                You are also using echo to output the result of a command substitution. This is an anti-pattern. Just run the pipeline, without echo nor command substitution. It will output its result to the terminal by itself.



                Here, we use printf to give cut the line read from the file:



                file="filename"

                while IFS= read -r line; do
                printf '%sn' "$line" | cut -d '[' -f2 | cut -d ' ' -f1
                done <"$file"


                The next thing to note is that the while loop is totally unnecessary. You are calling cut twice for each line in the log file. The cut utility is perfectly capable of reading the file line by line by itself:



                file="filename"

                cut -d '[' -f2 "$file" | cut -d ' ' -f1


                Or, you could use GNU grep:



                grep -oP '(?<=[)[^ ]+' "$file"


                (This extracts everything up to the first space after the first [)



                or standard sed,



                sed 's/].*//; s/.*[//; s/ .*//' "$file"


                (This deletes everything after the first ], then deletes everything to the first [, then chops of the space and the rest ofter that)



                Related:



                • Why is using a shell loop to process text considered bad practice?






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Mar 16 at 18:10

























                answered Mar 16 at 17:44









                KusalanandaKusalananda

                141k18264440




                141k18264440



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Unix & Linux Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f506686%2fshell-extracting-date-time-from-logs%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown






                    Popular posts from this blog

                    How to check contact read email or not when send email to Individual?

                    Displaying single band from multi-band raster using QGIS

                    How many registers does an x86_64 CPU actually have?