Why `tail -f data_log | grep keyword` within tmux session could lead to hard disk exhaustion?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












The scene is like, yesterday I need to check some api bug. So I logged into the log server. I opened up a tmux session, so I can reconnect to my work later.



I typed in tail -f data_log | grep keyword to debug. But didn't work it out at that moment. So I decided to keep this tmux session for later and closed the terminal pane.



And today my colleague told me my tmux session with tail -f data_log | grep keyword running has caused a hard disk exhaustion on that log server. Which makes me feel ashamed, self-blamed and confused.



As tail -f opens its own stdout file descriptor and redirect the newly added content of data_log to the terminal screen.



Can this stdout file descriptor receive infinite amount of data?

Where does this file descriptor store this large amount of data? Is there a real file to store them?

Does tmux have anything to do with this issue?

If tmux has nothing to do with this issue, if I opened a terminal running tail -f my_log, and used crontab to add 1 byte to my_log per second, does it mean that every second 2 bytes will be stored on my disk?(1 for tail and 1 for crontab task)?







share|improve this question

























    up vote
    0
    down vote

    favorite












    The scene is like, yesterday I need to check some api bug. So I logged into the log server. I opened up a tmux session, so I can reconnect to my work later.



    I typed in tail -f data_log | grep keyword to debug. But didn't work it out at that moment. So I decided to keep this tmux session for later and closed the terminal pane.



    And today my colleague told me my tmux session with tail -f data_log | grep keyword running has caused a hard disk exhaustion on that log server. Which makes me feel ashamed, self-blamed and confused.



    As tail -f opens its own stdout file descriptor and redirect the newly added content of data_log to the terminal screen.



    Can this stdout file descriptor receive infinite amount of data?

    Where does this file descriptor store this large amount of data? Is there a real file to store them?

    Does tmux have anything to do with this issue?

    If tmux has nothing to do with this issue, if I opened a terminal running tail -f my_log, and used crontab to add 1 byte to my_log per second, does it mean that every second 2 bytes will be stored on my disk?(1 for tail and 1 for crontab task)?







    share|improve this question























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      The scene is like, yesterday I need to check some api bug. So I logged into the log server. I opened up a tmux session, so I can reconnect to my work later.



      I typed in tail -f data_log | grep keyword to debug. But didn't work it out at that moment. So I decided to keep this tmux session for later and closed the terminal pane.



      And today my colleague told me my tmux session with tail -f data_log | grep keyword running has caused a hard disk exhaustion on that log server. Which makes me feel ashamed, self-blamed and confused.



      As tail -f opens its own stdout file descriptor and redirect the newly added content of data_log to the terminal screen.



      Can this stdout file descriptor receive infinite amount of data?

      Where does this file descriptor store this large amount of data? Is there a real file to store them?

      Does tmux have anything to do with this issue?

      If tmux has nothing to do with this issue, if I opened a terminal running tail -f my_log, and used crontab to add 1 byte to my_log per second, does it mean that every second 2 bytes will be stored on my disk?(1 for tail and 1 for crontab task)?







      share|improve this question













      The scene is like, yesterday I need to check some api bug. So I logged into the log server. I opened up a tmux session, so I can reconnect to my work later.



      I typed in tail -f data_log | grep keyword to debug. But didn't work it out at that moment. So I decided to keep this tmux session for later and closed the terminal pane.



      And today my colleague told me my tmux session with tail -f data_log | grep keyword running has caused a hard disk exhaustion on that log server. Which makes me feel ashamed, self-blamed and confused.



      As tail -f opens its own stdout file descriptor and redirect the newly added content of data_log to the terminal screen.



      Can this stdout file descriptor receive infinite amount of data?

      Where does this file descriptor store this large amount of data? Is there a real file to store them?

      Does tmux have anything to do with this issue?

      If tmux has nothing to do with this issue, if I opened a terminal running tail -f my_log, and used crontab to add 1 byte to my_log per second, does it mean that every second 2 bytes will be stored on my disk?(1 for tail and 1 for crontab task)?









      share|improve this question












      share|improve this question




      share|improve this question








      edited Jun 6 at 7:45









      ctrl-alt-delor

      8,75831947




      8,75831947









      asked Jun 6 at 3:55









      Zen

      2,16792951




      2,16792951




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          2
          down vote













          It's possible that:




          1. data_log gets a huge amount of data written in to it each day.

          2. It is rotated, possibly using logrotate. Usual steps in rotation involve at least file renaming, followed by compression and deletion of the uncompressed log.


          3. tail -f (GNU at least, likely others as well), by default continues to read the old file even if it was moved or deleted. If a file was deleted, but a program has an open file handle to it, Linux keeps the data on disk, marking the space unavailable.

          4. This means that log rotation will not result in increased disk space like it should, but rather that the compressed log and the uncompressed but deleted log are both taking up space.

          Do this long enough and and it's possible your server could run out of space despite measures like log rotation, or attempts by others to manually delete the logs.






          share|improve this answer

















          • 2




            and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
            – A.B
            Jun 6 at 4:55











          • @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
            – Zen
            Jun 6 at 6:03







          • 2




            quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
            – A.B
            Jun 6 at 6:18







          • 2




            @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
            – Olorin
            Jun 6 at 7:35











          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448105%2fwhy-tail-f-data-log-grep-keyword-within-tmux-session-could-lead-to-hard-dis%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote













          It's possible that:




          1. data_log gets a huge amount of data written in to it each day.

          2. It is rotated, possibly using logrotate. Usual steps in rotation involve at least file renaming, followed by compression and deletion of the uncompressed log.


          3. tail -f (GNU at least, likely others as well), by default continues to read the old file even if it was moved or deleted. If a file was deleted, but a program has an open file handle to it, Linux keeps the data on disk, marking the space unavailable.

          4. This means that log rotation will not result in increased disk space like it should, but rather that the compressed log and the uncompressed but deleted log are both taking up space.

          Do this long enough and and it's possible your server could run out of space despite measures like log rotation, or attempts by others to manually delete the logs.






          share|improve this answer

















          • 2




            and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
            – A.B
            Jun 6 at 4:55











          • @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
            – Zen
            Jun 6 at 6:03







          • 2




            quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
            – A.B
            Jun 6 at 6:18







          • 2




            @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
            – Olorin
            Jun 6 at 7:35















          up vote
          2
          down vote













          It's possible that:




          1. data_log gets a huge amount of data written in to it each day.

          2. It is rotated, possibly using logrotate. Usual steps in rotation involve at least file renaming, followed by compression and deletion of the uncompressed log.


          3. tail -f (GNU at least, likely others as well), by default continues to read the old file even if it was moved or deleted. If a file was deleted, but a program has an open file handle to it, Linux keeps the data on disk, marking the space unavailable.

          4. This means that log rotation will not result in increased disk space like it should, but rather that the compressed log and the uncompressed but deleted log are both taking up space.

          Do this long enough and and it's possible your server could run out of space despite measures like log rotation, or attempts by others to manually delete the logs.






          share|improve this answer

















          • 2




            and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
            – A.B
            Jun 6 at 4:55











          • @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
            – Zen
            Jun 6 at 6:03







          • 2




            quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
            – A.B
            Jun 6 at 6:18







          • 2




            @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
            – Olorin
            Jun 6 at 7:35













          up vote
          2
          down vote










          up vote
          2
          down vote









          It's possible that:




          1. data_log gets a huge amount of data written in to it each day.

          2. It is rotated, possibly using logrotate. Usual steps in rotation involve at least file renaming, followed by compression and deletion of the uncompressed log.


          3. tail -f (GNU at least, likely others as well), by default continues to read the old file even if it was moved or deleted. If a file was deleted, but a program has an open file handle to it, Linux keeps the data on disk, marking the space unavailable.

          4. This means that log rotation will not result in increased disk space like it should, but rather that the compressed log and the uncompressed but deleted log are both taking up space.

          Do this long enough and and it's possible your server could run out of space despite measures like log rotation, or attempts by others to manually delete the logs.






          share|improve this answer













          It's possible that:




          1. data_log gets a huge amount of data written in to it each day.

          2. It is rotated, possibly using logrotate. Usual steps in rotation involve at least file renaming, followed by compression and deletion of the uncompressed log.


          3. tail -f (GNU at least, likely others as well), by default continues to read the old file even if it was moved or deleted. If a file was deleted, but a program has an open file handle to it, Linux keeps the data on disk, marking the space unavailable.

          4. This means that log rotation will not result in increased disk space like it should, but rather that the compressed log and the uncompressed but deleted log are both taking up space.

          Do this long enough and and it's possible your server could run out of space despite measures like log rotation, or attempts by others to manually delete the logs.







          share|improve this answer













          share|improve this answer



          share|improve this answer











          answered Jun 6 at 4:50









          Olorin

          1,15711




          1,15711







          • 2




            and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
            – A.B
            Jun 6 at 4:55











          • @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
            – Zen
            Jun 6 at 6:03







          • 2




            quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
            – A.B
            Jun 6 at 6:18







          • 2




            @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
            – Olorin
            Jun 6 at 7:35













          • 2




            and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
            – A.B
            Jun 6 at 4:55











          • @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
            – Zen
            Jun 6 at 6:03







          • 2




            quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
            – A.B
            Jun 6 at 6:18







          • 2




            @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
            – Olorin
            Jun 6 at 7:35








          2




          2




          and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
          – A.B
          Jun 6 at 4:55





          and for GNU tail, option -F will close the old fd and open the freshly rotated file, avoiding this issue
          – A.B
          Jun 6 at 4:55













          @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
          – Zen
          Jun 6 at 6:03





          @A.B , I read the manual of tail(GNU), it says -F same as --follow=name --retry, and explanation for retry is keep trying to open a file even when it is or becomes inaccessible; useful when following by name, i.e., with --follow=name , the description is confusing to me. So without -F, tail -f will read the deleted file, will logrotate open a new file with the same name? Where does system writing the newly added log? Can tail -f show the newly added log?
          – Zen
          Jun 6 at 6:03





          2




          2




          quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
          – A.B
          Jun 6 at 6:18





          quoting man: With --follow (-f), tail defaults to following the file descriptor, which means that even if a tail'ed file is renamed, tail will continue to track its end. This default behavior is not desirable when you really want to track the actual name of the file, not the file descriptor (e.g., log rotation). Use --follow=name in that case. That causes tail to track the named file in a way that accommodates renaming, removal and creation. . --follow=name <=> -F
          – A.B
          Jun 6 at 6:18





          2




          2




          @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
          – Olorin
          Jun 6 at 7:35





          @Zen See also unix.stackexchange.com/a/291935/260978 for a nice clarification on -f vs -F.
          – Olorin
          Jun 6 at 7:35













           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f448105%2fwhy-tail-f-data-log-grep-keyword-within-tmux-session-could-lead-to-hard-dis%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          How to check contact read email or not when send email to Individual?

          How many registers does an x86_64 CPU actually have?

          Nur Jahan