what are the downsides of splitting /proc/pid/stat by whitespace?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via



$ cat /proc/$$/stat
14198 (bash) S 14195 14198 14198 34816 ...
$ x=($(< /proc/$$/stat)); echo $x[2]
S
$


and all seems well?







share|improve this question
























    up vote
    2
    down vote

    favorite












    What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via



    $ cat /proc/$$/stat
    14198 (bash) S 14195 14198 14198 34816 ...
    $ x=($(< /proc/$$/stat)); echo $x[2]
    S
    $


    and all seems well?







    share|improve this question






















      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via



      $ cat /proc/$$/stat
      14198 (bash) S 14195 14198 14198 34816 ...
      $ x=($(< /proc/$$/stat)); echo $x[2]
      S
      $


      and all seems well?







      share|improve this question












      What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via



      $ cat /proc/$$/stat
      14198 (bash) S 14195 14198 14198 34816 ...
      $ x=($(< /proc/$$/stat)); echo $x[2]
      S
      $


      and all seems well?









      share|improve this question











      share|improve this question




      share|improve this question










      asked Dec 7 '17 at 14:55









      thrig

      22.5k12852




      22.5k12852




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:



          $ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
          Name: foonbarn
          Umask: 0022
          State: S (sleeping)





          share|improve this answer




















          • that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
            – thrig
            Dec 8 '17 at 22:14










          • @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
            – ilkkachu
            Dec 8 '17 at 22:34

















          up vote
          4
          down vote













          The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name



          $ perl -e '$0="like this"; sleep 999' &
          [1] 14343
          $


          then the parse splitting by whitespace will fail



          $ x=($(< /proc/14343/stat)); echo $x[2]
          this)
          $


          as the command name contains a space.



          $ cat /proc/14343/stat
          14343 (like this) S 14198 14343 ...
          $


          How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting



           tty_nr %d (7) The controlling terminal of the process. (The
          minor device number is contained in the combination
          of bits 31 to 20 and 7 to 0; the major device number
          is in bits 15 to 8.)


          so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.



          The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit



          $ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
          [4] 14440
          $ cat /proc/14493/stat
          14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
          $


          Ideas to Parse this Wart of an Interface



          Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents



          1234 () S ...
          4321 (xxxxxxxxxxxxxxx) S ...


          one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...






          share|improve this answer
















          • 1




            It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
            – Stephen Kitt
            Dec 7 '17 at 15:14










          Your Answer







          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "106"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f409492%2fwhat-are-the-downsides-of-splitting-proc-pid-stat-by-whitespace%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote



          accepted










          If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:



          $ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
          Name: foonbarn
          Umask: 0022
          State: S (sleeping)





          share|improve this answer




















          • that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
            – thrig
            Dec 8 '17 at 22:14










          • @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
            – ilkkachu
            Dec 8 '17 at 22:34














          up vote
          3
          down vote



          accepted










          If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:



          $ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
          Name: foonbarn
          Umask: 0022
          State: S (sleeping)





          share|improve this answer




















          • that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
            – thrig
            Dec 8 '17 at 22:14










          • @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
            – ilkkachu
            Dec 8 '17 at 22:34












          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:



          $ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
          Name: foonbarn
          Umask: 0022
          State: S (sleeping)





          share|improve this answer












          If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:



          $ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
          Name: foonbarn
          Umask: 0022
          State: S (sleeping)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 7 '17 at 16:31









          ilkkachu

          50.1k676138




          50.1k676138











          • that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
            – thrig
            Dec 8 '17 at 22:14










          • @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
            – ilkkachu
            Dec 8 '17 at 22:34
















          • that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
            – thrig
            Dec 8 '17 at 22:14










          • @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
            – ilkkachu
            Dec 8 '17 at 22:34















          that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
          – thrig
          Dec 8 '17 at 22:14




          that's probably easier for perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
          – thrig
          Dec 8 '17 at 22:14












          @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
          – ilkkachu
          Dec 8 '17 at 22:34




          @thrig, eh, that code there reads /proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
          – ilkkachu
          Dec 8 '17 at 22:34












          up vote
          4
          down vote













          The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name



          $ perl -e '$0="like this"; sleep 999' &
          [1] 14343
          $


          then the parse splitting by whitespace will fail



          $ x=($(< /proc/14343/stat)); echo $x[2]
          this)
          $


          as the command name contains a space.



          $ cat /proc/14343/stat
          14343 (like this) S 14198 14343 ...
          $


          How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting



           tty_nr %d (7) The controlling terminal of the process. (The
          minor device number is contained in the combination
          of bits 31 to 20 and 7 to 0; the major device number
          is in bits 15 to 8.)


          so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.



          The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit



          $ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
          [4] 14440
          $ cat /proc/14493/stat
          14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
          $


          Ideas to Parse this Wart of an Interface



          Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents



          1234 () S ...
          4321 (xxxxxxxxxxxxxxx) S ...


          one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...






          share|improve this answer
















          • 1




            It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
            – Stephen Kitt
            Dec 7 '17 at 15:14














          up vote
          4
          down vote













          The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name



          $ perl -e '$0="like this"; sleep 999' &
          [1] 14343
          $


          then the parse splitting by whitespace will fail



          $ x=($(< /proc/14343/stat)); echo $x[2]
          this)
          $


          as the command name contains a space.



          $ cat /proc/14343/stat
          14343 (like this) S 14198 14343 ...
          $


          How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting



           tty_nr %d (7) The controlling terminal of the process. (The
          minor device number is contained in the combination
          of bits 31 to 20 and 7 to 0; the major device number
          is in bits 15 to 8.)


          so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.



          The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit



          $ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
          [4] 14440
          $ cat /proc/14493/stat
          14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
          $


          Ideas to Parse this Wart of an Interface



          Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents



          1234 () S ...
          4321 (xxxxxxxxxxxxxxx) S ...


          one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...






          share|improve this answer
















          • 1




            It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
            – Stephen Kitt
            Dec 7 '17 at 15:14












          up vote
          4
          down vote










          up vote
          4
          down vote









          The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name



          $ perl -e '$0="like this"; sleep 999' &
          [1] 14343
          $


          then the parse splitting by whitespace will fail



          $ x=($(< /proc/14343/stat)); echo $x[2]
          this)
          $


          as the command name contains a space.



          $ cat /proc/14343/stat
          14343 (like this) S 14198 14343 ...
          $


          How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting



           tty_nr %d (7) The controlling terminal of the process. (The
          minor device number is contained in the combination
          of bits 31 to 20 and 7 to 0; the major device number
          is in bits 15 to 8.)


          so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.



          The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit



          $ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
          [4] 14440
          $ cat /proc/14493/stat
          14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
          $


          Ideas to Parse this Wart of an Interface



          Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents



          1234 () S ...
          4321 (xxxxxxxxxxxxxxx) S ...


          one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...






          share|improve this answer












          The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name



          $ perl -e '$0="like this"; sleep 999' &
          [1] 14343
          $


          then the parse splitting by whitespace will fail



          $ x=($(< /proc/14343/stat)); echo $x[2]
          this)
          $


          as the command name contains a space.



          $ cat /proc/14343/stat
          14343 (like this) S 14198 14343 ...
          $


          How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting



           tty_nr %d (7) The controlling terminal of the process. (The
          minor device number is contained in the combination
          of bits 31 to 20 and 7 to 0; the major device number
          is in bits 15 to 8.)


          so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.



          The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit



          $ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
          [4] 14440
          $ cat /proc/14493/stat
          14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
          $


          Ideas to Parse this Wart of an Interface



          Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents



          1234 () S ...
          4321 (xxxxxxxxxxxxxxx) S ...


          one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 7 '17 at 14:58









          thrig

          22.5k12852




          22.5k12852







          • 1




            It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
            – Stephen Kitt
            Dec 7 '17 at 15:14












          • 1




            It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
            – Stephen Kitt
            Dec 7 '17 at 15:14







          1




          1




          It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
          – Stephen Kitt
          Dec 7 '17 at 15:14




          It’s often better to parse one of the other files in /proc/pid, if all the information needed is in a single file, or race conditions aren’t an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isn’t a good idea.)
          – Stephen Kitt
          Dec 7 '17 at 15:14

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f409492%2fwhat-are-the-downsides-of-splitting-proc-pid-stat-by-whitespace%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Peggy Mitchell

          Palaiologos

          The Forum (Inglewood, California)