what are the downsides of splitting /proc/pid/stat by whitespace?

Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via
$ cat /proc/$$/stat
14198 (bash) S 14195 14198 14198 34816 ...
$ x=($(< /proc/$$/stat)); echo $x[2]
S
$
and all seems well?
linux proc
add a comment |Â
up vote
2
down vote
favorite
What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via
$ cat /proc/$$/stat
14198 (bash) S 14195 14198 14198 34816 ...
$ x=($(< /proc/$$/stat)); echo $x[2]
S
$
and all seems well?
linux proc
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via
$ cat /proc/$$/stat
14198 (bash) S 14195 14198 14198 34816 ...
$ x=($(< /proc/$$/stat)); echo $x[2]
S
$
and all seems well?
linux proc
What are the downsides of splitting /proc/pid/stat on Linux by whitespace? For example using bash one can access the third column via
$ cat /proc/$$/stat
14198 (bash) S 14195 14198 14198 34816 ...
$ x=($(< /proc/$$/stat)); echo $x[2]
S
$
and all seems well?
linux proc
asked Dec 7 '17 at 14:55
thrig
22.5k12852
22.5k12852
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:
$ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
Name: foonbarn
Umask: 0022
State: S (sleeping)
that's probably easier forperlor such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads/proc/$pid/stat, notstatus. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop overfgets()andstrcmp(for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
â ilkkachu
Dec 8 '17 at 22:34
add a comment |Â
up vote
4
down vote
The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name
$ perl -e '$0="like this"; sleep 999' &
[1] 14343
$
then the parse splitting by whitespace will fail
$ x=($(< /proc/14343/stat)); echo $x[2]
this)
$
as the command name contains a space.
$ cat /proc/14343/stat
14343 (like this) S 14198 14343 ...
$
How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting
tty_nr %d (7) The controlling terminal of the process. (The
minor device number is contained in the combination
of bits 31 to 20 and 7 to 0; the major device number
is in bits 15 to 8.)
so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.
The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit
$ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
[4] 14440
$ cat /proc/14493/stat
14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
$
Ideas to Parse this Wart of an Interface
Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents
1234 () S ...
4321 (xxxxxxxxxxxxxxx) S ...
one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...
1
ItâÂÂs often better to parse one of the other files in/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading/proc/pid/commto help with parsing/proc/pid/stat, no, it isnâÂÂt a good idea.)
â Stephen Kitt
Dec 7 '17 at 15:14
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:
$ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
Name: foonbarn
Umask: 0022
State: S (sleeping)
that's probably easier forperlor such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads/proc/$pid/stat, notstatus. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop overfgets()andstrcmp(for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
â ilkkachu
Dec 8 '17 at 22:34
add a comment |Â
up vote
3
down vote
accepted
If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:
$ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
Name: foonbarn
Umask: 0022
State: S (sleeping)
that's probably easier forperlor such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads/proc/$pid/stat, notstatus. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop overfgets()andstrcmp(for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
â ilkkachu
Dec 8 '17 at 22:34
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:
$ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
Name: foonbarn
Umask: 0022
State: S (sleeping)
If you need to even think about it, why not just read /proc/$pid/status instead. It gives the same information on nicely labeled lines, and escapes newlines and backslashes that appear in the process name:
$ perl -e '$0="foonbarn"; system "head -3 /proc/$$/status";'
Name: foonbarn
Umask: 0022
State: S (sleeping)
answered Dec 7 '17 at 16:31
ilkkachu
50.1k676138
50.1k676138
that's probably easier forperlor such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads/proc/$pid/stat, notstatus. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop overfgets()andstrcmp(for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
â ilkkachu
Dec 8 '17 at 22:34
add a comment |Â
that's probably easier forperlor such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )
â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads/proc/$pid/stat, notstatus. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop overfgets()andstrcmp(for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.
â ilkkachu
Dec 8 '17 at 22:34
that's probably easier for
perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )â thrig
Dec 8 '17 at 22:14
that's probably easier for
perl or such but likely more difficult for C to deal with ( github.com/Microsoft/ProcDump-for-Linux/issues/8 )â thrig
Dec 8 '17 at 22:14
@thrig, eh, that code there reads
/proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.â ilkkachu
Dec 8 '17 at 22:34
@thrig, eh, that code there reads
/proc/$pid/stat, not status. So no wonder it has trouble. Reading a single-datum-per-line file in C is just a loop over fgets() and strcmp (for the headers). Though I do think I'll go to sleep now instead of coding the un-escaping.â ilkkachu
Dec 8 '17 at 22:34
add a comment |Â
up vote
4
down vote
The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name
$ perl -e '$0="like this"; sleep 999' &
[1] 14343
$
then the parse splitting by whitespace will fail
$ x=($(< /proc/14343/stat)); echo $x[2]
this)
$
as the command name contains a space.
$ cat /proc/14343/stat
14343 (like this) S 14198 14343 ...
$
How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting
tty_nr %d (7) The controlling terminal of the process. (The
minor device number is contained in the combination
of bits 31 to 20 and 7 to 0; the major device number
is in bits 15 to 8.)
so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.
The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit
$ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
[4] 14440
$ cat /proc/14493/stat
14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
$
Ideas to Parse this Wart of an Interface
Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents
1234 () S ...
4321 (xxxxxxxxxxxxxxx) S ...
one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...
1
ItâÂÂs often better to parse one of the other files in/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading/proc/pid/commto help with parsing/proc/pid/stat, no, it isnâÂÂt a good idea.)
â Stephen Kitt
Dec 7 '17 at 15:14
add a comment |Â
up vote
4
down vote
The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name
$ perl -e '$0="like this"; sleep 999' &
[1] 14343
$
then the parse splitting by whitespace will fail
$ x=($(< /proc/14343/stat)); echo $x[2]
this)
$
as the command name contains a space.
$ cat /proc/14343/stat
14343 (like this) S 14198 14343 ...
$
How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting
tty_nr %d (7) The controlling terminal of the process. (The
minor device number is contained in the combination
of bits 31 to 20 and 7 to 0; the major device number
is in bits 15 to 8.)
so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.
The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit
$ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
[4] 14440
$ cat /proc/14493/stat
14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
$
Ideas to Parse this Wart of an Interface
Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents
1234 () S ...
4321 (xxxxxxxxxxxxxxx) S ...
one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...
1
ItâÂÂs often better to parse one of the other files in/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading/proc/pid/commto help with parsing/proc/pid/stat, no, it isnâÂÂt a good idea.)
â Stephen Kitt
Dec 7 '17 at 15:14
add a comment |Â
up vote
4
down vote
up vote
4
down vote
The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name
$ perl -e '$0="like this"; sleep 999' &
[1] 14343
$
then the parse splitting by whitespace will fail
$ x=($(< /proc/14343/stat)); echo $x[2]
this)
$
as the command name contains a space.
$ cat /proc/14343/stat
14343 (like this) S 14198 14343 ...
$
How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting
tty_nr %d (7) The controlling terminal of the process. (The
minor device number is contained in the combination
of bits 31 to 20 and 7 to 0; the major device number
is in bits 15 to 8.)
so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.
The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit
$ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
[4] 14440
$ cat /proc/14493/stat
14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
$
Ideas to Parse this Wart of an Interface
Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents
1234 () S ...
4321 (xxxxxxxxxxxxxxx) S ...
one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...
The chief problem is that the space character (0x20) is used both for the delimiter between records and may also appear within a record; should a local user be able to set the process name
$ perl -e '$0="like this"; sleep 999' &
[1] 14343
$
then the parse splitting by whitespace will fail
$ x=($(< /proc/14343/stat)); echo $x[2]
this)
$
as the command name contains a space.
$ cat /proc/14343/stat
14343 (like this) S 14198 14343 ...
$
How bad could this be? According to proc(5) the "controlling terminal of the process" is interesting
tty_nr %d (7) The controlling terminal of the process. (The
minor device number is contained in the combination
of bits 31 to 20 and 7 to 0; the major device number
is in bits 15 to 8.)
so if a process misuses the controlling terminal information parsed incorrectly from /proc/pid/stat because someone changed that information, well, you may get a security vulnerability.
The parsing is additionally complicated by the fact that a ) can be placed in the process name though there is a 15 character limit
$ perl -e '$0="lisp) a b c d e f g h i"; sleep 999' &
[4] 14440
$ cat /proc/14493/stat
14493 (lisp) a b c d e) S 14198 14493 14198 34816 ...
$
Ideas to Parse this Wart of an Interface
Since the process name can vary somewhere between the empty string and 15 bytes of almost any contents
1234 () S ...
4321 (xxxxxxxxxxxxxxx) S ...
one idea would be to split on the first space to obtain the pid, then work backwards from the end of this string to find the first ); the stuff before the first ) from the right should be the process name and to the left the regular fields. Unit tests for the code would be highly advisable...
answered Dec 7 '17 at 14:58
thrig
22.5k12852
22.5k12852
1
ItâÂÂs often better to parse one of the other files in/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading/proc/pid/commto help with parsing/proc/pid/stat, no, it isnâÂÂt a good idea.)
â Stephen Kitt
Dec 7 '17 at 15:14
add a comment |Â
1
ItâÂÂs often better to parse one of the other files in/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading/proc/pid/commto help with parsing/proc/pid/stat, no, it isnâÂÂt a good idea.)
â Stephen Kitt
Dec 7 '17 at 15:14
1
1
ItâÂÂs often better to parse one of the other files in
/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isnâÂÂt a good idea.)â Stephen Kitt
Dec 7 '17 at 15:14
ItâÂÂs often better to parse one of the other files in
/proc/pid, if all the information needed is in a single file, or race conditions arenâÂÂt an issue. (So if anyone thought of reading /proc/pid/comm to help with parsing /proc/pid/stat, no, it isnâÂÂt a good idea.)â Stephen Kitt
Dec 7 '17 at 15:14
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f409492%2fwhat-are-the-downsides-of-splitting-proc-pid-stat-by-whitespace%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password