Why did DOS use dollar-terminated strings?
Clash Royale CLAN TAG#URR8PPP
up vote
53
down vote
favorite
According to a few tutorials I am seeing, DOS used dollar-terminated strings to write to the terminal. This seems to also be documented here on INT 21H
.
AH = 09h - WRITE STRING TO STANDARD OUTPUT
Entry: DS:DX -> '$'-terminated string
Return: AL = 24h
Notes: ^C/^Break are checked
You can see this from the code in this tutorial here. What was the reason for going with dollar-terminated strings rather than NUL-terminated strings like C?
ms-dos assembly
 |Â
show 2 more comments
up vote
53
down vote
favorite
According to a few tutorials I am seeing, DOS used dollar-terminated strings to write to the terminal. This seems to also be documented here on INT 21H
.
AH = 09h - WRITE STRING TO STANDARD OUTPUT
Entry: DS:DX -> '$'-terminated string
Return: AL = 24h
Notes: ^C/^Break are checked
You can see this from the code in this tutorial here. What was the reason for going with dollar-terminated strings rather than NUL-terminated strings like C?
ms-dos assembly
10
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
2
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
1
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
2
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
2
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28
 |Â
show 2 more comments
up vote
53
down vote
favorite
up vote
53
down vote
favorite
According to a few tutorials I am seeing, DOS used dollar-terminated strings to write to the terminal. This seems to also be documented here on INT 21H
.
AH = 09h - WRITE STRING TO STANDARD OUTPUT
Entry: DS:DX -> '$'-terminated string
Return: AL = 24h
Notes: ^C/^Break are checked
You can see this from the code in this tutorial here. What was the reason for going with dollar-terminated strings rather than NUL-terminated strings like C?
ms-dos assembly
According to a few tutorials I am seeing, DOS used dollar-terminated strings to write to the terminal. This seems to also be documented here on INT 21H
.
AH = 09h - WRITE STRING TO STANDARD OUTPUT
Entry: DS:DX -> '$'-terminated string
Return: AL = 24h
Notes: ^C/^Break are checked
You can see this from the code in this tutorial here. What was the reason for going with dollar-terminated strings rather than NUL-terminated strings like C?
ms-dos assembly
ms-dos assembly
edited Sep 19 at 17:24
asked Sep 19 at 17:19
Evan Carroll
672217
672217
10
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
2
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
1
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
2
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
2
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28
 |Â
show 2 more comments
10
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
2
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
1
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
2
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
2
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28
10
10
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
2
2
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
1
1
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
2
2
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
2
2
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28
 |Â
show 2 more comments
3 Answers
3
active
oldest
votes
up vote
83
down vote
accepted
The short answer is that DOS was designed to be similar to CP/M, and drawing a quote from here:
While 8-bit programs could not run on 16-bit computers, Intel
documented how the original software developer could mechanically
translate an 8-bit program into a 16-bit program. Only the developer
of the program with possession of the source code could make this
translation. I designed DOS so the translated program would work the
same as it had with CP/M â translation compatibility. The key to
making this work was implementing the CP/M API.
Or course this brings up the question as to why CP/M used the dollar sign.
This discussion says CP/M got the idea from DEC, which used the RAD50 character encoding. With only 40 characters (50 octal), you only have uppercase, digits, space, period, dollar, and percent.
Both CP/M and RT-11 are evolved from earlier DEC OS's, most notably
OS/8 (on the PDP-8) and DOS-11 (on the PDP-11). The most obvious
feature of all of these OS's is the presence of "PIP"
So DEC probably chose dollar because it didn't have many options, CP/M got it from DEC, and DOS got it from CP/M.
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
 |Â
show 3 more comments
up vote
8
down vote
So, what about the last part of the question, why not use nul as c?
As shown in an earlier answer, MS-DOS was influenced by CP/M, CP/M was influenced by DOS/11 on PDP-11 and that was an extension of OS/8 from PDP-8.
OS/8 used a character set called DEC Radix-50.
After checking the link about the RAD50 character set, the answer why OS/8 didn't use nul to terminate strings is quite simple. nul is Space in that character set.
Considering that apart from letters and digits you only had 3 punctuation marks ".", "%" and "$" you had to choose one.
The question remaining is why they choose $ and not %.
Besides that, as commented below, C did not exist when the PDP-8 was constructed, so it could not have influenced the choice of string terminator.
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directiveASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.
â Jules
Sep 20 at 16:18
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
 |Â
show 4 more comments
up vote
6
down vote
It's worth noting that CP/M was written originally in the PL/M-80 programming language (also developed by Gary Kildall) (source here )
The PL/M-80 Manual on page 11 states that
The character set used in PL/M is a subset of both ASCII and EBCDIC character sets. The valid PL/M characters consist of the alphanumerics
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h I j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
along with the special characters
= . / ( ) + - ' * , < > : ;
and the blank characters
space tab carriage-return line-feed
If a PL/M program contains any character not in this set, the compiler may treat the character as an error
That's 80 characters.
Interestingly there is no mention there of $ at all - yet on page 15 it is listed as a character that "may be freely inserted between the characters of a constant to improve readability".
I can't say whether or why this might have affected the decision to use $ as the string terminator for function 15 in CP/M, especially since page 16 is explicit that "character strings are denoted by printable ascii characters enclosed by apostrophes" which doesn't limit it to the same character set as the code body.
I can say from personal experience that function 15 was almost never used, and CP/M / CP/M-86 / MP/M / Concurrent DOS / CDOS /⦠assembler programmers pretty well invariably used subroutines containing loops that called function 2, usually using either a byte count or a null terminator.
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
83
down vote
accepted
The short answer is that DOS was designed to be similar to CP/M, and drawing a quote from here:
While 8-bit programs could not run on 16-bit computers, Intel
documented how the original software developer could mechanically
translate an 8-bit program into a 16-bit program. Only the developer
of the program with possession of the source code could make this
translation. I designed DOS so the translated program would work the
same as it had with CP/M â translation compatibility. The key to
making this work was implementing the CP/M API.
Or course this brings up the question as to why CP/M used the dollar sign.
This discussion says CP/M got the idea from DEC, which used the RAD50 character encoding. With only 40 characters (50 octal), you only have uppercase, digits, space, period, dollar, and percent.
Both CP/M and RT-11 are evolved from earlier DEC OS's, most notably
OS/8 (on the PDP-8) and DOS-11 (on the PDP-11). The most obvious
feature of all of these OS's is the presence of "PIP"
So DEC probably chose dollar because it didn't have many options, CP/M got it from DEC, and DOS got it from CP/M.
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
 |Â
show 3 more comments
up vote
83
down vote
accepted
The short answer is that DOS was designed to be similar to CP/M, and drawing a quote from here:
While 8-bit programs could not run on 16-bit computers, Intel
documented how the original software developer could mechanically
translate an 8-bit program into a 16-bit program. Only the developer
of the program with possession of the source code could make this
translation. I designed DOS so the translated program would work the
same as it had with CP/M â translation compatibility. The key to
making this work was implementing the CP/M API.
Or course this brings up the question as to why CP/M used the dollar sign.
This discussion says CP/M got the idea from DEC, which used the RAD50 character encoding. With only 40 characters (50 octal), you only have uppercase, digits, space, period, dollar, and percent.
Both CP/M and RT-11 are evolved from earlier DEC OS's, most notably
OS/8 (on the PDP-8) and DOS-11 (on the PDP-11). The most obvious
feature of all of these OS's is the presence of "PIP"
So DEC probably chose dollar because it didn't have many options, CP/M got it from DEC, and DOS got it from CP/M.
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
 |Â
show 3 more comments
up vote
83
down vote
accepted
up vote
83
down vote
accepted
The short answer is that DOS was designed to be similar to CP/M, and drawing a quote from here:
While 8-bit programs could not run on 16-bit computers, Intel
documented how the original software developer could mechanically
translate an 8-bit program into a 16-bit program. Only the developer
of the program with possession of the source code could make this
translation. I designed DOS so the translated program would work the
same as it had with CP/M â translation compatibility. The key to
making this work was implementing the CP/M API.
Or course this brings up the question as to why CP/M used the dollar sign.
This discussion says CP/M got the idea from DEC, which used the RAD50 character encoding. With only 40 characters (50 octal), you only have uppercase, digits, space, period, dollar, and percent.
Both CP/M and RT-11 are evolved from earlier DEC OS's, most notably
OS/8 (on the PDP-8) and DOS-11 (on the PDP-11). The most obvious
feature of all of these OS's is the presence of "PIP"
So DEC probably chose dollar because it didn't have many options, CP/M got it from DEC, and DOS got it from CP/M.
The short answer is that DOS was designed to be similar to CP/M, and drawing a quote from here:
While 8-bit programs could not run on 16-bit computers, Intel
documented how the original software developer could mechanically
translate an 8-bit program into a 16-bit program. Only the developer
of the program with possession of the source code could make this
translation. I designed DOS so the translated program would work the
same as it had with CP/M â translation compatibility. The key to
making this work was implementing the CP/M API.
Or course this brings up the question as to why CP/M used the dollar sign.
This discussion says CP/M got the idea from DEC, which used the RAD50 character encoding. With only 40 characters (50 octal), you only have uppercase, digits, space, period, dollar, and percent.
Both CP/M and RT-11 are evolved from earlier DEC OS's, most notably
OS/8 (on the PDP-8) and DOS-11 (on the PDP-11). The most obvious
feature of all of these OS's is the presence of "PIP"
So DEC probably chose dollar because it didn't have many options, CP/M got it from DEC, and DOS got it from CP/M.
edited Sep 20 at 15:49
Brian H
14.5k51127
14.5k51127
answered Sep 19 at 17:48
Eugene Styer
72666
72666
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
 |Â
show 3 more comments
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
4
4
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
I'm really impressed by the RAD50 information. I'll mark this answer as chosen if no one else has anything to do add. It looks like you went down a good rabbit hole from DOS to CP/M to find that.
â Evan Carroll
Sep 19 at 17:57
5
5
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
That's a great discussion you linked, by the way. Tim Shoppa, who is reliable, suggested it might go back to mainframe data conventions from the 1970s. And here's an example -- en.wikipedia.org/wiki/⦠-- a mainframe language in which every statement line is terminated by a dollar sign. There's no way I know of to connect the dots to CP/M though. And it also seems like this ancient convention may also explain why a dollar sign at the end of a regular expression is an assertion of end-of-line.
â MetaEd
Sep 19 at 23:26
6
6
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
It could also explain why in BASIC a dollar sign at the end of a variable name denotes a string type -- if it was already thought of as a string terminator.
â MetaEd
Sep 19 at 23:38
6
6
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
You might also go back to TECO. Teco was a text editor originally from MIT, that made its way to DEC in the 1960s. Commands like insert and search were followed by a text string. This text string was terminated by an escape character. This escape character was echoed as a dollar sign.
â Walter Mitty
Sep 20 at 0:42
2
2
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
RAD50, at least on PDP-11 operating systems, never needed a terminator. It was used in fixed-length contexts, e.g., the name of a file in FILES-11 was 6 characters = 2 words in RAD50. If you wanted a shorter name, it would be padded with spaces. RAD50 also cropped up in symbol tables, and $ was a frequent initial character ('reserved to DEC'). Contrariwise, there was a lot of precedent in various DEC operating systems for ASCII strings terminated by a zero byte (ASCIZ). Many assemblers provided ASCIZ directives. TOPS-10 had a UUO 'OUTSTR' to print an ASCIZ string.
â dave
Sep 20 at 3:02
 |Â
show 3 more comments
up vote
8
down vote
So, what about the last part of the question, why not use nul as c?
As shown in an earlier answer, MS-DOS was influenced by CP/M, CP/M was influenced by DOS/11 on PDP-11 and that was an extension of OS/8 from PDP-8.
OS/8 used a character set called DEC Radix-50.
After checking the link about the RAD50 character set, the answer why OS/8 didn't use nul to terminate strings is quite simple. nul is Space in that character set.
Considering that apart from letters and digits you only had 3 punctuation marks ".", "%" and "$" you had to choose one.
The question remaining is why they choose $ and not %.
Besides that, as commented below, C did not exist when the PDP-8 was constructed, so it could not have influenced the choice of string terminator.
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directiveASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.
â Jules
Sep 20 at 16:18
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
 |Â
show 4 more comments
up vote
8
down vote
So, what about the last part of the question, why not use nul as c?
As shown in an earlier answer, MS-DOS was influenced by CP/M, CP/M was influenced by DOS/11 on PDP-11 and that was an extension of OS/8 from PDP-8.
OS/8 used a character set called DEC Radix-50.
After checking the link about the RAD50 character set, the answer why OS/8 didn't use nul to terminate strings is quite simple. nul is Space in that character set.
Considering that apart from letters and digits you only had 3 punctuation marks ".", "%" and "$" you had to choose one.
The question remaining is why they choose $ and not %.
Besides that, as commented below, C did not exist when the PDP-8 was constructed, so it could not have influenced the choice of string terminator.
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directiveASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.
â Jules
Sep 20 at 16:18
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
 |Â
show 4 more comments
up vote
8
down vote
up vote
8
down vote
So, what about the last part of the question, why not use nul as c?
As shown in an earlier answer, MS-DOS was influenced by CP/M, CP/M was influenced by DOS/11 on PDP-11 and that was an extension of OS/8 from PDP-8.
OS/8 used a character set called DEC Radix-50.
After checking the link about the RAD50 character set, the answer why OS/8 didn't use nul to terminate strings is quite simple. nul is Space in that character set.
Considering that apart from letters and digits you only had 3 punctuation marks ".", "%" and "$" you had to choose one.
The question remaining is why they choose $ and not %.
Besides that, as commented below, C did not exist when the PDP-8 was constructed, so it could not have influenced the choice of string terminator.
So, what about the last part of the question, why not use nul as c?
As shown in an earlier answer, MS-DOS was influenced by CP/M, CP/M was influenced by DOS/11 on PDP-11 and that was an extension of OS/8 from PDP-8.
OS/8 used a character set called DEC Radix-50.
After checking the link about the RAD50 character set, the answer why OS/8 didn't use nul to terminate strings is quite simple. nul is Space in that character set.
Considering that apart from letters and digits you only had 3 punctuation marks ".", "%" and "$" you had to choose one.
The question remaining is why they choose $ and not %.
Besides that, as commented below, C did not exist when the PDP-8 was constructed, so it could not have influenced the choice of string terminator.
edited Sep 20 at 15:22
answered Sep 19 at 20:14
UncleBod
1473
1473
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directiveASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.
â Jules
Sep 20 at 16:18
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
 |Â
show 4 more comments
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directiveASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.
â Jules
Sep 20 at 16:18
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
3
3
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
But you are also speaking as if C always existed but it didnâÂÂt really get popular outside of Bell Labs until the mid to late 70âÂÂs.
â mannaggia
Sep 20 at 1:50
1
1
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
This answer fails to demonstrate a connection between paper tape and MS/PC-DOS version 1.
â JdeBP
Sep 20 at 8:28
1
1
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
The PDP-8 predates the C language by a long way - C was first targeted at the PDP-11.
â pjc50
Sep 20 at 9:36
3
3
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directive
ASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.â Jules
Sep 20 at 16:18
Also note that null terminated strings are not C's invention ... C actually inherited that convention from the DEC assemblers that its early compilers targeted (the PDP 11 assembler, for instance, had a directive
ASCIZ
to produce null terminated strings), so it seems unlikely Kildall wouldn't have been exposed to the idea even though he stsrted working on the first versions of CP/M before C was well known.â Jules
Sep 20 at 16:18
1
1
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
Verson 1 of unix was for the PDP-7, which predated the PDP-8. The port to the PDP-11 was the motivation for C.
â Walter Mitty
Sep 20 at 17:25
 |Â
show 4 more comments
up vote
6
down vote
It's worth noting that CP/M was written originally in the PL/M-80 programming language (also developed by Gary Kildall) (source here )
The PL/M-80 Manual on page 11 states that
The character set used in PL/M is a subset of both ASCII and EBCDIC character sets. The valid PL/M characters consist of the alphanumerics
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h I j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
along with the special characters
= . / ( ) + - ' * , < > : ;
and the blank characters
space tab carriage-return line-feed
If a PL/M program contains any character not in this set, the compiler may treat the character as an error
That's 80 characters.
Interestingly there is no mention there of $ at all - yet on page 15 it is listed as a character that "may be freely inserted between the characters of a constant to improve readability".
I can't say whether or why this might have affected the decision to use $ as the string terminator for function 15 in CP/M, especially since page 16 is explicit that "character strings are denoted by printable ascii characters enclosed by apostrophes" which doesn't limit it to the same character set as the code body.
I can say from personal experience that function 15 was almost never used, and CP/M / CP/M-86 / MP/M / Concurrent DOS / CDOS /⦠assembler programmers pretty well invariably used subroutines containing loops that called function 2, usually using either a byte count or a null terminator.
add a comment |Â
up vote
6
down vote
It's worth noting that CP/M was written originally in the PL/M-80 programming language (also developed by Gary Kildall) (source here )
The PL/M-80 Manual on page 11 states that
The character set used in PL/M is a subset of both ASCII and EBCDIC character sets. The valid PL/M characters consist of the alphanumerics
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h I j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
along with the special characters
= . / ( ) + - ' * , < > : ;
and the blank characters
space tab carriage-return line-feed
If a PL/M program contains any character not in this set, the compiler may treat the character as an error
That's 80 characters.
Interestingly there is no mention there of $ at all - yet on page 15 it is listed as a character that "may be freely inserted between the characters of a constant to improve readability".
I can't say whether or why this might have affected the decision to use $ as the string terminator for function 15 in CP/M, especially since page 16 is explicit that "character strings are denoted by printable ascii characters enclosed by apostrophes" which doesn't limit it to the same character set as the code body.
I can say from personal experience that function 15 was almost never used, and CP/M / CP/M-86 / MP/M / Concurrent DOS / CDOS /⦠assembler programmers pretty well invariably used subroutines containing loops that called function 2, usually using either a byte count or a null terminator.
add a comment |Â
up vote
6
down vote
up vote
6
down vote
It's worth noting that CP/M was written originally in the PL/M-80 programming language (also developed by Gary Kildall) (source here )
The PL/M-80 Manual on page 11 states that
The character set used in PL/M is a subset of both ASCII and EBCDIC character sets. The valid PL/M characters consist of the alphanumerics
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h I j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
along with the special characters
= . / ( ) + - ' * , < > : ;
and the blank characters
space tab carriage-return line-feed
If a PL/M program contains any character not in this set, the compiler may treat the character as an error
That's 80 characters.
Interestingly there is no mention there of $ at all - yet on page 15 it is listed as a character that "may be freely inserted between the characters of a constant to improve readability".
I can't say whether or why this might have affected the decision to use $ as the string terminator for function 15 in CP/M, especially since page 16 is explicit that "character strings are denoted by printable ascii characters enclosed by apostrophes" which doesn't limit it to the same character set as the code body.
I can say from personal experience that function 15 was almost never used, and CP/M / CP/M-86 / MP/M / Concurrent DOS / CDOS /⦠assembler programmers pretty well invariably used subroutines containing loops that called function 2, usually using either a byte count or a null terminator.
It's worth noting that CP/M was written originally in the PL/M-80 programming language (also developed by Gary Kildall) (source here )
The PL/M-80 Manual on page 11 states that
The character set used in PL/M is a subset of both ASCII and EBCDIC character sets. The valid PL/M characters consist of the alphanumerics
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h I j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
along with the special characters
= . / ( ) + - ' * , < > : ;
and the blank characters
space tab carriage-return line-feed
If a PL/M program contains any character not in this set, the compiler may treat the character as an error
That's 80 characters.
Interestingly there is no mention there of $ at all - yet on page 15 it is listed as a character that "may be freely inserted between the characters of a constant to improve readability".
I can't say whether or why this might have affected the decision to use $ as the string terminator for function 15 in CP/M, especially since page 16 is explicit that "character strings are denoted by printable ascii characters enclosed by apostrophes" which doesn't limit it to the same character set as the code body.
I can say from personal experience that function 15 was almost never used, and CP/M / CP/M-86 / MP/M / Concurrent DOS / CDOS /⦠assembler programmers pretty well invariably used subroutines containing loops that called function 2, usually using either a byte count or a null terminator.
edited Sep 20 at 13:06
answered Sep 20 at 12:57
PolicyWatcher
612
612
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f7638%2fwhy-did-dos-use-dollar-terminated-strings%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
10
That comes from CP/M.
â mannaggia
Sep 19 at 17:44
2
why is the string in function 9 terminated by a dollar sign?
â phuclv
Sep 20 at 1:53
1
Please note that all other strings in DOS (like file names) are NUL-terminated.
â Martin Rosenau
Sep 20 at 5:54
2
That is not actually true, M. Rosenau. Command tails in the PSP are terminated by Carriage Return, for starters.
â JdeBP
Sep 20 at 8:32
2
Why would he not tag MS-DOS?
â idrougge
Sep 21 at 11:28