Why is C/C++ main argv declared as “char* argv[]” rather than just “char* argv”?
Clash Royale CLAN TAG#URR8PPP
Why is argv
declared as "a pointer to pointer to the first index of the array", rather than just being "a pointer to the first index of array" (char* argv
)?
Why is the notion of "pointer to pointer" required here?
c++ c
|
show 7 more comments
Why is argv
declared as "a pointer to pointer to the first index of the array", rather than just being "a pointer to the first index of array" (char* argv
)?
Why is the notion of "pointer to pointer" required here?
c++ c
4
"a pointer to pointer to the first index of the array" - That's not a correct description ofchar* argv
orchar**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.
– Sebastian Redl
Jan 20 at 7:16
12
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
15
Your life will get easier when you put the space in the right place.char* argv
puts the space in the wrong place. Saychar *argv
, and now it is clear that this means "the expression*argv[n]
is a variable of typechar
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.
– Eric Lippert
Jan 20 at 17:22
1
Mentally comparechar * argv
to the similar C++ constructstd::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!
– Justin Time
Jan 20 at 20:32
2
@EricLippert note that the question also includes C++, and there you can have e.g.char &func(int);
which doesn't make&func(5)
have typechar
.
– Ruslan
Jan 21 at 15:17
|
show 7 more comments
Why is argv
declared as "a pointer to pointer to the first index of the array", rather than just being "a pointer to the first index of array" (char* argv
)?
Why is the notion of "pointer to pointer" required here?
c++ c
Why is argv
declared as "a pointer to pointer to the first index of the array", rather than just being "a pointer to the first index of array" (char* argv
)?
Why is the notion of "pointer to pointer" required here?
c++ c
c++ c
edited Jan 21 at 11:26
Peter Mortensen
1,11521114
1,11521114
asked Jan 20 at 0:48
a usera user
12314
12314
4
"a pointer to pointer to the first index of the array" - That's not a correct description ofchar* argv
orchar**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.
– Sebastian Redl
Jan 20 at 7:16
12
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
15
Your life will get easier when you put the space in the right place.char* argv
puts the space in the wrong place. Saychar *argv
, and now it is clear that this means "the expression*argv[n]
is a variable of typechar
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.
– Eric Lippert
Jan 20 at 17:22
1
Mentally comparechar * argv
to the similar C++ constructstd::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!
– Justin Time
Jan 20 at 20:32
2
@EricLippert note that the question also includes C++, and there you can have e.g.char &func(int);
which doesn't make&func(5)
have typechar
.
– Ruslan
Jan 21 at 15:17
|
show 7 more comments
4
"a pointer to pointer to the first index of the array" - That's not a correct description ofchar* argv
orchar**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.
– Sebastian Redl
Jan 20 at 7:16
12
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
15
Your life will get easier when you put the space in the right place.char* argv
puts the space in the wrong place. Saychar *argv
, and now it is clear that this means "the expression*argv[n]
is a variable of typechar
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.
– Eric Lippert
Jan 20 at 17:22
1
Mentally comparechar * argv
to the similar C++ constructstd::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!
– Justin Time
Jan 20 at 20:32
2
@EricLippert note that the question also includes C++, and there you can have e.g.char &func(int);
which doesn't make&func(5)
have typechar
.
– Ruslan
Jan 21 at 15:17
4
4
"a pointer to pointer to the first index of the array" - That's not a correct description of
char* argv
or char**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.– Sebastian Redl
Jan 20 at 7:16
"a pointer to pointer to the first index of the array" - That's not a correct description of
char* argv
or char**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.– Sebastian Redl
Jan 20 at 7:16
12
12
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
15
15
Your life will get easier when you put the space in the right place.
char* argv
puts the space in the wrong place. Say char *argv
, and now it is clear that this means "the expression *argv[n]
is a variable of type char
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.– Eric Lippert
Jan 20 at 17:22
Your life will get easier when you put the space in the right place.
char* argv
puts the space in the wrong place. Say char *argv
, and now it is clear that this means "the expression *argv[n]
is a variable of type char
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.– Eric Lippert
Jan 20 at 17:22
1
1
Mentally compare
char * argv
to the similar C++ construct std::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!– Justin Time
Jan 20 at 20:32
Mentally compare
char * argv
to the similar C++ construct std::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!– Justin Time
Jan 20 at 20:32
2
2
@EricLippert note that the question also includes C++, and there you can have e.g.
char &func(int);
which doesn't make &func(5)
have type char
.– Ruslan
Jan 21 at 15:17
@EricLippert note that the question also includes C++, and there you can have e.g.
char &func(int);
which doesn't make &func(5)
have type char
.– Ruslan
Jan 21 at 15:17
|
show 7 more comments
6 Answers
6
active
oldest
votes
Argv is basically like this:
On the left is the argument itself--what's actually passed as an argument to main. That contains the address of an array of pointers. Each of those points to some place in memory containing the text of the corresponding argument that was passed on the command line. Then, at the end of that array there's guaranteed to be a null pointer.
Note that the actual storage for the individual arguments are at least potentially allocated separately from each other, so their addresses in memory might be arranged fairly randomly (but depending on how things happen to be written, they could also be in a single contiguous block of memory--you simply don't know and shouldn't care).
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
I would say it's intentional
– Michael
Jan 20 at 22:36
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to accessargv[i]
without scanning through all the previous ones.
– ilkkachu
Jan 21 at 11:41
|
show 12 more comments
Because that's what the operating system provides :-)
Your question is a little bit of a chicken/egg inversion issue. The problem is not to choose what you want in C++, the problem is how you say in C++ what the OS is giving you.
Unix passes an array of "strings", each string being a command argument. In C/C++, a string is a "char*", so an array of strings is char* argv, or char** argv, according to taste.
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive theirargv
array — the runtime takes care of tokenizing the command line and building theargv
array at the startup.
– Joker_vD
Jan 20 at 10:46
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal_start
that callsmain
just needs to passmain
a pointer to the existingargv
array in memory; it's already in the right format. The kernel copies it from the argv argument to theexecve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)
– Peter Cordes
Jan 20 at 20:38
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of settingargc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.
– Peter Cordes
Jan 20 at 20:42
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
add a comment |
First, as a parameter declaration, char **argv
is the same as char *argv
; they both imply a pointer to (an array or set of one or more possible) pointer(s) to strings.
Next, if you only have "pointer to char" — e.g. just char *
— then in order to access the nth item, you'll have to scan the first n-1 items to find the nth item's start. (And this would also impose the requirement that each of the strings are stored contiguously.)
With the array of pointers, you can directly index the nth item — so (while not strictly necessary — assuming the strings are contiguous) it is generally much more convenient.
To illustrate:
./program hello world
argc = 3
argv[0] --> "./program"
argv[1] --> "hello"
argv[2] --> "world"
It is possible that, in an os provided array of characters:
"./programhelloworld"
argv[0] ^
argv[1] ^
argv[2] ^
if argv were just a "pointer to char" you might see
"./programhelloworld"
argv ^
However (though likely by design of the os) there is no real guarantee that the three strings "./program", "hello", and "world" are contiguous. Further, this kind of "single pointer to multiple contiguous strings" is a more unusual data type construct (for C), especially compared with array of pointers to string.
what if instead of ,argv --> "helloworld"
you haveargv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the arrayargc
times. then you pass argv itself and not a pointer to argv.
– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your exampleargv[4]
isNULL
– Basile Starynkevitch
Jan 20 at 7:50
3
There is a guarantee that (at least initially)argv[argc] == NULL
. In this case that'sargv[3]
, notargv[4]
.
– Miral
Jan 21 at 6:22
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
|
show 4 more comments
Rather than thinking of it as "pointer to pointer", it helps to think of it as "array of strings", with denoting array and
char*
denoting string. When you run a program, you can pass it one or more command-line arguments and these are reflected in the arguments to main
: argc
is the count of arguments and argv
lets you access individual arguments.
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you seechar **
orchar *
, which is the same.
– rexkogitans
Jan 20 at 15:08
add a comment |
Why C/C++ main argv is declared as “char* argv”
A possible answer is because the C11 standard n1570 (in §5.1.2.2.1 Program startup) and the C++11 standard n3337 (in §3.6.1 main function) require that for hosted environments (but notice that the C standard mentions also §5.1.2.1 freestanding environments) See also this.
The next question is why did the C and C++ standards choose main
to have such a int main(int argc, char**argv)
signature? The explanation is largely historical: C was invented with Unix, which has a shell which does globbing before doing fork
(which is a system call to create a process) and execve
(which is the system call to execute a program), and that execve
transmits an array of string program arguments and is related to the main
of the executed program. Read more about the Unix philosophy and about ABIs.
And C++ tried hard to follow the conventions of C and be compatible with it. It could not define main
to be incompatible with C traditions.
If you designed an operating system from scratch (still having a command line interface) and a programming language for it from scratch, you'll be free to invent different program starting conventions. And other programming languages (e.g. Common Lisp or Ocaml or Go) have different program starting conventions.
In practice, main
is invoked by some crt0 code. Notice that on Windows the globbing may be done by each program in the equivalent of crt0, and some Windows programs can start thru the non-standard WinMain entry point. On Unix, globbing is done by the shell (and crt0
is adapting the ABI, and the initial call stack layout that it has specified, to calling conventions of your C implementation).
add a comment |
In many cases the answer is "because it's a standard". To quote C99 standard:
— If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup.
Of course, before it has been standardized it was already in use by K&R C in early Unix implementations, with the purpose of storing command-line parameters (something you have to care in Unix shell such as /bin/bash
or /bin/sh
but not in embedded systems). To quote first edition of K&R's "The C Programming Language" (pg. 110):
The first (conventionally called argc) is the number of command-line arguments the program was invoked with; the second (argv) is a pointer to an array of character strings that contain the arguments, one per string.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "131"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: false,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f385819%2fwhy-is-c-c-main-argv-declared-as-char-argv-rather-than-just-char-argv%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
StackExchange.ready(function ()
$("#show-editor-button input, #show-editor-button button").click(function ()
var showEditor = function()
$("#show-editor-button").hide();
$("#post-form").removeClass("dno");
StackExchange.editor.finallyInit();
;
var useFancy = $(this).data('confirm-use-fancy');
if(useFancy == 'True')
var popupTitle = $(this).data('confirm-fancy-title');
var popupBody = $(this).data('confirm-fancy-body');
var popupAccept = $(this).data('confirm-fancy-accept-button');
$(this).loadPopup(
url: '/post/self-answer-popup',
loaded: function(popup)
var pTitle = $(popup).find('h2');
var pBody = $(popup).find('.popup-body');
var pSubmit = $(popup).find('.popup-submit');
pTitle.text(popupTitle);
pBody.html(popupBody);
pSubmit.val(popupAccept).click(showEditor);
)
else
var confirmText = $(this).data('confirm-text');
if (confirmText ? confirm(confirmText) : true)
showEditor();
);
);
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
Argv is basically like this:
On the left is the argument itself--what's actually passed as an argument to main. That contains the address of an array of pointers. Each of those points to some place in memory containing the text of the corresponding argument that was passed on the command line. Then, at the end of that array there's guaranteed to be a null pointer.
Note that the actual storage for the individual arguments are at least potentially allocated separately from each other, so their addresses in memory might be arranged fairly randomly (but depending on how things happen to be written, they could also be in a single contiguous block of memory--you simply don't know and shouldn't care).
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
I would say it's intentional
– Michael
Jan 20 at 22:36
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to accessargv[i]
without scanning through all the previous ones.
– ilkkachu
Jan 21 at 11:41
|
show 12 more comments
Argv is basically like this:
On the left is the argument itself--what's actually passed as an argument to main. That contains the address of an array of pointers. Each of those points to some place in memory containing the text of the corresponding argument that was passed on the command line. Then, at the end of that array there's guaranteed to be a null pointer.
Note that the actual storage for the individual arguments are at least potentially allocated separately from each other, so their addresses in memory might be arranged fairly randomly (but depending on how things happen to be written, they could also be in a single contiguous block of memory--you simply don't know and shouldn't care).
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
I would say it's intentional
– Michael
Jan 20 at 22:36
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to accessargv[i]
without scanning through all the previous ones.
– ilkkachu
Jan 21 at 11:41
|
show 12 more comments
Argv is basically like this:
On the left is the argument itself--what's actually passed as an argument to main. That contains the address of an array of pointers. Each of those points to some place in memory containing the text of the corresponding argument that was passed on the command line. Then, at the end of that array there's guaranteed to be a null pointer.
Note that the actual storage for the individual arguments are at least potentially allocated separately from each other, so their addresses in memory might be arranged fairly randomly (but depending on how things happen to be written, they could also be in a single contiguous block of memory--you simply don't know and shouldn't care).
Argv is basically like this:
On the left is the argument itself--what's actually passed as an argument to main. That contains the address of an array of pointers. Each of those points to some place in memory containing the text of the corresponding argument that was passed on the command line. Then, at the end of that array there's guaranteed to be a null pointer.
Note that the actual storage for the individual arguments are at least potentially allocated separately from each other, so their addresses in memory might be arranged fairly randomly (but depending on how things happen to be written, they could also be in a single contiguous block of memory--you simply don't know and shouldn't care).
edited Jan 22 at 9:54
answered Jan 20 at 4:08
Jerry CoffinJerry Coffin
40.9k576150
40.9k576150
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
I would say it's intentional
– Michael
Jan 20 at 22:36
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to accessargv[i]
without scanning through all the previous ones.
– ilkkachu
Jan 21 at 11:41
|
show 12 more comments
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
I would say it's intentional
– Michael
Jan 20 at 22:36
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to accessargv[i]
without scanning through all the previous ones.
– ilkkachu
Jan 21 at 11:41
51
51
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
Whatever layout engine drew that diagram for you has a bug in their minimize-crossings algorithm!
– Eric Lippert
Jan 20 at 17:17
43
43
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
@EricLippert Could be intentional to emphasize that that the pointees might not be contiguous nor in order.
– jamesdlin
Jan 20 at 20:55
3
3
I would say it's intentional
– Michael
Jan 20 at 22:36
I would say it's intentional
– Michael
Jan 20 at 22:36
24
24
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
It was certainly intentional--and I'd guess Eric probably figured that, but (correctly, IMO) thought the comment was funny anyway.
– Jerry Coffin
Jan 21 at 2:57
2
2
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to access
argv[i]
without scanning through all the previous ones.– ilkkachu
Jan 21 at 11:41
@JerryCoffin, one might also point out that even if the actual arguments were contiguous in memory, they can have arbitrary lengths, so one would still need distinct pointers for each of them to be able to access
argv[i]
without scanning through all the previous ones.– ilkkachu
Jan 21 at 11:41
|
show 12 more comments
Because that's what the operating system provides :-)
Your question is a little bit of a chicken/egg inversion issue. The problem is not to choose what you want in C++, the problem is how you say in C++ what the OS is giving you.
Unix passes an array of "strings", each string being a command argument. In C/C++, a string is a "char*", so an array of strings is char* argv, or char** argv, according to taste.
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive theirargv
array — the runtime takes care of tokenizing the command line and building theargv
array at the startup.
– Joker_vD
Jan 20 at 10:46
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal_start
that callsmain
just needs to passmain
a pointer to the existingargv
array in memory; it's already in the right format. The kernel copies it from the argv argument to theexecve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)
– Peter Cordes
Jan 20 at 20:38
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of settingargc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.
– Peter Cordes
Jan 20 at 20:42
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
add a comment |
Because that's what the operating system provides :-)
Your question is a little bit of a chicken/egg inversion issue. The problem is not to choose what you want in C++, the problem is how you say in C++ what the OS is giving you.
Unix passes an array of "strings", each string being a command argument. In C/C++, a string is a "char*", so an array of strings is char* argv, or char** argv, according to taste.
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive theirargv
array — the runtime takes care of tokenizing the command line and building theargv
array at the startup.
– Joker_vD
Jan 20 at 10:46
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal_start
that callsmain
just needs to passmain
a pointer to the existingargv
array in memory; it's already in the right format. The kernel copies it from the argv argument to theexecve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)
– Peter Cordes
Jan 20 at 20:38
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of settingargc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.
– Peter Cordes
Jan 20 at 20:42
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
add a comment |
Because that's what the operating system provides :-)
Your question is a little bit of a chicken/egg inversion issue. The problem is not to choose what you want in C++, the problem is how you say in C++ what the OS is giving you.
Unix passes an array of "strings", each string being a command argument. In C/C++, a string is a "char*", so an array of strings is char* argv, or char** argv, according to taste.
Because that's what the operating system provides :-)
Your question is a little bit of a chicken/egg inversion issue. The problem is not to choose what you want in C++, the problem is how you say in C++ what the OS is giving you.
Unix passes an array of "strings", each string being a command argument. In C/C++, a string is a "char*", so an array of strings is char* argv, or char** argv, according to taste.
answered Jan 20 at 4:59
passer-bypasser-by
2292
2292
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive theirargv
array — the runtime takes care of tokenizing the command line and building theargv
array at the startup.
– Joker_vD
Jan 20 at 10:46
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal_start
that callsmain
just needs to passmain
a pointer to the existingargv
array in memory; it's already in the right format. The kernel copies it from the argv argument to theexecve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)
– Peter Cordes
Jan 20 at 20:38
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of settingargc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.
– Peter Cordes
Jan 20 at 20:42
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
add a comment |
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive theirargv
array — the runtime takes care of tokenizing the command line and building theargv
array at the startup.
– Joker_vD
Jan 20 at 10:46
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal_start
that callsmain
just needs to passmain
a pointer to the existingargv
array in memory; it's already in the right format. The kernel copies it from the argv argument to theexecve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)
– Peter Cordes
Jan 20 at 20:38
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of settingargc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.
– Peter Cordes
Jan 20 at 20:42
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
13
13
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive their
argv
array — the runtime takes care of tokenizing the command line and building the argv
array at the startup.– Joker_vD
Jan 20 at 10:46
No, it's exactly "the problem to choose what you want in C++". Windows, for example, provides the command line as a single string, and yet C/C++ programs still receive their
argv
array — the runtime takes care of tokenizing the command line and building the argv
array at the startup.– Joker_vD
Jan 20 at 10:46
14
14
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
@Joker_vD I think in a twisted way it is about what the OS gives you. Specifically: I guess C++ did it this way because C did it this way, and C did it this way because at the time C and Unix were so inextricably linked and Unix did it this way.
– Daniel Wagner
Jan 20 at 15:33
1
1
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal
_start
that calls main
just needs to pass main
a pointer to the existing argv
array in memory; it's already in the right format. The kernel copies it from the argv argument to the execve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)– Peter Cordes
Jan 20 at 20:38
@DanielWagner: Yes, this is from C's Unix heritage. On Unix / Linux a minimal
_start
that calls main
just needs to pass main
a pointer to the existing argv
array in memory; it's already in the right format. The kernel copies it from the argv argument to the execve(const char *filename, char *const argv, char *const envp)
system call that was made to start a new executable. (On Linux, argv (the array itself) and argc are on the stack on process entry. I assume most Unixes are the same, because that's a good place for it.)– Peter Cordes
Jan 20 at 20:38
8
8
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of setting
argc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.– Peter Cordes
Jan 20 at 20:42
But Joker's point here is that the C / C++ standards leave it up to the implementation where the args come from; they don't have to be straight from the OS. On an OS that passes a flat string, a good C++ implementation should include tokenizing, instead of setting
argc=2
and passing the whole flat string. (Following the letter of the standard is not sufficient to be useful; it intentionally leaves a lot of room for implementation choices.) Although some Windows programs will want to treat quotes specially, so real implementations do provide a way to get the flat string, too.– Peter Cordes
Jan 20 at 20:42
1
1
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
Basile's answer is pretty much this + @Joker's correction and my comments, with more details.
– Peter Cordes
Jan 20 at 20:50
add a comment |
First, as a parameter declaration, char **argv
is the same as char *argv
; they both imply a pointer to (an array or set of one or more possible) pointer(s) to strings.
Next, if you only have "pointer to char" — e.g. just char *
— then in order to access the nth item, you'll have to scan the first n-1 items to find the nth item's start. (And this would also impose the requirement that each of the strings are stored contiguously.)
With the array of pointers, you can directly index the nth item — so (while not strictly necessary — assuming the strings are contiguous) it is generally much more convenient.
To illustrate:
./program hello world
argc = 3
argv[0] --> "./program"
argv[1] --> "hello"
argv[2] --> "world"
It is possible that, in an os provided array of characters:
"./programhelloworld"
argv[0] ^
argv[1] ^
argv[2] ^
if argv were just a "pointer to char" you might see
"./programhelloworld"
argv ^
However (though likely by design of the os) there is no real guarantee that the three strings "./program", "hello", and "world" are contiguous. Further, this kind of "single pointer to multiple contiguous strings" is a more unusual data type construct (for C), especially compared with array of pointers to string.
what if instead of ,argv --> "helloworld"
you haveargv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the arrayargc
times. then you pass argv itself and not a pointer to argv.
– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your exampleargv[4]
isNULL
– Basile Starynkevitch
Jan 20 at 7:50
3
There is a guarantee that (at least initially)argv[argc] == NULL
. In this case that'sargv[3]
, notargv[4]
.
– Miral
Jan 21 at 6:22
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
|
show 4 more comments
First, as a parameter declaration, char **argv
is the same as char *argv
; they both imply a pointer to (an array or set of one or more possible) pointer(s) to strings.
Next, if you only have "pointer to char" — e.g. just char *
— then in order to access the nth item, you'll have to scan the first n-1 items to find the nth item's start. (And this would also impose the requirement that each of the strings are stored contiguously.)
With the array of pointers, you can directly index the nth item — so (while not strictly necessary — assuming the strings are contiguous) it is generally much more convenient.
To illustrate:
./program hello world
argc = 3
argv[0] --> "./program"
argv[1] --> "hello"
argv[2] --> "world"
It is possible that, in an os provided array of characters:
"./programhelloworld"
argv[0] ^
argv[1] ^
argv[2] ^
if argv were just a "pointer to char" you might see
"./programhelloworld"
argv ^
However (though likely by design of the os) there is no real guarantee that the three strings "./program", "hello", and "world" are contiguous. Further, this kind of "single pointer to multiple contiguous strings" is a more unusual data type construct (for C), especially compared with array of pointers to string.
what if instead of ,argv --> "helloworld"
you haveargv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the arrayargc
times. then you pass argv itself and not a pointer to argv.
– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your exampleargv[4]
isNULL
– Basile Starynkevitch
Jan 20 at 7:50
3
There is a guarantee that (at least initially)argv[argc] == NULL
. In this case that'sargv[3]
, notargv[4]
.
– Miral
Jan 21 at 6:22
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
|
show 4 more comments
First, as a parameter declaration, char **argv
is the same as char *argv
; they both imply a pointer to (an array or set of one or more possible) pointer(s) to strings.
Next, if you only have "pointer to char" — e.g. just char *
— then in order to access the nth item, you'll have to scan the first n-1 items to find the nth item's start. (And this would also impose the requirement that each of the strings are stored contiguously.)
With the array of pointers, you can directly index the nth item — so (while not strictly necessary — assuming the strings are contiguous) it is generally much more convenient.
To illustrate:
./program hello world
argc = 3
argv[0] --> "./program"
argv[1] --> "hello"
argv[2] --> "world"
It is possible that, in an os provided array of characters:
"./programhelloworld"
argv[0] ^
argv[1] ^
argv[2] ^
if argv were just a "pointer to char" you might see
"./programhelloworld"
argv ^
However (though likely by design of the os) there is no real guarantee that the three strings "./program", "hello", and "world" are contiguous. Further, this kind of "single pointer to multiple contiguous strings" is a more unusual data type construct (for C), especially compared with array of pointers to string.
First, as a parameter declaration, char **argv
is the same as char *argv
; they both imply a pointer to (an array or set of one or more possible) pointer(s) to strings.
Next, if you only have "pointer to char" — e.g. just char *
— then in order to access the nth item, you'll have to scan the first n-1 items to find the nth item's start. (And this would also impose the requirement that each of the strings are stored contiguously.)
With the array of pointers, you can directly index the nth item — so (while not strictly necessary — assuming the strings are contiguous) it is generally much more convenient.
To illustrate:
./program hello world
argc = 3
argv[0] --> "./program"
argv[1] --> "hello"
argv[2] --> "world"
It is possible that, in an os provided array of characters:
"./programhelloworld"
argv[0] ^
argv[1] ^
argv[2] ^
if argv were just a "pointer to char" you might see
"./programhelloworld"
argv ^
However (though likely by design of the os) there is no real guarantee that the three strings "./program", "hello", and "world" are contiguous. Further, this kind of "single pointer to multiple contiguous strings" is a more unusual data type construct (for C), especially compared with array of pointers to string.
edited Jan 21 at 20:45
answered Jan 20 at 1:08
Erik EidtErik Eidt
24.1k43567
24.1k43567
what if instead of ,argv --> "helloworld"
you haveargv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the arrayargc
times. then you pass argv itself and not a pointer to argv.
– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your exampleargv[4]
isNULL
– Basile Starynkevitch
Jan 20 at 7:50
3
There is a guarantee that (at least initially)argv[argc] == NULL
. In this case that'sargv[3]
, notargv[4]
.
– Miral
Jan 21 at 6:22
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
|
show 4 more comments
what if instead of ,argv --> "helloworld"
you haveargv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the arrayargc
times. then you pass argv itself and not a pointer to argv.
– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your exampleargv[4]
isNULL
– Basile Starynkevitch
Jan 20 at 7:50
3
There is a guarantee that (at least initially)argv[argc] == NULL
. In this case that'sargv[3]
, notargv[4]
.
– Miral
Jan 21 at 6:22
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
what if instead of ,
argv --> "helloworld"
you have argv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the array argc
times. then you pass argv itself and not a pointer to argv.– a user
Jan 20 at 1:22
what if instead of ,
argv --> "helloworld"
you have argv --> index 0 of the array
(hello), just like a normal array. why isn't this doable? then you keep reading the array argc
times. then you pass argv itself and not a pointer to argv.– a user
Jan 20 at 1:22
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
@auser, that's what argv --> "./programhelloworld" is: a pointer to the first char (i.e. the ".") If you take that pointer past the first , then you have a pointer to "hello", and after that to "world". After argc times (hitting "), you're done. Sure, it can be made to work, and as I said, an unusual construct.
– Erik Eidt
Jan 20 at 7:16
You forgot to state that in your example
argv[4]
is NULL
– Basile Starynkevitch
Jan 20 at 7:50
You forgot to state that in your example
argv[4]
is NULL
– Basile Starynkevitch
Jan 20 at 7:50
3
3
There is a guarantee that (at least initially)
argv[argc] == NULL
. In this case that's argv[3]
, not argv[4]
.– Miral
Jan 21 at 6:22
There is a guarantee that (at least initially)
argv[argc] == NULL
. In this case that's argv[3]
, not argv[4]
.– Miral
Jan 21 at 6:22
1
1
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
@Hill, yes, thank you as I was trying to be explicit about the null character terminators (and missed that one).
– Erik Eidt
Jan 21 at 20:44
|
show 4 more comments
Rather than thinking of it as "pointer to pointer", it helps to think of it as "array of strings", with denoting array and
char*
denoting string. When you run a program, you can pass it one or more command-line arguments and these are reflected in the arguments to main
: argc
is the count of arguments and argv
lets you access individual arguments.
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you seechar **
orchar *
, which is the same.
– rexkogitans
Jan 20 at 15:08
add a comment |
Rather than thinking of it as "pointer to pointer", it helps to think of it as "array of strings", with denoting array and
char*
denoting string. When you run a program, you can pass it one or more command-line arguments and these are reflected in the arguments to main
: argc
is the count of arguments and argv
lets you access individual arguments.
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you seechar **
orchar *
, which is the same.
– rexkogitans
Jan 20 at 15:08
add a comment |
Rather than thinking of it as "pointer to pointer", it helps to think of it as "array of strings", with denoting array and
char*
denoting string. When you run a program, you can pass it one or more command-line arguments and these are reflected in the arguments to main
: argc
is the count of arguments and argv
lets you access individual arguments.
Rather than thinking of it as "pointer to pointer", it helps to think of it as "array of strings", with denoting array and
char*
denoting string. When you run a program, you can pass it one or more command-line arguments and these are reflected in the arguments to main
: argc
is the count of arguments and argv
lets you access individual arguments.
answered Jan 20 at 1:06
casablancacasablanca
93949
93949
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you seechar **
orchar *
, which is the same.
– rexkogitans
Jan 20 at 15:08
add a comment |
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you seechar **
orchar *
, which is the same.
– rexkogitans
Jan 20 at 15:08
2
2
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you see
char **
or char *
, which is the same.– rexkogitans
Jan 20 at 15:08
+1 This! In many languages - bash, PHP, C, C++ - argv is an array of strings. Of this you have to think when you see
char **
or char *
, which is the same.– rexkogitans
Jan 20 at 15:08
add a comment |
Why C/C++ main argv is declared as “char* argv”
A possible answer is because the C11 standard n1570 (in §5.1.2.2.1 Program startup) and the C++11 standard n3337 (in §3.6.1 main function) require that for hosted environments (but notice that the C standard mentions also §5.1.2.1 freestanding environments) See also this.
The next question is why did the C and C++ standards choose main
to have such a int main(int argc, char**argv)
signature? The explanation is largely historical: C was invented with Unix, which has a shell which does globbing before doing fork
(which is a system call to create a process) and execve
(which is the system call to execute a program), and that execve
transmits an array of string program arguments and is related to the main
of the executed program. Read more about the Unix philosophy and about ABIs.
And C++ tried hard to follow the conventions of C and be compatible with it. It could not define main
to be incompatible with C traditions.
If you designed an operating system from scratch (still having a command line interface) and a programming language for it from scratch, you'll be free to invent different program starting conventions. And other programming languages (e.g. Common Lisp or Ocaml or Go) have different program starting conventions.
In practice, main
is invoked by some crt0 code. Notice that on Windows the globbing may be done by each program in the equivalent of crt0, and some Windows programs can start thru the non-standard WinMain entry point. On Unix, globbing is done by the shell (and crt0
is adapting the ABI, and the initial call stack layout that it has specified, to calling conventions of your C implementation).
add a comment |
Why C/C++ main argv is declared as “char* argv”
A possible answer is because the C11 standard n1570 (in §5.1.2.2.1 Program startup) and the C++11 standard n3337 (in §3.6.1 main function) require that for hosted environments (but notice that the C standard mentions also §5.1.2.1 freestanding environments) See also this.
The next question is why did the C and C++ standards choose main
to have such a int main(int argc, char**argv)
signature? The explanation is largely historical: C was invented with Unix, which has a shell which does globbing before doing fork
(which is a system call to create a process) and execve
(which is the system call to execute a program), and that execve
transmits an array of string program arguments and is related to the main
of the executed program. Read more about the Unix philosophy and about ABIs.
And C++ tried hard to follow the conventions of C and be compatible with it. It could not define main
to be incompatible with C traditions.
If you designed an operating system from scratch (still having a command line interface) and a programming language for it from scratch, you'll be free to invent different program starting conventions. And other programming languages (e.g. Common Lisp or Ocaml or Go) have different program starting conventions.
In practice, main
is invoked by some crt0 code. Notice that on Windows the globbing may be done by each program in the equivalent of crt0, and some Windows programs can start thru the non-standard WinMain entry point. On Unix, globbing is done by the shell (and crt0
is adapting the ABI, and the initial call stack layout that it has specified, to calling conventions of your C implementation).
add a comment |
Why C/C++ main argv is declared as “char* argv”
A possible answer is because the C11 standard n1570 (in §5.1.2.2.1 Program startup) and the C++11 standard n3337 (in §3.6.1 main function) require that for hosted environments (but notice that the C standard mentions also §5.1.2.1 freestanding environments) See also this.
The next question is why did the C and C++ standards choose main
to have such a int main(int argc, char**argv)
signature? The explanation is largely historical: C was invented with Unix, which has a shell which does globbing before doing fork
(which is a system call to create a process) and execve
(which is the system call to execute a program), and that execve
transmits an array of string program arguments and is related to the main
of the executed program. Read more about the Unix philosophy and about ABIs.
And C++ tried hard to follow the conventions of C and be compatible with it. It could not define main
to be incompatible with C traditions.
If you designed an operating system from scratch (still having a command line interface) and a programming language for it from scratch, you'll be free to invent different program starting conventions. And other programming languages (e.g. Common Lisp or Ocaml or Go) have different program starting conventions.
In practice, main
is invoked by some crt0 code. Notice that on Windows the globbing may be done by each program in the equivalent of crt0, and some Windows programs can start thru the non-standard WinMain entry point. On Unix, globbing is done by the shell (and crt0
is adapting the ABI, and the initial call stack layout that it has specified, to calling conventions of your C implementation).
Why C/C++ main argv is declared as “char* argv”
A possible answer is because the C11 standard n1570 (in §5.1.2.2.1 Program startup) and the C++11 standard n3337 (in §3.6.1 main function) require that for hosted environments (but notice that the C standard mentions also §5.1.2.1 freestanding environments) See also this.
The next question is why did the C and C++ standards choose main
to have such a int main(int argc, char**argv)
signature? The explanation is largely historical: C was invented with Unix, which has a shell which does globbing before doing fork
(which is a system call to create a process) and execve
(which is the system call to execute a program), and that execve
transmits an array of string program arguments and is related to the main
of the executed program. Read more about the Unix philosophy and about ABIs.
And C++ tried hard to follow the conventions of C and be compatible with it. It could not define main
to be incompatible with C traditions.
If you designed an operating system from scratch (still having a command line interface) and a programming language for it from scratch, you'll be free to invent different program starting conventions. And other programming languages (e.g. Common Lisp or Ocaml or Go) have different program starting conventions.
In practice, main
is invoked by some crt0 code. Notice that on Windows the globbing may be done by each program in the equivalent of crt0, and some Windows programs can start thru the non-standard WinMain entry point. On Unix, globbing is done by the shell (and crt0
is adapting the ABI, and the initial call stack layout that it has specified, to calling conventions of your C implementation).
edited Jan 21 at 2:40
answered Jan 20 at 10:42
Basile StarynkevitchBasile Starynkevitch
27.9k562102
27.9k562102
add a comment |
add a comment |
In many cases the answer is "because it's a standard". To quote C99 standard:
— If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup.
Of course, before it has been standardized it was already in use by K&R C in early Unix implementations, with the purpose of storing command-line parameters (something you have to care in Unix shell such as /bin/bash
or /bin/sh
but not in embedded systems). To quote first edition of K&R's "The C Programming Language" (pg. 110):
The first (conventionally called argc) is the number of command-line arguments the program was invoked with; the second (argv) is a pointer to an array of character strings that contain the arguments, one per string.
add a comment |
In many cases the answer is "because it's a standard". To quote C99 standard:
— If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup.
Of course, before it has been standardized it was already in use by K&R C in early Unix implementations, with the purpose of storing command-line parameters (something you have to care in Unix shell such as /bin/bash
or /bin/sh
but not in embedded systems). To quote first edition of K&R's "The C Programming Language" (pg. 110):
The first (conventionally called argc) is the number of command-line arguments the program was invoked with; the second (argv) is a pointer to an array of character strings that contain the arguments, one per string.
add a comment |
In many cases the answer is "because it's a standard". To quote C99 standard:
— If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup.
Of course, before it has been standardized it was already in use by K&R C in early Unix implementations, with the purpose of storing command-line parameters (something you have to care in Unix shell such as /bin/bash
or /bin/sh
but not in embedded systems). To quote first edition of K&R's "The C Programming Language" (pg. 110):
The first (conventionally called argc) is the number of command-line arguments the program was invoked with; the second (argv) is a pointer to an array of character strings that contain the arguments, one per string.
In many cases the answer is "because it's a standard". To quote C99 standard:
— If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup.
Of course, before it has been standardized it was already in use by K&R C in early Unix implementations, with the purpose of storing command-line parameters (something you have to care in Unix shell such as /bin/bash
or /bin/sh
but not in embedded systems). To quote first edition of K&R's "The C Programming Language" (pg. 110):
The first (conventionally called argc) is the number of command-line arguments the program was invoked with; the second (argv) is a pointer to an array of character strings that contain the arguments, one per string.
edited Jan 20 at 10:55
answered Jan 20 at 10:47
Sergiy KolodyazhnyySergiy Kolodyazhnyy
1194
1194
add a comment |
add a comment |
Thanks for contributing an answer to Software Engineering Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f385819%2fwhy-is-c-c-main-argv-declared-as-char-argv-rather-than-just-char-argv%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
"a pointer to pointer to the first index of the array" - That's not a correct description of
char* argv
orchar**
. That's a pointer to a pointer to a character; specifically the outer pointer points to the first pointer in an array, and the inner pointers point to the first characters of nul-terminated strings. There's no indices involved here.– Sebastian Redl
Jan 20 at 7:16
12
How would you get the second argument if it was just char* argv?
– gnasher729
Jan 20 at 14:16
15
Your life will get easier when you put the space in the right place.
char* argv
puts the space in the wrong place. Saychar *argv
, and now it is clear that this means "the expression*argv[n]
is a variable of typechar
". Don't get caught up in trying to work out what's a pointer and what's a pointer to a pointer, and so on. The declaration is telling you what operations you can perform on this thing.– Eric Lippert
Jan 20 at 17:22
1
Mentally compare
char * argv
to the similar C++ constructstd::string argv
, and it might be easier to parse. ...Just don't start actually writing it that way!– Justin Time
Jan 20 at 20:32
2
@EricLippert note that the question also includes C++, and there you can have e.g.
char &func(int);
which doesn't make&func(5)
have typechar
.– Ruslan
Jan 21 at 15:17