Python RegEx to Catch White Space Between Array Name and Size

Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:
ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')
ret2 = ret1.match(nextLineInFile)
print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name
Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:
long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];
My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:
ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')
This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?
linux python arrays python-3.x
migrated from unix.stackexchange.com Sep 25 at 9:05
This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.
add a comment |Â
up vote
2
down vote
favorite
I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:
ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')
ret2 = ret1.match(nextLineInFile)
print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name
Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:
long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];
My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:
ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')
This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?
linux python arrays python-3.x
migrated from unix.stackexchange.com Sep 25 at 9:05
This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:
ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')
ret2 = ret1.match(nextLineInFile)
print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name
Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:
long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];
My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:
ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')
This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?
linux python arrays python-3.x
I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:
ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')
ret2 = ret1.match(nextLineInFile)
print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name
Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:
long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];
My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:
ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')
This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?
linux python arrays python-3.x
linux python arrays python-3.x
asked Sep 25 at 2:13
caleb_hendrix
132
132
migrated from unix.stackexchange.com Sep 25 at 9:05
This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.
migrated from unix.stackexchange.com Sep 25 at 9:05
This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05
add a comment |Â
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.
Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].
You can test the expression at https://regex101.com/r/6KDHjj/1
Do not hesitate if you have any question about my expression, or if it does not work as expected.
Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.
I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.
Here is a test of this second possibility : https://regex101.com/r/WMWC93/1
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.
Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].
You can test the expression at https://regex101.com/r/6KDHjj/1
Do not hesitate if you have any question about my expression, or if it does not work as expected.
Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.
I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.
Here is a test of this second possibility : https://regex101.com/r/WMWC93/1
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
add a comment |Â
up vote
0
down vote
accepted
After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.
Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].
You can test the expression at https://regex101.com/r/6KDHjj/1
Do not hesitate if you have any question about my expression, or if it does not work as expected.
Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.
I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.
Here is a test of this second possibility : https://regex101.com/r/WMWC93/1
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.
Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].
You can test the expression at https://regex101.com/r/6KDHjj/1
Do not hesitate if you have any question about my expression, or if it does not work as expected.
Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.
I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.
Here is a test of this second possibility : https://regex101.com/r/WMWC93/1
After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.
Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].
You can test the expression at https://regex101.com/r/6KDHjj/1
Do not hesitate if you have any question about my expression, or if it does not work as expected.
Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.
I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.
Here is a test of this second possibility : https://regex101.com/r/WMWC93/1
answered Sep 25 at 15:08
HolyDanna
524311
524311
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
add a comment |Â
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
I didn't try your first solution. But, your second solution worked perfectly! Thank you!
â caleb_hendrix
Sep 25 at 16:40
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52494407%2fpython-regex-to-catch-white-space-between-array-name-and-size%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Voting to move this to Stackoverflow as it's not a Unix or Linux question.
â Nasir Riley
Sep 25 at 3:05