Python RegEx to Catch White Space Between Array Name and Size

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:



ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')

ret2 = ret1.match(nextLineInFile)

print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name


Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:



long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];


My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:



ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')


This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?










share|improve this question













migrated from unix.stackexchange.com Sep 25 at 9:05


This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.














  • Voting to move this to Stackoverflow as it's not a Unix or Linux question.
    – Nasir Riley
    Sep 25 at 3:05














up vote
2
down vote

favorite












I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:



ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')

ret2 = ret1.match(nextLineInFile)

print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name


Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:



long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];


My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:



ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')


This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?










share|improve this question













migrated from unix.stackexchange.com Sep 25 at 9:05


This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.














  • Voting to move this to Stackoverflow as it's not a Unix or Linux question.
    – Nasir Riley
    Sep 25 at 3:05












up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:



ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')

ret2 = ret1.match(nextLineInFile)

print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name


Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:



long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];


My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:



ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')


This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?










share|improve this question













I am using a regular expression in a python script to search through a file to find variable declarations. This is what I have so far:



ret1 = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)[w:[]]*s*;s*$')

ret2 = ret1.match(nextLineInFile)

print("Group 2: ", ret2.group(2))#variable type
print("Group 3: ", ret2.group(3))#variable name


Later in the code, I'm using groupings to capture the variable type and the variable name. I have the following input:



long myArray1[2];
long myArray2 [2];
long long myArray3[2];
long long myArray4 [2];


My RegEx is only finding myArray1 and myArray3. I need it find all four declarations. I've tried the following:



ret = re.compile(r'^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s*[w:[]]*s*;s*$')


This catches myArray1, myArray2, and myArray4 perfectly. But, now myArray3 is coming back with a variable type of "long" and variable name of "long". What am I doing wrong?







linux python arrays python-3.x






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Sep 25 at 2:13









caleb_hendrix

132




132




migrated from unix.stackexchange.com Sep 25 at 9:05


This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.






migrated from unix.stackexchange.com Sep 25 at 9:05


This question came from our site for users of Linux, FreeBSD and other Un*x-like operating systems.













  • Voting to move this to Stackoverflow as it's not a Unix or Linux question.
    – Nasir Riley
    Sep 25 at 3:05
















  • Voting to move this to Stackoverflow as it's not a Unix or Linux question.
    – Nasir Riley
    Sep 25 at 3:05















Voting to move this to Stackoverflow as it's not a Unix or Linux question.
– Nasir Riley
Sep 25 at 3:05




Voting to move this to Stackoverflow as it's not a Unix or Linux question.
– Nasir Riley
Sep 25 at 3:05












1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.



Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].



You can test the expression at https://regex101.com/r/6KDHjj/1



Do not hesitate if you have any question about my expression, or if it does not work as expected.



Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.

I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.



Here is a test of this second possibility : https://regex101.com/r/WMWC93/1






share|improve this answer




















  • I didn't try your first solution. But, your second solution worked perfectly! Thank you!
    – caleb_hendrix
    Sep 25 at 16:40










Your Answer





StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52494407%2fpython-regex-to-catch-white-space-between-array-name-and-size%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote



accepted










After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.



Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].



You can test the expression at https://regex101.com/r/6KDHjj/1



Do not hesitate if you have any question about my expression, or if it does not work as expected.



Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.

I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.



Here is a test of this second possibility : https://regex101.com/r/WMWC93/1






share|improve this answer




















  • I didn't try your first solution. But, your second solution worked perfectly! Thank you!
    – caleb_hendrix
    Sep 25 at 16:40














up vote
0
down vote



accepted










After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.



Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].



You can test the expression at https://regex101.com/r/6KDHjj/1



Do not hesitate if you have any question about my expression, or if it does not work as expected.



Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.

I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.



Here is a test of this second possibility : https://regex101.com/r/WMWC93/1






share|improve this answer




















  • I didn't try your first solution. But, your second solution worked perfectly! Thank you!
    – caleb_hendrix
    Sep 25 at 16:40












up vote
0
down vote



accepted







up vote
0
down vote



accepted






After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.



Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].



You can test the expression at https://regex101.com/r/6KDHjj/1



Do not hesitate if you have any question about my expression, or if it does not work as expected.



Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.

I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.



Here is a test of this second possibility : https://regex101.com/r/WMWC93/1






share|improve this answer












After tinkering a bit with your input and current regex, I've come to the following expression : ^s*(volatile|register|typedef)?s?([w<>:*, ]+?)s+([w:]+)s?[d+]s*;s*$.



Instead of Looking for arrayName[size] and arrayName [size] as two separate elements, i'm just looking to find the name of the array ([w<>:*, ]+?) followed or not by a space s?, and then by the brackets with the size [d+].



You can test the expression at https://regex101.com/r/6KDHjj/1



Do not hesitate if you have any question about my expression, or if it does not work as expected.



Another possibility I just thought of would be to have the part matching the type of the array be greedy : ^s*(volatile|register|typedef)?s?([w<>:*, ]+)s+([w:]+)s*[w:[]]*s*;s*$.

I just removed the interrogation mark at the end of ([w<>:*, ]+?) to get this behaviour.



Here is a test of this second possibility : https://regex101.com/r/WMWC93/1







share|improve this answer












share|improve this answer



share|improve this answer










answered Sep 25 at 15:08









HolyDanna

524311




524311











  • I didn't try your first solution. But, your second solution worked perfectly! Thank you!
    – caleb_hendrix
    Sep 25 at 16:40
















  • I didn't try your first solution. But, your second solution worked perfectly! Thank you!
    – caleb_hendrix
    Sep 25 at 16:40















I didn't try your first solution. But, your second solution worked perfectly! Thank you!
– caleb_hendrix
Sep 25 at 16:40




I didn't try your first solution. But, your second solution worked perfectly! Thank you!
– caleb_hendrix
Sep 25 at 16:40

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52494407%2fpython-regex-to-catch-white-space-between-array-name-and-size%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos