Capture all numbers up to three digits [duplicate]

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
9
down vote

favorite













This question already has an answer here:



  • Python regular expression match whole word

    4 answers



  • Match a whole word in a string using dynamic regex

    1 answer



  • Regex matching 5-digit substrings not enclosed with digits

    2 answers



  • Regex whitespace word boundary

    2 answers



I have the following string:



1 2 134 2009


And I'd like to capture the strings with between 1-3 digits, so the result should be:



['1', '2', '134']


What I have now captures those, but also captures the "first 3" digits in strings that contain more than 3 digits. This is the current regex I have:



>>> re.findall(r'd1,3', '1 2 134 2009')
['1', '2', '134', '200', '9']

# or a bit closer --

>>> re.findall(r'd1,3(?!d)', '1 2 134 2009')
['1', '2', '134', '009']


What would be the correct way to make sure that another digit doesn't immediate proceed it?










share|improve this question















marked as duplicate by Wiktor Stribiżew python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
11 mins ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.










  • 1




    What is the logic to match 123 in ['1', '2', '123']
    – The fourth bird
    1 hour ago










  • @Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
    – David L
    1 hour ago






  • 1




    @Thefourthbird oh I see. Sorry that was a typo -- fixed.
    – David L
    1 hour ago














up vote
9
down vote

favorite













This question already has an answer here:



  • Python regular expression match whole word

    4 answers



  • Match a whole word in a string using dynamic regex

    1 answer



  • Regex matching 5-digit substrings not enclosed with digits

    2 answers



  • Regex whitespace word boundary

    2 answers



I have the following string:



1 2 134 2009


And I'd like to capture the strings with between 1-3 digits, so the result should be:



['1', '2', '134']


What I have now captures those, but also captures the "first 3" digits in strings that contain more than 3 digits. This is the current regex I have:



>>> re.findall(r'd1,3', '1 2 134 2009')
['1', '2', '134', '200', '9']

# or a bit closer --

>>> re.findall(r'd1,3(?!d)', '1 2 134 2009')
['1', '2', '134', '009']


What would be the correct way to make sure that another digit doesn't immediate proceed it?










share|improve this question















marked as duplicate by Wiktor Stribiżew python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
11 mins ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.










  • 1




    What is the logic to match 123 in ['1', '2', '123']
    – The fourth bird
    1 hour ago










  • @Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
    – David L
    1 hour ago






  • 1




    @Thefourthbird oh I see. Sorry that was a typo -- fixed.
    – David L
    1 hour ago












up vote
9
down vote

favorite









up vote
9
down vote

favorite












This question already has an answer here:



  • Python regular expression match whole word

    4 answers



  • Match a whole word in a string using dynamic regex

    1 answer



  • Regex matching 5-digit substrings not enclosed with digits

    2 answers



  • Regex whitespace word boundary

    2 answers



I have the following string:



1 2 134 2009


And I'd like to capture the strings with between 1-3 digits, so the result should be:



['1', '2', '134']


What I have now captures those, but also captures the "first 3" digits in strings that contain more than 3 digits. This is the current regex I have:



>>> re.findall(r'd1,3', '1 2 134 2009')
['1', '2', '134', '200', '9']

# or a bit closer --

>>> re.findall(r'd1,3(?!d)', '1 2 134 2009')
['1', '2', '134', '009']


What would be the correct way to make sure that another digit doesn't immediate proceed it?










share|improve this question
















This question already has an answer here:



  • Python regular expression match whole word

    4 answers



  • Match a whole word in a string using dynamic regex

    1 answer



  • Regex matching 5-digit substrings not enclosed with digits

    2 answers



  • Regex whitespace word boundary

    2 answers



I have the following string:



1 2 134 2009


And I'd like to capture the strings with between 1-3 digits, so the result should be:



['1', '2', '134']


What I have now captures those, but also captures the "first 3" digits in strings that contain more than 3 digits. This is the current regex I have:



>>> re.findall(r'd1,3', '1 2 134 2009')
['1', '2', '134', '200', '9']

# or a bit closer --

>>> re.findall(r'd1,3(?!d)', '1 2 134 2009')
['1', '2', '134', '009']


What would be the correct way to make sure that another digit doesn't immediate proceed it?





This question already has an answer here:



  • Python regular expression match whole word

    4 answers



  • Match a whole word in a string using dynamic regex

    1 answer



  • Regex matching 5-digit substrings not enclosed with digits

    2 answers



  • Regex whitespace word boundary

    2 answers







python regex






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago

























asked 1 hour ago









David L

2838




2838




marked as duplicate by Wiktor Stribiżew python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
11 mins ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by Wiktor Stribiżew python
Users with the  python badge can single-handedly close python questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
11 mins ago


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









  • 1




    What is the logic to match 123 in ['1', '2', '123']
    – The fourth bird
    1 hour ago










  • @Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
    – David L
    1 hour ago






  • 1




    @Thefourthbird oh I see. Sorry that was a typo -- fixed.
    – David L
    1 hour ago












  • 1




    What is the logic to match 123 in ['1', '2', '123']
    – The fourth bird
    1 hour ago










  • @Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
    – David L
    1 hour ago






  • 1




    @Thefourthbird oh I see. Sorry that was a typo -- fixed.
    – David L
    1 hour ago







1




1




What is the logic to match 123 in ['1', '2', '123']
– The fourth bird
1 hour ago




What is the logic to match 123 in ['1', '2', '123']
– The fourth bird
1 hour ago












@Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
– David L
1 hour ago




@Thefourthbird I suppose that it would be a 'self-contained number', for example if someone looked the above string they could see that 4 numbers were contained in it. Not sure if I can give a more rigorous explanation.
– David L
1 hour ago




1




1




@Thefourthbird oh I see. Sorry that was a typo -- fixed.
– David L
1 hour ago




@Thefourthbird oh I see. Sorry that was a typo -- fixed.
– David L
1 hour ago












2 Answers
2






active

oldest

votes

















up vote
11
down vote



accepted










Add word boundaries:



import re

result = re.findall(r'bd1,3b', '1 2 134 2009')

print(result)


Output



['1', '2', '134']


From the documentation b:




Matches the empty string, but only at the beginning or end of a word.
A word is defined as a sequence of word characters. Note that
formally, b is defined as the boundary between a w and a W
character (or vice versa), or between w and the beginning/end of the
string. This means that r'bfoob' matches 'foo', 'foo.', '(foo)',
'bar foo baz' but not 'foobar' or 'foo3'.



By default Unicode alphanumerics are the ones used in Unicode
patterns, but this can be changed by using the ASCII flag. Word
boundaries are determined by the current locale if the LOCALE flag is
used. Inside a character range, b represents the backspace character,
for compatibility with Python’s string literals.







share|improve this answer






















  • thanks for this. For a 'word boundary', what does this include other than a space?
    – David L
    1 hour ago










  • @DavidL Updated the answer!
    – Daniel Mesejo
    59 mins ago

















up vote
7
down vote













If there are only digits separated by whitespace in your string, using re is overkill. You can simply split the string and check the length of the substrings.



>>> numbers = '1 2 134 2009'
>>> [n for n in numbers.split() if len(n) <= 3]
>>> ['1', '2', '134']





share|improve this answer



























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    11
    down vote



    accepted










    Add word boundaries:



    import re

    result = re.findall(r'bd1,3b', '1 2 134 2009')

    print(result)


    Output



    ['1', '2', '134']


    From the documentation b:




    Matches the empty string, but only at the beginning or end of a word.
    A word is defined as a sequence of word characters. Note that
    formally, b is defined as the boundary between a w and a W
    character (or vice versa), or between w and the beginning/end of the
    string. This means that r'bfoob' matches 'foo', 'foo.', '(foo)',
    'bar foo baz' but not 'foobar' or 'foo3'.



    By default Unicode alphanumerics are the ones used in Unicode
    patterns, but this can be changed by using the ASCII flag. Word
    boundaries are determined by the current locale if the LOCALE flag is
    used. Inside a character range, b represents the backspace character,
    for compatibility with Python’s string literals.







    share|improve this answer






















    • thanks for this. For a 'word boundary', what does this include other than a space?
      – David L
      1 hour ago










    • @DavidL Updated the answer!
      – Daniel Mesejo
      59 mins ago














    up vote
    11
    down vote



    accepted










    Add word boundaries:



    import re

    result = re.findall(r'bd1,3b', '1 2 134 2009')

    print(result)


    Output



    ['1', '2', '134']


    From the documentation b:




    Matches the empty string, but only at the beginning or end of a word.
    A word is defined as a sequence of word characters. Note that
    formally, b is defined as the boundary between a w and a W
    character (or vice versa), or between w and the beginning/end of the
    string. This means that r'bfoob' matches 'foo', 'foo.', '(foo)',
    'bar foo baz' but not 'foobar' or 'foo3'.



    By default Unicode alphanumerics are the ones used in Unicode
    patterns, but this can be changed by using the ASCII flag. Word
    boundaries are determined by the current locale if the LOCALE flag is
    used. Inside a character range, b represents the backspace character,
    for compatibility with Python’s string literals.







    share|improve this answer






















    • thanks for this. For a 'word boundary', what does this include other than a space?
      – David L
      1 hour ago










    • @DavidL Updated the answer!
      – Daniel Mesejo
      59 mins ago












    up vote
    11
    down vote



    accepted







    up vote
    11
    down vote



    accepted






    Add word boundaries:



    import re

    result = re.findall(r'bd1,3b', '1 2 134 2009')

    print(result)


    Output



    ['1', '2', '134']


    From the documentation b:




    Matches the empty string, but only at the beginning or end of a word.
    A word is defined as a sequence of word characters. Note that
    formally, b is defined as the boundary between a w and a W
    character (or vice versa), or between w and the beginning/end of the
    string. This means that r'bfoob' matches 'foo', 'foo.', '(foo)',
    'bar foo baz' but not 'foobar' or 'foo3'.



    By default Unicode alphanumerics are the ones used in Unicode
    patterns, but this can be changed by using the ASCII flag. Word
    boundaries are determined by the current locale if the LOCALE flag is
    used. Inside a character range, b represents the backspace character,
    for compatibility with Python’s string literals.







    share|improve this answer














    Add word boundaries:



    import re

    result = re.findall(r'bd1,3b', '1 2 134 2009')

    print(result)


    Output



    ['1', '2', '134']


    From the documentation b:




    Matches the empty string, but only at the beginning or end of a word.
    A word is defined as a sequence of word characters. Note that
    formally, b is defined as the boundary between a w and a W
    character (or vice versa), or between w and the beginning/end of the
    string. This means that r'bfoob' matches 'foo', 'foo.', '(foo)',
    'bar foo baz' but not 'foobar' or 'foo3'.



    By default Unicode alphanumerics are the ones used in Unicode
    patterns, but this can be changed by using the ASCII flag. Word
    boundaries are determined by the current locale if the LOCALE flag is
    used. Inside a character range, b represents the backspace character,
    for compatibility with Python’s string literals.








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 59 mins ago

























    answered 1 hour ago









    Daniel Mesejo

    6,6881621




    6,6881621











    • thanks for this. For a 'word boundary', what does this include other than a space?
      – David L
      1 hour ago










    • @DavidL Updated the answer!
      – Daniel Mesejo
      59 mins ago
















    • thanks for this. For a 'word boundary', what does this include other than a space?
      – David L
      1 hour ago










    • @DavidL Updated the answer!
      – Daniel Mesejo
      59 mins ago















    thanks for this. For a 'word boundary', what does this include other than a space?
    – David L
    1 hour ago




    thanks for this. For a 'word boundary', what does this include other than a space?
    – David L
    1 hour ago












    @DavidL Updated the answer!
    – Daniel Mesejo
    59 mins ago




    @DavidL Updated the answer!
    – Daniel Mesejo
    59 mins ago












    up vote
    7
    down vote













    If there are only digits separated by whitespace in your string, using re is overkill. You can simply split the string and check the length of the substrings.



    >>> numbers = '1 2 134 2009'
    >>> [n for n in numbers.split() if len(n) <= 3]
    >>> ['1', '2', '134']





    share|improve this answer
























      up vote
      7
      down vote













      If there are only digits separated by whitespace in your string, using re is overkill. You can simply split the string and check the length of the substrings.



      >>> numbers = '1 2 134 2009'
      >>> [n for n in numbers.split() if len(n) <= 3]
      >>> ['1', '2', '134']





      share|improve this answer






















        up vote
        7
        down vote










        up vote
        7
        down vote









        If there are only digits separated by whitespace in your string, using re is overkill. You can simply split the string and check the length of the substrings.



        >>> numbers = '1 2 134 2009'
        >>> [n for n in numbers.split() if len(n) <= 3]
        >>> ['1', '2', '134']





        share|improve this answer












        If there are only digits separated by whitespace in your string, using re is overkill. You can simply split the string and check the length of the substrings.



        >>> numbers = '1 2 134 2009'
        >>> [n for n in numbers.split() if len(n) <= 3]
        >>> ['1', '2', '134']






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 1 hour ago









        timgeb

        42.2k105681




        42.2k105681












            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?