Collect subsequences of consecutive integers

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












4












$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question











$endgroup$











  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    Feb 4 at 15:25










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    Feb 4 at 15:34















4












$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question











$endgroup$











  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    Feb 4 at 15:25










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    Feb 4 at 15:34













4












4








4





$begingroup$


I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.










share|improve this question











$endgroup$




I've written a function to split a list into several lists with consecutive integers, e.g.:



extract_subsequences([1,2,3,7,8,9])



[[1, 2, 3], [7, 8, 9]]




extract_subsequences([1,2,3])



[[1, 2, 3]]




extract_subsequences([1,2,3, 10])



[[1, 2, 3], [10]]




extract_subsequences([1,2, 4,5, 8,9])



[[1, 2], [4, 5], [8, 9]]




This is the code that I came up with:



import numpy as np

def extract_subsequences(seq):
def split_sequence(seq):
for i in np.arange(1, len(seq)):
if seq[i] != seq[i - 1] + 1:
break

if i < len(seq) - 1:
return seq[:i], seq[i:]
else:
if seq[-1] == seq[-2] + 1:
return seq,
else:
return seq[:-1], [seq[-1]]

res =
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res


Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.







python python-3.x programming-challenge interval






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 4 at 18:54









200_success

130k16153417




130k16153417










asked Feb 4 at 14:56









C. E.C. E.

1234




1234











  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    Feb 4 at 15:25










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    Feb 4 at 15:34
















  • $begingroup$
    Are all sequences sorted?
    $endgroup$
    – Ludisposed
    Feb 4 at 15:25










  • $begingroup$
    @Ludisposed Yes
    $endgroup$
    – C. E.
    Feb 4 at 15:34















$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25




$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25












$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34




$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34










3 Answers
3






active

oldest

votes


















6












$begingroup$

Using yield is often simpler than building up a list to return. Here it would reduce




 res = 
last = seq
while len(last) > 1:
first, last = split_sequence(last)
res.append(first)

if len(last) == 1:
res.append(last)

return res



to



last = seq
while len(last) > 1:
first, last = split_sequence(last)
yield first

if len(last) == 1:
yield last


at which point you could inline split_sequence.




split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



current = 
for val in seq:
if current != and val != current[-1] + 1:
yield current
current =
current += [val]

# Test is only necessary because seq might be empty
if current != :
yield current





share|improve this answer









$endgroup$




















    9












    $begingroup$

    @Peter Taylor has a good idea, but there is more to be improved



    • You should add some tests



    • Python is often described as Batteries included



      This is a perfect example to use itertools.groupby()




    from itertools import groupby
    import doctest

    def extract_seq(seq):
    """
    Splits sequence into consecutive lists

    args:
    seq (list): A sorted sequence

    >>> extract_seq([1,2, 4,5, 8,9])
    [[1, 2], [4, 5], [8, 9]]

    >>> extract_seq([1,2,3])
    [[1, 2, 3]]

    >>> extract_seq([1,2,3,7,8,9])
    [[1, 2, 3], [7, 8, 9]]

    >>> extract_seq([1,2,3,10])
    [[1, 2, 3], [10]]
    """
    return [
    [x for _, x in g]
    for k, g in groupby(
    enumerate(seq),
    lambda i_x : i_x[0] - i_x[1]
    )
    ]

    if __name__ == "__main__":
    doctest.testmod()





    share|improve this answer











    $endgroup$




















      4












      $begingroup$

      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



      for a, b in pairwise(seq):
      if b != a + 1:
      break





      share|improve this answer









      $endgroup$








      • 2




        $begingroup$
        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
        $endgroup$
        – C. E.
        Feb 5 at 9:11










      Your Answer





      StackExchange.ifUsing("editor", function ()
      return StackExchange.using("mathjaxEditing", function ()
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
      );
      );
      , "mathjax-editing");

      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "196"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212848%2fcollect-subsequences-of-consecutive-integers%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      6












      $begingroup$

      Using yield is often simpler than building up a list to return. Here it would reduce




       res = 
      last = seq
      while len(last) > 1:
      first, last = split_sequence(last)
      res.append(first)

      if len(last) == 1:
      res.append(last)

      return res



      to



      last = seq
      while len(last) > 1:
      first, last = split_sequence(last)
      yield first

      if len(last) == 1:
      yield last


      at which point you could inline split_sequence.




      split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



      current = 
      for val in seq:
      if current != and val != current[-1] + 1:
      yield current
      current =
      current += [val]

      # Test is only necessary because seq might be empty
      if current != :
      yield current





      share|improve this answer









      $endgroup$

















        6












        $begingroup$

        Using yield is often simpler than building up a list to return. Here it would reduce




         res = 
        last = seq
        while len(last) > 1:
        first, last = split_sequence(last)
        res.append(first)

        if len(last) == 1:
        res.append(last)

        return res



        to



        last = seq
        while len(last) > 1:
        first, last = split_sequence(last)
        yield first

        if len(last) == 1:
        yield last


        at which point you could inline split_sequence.




        split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



        current = 
        for val in seq:
        if current != and val != current[-1] + 1:
        yield current
        current =
        current += [val]

        # Test is only necessary because seq might be empty
        if current != :
        yield current





        share|improve this answer









        $endgroup$















          6












          6








          6





          $begingroup$

          Using yield is often simpler than building up a list to return. Here it would reduce




           res = 
          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          res.append(first)

          if len(last) == 1:
          res.append(last)

          return res



          to



          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          yield first

          if len(last) == 1:
          yield last


          at which point you could inline split_sequence.




          split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



          current = 
          for val in seq:
          if current != and val != current[-1] + 1:
          yield current
          current =
          current += [val]

          # Test is only necessary because seq might be empty
          if current != :
          yield current





          share|improve this answer









          $endgroup$



          Using yield is often simpler than building up a list to return. Here it would reduce




           res = 
          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          res.append(first)

          if len(last) == 1:
          res.append(last)

          return res



          to



          last = seq
          while len(last) > 1:
          first, last = split_sequence(last)
          yield first

          if len(last) == 1:
          yield last


          at which point you could inline split_sequence.




          split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as



          current = 
          for val in seq:
          if current != and val != current[-1] + 1:
          yield current
          current =
          current += [val]

          # Test is only necessary because seq might be empty
          if current != :
          yield current






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Feb 4 at 15:38









          Peter TaylorPeter Taylor

          17.3k2862




          17.3k2862























              9












              $begingroup$

              @Peter Taylor has a good idea, but there is more to be improved



              • You should add some tests



              • Python is often described as Batteries included



                This is a perfect example to use itertools.groupby()




              from itertools import groupby
              import doctest

              def extract_seq(seq):
              """
              Splits sequence into consecutive lists

              args:
              seq (list): A sorted sequence

              >>> extract_seq([1,2, 4,5, 8,9])
              [[1, 2], [4, 5], [8, 9]]

              >>> extract_seq([1,2,3])
              [[1, 2, 3]]

              >>> extract_seq([1,2,3,7,8,9])
              [[1, 2, 3], [7, 8, 9]]

              >>> extract_seq([1,2,3,10])
              [[1, 2, 3], [10]]
              """
              return [
              [x for _, x in g]
              for k, g in groupby(
              enumerate(seq),
              lambda i_x : i_x[0] - i_x[1]
              )
              ]

              if __name__ == "__main__":
              doctest.testmod()





              share|improve this answer











              $endgroup$

















                9












                $begingroup$

                @Peter Taylor has a good idea, but there is more to be improved



                • You should add some tests



                • Python is often described as Batteries included



                  This is a perfect example to use itertools.groupby()




                from itertools import groupby
                import doctest

                def extract_seq(seq):
                """
                Splits sequence into consecutive lists

                args:
                seq (list): A sorted sequence

                >>> extract_seq([1,2, 4,5, 8,9])
                [[1, 2], [4, 5], [8, 9]]

                >>> extract_seq([1,2,3])
                [[1, 2, 3]]

                >>> extract_seq([1,2,3,7,8,9])
                [[1, 2, 3], [7, 8, 9]]

                >>> extract_seq([1,2,3,10])
                [[1, 2, 3], [10]]
                """
                return [
                [x for _, x in g]
                for k, g in groupby(
                enumerate(seq),
                lambda i_x : i_x[0] - i_x[1]
                )
                ]

                if __name__ == "__main__":
                doctest.testmod()





                share|improve this answer











                $endgroup$















                  9












                  9








                  9





                  $begingroup$

                  @Peter Taylor has a good idea, but there is more to be improved



                  • You should add some tests



                  • Python is often described as Batteries included



                    This is a perfect example to use itertools.groupby()




                  from itertools import groupby
                  import doctest

                  def extract_seq(seq):
                  """
                  Splits sequence into consecutive lists

                  args:
                  seq (list): A sorted sequence

                  >>> extract_seq([1,2, 4,5, 8,9])
                  [[1, 2], [4, 5], [8, 9]]

                  >>> extract_seq([1,2,3])
                  [[1, 2, 3]]

                  >>> extract_seq([1,2,3,7,8,9])
                  [[1, 2, 3], [7, 8, 9]]

                  >>> extract_seq([1,2,3,10])
                  [[1, 2, 3], [10]]
                  """
                  return [
                  [x for _, x in g]
                  for k, g in groupby(
                  enumerate(seq),
                  lambda i_x : i_x[0] - i_x[1]
                  )
                  ]

                  if __name__ == "__main__":
                  doctest.testmod()





                  share|improve this answer











                  $endgroup$



                  @Peter Taylor has a good idea, but there is more to be improved



                  • You should add some tests



                  • Python is often described as Batteries included



                    This is a perfect example to use itertools.groupby()




                  from itertools import groupby
                  import doctest

                  def extract_seq(seq):
                  """
                  Splits sequence into consecutive lists

                  args:
                  seq (list): A sorted sequence

                  >>> extract_seq([1,2, 4,5, 8,9])
                  [[1, 2], [4, 5], [8, 9]]

                  >>> extract_seq([1,2,3])
                  [[1, 2, 3]]

                  >>> extract_seq([1,2,3,7,8,9])
                  [[1, 2, 3], [7, 8, 9]]

                  >>> extract_seq([1,2,3,10])
                  [[1, 2, 3], [10]]
                  """
                  return [
                  [x for _, x in g]
                  for k, g in groupby(
                  enumerate(seq),
                  lambda i_x : i_x[0] - i_x[1]
                  )
                  ]

                  if __name__ == "__main__":
                  doctest.testmod()






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Feb 5 at 8:31

























                  answered Feb 4 at 15:52









                  LudisposedLudisposed

                  8,11222161




                  8,11222161





















                      4












                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$








                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        Feb 5 at 9:11















                      4












                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$








                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        Feb 5 at 9:11













                      4












                      4








                      4





                      $begingroup$

                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break





                      share|improve this answer









                      $endgroup$



                      One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):



                      for a, b in pairwise(seq):
                      if b != a + 1:
                      break






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Feb 4 at 22:54









                      Solomon UckoSolomon Ucko

                      1,1521415




                      1,1521415







                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        Feb 5 at 9:11












                      • 2




                        $begingroup$
                        Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                        $endgroup$
                        – C. E.
                        Feb 5 at 9:11







                      2




                      2




                      $begingroup$
                      Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                      $endgroup$
                      – C. E.
                      Feb 5 at 9:11




                      $begingroup$
                      Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
                      $endgroup$
                      – C. E.
                      Feb 5 at 9:11

















                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Code Review Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212848%2fcollect-subsequences-of-consecutive-integers%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown






                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Displaying single band from multi-band raster using QGIS

                      How many registers does an x86_64 CPU actually have?