Collect subsequences of consecutive integers

I've written a function to split a list into several lists with consecutive integers, e.g.:

extract_subsequences([1,2,3,7,8,9])

[[1, 2, 3], [7, 8, 9]]

extract_subsequences([1,2,3])

[[1, 2, 3]]

extract_subsequences([1,2,3, 10])

[[1, 2, 3], [10]]

extract_subsequences([1,2, 4,5, 8,9])

[[1, 2], [4, 5], [8, 9]]

This is the code that I came up with:

import numpy as np

def extract_subsequences(seq):
 def split_sequence(seq):
 for i in np.arange(1, len(seq)):
 if seq[i] != seq[i - 1] + 1:
 break

 if i < len(seq) - 1:
 return seq[:i], seq[i:]
 else:
 if seq[-1] == seq[-2] + 1:
 return seq, 
 else:
 return seq[:-1], [seq[-1]]

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

Can someone help me improve this code? It's kind of difficult to read with all the indices and special cases, and for the same reasons it was also quite difficult to write. I would very much appreciate a better way of thinking about this problem.

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25

$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34

add a comment |

I've written a function to split a list into several lists with consecutive integers, e.g.:

extract_subsequences([1,2,3,7,8,9])

[[1, 2, 3], [7, 8, 9]]

extract_subsequences([1,2,3])

[[1, 2, 3]]

extract_subsequences([1,2,3, 10])

[[1, 2, 3], [10]]

extract_subsequences([1,2, 4,5, 8,9])

[[1, 2], [4, 5], [8, 9]]

This is the code that I came up with:

import numpy as np

def extract_subsequences(seq):
 def split_sequence(seq):
 for i in np.arange(1, len(seq)):
 if seq[i] != seq[i - 1] + 1:
 break

 if i < len(seq) - 1:
 return seq[:i], seq[i:]
 else:
 if seq[-1] == seq[-2] + 1:
 return seq, 
 else:
 return seq[:-1], [seq[-1]]

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25

$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34

add a comment |

I've written a function to split a list into several lists with consecutive integers, e.g.:

extract_subsequences([1,2,3,7,8,9])

[[1, 2, 3], [7, 8, 9]]

extract_subsequences([1,2,3])

[[1, 2, 3]]

extract_subsequences([1,2,3, 10])

[[1, 2, 3], [10]]

extract_subsequences([1,2, 4,5, 8,9])

[[1, 2], [4, 5], [8, 9]]

This is the code that I came up with:

import numpy as np

def extract_subsequences(seq):
 def split_sequence(seq):
 for i in np.arange(1, len(seq)):
 if seq[i] != seq[i - 1] + 1:
 break

 if i < len(seq) - 1:
 return seq[:i], seq[i:]
 else:
 if seq[-1] == seq[-2] + 1:
 return seq, 
 else:
 return seq[:-1], [seq[-1]]

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

I've written a function to split a list into several lists with consecutive integers, e.g.:

extract_subsequences([1,2,3,7,8,9])

[[1, 2, 3], [7, 8, 9]]

extract_subsequences([1,2,3])

[[1, 2, 3]]

extract_subsequences([1,2,3, 10])

[[1, 2, 3], [10]]

extract_subsequences([1,2, 4,5, 8,9])

[[1, 2], [4, 5], [8, 9]]

This is the code that I came up with:

import numpy as np

def extract_subsequences(seq):
 def split_sequence(seq):
 for i in np.arange(1, len(seq)):
 if seq[i] != seq[i - 1] + 1:
 break

 if i < len(seq) - 1:
 return seq[:i], seq[i:]
 else:
 if seq[-1] == seq[-2] + 1:
 return seq, 
 else:
 return seq[:-1], [seq[-1]]

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

python python-3.x programming-challenge interval

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

edited Feb 4 at 18:54

200_success

130k16153417

edited Feb 4 at 18:54

200_success

130k16153417

edited Feb 4 at 18:54

200_success

130k16153417

asked Feb 4 at 14:56

C. E.

1234

asked Feb 4 at 14:56

C. E.

1234

asked Feb 4 at 14:56

C. E.

1234

$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25

$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34

add a comment |

$begingroup$
Are all sequences sorted?
$endgroup$
– Ludisposed
Feb 4 at 15:25

$begingroup$
@Ludisposed Yes
$endgroup$
– C. E.
Feb 4 at 15:34

Are all sequences sorted?

– Ludisposed
Feb 4 at 15:25

@Ludisposed Yes

– C. E.
Feb 4 at 15:34

add a comment |

3 Answers
3

active

oldest

votes

Using yield is often simpler than building up a list to return. Here it would reduce

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

last = seq
while len(last) > 1:
 first, last = split_sequence(last)
 yield first

if len(last) == 1:
 yield last

at which point you could inline split_sequence.

split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as

current = 
for val in seq:
 if current != and val != current[-1] + 1:
 yield current
 current = 
 current += [val]

# Test is only necessary because seq might be empty
if current != :
 yield current

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

add a comment |

@Peter Taylor has a good idea, but there is more to be improved

You should add some tests

Python is often described as Batteries included

This is a perfect example to use itertools.groupby()

from itertools import groupby
import doctest

def extract_seq(seq):
 """
 Splits sequence into consecutive lists

 args:
 seq (list): A sorted sequence

 >>> extract_seq([1,2, 4,5, 8,9])
 [[1, 2], [4, 5], [8, 9]]

 >>> extract_seq([1,2,3])
 [[1, 2, 3]]

 >>> extract_seq([1,2,3,7,8,9])
 [[1, 2, 3], [7, 8, 9]]

 >>> extract_seq([1,2,3,10])
 [[1, 2, 3], [10]]
 """
 return [
 [x for _, x in g]
 for k, g in groupby(
 enumerate(seq), 
 lambda i_x : i_x[0] - i_x[1]
 )
 ]

if __name__ == "__main__":
 doctest.testmod()

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

add a comment |

One more thing that no one has explicitly pointed out yet, because it is made irrelevant: there's no need to import numpy just to iterate over numpy.arange. Just use range instead: for i in range(1, len(seq)):. Or even better, use the itertools recipe pairwise (available with more-itertools):

for a, b in pairwise(seq):
 if b != a + 1:
 break

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

2

$begingroup$
Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
$endgroup$
– C. E.
Feb 5 at 9:11

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f212848%2fcollect-subsequences-of-consecutive-integers%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Using yield is often simpler than building up a list to return. Here it would reduce

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

last = seq
while len(last) > 1:
 first, last = split_sequence(last)
 yield first

if len(last) == 1:
 yield last

at which point you could inline split_sequence.

split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as

current = 
for val in seq:
 if current != and val != current[-1] + 1:
 yield current
 current = 
 current += [val]

# Test is only necessary because seq might be empty
if current != :
 yield current

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

add a comment |

Using yield is often simpler than building up a list to return. Here it would reduce

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

last = seq
while len(last) > 1:
 first, last = split_sequence(last)
 yield first

if len(last) == 1:
 yield last

at which point you could inline split_sequence.

split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as

current = 
for val in seq:
 if current != and val != current[-1] + 1:
 yield current
 current = 
 current += [val]

# Test is only necessary because seq might be empty
if current != :
 yield current

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

add a comment |

Using yield is often simpler than building up a list to return. Here it would reduce

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

last = seq
while len(last) > 1:
 first, last = split_sequence(last)
 yield first

if len(last) == 1:
 yield last

at which point you could inline split_sequence.

split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as

current = 
for val in seq:
 if current != and val != current[-1] + 1:
 yield current
 current = 
 current += [val]

# Test is only necessary because seq might be empty
if current != :
 yield current

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

Using yield is often simpler than building up a list to return. Here it would reduce

 res = 
 last = seq
 while len(last) > 1:
 first, last = split_sequence(last)
 res.append(first)

 if len(last) == 1:
 res.append(last)

 return res

last = seq
while len(last) > 1:
 first, last = split_sequence(last)
 yield first

if len(last) == 1:
 yield last

at which point you could inline split_sequence.

split_sequence seems unnecessarily complicated to me. The entire extract_subsequences could be written with only one special case as

current = 
for val in seq:
 if current != and val != current[-1] + 1:
 yield current
 current = 
 current += [val]

# Test is only necessary because seq might be empty
if current != :
 yield current

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

answered Feb 4 at 15:38

Peter Taylor

17.3k2862

add a comment |

@Peter Taylor has a good idea, but there is more to be improved

You should add some tests

Python is often described as Batteries included

This is a perfect example to use itertools.groupby()

from itertools import groupby
import doctest

def extract_seq(seq):
 """
 Splits sequence into consecutive lists

 args:
 seq (list): A sorted sequence

 >>> extract_seq([1,2, 4,5, 8,9])
 [[1, 2], [4, 5], [8, 9]]

 >>> extract_seq([1,2,3])
 [[1, 2, 3]]

 >>> extract_seq([1,2,3,7,8,9])
 [[1, 2, 3], [7, 8, 9]]

 >>> extract_seq([1,2,3,10])
 [[1, 2, 3], [10]]
 """
 return [
 [x for _, x in g]
 for k, g in groupby(
 enumerate(seq), 
 lambda i_x : i_x[0] - i_x[1]
 )
 ]

if __name__ == "__main__":
 doctest.testmod()

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

add a comment |

@Peter Taylor has a good idea, but there is more to be improved

You should add some tests

Python is often described as Batteries included

This is a perfect example to use itertools.groupby()

from itertools import groupby
import doctest

def extract_seq(seq):
 """
 Splits sequence into consecutive lists

 args:
 seq (list): A sorted sequence

 >>> extract_seq([1,2, 4,5, 8,9])
 [[1, 2], [4, 5], [8, 9]]

 >>> extract_seq([1,2,3])
 [[1, 2, 3]]

 >>> extract_seq([1,2,3,7,8,9])
 [[1, 2, 3], [7, 8, 9]]

 >>> extract_seq([1,2,3,10])
 [[1, 2, 3], [10]]
 """
 return [
 [x for _, x in g]
 for k, g in groupby(
 enumerate(seq), 
 lambda i_x : i_x[0] - i_x[1]
 )
 ]

if __name__ == "__main__":
 doctest.testmod()

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

add a comment |

@Peter Taylor has a good idea, but there is more to be improved

You should add some tests

Python is often described as Batteries included

This is a perfect example to use itertools.groupby()

from itertools import groupby
import doctest

def extract_seq(seq):
 """
 Splits sequence into consecutive lists

 args:
 seq (list): A sorted sequence

 >>> extract_seq([1,2, 4,5, 8,9])
 [[1, 2], [4, 5], [8, 9]]

 >>> extract_seq([1,2,3])
 [[1, 2, 3]]

 >>> extract_seq([1,2,3,7,8,9])
 [[1, 2, 3], [7, 8, 9]]

 >>> extract_seq([1,2,3,10])
 [[1, 2, 3], [10]]
 """
 return [
 [x for _, x in g]
 for k, g in groupby(
 enumerate(seq), 
 lambda i_x : i_x[0] - i_x[1]
 )
 ]

if __name__ == "__main__":
 doctest.testmod()

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

@Peter Taylor has a good idea, but there is more to be improved

You should add some tests

Python is often described as Batteries included

This is a perfect example to use itertools.groupby()

from itertools import groupby
import doctest

def extract_seq(seq):
 """
 Splits sequence into consecutive lists

 args:
 seq (list): A sorted sequence

 >>> extract_seq([1,2, 4,5, 8,9])
 [[1, 2], [4, 5], [8, 9]]

 >>> extract_seq([1,2,3])
 [[1, 2, 3]]

 >>> extract_seq([1,2,3,7,8,9])
 [[1, 2, 3], [7, 8, 9]]

 >>> extract_seq([1,2,3,10])
 [[1, 2, 3], [10]]
 """
 return [
 [x for _, x in g]
 for k, g in groupby(
 enumerate(seq), 
 lambda i_x : i_x[0] - i_x[1]
 )
 ]

if __name__ == "__main__":
 doctest.testmod()

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

edited Feb 5 at 8:31

answered Feb 4 at 15:52

Ludisposed

8,11222161

answered Feb 4 at 15:52

Ludisposed

8,11222161

answered Feb 4 at 15:52

Ludisposed

8,11222161

add a comment |

for a, b in pairwise(seq):
 if b != a + 1:
 break

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

2

$begingroup$
Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
$endgroup$
– C. E.
Feb 5 at 9:11

add a comment |

for a, b in pairwise(seq):
 if b != a + 1:
 break

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

2

$begingroup$
Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
$endgroup$
– C. E.
Feb 5 at 9:11

add a comment |

for a, b in pairwise(seq):
 if b != a + 1:
 break

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

for a, b in pairwise(seq):
 if b != a + 1:
 break

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

answered Feb 4 at 22:54

Solomon Ucko

1,1521415

2

$begingroup$
Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
$endgroup$
– C. E.
Feb 5 at 9:11

add a comment |

2

$begingroup$
Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.
$endgroup$
– C. E.
Feb 5 at 9:11

Thank you for pointing me to more-itertools! It has a function called consecutive_groups which does that which is needed.

– C. E.
Feb 5 at 9:11

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu