How to sort a file by duration column?

up vote
2
down vote

favorite

How to sort a file containing below? (s=second, h=hour, d=day m=minute)

1s
2s
1h
2h
1m
2m
2s
1d
1m

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

asked Oct 15 '17 at 9:22

mert inan

1525

3

Are there always only whole numbers, nothing like 1h30m40s or 1.30h?
â€“Â jimmij
Oct 15 '17 at 10:02

add a commentÂ |Â

up vote
2
down vote

favorite

How to sort a file containing below? (s=second, h=hour, d=day m=minute)

1s
2s
1h
2h
1m
2m
2s
1d
1m

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

asked Oct 15 '17 at 9:22

mert inan

1525

3

Are there always only whole numbers, nothing like 1h30m40s or 1.30h?
â€“Â jimmij
Oct 15 '17 at 10:02

add a commentÂ |Â

up vote
2
down vote

favorite

How to sort a file containing below? (s=second, h=hour, d=day m=minute)

1s
2s
1h
2h
1m
2m
2s
1d
1m

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

asked Oct 15 '17 at 9:22

mert inan

1525

How to sort a file containing below? (s=second, h=hour, d=day m=minute)

1s
2s
1h
2h
1m
2m
2s
1d
1m

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

asked Oct 15 '17 at 9:22

mert inan

1525

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

edited Oct 15 '17 at 18:58

GAD3R

22.7k154895

asked Oct 15 '17 at 9:22

mert inan

1525

asked Oct 15 '17 at 9:22

mert inan

1525

asked Oct 15 '17 at 9:22

mert inan

1525

3

Are there always only whole numbers, nothing like 1h30m40s or 1.30h?
â€“Â jimmij
Oct 15 '17 at 10:02

add a commentÂ |Â

3

Are there always only whole numbers, nothing like 1h30m40s or 1.30h?
â€“Â jimmij
Oct 15 '17 at 10:02

Are there always only whole numbers, nothing like 1h30m40s or 1.30h?
â€“Â jimmij
Oct 15 '17 at 10:02

add a commentÂ |Â

5 Answers
5

active

oldest

votes

up vote
5
down vote

accepted

awk ' unitvalue=$1; ; 
 /s/ m=1 ; /m/ m=60 ; /h/ m=3600 ; /d/ m=86400 ; 
 sub("[smhd]","",unitvalue); unitvalue=unitvalue*m; 
 print unitvalue " " $1; ' input |
 sort -n | awk ' print $2 '
1s
2s
2s
1m
1m
2m
1h
2h
1d

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

add a commentÂ |Â

up vote
4
down vote

First version - FPAT is used

gawk '
BEGIN [smhd]";

/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $1 * factor, $0;
' input.txt | sort -n | awk 'print $2'

FPAT - A regular expression describing the contents of the fields
in a record. When set, gawk
parses the input into fields, where the fields match the regular expression, instead of
using the value of the FS variable as the field separator.

Second version

I was surprised to discover, that without FPAT it also works.
It is caused the number conversion mechanism of awk - How awk Converts Between Strings and Numbers, namely:

A string is converted to a number by interpreting any numeric prefix of the string as numerals: "2.5" converts to 2.5, "1e3" converts to 1,000, and "25fix" has a numeric value of 25. Strings that canÃ¢Â€Â™t be interpreted as valid numbers convert to zero.

gawk '
/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $0 * factor, $0;
' input.txt | sort -n | awk 'print $2'

Input (changed a little bit)

1s
122s
1h
2h
1m
2m
2s
1d
1m

Output

Note: 122 seconds more than 2 minutes, so it sorted after 2m.

1s
2s
1m
1m
2m
122s
1h
2h
1d

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

1

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

add a commentÂ |Â

up vote
2
down vote

If you only have times in the format of your question:

sort -k 1.2,1.2 -k 1.1,1.1 <file>

Where <file> is the file your data resides in. This command sorts on the second letter (ascending) and then sorts on the first letter (ascending). This works because it just so happes that the ordering of the letters for the time units (d > h > m > s) is exactly the order we want (day > hours > minutes > seconds).

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

1

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

add a commentÂ |Â

up vote
2
down vote

This an extension of MiniMaxÃ¢Â€Â™ answer that can handle a broader range of duration value like 1d3h10m40s.

GNU Awk program (stored in parse-times.awk for the sake of this answer):

#!/usr/bin/gawk -f
BEGIN
 FPAT = "[0-9]+[dhms]";
 duration["s"] = 1;
 duration["m"] = 60;
 duration["h"] = duration["m"] * 60;
 duration["d"] = duration["h"] * 24;



 t=0;
 for (i=1; i<=NF; i++)
 t += $i * duration[substr($i, length($i))];
 print(t, $0);

Invocation:

gawk -f parse-times.awk input.txt | sort -n -k 1,1 | cut -d ' ' -f 2

answered Oct 15 '17 at 17:04

David Foerster

917616

add a commentÂ |Â

up vote
1
down vote

Solution in Python 3:

#!/usr/bin/python3
import re, fileinput

class RegexMatchIterator:
 def __init__(self, regex, string, error_on_incomplete=False):
 self.regex = regex
 self.string = string
 self.error_on_incomplete = error_on_incomplete
 self.pos = 0

 def __iter__(self):
 return self

 def __next__(self):
 match = self.regex.match(self.string, self.pos)
 if match is not None:
 if match.end() > self.pos:
 self.pos = match.end()
 return match
 else:
 fmt = '0!s returns an empty match at position 1:d for "3!r"'

 elif self.error_on_incomplete and self.pos < len(self.string):
 if isinstance(self.error_on_incomplete, str):
 fmt = self.error_on_incomplete
 else:
 fmt = '0!s didn't match the suffix 3!r at position 1:d of 2!r'

 else:
 raise StopIteration(self.pos)

 raise ValueError(fmt.format(
 self.regex, self.pos, self.string, self.string[self.pos:]))


DURATION_SUFFIXES = 's': 1, 'm': 60, 'h': 3600, 'd': 24*3600 
DURATION_PATTERN = re.compile(
 '(\d+)(' + '|'.join(map(re.escape, DURATION_SUFFIXES.keys())) + ')')

def parse_duration(s):
 return sum(
 int(m.group(1)) * DURATION_SUFFIXES[m.group(2)]
 for m in RegexMatchIterator(DURATION_PATTERN, s,
 'Illegal duration string 3!r at position 1:d'))


if __name__ == '__main__':
 with fileinput.input() as f:
 result = sorted((l.rstrip('n') for l in f), key=parse_duration)
 for item in result:
 print(item)

As you can see I spent about Ã¢Â…Â” of the line count towards a useful iterator over regex.match() results because regex.finditer() doesn't tie matches to the beginning of the current region and there are no other suitable ways to iterate over match results. *grrr*

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f398212%2fhow-to-sort-a-file-by-duration-column%23new-answer', 'question_page');

);

Post as a guest

Name

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

up vote
5
down vote

accepted

awk ' unitvalue=$1; ; 
 /s/ m=1 ; /m/ m=60 ; /h/ m=3600 ; /d/ m=86400 ; 
 sub("[smhd]","",unitvalue); unitvalue=unitvalue*m; 
 print unitvalue " " $1; ' input |
 sort -n | awk ' print $2 '
1s
2s
2s
1m
1m
2m
1h
2h
1d

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

add a commentÂ |Â

up vote
5
down vote

accepted

awk ' unitvalue=$1; ; 
 /s/ m=1 ; /m/ m=60 ; /h/ m=3600 ; /d/ m=86400 ; 
 sub("[smhd]","",unitvalue); unitvalue=unitvalue*m; 
 print unitvalue " " $1; ' input |
 sort -n | awk ' print $2 '
1s
2s
2s
1m
1m
2m
1h
2h
1d

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

add a commentÂ |Â

up vote
5
down vote

accepted

awk ' unitvalue=$1; ; 
 /s/ m=1 ; /m/ m=60 ; /h/ m=3600 ; /d/ m=86400 ; 
 sub("[smhd]","",unitvalue); unitvalue=unitvalue*m; 
 print unitvalue " " $1; ' input |
 sort -n | awk ' print $2 '
1s
2s
2s
1m
1m
2m
1h
2h
1d

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

awk ' unitvalue=$1; ; 
 /s/ m=1 ; /m/ m=60 ; /h/ m=3600 ; /d/ m=86400 ; 
 sub("[smhd]","",unitvalue); unitvalue=unitvalue*m; 
 print unitvalue " " $1; ' input |
 sort -n | awk ' print $2 '
1s
2s
2s
1m
1m
2m
1h
2h
1d

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

answered Oct 15 '17 at 11:03

Hauke Laging

53.6k1282130

add a commentÂ |Â

up vote
4
down vote

First version - FPAT is used

gawk '
BEGIN [smhd]";

/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $1 * factor, $0;
' input.txt | sort -n | awk 'print $2'

FPAT - A regular expression describing the contents of the fields
in a record. When set, gawk
parses the input into fields, where the fields match the regular expression, instead of
using the value of the FS variable as the field separator.

Second version

I was surprised to discover, that without FPAT it also works.
It is caused the number conversion mechanism of awk - How awk Converts Between Strings and Numbers, namely:

A string is converted to a number by interpreting any numeric prefix of the string as numerals: "2.5" converts to 2.5, "1e3" converts to 1,000, and "25fix" has a numeric value of 25. Strings that canÃ¢Â€Â™t be interpreted as valid numbers convert to zero.

gawk '
/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $0 * factor, $0;
' input.txt | sort -n | awk 'print $2'

Input (changed a little bit)

1s
122s
1h
2h
1m
2m
2s
1d
1m

Output

Note: 122 seconds more than 2 minutes, so it sorted after 2m.

1s
2s
1m
1m
2m
122s
1h
2h
1d

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

1

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

add a commentÂ |Â

up vote
4
down vote

First version - FPAT is used

gawk '
BEGIN [smhd]";

/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $1 * factor, $0;
' input.txt | sort -n | awk 'print $2'

FPAT - A regular expression describing the contents of the fields
in a record. When set, gawk
parses the input into fields, where the fields match the regular expression, instead of
using the value of the FS variable as the field separator.

Second version

I was surprised to discover, that without FPAT it also works.
It is caused the number conversion mechanism of awk - How awk Converts Between Strings and Numbers, namely:

A string is converted to a number by interpreting any numeric prefix of the string as numerals: "2.5" converts to 2.5, "1e3" converts to 1,000, and "25fix" has a numeric value of 25. Strings that canÃ¢Â€Â™t be interpreted as valid numbers convert to zero.

gawk '
/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $0 * factor, $0;
' input.txt | sort -n | awk 'print $2'

Input (changed a little bit)

1s
122s
1h
2h
1m
2m
2s
1d
1m

Output

Note: 122 seconds more than 2 minutes, so it sorted after 2m.

1s
2s
1m
1m
2m
122s
1h
2h
1d

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

1

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

add a commentÂ |Â

up vote
4
down vote

First version - FPAT is used

gawk '
BEGIN [smhd]";

/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $1 * factor, $0;
' input.txt | sort -n | awk 'print $2'

FPAT - A regular expression describing the contents of the fields
in a record. When set, gawk
parses the input into fields, where the fields match the regular expression, instead of
using the value of the FS variable as the field separator.

Second version

I was surprised to discover, that without FPAT it also works.
It is caused the number conversion mechanism of awk - How awk Converts Between Strings and Numbers, namely:

A string is converted to a number by interpreting any numeric prefix of the string as numerals: "2.5" converts to 2.5, "1e3" converts to 1,000, and "25fix" has a numeric value of 25. Strings that canÃ¢Â€Â™t be interpreted as valid numbers convert to zero.

gawk '
/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $0 * factor, $0;
' input.txt | sort -n | awk 'print $2'

Input (changed a little bit)

1s
122s
1h
2h
1m
2m
2s
1d
1m

Output

Note: 122 seconds more than 2 minutes, so it sorted after 2m.

1s
2s
1m
1m
2m
122s
1h
2h
1d

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

First version - FPAT is used

gawk '
BEGIN [smhd]";

/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $1 * factor, $0;
' input.txt | sort -n | awk 'print $2'

FPAT - A regular expression describing the contents of the fields
in a record. When set, gawk
parses the input into fields, where the fields match the regular expression, instead of
using the value of the FS variable as the field separator.

Second version

I was surprised to discover, that without FPAT it also works.
It is caused the number conversion mechanism of awk - How awk Converts Between Strings and Numbers, namely:

A string is converted to a number by interpreting any numeric prefix of the string as numerals: "2.5" converts to 2.5, "1e3" converts to 1,000, and "25fix" has a numeric value of 25. Strings that canÃ¢Â€Â™t be interpreted as valid numbers convert to zero.

gawk '
/s/ factor = 1 
/m/ factor = 60 
/h/ factor = 3600 
/d/ factor = 86400 

 print $0 * factor, $0;
' input.txt | sort -n | awk 'print $2'

Input (changed a little bit)

1s
122s
1h
2h
1m
2m
2s
1d
1m

Output

Note: 122 seconds more than 2 minutes, so it sorted after 2m.

1s
2s
1m
1m
2m
122s
1h
2h
1d

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

edited Oct 15 '17 at 19:34

answered Oct 15 '17 at 14:44

MiniMax

2,706719

answered Oct 15 '17 at 14:44

MiniMax

2,706719

answered Oct 15 '17 at 14:44

MiniMax

2,706719

1

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

add a commentÂ |Â

1

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

+1 I like the clever use of FPAT. This could easily be expanded to accept and handle time values like 1d3h10m40s.
â€“Â David Foerster
Oct 15 '17 at 16:44

@DavidFoerster I looked to your awk answer and discovered interesting fact: strings like 1s, 3d, 4m converting to the integer by awk itself, without problems. So, they can be used for math operations directly - without splitting by regex. I was added second version of the solution and an explanation of this behaviour too.
â€“Â MiniMax
Oct 15 '17 at 19:42

add a commentÂ |Â

up vote
2
down vote

If you only have times in the format of your question:

sort -k 1.2,1.2 -k 1.1,1.1 <file>

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

1

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

add a commentÂ |Â

up vote
2
down vote

If you only have times in the format of your question:

sort -k 1.2,1.2 -k 1.1,1.1 <file>

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

1

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

add a commentÂ |Â

up vote
2
down vote

If you only have times in the format of your question:

sort -k 1.2,1.2 -k 1.1,1.1 <file>

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

If you only have times in the format of your question:

sort -k 1.2,1.2 -k 1.1,1.1 <file>

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

answered Oct 15 '17 at 10:58

PawkyPenguin

696110

1

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

add a commentÂ |Â

1

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

Just like the other (now deleted) answer, this assumes durations are single-digit...
â€“Â don_crissti
Oct 15 '17 at 10:59

@don_crissti I answered the question because in the worst case I can just delete and in the best case this is exactly what he was looking for. I thought this was a better approach than waiting for an edit of the question (which potentially takes a long time, so by then the question might be lost).
â€“Â PawkyPenguin
Oct 15 '17 at 11:32

add a commentÂ |Â

up vote
2
down vote

This an extension of MiniMaxÃ¢Â€Â™ answer that can handle a broader range of duration value like 1d3h10m40s.

GNU Awk program (stored in parse-times.awk for the sake of this answer):

#!/usr/bin/gawk -f
BEGIN
 FPAT = "[0-9]+[dhms]";
 duration["s"] = 1;
 duration["m"] = 60;
 duration["h"] = duration["m"] * 60;
 duration["d"] = duration["h"] * 24;



 t=0;
 for (i=1; i<=NF; i++)
 t += $i * duration[substr($i, length($i))];
 print(t, $0);

Invocation:

gawk -f parse-times.awk input.txt | sort -n -k 1,1 | cut -d ' ' -f 2

answered Oct 15 '17 at 17:04

David Foerster

917616

add a commentÂ |Â

up vote
2
down vote

This an extension of MiniMaxÃ¢Â€Â™ answer that can handle a broader range of duration value like 1d3h10m40s.

GNU Awk program (stored in parse-times.awk for the sake of this answer):

#!/usr/bin/gawk -f
BEGIN
 FPAT = "[0-9]+[dhms]";
 duration["s"] = 1;
 duration["m"] = 60;
 duration["h"] = duration["m"] * 60;
 duration["d"] = duration["h"] * 24;



 t=0;
 for (i=1; i<=NF; i++)
 t += $i * duration[substr($i, length($i))];
 print(t, $0);

Invocation:

gawk -f parse-times.awk input.txt | sort -n -k 1,1 | cut -d ' ' -f 2

answered Oct 15 '17 at 17:04

David Foerster

917616

add a commentÂ |Â

up vote
2
down vote

This an extension of MiniMaxÃ¢Â€Â™ answer that can handle a broader range of duration value like 1d3h10m40s.

GNU Awk program (stored in parse-times.awk for the sake of this answer):

#!/usr/bin/gawk -f
BEGIN
 FPAT = "[0-9]+[dhms]";
 duration["s"] = 1;
 duration["m"] = 60;
 duration["h"] = duration["m"] * 60;
 duration["d"] = duration["h"] * 24;



 t=0;
 for (i=1; i<=NF; i++)
 t += $i * duration[substr($i, length($i))];
 print(t, $0);

Invocation:

gawk -f parse-times.awk input.txt | sort -n -k 1,1 | cut -d ' ' -f 2

answered Oct 15 '17 at 17:04

David Foerster

917616

This an extension of MiniMaxÃ¢Â€Â™ answer that can handle a broader range of duration value like 1d3h10m40s.

GNU Awk program (stored in parse-times.awk for the sake of this answer):

#!/usr/bin/gawk -f
BEGIN
 FPAT = "[0-9]+[dhms]";
 duration["s"] = 1;
 duration["m"] = 60;
 duration["h"] = duration["m"] * 60;
 duration["d"] = duration["h"] * 24;



 t=0;
 for (i=1; i<=NF; i++)
 t += $i * duration[substr($i, length($i))];
 print(t, $0);

Invocation:

gawk -f parse-times.awk input.txt | sort -n -k 1,1 | cut -d ' ' -f 2

answered Oct 15 '17 at 17:04

David Foerster

917616

answered Oct 15 '17 at 17:04

David Foerster

917616

answered Oct 15 '17 at 17:04

David Foerster

917616

answered Oct 15 '17 at 17:04

David Foerster

917616

add a commentÂ |Â

up vote
1
down vote

Solution in Python 3:

#!/usr/bin/python3
import re, fileinput

class RegexMatchIterator:
 def __init__(self, regex, string, error_on_incomplete=False):
 self.regex = regex
 self.string = string
 self.error_on_incomplete = error_on_incomplete
 self.pos = 0

 def __iter__(self):
 return self

 def __next__(self):
 match = self.regex.match(self.string, self.pos)
 if match is not None:
 if match.end() > self.pos:
 self.pos = match.end()
 return match
 else:
 fmt = '0!s returns an empty match at position 1:d for "3!r"'

 elif self.error_on_incomplete and self.pos < len(self.string):
 if isinstance(self.error_on_incomplete, str):
 fmt = self.error_on_incomplete
 else:
 fmt = '0!s didn't match the suffix 3!r at position 1:d of 2!r'

 else:
 raise StopIteration(self.pos)

 raise ValueError(fmt.format(
 self.regex, self.pos, self.string, self.string[self.pos:]))


DURATION_SUFFIXES = 's': 1, 'm': 60, 'h': 3600, 'd': 24*3600 
DURATION_PATTERN = re.compile(
 '(\d+)(' + '|'.join(map(re.escape, DURATION_SUFFIXES.keys())) + ')')

def parse_duration(s):
 return sum(
 int(m.group(1)) * DURATION_SUFFIXES[m.group(2)]
 for m in RegexMatchIterator(DURATION_PATTERN, s,
 'Illegal duration string 3!r at position 1:d'))


if __name__ == '__main__':
 with fileinput.input() as f:
 result = sorted((l.rstrip('n') for l in f), key=parse_duration)
 for item in result:
 print(item)

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

add a commentÂ |Â

up vote
1
down vote

Solution in Python 3:

#!/usr/bin/python3
import re, fileinput

class RegexMatchIterator:
 def __init__(self, regex, string, error_on_incomplete=False):
 self.regex = regex
 self.string = string
 self.error_on_incomplete = error_on_incomplete
 self.pos = 0

 def __iter__(self):
 return self

 def __next__(self):
 match = self.regex.match(self.string, self.pos)
 if match is not None:
 if match.end() > self.pos:
 self.pos = match.end()
 return match
 else:
 fmt = '0!s returns an empty match at position 1:d for "3!r"'

 elif self.error_on_incomplete and self.pos < len(self.string):
 if isinstance(self.error_on_incomplete, str):
 fmt = self.error_on_incomplete
 else:
 fmt = '0!s didn't match the suffix 3!r at position 1:d of 2!r'

 else:
 raise StopIteration(self.pos)

 raise ValueError(fmt.format(
 self.regex, self.pos, self.string, self.string[self.pos:]))


DURATION_SUFFIXES = 's': 1, 'm': 60, 'h': 3600, 'd': 24*3600 
DURATION_PATTERN = re.compile(
 '(\d+)(' + '|'.join(map(re.escape, DURATION_SUFFIXES.keys())) + ')')

def parse_duration(s):
 return sum(
 int(m.group(1)) * DURATION_SUFFIXES[m.group(2)]
 for m in RegexMatchIterator(DURATION_PATTERN, s,
 'Illegal duration string 3!r at position 1:d'))


if __name__ == '__main__':
 with fileinput.input() as f:
 result = sorted((l.rstrip('n') for l in f), key=parse_duration)
 for item in result:
 print(item)

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

add a commentÂ |Â

up vote
1
down vote

Solution in Python 3:

#!/usr/bin/python3
import re, fileinput

class RegexMatchIterator:
 def __init__(self, regex, string, error_on_incomplete=False):
 self.regex = regex
 self.string = string
 self.error_on_incomplete = error_on_incomplete
 self.pos = 0

 def __iter__(self):
 return self

 def __next__(self):
 match = self.regex.match(self.string, self.pos)
 if match is not None:
 if match.end() > self.pos:
 self.pos = match.end()
 return match
 else:
 fmt = '0!s returns an empty match at position 1:d for "3!r"'

 elif self.error_on_incomplete and self.pos < len(self.string):
 if isinstance(self.error_on_incomplete, str):
 fmt = self.error_on_incomplete
 else:
 fmt = '0!s didn't match the suffix 3!r at position 1:d of 2!r'

 else:
 raise StopIteration(self.pos)

 raise ValueError(fmt.format(
 self.regex, self.pos, self.string, self.string[self.pos:]))


DURATION_SUFFIXES = 's': 1, 'm': 60, 'h': 3600, 'd': 24*3600 
DURATION_PATTERN = re.compile(
 '(\d+)(' + '|'.join(map(re.escape, DURATION_SUFFIXES.keys())) + ')')

def parse_duration(s):
 return sum(
 int(m.group(1)) * DURATION_SUFFIXES[m.group(2)]
 for m in RegexMatchIterator(DURATION_PATTERN, s,
 'Illegal duration string 3!r at position 1:d'))


if __name__ == '__main__':
 with fileinput.input() as f:
 result = sorted((l.rstrip('n') for l in f), key=parse_duration)
 for item in result:
 print(item)

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

Solution in Python 3:

#!/usr/bin/python3
import re, fileinput

class RegexMatchIterator:
 def __init__(self, regex, string, error_on_incomplete=False):
 self.regex = regex
 self.string = string
 self.error_on_incomplete = error_on_incomplete
 self.pos = 0

 def __iter__(self):
 return self

 def __next__(self):
 match = self.regex.match(self.string, self.pos)
 if match is not None:
 if match.end() > self.pos:
 self.pos = match.end()
 return match
 else:
 fmt = '0!s returns an empty match at position 1:d for "3!r"'

 elif self.error_on_incomplete and self.pos < len(self.string):
 if isinstance(self.error_on_incomplete, str):
 fmt = self.error_on_incomplete
 else:
 fmt = '0!s didn't match the suffix 3!r at position 1:d of 2!r'

 else:
 raise StopIteration(self.pos)

 raise ValueError(fmt.format(
 self.regex, self.pos, self.string, self.string[self.pos:]))


DURATION_SUFFIXES = 's': 1, 'm': 60, 'h': 3600, 'd': 24*3600 
DURATION_PATTERN = re.compile(
 '(\d+)(' + '|'.join(map(re.escape, DURATION_SUFFIXES.keys())) + ')')

def parse_duration(s):
 return sum(
 int(m.group(1)) * DURATION_SUFFIXES[m.group(2)]
 for m in RegexMatchIterator(DURATION_PATTERN, s,
 'Illegal duration string 3!r at position 1:d'))


if __name__ == '__main__':
 with fileinput.input() as f:
 result = sorted((l.rstrip('n') for l in f), key=parse_duration)
 for item in result:
 print(item)

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

edited Oct 15 '17 at 19:08

answered Oct 15 '17 at 18:56

David Foerster

917616

answered Oct 15 '17 at 18:56

David Foerster

917616

answered Oct 15 '17 at 18:56

David Foerster

917616

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu