Sort a file while grouping indented lines with their parent (multiple level)
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
All the levels should be sorted alphabetically (but must be kept with their parent)
File Example:
first
apple
orange
train
car
kiwi
third
orange
apple
plane
second
lemon
Expected Result:
first
apple
kiwi
orange
car
train
second
lemon
third
apple
plane
orange
The following command has been used but it works only if the file has only two levels into the tree.
sed '/^[^[:blank:]]/h;//!G;s/(.*)n(.*)/2x021/' infile | sort | sed 's/.*x02//'
How can I do to sort all the levels correctly?
Thanks in advance
text-processing sort
 |Â
show 1 more comment
up vote
2
down vote
favorite
All the levels should be sorted alphabetically (but must be kept with their parent)
File Example:
first
apple
orange
train
car
kiwi
third
orange
apple
plane
second
lemon
Expected Result:
first
apple
kiwi
orange
car
train
second
lemon
third
apple
plane
orange
The following command has been used but it works only if the file has only two levels into the tree.
sed '/^[^[:blank:]]/h;//!G;s/(.*)n(.*)/2x021/' infile | sort | sed 's/.*x02//'
How can I do to sort all the levels correctly?
Thanks in advance
text-processing sort
2
please format your input content in proper way (as it actually looks). Copy and paste, then use(code sample) on selected fragment
â RomanPerekhrest
Jul 3 at 17:04
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
4 levels are possible
â nick10
Jul 3 at 17:31
are there spaces beforefirst
,second
(1st level) values?
â RomanPerekhrest
Jul 3 at 17:33
No Spaces before the first level values
â nick10
Jul 3 at 17:35
 |Â
show 1 more comment
up vote
2
down vote
favorite
up vote
2
down vote
favorite
All the levels should be sorted alphabetically (but must be kept with their parent)
File Example:
first
apple
orange
train
car
kiwi
third
orange
apple
plane
second
lemon
Expected Result:
first
apple
kiwi
orange
car
train
second
lemon
third
apple
plane
orange
The following command has been used but it works only if the file has only two levels into the tree.
sed '/^[^[:blank:]]/h;//!G;s/(.*)n(.*)/2x021/' infile | sort | sed 's/.*x02//'
How can I do to sort all the levels correctly?
Thanks in advance
text-processing sort
All the levels should be sorted alphabetically (but must be kept with their parent)
File Example:
first
apple
orange
train
car
kiwi
third
orange
apple
plane
second
lemon
Expected Result:
first
apple
kiwi
orange
car
train
second
lemon
third
apple
plane
orange
The following command has been used but it works only if the file has only two levels into the tree.
sed '/^[^[:blank:]]/h;//!G;s/(.*)n(.*)/2x021/' infile | sort | sed 's/.*x02//'
How can I do to sort all the levels correctly?
Thanks in advance
text-processing sort
edited Jul 4 at 16:25
Isaac
6,2331632
6,2331632
asked Jul 3 at 16:59
nick10
164
164
2
please format your input content in proper way (as it actually looks). Copy and paste, then use(code sample) on selected fragment
â RomanPerekhrest
Jul 3 at 17:04
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
4 levels are possible
â nick10
Jul 3 at 17:31
are there spaces beforefirst
,second
(1st level) values?
â RomanPerekhrest
Jul 3 at 17:33
No Spaces before the first level values
â nick10
Jul 3 at 17:35
 |Â
show 1 more comment
2
please format your input content in proper way (as it actually looks). Copy and paste, then use(code sample) on selected fragment
â RomanPerekhrest
Jul 3 at 17:04
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
4 levels are possible
â nick10
Jul 3 at 17:31
are there spaces beforefirst
,second
(1st level) values?
â RomanPerekhrest
Jul 3 at 17:33
No Spaces before the first level values
â nick10
Jul 3 at 17:35
2
2
please format your input content in proper way (as it actually looks). Copy and paste, then use
(code sample) on selected fragmentâ RomanPerekhrest
Jul 3 at 17:04
please format your input content in proper way (as it actually looks). Copy and paste, then use
(code sample) on selected fragmentâ RomanPerekhrest
Jul 3 at 17:04
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
4 levels are possible
â nick10
Jul 3 at 17:31
4 levels are possible
â nick10
Jul 3 at 17:31
are there spaces before
first
, second
(1st level) values?â RomanPerekhrest
Jul 3 at 17:33
are there spaces before
first
, second
(1st level) values?â RomanPerekhrest
Jul 3 at 17:33
No Spaces before the first level values
â nick10
Jul 3 at 17:35
No Spaces before the first level values
â nick10
Jul 3 at 17:35
 |Â
show 1 more comment
4 Answers
4
active
oldest
votes
up vote
1
down vote
accepted
Extended Python
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
sort_hierarchy.py
script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
with open(sys.argv[1], 'rt') as f:
pat = re.compile(r'^s+')
paths =
for line in f:
offset = pat.match(line)
item = line.strip()
if not offset:
offset = 0
paths.append(item)
else:
offset = offset.span()[1]
if offset > prev_offset:
paths.append(paths[-1] + '.' + item)
else:
cut_pos = -prev_offset//offset
paths.append('.'.join(paths[-1].split('.')[:cut_pos]) + '.' + item)
prev_offset = offset
paths.sort()
sub_pat = re.compile(r'[^.]+.')
for i in paths:
print(sub_pat.sub(' ' * 4, i))
Usage:
python sort_hierarchy.py path/to/infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
add a comment |Â
up vote
0
down vote
Awk
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
awk '
offset = gsub(/ /, "");
if (offset == 0) items[NR] = $1
else if (offset > prev_ofst) items[NR] = items[NR-1] "." $1
else
prev_item = items[NR-1];
gsub("(\.[^.]+)" int(prev_ofst / offset) "$", "", prev_item);
items[NR] = prev_item "." $1
prev_ofst = offset;
END
asort(items);
for (i = 1; i <= NR; i++)
gsub(/[^.]+./, " ", items[i]);
print items[i]
' infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
add a comment |Â
up vote
0
down vote
works for any depth
#!/usr/bin/python3
lines = open('test_file').read().splitlines()
def yield_sorted_lines(lines):
sorter =
for l in lines:
fields = l.split('t')
n = len(fields)
sorter = sorter[:n-1] + fields[n-1:]
yield sorter, l
prefixed_lines = yield_sorted_lines(lines)
sorted_lines = sorted(prefixed_lines, key=lambda x: x[0])
for x, y in sorted_lines:
print(y)
Or an pipeline
awk -F'\t' 'a[NF]=$NF; for (i=1; i<=NF; ++i) printf "%s%s", a[i], i==NF? "n": "t"' file|
sort | awk -F'\t' -vOFS='t' 'for (i=1; i<NF; ++i) $i=""; print'
add a comment |Â
up vote
0
down vote
sed '/^ /H;$!d;x;1d;s/n/x7/g' | sort | tr \a \n
The /continuation/H;$!d;x;1d
(or /firstline/!
etc) is a slurp, it falls through only when it's got a complete line gaggle in the buffer.
If you might get a single-line gaggle at the end, add $p;x;/n/d
to do the double-pump needed for that.
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Extended Python
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
sort_hierarchy.py
script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
with open(sys.argv[1], 'rt') as f:
pat = re.compile(r'^s+')
paths =
for line in f:
offset = pat.match(line)
item = line.strip()
if not offset:
offset = 0
paths.append(item)
else:
offset = offset.span()[1]
if offset > prev_offset:
paths.append(paths[-1] + '.' + item)
else:
cut_pos = -prev_offset//offset
paths.append('.'.join(paths[-1].split('.')[:cut_pos]) + '.' + item)
prev_offset = offset
paths.sort()
sub_pat = re.compile(r'[^.]+.')
for i in paths:
print(sub_pat.sub(' ' * 4, i))
Usage:
python sort_hierarchy.py path/to/infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
add a comment |Â
up vote
1
down vote
accepted
Extended Python
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
sort_hierarchy.py
script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
with open(sys.argv[1], 'rt') as f:
pat = re.compile(r'^s+')
paths =
for line in f:
offset = pat.match(line)
item = line.strip()
if not offset:
offset = 0
paths.append(item)
else:
offset = offset.span()[1]
if offset > prev_offset:
paths.append(paths[-1] + '.' + item)
else:
cut_pos = -prev_offset//offset
paths.append('.'.join(paths[-1].split('.')[:cut_pos]) + '.' + item)
prev_offset = offset
paths.sort()
sub_pat = re.compile(r'[^.]+.')
for i in paths:
print(sub_pat.sub(' ' * 4, i))
Usage:
python sort_hierarchy.py path/to/infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Extended Python
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
sort_hierarchy.py
script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
with open(sys.argv[1], 'rt') as f:
pat = re.compile(r'^s+')
paths =
for line in f:
offset = pat.match(line)
item = line.strip()
if not offset:
offset = 0
paths.append(item)
else:
offset = offset.span()[1]
if offset > prev_offset:
paths.append(paths[-1] + '.' + item)
else:
cut_pos = -prev_offset//offset
paths.append('.'.join(paths[-1].split('.')[:cut_pos]) + '.' + item)
prev_offset = offset
paths.sort()
sub_pat = re.compile(r'[^.]+.')
for i in paths:
print(sub_pat.sub(' ' * 4, i))
Usage:
python sort_hierarchy.py path/to/infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
Extended Python
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
sort_hierarchy.py
script:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
with open(sys.argv[1], 'rt') as f:
pat = re.compile(r'^s+')
paths =
for line in f:
offset = pat.match(line)
item = line.strip()
if not offset:
offset = 0
paths.append(item)
else:
offset = offset.span()[1]
if offset > prev_offset:
paths.append(paths[-1] + '.' + item)
else:
cut_pos = -prev_offset//offset
paths.append('.'.join(paths[-1].split('.')[:cut_pos]) + '.' + item)
prev_offset = offset
paths.sort()
sub_pat = re.compile(r'[^.]+.')
for i in paths:
print(sub_pat.sub(' ' * 4, i))
Usage:
python sort_hierarchy.py path/to/infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
answered Jul 3 at 22:00
RomanPerekhrest
22.4k12144
22.4k12144
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
add a comment |Â
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
@nick10, learn about What should I do when someone answers my question?
â RomanPerekhrest
Jul 4 at 10:50
add a comment |Â
up vote
0
down vote
Awk
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
awk '
offset = gsub(/ /, "");
if (offset == 0) items[NR] = $1
else if (offset > prev_ofst) items[NR] = items[NR-1] "." $1
else
prev_item = items[NR-1];
gsub("(\.[^.]+)" int(prev_ofst / offset) "$", "", prev_item);
items[NR] = prev_item "." $1
prev_ofst = offset;
END
asort(items);
for (i = 1; i <= NR; i++)
gsub(/[^.]+./, " ", items[i]);
print items[i]
' infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
add a comment |Â
up vote
0
down vote
Awk
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
awk '
offset = gsub(/ /, "");
if (offset == 0) items[NR] = $1
else if (offset > prev_ofst) items[NR] = items[NR-1] "." $1
else
prev_item = items[NR-1];
gsub("(\.[^.]+)" int(prev_ofst / offset) "$", "", prev_item);
items[NR] = prev_item "." $1
prev_ofst = offset;
END
asort(items);
for (i = 1; i <= NR; i++)
gsub(/[^.]+./, " ", items[i]);
print items[i]
' infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Awk
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
awk '
offset = gsub(/ /, "");
if (offset == 0) items[NR] = $1
else if (offset > prev_ofst) items[NR] = items[NR-1] "." $1
else
prev_item = items[NR-1];
gsub("(\.[^.]+)" int(prev_ofst / offset) "$", "", prev_item);
items[NR] = prev_item "." $1
prev_ofst = offset;
END
asort(items);
for (i = 1; i <= NR; i++)
gsub(/[^.]+./, " ", items[i]);
print items[i]
' infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
Awk
solution:
Sample infile
contents (4 levels):
first
apple
orange
train
car
truck
automobile
kiwi
third
orange
apple
plane
second
lemon
awk '
offset = gsub(/ /, "");
if (offset == 0) items[NR] = $1
else if (offset > prev_ofst) items[NR] = items[NR-1] "." $1
else
prev_item = items[NR-1];
gsub("(\.[^.]+)" int(prev_ofst / offset) "$", "", prev_item);
items[NR] = prev_item "." $1
prev_ofst = offset;
END
asort(items);
for (i = 1; i <= NR; i++)
gsub(/[^.]+./, " ", items[i]);
print items[i]
' infile
The output:
first
apple
kiwi
orange
car
automobile
truck
train
second
lemon
third
apple
plane
orange
answered Jul 3 at 22:06
RomanPerekhrest
22.4k12144
22.4k12144
add a comment |Â
add a comment |Â
up vote
0
down vote
works for any depth
#!/usr/bin/python3
lines = open('test_file').read().splitlines()
def yield_sorted_lines(lines):
sorter =
for l in lines:
fields = l.split('t')
n = len(fields)
sorter = sorter[:n-1] + fields[n-1:]
yield sorter, l
prefixed_lines = yield_sorted_lines(lines)
sorted_lines = sorted(prefixed_lines, key=lambda x: x[0])
for x, y in sorted_lines:
print(y)
Or an pipeline
awk -F'\t' 'a[NF]=$NF; for (i=1; i<=NF; ++i) printf "%s%s", a[i], i==NF? "n": "t"' file|
sort | awk -F'\t' -vOFS='t' 'for (i=1; i<NF; ++i) $i=""; print'
add a comment |Â
up vote
0
down vote
works for any depth
#!/usr/bin/python3
lines = open('test_file').read().splitlines()
def yield_sorted_lines(lines):
sorter =
for l in lines:
fields = l.split('t')
n = len(fields)
sorter = sorter[:n-1] + fields[n-1:]
yield sorter, l
prefixed_lines = yield_sorted_lines(lines)
sorted_lines = sorted(prefixed_lines, key=lambda x: x[0])
for x, y in sorted_lines:
print(y)
Or an pipeline
awk -F'\t' 'a[NF]=$NF; for (i=1; i<=NF; ++i) printf "%s%s", a[i], i==NF? "n": "t"' file|
sort | awk -F'\t' -vOFS='t' 'for (i=1; i<NF; ++i) $i=""; print'
add a comment |Â
up vote
0
down vote
up vote
0
down vote
works for any depth
#!/usr/bin/python3
lines = open('test_file').read().splitlines()
def yield_sorted_lines(lines):
sorter =
for l in lines:
fields = l.split('t')
n = len(fields)
sorter = sorter[:n-1] + fields[n-1:]
yield sorter, l
prefixed_lines = yield_sorted_lines(lines)
sorted_lines = sorted(prefixed_lines, key=lambda x: x[0])
for x, y in sorted_lines:
print(y)
Or an pipeline
awk -F'\t' 'a[NF]=$NF; for (i=1; i<=NF; ++i) printf "%s%s", a[i], i==NF? "n": "t"' file|
sort | awk -F'\t' -vOFS='t' 'for (i=1; i<NF; ++i) $i=""; print'
works for any depth
#!/usr/bin/python3
lines = open('test_file').read().splitlines()
def yield_sorted_lines(lines):
sorter =
for l in lines:
fields = l.split('t')
n = len(fields)
sorter = sorter[:n-1] + fields[n-1:]
yield sorter, l
prefixed_lines = yield_sorted_lines(lines)
sorted_lines = sorted(prefixed_lines, key=lambda x: x[0])
for x, y in sorted_lines:
print(y)
Or an pipeline
awk -F'\t' 'a[NF]=$NF; for (i=1; i<=NF; ++i) printf "%s%s", a[i], i==NF? "n": "t"' file|
sort | awk -F'\t' -vOFS='t' 'for (i=1; i<NF; ++i) $i=""; print'
edited Jul 4 at 15:36
answered Jul 4 at 11:40
iruvar
11.4k62959
11.4k62959
add a comment |Â
add a comment |Â
up vote
0
down vote
sed '/^ /H;$!d;x;1d;s/n/x7/g' | sort | tr \a \n
The /continuation/H;$!d;x;1d
(or /firstline/!
etc) is a slurp, it falls through only when it's got a complete line gaggle in the buffer.
If you might get a single-line gaggle at the end, add $p;x;/n/d
to do the double-pump needed for that.
add a comment |Â
up vote
0
down vote
sed '/^ /H;$!d;x;1d;s/n/x7/g' | sort | tr \a \n
The /continuation/H;$!d;x;1d
(or /firstline/!
etc) is a slurp, it falls through only when it's got a complete line gaggle in the buffer.
If you might get a single-line gaggle at the end, add $p;x;/n/d
to do the double-pump needed for that.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
sed '/^ /H;$!d;x;1d;s/n/x7/g' | sort | tr \a \n
The /continuation/H;$!d;x;1d
(or /firstline/!
etc) is a slurp, it falls through only when it's got a complete line gaggle in the buffer.
If you might get a single-line gaggle at the end, add $p;x;/n/d
to do the double-pump needed for that.
sed '/^ /H;$!d;x;1d;s/n/x7/g' | sort | tr \a \n
The /continuation/H;$!d;x;1d
(or /firstline/!
etc) is a slurp, it falls through only when it's got a complete line gaggle in the buffer.
If you might get a single-line gaggle at the end, add $p;x;/n/d
to do the double-pump needed for that.
answered Jul 4 at 19:40
jthill
2,283715
2,283715
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f453267%2fsort-a-file-while-grouping-indented-lines-with-their-parent-multiple-level%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
please format your input content in proper way (as it actually looks). Copy and paste, then use
(code sample) on selected fragment
â RomanPerekhrest
Jul 3 at 17:04
could the file have more than 3 levels?
â RomanPerekhrest
Jul 3 at 17:23
4 levels are possible
â nick10
Jul 3 at 17:31
are there spaces before
first
,second
(1st level) values?â RomanPerekhrest
Jul 3 at 17:33
No Spaces before the first level values
â nick10
Jul 3 at 17:35