Shell script sort

up vote
0
down vote

favorite

Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.

for example consider this small list

 peter barker painter
 carl baker cook
 joshua carpenter

These are all names and occupations. now say i want to use sort to sort these entries.

Problem is sort uses white spaces as fields
so if i sort -k 1n i'll sort by first name

But i want to sort by full name and then have the option to sort by occupation aswell. As you can see some entires don't have a full name, joshua only have his first name and his occupation. So for him i want to sort only by first name but for the others full name.

Can this be achieved?

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

add a comment |

up vote
0
down vote

favorite

Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.

for example consider this small list

 peter barker painter
 carl baker cook
 joshua carpenter

These are all names and occupations. now say i want to use sort to sort these entries.

Problem is sort uses white spaces as fields
so if i sort -k 1n i'll sort by first name

Can this be achieved?

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

add a comment |

up vote
0
down vote

favorite

Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.

for example consider this small list

 peter barker painter
 carl baker cook
 joshua carpenter

These are all names and occupations. now say i want to use sort to sort these entries.

Problem is sort uses white spaces as fields
so if i sort -k 1n i'll sort by first name

Can this be achieved?

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

Im trying to sort a small file with some entries containing two words but i want to sort this as one entry.

for example consider this small list

 peter barker painter
 carl baker cook
 joshua carpenter

These are all names and occupations. now say i want to use sort to sort these entries.

Problem is sort uses white spaces as fields
so if i sort -k 1n i'll sort by first name

Can this be achieved?

linux text-processing sort

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

edited Nov 23 at 15:58

Kusalananda

118k16222360

edited Nov 23 at 15:58

Kusalananda

118k16222360

edited Nov 23 at 15:58

Kusalananda

118k16222360

asked Nov 23 at 15:42

mrmagin

122

asked Nov 23 at 15:42

mrmagin

122

asked Nov 23 at 15:42

mrmagin

122

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Assuming that it is only ever going to be the surname that is missing (and not the first name) and that the words in the file does not include spaces (which would make it extremely difficult), first get the data into tab-delimited format with the missing surnames replaced by empty fields:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file
peter barker painter
carl baker cook
joshua carpenter

The awk script will detect lines that contain two or three fields. It will simply reformat the lines that already has three fields into three tab-delimited fields while moving the second field into the third field for the lines that originally only contained two fields.

Then sort the data with tabs as delimiters:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t $'t' -k1,2 -k3
carl baker cook
joshua carpenter
peter barker painter

The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.

Instead of tabs, you may use any other character that does not interfere with the data (here :):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3
carl:baker:cook
joshua::carpenter
peter:barker:painter

Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
carl baker cook
joshua carpenter
peter barker painter

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483710%2fshell-script-sort%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file
peter barker painter
carl baker cook
joshua carpenter

Then sort the data with tabs as delimiters:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t $'t' -k1,2 -k3
carl baker cook
joshua carpenter
peter barker painter

The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.

Instead of tabs, you may use any other character that does not interfere with the data (here :):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3
carl:baker:cook
joshua::carpenter
peter:barker:painter

Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
carl baker cook
joshua carpenter
peter barker painter

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

add a comment |

up vote
1
down vote

accepted

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file
peter barker painter
carl baker cook
joshua carpenter

Then sort the data with tabs as delimiters:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t $'t' -k1,2 -k3
carl baker cook
joshua carpenter
peter barker painter

The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.

Instead of tabs, you may use any other character that does not interfere with the data (here :):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3
carl:baker:cook
joshua::carpenter
peter:barker:painter

Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
carl baker cook
joshua carpenter
peter barker painter

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

add a comment |

up vote
1
down vote

accepted

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file
peter barker painter
carl baker cook
joshua carpenter

Then sort the data with tabs as delimiters:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t $'t' -k1,2 -k3
carl baker cook
joshua carpenter
peter barker painter

The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.

Instead of tabs, you may use any other character that does not interfere with the data (here :):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3
carl:baker:cook
joshua::carpenter
peter:barker:painter

Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
carl baker cook
joshua carpenter
peter barker painter

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file
peter barker painter
carl baker cook
joshua carpenter

Then sort the data with tabs as delimiters:

$ awk -v OFS='t' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t $'t' -k1,2 -k3
carl baker cook
joshua carpenter
peter barker painter

The sorting done here is on the full name (fields one and two) and then by occupation. It is assumed that you are using a shell like bash that understands $'t' as a tab character.

Instead of tabs, you may use any other character that does not interfere with the data (here :):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3
carl:baker:cook
joshua::carpenter
peter:barker:painter

Then replace the chosen delimiter character by passing the result through tr (here replacing with tabs, because it looks nice):

$ awk -v OFS=':' 'NF == 3 $1 = $1 NF == 2 $3 = $2; $2 = "" print ' <file | sort -t ':' -k1,2 -k3 | tr ':' 't'
carl baker cook
joshua carpenter
peter barker painter

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

edited Nov 23 at 15:57

answered Nov 23 at 15:50

Kusalananda

118k16222360

answered Nov 23 at 15:50

Kusalananda

118k16222360

answered Nov 23 at 15:50

Kusalananda

118k16222360

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu