How to print an nth column in a file using awk?

I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use

awk 'print $2' input1..n >> out

then I get a single column with all the entries from the 2nd column of the input files. What can I do to have different columns for different files, as in $1 in out = $2 of input1, $2 in out = $2 of input2, $3 in out = $2 of input3,....., $n in out = $2 of inputn?

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?

– Kusalananda
Feb 19 at 14:23

@Kusalananda yes they're all of the same length and the #rows and #columns are known.

– Hitanshu Sachania
Feb 19 at 14:26

add a comment |

I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use

awk 'print $2' input1..n >> out

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?

– Kusalananda
Feb 19 at 14:23

@Kusalananda yes they're all of the same length and the #rows and #columns are known.

– Hitanshu Sachania
Feb 19 at 14:26

add a comment |

I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use

awk 'print $2' input1..n >> out

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use

awk 'print $2' input1..n >> out

text-processing awk

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

edited Feb 19 at 14:02

asked Feb 19 at 13:30

Hitanshu Sachania

425

asked Feb 19 at 13:30

Hitanshu Sachania

425

asked Feb 19 at 13:30

Hitanshu Sachania

425

Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?

– Kusalananda
Feb 19 at 14:23

@Kusalananda yes they're all of the same length and the #rows and #columns are known.

– Hitanshu Sachania
Feb 19 at 14:26

add a comment |

Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?

– Kusalananda
Feb 19 at 14:23

@Kusalananda yes they're all of the same length and the #rows and #columns are known.

– Hitanshu Sachania
Feb 19 at 14:26

Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?

– Kusalananda
Feb 19 at 14:23

@Kusalananda yes they're all of the same length and the #rows and #columns are known.

– Hitanshu Sachania
Feb 19 at 14:26

add a comment |

4 Answers
4

active

oldest

votes

You could do the whole thing in a BEGIN statement using getline

awk '
 BEGIN 
 while(1) 
 line = sep = ""
 for (i = 1; i < ARGC; i++) 
 if ((getline < ARGV[i]) <= 0) exit
 line = line sep $2
 sep = OFS
 
 print line
 
 ' input1..n > out

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

1

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

1

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

|
show 4 more comments

You could construct a paste command to put all the second columns together:

cmd="paste"
for x in input1..n; do
 cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd

answered Feb 19 at 14:05

NickD

1,7181314

add a comment |

using this post as reference

awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n

an array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

ENDfor(i=1;i<FNR;i++) print a[i]

prints the content of array a on END of file

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

1

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

add a comment |

I would use the pr tool, which is designed to columnize data:

awk 'print $2' input1..n | pr -t --columns=n > out

This assumes each file has the same number of lines.

answered Feb 19 at 14:48

glenn jackman

52.3k572113

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501589%2fhow-to-print-an-nth-column-in-a-file-using-awk%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

You could do the whole thing in a BEGIN statement using getline

awk '
 BEGIN 
 while(1) 
 line = sep = ""
 for (i = 1; i < ARGC; i++) 
 if ((getline < ARGV[i]) <= 0) exit
 line = line sep $2
 sep = OFS
 
 print line
 
 ' input1..n > out

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

1

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

1

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

|
show 4 more comments

You could do the whole thing in a BEGIN statement using getline

awk '
 BEGIN 
 while(1) 
 line = sep = ""
 for (i = 1; i < ARGC; i++) 
 if ((getline < ARGV[i]) <= 0) exit
 line = line sep $2
 sep = OFS
 
 print line
 
 ' input1..n > out

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

1

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

1

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

|
show 4 more comments

You could do the whole thing in a BEGIN statement using getline

awk '
 BEGIN 
 while(1) 
 line = sep = ""
 for (i = 1; i < ARGC; i++) 
 if ((getline < ARGV[i]) <= 0) exit
 line = line sep $2
 sep = OFS
 
 print line
 
 ' input1..n > out

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

You could do the whole thing in a BEGIN statement using getline

awk '
 BEGIN 
 while(1) 
 line = sep = ""
 for (i = 1; i < ARGC; i++) 
 if ((getline < ARGV[i]) <= 0) exit
 line = line sep $2
 sep = OFS
 
 print line
 
 ' input1..n > out

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

edited Feb 19 at 14:17

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

answered Feb 19 at 13:39

Stéphane Chazelas

310k57584945

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

1

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

1

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

|
show 4 more comments

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

1

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

1

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

Is there any way we can use ORS? I was trying that but it doesn't work with multiple files.

– Prvt_Yadv
Feb 19 at 14:04

@PRY, there was a missing loop. It should be fixed now.

– Stéphane Chazelas
Feb 19 at 14:04

Sorry, I was asking in general manner, not related to your answer.

– Prvt_Yadv
Feb 19 at 14:05

@HitanshuSachania, getline retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).

– Stéphane Chazelas
Feb 19 at 14:51

@HitanshuSachania, not it builds one line of output (in line) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.

– Stéphane Chazelas
Feb 19 at 15:48

|
show 4 more comments

You could construct a paste command to put all the second columns together:

cmd="paste"
for x in input1..n; do
 cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd

answered Feb 19 at 14:05

NickD

1,7181314

add a comment |

You could construct a paste command to put all the second columns together:

cmd="paste"
for x in input1..n; do
 cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd

answered Feb 19 at 14:05

NickD

1,7181314

add a comment |

You could construct a paste command to put all the second columns together:

cmd="paste"
for x in input1..n; do
 cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd

answered Feb 19 at 14:05

NickD

1,7181314

You could construct a paste command to put all the second columns together:

cmd="paste"
for x in input1..n; do
 cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd

answered Feb 19 at 14:05

NickD

1,7181314

answered Feb 19 at 14:05

NickD

1,7181314

answered Feb 19 at 14:05

NickD

1,7181314

answered Feb 19 at 14:05

NickD

1,7181314

add a comment |

using this post as reference

awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n

an array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

ENDfor(i=1;i<FNR;i++) print a[i]

prints the content of array a on END of file

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

1

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

add a comment |

using this post as reference

awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n

an array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

ENDfor(i=1;i<FNR;i++) print a[i]

prints the content of array a on END of file

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

1

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

add a comment |

using this post as reference

awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n

an array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

ENDfor(i=1;i<FNR;i++) print a[i]

prints the content of array a on END of file

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

using this post as reference

awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n

an array holds each line from different files

FNR number of records read in current input file, set to zero at begining of each file.

ENDfor(i=1;i<FNR;i++) print a[i]

prints the content of array a on END of file

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

edited Feb 19 at 14:22

answered Feb 19 at 14:04

Emilio Galarraga

55439

answered Feb 19 at 14:04

Emilio Galarraga

55439

answered Feb 19 at 14:04

Emilio Galarraga

55439

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

1

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

add a comment |

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

1

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?

– Hitanshu Sachania
Feb 19 at 14:14

Note that it stores the whole output in memory before starting to print it.

– Stéphane Chazelas
Feb 19 at 14:16

I changed the lstaro of the loop froom 0 to 1

– Emilio Galarraga
Feb 19 at 14:23

the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array

– Emilio Galarraga
Feb 19 at 14:27

Use a[FNR]=!a[FNR]?$2:a[FNR]" "$2 to eleminate white space in front of each line...

– RoVo
Feb 19 at 14:35

add a comment |

I would use the pr tool, which is designed to columnize data:

awk 'print $2' input1..n | pr -t --columns=n > out

This assumes each file has the same number of lines.

answered Feb 19 at 14:48

glenn jackman

52.3k572113

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

add a comment |

I would use the pr tool, which is designed to columnize data:

awk 'print $2' input1..n | pr -t --columns=n > out

This assumes each file has the same number of lines.

answered Feb 19 at 14:48

glenn jackman

52.3k572113

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

add a comment |

I would use the pr tool, which is designed to columnize data:

awk 'print $2' input1..n | pr -t --columns=n > out

This assumes each file has the same number of lines.

answered Feb 19 at 14:48

glenn jackman

52.3k572113

I would use the pr tool, which is designed to columnize data:

awk 'print $2' input1..n | pr -t --columns=n > out

This assumes each file has the same number of lines.

answered Feb 19 at 14:48

glenn jackman

52.3k572113

answered Feb 19 at 14:48

glenn jackman

52.3k572113

answered Feb 19 at 14:48

glenn jackman

52.3k572113

answered Feb 19 at 14:48

glenn jackman

52.3k572113

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

add a comment |

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

The best answer for this scenario.

– Rakesh Sharma
Feb 20 at 1:00

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu