How to print an nth column in a file using awk?
Clash Royale CLAN TAG#URR8PPP
I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use
awk 'print $2' input1..n >> out
then I get a single column with all the entries from the 2nd column of the input files. What can I do to have different columns for different files, as in $1
in out = $2
of input1, $2
in out = $2
of input2, $3
in out = $2
of input3,....., $n
in out = $2
of inputn?
text-processing awk
add a comment |
I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use
awk 'print $2' input1..n >> out
then I get a single column with all the entries from the 2nd column of the input files. What can I do to have different columns for different files, as in $1
in out = $2
of input1, $2
in out = $2
of input2, $3
in out = $2
of input3,....., $n
in out = $2
of inputn?
text-processing awk
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26
add a comment |
I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use
awk 'print $2' input1..n >> out
then I get a single column with all the entries from the 2nd column of the input files. What can I do to have different columns for different files, as in $1
in out = $2
of input1, $2
in out = $2
of input2, $3
in out = $2
of input3,....., $n
in out = $2
of inputn?
text-processing awk
I have n files (call them input1, input2, and so on) with similar data and I wish to make a new file (call it out) that contains the 2nd column of these files. If I use
awk 'print $2' input1..n >> out
then I get a single column with all the entries from the 2nd column of the input files. What can I do to have different columns for different files, as in $1
in out = $2
of input1, $2
in out = $2
of input2, $3
in out = $2
of input3,....., $n
in out = $2
of inputn?
text-processing awk
text-processing awk
edited Feb 19 at 14:02
Hitanshu Sachania
asked Feb 19 at 13:30
Hitanshu SachaniaHitanshu Sachania
425
425
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26
add a comment |
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26
add a comment |
4 Answers
4
active
oldest
votes
You could do the whole thing in a BEGIN
statement using getline
awk '
BEGIN
while(1)
line = sep = ""
for (i = 1; i < ARGC; i++)
if ((getline < ARGV[i]) <= 0) exit
line = line sep $2
sep = OFS
print line
' input1..n > out
Is there any way we can useORS
? I was trying that but it doesn't work with multiple files.
– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
@HitanshuSachania,getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).
– Stéphane Chazelas
Feb 19 at 14:51
1
@HitanshuSachania, not it builds one line of output (inline
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line inline
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.
– Stéphane Chazelas
Feb 19 at 15:48
|
show 4 more comments
You could construct a paste
command to put all the second columns together:
cmd="paste"
for x in input1..n; do
cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd
add a comment |
using this post as reference
awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n
an array holds each line from different files
FNR number of records read in current input file, set to zero at begining of each file.
ENDfor(i=1;i<FNR;i++) print a[i]
prints the content of array a on END of file
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Usea[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...
– RoVo
Feb 19 at 14:35
add a comment |
I would use the pr
tool, which is designed to columnize data:
awk 'print $2' input1..n | pr -t --columns=n > out
This assumes each file has the same number of lines.
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501589%2fhow-to-print-an-nth-column-in-a-file-using-awk%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
You could do the whole thing in a BEGIN
statement using getline
awk '
BEGIN
while(1)
line = sep = ""
for (i = 1; i < ARGC; i++)
if ((getline < ARGV[i]) <= 0) exit
line = line sep $2
sep = OFS
print line
' input1..n > out
Is there any way we can useORS
? I was trying that but it doesn't work with multiple files.
– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
@HitanshuSachania,getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).
– Stéphane Chazelas
Feb 19 at 14:51
1
@HitanshuSachania, not it builds one line of output (inline
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line inline
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.
– Stéphane Chazelas
Feb 19 at 15:48
|
show 4 more comments
You could do the whole thing in a BEGIN
statement using getline
awk '
BEGIN
while(1)
line = sep = ""
for (i = 1; i < ARGC; i++)
if ((getline < ARGV[i]) <= 0) exit
line = line sep $2
sep = OFS
print line
' input1..n > out
Is there any way we can useORS
? I was trying that but it doesn't work with multiple files.
– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
@HitanshuSachania,getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).
– Stéphane Chazelas
Feb 19 at 14:51
1
@HitanshuSachania, not it builds one line of output (inline
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line inline
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.
– Stéphane Chazelas
Feb 19 at 15:48
|
show 4 more comments
You could do the whole thing in a BEGIN
statement using getline
awk '
BEGIN
while(1)
line = sep = ""
for (i = 1; i < ARGC; i++)
if ((getline < ARGV[i]) <= 0) exit
line = line sep $2
sep = OFS
print line
' input1..n > out
You could do the whole thing in a BEGIN
statement using getline
awk '
BEGIN
while(1)
line = sep = ""
for (i = 1; i < ARGC; i++)
if ((getline < ARGV[i]) <= 0) exit
line = line sep $2
sep = OFS
print line
' input1..n > out
edited Feb 19 at 14:17
answered Feb 19 at 13:39
Stéphane ChazelasStéphane Chazelas
310k57584945
310k57584945
Is there any way we can useORS
? I was trying that but it doesn't work with multiple files.
– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
@HitanshuSachania,getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).
– Stéphane Chazelas
Feb 19 at 14:51
1
@HitanshuSachania, not it builds one line of output (inline
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line inline
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.
– Stéphane Chazelas
Feb 19 at 15:48
|
show 4 more comments
Is there any way we can useORS
? I was trying that but it doesn't work with multiple files.
– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
@HitanshuSachania,getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).
– Stéphane Chazelas
Feb 19 at 14:51
1
@HitanshuSachania, not it builds one line of output (inline
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line inline
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.
– Stéphane Chazelas
Feb 19 at 15:48
Is there any way we can use
ORS
? I was trying that but it doesn't work with multiple files.– Prvt_Yadv
Feb 19 at 14:04
Is there any way we can use
ORS
? I was trying that but it doesn't work with multiple files.– Prvt_Yadv
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
@PRY, there was a missing loop. It should be fixed now.
– Stéphane Chazelas
Feb 19 at 14:04
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
Sorry, I was asking in general manner, not related to your answer.
– Prvt_Yadv
Feb 19 at 14:05
1
1
@HitanshuSachania,
getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).– Stéphane Chazelas
Feb 19 at 14:51
@HitanshuSachania,
getline
retrieves one record from the given input file and return 0 upon end-of-file or a negative number upon error in which case we exit (so we stop as soon as we've reached the end of any of the input files so the number of records in the output file is that of the input file with fewest records).– Stéphane Chazelas
Feb 19 at 14:51
1
1
@HitanshuSachania, not it builds one line of output (in
line
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.– Stéphane Chazelas
Feb 19 at 15:48
@HitanshuSachania, not it builds one line of output (in
line
) at a time, as opposed to storing all the lines of output in an array and printing it at the end. We could skip storing the line in line
and print the fields as they come, but that would cause problem for the last record if not all input files have the same number of records.– Stéphane Chazelas
Feb 19 at 15:48
|
show 4 more comments
You could construct a paste
command to put all the second columns together:
cmd="paste"
for x in input1..n; do
cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd
add a comment |
You could construct a paste
command to put all the second columns together:
cmd="paste"
for x in input1..n; do
cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd
add a comment |
You could construct a paste
command to put all the second columns together:
cmd="paste"
for x in input1..n; do
cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd
You could construct a paste
command to put all the second columns together:
cmd="paste"
for x in input1..n; do
cmd="$cmd <(awk 'print $2;' $x)"
done
echo $cmd
eval $cmd
answered Feb 19 at 14:05
NickDNickD
1,7181314
1,7181314
add a comment |
add a comment |
using this post as reference
awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n
an array holds each line from different files
FNR number of records read in current input file, set to zero at begining of each file.
ENDfor(i=1;i<FNR;i++) print a[i]
prints the content of array a on END of file
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Usea[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...
– RoVo
Feb 19 at 14:35
add a comment |
using this post as reference
awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n
an array holds each line from different files
FNR number of records read in current input file, set to zero at begining of each file.
ENDfor(i=1;i<FNR;i++) print a[i]
prints the content of array a on END of file
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Usea[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...
– RoVo
Feb 19 at 14:35
add a comment |
using this post as reference
awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n
an array holds each line from different files
FNR number of records read in current input file, set to zero at begining of each file.
ENDfor(i=1;i<FNR;i++) print a[i]
prints the content of array a on END of file
using this post as reference
awk 'a[FNR] = a[FNR]" " $2ENDfor(i=1;i<=FNR;i++) print a[i]' input1..n
an array holds each line from different files
FNR number of records read in current input file, set to zero at begining of each file.
ENDfor(i=1;i<FNR;i++) print a[i]
prints the content of array a on END of file
edited Feb 19 at 14:22
answered Feb 19 at 14:04
Emilio GalarragaEmilio Galarraga
55439
55439
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Usea[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...
– RoVo
Feb 19 at 14:35
add a comment |
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Usea[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...
– RoVo
Feb 19 at 14:35
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
What is the purpose of the double quotes before $2 and how does the array a store just the 2nd column?
– Hitanshu Sachania
Feb 19 at 14:14
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
Note that it stores the whole output in memory before starting to print it.
– Stéphane Chazelas
Feb 19 at 14:16
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
I changed the lstaro of the loop froom 0 to 1
– Emilio Galarraga
Feb 19 at 14:23
1
1
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
the double quotes before $2 defile the delimiter between columns in the output file and the script is passing the value in $2 to a[FNR] element of the array
– Emilio Galarraga
Feb 19 at 14:27
Use
a[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...– RoVo
Feb 19 at 14:35
Use
a[FNR]=!a[FNR]?$2:a[FNR]" "$2
to eleminate white space in front of each line...– RoVo
Feb 19 at 14:35
add a comment |
I would use the pr
tool, which is designed to columnize data:
awk 'print $2' input1..n | pr -t --columns=n > out
This assumes each file has the same number of lines.
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
add a comment |
I would use the pr
tool, which is designed to columnize data:
awk 'print $2' input1..n | pr -t --columns=n > out
This assumes each file has the same number of lines.
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
add a comment |
I would use the pr
tool, which is designed to columnize data:
awk 'print $2' input1..n | pr -t --columns=n > out
This assumes each file has the same number of lines.
I would use the pr
tool, which is designed to columnize data:
awk 'print $2' input1..n | pr -t --columns=n > out
This assumes each file has the same number of lines.
answered Feb 19 at 14:48
glenn jackmanglenn jackman
52.3k572113
52.3k572113
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
add a comment |
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
The best answer for this scenario.
– Rakesh Sharma
Feb 20 at 1:00
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501589%2fhow-to-print-an-nth-column-in-a-file-using-awk%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are all files of the same length, i.e. do they all always have the same number of rows, and is this number known in advance?
– Kusalananda
Feb 19 at 14:23
@Kusalananda yes they're all of the same length and the #rows and #columns are known.
– Hitanshu Sachania
Feb 19 at 14:26