find and print particular line from list of filename and line number
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I have a file input.txt
which contains multiple filename in the below format.FILENAME_DATE_LINENUMBER
, the input.txt
contains many such filenames.
The filename itself has precisely 5 underscore.
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
All files are present in sub directory as input.txt
. I want to parse input.txt
, iterate through each filename and print FILENAME and the specified line number ( from the FILENAME ) to output.txt
I am beginner at shell script and I understand that sed or awk will be used , and below command can do the job.
awk 'FNR==LINENUMBER print FILENAME, $0' *.txt >output.txt
But how can i iterate through the file input.txt
and find the FILENAME and extract LINENUMBER from FILENAME to output.txt
The FILENAME specified in input.txt
can in one of the sub directories where input.txt
is located. There can be only one file with FILENAME in input.txt inside one of the sub directory ( one level ) form the input.txt
location.
DIR
âÂÂâÂÂâ input.txt
â âÂÂâÂÂâ DIR1
â â âÂÂâÂÂâ FILE_NAME_1.DAT
â âÂÂâÂÂâ DIR2
â â âÂÂâÂÂâ FILE_NAME_2.DAT
â âÂÂâÂÂâ DIR3
â â âÂÂâÂÂâ FILE_NAME_3.DAT
In output.txt
it should be printed as
FILENAME
LINE( Extracted from FILENAME present in input.txt )
shell-script text-processing awk
 |Â
show 5 more comments
up vote
2
down vote
favorite
I have a file input.txt
which contains multiple filename in the below format.FILENAME_DATE_LINENUMBER
, the input.txt
contains many such filenames.
The filename itself has precisely 5 underscore.
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
All files are present in sub directory as input.txt
. I want to parse input.txt
, iterate through each filename and print FILENAME and the specified line number ( from the FILENAME ) to output.txt
I am beginner at shell script and I understand that sed or awk will be used , and below command can do the job.
awk 'FNR==LINENUMBER print FILENAME, $0' *.txt >output.txt
But how can i iterate through the file input.txt
and find the FILENAME and extract LINENUMBER from FILENAME to output.txt
The FILENAME specified in input.txt
can in one of the sub directories where input.txt
is located. There can be only one file with FILENAME in input.txt inside one of the sub directory ( one level ) form the input.txt
location.
DIR
âÂÂâÂÂâ input.txt
â âÂÂâÂÂâ DIR1
â â âÂÂâÂÂâ FILE_NAME_1.DAT
â âÂÂâÂÂâ DIR2
â â âÂÂâÂÂâ FILE_NAME_2.DAT
â âÂÂâÂÂâ DIR3
â â âÂÂâÂÂâ FILE_NAME_3.DAT
In output.txt
it should be printed as
FILENAME
LINE( Extracted from FILENAME present in input.txt )
shell-script text-processing awk
1
With such questions you should always give example output. You wantFILE1.DAT_20180123_4
transformed intoFILE1.DAT 4
?
â Hauke Laging
Jan 23 at 17:43
@shubhamdeodia, your output should beFILENAME LINE(under number)
- update your question
â RomanPerekhrest
Jan 23 at 18:13
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@shubhamdeodia, ok, can we assume that all filesFILE*.DAT...
are in the same directory asinput.txt
?
â RomanPerekhrest
Jan 23 at 18:35
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40
 |Â
show 5 more comments
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I have a file input.txt
which contains multiple filename in the below format.FILENAME_DATE_LINENUMBER
, the input.txt
contains many such filenames.
The filename itself has precisely 5 underscore.
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
All files are present in sub directory as input.txt
. I want to parse input.txt
, iterate through each filename and print FILENAME and the specified line number ( from the FILENAME ) to output.txt
I am beginner at shell script and I understand that sed or awk will be used , and below command can do the job.
awk 'FNR==LINENUMBER print FILENAME, $0' *.txt >output.txt
But how can i iterate through the file input.txt
and find the FILENAME and extract LINENUMBER from FILENAME to output.txt
The FILENAME specified in input.txt
can in one of the sub directories where input.txt
is located. There can be only one file with FILENAME in input.txt inside one of the sub directory ( one level ) form the input.txt
location.
DIR
âÂÂâÂÂâ input.txt
â âÂÂâÂÂâ DIR1
â â âÂÂâÂÂâ FILE_NAME_1.DAT
â âÂÂâÂÂâ DIR2
â â âÂÂâÂÂâ FILE_NAME_2.DAT
â âÂÂâÂÂâ DIR3
â â âÂÂâÂÂâ FILE_NAME_3.DAT
In output.txt
it should be printed as
FILENAME
LINE( Extracted from FILENAME present in input.txt )
shell-script text-processing awk
I have a file input.txt
which contains multiple filename in the below format.FILENAME_DATE_LINENUMBER
, the input.txt
contains many such filenames.
The filename itself has precisely 5 underscore.
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
All files are present in sub directory as input.txt
. I want to parse input.txt
, iterate through each filename and print FILENAME and the specified line number ( from the FILENAME ) to output.txt
I am beginner at shell script and I understand that sed or awk will be used , and below command can do the job.
awk 'FNR==LINENUMBER print FILENAME, $0' *.txt >output.txt
But how can i iterate through the file input.txt
and find the FILENAME and extract LINENUMBER from FILENAME to output.txt
The FILENAME specified in input.txt
can in one of the sub directories where input.txt
is located. There can be only one file with FILENAME in input.txt inside one of the sub directory ( one level ) form the input.txt
location.
DIR
âÂÂâÂÂâ input.txt
â âÂÂâÂÂâ DIR1
â â âÂÂâÂÂâ FILE_NAME_1.DAT
â âÂÂâÂÂâ DIR2
â â âÂÂâÂÂâ FILE_NAME_2.DAT
â âÂÂâÂÂâ DIR3
â â âÂÂâÂÂâ FILE_NAME_3.DAT
In output.txt
it should be printed as
FILENAME
LINE( Extracted from FILENAME present in input.txt )
shell-script text-processing awk
edited Jan 23 at 19:44
asked Jan 23 at 17:36
shubham deodia
134
134
1
With such questions you should always give example output. You wantFILE1.DAT_20180123_4
transformed intoFILE1.DAT 4
?
â Hauke Laging
Jan 23 at 17:43
@shubhamdeodia, your output should beFILENAME LINE(under number)
- update your question
â RomanPerekhrest
Jan 23 at 18:13
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@shubhamdeodia, ok, can we assume that all filesFILE*.DAT...
are in the same directory asinput.txt
?
â RomanPerekhrest
Jan 23 at 18:35
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40
 |Â
show 5 more comments
1
With such questions you should always give example output. You wantFILE1.DAT_20180123_4
transformed intoFILE1.DAT 4
?
â Hauke Laging
Jan 23 at 17:43
@shubhamdeodia, your output should beFILENAME LINE(under number)
- update your question
â RomanPerekhrest
Jan 23 at 18:13
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@shubhamdeodia, ok, can we assume that all filesFILE*.DAT...
are in the same directory asinput.txt
?
â RomanPerekhrest
Jan 23 at 18:35
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40
1
1
With such questions you should always give example output. You want
FILE1.DAT_20180123_4
transformed into FILE1.DAT 4
?â Hauke Laging
Jan 23 at 17:43
With such questions you should always give example output. You want
FILE1.DAT_20180123_4
transformed into FILE1.DAT 4
?â Hauke Laging
Jan 23 at 17:43
@shubhamdeodia, your output should be
FILENAME LINE(under number)
- update your questionâ RomanPerekhrest
Jan 23 at 18:13
@shubhamdeodia, your output should be
FILENAME LINE(under number)
- update your questionâ RomanPerekhrest
Jan 23 at 18:13
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@shubhamdeodia, ok, can we assume that all files
FILE*.DAT...
are in the same directory as input.txt
?â RomanPerekhrest
Jan 23 at 18:35
@shubhamdeodia, ok, can we assume that all files
FILE*.DAT...
are in the same directory as input.txt
?â RomanPerekhrest
Jan 23 at 18:35
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40
 |Â
show 5 more comments
4 Answers
4
active
oldest
votes
up vote
0
down vote
accepted
#!/bin/bash
do_one()
# two args: $1=filename_no_dir $2=line_number
# Find the single filename
eval file=*"/$1"
echo $1
# $. == line number
perl -ne 'chomp; $.=='"$2"' and print "LINE($_)n"' $file
export -f do_one
# Generate som test data
parallel 'mkdir DIR; seq 100 110 >DIR/FILE_NAME_.DAT' ::: 1..4
# Test input.txt
cat <<EOF |
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
EOF
# Remove _YYYYMMDD.* to get filename, and .*_ to get line number
parallel do_one '= s/_201ddddd.*// =' '= s/.*_// ='
Output:
FILE_NAME_1.DAT
LINE(103)
FILE_NAME_2.DAT
LINE(104)
FILE_NAME_3.DAT
LINE(105)
FILE_NAME_4.DAT
LINE(106)
add a comment |Â
up vote
0
down vote
:> awk -F_ ' print $1; print $3; ' inputfile
FILE1.DAT
4
FILE2.DAT
5
FILE3.DAT
6
FILE4.DAT
7
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
add a comment |Â
up vote
0
down vote
If I'm understanding you correctly,
while IFS=_ read -r filename unuseddate linenum
do
printf "%sn" "$filename"
sed -n "$linenump;q" */"$filename"
done < input.txt > output.txt
This reads one line at a time from input.txt, splitting the line into 3 parts based on the underscore. It prints the filename then fires off a sed command that (by default prints nothing) and then on the specified line number, prints the line and quits that invocation of sed. The location of the filename should be in one of the immediate subdirectories of the current directory.
All of the output is then redirected to output.txt.
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
add a comment |Â
up vote
0
down vote
Complex solution with GNU parallel
+ find
+ awk
:
Let's say each input file has a content similar to the following:
cat DIR1/FILE1.DAT_20180123_4
FILE1 a
FILE1 b
FILE1 c
FILE1 d
FILE1 e
FILE1 f
FILE1 g
So, by the above scheme, the 2nd line in file FILE2.DAT_20180123_5
would be FILE2 b
and the 7th line in file FILE4.DAT_20180123_7
- FILE4 g
.
Assume input.txt
file is the same as in the question.
The job:
find . -type f -regextype posix-egrep -regex ".*/($(paste -s -d'|' input.txt))"
| parallel -j0 "awk -v n="=s/.*_//=" -v fn="/"
'NR==n print fn,$0; exit ' " > output.txt
The final output.txt
contents:
$ cat output.txt
FILE4.DAT_20180123_7 FILE4 g
FILE3.DAT_20180123_6 FILE3 f
FILE1.DAT_20180123_4 FILE1 d
FILE2.DAT_20180123_5 FILE2 e
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with.DAT
only?
â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in yourinput.txt
- it should have include full filenames not basenames. So reconsider yourinput.txt
scheme and your life will be easier
â RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
#!/bin/bash
do_one()
# two args: $1=filename_no_dir $2=line_number
# Find the single filename
eval file=*"/$1"
echo $1
# $. == line number
perl -ne 'chomp; $.=='"$2"' and print "LINE($_)n"' $file
export -f do_one
# Generate som test data
parallel 'mkdir DIR; seq 100 110 >DIR/FILE_NAME_.DAT' ::: 1..4
# Test input.txt
cat <<EOF |
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
EOF
# Remove _YYYYMMDD.* to get filename, and .*_ to get line number
parallel do_one '= s/_201ddddd.*// =' '= s/.*_// ='
Output:
FILE_NAME_1.DAT
LINE(103)
FILE_NAME_2.DAT
LINE(104)
FILE_NAME_3.DAT
LINE(105)
FILE_NAME_4.DAT
LINE(106)
add a comment |Â
up vote
0
down vote
accepted
#!/bin/bash
do_one()
# two args: $1=filename_no_dir $2=line_number
# Find the single filename
eval file=*"/$1"
echo $1
# $. == line number
perl -ne 'chomp; $.=='"$2"' and print "LINE($_)n"' $file
export -f do_one
# Generate som test data
parallel 'mkdir DIR; seq 100 110 >DIR/FILE_NAME_.DAT' ::: 1..4
# Test input.txt
cat <<EOF |
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
EOF
# Remove _YYYYMMDD.* to get filename, and .*_ to get line number
parallel do_one '= s/_201ddddd.*// =' '= s/.*_// ='
Output:
FILE_NAME_1.DAT
LINE(103)
FILE_NAME_2.DAT
LINE(104)
FILE_NAME_3.DAT
LINE(105)
FILE_NAME_4.DAT
LINE(106)
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
#!/bin/bash
do_one()
# two args: $1=filename_no_dir $2=line_number
# Find the single filename
eval file=*"/$1"
echo $1
# $. == line number
perl -ne 'chomp; $.=='"$2"' and print "LINE($_)n"' $file
export -f do_one
# Generate som test data
parallel 'mkdir DIR; seq 100 110 >DIR/FILE_NAME_.DAT' ::: 1..4
# Test input.txt
cat <<EOF |
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
EOF
# Remove _YYYYMMDD.* to get filename, and .*_ to get line number
parallel do_one '= s/_201ddddd.*// =' '= s/.*_// ='
Output:
FILE_NAME_1.DAT
LINE(103)
FILE_NAME_2.DAT
LINE(104)
FILE_NAME_3.DAT
LINE(105)
FILE_NAME_4.DAT
LINE(106)
#!/bin/bash
do_one()
# two args: $1=filename_no_dir $2=line_number
# Find the single filename
eval file=*"/$1"
echo $1
# $. == line number
perl -ne 'chomp; $.=='"$2"' and print "LINE($_)n"' $file
export -f do_one
# Generate som test data
parallel 'mkdir DIR; seq 100 110 >DIR/FILE_NAME_.DAT' ::: 1..4
# Test input.txt
cat <<EOF |
FILE_NAME_1.DAT_20180123_4
FILE_NAME_2.DAT_20180123_5
FILE_NAME_3.DAT_20180123_6
FILE_NAME_4.DAT_20180123_7
EOF
# Remove _YYYYMMDD.* to get filename, and .*_ to get line number
parallel do_one '= s/_201ddddd.*// =' '= s/.*_// ='
Output:
FILE_NAME_1.DAT
LINE(103)
FILE_NAME_2.DAT
LINE(104)
FILE_NAME_3.DAT
LINE(105)
FILE_NAME_4.DAT
LINE(106)
answered Jan 31 at 15:55
Ole Tange
11.3k1344101
11.3k1344101
add a comment |Â
add a comment |Â
up vote
0
down vote
:> awk -F_ ' print $1; print $3; ' inputfile
FILE1.DAT
4
FILE2.DAT
5
FILE3.DAT
6
FILE4.DAT
7
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
add a comment |Â
up vote
0
down vote
:> awk -F_ ' print $1; print $3; ' inputfile
FILE1.DAT
4
FILE2.DAT
5
FILE3.DAT
6
FILE4.DAT
7
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
add a comment |Â
up vote
0
down vote
up vote
0
down vote
:> awk -F_ ' print $1; print $3; ' inputfile
FILE1.DAT
4
FILE2.DAT
5
FILE3.DAT
6
FILE4.DAT
7
:> awk -F_ ' print $1; print $3; ' inputfile
FILE1.DAT
4
FILE2.DAT
5
FILE3.DAT
6
FILE4.DAT
7
answered Jan 23 at 17:51
Hauke Laging
53.4k1282130
53.4k1282130
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
add a comment |Â
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
what if FILE2.DAT is in nested directory from where inputfile.txt is. I mean we don't know the relative path for FILE2.DAT from inputfile and we don't even have the exact name of subdirectory.
â shubham deodia
Jan 23 at 17:57
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
@shubhamdeodia Are you complaining to me about your input data...?
â Hauke Laging
Jan 23 at 18:02
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
I am sorry , my question was a little confusing, I have corrected it. I want to find the FILENAME in input.txt and then find line number in FILENAME and print it to putput.txt
â shubham deodia
Jan 23 at 18:08
add a comment |Â
up vote
0
down vote
If I'm understanding you correctly,
while IFS=_ read -r filename unuseddate linenum
do
printf "%sn" "$filename"
sed -n "$linenump;q" */"$filename"
done < input.txt > output.txt
This reads one line at a time from input.txt, splitting the line into 3 parts based on the underscore. It prints the filename then fires off a sed command that (by default prints nothing) and then on the specified line number, prints the line and quits that invocation of sed. The location of the filename should be in one of the immediate subdirectories of the current directory.
All of the output is then redirected to output.txt.
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
add a comment |Â
up vote
0
down vote
If I'm understanding you correctly,
while IFS=_ read -r filename unuseddate linenum
do
printf "%sn" "$filename"
sed -n "$linenump;q" */"$filename"
done < input.txt > output.txt
This reads one line at a time from input.txt, splitting the line into 3 parts based on the underscore. It prints the filename then fires off a sed command that (by default prints nothing) and then on the specified line number, prints the line and quits that invocation of sed. The location of the filename should be in one of the immediate subdirectories of the current directory.
All of the output is then redirected to output.txt.
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
add a comment |Â
up vote
0
down vote
up vote
0
down vote
If I'm understanding you correctly,
while IFS=_ read -r filename unuseddate linenum
do
printf "%sn" "$filename"
sed -n "$linenump;q" */"$filename"
done < input.txt > output.txt
This reads one line at a time from input.txt, splitting the line into 3 parts based on the underscore. It prints the filename then fires off a sed command that (by default prints nothing) and then on the specified line number, prints the line and quits that invocation of sed. The location of the filename should be in one of the immediate subdirectories of the current directory.
All of the output is then redirected to output.txt.
If I'm understanding you correctly,
while IFS=_ read -r filename unuseddate linenum
do
printf "%sn" "$filename"
sed -n "$linenump;q" */"$filename"
done < input.txt > output.txt
This reads one line at a time from input.txt, splitting the line into 3 parts based on the underscore. It prints the filename then fires off a sed command that (by default prints nothing) and then on the specified line number, prints the line and quits that invocation of sed. The location of the filename should be in one of the immediate subdirectories of the current directory.
All of the output is then redirected to output.txt.
edited Jan 23 at 18:52
answered Jan 23 at 17:54
Jeff Schaller
31.7k847107
31.7k847107
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
add a comment |Â
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Hi , Is there a way we can tell the script to select last occurrence of underscore for line Number instead of 3rd occurrence because filename may contain more than 3 underscore.
â shubham deodia
Jan 23 at 19:09
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Not this script, because I based it on your sample input. You should really post representative data in your question in order to get usable answers.
â Jeff Schaller
Jan 23 at 19:15
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
Yes, you are right, this script works perfectly with 3 underscore. Thank you.
â shubham deodia
Jan 23 at 19:18
add a comment |Â
up vote
0
down vote
Complex solution with GNU parallel
+ find
+ awk
:
Let's say each input file has a content similar to the following:
cat DIR1/FILE1.DAT_20180123_4
FILE1 a
FILE1 b
FILE1 c
FILE1 d
FILE1 e
FILE1 f
FILE1 g
So, by the above scheme, the 2nd line in file FILE2.DAT_20180123_5
would be FILE2 b
and the 7th line in file FILE4.DAT_20180123_7
- FILE4 g
.
Assume input.txt
file is the same as in the question.
The job:
find . -type f -regextype posix-egrep -regex ".*/($(paste -s -d'|' input.txt))"
| parallel -j0 "awk -v n="=s/.*_//=" -v fn="/"
'NR==n print fn,$0; exit ' " > output.txt
The final output.txt
contents:
$ cat output.txt
FILE4.DAT_20180123_7 FILE4 g
FILE3.DAT_20180123_6 FILE3 f
FILE1.DAT_20180123_4 FILE1 d
FILE2.DAT_20180123_5 FILE2 e
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with.DAT
only?
â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in yourinput.txt
- it should have include full filenames not basenames. So reconsider yourinput.txt
scheme and your life will be easier
â RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
add a comment |Â
up vote
0
down vote
Complex solution with GNU parallel
+ find
+ awk
:
Let's say each input file has a content similar to the following:
cat DIR1/FILE1.DAT_20180123_4
FILE1 a
FILE1 b
FILE1 c
FILE1 d
FILE1 e
FILE1 f
FILE1 g
So, by the above scheme, the 2nd line in file FILE2.DAT_20180123_5
would be FILE2 b
and the 7th line in file FILE4.DAT_20180123_7
- FILE4 g
.
Assume input.txt
file is the same as in the question.
The job:
find . -type f -regextype posix-egrep -regex ".*/($(paste -s -d'|' input.txt))"
| parallel -j0 "awk -v n="=s/.*_//=" -v fn="/"
'NR==n print fn,$0; exit ' " > output.txt
The final output.txt
contents:
$ cat output.txt
FILE4.DAT_20180123_7 FILE4 g
FILE3.DAT_20180123_6 FILE3 f
FILE1.DAT_20180123_4 FILE1 d
FILE2.DAT_20180123_5 FILE2 e
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with.DAT
only?
â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in yourinput.txt
- it should have include full filenames not basenames. So reconsider yourinput.txt
scheme and your life will be easier
â RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Complex solution with GNU parallel
+ find
+ awk
:
Let's say each input file has a content similar to the following:
cat DIR1/FILE1.DAT_20180123_4
FILE1 a
FILE1 b
FILE1 c
FILE1 d
FILE1 e
FILE1 f
FILE1 g
So, by the above scheme, the 2nd line in file FILE2.DAT_20180123_5
would be FILE2 b
and the 7th line in file FILE4.DAT_20180123_7
- FILE4 g
.
Assume input.txt
file is the same as in the question.
The job:
find . -type f -regextype posix-egrep -regex ".*/($(paste -s -d'|' input.txt))"
| parallel -j0 "awk -v n="=s/.*_//=" -v fn="/"
'NR==n print fn,$0; exit ' " > output.txt
The final output.txt
contents:
$ cat output.txt
FILE4.DAT_20180123_7 FILE4 g
FILE3.DAT_20180123_6 FILE3 f
FILE1.DAT_20180123_4 FILE1 d
FILE2.DAT_20180123_5 FILE2 e
Complex solution with GNU parallel
+ find
+ awk
:
Let's say each input file has a content similar to the following:
cat DIR1/FILE1.DAT_20180123_4
FILE1 a
FILE1 b
FILE1 c
FILE1 d
FILE1 e
FILE1 f
FILE1 g
So, by the above scheme, the 2nd line in file FILE2.DAT_20180123_5
would be FILE2 b
and the 7th line in file FILE4.DAT_20180123_7
- FILE4 g
.
Assume input.txt
file is the same as in the question.
The job:
find . -type f -regextype posix-egrep -regex ".*/($(paste -s -d'|' input.txt))"
| parallel -j0 "awk -v n="=s/.*_//=" -v fn="/"
'NR==n print fn,$0; exit ' " > output.txt
The final output.txt
contents:
$ cat output.txt
FILE4.DAT_20180123_7 FILE4 g
FILE3.DAT_20180123_6 FILE3 f
FILE1.DAT_20180123_4 FILE1 d
FILE2.DAT_20180123_5 FILE2 e
answered Jan 23 at 19:32
RomanPerekhrest
22.4k12144
22.4k12144
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with.DAT
only?
â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in yourinput.txt
- it should have include full filenames not basenames. So reconsider yourinput.txt
scheme and your life will be easier
â RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
add a comment |Â
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with.DAT
only?
â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in yourinput.txt
- it should have include full filenames not basenames. So reconsider yourinput.txt
scheme and your life will be easier
â RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
Hi , but the fileName itself has some underscore , to be specific 5 underscore.
â shubham deodia
Jan 23 at 19:35
@shubhamdeodia, elaborate your file name convention: should all the files be ended with
.DAT
only?â RomanPerekhrest
Jan 23 at 19:40
@shubhamdeodia, elaborate your file name convention: should all the files be ended with
.DAT
only?â RomanPerekhrest
Jan 23 at 19:40
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
Yes, they all end with .DAT , but I can't install GNU parallel in my Linux distro. Don't have the privileges.
â shubham deodia
Jan 23 at 19:41
@shubhamdeodia, the main problem is in your
input.txt
- it should have include full filenames not basenames. So reconsider your input.txt
scheme and your life will be easierâ RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia, the main problem is in your
input.txt
- it should have include full filenames not basenames. So reconsider your input.txt
scheme and your life will be easierâ RomanPerekhrest
Jan 23 at 19:49
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
@shubhamdeodia Are the reasons covered on oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html If not, please elaborate.
â Ole Tange
Jan 31 at 15:39
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f419143%2ffind-and-print-particular-line-from-list-of-filename-and-line-number%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
With such questions you should always give example output. You want
FILE1.DAT_20180123_4
transformed intoFILE1.DAT 4
?â Hauke Laging
Jan 23 at 17:43
@shubhamdeodia, your output should be
FILENAME LINE(under number)
- update your questionâ RomanPerekhrest
Jan 23 at 18:13
@RomanPerekhrest , thank you. I have updated.
â shubham deodia
Jan 23 at 18:19
@shubhamdeodia, ok, can we assume that all files
FILE*.DAT...
are in the same directory asinput.txt
?â RomanPerekhrest
Jan 23 at 18:35
@RomanPerekhrest , no they are in the sub directories ( one level down ) from where input.txt is and we don't even know the name of sub directory.
â shubham deodia
Jan 23 at 18:40