Check target file for duplicate entry before copy data from source file [duplicate]

up vote
2
down vote

favorite

This question already has an answer here:

Is there a tool to get the lines in one file that are not in another?

6 answers

I am trying to copy lines from source.txt to target.txt. I would like this bash script to check each line on target.txt if there is duplicate entry before copy.

source.txt contains:
a$$a$$a
b**b**
c%%cc%%
d##d##d##
e^^e^^e^^

target.txt contains:

a$$a$$a
ee$$ee$$
ff__ff__
gg@@gg@@
zzxxzzxx
bb..bb..bb
e^^e^^e^^
hh;;hh;;hh

on this scenario, I assuming that there only 3 entries will be copied to target.txt which are:

b**b**
c%%cc%%
d##d##d##

My test code is:

#!/bin/bash
echo "started"
programpath=/home/mysite/www/copyfiles

var str input ; cat "$programpath/source.txt" > $input 
var str target ; cat "$programpath/target.txt" > $target 

cat $input >> $target

uniq -u "$target"

echo "finished"
 exit 1
fi

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 15 '17 at 3:59

danone

1069

marked as duplicate by don_crissti, Rui F Ribeiro, Jeff Schaller, Timothy Martin, G-Man Jan 18 at 6:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

up vote
2
down vote

favorite

This question already has an answer here:

Is there a tool to get the lines in one file that are not in another?

6 answers

I am trying to copy lines from source.txt to target.txt. I would like this bash script to check each line on target.txt if there is duplicate entry before copy.

source.txt contains:
a$$a$$a
b**b**
c%%cc%%
d##d##d##
e^^e^^e^^

target.txt contains:

a$$a$$a
ee$$ee$$
ff__ff__
gg@@gg@@
zzxxzzxx
bb..bb..bb
e^^e^^e^^
hh;;hh;;hh

on this scenario, I assuming that there only 3 entries will be copied to target.txt which are:

b**b**
c%%cc%%
d##d##d##

My test code is:

#!/bin/bash
echo "started"
programpath=/home/mysite/www/copyfiles

var str input ; cat "$programpath/source.txt" > $input 
var str target ; cat "$programpath/target.txt" > $target 

cat $input >> $target

uniq -u "$target"

echo "finished"
 exit 1
fi

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 15 '17 at 3:59

danone

1069

marked as duplicate by don_crissti, Rui F Ribeiro, Jeff Schaller, Timothy Martin, G-Man Jan 18 at 6:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

up vote
2
down vote

favorite

This question already has an answer here:

Is there a tool to get the lines in one file that are not in another?

6 answers

I am trying to copy lines from source.txt to target.txt. I would like this bash script to check each line on target.txt if there is duplicate entry before copy.

source.txt contains:
a$$a$$a
b**b**
c%%cc%%
d##d##d##
e^^e^^e^^

target.txt contains:

a$$a$$a
ee$$ee$$
ff__ff__
gg@@gg@@
zzxxzzxx
bb..bb..bb
e^^e^^e^^
hh;;hh;;hh

on this scenario, I assuming that there only 3 entries will be copied to target.txt which are:

b**b**
c%%cc%%
d##d##d##

My test code is:

#!/bin/bash
echo "started"
programpath=/home/mysite/www/copyfiles

var str input ; cat "$programpath/source.txt" > $input 
var str target ; cat "$programpath/target.txt" > $target 

cat $input >> $target

uniq -u "$target"

echo "finished"
 exit 1
fi

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 15 '17 at 3:59

danone

1069

This question already has an answer here:

Is there a tool to get the lines in one file that are not in another?

6 answers

I am trying to copy lines from source.txt to target.txt. I would like this bash script to check each line on target.txt if there is duplicate entry before copy.

source.txt contains:
a$$a$$a
b**b**
c%%cc%%
d##d##d##
e^^e^^e^^

target.txt contains:

a$$a$$a
ee$$ee$$
ff__ff__
gg@@gg@@
zzxxzzxx
bb..bb..bb
e^^e^^e^^
hh;;hh;;hh

on this scenario, I assuming that there only 3 entries will be copied to target.txt which are:

b**b**
c%%cc%%
d##d##d##

My test code is:

#!/bin/bash
echo "started"
programpath=/home/mysite/www/copyfiles

var str input ; cat "$programpath/source.txt" > $input 
var str target ; cat "$programpath/target.txt" > $target 

cat $input >> $target

uniq -u "$target"

echo "finished"
 exit 1
fi

This question already has an answer here:

Is there a tool to get the lines in one file that are not in another?

6 answers

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 15 '17 at 3:59

danone

1069

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 15 '17 at 5:41

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

asked Oct 15 '17 at 3:59

danone

1069

asked Oct 15 '17 at 3:59

danone

1069

asked Oct 15 '17 at 3:59

danone

1069

marked as duplicate by don_crissti, Rui F Ribeiro, Jeff Schaller, Timothy Martin, G-Man Jan 18 at 6:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by don_crissti, Rui F Ribeiro, Jeff Schaller, Timothy Martin, G-Man Jan 18 at 6:20

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Why using bash? The grep command can do the job clean.

grep -Fxvf target.txt source.txt #>> target.txt

This will return those lines which are exist only in source.txt, then you can append these lines to your target.txt with just uncomment #>> target.txt.

You may also needs to unique the source.txt before, to prevent appending duplicated entries if in source.txt file which awk also does the same at next.

grep -Fxvf target.txt <(sort -u source.txt) #>> target.txt

The -F option is telling grep that match pattern as a string instead of regex.

With -x option we are telling whole line is my pattern.

The -v is the reverse match that if you miss it, that will output the lines which are exist in the both files.

And -f is telling grep read my patterns from a file which is target.txt here.

Or you could use awk instead.

awk 'NR==FNRseen[$0]=1;next !seen[$0]++' target.txt source.txt #>> target.txt

Add whole target.txt file into the array called seen with the key of whole line seen[$0], and do next to read next line.

With !seen[$0]++ we are looking a line from source.txt which is not exist in array, then print it. Also add source.txt file lines into the array to prevent printing duplicated lines if exist in source.txt_.

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

add a commentÂ |Â

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Why using bash? The grep command can do the job clean.

grep -Fxvf target.txt source.txt #>> target.txt

This will return those lines which are exist only in source.txt, then you can append these lines to your target.txt with just uncomment #>> target.txt.

You may also needs to unique the source.txt before, to prevent appending duplicated entries if in source.txt file which awk also does the same at next.

grep -Fxvf target.txt <(sort -u source.txt) #>> target.txt

The -F option is telling grep that match pattern as a string instead of regex.

With -x option we are telling whole line is my pattern.

The -v is the reverse match that if you miss it, that will output the lines which are exist in the both files.

And -f is telling grep read my patterns from a file which is target.txt here.

Or you could use awk instead.

awk 'NR==FNRseen[$0]=1;next !seen[$0]++' target.txt source.txt #>> target.txt

Add whole target.txt file into the array called seen with the key of whole line seen[$0], and do next to read next line.

With !seen[$0]++ we are looking a line from source.txt which is not exist in array, then print it. Also add source.txt file lines into the array to prevent printing duplicated lines if exist in source.txt_.

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

add a commentÂ |Â

up vote
1
down vote

accepted

Why using bash? The grep command can do the job clean.

grep -Fxvf target.txt source.txt #>> target.txt

This will return those lines which are exist only in source.txt, then you can append these lines to your target.txt with just uncomment #>> target.txt.

You may also needs to unique the source.txt before, to prevent appending duplicated entries if in source.txt file which awk also does the same at next.

grep -Fxvf target.txt <(sort -u source.txt) #>> target.txt

The -F option is telling grep that match pattern as a string instead of regex.

With -x option we are telling whole line is my pattern.

The -v is the reverse match that if you miss it, that will output the lines which are exist in the both files.

And -f is telling grep read my patterns from a file which is target.txt here.

Or you could use awk instead.

awk 'NR==FNRseen[$0]=1;next !seen[$0]++' target.txt source.txt #>> target.txt

Add whole target.txt file into the array called seen with the key of whole line seen[$0], and do next to read next line.

With !seen[$0]++ we are looking a line from source.txt which is not exist in array, then print it. Also add source.txt file lines into the array to prevent printing duplicated lines if exist in source.txt_.

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

add a commentÂ |Â

up vote
1
down vote

accepted

Why using bash? The grep command can do the job clean.

grep -Fxvf target.txt source.txt #>> target.txt

This will return those lines which are exist only in source.txt, then you can append these lines to your target.txt with just uncomment #>> target.txt.

You may also needs to unique the source.txt before, to prevent appending duplicated entries if in source.txt file which awk also does the same at next.

grep -Fxvf target.txt <(sort -u source.txt) #>> target.txt

The -F option is telling grep that match pattern as a string instead of regex.

With -x option we are telling whole line is my pattern.

The -v is the reverse match that if you miss it, that will output the lines which are exist in the both files.

And -f is telling grep read my patterns from a file which is target.txt here.

Or you could use awk instead.

awk 'NR==FNRseen[$0]=1;next !seen[$0]++' target.txt source.txt #>> target.txt

Add whole target.txt file into the array called seen with the key of whole line seen[$0], and do next to read next line.

With !seen[$0]++ we are looking a line from source.txt which is not exist in array, then print it. Also add source.txt file lines into the array to prevent printing duplicated lines if exist in source.txt_.

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

Why using bash? The grep command can do the job clean.

grep -Fxvf target.txt source.txt #>> target.txt

This will return those lines which are exist only in source.txt, then you can append these lines to your target.txt with just uncomment #>> target.txt.

You may also needs to unique the source.txt before, to prevent appending duplicated entries if in source.txt file which awk also does the same at next.

grep -Fxvf target.txt <(sort -u source.txt) #>> target.txt

The -F option is telling grep that match pattern as a string instead of regex.

With -x option we are telling whole line is my pattern.

The -v is the reverse match that if you miss it, that will output the lines which are exist in the both files.

And -f is telling grep read my patterns from a file which is target.txt here.

Or you could use awk instead.

awk 'NR==FNRseen[$0]=1;next !seen[$0]++' target.txt source.txt #>> target.txt

Add whole target.txt file into the array called seen with the key of whole line seen[$0], and do next to read next line.

With !seen[$0]++ we are looking a line from source.txt which is not exist in array, then print it. Also add source.txt file lines into the array to prevent printing duplicated lines if exist in source.txt_.

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

edited Oct 19 '17 at 4:29

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

answered Oct 15 '17 at 4:11

ÃŽÂ±Ã’Â“sÃÂ½ÃŽÂ¹ÃŽÂ·

15.6k92563

add a commentÂ |Â

搜尋此網誌

mjhjmtu