Is it a bug for join with -tt?

Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.
The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.
LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111
I want to know whether it is a bug or I made some mistakes.
shell quoting
add a comment |
up vote
4
down vote
favorite
I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.
The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.
LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111
I want to know whether it is a bug or I made some mistakes.
shell quoting
1
If you are using bash, you may tryjoin -t $'t'
– enzotib
Sep 1 '12 at 15:52
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55
add a comment |
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.
The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.
LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111
I want to know whether it is a bug or I made some mistakes.
shell quoting
I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.
The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.
LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111
I want to know whether it is a bug or I made some mistakes.
shell quoting
shell quoting
edited Feb 18 '14 at 16:14
erch
1,970113460
1,970113460
asked Sep 1 '12 at 15:37
Dejian
43339
43339
1
If you are using bash, you may tryjoin -t $'t'
– enzotib
Sep 1 '12 at 15:52
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55
add a comment |
1
If you are using bash, you may tryjoin -t $'t'
– enzotib
Sep 1 '12 at 15:52
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55
1
1
If you are using bash, you may try
join -t $'t'– enzotib
Sep 1 '12 at 15:52
If you are using bash, you may try
join -t $'t'– enzotib
Sep 1 '12 at 15:52
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55
add a comment |
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.
To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.
Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.
If you need portability to all POSIX shells such as dash, use
tab=$(printf 't')
join -t "$tab" …
or directly join -t "$(printf 't')" ….
The variable can be avoided withjoin -t "$(printf 't')"
– agc
Nov 18 at 20:44
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.
To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.
Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.
If you need portability to all POSIX shells such as dash, use
tab=$(printf 't')
join -t "$tab" …
or directly join -t "$(printf 't')" ….
The variable can be avoided withjoin -t "$(printf 't')"
– agc
Nov 18 at 20:44
add a comment |
up vote
2
down vote
accepted
-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.
To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.
Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.
If you need portability to all POSIX shells such as dash, use
tab=$(printf 't')
join -t "$tab" …
or directly join -t "$(printf 't')" ….
The variable can be avoided withjoin -t "$(printf 't')"
– agc
Nov 18 at 20:44
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.
To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.
Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.
If you need portability to all POSIX shells such as dash, use
tab=$(printf 't')
join -t "$tab" …
or directly join -t "$(printf 't')" ….
-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.
To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.
Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.
If you need portability to all POSIX shells such as dash, use
tab=$(printf 't')
join -t "$tab" …
or directly join -t "$(printf 't')" ….
edited Nov 19 at 11:26
answered Sep 1 '12 at 23:31
Gilles
521k12610401570
521k12610401570
The variable can be avoided withjoin -t "$(printf 't')"
– agc
Nov 18 at 20:44
add a comment |
The variable can be avoided withjoin -t "$(printf 't')"
– agc
Nov 18 at 20:44
The variable can be avoided with
join -t "$(printf 't')" – agc
Nov 18 at 20:44
The variable can be avoided with
join -t "$(printf 't')" – agc
Nov 18 at 20:44
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f46910%2fis-it-a-bug-for-join-with-t-t%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
If you are using bash, you may try
join -t $'t'– enzotib
Sep 1 '12 at 15:52
it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30
Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55