Is it a bug for join with -tt?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
4
down vote

favorite
1












I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.



The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.



LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111


I want to know whether it is a bug or I made some mistakes.










share|improve this question



















  • 1




    If you are using bash, you may try join -t $'t'
    – enzotib
    Sep 1 '12 at 15:52











  • it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
    – Cougar
    Sep 1 '12 at 17:30










  • Thank you, enzotib and Cougar. Both solutions work well.
    – Dejian
    Sep 2 '12 at 4:55














up vote
4
down vote

favorite
1












I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.



The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.



LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111


I want to know whether it is a bug or I made some mistakes.










share|improve this question



















  • 1




    If you are using bash, you may try join -t $'t'
    – enzotib
    Sep 1 '12 at 15:52











  • it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
    – Cougar
    Sep 1 '12 at 17:30










  • Thank you, enzotib and Cougar. Both solutions work well.
    – Dejian
    Sep 2 '12 at 4:55












up vote
4
down vote

favorite
1









up vote
4
down vote

favorite
1






1





I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.



The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.



LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111


I want to know whether it is a bug or I made some mistakes.










share|improve this question















I have a problem with the command join. "The default join field is the first, delimited by whitespace" (Cited from join --help). However, there is a field containing sentences in my tab-delimted files. Thus, I want to join the two files using -tt (I also tried -t "t" which reported errors under Cygwin, but not under CentOS). Unexpectedly, the command outputted the fields in two consecutive lines. I have processed the two files with dos2unix and sort.



The example of output is as follows. The 1st and 3rd lines are from file1, and the 2nd and 4th lines are from file2. The 1st and 2nd lines should appear in the same line. However, if -tt is used, they appear in two consecutive lines (as below); if no -t, they appear in the same line.



LM00089 0.6281 0 Q27888 L-lactate dehydrogenase
LM00089 gi|2497622|sp|Q27888|LDH_CAEEL 0.6281 0.422
LM00136 0.3219 0.376741 O62619 Pyruvate kinase
LM00136 gi|27923979|sp|O62619|KPYK_DROME 0.3219 0.111


I want to know whether it is a bug or I made some mistakes.







shell quoting






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 18 '14 at 16:14









erch

1,970113460




1,970113460










asked Sep 1 '12 at 15:37









Dejian

43339




43339







  • 1




    If you are using bash, you may try join -t $'t'
    – enzotib
    Sep 1 '12 at 15:52











  • it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
    – Cougar
    Sep 1 '12 at 17:30










  • Thank you, enzotib and Cougar. Both solutions work well.
    – Dejian
    Sep 2 '12 at 4:55












  • 1




    If you are using bash, you may try join -t $'t'
    – enzotib
    Sep 1 '12 at 15:52











  • it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
    – Cougar
    Sep 1 '12 at 17:30










  • Thank you, enzotib and Cougar. Both solutions work well.
    – Dejian
    Sep 2 '12 at 4:55







1




1




If you are using bash, you may try join -t $'t'
– enzotib
Sep 1 '12 at 15:52





If you are using bash, you may try join -t $'t'
– enzotib
Sep 1 '12 at 15:52













it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30




it is no copy-paste safe but you can type '<ctrl-v><tab>' to get tab character in bash command line
– Cougar
Sep 1 '12 at 17:30












Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55




Thank you, enzotib and Cougar. Both solutions work well.
– Dejian
Sep 2 '12 at 4:55










1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.



To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.



Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.



If you need portability to all POSIX shells such as dash, use



tab=$(printf 't')
join -t "$tab" …


or directly join -t "$(printf 't')" ….






share|improve this answer






















  • The variable can be avoided with join -t "$(printf 't')"
    – agc
    Nov 18 at 20:44










Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f46910%2fis-it-a-bug-for-join-with-t-t%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.



To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.



Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.



If you need portability to all POSIX shells such as dash, use



tab=$(printf 't')
join -t "$tab" …


or directly join -t "$(printf 't')" ….






share|improve this answer






















  • The variable can be avoided with join -t "$(printf 't')"
    – agc
    Nov 18 at 20:44














up vote
2
down vote



accepted










-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.



To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.



Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.



If you need portability to all POSIX shells such as dash, use



tab=$(printf 't')
join -t "$tab" …


or directly join -t "$(printf 't')" ….






share|improve this answer






















  • The variable can be avoided with join -t "$(printf 't')"
    – agc
    Nov 18 at 20:44












up vote
2
down vote



accepted







up vote
2
down vote



accepted






-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.



To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.



Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.



If you need portability to all POSIX shells such as dash, use



tab=$(printf 't')
join -t "$tab" …


or directly join -t "$(printf 't')" ….






share|improve this answer














-t t passes t as the separator: an unquoted backslash always takes the next character literally (except when the next character is a newline). -t "t" passes t as the separator, different versions of join may behave differently when you pass multiple characters.



To pass a tab, from bash, use -t $'t'. The $'…' syntax mimics the feature of C and many other languages where followed by letters designate control characters, and can be followed by octal digits.



Another way is to put a literal tab in your script (between single or double quotes). This isn't very readable.



If you need portability to all POSIX shells such as dash, use



tab=$(printf 't')
join -t "$tab" …


or directly join -t "$(printf 't')" ….







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 19 at 11:26

























answered Sep 1 '12 at 23:31









Gilles

521k12610401570




521k12610401570











  • The variable can be avoided with join -t "$(printf 't')"
    – agc
    Nov 18 at 20:44
















  • The variable can be avoided with join -t "$(printf 't')"
    – agc
    Nov 18 at 20:44















The variable can be avoided with join -t "$(printf 't')"
– agc
Nov 18 at 20:44




The variable can be avoided with join -t "$(printf 't')"
– agc
Nov 18 at 20:44

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f46910%2fis-it-a-bug-for-join-with-t-t%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown






Popular posts from this blog

Peggy Mitchell

The Forum (Inglewood, California)

Palaiologos