Listed Frequency of Different Strings in a Particular Column

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I need to figure it out how many times a particular string shows up in column 4.



This is my data:



25 48656721 48656734 FAM132B ENSCAFT00000019683 4 0.51 
X 53969937 53969950 FAM155B ENSCAFT00000026508 5 0.57
3 42203721 42203906 FAM169B ENSCAFT00000017307 5 0.54
36 28947780 28947831 FAM171B ENSCAFT00000046981 5 0.51
10 45080519 45080773 FAM171B ENSCAFT00000003744 9 -0.53
3 61627122 61627446 FAM193A ENSCAFT00000023571 13 0.64
3 61626373 61626466 FAM193A ENSCAFT00000023571 6 0.51
15 55348822 55349196 FAM193A ENSCAFT00000045012 5 0.52


This is a portion of my data. So, I'd want the output to be:



1 FAM132B
1 FAM155B
1 FAM169B
2 FAM171B
3 FAM193A


And so on - for the rest of my data. What's a command that would work?










share|improve this question























  • Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
    – Gravy
    Sep 23 '15 at 17:55










  • I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
    – Jeff Schaller
    Sep 23 '15 at 17:57










  • @Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
    – Justin
    Sep 23 '15 at 18:09










  • @JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
    – Justin
    Sep 23 '15 at 18:09











  • @Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
    – vfbsilva
    Sep 23 '15 at 18:16














up vote
2
down vote

favorite












I need to figure it out how many times a particular string shows up in column 4.



This is my data:



25 48656721 48656734 FAM132B ENSCAFT00000019683 4 0.51 
X 53969937 53969950 FAM155B ENSCAFT00000026508 5 0.57
3 42203721 42203906 FAM169B ENSCAFT00000017307 5 0.54
36 28947780 28947831 FAM171B ENSCAFT00000046981 5 0.51
10 45080519 45080773 FAM171B ENSCAFT00000003744 9 -0.53
3 61627122 61627446 FAM193A ENSCAFT00000023571 13 0.64
3 61626373 61626466 FAM193A ENSCAFT00000023571 6 0.51
15 55348822 55349196 FAM193A ENSCAFT00000045012 5 0.52


This is a portion of my data. So, I'd want the output to be:



1 FAM132B
1 FAM155B
1 FAM169B
2 FAM171B
3 FAM193A


And so on - for the rest of my data. What's a command that would work?










share|improve this question























  • Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
    – Gravy
    Sep 23 '15 at 17:55










  • I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
    – Jeff Schaller
    Sep 23 '15 at 17:57










  • @Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
    – Justin
    Sep 23 '15 at 18:09










  • @JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
    – Justin
    Sep 23 '15 at 18:09











  • @Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
    – vfbsilva
    Sep 23 '15 at 18:16












up vote
2
down vote

favorite









up vote
2
down vote

favorite











I need to figure it out how many times a particular string shows up in column 4.



This is my data:



25 48656721 48656734 FAM132B ENSCAFT00000019683 4 0.51 
X 53969937 53969950 FAM155B ENSCAFT00000026508 5 0.57
3 42203721 42203906 FAM169B ENSCAFT00000017307 5 0.54
36 28947780 28947831 FAM171B ENSCAFT00000046981 5 0.51
10 45080519 45080773 FAM171B ENSCAFT00000003744 9 -0.53
3 61627122 61627446 FAM193A ENSCAFT00000023571 13 0.64
3 61626373 61626466 FAM193A ENSCAFT00000023571 6 0.51
15 55348822 55349196 FAM193A ENSCAFT00000045012 5 0.52


This is a portion of my data. So, I'd want the output to be:



1 FAM132B
1 FAM155B
1 FAM169B
2 FAM171B
3 FAM193A


And so on - for the rest of my data. What's a command that would work?










share|improve this question















I need to figure it out how many times a particular string shows up in column 4.



This is my data:



25 48656721 48656734 FAM132B ENSCAFT00000019683 4 0.51 
X 53969937 53969950 FAM155B ENSCAFT00000026508 5 0.57
3 42203721 42203906 FAM169B ENSCAFT00000017307 5 0.54
36 28947780 28947831 FAM171B ENSCAFT00000046981 5 0.51
10 45080519 45080773 FAM171B ENSCAFT00000003744 9 -0.53
3 61627122 61627446 FAM193A ENSCAFT00000023571 13 0.64
3 61626373 61626466 FAM193A ENSCAFT00000023571 6 0.51
15 55348822 55349196 FAM193A ENSCAFT00000045012 5 0.52


This is a portion of my data. So, I'd want the output to be:



1 FAM132B
1 FAM155B
1 FAM169B
2 FAM171B
3 FAM193A


And so on - for the rest of my data. What's a command that would work?







shell command-line text-processing uniq






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 17 at 20:23









Rui F Ribeiro

38.2k1475123




38.2k1475123










asked Sep 23 '15 at 17:47









Justin

155




155











  • Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
    – Gravy
    Sep 23 '15 at 17:55










  • I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
    – Jeff Schaller
    Sep 23 '15 at 17:57










  • @Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
    – Justin
    Sep 23 '15 at 18:09










  • @JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
    – Justin
    Sep 23 '15 at 18:09











  • @Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
    – vfbsilva
    Sep 23 '15 at 18:16
















  • Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
    – Gravy
    Sep 23 '15 at 17:55










  • I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
    – Jeff Schaller
    Sep 23 '15 at 17:57










  • @Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
    – Justin
    Sep 23 '15 at 18:09










  • @JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
    – Justin
    Sep 23 '15 at 18:09











  • @Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
    – vfbsilva
    Sep 23 '15 at 18:16















Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
– Gravy
Sep 23 '15 at 17:55




Are you looking for a dynamic count on all data? IE You need to know how many times each occurance of an entry appears, you don't know how many different types of entries there may be, or what t hose entries may be? Or do you have a set number of potential entries that you are aware of, and want a count of those known entries?
– Gravy
Sep 23 '15 at 17:55












I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
– Jeff Schaller
Sep 23 '15 at 17:57




I only see two "FAM193A" in your sample data? And, do you care if the output is sorted by column 4?
– Jeff Schaller
Sep 23 '15 at 17:57












@Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
– Justin
Sep 23 '15 at 18:09




@Gravy My data consists of 2066 lines. Above I just have 8 sample lines.
– Justin
Sep 23 '15 at 18:09












@JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
– Justin
Sep 23 '15 at 18:09





@JeffSchaller You're absolutely right! That was a mistake on my part. I've edited it now. Thanks! And yes I would like it sorted by column 4
– Justin
Sep 23 '15 at 18:09













@Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
– vfbsilva
Sep 23 '15 at 18:16




@Justin use sort -k -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syn‐ tax below
– vfbsilva
Sep 23 '15 at 18:16










3 Answers
3






active

oldest

votes

















up vote
2
down vote



accepted










One simplistic solution would be to use awk to pull column 4; uniq -c to count them; and another sort to put them in order by the second column (the old column 4 data):



awk 'print $4' < data | uniq -c | sort -k2


On your (updated) sample input, this gives:



 1 FAM132B
1 FAM155B
1 FAM169B
2 FAM171B
3 FAM193A





share|improve this answer




















  • Ooo, this one works great as well! Thank you for that and your explanation!
    – Justin
    Sep 23 '15 at 18:26











  • @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
    – vfbsilva
    Sep 23 '15 at 18:28











  • @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
    – Jeff Schaller
    Sep 23 '15 at 18:38

















up vote
1
down vote













Use awk:



awk 'a[$4]++ ENDfor(s in a)print a[s]" "s' file



  • a[$4]++ increments the array element whose index has the name of the 4th column. When finishing trough the file, that array contains counters of all occurences of the 4th column.


  • END: indicates a block of code that runs when awk is trough the file.


    • for(s in a) run trough the array...


    • print a[s]" "s} ... and print its values and indexes.



The output:



1 FAM169B
3 FAM193A
1 FAM132B
1 FAM155B
2 FAM171B





share|improve this answer




















  • print a[s], s seen better
    – Costas
    Sep 23 '15 at 19:57

















up vote
0
down vote













Assuming the delimiter is a single space:



cut -d' ' -f4 infile | sort | uniq -c



Note that uniq filters adjacent matching lines so you need to sort first e.g. with this input:



FAM193A
FAM155B
FAM169B
FAM171B
FAM132B
FAM193A
FAM132A
FAM132B
FAM155B
FAM169B
FAM171B
FAM171A
FAM193A
FAM132A


using sort | uniq -c produces:



 2 FAM132A
2 FAM132B
2 FAM155B
2 FAM169B
1 FAM171A
2 FAM171B
3 FAM193A


while uniq -c | sort -k2 produces:



 1 FAM132A
1 FAM132A
1 FAM132B
1 FAM132B
1 FAM155B
1 FAM155B
1 FAM169B
1 FAM169B
1 FAM171A
1 FAM171B
1 FAM171B
1 FAM193A
1 FAM193A
1 FAM193A





share|improve this answer






















    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f231630%2flisted-frequency-of-different-strings-in-a-particular-column%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote



    accepted










    One simplistic solution would be to use awk to pull column 4; uniq -c to count them; and another sort to put them in order by the second column (the old column 4 data):



    awk 'print $4' < data | uniq -c | sort -k2


    On your (updated) sample input, this gives:



     1 FAM132B
    1 FAM155B
    1 FAM169B
    2 FAM171B
    3 FAM193A





    share|improve this answer




















    • Ooo, this one works great as well! Thank you for that and your explanation!
      – Justin
      Sep 23 '15 at 18:26











    • @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
      – vfbsilva
      Sep 23 '15 at 18:28











    • @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
      – Jeff Schaller
      Sep 23 '15 at 18:38














    up vote
    2
    down vote



    accepted










    One simplistic solution would be to use awk to pull column 4; uniq -c to count them; and another sort to put them in order by the second column (the old column 4 data):



    awk 'print $4' < data | uniq -c | sort -k2


    On your (updated) sample input, this gives:



     1 FAM132B
    1 FAM155B
    1 FAM169B
    2 FAM171B
    3 FAM193A





    share|improve this answer




















    • Ooo, this one works great as well! Thank you for that and your explanation!
      – Justin
      Sep 23 '15 at 18:26











    • @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
      – vfbsilva
      Sep 23 '15 at 18:28











    • @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
      – Jeff Schaller
      Sep 23 '15 at 18:38












    up vote
    2
    down vote



    accepted







    up vote
    2
    down vote



    accepted






    One simplistic solution would be to use awk to pull column 4; uniq -c to count them; and another sort to put them in order by the second column (the old column 4 data):



    awk 'print $4' < data | uniq -c | sort -k2


    On your (updated) sample input, this gives:



     1 FAM132B
    1 FAM155B
    1 FAM169B
    2 FAM171B
    3 FAM193A





    share|improve this answer












    One simplistic solution would be to use awk to pull column 4; uniq -c to count them; and another sort to put them in order by the second column (the old column 4 data):



    awk 'print $4' < data | uniq -c | sort -k2


    On your (updated) sample input, this gives:



     1 FAM132B
    1 FAM155B
    1 FAM169B
    2 FAM171B
    3 FAM193A






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Sep 23 '15 at 18:22









    Jeff Schaller

    36.2k952119




    36.2k952119











    • Ooo, this one works great as well! Thank you for that and your explanation!
      – Justin
      Sep 23 '15 at 18:26











    • @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
      – vfbsilva
      Sep 23 '15 at 18:28











    • @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
      – Jeff Schaller
      Sep 23 '15 at 18:38
















    • Ooo, this one works great as well! Thank you for that and your explanation!
      – Justin
      Sep 23 '15 at 18:26











    • @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
      – vfbsilva
      Sep 23 '15 at 18:28











    • @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
      – Jeff Schaller
      Sep 23 '15 at 18:38















    Ooo, this one works great as well! Thank you for that and your explanation!
    – Justin
    Sep 23 '15 at 18:26





    Ooo, this one works great as well! Thank you for that and your explanation!
    – Justin
    Sep 23 '15 at 18:26













    @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
    – vfbsilva
    Sep 23 '15 at 18:28





    @Jeff Schaller I messed up here can you give me a hand? I was with: sort -k4 | awk 'print $4, $3,$2,$1' | uniq -c How did you get the first field as a counter?
    – vfbsilva
    Sep 23 '15 at 18:28













    @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
    – Jeff Schaller
    Sep 23 '15 at 18:38




    @vfbsilva you are including columns 3, 2, and 1 in your awk output when you should not; doing so changes uniq's input (and thus output)
    – Jeff Schaller
    Sep 23 '15 at 18:38












    up vote
    1
    down vote













    Use awk:



    awk 'a[$4]++ ENDfor(s in a)print a[s]" "s' file



    • a[$4]++ increments the array element whose index has the name of the 4th column. When finishing trough the file, that array contains counters of all occurences of the 4th column.


    • END: indicates a block of code that runs when awk is trough the file.


      • for(s in a) run trough the array...


      • print a[s]" "s} ... and print its values and indexes.



    The output:



    1 FAM169B
    3 FAM193A
    1 FAM132B
    1 FAM155B
    2 FAM171B





    share|improve this answer




















    • print a[s], s seen better
      – Costas
      Sep 23 '15 at 19:57














    up vote
    1
    down vote













    Use awk:



    awk 'a[$4]++ ENDfor(s in a)print a[s]" "s' file



    • a[$4]++ increments the array element whose index has the name of the 4th column. When finishing trough the file, that array contains counters of all occurences of the 4th column.


    • END: indicates a block of code that runs when awk is trough the file.


      • for(s in a) run trough the array...


      • print a[s]" "s} ... and print its values and indexes.



    The output:



    1 FAM169B
    3 FAM193A
    1 FAM132B
    1 FAM155B
    2 FAM171B





    share|improve this answer




















    • print a[s], s seen better
      – Costas
      Sep 23 '15 at 19:57












    up vote
    1
    down vote










    up vote
    1
    down vote









    Use awk:



    awk 'a[$4]++ ENDfor(s in a)print a[s]" "s' file



    • a[$4]++ increments the array element whose index has the name of the 4th column. When finishing trough the file, that array contains counters of all occurences of the 4th column.


    • END: indicates a block of code that runs when awk is trough the file.


      • for(s in a) run trough the array...


      • print a[s]" "s} ... and print its values and indexes.



    The output:



    1 FAM169B
    3 FAM193A
    1 FAM132B
    1 FAM155B
    2 FAM171B





    share|improve this answer












    Use awk:



    awk 'a[$4]++ ENDfor(s in a)print a[s]" "s' file



    • a[$4]++ increments the array element whose index has the name of the 4th column. When finishing trough the file, that array contains counters of all occurences of the 4th column.


    • END: indicates a block of code that runs when awk is trough the file.


      • for(s in a) run trough the array...


      • print a[s]" "s} ... and print its values and indexes.



    The output:



    1 FAM169B
    3 FAM193A
    1 FAM132B
    1 FAM155B
    2 FAM171B






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Sep 23 '15 at 18:15









    chaos

    34.7k772115




    34.7k772115











    • print a[s], s seen better
      – Costas
      Sep 23 '15 at 19:57
















    • print a[s], s seen better
      – Costas
      Sep 23 '15 at 19:57















    print a[s], s seen better
    – Costas
    Sep 23 '15 at 19:57




    print a[s], s seen better
    – Costas
    Sep 23 '15 at 19:57










    up vote
    0
    down vote













    Assuming the delimiter is a single space:



    cut -d' ' -f4 infile | sort | uniq -c



    Note that uniq filters adjacent matching lines so you need to sort first e.g. with this input:



    FAM193A
    FAM155B
    FAM169B
    FAM171B
    FAM132B
    FAM193A
    FAM132A
    FAM132B
    FAM155B
    FAM169B
    FAM171B
    FAM171A
    FAM193A
    FAM132A


    using sort | uniq -c produces:



     2 FAM132A
    2 FAM132B
    2 FAM155B
    2 FAM169B
    1 FAM171A
    2 FAM171B
    3 FAM193A


    while uniq -c | sort -k2 produces:



     1 FAM132A
    1 FAM132A
    1 FAM132B
    1 FAM132B
    1 FAM155B
    1 FAM155B
    1 FAM169B
    1 FAM169B
    1 FAM171A
    1 FAM171B
    1 FAM171B
    1 FAM193A
    1 FAM193A
    1 FAM193A





    share|improve this answer


























      up vote
      0
      down vote













      Assuming the delimiter is a single space:



      cut -d' ' -f4 infile | sort | uniq -c



      Note that uniq filters adjacent matching lines so you need to sort first e.g. with this input:



      FAM193A
      FAM155B
      FAM169B
      FAM171B
      FAM132B
      FAM193A
      FAM132A
      FAM132B
      FAM155B
      FAM169B
      FAM171B
      FAM171A
      FAM193A
      FAM132A


      using sort | uniq -c produces:



       2 FAM132A
      2 FAM132B
      2 FAM155B
      2 FAM169B
      1 FAM171A
      2 FAM171B
      3 FAM193A


      while uniq -c | sort -k2 produces:



       1 FAM132A
      1 FAM132A
      1 FAM132B
      1 FAM132B
      1 FAM155B
      1 FAM155B
      1 FAM169B
      1 FAM169B
      1 FAM171A
      1 FAM171B
      1 FAM171B
      1 FAM193A
      1 FAM193A
      1 FAM193A





      share|improve this answer
























        up vote
        0
        down vote










        up vote
        0
        down vote









        Assuming the delimiter is a single space:



        cut -d' ' -f4 infile | sort | uniq -c



        Note that uniq filters adjacent matching lines so you need to sort first e.g. with this input:



        FAM193A
        FAM155B
        FAM169B
        FAM171B
        FAM132B
        FAM193A
        FAM132A
        FAM132B
        FAM155B
        FAM169B
        FAM171B
        FAM171A
        FAM193A
        FAM132A


        using sort | uniq -c produces:



         2 FAM132A
        2 FAM132B
        2 FAM155B
        2 FAM169B
        1 FAM171A
        2 FAM171B
        3 FAM193A


        while uniq -c | sort -k2 produces:



         1 FAM132A
        1 FAM132A
        1 FAM132B
        1 FAM132B
        1 FAM155B
        1 FAM155B
        1 FAM169B
        1 FAM169B
        1 FAM171A
        1 FAM171B
        1 FAM171B
        1 FAM193A
        1 FAM193A
        1 FAM193A





        share|improve this answer














        Assuming the delimiter is a single space:



        cut -d' ' -f4 infile | sort | uniq -c



        Note that uniq filters adjacent matching lines so you need to sort first e.g. with this input:



        FAM193A
        FAM155B
        FAM169B
        FAM171B
        FAM132B
        FAM193A
        FAM132A
        FAM132B
        FAM155B
        FAM169B
        FAM171B
        FAM171A
        FAM193A
        FAM132A


        using sort | uniq -c produces:



         2 FAM132A
        2 FAM132B
        2 FAM155B
        2 FAM169B
        1 FAM171A
        2 FAM171B
        3 FAM193A


        while uniq -c | sort -k2 produces:



         1 FAM132A
        1 FAM132A
        1 FAM132B
        1 FAM132B
        1 FAM155B
        1 FAM155B
        1 FAM169B
        1 FAM169B
        1 FAM171A
        1 FAM171B
        1 FAM171B
        1 FAM193A
        1 FAM193A
        1 FAM193A






        share|improve this answer














        share|improve this answer



        share|improve this answer








        answered Sep 23 '15 at 18:55


























        community wiki





        don_crissti




























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f231630%2flisted-frequency-of-different-strings-in-a-particular-column%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Displaying single band from multi-band raster using QGIS

            How many registers does an x86_64 CPU actually have?