Abbreviate multiple column names in linux, keeping the last field

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












0















I have a file where all the column headers are the path names. I want to abbreviate each column header from something that looks like:



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam


to:



sample1 sample2


How do I do this in linux? My files have anywhere from 46 to 100+ columns so manually editing column names is not an option. My desired file names are each 7 characters in length, as above.



Thanks



The header has the filename.
Each column header/ name is



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam 


where I just want it to be



sample1


To clarify, this is one text file with 46 columns. Each column header or name appears as the lengthy string above and I want to truncate each header to the 7 character version, e.g. 'sample1'...'sampl46'



Desired Example file (with data under each column header)



sample1 sample2 sample3 sample4 sample5 ... 









share|improve this question
























  • The header has the filenames, or the data?

    – Jeff Schaller
    Feb 6 at 11:28











  • What does a sample header look like, knowing that there are many columns. What's the delimiter?

    – Jeff Schaller
    Feb 6 at 11:29











  • Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

    – eblock
    Feb 6 at 11:39












  • @eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

    – Cece
    Feb 6 at 12:41












  • Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

    – Jeff Schaller
    Feb 6 at 13:58















0















I have a file where all the column headers are the path names. I want to abbreviate each column header from something that looks like:



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam


to:



sample1 sample2


How do I do this in linux? My files have anywhere from 46 to 100+ columns so manually editing column names is not an option. My desired file names are each 7 characters in length, as above.



Thanks



The header has the filename.
Each column header/ name is



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam 


where I just want it to be



sample1


To clarify, this is one text file with 46 columns. Each column header or name appears as the lengthy string above and I want to truncate each header to the 7 character version, e.g. 'sample1'...'sampl46'



Desired Example file (with data under each column header)



sample1 sample2 sample3 sample4 sample5 ... 









share|improve this question
























  • The header has the filenames, or the data?

    – Jeff Schaller
    Feb 6 at 11:28











  • What does a sample header look like, knowing that there are many columns. What's the delimiter?

    – Jeff Schaller
    Feb 6 at 11:29











  • Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

    – eblock
    Feb 6 at 11:39












  • @eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

    – Cece
    Feb 6 at 12:41












  • Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

    – Jeff Schaller
    Feb 6 at 13:58













0












0








0


0






I have a file where all the column headers are the path names. I want to abbreviate each column header from something that looks like:



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam


to:



sample1 sample2


How do I do this in linux? My files have anywhere from 46 to 100+ columns so manually editing column names is not an option. My desired file names are each 7 characters in length, as above.



Thanks



The header has the filename.
Each column header/ name is



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam 


where I just want it to be



sample1


To clarify, this is one text file with 46 columns. Each column header or name appears as the lengthy string above and I want to truncate each header to the 7 character version, e.g. 'sample1'...'sampl46'



Desired Example file (with data under each column header)



sample1 sample2 sample3 sample4 sample5 ... 









share|improve this question
















I have a file where all the column headers are the path names. I want to abbreviate each column header from something that looks like:



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam


to:



sample1 sample2


How do I do this in linux? My files have anywhere from 46 to 100+ columns so manually editing column names is not an option. My desired file names are each 7 characters in length, as above.



Thanks



The header has the filename.
Each column header/ name is



/mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam 


where I just want it to be



sample1


To clarify, this is one text file with 46 columns. Each column header or name appears as the lengthy string above and I want to truncate each header to the 7 character version, e.g. 'sample1'...'sampl46'



Desired Example file (with data under each column header)



sample1 sample2 sample3 sample4 sample5 ... 






linux text-processing






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 6 at 18:47









agc

4,71111137




4,71111137










asked Feb 6 at 10:50









CeceCece

12




12












  • The header has the filenames, or the data?

    – Jeff Schaller
    Feb 6 at 11:28











  • What does a sample header look like, knowing that there are many columns. What's the delimiter?

    – Jeff Schaller
    Feb 6 at 11:29











  • Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

    – eblock
    Feb 6 at 11:39












  • @eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

    – Cece
    Feb 6 at 12:41












  • Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

    – Jeff Schaller
    Feb 6 at 13:58

















  • The header has the filenames, or the data?

    – Jeff Schaller
    Feb 6 at 11:28











  • What does a sample header look like, knowing that there are many columns. What's the delimiter?

    – Jeff Schaller
    Feb 6 at 11:29











  • Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

    – eblock
    Feb 6 at 11:39












  • @eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

    – Cece
    Feb 6 at 12:41












  • Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

    – Jeff Schaller
    Feb 6 at 13:58
















The header has the filenames, or the data?

– Jeff Schaller
Feb 6 at 11:28





The header has the filenames, or the data?

– Jeff Schaller
Feb 6 at 11:28













What does a sample header look like, knowing that there are many columns. What's the delimiter?

– Jeff Schaller
Feb 6 at 11:29





What does a sample header look like, knowing that there are many columns. What's the delimiter?

– Jeff Schaller
Feb 6 at 11:29













Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

– eblock
Feb 6 at 11:39






Is this something you can use? basename /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam | sed 's/.[[:alnum:]]+//g' sample1

– eblock
Feb 6 at 11:39














@eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

– Cece
Feb 6 at 12:41






@eblock, I don't think I can use the sed command as above because it references a single column. The file has approximately 46 columns and I'm looking for a command/ script which will enable me to abbreviate all the column headers simultaneously. But thanks.

– Cece
Feb 6 at 12:41














Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

– Jeff Schaller
Feb 6 at 13:58





Is there an input example of ...sample46... that gets abbreviated to sampl46 in order to get it down to 7 characters, or does the input start with ...sampl46...?

– Jeff Schaller
Feb 6 at 13:58










4 Answers
4






active

oldest

votes


















1














Assuming the unwanted suffix is always ".so.rg.mk.bam", then GNU sed's evaluate command can be used to run basename on just the first line of filename, replacing it with the required output:



sed -i '1s/.*/basename -as .so.rg.mk.bam -a &/e' filename


For non-GNU seds, head can be used instead:



sed -i '1s/.*/'"$(basename -as .so.rg.mk.bam -a $(head -1 filename))"'/' filename


--



Note: To see the results without changing the file, try it without the -i first.






share|improve this answer
































    0














    I would write a short program to copy the original file into a new file with the short names. Keeping the original file will give you a back up should something go wrong. Exactly what you write depends on the language you are comfortable with. This may be you shell such as Bash, or any of a number of languages such as java, c, pearl, python, etc.



    Here is some psudo code:
    old is the original file and new is the new file
    create new



    begin a loop to read each line in old
    read line from old
    delete all characters from line up to and including the last "/"
    delete delete all characters from line after the first 7
    //This is what you want to save unless it conflicts with a previously saved line
    determine if you have a conflict.
    if there is a conflict
    add a number to the end of line to make it unique
    save line to new
    end of loop





    share|improve this answer

























    • The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

      – Ed Roberts
      Feb 6 at 14:19


















    0














    Let's say I have a file with 4 columns and two lines:



    host:~ # cat file2
    /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample3.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4.so.rg.mk.bam
    abc def ghi jkl


    This command worked for me (not very handy, but still):



    host:~ # sed -i -e 's/^///g' -e 's/[[:alnum:]]+///g' -e 's/.[[:alnum:]]+//g' -e 's////g' file2
    host:~ # cat file2
    sample1 sample2 sample3 sample4
    abc def ghi jkl


    I'm sure there's a more efficient way, but you could give it a try.






    share|improve this answer






























      0














      You could use awk to process the headers. The following awk script works only on the first line (NR==1). It loops though all of the fields in that line, one at a time. For each field, it performs the following steps:



      1. Find the first instance of the text /sample and trim the text up to that (and through the /).

      2. Find the first instance in the remainder of a period and trim off the portion from the period onwards.

      3. If the remainder is too long, then trim the sample text down as much as needed. The equation how much of it to keep turns out to be "6 plus the position of the first digit minus the overall length".

      4. Once we're done processing this field, print it out with a trailing space.

      5. Once we're done looping through all of the fields, print a newline.

      Note that this leaves you with a trailing space at the end of the line.



      The awk script:



      NR == 1 
      for(i=1; i <= NF; i++)
      tail=substr($i, 1 + match($i, "/sample")) # delete up to the first instance of "/sample"
      tail=substr(tail, 1, index(tail, ".") - 1) # find, then stop short of, the first period
      if (length(tail) > 7) # if it's too long
      match(tail, "[0-9]") # find the first digit
      # trim the beginning down, then append the number
      tail=substr(tail, 1, 6 + RSTART - length(tail))substr(tail, RSTART)

      printf tail" "

      print ""



      On a sample input of:



      /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample47.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4631.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample1234567.so.rg.mk.bam 


      The sample output is:



      sample1 sampl47 sam4631 1234567





      share|improve this answer






















        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "106"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499015%2fabbreviate-multiple-column-names-in-linux-keeping-the-last-field%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        4 Answers
        4






        active

        oldest

        votes








        4 Answers
        4






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        1














        Assuming the unwanted suffix is always ".so.rg.mk.bam", then GNU sed's evaluate command can be used to run basename on just the first line of filename, replacing it with the required output:



        sed -i '1s/.*/basename -as .so.rg.mk.bam -a &/e' filename


        For non-GNU seds, head can be used instead:



        sed -i '1s/.*/'"$(basename -as .so.rg.mk.bam -a $(head -1 filename))"'/' filename


        --



        Note: To see the results without changing the file, try it without the -i first.






        share|improve this answer





























          1














          Assuming the unwanted suffix is always ".so.rg.mk.bam", then GNU sed's evaluate command can be used to run basename on just the first line of filename, replacing it with the required output:



          sed -i '1s/.*/basename -as .so.rg.mk.bam -a &/e' filename


          For non-GNU seds, head can be used instead:



          sed -i '1s/.*/'"$(basename -as .so.rg.mk.bam -a $(head -1 filename))"'/' filename


          --



          Note: To see the results without changing the file, try it without the -i first.






          share|improve this answer



























            1












            1








            1







            Assuming the unwanted suffix is always ".so.rg.mk.bam", then GNU sed's evaluate command can be used to run basename on just the first line of filename, replacing it with the required output:



            sed -i '1s/.*/basename -as .so.rg.mk.bam -a &/e' filename


            For non-GNU seds, head can be used instead:



            sed -i '1s/.*/'"$(basename -as .so.rg.mk.bam -a $(head -1 filename))"'/' filename


            --



            Note: To see the results without changing the file, try it without the -i first.






            share|improve this answer















            Assuming the unwanted suffix is always ".so.rg.mk.bam", then GNU sed's evaluate command can be used to run basename on just the first line of filename, replacing it with the required output:



            sed -i '1s/.*/basename -as .so.rg.mk.bam -a &/e' filename


            For non-GNU seds, head can be used instead:



            sed -i '1s/.*/'"$(basename -as .so.rg.mk.bam -a $(head -1 filename))"'/' filename


            --



            Note: To see the results without changing the file, try it without the -i first.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Feb 8 at 19:16

























            answered Feb 6 at 18:57









            agcagc

            4,71111137




            4,71111137























                0














                I would write a short program to copy the original file into a new file with the short names. Keeping the original file will give you a back up should something go wrong. Exactly what you write depends on the language you are comfortable with. This may be you shell such as Bash, or any of a number of languages such as java, c, pearl, python, etc.



                Here is some psudo code:
                old is the original file and new is the new file
                create new



                begin a loop to read each line in old
                read line from old
                delete all characters from line up to and including the last "/"
                delete delete all characters from line after the first 7
                //This is what you want to save unless it conflicts with a previously saved line
                determine if you have a conflict.
                if there is a conflict
                add a number to the end of line to make it unique
                save line to new
                end of loop





                share|improve this answer

























                • The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                  – Ed Roberts
                  Feb 6 at 14:19















                0














                I would write a short program to copy the original file into a new file with the short names. Keeping the original file will give you a back up should something go wrong. Exactly what you write depends on the language you are comfortable with. This may be you shell such as Bash, or any of a number of languages such as java, c, pearl, python, etc.



                Here is some psudo code:
                old is the original file and new is the new file
                create new



                begin a loop to read each line in old
                read line from old
                delete all characters from line up to and including the last "/"
                delete delete all characters from line after the first 7
                //This is what you want to save unless it conflicts with a previously saved line
                determine if you have a conflict.
                if there is a conflict
                add a number to the end of line to make it unique
                save line to new
                end of loop





                share|improve this answer

























                • The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                  – Ed Roberts
                  Feb 6 at 14:19













                0












                0








                0







                I would write a short program to copy the original file into a new file with the short names. Keeping the original file will give you a back up should something go wrong. Exactly what you write depends on the language you are comfortable with. This may be you shell such as Bash, or any of a number of languages such as java, c, pearl, python, etc.



                Here is some psudo code:
                old is the original file and new is the new file
                create new



                begin a loop to read each line in old
                read line from old
                delete all characters from line up to and including the last "/"
                delete delete all characters from line after the first 7
                //This is what you want to save unless it conflicts with a previously saved line
                determine if you have a conflict.
                if there is a conflict
                add a number to the end of line to make it unique
                save line to new
                end of loop





                share|improve this answer















                I would write a short program to copy the original file into a new file with the short names. Keeping the original file will give you a back up should something go wrong. Exactly what you write depends on the language you are comfortable with. This may be you shell such as Bash, or any of a number of languages such as java, c, pearl, python, etc.



                Here is some psudo code:
                old is the original file and new is the new file
                create new



                begin a loop to read each line in old
                read line from old
                delete all characters from line up to and including the last "/"
                delete delete all characters from line after the first 7
                //This is what you want to save unless it conflicts with a previously saved line
                determine if you have a conflict.
                if there is a conflict
                add a number to the end of line to make it unique
                save line to new
                end of loop






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Feb 6 at 14:21

























                answered Feb 6 at 14:16









                Ed RobertsEd Roberts

                12




                12












                • The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                  – Ed Roberts
                  Feb 6 at 14:19

















                • The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                  – Ed Roberts
                  Feb 6 at 14:19
















                The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                – Ed Roberts
                Feb 6 at 14:19





                The psudo code I provided got thrown into a single paragraph. I hope you can make sense of it.

                – Ed Roberts
                Feb 6 at 14:19











                0














                Let's say I have a file with 4 columns and two lines:



                host:~ # cat file2
                /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample3.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4.so.rg.mk.bam
                abc def ghi jkl


                This command worked for me (not very handy, but still):



                host:~ # sed -i -e 's/^///g' -e 's/[[:alnum:]]+///g' -e 's/.[[:alnum:]]+//g' -e 's////g' file2
                host:~ # cat file2
                sample1 sample2 sample3 sample4
                abc def ghi jkl


                I'm sure there's a more efficient way, but you could give it a try.






                share|improve this answer



























                  0














                  Let's say I have a file with 4 columns and two lines:



                  host:~ # cat file2
                  /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample3.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4.so.rg.mk.bam
                  abc def ghi jkl


                  This command worked for me (not very handy, but still):



                  host:~ # sed -i -e 's/^///g' -e 's/[[:alnum:]]+///g' -e 's/.[[:alnum:]]+//g' -e 's////g' file2
                  host:~ # cat file2
                  sample1 sample2 sample3 sample4
                  abc def ghi jkl


                  I'm sure there's a more efficient way, but you could give it a try.






                  share|improve this answer

























                    0












                    0








                    0







                    Let's say I have a file with 4 columns and two lines:



                    host:~ # cat file2
                    /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample3.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4.so.rg.mk.bam
                    abc def ghi jkl


                    This command worked for me (not very handy, but still):



                    host:~ # sed -i -e 's/^///g' -e 's/[[:alnum:]]+///g' -e 's/.[[:alnum:]]+//g' -e 's////g' file2
                    host:~ # cat file2
                    sample1 sample2 sample3 sample4
                    abc def ghi jkl


                    I'm sure there's a more efficient way, but you could give it a try.






                    share|improve this answer













                    Let's say I have a file with 4 columns and two lines:



                    host:~ # cat file2
                    /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample2.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample3.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4.so.rg.mk.bam
                    abc def ghi jkl


                    This command worked for me (not very handy, but still):



                    host:~ # sed -i -e 's/^///g' -e 's/[[:alnum:]]+///g' -e 's/.[[:alnum:]]+//g' -e 's////g' file2
                    host:~ # cat file2
                    sample1 sample2 sample3 sample4
                    abc def ghi jkl


                    I'm sure there's a more efficient way, but you could give it a try.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Feb 6 at 14:44









                    eblockeblock

                    1166




                    1166





















                        0














                        You could use awk to process the headers. The following awk script works only on the first line (NR==1). It loops though all of the fields in that line, one at a time. For each field, it performs the following steps:



                        1. Find the first instance of the text /sample and trim the text up to that (and through the /).

                        2. Find the first instance in the remainder of a period and trim off the portion from the period onwards.

                        3. If the remainder is too long, then trim the sample text down as much as needed. The equation how much of it to keep turns out to be "6 plus the position of the first digit minus the overall length".

                        4. Once we're done processing this field, print it out with a trailing space.

                        5. Once we're done looping through all of the fields, print a newline.

                        Note that this leaves you with a trailing space at the end of the line.



                        The awk script:



                        NR == 1 
                        for(i=1; i <= NF; i++)
                        tail=substr($i, 1 + match($i, "/sample")) # delete up to the first instance of "/sample"
                        tail=substr(tail, 1, index(tail, ".") - 1) # find, then stop short of, the first period
                        if (length(tail) > 7) # if it's too long
                        match(tail, "[0-9]") # find the first digit
                        # trim the beginning down, then append the number
                        tail=substr(tail, 1, 6 + RSTART - length(tail))substr(tail, RSTART)

                        printf tail" "

                        print ""



                        On a sample input of:



                        /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample47.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4631.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample1234567.so.rg.mk.bam 


                        The sample output is:



                        sample1 sampl47 sam4631 1234567





                        share|improve this answer



























                          0














                          You could use awk to process the headers. The following awk script works only on the first line (NR==1). It loops though all of the fields in that line, one at a time. For each field, it performs the following steps:



                          1. Find the first instance of the text /sample and trim the text up to that (and through the /).

                          2. Find the first instance in the remainder of a period and trim off the portion from the period onwards.

                          3. If the remainder is too long, then trim the sample text down as much as needed. The equation how much of it to keep turns out to be "6 plus the position of the first digit minus the overall length".

                          4. Once we're done processing this field, print it out with a trailing space.

                          5. Once we're done looping through all of the fields, print a newline.

                          Note that this leaves you with a trailing space at the end of the line.



                          The awk script:



                          NR == 1 
                          for(i=1; i <= NF; i++)
                          tail=substr($i, 1 + match($i, "/sample")) # delete up to the first instance of "/sample"
                          tail=substr(tail, 1, index(tail, ".") - 1) # find, then stop short of, the first period
                          if (length(tail) > 7) # if it's too long
                          match(tail, "[0-9]") # find the first digit
                          # trim the beginning down, then append the number
                          tail=substr(tail, 1, 6 + RSTART - length(tail))substr(tail, RSTART)

                          printf tail" "

                          print ""



                          On a sample input of:



                          /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample47.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4631.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample1234567.so.rg.mk.bam 


                          The sample output is:



                          sample1 sampl47 sam4631 1234567





                          share|improve this answer

























                            0












                            0








                            0







                            You could use awk to process the headers. The following awk script works only on the first line (NR==1). It loops though all of the fields in that line, one at a time. For each field, it performs the following steps:



                            1. Find the first instance of the text /sample and trim the text up to that (and through the /).

                            2. Find the first instance in the remainder of a period and trim off the portion from the period onwards.

                            3. If the remainder is too long, then trim the sample text down as much as needed. The equation how much of it to keep turns out to be "6 plus the position of the first digit minus the overall length".

                            4. Once we're done processing this field, print it out with a trailing space.

                            5. Once we're done looping through all of the fields, print a newline.

                            Note that this leaves you with a trailing space at the end of the line.



                            The awk script:



                            NR == 1 
                            for(i=1; i <= NF; i++)
                            tail=substr($i, 1 + match($i, "/sample")) # delete up to the first instance of "/sample"
                            tail=substr(tail, 1, index(tail, ".") - 1) # find, then stop short of, the first period
                            if (length(tail) > 7) # if it's too long
                            match(tail, "[0-9]") # find the first digit
                            # trim the beginning down, then append the number
                            tail=substr(tail, 1, 6 + RSTART - length(tail))substr(tail, RSTART)

                            printf tail" "

                            print ""



                            On a sample input of:



                            /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample47.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4631.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample1234567.so.rg.mk.bam 


                            The sample output is:



                            sample1 sampl47 sam4631 1234567





                            share|improve this answer













                            You could use awk to process the headers. The following awk script works only on the first line (NR==1). It loops though all of the fields in that line, one at a time. For each field, it performs the following steps:



                            1. Find the first instance of the text /sample and trim the text up to that (and through the /).

                            2. Find the first instance in the remainder of a period and trim off the portion from the period onwards.

                            3. If the remainder is too long, then trim the sample text down as much as needed. The equation how much of it to keep turns out to be "6 plus the position of the first digit minus the overall length".

                            4. Once we're done processing this field, print it out with a trailing space.

                            5. Once we're done looping through all of the fields, print a newline.

                            Note that this leaves you with a trailing space at the end of the line.



                            The awk script:



                            NR == 1 
                            for(i=1; i <= NF; i++)
                            tail=substr($i, 1 + match($i, "/sample")) # delete up to the first instance of "/sample"
                            tail=substr(tail, 1, index(tail, ".") - 1) # find, then stop short of, the first period
                            if (length(tail) > 7) # if it's too long
                            match(tail, "[0-9]") # find the first digit
                            # trim the beginning down, then append the number
                            tail=substr(tail, 1, 6 + RSTART - length(tail))substr(tail, RSTART)

                            printf tail" "

                            print ""



                            On a sample input of:



                            /mydir/cat/dog/hen/test/block/sample1.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample47.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample4631.so.rg.mk.bam /mydir/cat/dog/hen/test/block/sample1234567.so.rg.mk.bam 


                            The sample output is:



                            sample1 sampl47 sam4631 1234567






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Feb 23 at 2:25









                            Jeff SchallerJeff Schaller

                            42.6k1159136




                            42.6k1159136



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f499015%2fabbreviate-multiple-column-names-in-linux-keeping-the-last-field%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown






                                Popular posts from this blog

                                How to check contact read email or not when send email to Individual?

                                Displaying single band from multi-band raster using QGIS

                                How many registers does an x86_64 CPU actually have?