How can I standardize the phone numbers in a text file?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I periodically receive a text file with phone numbers formatted in wildly different ways: ##########, ###-###-####, (###) ###-###, etc. Usually there's ten digits, but I've seen +1 (###) ###-####.



Eventually the file gets imported into a database, but for reasons I won't go into, it'd be handy to have the phone numbers have a standard format, (###) ###-####.



The only constant is that the phone numbers always fall between the second and third tab character on each line.



Is there a way to do this from the command line?







share|improve this question



















  • The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
    – schily
    May 4 at 10:23











  • To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
    – JdeBP
    May 4 at 10:38










  • This particular system uses only US phone numbers, and the "standard" is for human readability.
    – Chuck
    May 4 at 17:14














up vote
0
down vote

favorite












I periodically receive a text file with phone numbers formatted in wildly different ways: ##########, ###-###-####, (###) ###-###, etc. Usually there's ten digits, but I've seen +1 (###) ###-####.



Eventually the file gets imported into a database, but for reasons I won't go into, it'd be handy to have the phone numbers have a standard format, (###) ###-####.



The only constant is that the phone numbers always fall between the second and third tab character on each line.



Is there a way to do this from the command line?







share|improve this question



















  • The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
    – schily
    May 4 at 10:23











  • To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
    – JdeBP
    May 4 at 10:38










  • This particular system uses only US phone numbers, and the "standard" is for human readability.
    – Chuck
    May 4 at 17:14












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I periodically receive a text file with phone numbers formatted in wildly different ways: ##########, ###-###-####, (###) ###-###, etc. Usually there's ten digits, but I've seen +1 (###) ###-####.



Eventually the file gets imported into a database, but for reasons I won't go into, it'd be handy to have the phone numbers have a standard format, (###) ###-####.



The only constant is that the phone numbers always fall between the second and third tab character on each line.



Is there a way to do this from the command line?







share|improve this question











I periodically receive a text file with phone numbers formatted in wildly different ways: ##########, ###-###-####, (###) ###-###, etc. Usually there's ten digits, but I've seen +1 (###) ###-####.



Eventually the file gets imported into a database, but for reasons I won't go into, it'd be handy to have the phone numbers have a standard format, (###) ###-####.



The only constant is that the phone numbers always fall between the second and third tab character on each line.



Is there a way to do this from the command line?









share|improve this question










share|improve this question




share|improve this question









asked May 3 at 23:47









Chuck

250111




250111











  • The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
    – schily
    May 4 at 10:23











  • To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
    – JdeBP
    May 4 at 10:38










  • This particular system uses only US phone numbers, and the "standard" is for human readability.
    – Chuck
    May 4 at 17:14
















  • The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
    – schily
    May 4 at 10:23











  • To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
    – JdeBP
    May 4 at 10:38










  • This particular system uses only US phone numbers, and the "standard" is for human readability.
    – Chuck
    May 4 at 17:14















The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
– schily
May 4 at 10:23





The phone number system you mention in your question hast been introduced by Germany in the 1950s and withdrawn soon as it was not flexible enough. The US later imported that scheme and unless you only need to deal with US phone numbers, you need to be aware of much longer phone numbers. IIRC, current international rules require that the contry code (1 in case of the US, but up to three digits in general) + the rest of the phone number may be up to 16 digits. BTW: the area code could be from one to 5 digits.
– schily
May 4 at 10:23













To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
– JdeBP
May 4 at 10:38




To standardize, one would do well to use a standard. An appropriate standard for telephone numbers is E.164 from the International Telecommunications Union.
– JdeBP
May 4 at 10:38












This particular system uses only US phone numbers, and the "standard" is for human readability.
– Chuck
May 4 at 17:14




This particular system uses only US phone numbers, and the "standard" is for human readability.
– Chuck
May 4 at 17:14










3 Answers
3






active

oldest

votes

















up vote
1
down vote



accepted










This should cover you as long as the file is as you have described. The command preserves information before and after the phone number and formats it in the way that you asked for. If the output looks good, add the -i option to sed to edit it in place or provide it with output redirection using > output_file at the end.



sed -E "s/(.*t.*t)+?1?[[:space:]]?(?([0-9]3))?.*([0-9]3).*([0-9]4)(.*)/1(2) 3-45/g" filename


I tested it on a file containing this text:



 jfk 902-765-9292 hat jump cat
jk 902 819 2244 hat jump cat
98 902 823-4456 hat jump cat
78h +1 075 242 1566 hat jump cat
jklj kjlj +1 075-242-1566 hat jump cat
jk jkj +1 (075) 242-1566 hat jump cat
kj (204) 799-9810 hat jump cat
kj 89 (204)-799-9810 hat jump cat


The output was:



 jfk (902) 765-9292 hat jump cat
jk (902) 819-2244 hat jump cat
98 (902) 823-4456 hat jump cat
78h (075) 242-1566 hat jump cat
jklj kjlj (075) 242-1566 hat jump cat
jk jkj (075) 242-1566 hat jump cat
kj (204) 799-9810 hat jump cat
kj 89 (204) 799-9810 hat jump cat





share|improve this answer






























    up vote
    1
    down vote













    You can construct a regular expression that matches any of the formats, and captures the digits then re-substitutes them in your desired format.



    For example, to match and capture a sequence of three decimal digits optionally surrounded by parentheses with an Extended Regular Expression (ERE), you can write (?([0-9]3))? while [- ]? matches an optional hyphen or space. Building up in this way



    (?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)


    will match 3 digits optionally parenthesized, optionally followed by a hyphen or space, then more digits optionally followed by a hyphen or space, followed by 4 digits.



    Applying the expression in a sed substitution:



    $ cat <<EOF | sed -E 's/(?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)/(1) 2-3/g'
    I periodically receive a text file with phone numbers formatted
    in wildly different ways: 123 456-7890, 123 456-7890, 123 456-7890,
    etc. Usually there's ten digits, but I've seen +1 555 456-7890.
    EOF
    I periodically receive a text file with phone numbers formatted
    in wildly different ways: (123) 456-7890, (123) 456-7890, (123) 456-7890,
    etc. Usually there's ten digits, but I've seen +1 (555) 456-7890.





    share|improve this answer




























      up vote
      1
      down vote













      You need to match the field and the re-format it; here's an awk script that looks for three variations and re-formats them (before default-printing the reconstituted line):



      $3 ~ /^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/ 
      $3="(" substr($3, 1, 3) ") " substr($3, 4, 3) "-" substr($3, 7, 4)


      $3 ~ /^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
      $3="(" substr($3, 1, 3) ") " substr($3, 5, 3) "-" substr($3, 9, 4)


      $3 ~ /^+1 ([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
      $3="(" substr($3, 5, 3) ") " substr($3, 10, 3) "-" substr($3, 14, 4)


      1


      Save that to a file, perhaps phone.awk, then call it with: awk -F $'t' -f phone.awk < input.






      share|improve this answer





















        Your Answer







        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "106"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        convertImagesToLinks: false,
        noModals: false,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );








         

        draft saved


        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441676%2fhow-can-i-standardize-the-phone-numbers-in-a-text-file%23new-answer', 'question_page');

        );

        Post as a guest






























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        1
        down vote



        accepted










        This should cover you as long as the file is as you have described. The command preserves information before and after the phone number and formats it in the way that you asked for. If the output looks good, add the -i option to sed to edit it in place or provide it with output redirection using > output_file at the end.



        sed -E "s/(.*t.*t)+?1?[[:space:]]?(?([0-9]3))?.*([0-9]3).*([0-9]4)(.*)/1(2) 3-45/g" filename


        I tested it on a file containing this text:



         jfk 902-765-9292 hat jump cat
        jk 902 819 2244 hat jump cat
        98 902 823-4456 hat jump cat
        78h +1 075 242 1566 hat jump cat
        jklj kjlj +1 075-242-1566 hat jump cat
        jk jkj +1 (075) 242-1566 hat jump cat
        kj (204) 799-9810 hat jump cat
        kj 89 (204)-799-9810 hat jump cat


        The output was:



         jfk (902) 765-9292 hat jump cat
        jk (902) 819-2244 hat jump cat
        98 (902) 823-4456 hat jump cat
        78h (075) 242-1566 hat jump cat
        jklj kjlj (075) 242-1566 hat jump cat
        jk jkj (075) 242-1566 hat jump cat
        kj (204) 799-9810 hat jump cat
        kj 89 (204) 799-9810 hat jump cat





        share|improve this answer



























          up vote
          1
          down vote



          accepted










          This should cover you as long as the file is as you have described. The command preserves information before and after the phone number and formats it in the way that you asked for. If the output looks good, add the -i option to sed to edit it in place or provide it with output redirection using > output_file at the end.



          sed -E "s/(.*t.*t)+?1?[[:space:]]?(?([0-9]3))?.*([0-9]3).*([0-9]4)(.*)/1(2) 3-45/g" filename


          I tested it on a file containing this text:



           jfk 902-765-9292 hat jump cat
          jk 902 819 2244 hat jump cat
          98 902 823-4456 hat jump cat
          78h +1 075 242 1566 hat jump cat
          jklj kjlj +1 075-242-1566 hat jump cat
          jk jkj +1 (075) 242-1566 hat jump cat
          kj (204) 799-9810 hat jump cat
          kj 89 (204)-799-9810 hat jump cat


          The output was:



           jfk (902) 765-9292 hat jump cat
          jk (902) 819-2244 hat jump cat
          98 (902) 823-4456 hat jump cat
          78h (075) 242-1566 hat jump cat
          jklj kjlj (075) 242-1566 hat jump cat
          jk jkj (075) 242-1566 hat jump cat
          kj (204) 799-9810 hat jump cat
          kj 89 (204) 799-9810 hat jump cat





          share|improve this answer

























            up vote
            1
            down vote



            accepted







            up vote
            1
            down vote



            accepted






            This should cover you as long as the file is as you have described. The command preserves information before and after the phone number and formats it in the way that you asked for. If the output looks good, add the -i option to sed to edit it in place or provide it with output redirection using > output_file at the end.



            sed -E "s/(.*t.*t)+?1?[[:space:]]?(?([0-9]3))?.*([0-9]3).*([0-9]4)(.*)/1(2) 3-45/g" filename


            I tested it on a file containing this text:



             jfk 902-765-9292 hat jump cat
            jk 902 819 2244 hat jump cat
            98 902 823-4456 hat jump cat
            78h +1 075 242 1566 hat jump cat
            jklj kjlj +1 075-242-1566 hat jump cat
            jk jkj +1 (075) 242-1566 hat jump cat
            kj (204) 799-9810 hat jump cat
            kj 89 (204)-799-9810 hat jump cat


            The output was:



             jfk (902) 765-9292 hat jump cat
            jk (902) 819-2244 hat jump cat
            98 (902) 823-4456 hat jump cat
            78h (075) 242-1566 hat jump cat
            jklj kjlj (075) 242-1566 hat jump cat
            jk jkj (075) 242-1566 hat jump cat
            kj (204) 799-9810 hat jump cat
            kj 89 (204) 799-9810 hat jump cat





            share|improve this answer















            This should cover you as long as the file is as you have described. The command preserves information before and after the phone number and formats it in the way that you asked for. If the output looks good, add the -i option to sed to edit it in place or provide it with output redirection using > output_file at the end.



            sed -E "s/(.*t.*t)+?1?[[:space:]]?(?([0-9]3))?.*([0-9]3).*([0-9]4)(.*)/1(2) 3-45/g" filename


            I tested it on a file containing this text:



             jfk 902-765-9292 hat jump cat
            jk 902 819 2244 hat jump cat
            98 902 823-4456 hat jump cat
            78h +1 075 242 1566 hat jump cat
            jklj kjlj +1 075-242-1566 hat jump cat
            jk jkj +1 (075) 242-1566 hat jump cat
            kj (204) 799-9810 hat jump cat
            kj 89 (204)-799-9810 hat jump cat


            The output was:



             jfk (902) 765-9292 hat jump cat
            jk (902) 819-2244 hat jump cat
            98 (902) 823-4456 hat jump cat
            78h (075) 242-1566 hat jump cat
            jklj kjlj (075) 242-1566 hat jump cat
            jk jkj (075) 242-1566 hat jump cat
            kj (204) 799-9810 hat jump cat
            kj 89 (204) 799-9810 hat jump cat






            share|improve this answer















            share|improve this answer



            share|improve this answer








            edited May 4 at 4:12


























            answered May 4 at 1:30









            Jeff H.

            1667




            1667






















                up vote
                1
                down vote













                You can construct a regular expression that matches any of the formats, and captures the digits then re-substitutes them in your desired format.



                For example, to match and capture a sequence of three decimal digits optionally surrounded by parentheses with an Extended Regular Expression (ERE), you can write (?([0-9]3))? while [- ]? matches an optional hyphen or space. Building up in this way



                (?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)


                will match 3 digits optionally parenthesized, optionally followed by a hyphen or space, then more digits optionally followed by a hyphen or space, followed by 4 digits.



                Applying the expression in a sed substitution:



                $ cat <<EOF | sed -E 's/(?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)/(1) 2-3/g'
                I periodically receive a text file with phone numbers formatted
                in wildly different ways: 123 456-7890, 123 456-7890, 123 456-7890,
                etc. Usually there's ten digits, but I've seen +1 555 456-7890.
                EOF
                I periodically receive a text file with phone numbers formatted
                in wildly different ways: (123) 456-7890, (123) 456-7890, (123) 456-7890,
                etc. Usually there's ten digits, but I've seen +1 (555) 456-7890.





                share|improve this answer

























                  up vote
                  1
                  down vote













                  You can construct a regular expression that matches any of the formats, and captures the digits then re-substitutes them in your desired format.



                  For example, to match and capture a sequence of three decimal digits optionally surrounded by parentheses with an Extended Regular Expression (ERE), you can write (?([0-9]3))? while [- ]? matches an optional hyphen or space. Building up in this way



                  (?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)


                  will match 3 digits optionally parenthesized, optionally followed by a hyphen or space, then more digits optionally followed by a hyphen or space, followed by 4 digits.



                  Applying the expression in a sed substitution:



                  $ cat <<EOF | sed -E 's/(?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)/(1) 2-3/g'
                  I periodically receive a text file with phone numbers formatted
                  in wildly different ways: 123 456-7890, 123 456-7890, 123 456-7890,
                  etc. Usually there's ten digits, but I've seen +1 555 456-7890.
                  EOF
                  I periodically receive a text file with phone numbers formatted
                  in wildly different ways: (123) 456-7890, (123) 456-7890, (123) 456-7890,
                  etc. Usually there's ten digits, but I've seen +1 (555) 456-7890.





                  share|improve this answer























                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    You can construct a regular expression that matches any of the formats, and captures the digits then re-substitutes them in your desired format.



                    For example, to match and capture a sequence of three decimal digits optionally surrounded by parentheses with an Extended Regular Expression (ERE), you can write (?([0-9]3))? while [- ]? matches an optional hyphen or space. Building up in this way



                    (?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)


                    will match 3 digits optionally parenthesized, optionally followed by a hyphen or space, then more digits optionally followed by a hyphen or space, followed by 4 digits.



                    Applying the expression in a sed substitution:



                    $ cat <<EOF | sed -E 's/(?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)/(1) 2-3/g'
                    I periodically receive a text file with phone numbers formatted
                    in wildly different ways: 123 456-7890, 123 456-7890, 123 456-7890,
                    etc. Usually there's ten digits, but I've seen +1 555 456-7890.
                    EOF
                    I periodically receive a text file with phone numbers formatted
                    in wildly different ways: (123) 456-7890, (123) 456-7890, (123) 456-7890,
                    etc. Usually there's ten digits, but I've seen +1 (555) 456-7890.





                    share|improve this answer













                    You can construct a regular expression that matches any of the formats, and captures the digits then re-substitutes them in your desired format.



                    For example, to match and capture a sequence of three decimal digits optionally surrounded by parentheses with an Extended Regular Expression (ERE), you can write (?([0-9]3))? while [- ]? matches an optional hyphen or space. Building up in this way



                    (?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)


                    will match 3 digits optionally parenthesized, optionally followed by a hyphen or space, then more digits optionally followed by a hyphen or space, followed by 4 digits.



                    Applying the expression in a sed substitution:



                    $ cat <<EOF | sed -E 's/(?([0-9]3))?[- ]?([0-9]3)[- ]?([0-9]4)/(1) 2-3/g'
                    I periodically receive a text file with phone numbers formatted
                    in wildly different ways: 123 456-7890, 123 456-7890, 123 456-7890,
                    etc. Usually there's ten digits, but I've seen +1 555 456-7890.
                    EOF
                    I periodically receive a text file with phone numbers formatted
                    in wildly different ways: (123) 456-7890, (123) 456-7890, (123) 456-7890,
                    etc. Usually there's ten digits, but I've seen +1 (555) 456-7890.






                    share|improve this answer













                    share|improve this answer



                    share|improve this answer











                    answered May 4 at 1:18









                    steeldriver

                    31.3k34978




                    31.3k34978




















                        up vote
                        1
                        down vote













                        You need to match the field and the re-format it; here's an awk script that looks for three variations and re-formats them (before default-printing the reconstituted line):



                        $3 ~ /^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/ 
                        $3="(" substr($3, 1, 3) ") " substr($3, 4, 3) "-" substr($3, 7, 4)


                        $3 ~ /^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                        $3="(" substr($3, 1, 3) ") " substr($3, 5, 3) "-" substr($3, 9, 4)


                        $3 ~ /^+1 ([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                        $3="(" substr($3, 5, 3) ") " substr($3, 10, 3) "-" substr($3, 14, 4)


                        1


                        Save that to a file, perhaps phone.awk, then call it with: awk -F $'t' -f phone.awk < input.






                        share|improve this answer

























                          up vote
                          1
                          down vote













                          You need to match the field and the re-format it; here's an awk script that looks for three variations and re-formats them (before default-printing the reconstituted line):



                          $3 ~ /^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/ 
                          $3="(" substr($3, 1, 3) ") " substr($3, 4, 3) "-" substr($3, 7, 4)


                          $3 ~ /^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                          $3="(" substr($3, 1, 3) ") " substr($3, 5, 3) "-" substr($3, 9, 4)


                          $3 ~ /^+1 ([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                          $3="(" substr($3, 5, 3) ") " substr($3, 10, 3) "-" substr($3, 14, 4)


                          1


                          Save that to a file, perhaps phone.awk, then call it with: awk -F $'t' -f phone.awk < input.






                          share|improve this answer























                            up vote
                            1
                            down vote










                            up vote
                            1
                            down vote









                            You need to match the field and the re-format it; here's an awk script that looks for three variations and re-formats them (before default-printing the reconstituted line):



                            $3 ~ /^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/ 
                            $3="(" substr($3, 1, 3) ") " substr($3, 4, 3) "-" substr($3, 7, 4)


                            $3 ~ /^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                            $3="(" substr($3, 1, 3) ") " substr($3, 5, 3) "-" substr($3, 9, 4)


                            $3 ~ /^+1 ([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                            $3="(" substr($3, 5, 3) ") " substr($3, 10, 3) "-" substr($3, 14, 4)


                            1


                            Save that to a file, perhaps phone.awk, then call it with: awk -F $'t' -f phone.awk < input.






                            share|improve this answer













                            You need to match the field and the re-format it; here's an awk script that looks for three variations and re-formats them (before default-printing the reconstituted line):



                            $3 ~ /^[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/ 
                            $3="(" substr($3, 1, 3) ") " substr($3, 4, 3) "-" substr($3, 7, 4)


                            $3 ~ /^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                            $3="(" substr($3, 1, 3) ") " substr($3, 5, 3) "-" substr($3, 9, 4)


                            $3 ~ /^+1 ([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/
                            $3="(" substr($3, 5, 3) ") " substr($3, 10, 3) "-" substr($3, 14, 4)


                            1


                            Save that to a file, perhaps phone.awk, then call it with: awk -F $'t' -f phone.awk < input.







                            share|improve this answer













                            share|improve this answer



                            share|improve this answer











                            answered May 4 at 2:05









                            Jeff Schaller

                            31.1k846105




                            31.1k846105






















                                 

                                draft saved


                                draft discarded


























                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f441676%2fhow-can-i-standardize-the-phone-numbers-in-a-text-file%23new-answer', 'question_page');

                                );

                                Post as a guest













































































                                Popular posts from this blog

                                Peggy Mitchell

                                Palaiologos

                                The Forum (Inglewood, California)