how can I increase the performance for below code

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












Below is my script which is having lot of performance issue



#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo

grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt


Is there anyway I can increase performance with above script?



sample input:



1|5147634738948389685


sample output



1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43









share|improve this question























  • It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
    – roaima
    Aug 16 at 18:26







  • 1




    no, I amo not using aix, ksh and bash is fine
    – siva krishna
    Aug 16 at 18:28






  • 2




    Related: Why is using a shell loop to process text considered bad practice?
    – Stéphane Chazelas
    Aug 16 at 19:17










  • @StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
    – Isaac
    Aug 17 at 21:52














up vote
0
down vote

favorite












Below is my script which is having lot of performance issue



#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo

grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt


Is there anyway I can increase performance with above script?



sample input:



1|5147634738948389685


sample output



1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43









share|improve this question























  • It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
    – roaima
    Aug 16 at 18:26







  • 1




    no, I amo not using aix, ksh and bash is fine
    – siva krishna
    Aug 16 at 18:28






  • 2




    Related: Why is using a shell loop to process text considered bad practice?
    – Stéphane Chazelas
    Aug 16 at 19:17










  • @StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
    – Isaac
    Aug 17 at 21:52












up vote
0
down vote

favorite









up vote
0
down vote

favorite











Below is my script which is having lot of performance issue



#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo

grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt


Is there anyway I can increase performance with above script?



sample input:



1|5147634738948389685


sample output



1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43









share|improve this question















Below is my script which is having lot of performance issue



#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo

grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt


Is there anyway I can increase performance with above script?



sample input:



1|5147634738948389685


sample output



1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43






linux ksh






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Aug 20 at 6:25









muru

33.7k577144




33.7k577144










asked Aug 16 at 17:44









siva krishna

31




31











  • It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
    – roaima
    Aug 16 at 18:26







  • 1




    no, I amo not using aix, ksh and bash is fine
    – siva krishna
    Aug 16 at 18:28






  • 2




    Related: Why is using a shell loop to process text considered bad practice?
    – Stéphane Chazelas
    Aug 16 at 19:17










  • @StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
    – Isaac
    Aug 17 at 21:52
















  • It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
    – roaima
    Aug 16 at 18:26







  • 1




    no, I amo not using aix, ksh and bash is fine
    – siva krishna
    Aug 16 at 18:28






  • 2




    Related: Why is using a shell loop to process text considered bad practice?
    – Stéphane Chazelas
    Aug 16 at 19:17










  • @StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
    – Isaac
    Aug 17 at 21:52















It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
– roaima
Aug 16 at 18:26





It's a ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?
– roaima
Aug 16 at 18:26





1




1




no, I amo not using aix, ksh and bash is fine
– siva krishna
Aug 16 at 18:28




no, I amo not using aix, ksh and bash is fine
– siva krishna
Aug 16 at 18:28




2




2




Related: Why is using a shell loop to process text considered bad practice?
– Stéphane Chazelas
Aug 16 at 19:17




Related: Why is using a shell loop to process text considered bad practice?
– Stéphane Chazelas
Aug 16 at 19:17












@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
– Isaac
Aug 17 at 21:52




@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
– Isaac
Aug 17 at 21:52










7 Answers
7






active

oldest

votes

















up vote
3
down vote



accepted










This should be considerably faster



#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out


The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).



Input



1|5147634738948389685


Output



1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43





share|improve this answer
















  • 1




    I like this - especially the clever use of bc's default way of handling obase > 16
    – steeldriver
    Aug 17 at 1:56










  • how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
    – siva krishna
    Aug 17 at 13:31











  • Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
    – roaima
    Aug 17 at 13:39











  • @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
    – siva krishna
    Aug 17 at 13:46











  • if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
    – siva krishna
    Aug 17 at 14:03

















up vote
5
down vote













You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders



#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2

remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"

x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END





share|improve this answer






















  • When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
    – siva krishna
    Aug 20 at 15:15










  • change the heredoc <<END ... END to a simple redirection < filename
    – glenn jackman
    Aug 20 at 20:31

















up vote
3
down vote













There are several things that could be done (and speed gained):



  • original on 1000 numbers

    35.023 sec

  • replace all the expr commands with arithmetic expansions $((x%62))

    14.473

  • convert grp_rem=`echo $sub_rem" "$grp_rem` to grp_rem="$sub_rem $grp_rem"

    3.131

  • avoid the use of cut (set IFS='|'; set -f; and use shell split with set -- $1)

    • or use IFS='|' read a x <<<"$i" (though <<< creates a temp file)

    • and as one read is already being used, replace that read.

      0.454


  • reduce to only one loop (remove the if) and remove trailing space at the end

    0.207

  • Make the loop tighter Join both $((...))

    0.113

    ---- shell: a change of ~300 times faster than 35.023 seconds.

    ++++ This is probably the best that can be done with a shell script.

  • change to awk
    0.123

    ---- awk: a total change of ~280 times faster

Resulting script:



#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out


An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')



awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem

printf("%-22s
' <base62_while.txt >>base62_while.out





share|improve this answer


















  • 1




    On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
    – Stéphane Chazelas
    Aug 18 at 8:02










  • This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
    – Isaac
    Aug 19 at 12:54

















up vote
2
down vote













After playing for a bit with the Math::Base::Convert perl module I came up with



perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);

print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt


There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.






share|improve this answer





























    up vote
    2
    down vote













    With dc:



    sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out


    Or bc (note that historical implementations of bc are actually wrappers around dc):



    sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out


    Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.



    $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
    1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    00
    $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
    1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00





    share|improve this answer





























      up vote
      1
      down vote













      You can do some optimizations.



      Change



      grp_rem=`echo $sub_rem" "$grp_rem`


      to



      grp_rem="$sub_rem $grp_rem"


      Change



      x=`echo $i |cut -d"|" -f2`


      to



      x="$"


      You probably also want to change



      if [[ $#quo -ge 2 ]]


      to



      if [[ $quo -ge 62 ]]


      Reducing the number of subshells a little will help. If you want more speed, use a language like C.






      share|improve this answer




















      • how do I select field 2 from x="$"
        – siva krishna
        Aug 16 at 18:25










      • It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
        – RalfFriedl
        Aug 16 at 18:35

















      up vote
      1
      down vote













      The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:



      base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
      base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
      base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total


      Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:



      #|
      eval 'exec sbcl --script "$0" $1+"$@"'
      |#
      (defun divvy-r (n b l)
      (if (< n b) (cons (truncate n) l)
      (let ((rem (truncate (mod n b))) (quo (/ n b)))
      (divvy-r quo b (cons rem l)))))
      (defun divvy (n b)
      (let ((rem (mod n b)) (quo (/ n b)))
      (if (< quo 2)
      (list (truncate quo) (truncate rem))
      (divvy-r n b nil))))
      (loop for line = (read-line *standard-input* nil) while line do
      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
      (let ((out (divvy n 62)))
      (format t "~a|~~a~^ ~~&" line out))))


      Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:



      #|
      eval 'exec sbcl --script "$0" $1+"$@"'
      |#
      (defun remainders (n base)
      (do* ((rem (mod n base) (mod quo base))
      (quo (/ n base) (/ quo base))
      (out (cons (truncate rem) nil) (cons (truncate rem) out)))
      ((< quo base) (cons (truncate quo) out))))
      (loop for line = (read-line *standard-input* nil) while line do
      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
      (format t "~a|~~a~^ ~~&" line (remainders n 62))))





      share|improve this answer






















      • It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
        – Stéphane Chazelas
        Aug 19 at 21:57










      • SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
        – thrig
        Aug 19 at 22:58










      Your Answer







      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f463030%2fhow-can-i-increase-the-performance-for-below-code%23new-answer', 'question_page');

      );

      Post as a guest






























      7 Answers
      7






      active

      oldest

      votes








      7 Answers
      7






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      3
      down vote



      accepted










      This should be considerably faster



      #!/usr/bin/ksh
      #
      while IFS='|' read n x
      do
      base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
      printf "%d|%s|%sn" $n "$x" "$base62"
      done <base62_while.txt >>base62_while.out


      The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).



      Input



      1|5147634738948389685


      Output



      1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43





      share|improve this answer
















      • 1




        I like this - especially the clever use of bc's default way of handling obase > 16
        – steeldriver
        Aug 17 at 1:56










      • how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
        – siva krishna
        Aug 17 at 13:31











      • Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
        – roaima
        Aug 17 at 13:39











      • @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
        – siva krishna
        Aug 17 at 13:46











      • if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
        – siva krishna
        Aug 17 at 14:03














      up vote
      3
      down vote



      accepted










      This should be considerably faster



      #!/usr/bin/ksh
      #
      while IFS='|' read n x
      do
      base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
      printf "%d|%s|%sn" $n "$x" "$base62"
      done <base62_while.txt >>base62_while.out


      The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).



      Input



      1|5147634738948389685


      Output



      1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43





      share|improve this answer
















      • 1




        I like this - especially the clever use of bc's default way of handling obase > 16
        – steeldriver
        Aug 17 at 1:56










      • how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
        – siva krishna
        Aug 17 at 13:31











      • Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
        – roaima
        Aug 17 at 13:39











      • @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
        – siva krishna
        Aug 17 at 13:46











      • if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
        – siva krishna
        Aug 17 at 14:03












      up vote
      3
      down vote



      accepted







      up vote
      3
      down vote



      accepted






      This should be considerably faster



      #!/usr/bin/ksh
      #
      while IFS='|' read n x
      do
      base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
      printf "%d|%s|%sn" $n "$x" "$base62"
      done <base62_while.txt >>base62_while.out


      The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).



      Input



      1|5147634738948389685


      Output



      1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43





      share|improve this answer












      This should be considerably faster



      #!/usr/bin/ksh
      #
      while IFS='|' read n x
      do
      base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
      printf "%d|%s|%sn" $n "$x" "$base62"
      done <base62_while.txt >>base62_while.out


      The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).



      Input



      1|5147634738948389685


      Output



      1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Aug 16 at 20:03









      roaima

      40.4k547110




      40.4k547110







      • 1




        I like this - especially the clever use of bc's default way of handling obase > 16
        – steeldriver
        Aug 17 at 1:56










      • how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
        – siva krishna
        Aug 17 at 13:31











      • Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
        – roaima
        Aug 17 at 13:39











      • @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
        – siva krishna
        Aug 17 at 13:46











      • if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
        – siva krishna
        Aug 17 at 14:03












      • 1




        I like this - especially the clever use of bc's default way of handling obase > 16
        – steeldriver
        Aug 17 at 1:56










      • how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
        – siva krishna
        Aug 17 at 13:31











      • Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
        – roaima
        Aug 17 at 13:39











      • @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
        – siva krishna
        Aug 17 at 13:46











      • if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
        – siva krishna
        Aug 17 at 14:03







      1




      1




      I like this - especially the clever use of bc's default way of handling obase > 16
      – steeldriver
      Aug 17 at 1:56




      I like this - especially the clever use of bc's default way of handling obase > 16
      – steeldriver
      Aug 17 at 1:56












      how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
      – siva krishna
      Aug 17 at 13:31





      how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
      – siva krishna
      Aug 17 at 13:31













      Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
      – roaima
      Aug 17 at 13:39





      Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loop
      – roaima
      Aug 17 at 13:39













      @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
      – siva krishna
      Aug 17 at 13:46





      @roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
      – siva krishna
      Aug 17 at 13:46













      if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
      – siva krishna
      Aug 17 at 14:03




      if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
      – siva krishna
      Aug 17 at 14:03












      up vote
      5
      down vote













      You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders



      #!/usr/bin/ksh
      div=62
      while IFS='|' read -r n x; do
      rem=$(( x % div ))
      quo=$(( x / div ))
      echo "reminder is $rem" >&2
      echo "quotiont is $quo" >&2

      remainders=( $rem )
      while (( quo >= div )); do
      sub_rem=$(( quo % 62 ))
      quo=$(( quo / 62 ))
      echo "reminder is $sub_rem" >&2
      echo "quotiont is $quo" >&2
      remainders=( $sub_rem "$remainders[@]" )
      done
      echo "$n|$x|$quo $remainders[*]"

      x=$quo
      for r in "$remainders[@]"; do
      x=$(( x * div + r ))
      done
      echo Verification: $x
      done <<END
      1|5147634738948389685
      END





      share|improve this answer






















      • When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
        – siva krishna
        Aug 20 at 15:15










      • change the heredoc <<END ... END to a simple redirection < filename
        – glenn jackman
        Aug 20 at 20:31














      up vote
      5
      down vote













      You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders



      #!/usr/bin/ksh
      div=62
      while IFS='|' read -r n x; do
      rem=$(( x % div ))
      quo=$(( x / div ))
      echo "reminder is $rem" >&2
      echo "quotiont is $quo" >&2

      remainders=( $rem )
      while (( quo >= div )); do
      sub_rem=$(( quo % 62 ))
      quo=$(( quo / 62 ))
      echo "reminder is $sub_rem" >&2
      echo "quotiont is $quo" >&2
      remainders=( $sub_rem "$remainders[@]" )
      done
      echo "$n|$x|$quo $remainders[*]"

      x=$quo
      for r in "$remainders[@]"; do
      x=$(( x * div + r ))
      done
      echo Verification: $x
      done <<END
      1|5147634738948389685
      END





      share|improve this answer






















      • When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
        – siva krishna
        Aug 20 at 15:15










      • change the heredoc <<END ... END to a simple redirection < filename
        – glenn jackman
        Aug 20 at 20:31












      up vote
      5
      down vote










      up vote
      5
      down vote









      You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders



      #!/usr/bin/ksh
      div=62
      while IFS='|' read -r n x; do
      rem=$(( x % div ))
      quo=$(( x / div ))
      echo "reminder is $rem" >&2
      echo "quotiont is $quo" >&2

      remainders=( $rem )
      while (( quo >= div )); do
      sub_rem=$(( quo % 62 ))
      quo=$(( quo / 62 ))
      echo "reminder is $sub_rem" >&2
      echo "quotiont is $quo" >&2
      remainders=( $sub_rem "$remainders[@]" )
      done
      echo "$n|$x|$quo $remainders[*]"

      x=$quo
      for r in "$remainders[@]"; do
      x=$(( x * div + r ))
      done
      echo Verification: $x
      done <<END
      1|5147634738948389685
      END





      share|improve this answer














      You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders



      #!/usr/bin/ksh
      div=62
      while IFS='|' read -r n x; do
      rem=$(( x % div ))
      quo=$(( x / div ))
      echo "reminder is $rem" >&2
      echo "quotiont is $quo" >&2

      remainders=( $rem )
      while (( quo >= div )); do
      sub_rem=$(( quo % 62 ))
      quo=$(( quo / 62 ))
      echo "reminder is $sub_rem" >&2
      echo "quotiont is $quo" >&2
      remainders=( $sub_rem "$remainders[@]" )
      done
      echo "$n|$x|$quo $remainders[*]"

      x=$quo
      for r in "$remainders[@]"; do
      x=$(( x * div + r ))
      done
      echo Verification: $x
      done <<END
      1|5147634738948389685
      END






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 16 at 19:04

























      answered Aug 16 at 18:42









      glenn jackman

      47.6k265104




      47.6k265104











      • When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
        – siva krishna
        Aug 20 at 15:15










      • change the heredoc <<END ... END to a simple redirection < filename
        – glenn jackman
        Aug 20 at 20:31
















      • When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
        – siva krishna
        Aug 20 at 15:15










      • change the heredoc <<END ... END to a simple redirection < filename
        – glenn jackman
        Aug 20 at 20:31















      When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
      – siva krishna
      Aug 20 at 15:15




      When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
      – siva krishna
      Aug 20 at 15:15












      change the heredoc <<END ... END to a simple redirection < filename
      – glenn jackman
      Aug 20 at 20:31




      change the heredoc <<END ... END to a simple redirection < filename
      – glenn jackman
      Aug 20 at 20:31










      up vote
      3
      down vote













      There are several things that could be done (and speed gained):



      • original on 1000 numbers

        35.023 sec

      • replace all the expr commands with arithmetic expansions $((x%62))

        14.473

      • convert grp_rem=`echo $sub_rem" "$grp_rem` to grp_rem="$sub_rem $grp_rem"

        3.131

      • avoid the use of cut (set IFS='|'; set -f; and use shell split with set -- $1)

        • or use IFS='|' read a x <<<"$i" (though <<< creates a temp file)

        • and as one read is already being used, replace that read.

          0.454


      • reduce to only one loop (remove the if) and remove trailing space at the end

        0.207

      • Make the loop tighter Join both $((...))

        0.113

        ---- shell: a change of ~300 times faster than 35.023 seconds.

        ++++ This is probably the best that can be done with a shell script.

      • change to awk
        0.123

        ---- awk: a total change of ~280 times faster

      Resulting script:



      #!/usr/bin/ksh
      while IFS='|' read a b # read both values split on '|'
      do
      x=$b # set value of x (quotient)
      grp_rem="" # clear value of group
      while (( rem=x%62 , x/=62 )) # do both math expressions.
      do
      grp_rem="$rem $grp_rem" # concatenate resulting values
      done
      grp_rem=$grp_rem%? # remove one character (an space)
      echo "$a|$b|$rem $grp_rem"
      done < base62_while.txt >> base62_while.out


      An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
      Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')



      awk -MF'|' ' x=$2; grp_rem="";
      while(x>0)
      rem=x%62;
      x=int(x/62);
      grp_rem=rem" "grp_rem

      printf("%-22s
      ' <base62_while.txt >>base62_while.out





      share|improve this answer


















      • 1




        On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
        – Stéphane Chazelas
        Aug 18 at 8:02










      • This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
        – Isaac
        Aug 19 at 12:54














      up vote
      3
      down vote













      There are several things that could be done (and speed gained):



      • original on 1000 numbers

        35.023 sec

      • replace all the expr commands with arithmetic expansions $((x%62))

        14.473

      • convert grp_rem=`echo $sub_rem" "$grp_rem` to grp_rem="$sub_rem $grp_rem"

        3.131

      • avoid the use of cut (set IFS='|'; set -f; and use shell split with set -- $1)

        • or use IFS='|' read a x <<<"$i" (though <<< creates a temp file)

        • and as one read is already being used, replace that read.

          0.454


      • reduce to only one loop (remove the if) and remove trailing space at the end

        0.207

      • Make the loop tighter Join both $((...))

        0.113

        ---- shell: a change of ~300 times faster than 35.023 seconds.

        ++++ This is probably the best that can be done with a shell script.

      • change to awk
        0.123

        ---- awk: a total change of ~280 times faster

      Resulting script:



      #!/usr/bin/ksh
      while IFS='|' read a b # read both values split on '|'
      do
      x=$b # set value of x (quotient)
      grp_rem="" # clear value of group
      while (( rem=x%62 , x/=62 )) # do both math expressions.
      do
      grp_rem="$rem $grp_rem" # concatenate resulting values
      done
      grp_rem=$grp_rem%? # remove one character (an space)
      echo "$a|$b|$rem $grp_rem"
      done < base62_while.txt >> base62_while.out


      An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
      Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')



      awk -MF'|' ' x=$2; grp_rem="";
      while(x>0)
      rem=x%62;
      x=int(x/62);
      grp_rem=rem" "grp_rem

      printf("%-22s
      ' <base62_while.txt >>base62_while.out





      share|improve this answer


















      • 1




        On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
        – Stéphane Chazelas
        Aug 18 at 8:02










      • This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
        – Isaac
        Aug 19 at 12:54












      up vote
      3
      down vote










      up vote
      3
      down vote









      There are several things that could be done (and speed gained):



      • original on 1000 numbers

        35.023 sec

      • replace all the expr commands with arithmetic expansions $((x%62))

        14.473

      • convert grp_rem=`echo $sub_rem" "$grp_rem` to grp_rem="$sub_rem $grp_rem"

        3.131

      • avoid the use of cut (set IFS='|'; set -f; and use shell split with set -- $1)

        • or use IFS='|' read a x <<<"$i" (though <<< creates a temp file)

        • and as one read is already being used, replace that read.

          0.454


      • reduce to only one loop (remove the if) and remove trailing space at the end

        0.207

      • Make the loop tighter Join both $((...))

        0.113

        ---- shell: a change of ~300 times faster than 35.023 seconds.

        ++++ This is probably the best that can be done with a shell script.

      • change to awk
        0.123

        ---- awk: a total change of ~280 times faster

      Resulting script:



      #!/usr/bin/ksh
      while IFS='|' read a b # read both values split on '|'
      do
      x=$b # set value of x (quotient)
      grp_rem="" # clear value of group
      while (( rem=x%62 , x/=62 )) # do both math expressions.
      do
      grp_rem="$rem $grp_rem" # concatenate resulting values
      done
      grp_rem=$grp_rem%? # remove one character (an space)
      echo "$a|$b|$rem $grp_rem"
      done < base62_while.txt >> base62_while.out


      An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
      Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')



      awk -MF'|' ' x=$2; grp_rem="";
      while(x>0)
      rem=x%62;
      x=int(x/62);
      grp_rem=rem" "grp_rem

      printf("%-22s
      ' <base62_while.txt >>base62_while.out





      share|improve this answer














      There are several things that could be done (and speed gained):



      • original on 1000 numbers

        35.023 sec

      • replace all the expr commands with arithmetic expansions $((x%62))

        14.473

      • convert grp_rem=`echo $sub_rem" "$grp_rem` to grp_rem="$sub_rem $grp_rem"

        3.131

      • avoid the use of cut (set IFS='|'; set -f; and use shell split with set -- $1)

        • or use IFS='|' read a x <<<"$i" (though <<< creates a temp file)

        • and as one read is already being used, replace that read.

          0.454


      • reduce to only one loop (remove the if) and remove trailing space at the end

        0.207

      • Make the loop tighter Join both $((...))

        0.113

        ---- shell: a change of ~300 times faster than 35.023 seconds.

        ++++ This is probably the best that can be done with a shell script.

      • change to awk
        0.123

        ---- awk: a total change of ~280 times faster

      Resulting script:



      #!/usr/bin/ksh
      while IFS='|' read a b # read both values split on '|'
      do
      x=$b # set value of x (quotient)
      grp_rem="" # clear value of group
      while (( rem=x%62 , x/=62 )) # do both math expressions.
      do
      grp_rem="$rem $grp_rem" # concatenate resulting values
      done
      grp_rem=$grp_rem%? # remove one character (an space)
      echo "$a|$b|$rem $grp_rem"
      done < base62_while.txt >> base62_while.out


      An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
      Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')



      awk -MF'|' ' x=$2; grp_rem="";
      while(x>0)
      rem=x%62;
      x=int(x/62);
      grp_rem=rem" "grp_rem

      printf("%-22s
      ' <base62_while.txt >>base62_while.out






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 19 at 13:28

























      answered Aug 17 at 21:07









      Isaac

      7,1241835




      7,1241835







      • 1




        On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
        – Stéphane Chazelas
        Aug 18 at 8:02










      • This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
        – Isaac
        Aug 19 at 12:54












      • 1




        On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
        – Stéphane Chazelas
        Aug 18 at 8:02










      • This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
        – Isaac
        Aug 19 at 12:54







      1




      1




      On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
      – Stéphane Chazelas
      Aug 18 at 8:02




      On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
      – Stéphane Chazelas
      Aug 18 at 8:02












      This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
      – Isaac
      Aug 19 at 12:54




      This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
      – Isaac
      Aug 19 at 12:54










      up vote
      2
      down vote













      After playing for a bit with the Math::Base::Convert perl module I came up with



      perl -F'|' -MMath::Base::Convert -lne '
      BEGIN
      $bc = new Math::Base::Convert(dec,b62);
      # create a mapping from internal symbol set to desired decimal representation
      $syms = $bc->b62;
      @h@$syms = (0..61);

      print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
      ' base62_while.txt


      There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.






      share|improve this answer


























        up vote
        2
        down vote













        After playing for a bit with the Math::Base::Convert perl module I came up with



        perl -F'|' -MMath::Base::Convert -lne '
        BEGIN
        $bc = new Math::Base::Convert(dec,b62);
        # create a mapping from internal symbol set to desired decimal representation
        $syms = $bc->b62;
        @h@$syms = (0..61);

        print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
        ' base62_while.txt


        There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.






        share|improve this answer
























          up vote
          2
          down vote










          up vote
          2
          down vote









          After playing for a bit with the Math::Base::Convert perl module I came up with



          perl -F'|' -MMath::Base::Convert -lne '
          BEGIN
          $bc = new Math::Base::Convert(dec,b62);
          # create a mapping from internal symbol set to desired decimal representation
          $syms = $bc->b62;
          @h@$syms = (0..61);

          print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
          ' base62_while.txt


          There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.






          share|improve this answer














          After playing for a bit with the Math::Base::Convert perl module I came up with



          perl -F'|' -MMath::Base::Convert -lne '
          BEGIN
          $bc = new Math::Base::Convert(dec,b62);
          # create a mapping from internal symbol set to desired decimal representation
          $syms = $bc->b62;
          @h@$syms = (0..61);

          print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
          ' base62_while.txt


          There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 17 at 20:40

























          answered Aug 17 at 2:07









          steeldriver

          32.2k34979




          32.2k34979




















              up vote
              2
              down vote













              With dc:



              sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out


              Or bc (note that historical implementations of bc are actually wrappers around dc):



              sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out


              Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.



              $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
              1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
              00
              $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
              1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00





              share|improve this answer


























                up vote
                2
                down vote













                With dc:



                sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out


                Or bc (note that historical implementations of bc are actually wrappers around dc):



                sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out


                Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.



                $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
                1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                00
                $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
                1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00





                share|improve this answer
























                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  With dc:



                  sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out


                  Or bc (note that historical implementations of bc are actually wrappers around dc):



                  sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out


                  Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.



                  $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
                  1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                  00
                  $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
                  1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00





                  share|improve this answer














                  With dc:



                  sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out


                  Or bc (note that historical implementations of bc are actually wrappers around dc):



                  sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out


                  Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.



                  $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
                  1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                  00
                  $ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
                  1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 18 at 8:21

























                  answered Aug 18 at 7:54









                  Stéphane Chazelas

                  285k53525864




                  285k53525864




















                      up vote
                      1
                      down vote













                      You can do some optimizations.



                      Change



                      grp_rem=`echo $sub_rem" "$grp_rem`


                      to



                      grp_rem="$sub_rem $grp_rem"


                      Change



                      x=`echo $i |cut -d"|" -f2`


                      to



                      x="$"


                      You probably also want to change



                      if [[ $#quo -ge 2 ]]


                      to



                      if [[ $quo -ge 62 ]]


                      Reducing the number of subshells a little will help. If you want more speed, use a language like C.






                      share|improve this answer




















                      • how do I select field 2 from x="$"
                        – siva krishna
                        Aug 16 at 18:25










                      • It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                        – RalfFriedl
                        Aug 16 at 18:35














                      up vote
                      1
                      down vote













                      You can do some optimizations.



                      Change



                      grp_rem=`echo $sub_rem" "$grp_rem`


                      to



                      grp_rem="$sub_rem $grp_rem"


                      Change



                      x=`echo $i |cut -d"|" -f2`


                      to



                      x="$"


                      You probably also want to change



                      if [[ $#quo -ge 2 ]]


                      to



                      if [[ $quo -ge 62 ]]


                      Reducing the number of subshells a little will help. If you want more speed, use a language like C.






                      share|improve this answer




















                      • how do I select field 2 from x="$"
                        – siva krishna
                        Aug 16 at 18:25










                      • It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                        – RalfFriedl
                        Aug 16 at 18:35












                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      You can do some optimizations.



                      Change



                      grp_rem=`echo $sub_rem" "$grp_rem`


                      to



                      grp_rem="$sub_rem $grp_rem"


                      Change



                      x=`echo $i |cut -d"|" -f2`


                      to



                      x="$"


                      You probably also want to change



                      if [[ $#quo -ge 2 ]]


                      to



                      if [[ $quo -ge 62 ]]


                      Reducing the number of subshells a little will help. If you want more speed, use a language like C.






                      share|improve this answer












                      You can do some optimizations.



                      Change



                      grp_rem=`echo $sub_rem" "$grp_rem`


                      to



                      grp_rem="$sub_rem $grp_rem"


                      Change



                      x=`echo $i |cut -d"|" -f2`


                      to



                      x="$"


                      You probably also want to change



                      if [[ $#quo -ge 2 ]]


                      to



                      if [[ $quo -ge 62 ]]


                      Reducing the number of subshells a little will help. If you want more speed, use a language like C.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Aug 16 at 18:19









                      RalfFriedl

                      3,7001523




                      3,7001523











                      • how do I select field 2 from x="$"
                        – siva krishna
                        Aug 16 at 18:25










                      • It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                        – RalfFriedl
                        Aug 16 at 18:35
















                      • how do I select field 2 from x="$"
                        – siva krishna
                        Aug 16 at 18:25










                      • It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                        – RalfFriedl
                        Aug 16 at 18:35















                      how do I select field 2 from x="$"
                      – siva krishna
                      Aug 16 at 18:25




                      how do I select field 2 from x="$"
                      – siva krishna
                      Aug 16 at 18:25












                      It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                      – RalfFriedl
                      Aug 16 at 18:35




                      It will strip everything at the front of the value the matches the pattern *|, in other words everything up to and including the |-
                      – RalfFriedl
                      Aug 16 at 18:35










                      up vote
                      1
                      down vote













                      The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:



                      base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
                      base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
                      base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total


                      Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun divvy-r (n b l)
                      (if (< n b) (cons (truncate n) l)
                      (let ((rem (truncate (mod n b))) (quo (/ n b)))
                      (divvy-r quo b (cons rem l)))))
                      (defun divvy (n b)
                      (let ((rem (mod n b)) (quo (/ n b)))
                      (if (< quo 2)
                      (list (truncate quo) (truncate rem))
                      (divvy-r n b nil))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (let ((out (divvy n 62)))
                      (format t "~a|~~a~^ ~~&" line out))))


                      Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun remainders (n base)
                      (do* ((rem (mod n base) (mod quo base))
                      (quo (/ n base) (/ quo base))
                      (out (cons (truncate rem) nil) (cons (truncate rem) out)))
                      ((< quo base) (cons (truncate quo) out))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (format t "~a|~~a~^ ~~&" line (remainders n 62))))





                      share|improve this answer






















                      • It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                        – Stéphane Chazelas
                        Aug 19 at 21:57










                      • SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                        – thrig
                        Aug 19 at 22:58














                      up vote
                      1
                      down vote













                      The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:



                      base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
                      base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
                      base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total


                      Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun divvy-r (n b l)
                      (if (< n b) (cons (truncate n) l)
                      (let ((rem (truncate (mod n b))) (quo (/ n b)))
                      (divvy-r quo b (cons rem l)))))
                      (defun divvy (n b)
                      (let ((rem (mod n b)) (quo (/ n b)))
                      (if (< quo 2)
                      (list (truncate quo) (truncate rem))
                      (divvy-r n b nil))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (let ((out (divvy n 62)))
                      (format t "~a|~~a~^ ~~&" line out))))


                      Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun remainders (n base)
                      (do* ((rem (mod n base) (mod quo base))
                      (quo (/ n base) (/ quo base))
                      (out (cons (truncate rem) nil) (cons (truncate rem) out)))
                      ((< quo base) (cons (truncate quo) out))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (format t "~a|~~a~^ ~~&" line (remainders n 62))))





                      share|improve this answer






















                      • It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                        – Stéphane Chazelas
                        Aug 19 at 21:57










                      • SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                        – thrig
                        Aug 19 at 22:58












                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:



                      base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
                      base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
                      base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total


                      Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun divvy-r (n b l)
                      (if (< n b) (cons (truncate n) l)
                      (let ((rem (truncate (mod n b))) (quo (/ n b)))
                      (divvy-r quo b (cons rem l)))))
                      (defun divvy (n b)
                      (let ((rem (mod n b)) (quo (/ n b)))
                      (if (< quo 2)
                      (list (truncate quo) (truncate rem))
                      (divvy-r n b nil))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (let ((out (divvy n 62)))
                      (format t "~a|~~a~^ ~~&" line out))))


                      Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun remainders (n base)
                      (do* ((rem (mod n base) (mod quo base))
                      (quo (/ n base) (/ quo base))
                      (out (cons (truncate rem) nil) (cons (truncate rem) out)))
                      ((< quo base) (cons (truncate quo) out))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (format t "~a|~~a~^ ~~&" line (remainders n 62))))





                      share|improve this answer














                      The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:



                      base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
                      base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
                      base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total


                      Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun divvy-r (n b l)
                      (if (< n b) (cons (truncate n) l)
                      (let ((rem (truncate (mod n b))) (quo (/ n b)))
                      (divvy-r quo b (cons rem l)))))
                      (defun divvy (n b)
                      (let ((rem (mod n b)) (quo (/ n b)))
                      (if (< quo 2)
                      (list (truncate quo) (truncate rem))
                      (divvy-r n b nil))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (let ((out (divvy n 62)))
                      (format t "~a|~~a~^ ~~&" line out))))


                      Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:



                      #|
                      eval 'exec sbcl --script "$0" $1+"$@"'
                      |#
                      (defun remainders (n base)
                      (do* ((rem (mod n base) (mod quo base))
                      (quo (/ n base) (/ quo base))
                      (out (cons (truncate rem) nil) (cons (truncate rem) out)))
                      ((< quo base) (cons (truncate quo) out))))
                      (loop for line = (read-line *standard-input* nil) while line do
                      (let ((n (parse-integer (subseq line (1+ (position #| line))))))
                      (format t "~a|~~a~^ ~~&" line (remainders n 62))))






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Aug 20 at 1:19

























                      answered Aug 19 at 21:22









                      thrig

                      22.8k12854




                      22.8k12854











                      • It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                        – Stéphane Chazelas
                        Aug 19 at 21:57










                      • SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                        – thrig
                        Aug 19 at 22:58
















                      • It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                        – Stéphane Chazelas
                        Aug 19 at 21:57










                      • SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                        – thrig
                        Aug 19 at 22:58















                      It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                      – Stéphane Chazelas
                      Aug 19 at 21:57




                      It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
                      – Stéphane Chazelas
                      Aug 19 at 21:57












                      SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                      – thrig
                      Aug 19 at 22:58




                      SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
                      – thrig
                      Aug 19 at 22:58

















                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f463030%2fhow-can-i-increase-the-performance-for-below-code%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Popular posts from this blog

                      Peggy Mitchell

                      Palaiologos

                      The Forum (Inglewood, California)