how can I increase the performance for below code

Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
Below is my script which is having lot of performance issue
#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo
grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt
Is there anyway I can increase performance with above script?
sample input:
1|5147634738948389685
sample output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
linux ksh
add a comment |Â
up vote
0
down vote
favorite
Below is my script which is having lot of performance issue
#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo
grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt
Is there anyway I can increase performance with above script?
sample input:
1|5147634738948389685
sample output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
linux ksh
It's akshscript. oes an optimised alternative have to bekshor would a different shell (such asbash) be acceptable?
â roaima
Aug 16 at 18:26
1
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
2
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Below is my script which is having lot of performance issue
#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo
grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt
Is there anyway I can increase performance with above script?
sample input:
1|5147634738948389685
sample output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
linux ksh
Below is my script which is having lot of performance issue
#!/usr/bin/ksh
while read i
do
x=`echo $i |cut -d"|" -f2`
rem=`expr $x % 62`
echo "reminder is " $rem
quo=`expr $x / 62`
echo "quotiont is " $quo
grp_rem=" "
if [[ $#quo -ge 2 ]]
then
while [ $quo -ge 62 ]
do
sub_rem=`expr $quo % 62`
quo=`expr $quo / 62`
grp_rem=`echo $sub_rem" "$grp_rem`
done
fi
echo $i"|"$quo" "$grp_rem" "$rem >> base62_while.out
done < base62_while.txt
Is there anyway I can increase performance with above script?
sample input:
1|5147634738948389685
sample output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
linux ksh
linux ksh
edited Aug 20 at 6:25
muru
33.7k577144
33.7k577144
asked Aug 16 at 17:44
siva krishna
31
31
It's akshscript. oes an optimised alternative have to bekshor would a different shell (such asbash) be acceptable?
â roaima
Aug 16 at 18:26
1
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
2
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52
add a comment |Â
It's akshscript. oes an optimised alternative have to bekshor would a different shell (such asbash) be acceptable?
â roaima
Aug 16 at 18:26
1
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
2
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52
It's a
ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?â roaima
Aug 16 at 18:26
It's a
ksh script. oes an optimised alternative have to be ksh or would a different shell (such as bash) be acceptable?â roaima
Aug 16 at 18:26
1
1
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
2
2
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52
add a comment |Â
7 Answers
7
active
oldest
votes
up vote
3
down vote
accepted
This should be considerably faster
#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out
The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).
Input
1|5147634738948389685
Output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
1
I like this - especially the clever use ofbc's default way of handlingobase > 16
â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, theread n xreads two values separated by the|character. You could haveread a b x c dif you wanted, and then write them out with theprintfat the bottom of the loop
â roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
 |Â
show 1 more comment
up vote
5
down vote
You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders
#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2
remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"
x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc<<END ... ENDto a simple redirection< filename
â glenn jackman
Aug 20 at 20:31
add a comment |Â
up vote
3
down vote
There are several things that could be done (and speed gained):
- original on 1000 numbers
35.023 sec - replace all the expr commands with arithmetic expansions $((x%62))
14.473 - convert
grp_rem=`echo $sub_rem" "$grp_rem`togrp_rem="$sub_rem $grp_rem"
3.131 - avoid the use of cut (
set IFS='|'; set -f; and use shell split withset -- $1)- or use
IFS='|' read a x <<<"$i"(though<<<creates a temp file) - and as one read is already being used, replace that read.
0.454
- or use
- reduce to only one loop (remove the if) and remove trailing space at the end
0.207 - Make the loop tighter Join both
$((...))
0.113
---- shell: a change of ~300 times faster than 35.023 seconds.
++++ This is probably the best that can be done with a shell script. - change to awk
0.123
---- awk: a total change of ~280 times faster
Resulting script:
#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out
An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')
awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem
printf("%-22s
' <base62_while.txt >>base62_while.out
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
add a comment |Â
up vote
2
down vote
After playing for a bit with the Math::Base::Convert perl module I came up with
perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);
print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt
There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.
add a comment |Â
up vote
2
down vote
With dc:
sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out
Or bc (note that historical implementations of bc are actually wrappers around dc):
sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out
Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
add a comment |Â
up vote
1
down vote
You can do some optimizations.
Change
grp_rem=`echo $sub_rem" "$grp_rem`
to
grp_rem="$sub_rem $grp_rem"
Change
x=`echo $i |cut -d"|" -f2`
to
x="$"
You probably also want to change
if [[ $#quo -ge 2 ]]
to
if [[ $quo -ge 62 ]]
Reducing the number of subshells a little will help. If you want more speed, use a language like C.
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern*|, in other words everything up to and including the|-
â RalfFriedl
Aug 16 at 18:35
add a comment |Â
up vote
1
down vote
The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:
base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total
Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun divvy-r (n b l)
(if (< n b) (cons (truncate n) l)
(let ((rem (truncate (mod n b))) (quo (/ n b)))
(divvy-r quo b (cons rem l)))))
(defun divvy (n b)
(let ((rem (mod n b)) (quo (/ n b)))
(if (< quo 2)
(list (truncate quo) (truncate rem))
(divvy-r n b nil))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(let ((out (divvy n 62)))
(format t "~a|~~a~^ ~~&" line out))))
Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun remainders (n base)
(do* ((rem (mod n base) (mod quo base))
(quo (/ n base) (/ quo base))
(out (cons (truncate rem) nil) (cons (truncate rem) out)))
((< quo base) (cons (truncate quo) out))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(format t "~a|~~a~^ ~~&" line (remainders n 62))))
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
add a comment |Â
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
This should be considerably faster
#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out
The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).
Input
1|5147634738948389685
Output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
1
I like this - especially the clever use ofbc's default way of handlingobase > 16
â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, theread n xreads two values separated by the|character. You could haveread a b x c dif you wanted, and then write them out with theprintfat the bottom of the loop
â roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
 |Â
show 1 more comment
up vote
3
down vote
accepted
This should be considerably faster
#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out
The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).
Input
1|5147634738948389685
Output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
1
I like this - especially the clever use ofbc's default way of handlingobase > 16
â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, theread n xreads two values separated by the|character. You could haveread a b x c dif you wanted, and then write them out with theprintfat the bottom of the loop
â roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
 |Â
show 1 more comment
up vote
3
down vote
accepted
up vote
3
down vote
accepted
This should be considerably faster
#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out
The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).
Input
1|5147634738948389685
Output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
This should be considerably faster
#!/usr/bin/ksh
#
while IFS='|' read n x
do
base62="$(echo "obase=62; $x" | bc | sed -re 's/ 0/ /g' -e 's/^ //')"
printf "%d|%s|%sn" $n "$x" "$base62"
done <base62_while.txt >>base62_while.out
The base62 line uses bc to convert the decimal source number into a base 62 equivalent. It outputs two digit decimal pairs, from which we strip any leading zero (i.e. 02 is rewritten as 2, but 45 is left unchanged).
Input
1|5147634738948389685
Output
1|5147634738948389685|6 8 16 13 46 17 20 35 9 49 43
answered Aug 16 at 20:03
roaima
40.4k547110
40.4k547110
1
I like this - especially the clever use ofbc's default way of handlingobase > 16
â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, theread n xreads two values separated by the|character. You could haveread a b x c dif you wanted, and then write them out with theprintfat the bottom of the loop
â roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
 |Â
show 1 more comment
1
I like this - especially the clever use ofbc's default way of handlingobase > 16
â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, theread n xreads two values separated by the|character. You could haveread a b x c dif you wanted, and then write them out with theprintfat the bottom of the loop
â roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
1
1
I like this - especially the clever use of
bc's default way of handling obase > 16â steeldriver
Aug 17 at 1:56
I like this - especially the clever use of
bc's default way of handling obase > 16â steeldriver
Aug 17 at 1:56
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
how do I pass which for which field base62 has to be done Suppose I have 5 fileds, base62 should I apply on 3field
â siva krishna
Aug 17 at 13:31
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the
read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loopâ roaima
Aug 17 at 13:39
Provide a proper example data file (in your question) and I can amend the code here for you. But to answer your specific enquiry, the
read n x reads two values separated by the | character. You could have read a b x c d if you wanted, and then write them out with the printf at the bottom of the loopâ roaima
Aug 17 at 13:39
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
@roaima I got this. Actually I want to want to assign a static alpha values to the reminder, sub reminder and quo with alpha characters, I will post it here in next comment. and one more thing I want to replace the spaces for output. like it should be 6816134617203594943 instead of 6 8 16 13 46 17 20 35 9 49 43
â siva krishna
Aug 17 at 13:46
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
if rem=1 then rem=a;if rem=2 then rem=b;if rem=3 then rem=c like this a-z and A-Z for quo and rem and sub_rem so that I can reduce the output length. I am unable to add complete static values as it too long.
â siva krishna
Aug 17 at 14:03
 |Â
show 1 more comment
up vote
5
down vote
You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders
#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2
remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"
x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc<<END ... ENDto a simple redirection< filename
â glenn jackman
Aug 20 at 20:31
add a comment |Â
up vote
5
down vote
You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders
#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2
remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"
x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc<<END ... ENDto a simple redirection< filename
â glenn jackman
Aug 20 at 20:31
add a comment |Â
up vote
5
down vote
up vote
5
down vote
You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders
#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2
remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"
x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END
You don't need to call out to any external tools: ksh can do arithmetic. I'm also using an array to store the remainders
#!/usr/bin/ksh
div=62
while IFS='|' read -r n x; do
rem=$(( x % div ))
quo=$(( x / div ))
echo "reminder is $rem" >&2
echo "quotiont is $quo" >&2
remainders=( $rem )
while (( quo >= div )); do
sub_rem=$(( quo % 62 ))
quo=$(( quo / 62 ))
echo "reminder is $sub_rem" >&2
echo "quotiont is $quo" >&2
remainders=( $sub_rem "$remainders[@]" )
done
echo "$n|$x|$quo $remainders[*]"
x=$quo
for r in "$remainders[@]"; do
x=$(( x * div + r ))
done
echo Verification: $x
done <<END
1|5147634738948389685
END
edited Aug 16 at 19:04
answered Aug 16 at 18:42
glenn jackman
47.6k265104
47.6k265104
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc<<END ... ENDto a simple redirection< filename
â glenn jackman
Aug 20 at 20:31
add a comment |Â
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc<<END ... ENDto a simple redirection< filename
â glenn jackman
Aug 20 at 20:31
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
When I am passing file name instead of 1|5147634738948389685 it is reading anything. How do I pass a file name in this script
â siva krishna
Aug 20 at 15:15
change the heredoc
<<END ... END to a simple redirection < filenameâ glenn jackman
Aug 20 at 20:31
change the heredoc
<<END ... END to a simple redirection < filenameâ glenn jackman
Aug 20 at 20:31
add a comment |Â
up vote
3
down vote
There are several things that could be done (and speed gained):
- original on 1000 numbers
35.023 sec - replace all the expr commands with arithmetic expansions $((x%62))
14.473 - convert
grp_rem=`echo $sub_rem" "$grp_rem`togrp_rem="$sub_rem $grp_rem"
3.131 - avoid the use of cut (
set IFS='|'; set -f; and use shell split withset -- $1)- or use
IFS='|' read a x <<<"$i"(though<<<creates a temp file) - and as one read is already being used, replace that read.
0.454
- or use
- reduce to only one loop (remove the if) and remove trailing space at the end
0.207 - Make the loop tighter Join both
$((...))
0.113
---- shell: a change of ~300 times faster than 35.023 seconds.
++++ This is probably the best that can be done with a shell script. - change to awk
0.123
---- awk: a total change of ~280 times faster
Resulting script:
#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out
An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')
awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem
printf("%-22s
' <base62_while.txt >>base62_while.out
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
add a comment |Â
up vote
3
down vote
There are several things that could be done (and speed gained):
- original on 1000 numbers
35.023 sec - replace all the expr commands with arithmetic expansions $((x%62))
14.473 - convert
grp_rem=`echo $sub_rem" "$grp_rem`togrp_rem="$sub_rem $grp_rem"
3.131 - avoid the use of cut (
set IFS='|'; set -f; and use shell split withset -- $1)- or use
IFS='|' read a x <<<"$i"(though<<<creates a temp file) - and as one read is already being used, replace that read.
0.454
- or use
- reduce to only one loop (remove the if) and remove trailing space at the end
0.207 - Make the loop tighter Join both
$((...))
0.113
---- shell: a change of ~300 times faster than 35.023 seconds.
++++ This is probably the best that can be done with a shell script. - change to awk
0.123
---- awk: a total change of ~280 times faster
Resulting script:
#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out
An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')
awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem
printf("%-22s
' <base62_while.txt >>base62_while.out
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
add a comment |Â
up vote
3
down vote
up vote
3
down vote
There are several things that could be done (and speed gained):
- original on 1000 numbers
35.023 sec - replace all the expr commands with arithmetic expansions $((x%62))
14.473 - convert
grp_rem=`echo $sub_rem" "$grp_rem`togrp_rem="$sub_rem $grp_rem"
3.131 - avoid the use of cut (
set IFS='|'; set -f; and use shell split withset -- $1)- or use
IFS='|' read a x <<<"$i"(though<<<creates a temp file) - and as one read is already being used, replace that read.
0.454
- or use
- reduce to only one loop (remove the if) and remove trailing space at the end
0.207 - Make the loop tighter Join both
$((...))
0.113
---- shell: a change of ~300 times faster than 35.023 seconds.
++++ This is probably the best that can be done with a shell script. - change to awk
0.123
---- awk: a total change of ~280 times faster
Resulting script:
#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out
An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')
awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem
printf("%-22s
' <base62_while.txt >>base62_while.out
There are several things that could be done (and speed gained):
- original on 1000 numbers
35.023 sec - replace all the expr commands with arithmetic expansions $((x%62))
14.473 - convert
grp_rem=`echo $sub_rem" "$grp_rem`togrp_rem="$sub_rem $grp_rem"
3.131 - avoid the use of cut (
set IFS='|'; set -f; and use shell split withset -- $1)- or use
IFS='|' read a x <<<"$i"(though<<<creates a temp file) - and as one read is already being used, replace that read.
0.454
- or use
- reduce to only one loop (remove the if) and remove trailing space at the end
0.207 - Make the loop tighter Join both
$((...))
0.113
---- shell: a change of ~300 times faster than 35.023 seconds.
++++ This is probably the best that can be done with a shell script. - change to awk
0.123
---- awk: a total change of ~280 times faster
Resulting script:
#!/usr/bin/ksh
while IFS='|' read a b # read both values split on '|'
do
x=$b # set value of x (quotient)
grp_rem="" # clear value of group
while (( rem=x%62 , x/=62 )) # do both math expressions.
do
grp_rem="$rem $grp_rem" # concatenate resulting values
done
grp_rem=$grp_rem%? # remove one character (an space)
echo "$a|$b|$rem $grp_rem"
done < base62_while.txt >> base62_while.out
An awk script equivalent. I don't know if this is the faster awk script possible, but works fine. Faster than the shell for more than 10k lines.
Note: This is using GNU awk with the option of -M (arbitrary precision) which is a must to process numbers in the order of 19 digits that you presented. It could process even longer numbers, I did not check how long, but I am pretty sure that the limit is pretty high. :-) Note that awk must have been compiled with that option included (check with awk 'BEGIN print( PROCINFO["gmp_version"], PROCINFO["prec_max"]) ')
awk -MF'|' ' x=$2; grp_rem="";
while(x>0)
rem=x%62;
x=int(x/62);
grp_rem=rem" "grp_rem
printf("%-22s
' <base62_while.txt >>base62_while.out
edited Aug 19 at 13:28
answered Aug 17 at 21:07
Isaac
7,1241835
7,1241835
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
add a comment |Â
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
1
1
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
On a 100000 line file, I find that that gawk solution is still 3 times as fast even though it supports arbitrary precision. You can make the ksh one faster by taking the output redirection out of the loop.
â Stéphane Chazelas
Aug 18 at 8:02
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
This answer already has this Faster than the shell for more than 10k lines. So, yes, that is known. But what is faster for a few values? Please answer this @StéphaneChazelas
â Isaac
Aug 19 at 12:54
add a comment |Â
up vote
2
down vote
After playing for a bit with the Math::Base::Convert perl module I came up with
perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);
print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt
There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.
add a comment |Â
up vote
2
down vote
After playing for a bit with the Math::Base::Convert perl module I came up with
perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);
print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt
There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
After playing for a bit with the Math::Base::Convert perl module I came up with
perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);
print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt
There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.
After playing for a bit with the Math::Base::Convert perl module I came up with
perl -F'|' -MMath::Base::Convert -lne '
BEGIN
$bc = new Math::Base::Convert(dec,b62);
# create a mapping from internal symbol set to desired decimal representation
$syms = $bc->b62;
@h@$syms = (0..61);
print join "|", @F[0..1], (join " ", map $h$_, split //, $bc->cnv($F[1]))
' base62_while.txt
There may be faster perl alternatives as discussed here Base conversion although I'm not sure if they have the same flexibility to manipulate the output mapping.
edited Aug 17 at 20:40
answered Aug 17 at 2:07
steeldriver
32.2k34979
32.2k34979
add a comment |Â
add a comment |Â
up vote
2
down vote
With dc:
sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out
Or bc (note that historical implementations of bc are actually wrappers around dc):
sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out
Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
add a comment |Â
up vote
2
down vote
With dc:
sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out
Or bc (note that historical implementations of bc are actually wrappers around dc):
sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out
Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
add a comment |Â
up vote
2
down vote
up vote
2
down vote
With dc:
sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out
Or bc (note that historical implementations of bc are actually wrappers around dc):
sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out
Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
With dc:
sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' base62_while.txt | dc > base62_while.out
Or bc (note that historical implementations of bc are actually wrappers around dc):
sed 's/.*|(.*)/"&|";1/;1s/^/obase=62;/' base62_while.txt | bc > base62_while.out
Note that dc and bc wrap long lines of output. With the GNU implementations, you can set the DC_LINE_LENGTH and BC_LINE_LENGTH environment variables to 0 to avoid it.
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00
$ echo '1|167883826163764944817996215305490271305728' | sed 's/.*|(.*)/[&|]P1p/;1s/^/62o/' | DC_LINE_LENGTH=0 dc
1|167883826163764944817996215305490271305728| 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
edited Aug 18 at 8:21
answered Aug 18 at 7:54
Stéphane Chazelas
285k53525864
285k53525864
add a comment |Â
add a comment |Â
up vote
1
down vote
You can do some optimizations.
Change
grp_rem=`echo $sub_rem" "$grp_rem`
to
grp_rem="$sub_rem $grp_rem"
Change
x=`echo $i |cut -d"|" -f2`
to
x="$"
You probably also want to change
if [[ $#quo -ge 2 ]]
to
if [[ $quo -ge 62 ]]
Reducing the number of subshells a little will help. If you want more speed, use a language like C.
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern*|, in other words everything up to and including the|-
â RalfFriedl
Aug 16 at 18:35
add a comment |Â
up vote
1
down vote
You can do some optimizations.
Change
grp_rem=`echo $sub_rem" "$grp_rem`
to
grp_rem="$sub_rem $grp_rem"
Change
x=`echo $i |cut -d"|" -f2`
to
x="$"
You probably also want to change
if [[ $#quo -ge 2 ]]
to
if [[ $quo -ge 62 ]]
Reducing the number of subshells a little will help. If you want more speed, use a language like C.
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern*|, in other words everything up to and including the|-
â RalfFriedl
Aug 16 at 18:35
add a comment |Â
up vote
1
down vote
up vote
1
down vote
You can do some optimizations.
Change
grp_rem=`echo $sub_rem" "$grp_rem`
to
grp_rem="$sub_rem $grp_rem"
Change
x=`echo $i |cut -d"|" -f2`
to
x="$"
You probably also want to change
if [[ $#quo -ge 2 ]]
to
if [[ $quo -ge 62 ]]
Reducing the number of subshells a little will help. If you want more speed, use a language like C.
You can do some optimizations.
Change
grp_rem=`echo $sub_rem" "$grp_rem`
to
grp_rem="$sub_rem $grp_rem"
Change
x=`echo $i |cut -d"|" -f2`
to
x="$"
You probably also want to change
if [[ $#quo -ge 2 ]]
to
if [[ $quo -ge 62 ]]
Reducing the number of subshells a little will help. If you want more speed, use a language like C.
answered Aug 16 at 18:19
RalfFriedl
3,7001523
3,7001523
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern*|, in other words everything up to and including the|-
â RalfFriedl
Aug 16 at 18:35
add a comment |Â
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern*|, in other words everything up to and including the|-
â RalfFriedl
Aug 16 at 18:35
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
how do I select field 2 from x="$"
â siva krishna
Aug 16 at 18:25
It will strip everything at the front of the value the matches the pattern
*|, in other words everything up to and including the |-â RalfFriedl
Aug 16 at 18:35
It will strip everything at the front of the value the matches the pattern
*|, in other words everything up to and including the |-â RalfFriedl
Aug 16 at 18:35
add a comment |Â
up vote
1
down vote
The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:
base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total
Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun divvy-r (n b l)
(if (< n b) (cons (truncate n) l)
(let ((rem (truncate (mod n b))) (quo (/ n b)))
(divvy-r quo b (cons rem l)))))
(defun divvy (n b)
(let ((rem (mod n b)) (quo (/ n b)))
(if (< quo 2)
(list (truncate quo) (truncate rem))
(divvy-r n b nil))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(let ((out (divvy n 62)))
(format t "~a|~~a~^ ~~&" line out))))
Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun remainders (n base)
(do* ((rem (mod n base) (mod quo base))
(quo (/ n base) (/ quo base))
(out (cons (truncate rem) nil) (cons (truncate rem) out)))
((< quo base) (cons (truncate quo) out))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(format t "~a|~~a~^ ~~&" line (remainders n 62))))
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
add a comment |Â
up vote
1
down vote
The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:
base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total
Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun divvy-r (n b l)
(if (< n b) (cons (truncate n) l)
(let ((rem (truncate (mod n b))) (quo (/ n b)))
(divvy-r quo b (cons rem l)))))
(defun divvy (n b)
(let ((rem (mod n b)) (quo (/ n b)))
(if (< quo 2)
(list (truncate quo) (truncate rem))
(divvy-r n b nil))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(let ((out (divvy n 62)))
(format t "~a|~~a~^ ~~&" line out))))
Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun remainders (n base)
(do* ((rem (mod n base) (mod quo base))
(quo (/ n base) (/ quo base))
(out (cons (truncate rem) nil) (cons (truncate rem) out)))
((< quo base) (cons (truncate quo) out))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(format t "~a|~~a~^ ~~&" line (remainders n 62))))
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
add a comment |Â
up vote
1
down vote
up vote
1
down vote
The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:
base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total
Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun divvy-r (n b l)
(if (< n b) (cons (truncate n) l)
(let ((rem (truncate (mod n b))) (quo (/ n b)))
(divvy-r quo b (cons rem l)))))
(defun divvy (n b)
(let ((rem (mod n b)) (quo (/ n b)))
(if (< quo 2)
(list (truncate quo) (truncate rem))
(divvy-r n b nil))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(let ((out (divvy n 62)))
(format t "~a|~~a~^ ~~&" line out))))
Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun remainders (n base)
(do* ((rem (mod n base) (mod quo base))
(quo (/ n base) (/ quo base))
(out (cons (truncate rem) nil) (cons (truncate rem) out)))
((< quo base) (cons (truncate quo) out))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(format t "~a|~~a~^ ~~&" line (remainders n 62))))
The shell is slow: use a different language. If we compare the original KSH script (modified to used stdin and stdout), something very similar to steeldriver's Perl code (a script instead of a one-liner that shows similar speeds to glenn jackman's native KSH version), and a LISP implementation with 10,000 lines of input on a Centos 7 test system:
base62.ksh 93.29s user 143.48s system 109% cpu 3:36.73 total
base62.perl 1.32s user 0.00s system 99% cpu 1.326 total
base62.sbcl 0.22s user 0.03s system 99% cpu 0.243 total
Obviously the original code quickly becomes impractical as the input lines increase, as will scripting languages compared to LISP with significant amounts of input. The base62.sbcl time is from a tail call recursive implementation:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun divvy-r (n b l)
(if (< n b) (cons (truncate n) l)
(let ((rem (truncate (mod n b))) (quo (/ n b)))
(divvy-r quo b (cons rem l)))))
(defun divvy (n b)
(let ((rem (mod n b)) (quo (/ n b)))
(if (< quo 2)
(list (truncate quo) (truncate rem))
(divvy-r n b nil))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(let ((out (divvy n 62)))
(format t "~a|~~a~^ ~~&" line out))))
Reading "Common Lisp: A Gentle Introduction to Symbolic Computation" and doing all the exercises therein is how I learned this. Slightly faster (and ever so succinct) is a do* implementation based on glenn jackman's KSH code:
#|
eval 'exec sbcl --script "$0" $1+"$@"'
|#
(defun remainders (n base)
(do* ((rem (mod n base) (mod quo base))
(quo (/ n base) (/ quo base))
(out (cons (truncate rem) nil) (cons (truncate rem) out)))
((< quo base) (cons (truncate quo) out))))
(loop for line = (read-line *standard-input* nil) while line do
(let ((n (parse-integer (subseq line (1+ (position #| line))))))
(format t "~a|~~a~^ ~~&" line (remainders n 62))))
edited Aug 20 at 1:19
answered Aug 19 at 21:22
thrig
22.8k12854
22.8k12854
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
add a comment |Â
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
It's ironic that that script actually needs a shell to run. So, the shell is not slow as long as you're using it the right way: as a command line interpreter to run the right command for the task (as opposed to dozens or invocations of ill fitted tools for each line of the input in the OP's attempt).
â Stéphane Chazelas
Aug 19 at 21:57
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
SBCL can compile the script to a ~39 megabyte binary which shaves ~0.01 seconds off the execution time avoiding the shell exec. Otherwise, LISP implementations are often an awkward fit for the unix shell environment...
â thrig
Aug 19 at 22:58
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f463030%2fhow-can-i-increase-the-performance-for-below-code%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
It's a
kshscript. oes an optimised alternative have to bekshor would a different shell (such asbash) be acceptable?â roaima
Aug 16 at 18:26
1
no, I amo not using aix, ksh and bash is fine
â siva krishna
Aug 16 at 18:28
2
Related: Why is using a shell loop to process text considered bad practice?
â Stéphane Chazelas
Aug 16 at 19:17
@StéphaneChazelas The script I optimized seems to be quite fast, this may be a counter example of your claim.
â Isaac
Aug 17 at 21:52