Convert key=value blocks to CSV
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
I am trying to transpose a content of a file into another.
Input file Test.txt
:
HLRSN = 3
IMSI = 404212109727229
KIVALUE = A24AD11812232B47688ADBF15CE05CA9
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727230
KIVALUE = A24AD11812232B47688ADBF15CE05CB8
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727231
KIVALUE = A24AD11812232B47688ADBF15CE05CD6
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
Output needed in another text file:
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
shell text-processing csv
add a comment |
up vote
4
down vote
favorite
I am trying to transpose a content of a file into another.
Input file Test.txt
:
HLRSN = 3
IMSI = 404212109727229
KIVALUE = A24AD11812232B47688ADBF15CE05CA9
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727230
KIVALUE = A24AD11812232B47688ADBF15CE05CB8
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727231
KIVALUE = A24AD11812232B47688ADBF15CE05CD6
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
Output needed in another text file:
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
shell text-processing csv
Are the fieldsHLRSN
,IMSI
,KIVALUE
, etc. always present in each input block and in the same order each time?
– Gilles
Aug 1 '14 at 21:15
add a comment |
up vote
4
down vote
favorite
up vote
4
down vote
favorite
I am trying to transpose a content of a file into another.
Input file Test.txt
:
HLRSN = 3
IMSI = 404212109727229
KIVALUE = A24AD11812232B47688ADBF15CE05CA9
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727230
KIVALUE = A24AD11812232B47688ADBF15CE05CB8
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727231
KIVALUE = A24AD11812232B47688ADBF15CE05CD6
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
Output needed in another text file:
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
shell text-processing csv
I am trying to transpose a content of a file into another.
Input file Test.txt
:
HLRSN = 3
IMSI = 404212109727229
KIVALUE = A24AD11812232B47688ADBF15CE05CA9
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727230
KIVALUE = A24AD11812232B47688ADBF15CE05CB8
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
HLRSN = 3
IMSI = 404212109727231
KIVALUE = A24AD11812232B47688ADBF15CE05CD6
K4SNO = 1
CARDTYPE = SIM
ALG = COMP128_3
Output needed in another text file:
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
shell text-processing csv
shell text-processing csv
edited Nov 25 at 14:36
Rui F Ribeiro
38.3k1476127
38.3k1476127
asked Aug 1 '14 at 7:04
user79374
263
263
Are the fieldsHLRSN
,IMSI
,KIVALUE
, etc. always present in each input block and in the same order each time?
– Gilles
Aug 1 '14 at 21:15
add a comment |
Are the fieldsHLRSN
,IMSI
,KIVALUE
, etc. always present in each input block and in the same order each time?
– Gilles
Aug 1 '14 at 21:15
Are the fields
HLRSN
, IMSI
, KIVALUE
, etc. always present in each input block and in the same order each time?– Gilles
Aug 1 '14 at 21:15
Are the fields
HLRSN
, IMSI
, KIVALUE
, etc. always present in each input block and in the same order each time?– Gilles
Aug 1 '14 at 21:15
add a comment |
5 Answers
5
active
oldest
votes
up vote
4
down vote
accepted
A bash
solution:
declare -a out
EOF=false
IFS=$'='
until $EOF; do
read -r skip val || EOF=true
if [ ! -z "$val" ]
then
out+=("$val//[[:space:]]/")
else
tmp="$out[@]"
printf '%sn' "$tmp// /,"
out=()
fi
done < file
How does this work
- Declare array
out
for holding output line, set variableEOF
to keep track end of file,IFS
for input field separator forread
. - Until we read end of file, we read each line of file, set value of last field to variable
val
. if [ ! -z "$val" ]
: check if length of variable$val
is not zero, we remove space in$val
, push it to arrayout
.- if length
$val
is zero, meaning we get blank line or end of file, we assign all element of arrayout
to variable tmp, then replace all space variabletmp
by,
, our designed output recode separator. - Set
out
to null for next work.
Another solution, more concise, shorter for you is using perl
:
$ perl -F'=' -anle '
BEGIN $, = ","
push @out,$F[-1] if @F;
print @[map s/s// && $_ @out] and @out = ()
if /^$/ or eof;
' file
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
You can avoid theEOF
variable by usingwhile read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Isskip
a variable or a modifier to theread
command ?
– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
add a comment |
up vote
9
down vote
Simply:
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
An empty record separator (RS=
) enables the paragraph mode whereby records are separated by sequences of empty lines. Inside a record, the default field separator applies (records are separated by blanks) so in each record, the fields we are interested in are the 3rd, 6th, 9th...
We change the output field separator to a comma character (OFS=,
) and print the fields we're interested in.
Great solution! If it happens to be variable number of lines, you can make it more general withawk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.
– fedorqui
Aug 1 '14 at 9:45
1
I love how the lineawk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".
– dotancohen
Aug 1 '14 at 14:21
add a comment |
up vote
3
down vote
Save the following to a file (eg split.awk
)
BEGIN
RS="nn";
FS="n";
ORS=",";
for (i=1;i<=NF;i++)
split($i, sf, "= ")
print sf[2]
printf "n"
Then run:
awk -f split.awk Test.txt
Or run the whole command as one line:
awk 'BEGIN RS="nn";FS="n";ORS=",";for(i=1;i<=NF;i++)split($i, sf, "= ")print sf[2]printf "n"' Test.txt
It works as follows:
The
BEGIN
block runs once at the start and sets the record seperator (RS
) to two newlines and the field seperator (FS
) to a single newline. The output record seperator (ORS
) is set to a comma.It then loops through each field in the record (
NF
is the number of fields in the current record) and splits it on "= ".It then prints the right hand of this split with a comma between each (the
ORS
)After each line it prints a newline to give you the CSV format.
add a comment |
up vote
1
down vote
Based on the other answers... I needed a slight modification, because I was in need of the column header in the final CSV-file. This is my awk-solution:
awk -v RS= -v FS='n' '
NR==1
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[1], (i==NF ? "n" : ",");
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[2], (i==NF ? "n" : ",");
'
As in the other answers, this reads the entire block (until a blank line) in fields, then splits each field again with split(). The only difference is that for the first record, it prints the part before the '=' and in a second for-loop, it prints the right-hand-side for all records.
add a comment |
up vote
0
down vote
If all the blocks have exactly the same format (same field names, in the same order), then you can use awk in “paragraph mode”, and print the desired field number in each block. If there are always spaces around the equal signs and the values never contain spaces, you can rely on whitespace-separated fields.
awk -v RS= -v ORS=',' 'print $3, $6, $9, $12, $15, $18'
If you can rely on the order and presence of fields but not on the whitespace, then you'll need a bit of parsing to split at the equal signs.
awk -v RS= -F 'n' '
for (i = 1; i <= NF; i++)
sub(/[^=]*= */, "", $i);
printf "%s%s", $i, (i==NF ? "n" : ",");
'
Here's a Perl method:
perl -000 -ne '
$, = ","; $ = "n";
@kv = split /n| *= */;
print @kv[grep $_%2 0..$#kv];
'
If the fields in the blocks can come out of order and you always want a specific order as output, you'll need to store the fields and print them out in the right order at the end of each paragraph. In awk, this is easier to do in line mode than in paragraph mode.
awk -v OFS=',' '
match($0, / *= */) a[substr($0,1,RSTART-1)] = substr($0,RSTART+RLENGTH)
/^$/ print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]; split("", a)
END print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]
'
This one is a one-liner in Perl.
perl -000 -F'/n|s*=s*/' -ane '%F = @F; $ = "n"; $, = ","; print @Fqw(HLRSN IMSI KIVALUE K4SNO CARDTYPE ALG)'
add a comment |
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
A bash
solution:
declare -a out
EOF=false
IFS=$'='
until $EOF; do
read -r skip val || EOF=true
if [ ! -z "$val" ]
then
out+=("$val//[[:space:]]/")
else
tmp="$out[@]"
printf '%sn' "$tmp// /,"
out=()
fi
done < file
How does this work
- Declare array
out
for holding output line, set variableEOF
to keep track end of file,IFS
for input field separator forread
. - Until we read end of file, we read each line of file, set value of last field to variable
val
. if [ ! -z "$val" ]
: check if length of variable$val
is not zero, we remove space in$val
, push it to arrayout
.- if length
$val
is zero, meaning we get blank line or end of file, we assign all element of arrayout
to variable tmp, then replace all space variabletmp
by,
, our designed output recode separator. - Set
out
to null for next work.
Another solution, more concise, shorter for you is using perl
:
$ perl -F'=' -anle '
BEGIN $, = ","
push @out,$F[-1] if @F;
print @[map s/s// && $_ @out] and @out = ()
if /^$/ or eof;
' file
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
You can avoid theEOF
variable by usingwhile read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Isskip
a variable or a modifier to theread
command ?
– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
add a comment |
up vote
4
down vote
accepted
A bash
solution:
declare -a out
EOF=false
IFS=$'='
until $EOF; do
read -r skip val || EOF=true
if [ ! -z "$val" ]
then
out+=("$val//[[:space:]]/")
else
tmp="$out[@]"
printf '%sn' "$tmp// /,"
out=()
fi
done < file
How does this work
- Declare array
out
for holding output line, set variableEOF
to keep track end of file,IFS
for input field separator forread
. - Until we read end of file, we read each line of file, set value of last field to variable
val
. if [ ! -z "$val" ]
: check if length of variable$val
is not zero, we remove space in$val
, push it to arrayout
.- if length
$val
is zero, meaning we get blank line or end of file, we assign all element of arrayout
to variable tmp, then replace all space variabletmp
by,
, our designed output recode separator. - Set
out
to null for next work.
Another solution, more concise, shorter for you is using perl
:
$ perl -F'=' -anle '
BEGIN $, = ","
push @out,$F[-1] if @F;
print @[map s/s// && $_ @out] and @out = ()
if /^$/ or eof;
' file
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
You can avoid theEOF
variable by usingwhile read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Isskip
a variable or a modifier to theread
command ?
– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
add a comment |
up vote
4
down vote
accepted
up vote
4
down vote
accepted
A bash
solution:
declare -a out
EOF=false
IFS=$'='
until $EOF; do
read -r skip val || EOF=true
if [ ! -z "$val" ]
then
out+=("$val//[[:space:]]/")
else
tmp="$out[@]"
printf '%sn' "$tmp// /,"
out=()
fi
done < file
How does this work
- Declare array
out
for holding output line, set variableEOF
to keep track end of file,IFS
for input field separator forread
. - Until we read end of file, we read each line of file, set value of last field to variable
val
. if [ ! -z "$val" ]
: check if length of variable$val
is not zero, we remove space in$val
, push it to arrayout
.- if length
$val
is zero, meaning we get blank line or end of file, we assign all element of arrayout
to variable tmp, then replace all space variabletmp
by,
, our designed output recode separator. - Set
out
to null for next work.
Another solution, more concise, shorter for you is using perl
:
$ perl -F'=' -anle '
BEGIN $, = ","
push @out,$F[-1] if @F;
print @[map s/s// && $_ @out] and @out = ()
if /^$/ or eof;
' file
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
A bash
solution:
declare -a out
EOF=false
IFS=$'='
until $EOF; do
read -r skip val || EOF=true
if [ ! -z "$val" ]
then
out+=("$val//[[:space:]]/")
else
tmp="$out[@]"
printf '%sn' "$tmp// /,"
out=()
fi
done < file
How does this work
- Declare array
out
for holding output line, set variableEOF
to keep track end of file,IFS
for input field separator forread
. - Until we read end of file, we read each line of file, set value of last field to variable
val
. if [ ! -z "$val" ]
: check if length of variable$val
is not zero, we remove space in$val
, push it to arrayout
.- if length
$val
is zero, meaning we get blank line or end of file, we assign all element of arrayout
to variable tmp, then replace all space variabletmp
by,
, our designed output recode separator. - Set
out
to null for next work.
Another solution, more concise, shorter for you is using perl
:
$ perl -F'=' -anle '
BEGIN $, = ","
push @out,$F[-1] if @F;
print @[map s/s// && $_ @out] and @out = ()
if /^$/ or eof;
' file
3,404212109727229,A24AD11812232B47688ADBF15CE05CA9,1,SIM,COMP128_3
3,404212109727230,A24AD11812232B47688ADBF15CE05CB8,1,SIM,COMP128_3
3,404212109727231,A24AD11812232B47688ADBF15CE05CD6,1,SIM,COMP128_3
edited Dec 16 '14 at 10:30
answered Aug 1 '14 at 7:48
cuonglm
101k23197298
101k23197298
You can avoid theEOF
variable by usingwhile read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Isskip
a variable or a modifier to theread
command ?
– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
add a comment |
You can avoid theEOF
variable by usingwhile read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Isskip
a variable or a modifier to theread
command ?
– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
You can avoid the
EOF
variable by using while read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
You can avoid the
EOF
variable by using while read -r skip val || [ ! -z "$skip" ]
– l0b0
Aug 1 '14 at 11:30
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
@l0b0: Good point! Thanks.
– cuonglm
Aug 1 '14 at 11:38
Is
skip
a variable or a modifier to the read
command ?– Tulains Córdova
Aug 1 '14 at 19:19
Is
skip
a variable or a modifier to the read
command ?– Tulains Córdova
Aug 1 '14 at 19:19
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
@user1598390: It a variable.
– cuonglm
Aug 2 '14 at 3:42
add a comment |
up vote
9
down vote
Simply:
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
An empty record separator (RS=
) enables the paragraph mode whereby records are separated by sequences of empty lines. Inside a record, the default field separator applies (records are separated by blanks) so in each record, the fields we are interested in are the 3rd, 6th, 9th...
We change the output field separator to a comma character (OFS=,
) and print the fields we're interested in.
Great solution! If it happens to be variable number of lines, you can make it more general withawk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.
– fedorqui
Aug 1 '14 at 9:45
1
I love how the lineawk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".
– dotancohen
Aug 1 '14 at 14:21
add a comment |
up vote
9
down vote
Simply:
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
An empty record separator (RS=
) enables the paragraph mode whereby records are separated by sequences of empty lines. Inside a record, the default field separator applies (records are separated by blanks) so in each record, the fields we are interested in are the 3rd, 6th, 9th...
We change the output field separator to a comma character (OFS=,
) and print the fields we're interested in.
Great solution! If it happens to be variable number of lines, you can make it more general withawk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.
– fedorqui
Aug 1 '14 at 9:45
1
I love how the lineawk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".
– dotancohen
Aug 1 '14 at 14:21
add a comment |
up vote
9
down vote
up vote
9
down vote
Simply:
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
An empty record separator (RS=
) enables the paragraph mode whereby records are separated by sequences of empty lines. Inside a record, the default field separator applies (records are separated by blanks) so in each record, the fields we are interested in are the 3rd, 6th, 9th...
We change the output field separator to a comma character (OFS=,
) and print the fields we're interested in.
Simply:
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
An empty record separator (RS=
) enables the paragraph mode whereby records are separated by sequences of empty lines. Inside a record, the default field separator applies (records are separated by blanks) so in each record, the fields we are interested in are the 3rd, 6th, 9th...
We change the output field separator to a comma character (OFS=,
) and print the fields we're interested in.
edited Nov 15 '16 at 15:41
answered Aug 1 '14 at 9:16
Stéphane Chazelas
295k54559902
295k54559902
Great solution! If it happens to be variable number of lines, you can make it more general withawk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.
– fedorqui
Aug 1 '14 at 9:45
1
I love how the lineawk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".
– dotancohen
Aug 1 '14 at 14:21
add a comment |
Great solution! If it happens to be variable number of lines, you can make it more general withawk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.
– fedorqui
Aug 1 '14 at 9:45
1
I love how the lineawk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".
– dotancohen
Aug 1 '14 at 14:21
Great solution! If it happens to be variable number of lines, you can make it more general with
awk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.– fedorqui
Aug 1 '14 at 9:45
Great solution! If it happens to be variable number of lines, you can make it more general with
awk -v RS= -v OFS=, 'for (i=3; i<=NF; i+=3) printf "%s%s", $i, (i==NF?"n":OFS)' file
- that is, to print every 3 fields.– fedorqui
Aug 1 '14 at 9:45
1
1
I love how the line
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".– dotancohen
Aug 1 '14 at 14:21
I love how the line
awk -v RS= -v OFS=, 'print $3,$6,$9,$12,$15,$18'
is prefixed with the adverb "simply".– dotancohen
Aug 1 '14 at 14:21
add a comment |
up vote
3
down vote
Save the following to a file (eg split.awk
)
BEGIN
RS="nn";
FS="n";
ORS=",";
for (i=1;i<=NF;i++)
split($i, sf, "= ")
print sf[2]
printf "n"
Then run:
awk -f split.awk Test.txt
Or run the whole command as one line:
awk 'BEGIN RS="nn";FS="n";ORS=",";for(i=1;i<=NF;i++)split($i, sf, "= ")print sf[2]printf "n"' Test.txt
It works as follows:
The
BEGIN
block runs once at the start and sets the record seperator (RS
) to two newlines and the field seperator (FS
) to a single newline. The output record seperator (ORS
) is set to a comma.It then loops through each field in the record (
NF
is the number of fields in the current record) and splits it on "= ".It then prints the right hand of this split with a comma between each (the
ORS
)After each line it prints a newline to give you the CSV format.
add a comment |
up vote
3
down vote
Save the following to a file (eg split.awk
)
BEGIN
RS="nn";
FS="n";
ORS=",";
for (i=1;i<=NF;i++)
split($i, sf, "= ")
print sf[2]
printf "n"
Then run:
awk -f split.awk Test.txt
Or run the whole command as one line:
awk 'BEGIN RS="nn";FS="n";ORS=",";for(i=1;i<=NF;i++)split($i, sf, "= ")print sf[2]printf "n"' Test.txt
It works as follows:
The
BEGIN
block runs once at the start and sets the record seperator (RS
) to two newlines and the field seperator (FS
) to a single newline. The output record seperator (ORS
) is set to a comma.It then loops through each field in the record (
NF
is the number of fields in the current record) and splits it on "= ".It then prints the right hand of this split with a comma between each (the
ORS
)After each line it prints a newline to give you the CSV format.
add a comment |
up vote
3
down vote
up vote
3
down vote
Save the following to a file (eg split.awk
)
BEGIN
RS="nn";
FS="n";
ORS=",";
for (i=1;i<=NF;i++)
split($i, sf, "= ")
print sf[2]
printf "n"
Then run:
awk -f split.awk Test.txt
Or run the whole command as one line:
awk 'BEGIN RS="nn";FS="n";ORS=",";for(i=1;i<=NF;i++)split($i, sf, "= ")print sf[2]printf "n"' Test.txt
It works as follows:
The
BEGIN
block runs once at the start and sets the record seperator (RS
) to two newlines and the field seperator (FS
) to a single newline. The output record seperator (ORS
) is set to a comma.It then loops through each field in the record (
NF
is the number of fields in the current record) and splits it on "= ".It then prints the right hand of this split with a comma between each (the
ORS
)After each line it prints a newline to give you the CSV format.
Save the following to a file (eg split.awk
)
BEGIN
RS="nn";
FS="n";
ORS=",";
for (i=1;i<=NF;i++)
split($i, sf, "= ")
print sf[2]
printf "n"
Then run:
awk -f split.awk Test.txt
Or run the whole command as one line:
awk 'BEGIN RS="nn";FS="n";ORS=",";for(i=1;i<=NF;i++)split($i, sf, "= ")print sf[2]printf "n"' Test.txt
It works as follows:
The
BEGIN
block runs once at the start and sets the record seperator (RS
) to two newlines and the field seperator (FS
) to a single newline. The output record seperator (ORS
) is set to a comma.It then loops through each field in the record (
NF
is the number of fields in the current record) and splits it on "= ".It then prints the right hand of this split with a comma between each (the
ORS
)After each line it prints a newline to give you the CSV format.
answered Aug 1 '14 at 8:27
garethTheRed
23.8k36079
23.8k36079
add a comment |
add a comment |
up vote
1
down vote
Based on the other answers... I needed a slight modification, because I was in need of the column header in the final CSV-file. This is my awk-solution:
awk -v RS= -v FS='n' '
NR==1
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[1], (i==NF ? "n" : ",");
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[2], (i==NF ? "n" : ",");
'
As in the other answers, this reads the entire block (until a blank line) in fields, then splits each field again with split(). The only difference is that for the first record, it prints the part before the '=' and in a second for-loop, it prints the right-hand-side for all records.
add a comment |
up vote
1
down vote
Based on the other answers... I needed a slight modification, because I was in need of the column header in the final CSV-file. This is my awk-solution:
awk -v RS= -v FS='n' '
NR==1
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[1], (i==NF ? "n" : ",");
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[2], (i==NF ? "n" : ",");
'
As in the other answers, this reads the entire block (until a blank line) in fields, then splits each field again with split(). The only difference is that for the first record, it prints the part before the '=' and in a second for-loop, it prints the right-hand-side for all records.
add a comment |
up vote
1
down vote
up vote
1
down vote
Based on the other answers... I needed a slight modification, because I was in need of the column header in the final CSV-file. This is my awk-solution:
awk -v RS= -v FS='n' '
NR==1
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[1], (i==NF ? "n" : ",");
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[2], (i==NF ? "n" : ",");
'
As in the other answers, this reads the entire block (until a blank line) in fields, then splits each field again with split(). The only difference is that for the first record, it prints the part before the '=' and in a second for-loop, it prints the right-hand-side for all records.
Based on the other answers... I needed a slight modification, because I was in need of the column header in the final CSV-file. This is my awk-solution:
awk -v RS= -v FS='n' '
NR==1
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[1], (i==NF ? "n" : ",");
for (i=1; i<=NF; i++)
split($i,line,"= ");
printf "%s%s", line[2], (i==NF ? "n" : ",");
'
As in the other answers, this reads the entire block (until a blank line) in fields, then splits each field again with split(). The only difference is that for the first record, it prints the part before the '=' and in a second for-loop, it prints the right-hand-side for all records.
answered Dec 3 '16 at 16:16
Dweia
411
411
add a comment |
add a comment |
up vote
0
down vote
If all the blocks have exactly the same format (same field names, in the same order), then you can use awk in “paragraph mode”, and print the desired field number in each block. If there are always spaces around the equal signs and the values never contain spaces, you can rely on whitespace-separated fields.
awk -v RS= -v ORS=',' 'print $3, $6, $9, $12, $15, $18'
If you can rely on the order and presence of fields but not on the whitespace, then you'll need a bit of parsing to split at the equal signs.
awk -v RS= -F 'n' '
for (i = 1; i <= NF; i++)
sub(/[^=]*= */, "", $i);
printf "%s%s", $i, (i==NF ? "n" : ",");
'
Here's a Perl method:
perl -000 -ne '
$, = ","; $ = "n";
@kv = split /n| *= */;
print @kv[grep $_%2 0..$#kv];
'
If the fields in the blocks can come out of order and you always want a specific order as output, you'll need to store the fields and print them out in the right order at the end of each paragraph. In awk, this is easier to do in line mode than in paragraph mode.
awk -v OFS=',' '
match($0, / *= */) a[substr($0,1,RSTART-1)] = substr($0,RSTART+RLENGTH)
/^$/ print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]; split("", a)
END print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]
'
This one is a one-liner in Perl.
perl -000 -F'/n|s*=s*/' -ane '%F = @F; $ = "n"; $, = ","; print @Fqw(HLRSN IMSI KIVALUE K4SNO CARDTYPE ALG)'
add a comment |
up vote
0
down vote
If all the blocks have exactly the same format (same field names, in the same order), then you can use awk in “paragraph mode”, and print the desired field number in each block. If there are always spaces around the equal signs and the values never contain spaces, you can rely on whitespace-separated fields.
awk -v RS= -v ORS=',' 'print $3, $6, $9, $12, $15, $18'
If you can rely on the order and presence of fields but not on the whitespace, then you'll need a bit of parsing to split at the equal signs.
awk -v RS= -F 'n' '
for (i = 1; i <= NF; i++)
sub(/[^=]*= */, "", $i);
printf "%s%s", $i, (i==NF ? "n" : ",");
'
Here's a Perl method:
perl -000 -ne '
$, = ","; $ = "n";
@kv = split /n| *= */;
print @kv[grep $_%2 0..$#kv];
'
If the fields in the blocks can come out of order and you always want a specific order as output, you'll need to store the fields and print them out in the right order at the end of each paragraph. In awk, this is easier to do in line mode than in paragraph mode.
awk -v OFS=',' '
match($0, / *= */) a[substr($0,1,RSTART-1)] = substr($0,RSTART+RLENGTH)
/^$/ print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]; split("", a)
END print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]
'
This one is a one-liner in Perl.
perl -000 -F'/n|s*=s*/' -ane '%F = @F; $ = "n"; $, = ","; print @Fqw(HLRSN IMSI KIVALUE K4SNO CARDTYPE ALG)'
add a comment |
up vote
0
down vote
up vote
0
down vote
If all the blocks have exactly the same format (same field names, in the same order), then you can use awk in “paragraph mode”, and print the desired field number in each block. If there are always spaces around the equal signs and the values never contain spaces, you can rely on whitespace-separated fields.
awk -v RS= -v ORS=',' 'print $3, $6, $9, $12, $15, $18'
If you can rely on the order and presence of fields but not on the whitespace, then you'll need a bit of parsing to split at the equal signs.
awk -v RS= -F 'n' '
for (i = 1; i <= NF; i++)
sub(/[^=]*= */, "", $i);
printf "%s%s", $i, (i==NF ? "n" : ",");
'
Here's a Perl method:
perl -000 -ne '
$, = ","; $ = "n";
@kv = split /n| *= */;
print @kv[grep $_%2 0..$#kv];
'
If the fields in the blocks can come out of order and you always want a specific order as output, you'll need to store the fields and print them out in the right order at the end of each paragraph. In awk, this is easier to do in line mode than in paragraph mode.
awk -v OFS=',' '
match($0, / *= */) a[substr($0,1,RSTART-1)] = substr($0,RSTART+RLENGTH)
/^$/ print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]; split("", a)
END print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]
'
This one is a one-liner in Perl.
perl -000 -F'/n|s*=s*/' -ane '%F = @F; $ = "n"; $, = ","; print @Fqw(HLRSN IMSI KIVALUE K4SNO CARDTYPE ALG)'
If all the blocks have exactly the same format (same field names, in the same order), then you can use awk in “paragraph mode”, and print the desired field number in each block. If there are always spaces around the equal signs and the values never contain spaces, you can rely on whitespace-separated fields.
awk -v RS= -v ORS=',' 'print $3, $6, $9, $12, $15, $18'
If you can rely on the order and presence of fields but not on the whitespace, then you'll need a bit of parsing to split at the equal signs.
awk -v RS= -F 'n' '
for (i = 1; i <= NF; i++)
sub(/[^=]*= */, "", $i);
printf "%s%s", $i, (i==NF ? "n" : ",");
'
Here's a Perl method:
perl -000 -ne '
$, = ","; $ = "n";
@kv = split /n| *= */;
print @kv[grep $_%2 0..$#kv];
'
If the fields in the blocks can come out of order and you always want a specific order as output, you'll need to store the fields and print them out in the right order at the end of each paragraph. In awk, this is easier to do in line mode than in paragraph mode.
awk -v OFS=',' '
match($0, / *= */) a[substr($0,1,RSTART-1)] = substr($0,RSTART+RLENGTH)
/^$/ print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]; split("", a)
END print a["HLRSN"], a["IMSI"], a["KIVALUE"], a["K4SNO"], a["CARDTYPE"], a["ALG"]
'
This one is a one-liner in Perl.
perl -000 -F'/n|s*=s*/' -ane '%F = @F; $ = "n"; $, = ","; print @Fqw(HLRSN IMSI KIVALUE K4SNO CARDTYPE ALG)'
answered Aug 2 '14 at 0:28
Gilles
523k12610441576
523k12610441576
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f147816%2fconvert-key-value-blocks-to-csv%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are the fields
HLRSN
,IMSI
,KIVALUE
, etc. always present in each input block and in the same order each time?– Gilles
Aug 1 '14 at 21:15