How to parse the file
Clash Royale CLAN TAG#URR8PPP
up vote
-1
down vote
favorite
I have the below file1.txt. What I want to do is to take the value1 until value7 and output it in one row. The value will be scanned between the word "Start" and "End". In case the label/value is missing, the output will show "NA"
Please see the wanted output.txt below.
In short, I want to copy the values between Start and End and output in one line. If value label doesn't exist , the value will show NA. And continously scan the value for another record (Start till End) until enf og the file1.txt.
file1.txt
Start
label1 label2 label3 label4
value1 value2 value3 value4
label5
value5
label6 label7
value6 value7
End
Start
label1 label2 label4
valueA valueB valueD
label5
valueE
label6
valueF
End
Start
.
.
.
End
output.txt
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
text-processing
add a comment |Â
up vote
-1
down vote
favorite
I have the below file1.txt. What I want to do is to take the value1 until value7 and output it in one row. The value will be scanned between the word "Start" and "End". In case the label/value is missing, the output will show "NA"
Please see the wanted output.txt below.
In short, I want to copy the values between Start and End and output in one line. If value label doesn't exist , the value will show NA. And continously scan the value for another record (Start till End) until enf og the file1.txt.
file1.txt
Start
label1 label2 label3 label4
value1 value2 value3 value4
label5
value5
label6 label7
value6 value7
End
Start
label1 label2 label4
valueA valueB valueD
label5
valueE
label6
valueF
End
Start
.
.
.
End
output.txt
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
text-processing
Did you try anything ?
â Kiwy
May 9 at 14:09
add a comment |Â
up vote
-1
down vote
favorite
up vote
-1
down vote
favorite
I have the below file1.txt. What I want to do is to take the value1 until value7 and output it in one row. The value will be scanned between the word "Start" and "End". In case the label/value is missing, the output will show "NA"
Please see the wanted output.txt below.
In short, I want to copy the values between Start and End and output in one line. If value label doesn't exist , the value will show NA. And continously scan the value for another record (Start till End) until enf og the file1.txt.
file1.txt
Start
label1 label2 label3 label4
value1 value2 value3 value4
label5
value5
label6 label7
value6 value7
End
Start
label1 label2 label4
valueA valueB valueD
label5
valueE
label6
valueF
End
Start
.
.
.
End
output.txt
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
text-processing
I have the below file1.txt. What I want to do is to take the value1 until value7 and output it in one row. The value will be scanned between the word "Start" and "End". In case the label/value is missing, the output will show "NA"
Please see the wanted output.txt below.
In short, I want to copy the values between Start and End and output in one line. If value label doesn't exist , the value will show NA. And continously scan the value for another record (Start till End) until enf og the file1.txt.
file1.txt
Start
label1 label2 label3 label4
value1 value2 value3 value4
label5
value5
label6 label7
value6 value7
End
Start
label1 label2 label4
valueA valueB valueD
label5
valueE
label6
valueF
End
Start
.
.
.
End
output.txt
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
text-processing
edited May 9 at 13:41
Jeff Schaller
31.1k846105
31.1k846105
asked May 9 at 13:39
user290080
1
1
Did you try anything ?
â Kiwy
May 9 at 14:09
add a comment |Â
Did you try anything ?
â Kiwy
May 9 at 14:09
Did you try anything ?
â Kiwy
May 9 at 14:09
Did you try anything ?
â Kiwy
May 9 at 14:09
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
This Python script should do what you want:
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""parse.py
Parses a custom-format data-file.
Processes the file first and then prints the results.
"""
import sys
# Read the data from the file
file = open(sys.argv[1], 'r')
# Initialize a dictionary to collect the values for each label
labels =
# Initialize a stack to keep track of block state
stack =
# Initialize a counter to count the number of blocks
block = 0
# Process the file
line = file.readline()
while line:
# Remove white-space
line = line.strip()
# The stack should be empty when we start a new block
if line.lower() == "start":
if stack:
raise Exception("Invalid File Format: Bad Start")
else:
stack.append(line)
# Otherwise the bottom of the stack should be a "Start"
# When we reach the end of a block we empty the stack
# end increment the block counter
elif line.lower() == "end":
if stack[0].lower() != "start":
raise Exception("Invalid File Format: Bad End")
else:
block += 1
stack =
# Other lines should come in consecutive label/value pairs
# i.e. a value row should follow a label row
elif line:
# If there are an odd number of data rows in the stack then
# the current row should be a value row - check that it matches
# the corresponding label row
if len(stack[1:])%2==1:
_labels = stack[-1].split()
_values = line.split()
# Verify that the label row and value row have the same number
# of columns
if len(_labels) == len(_values):
stack.append(line)
for label, value in zip(_labels, _values):
# Add new labels to the labels dictionary
if label not in labels:
labels[label] =
"cols": len(label)
# Add the value for the current block
labels[label][block] = value
# Keep track of the longest value for each label
# so we can format the output later
if len(value) > labels[label]["cols"]:
labels[label]["cols"] = len(value)
else:
raise Exception("Invalid File Format: Label/Value Mismatch")
# If there are an even number of data rows in the stack then
# the current row should be a label row - append it to the stack
else:
stack.append(line)
# Read the next line
line = file.readline()
# Construct the header row
header = ""
for label in labels:
cols = labels[label]["cols"]
header += "0: <width".format(label, width=cols+1)
# Construct the data rows
rows =
for i in range(0, block):
row = ""
for label in labels:
cols = labels[label]["cols"]
row += "0: <width".format(labels[label].get(i, "NA"), width=cols+1)
rows.append(row)
# Print the results
print(header)
for row in rows:
print(row)
You can run it like this:
python parse.py file1.txt
It produces the following output on your example data:
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
This Python script should do what you want:
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""parse.py
Parses a custom-format data-file.
Processes the file first and then prints the results.
"""
import sys
# Read the data from the file
file = open(sys.argv[1], 'r')
# Initialize a dictionary to collect the values for each label
labels =
# Initialize a stack to keep track of block state
stack =
# Initialize a counter to count the number of blocks
block = 0
# Process the file
line = file.readline()
while line:
# Remove white-space
line = line.strip()
# The stack should be empty when we start a new block
if line.lower() == "start":
if stack:
raise Exception("Invalid File Format: Bad Start")
else:
stack.append(line)
# Otherwise the bottom of the stack should be a "Start"
# When we reach the end of a block we empty the stack
# end increment the block counter
elif line.lower() == "end":
if stack[0].lower() != "start":
raise Exception("Invalid File Format: Bad End")
else:
block += 1
stack =
# Other lines should come in consecutive label/value pairs
# i.e. a value row should follow a label row
elif line:
# If there are an odd number of data rows in the stack then
# the current row should be a value row - check that it matches
# the corresponding label row
if len(stack[1:])%2==1:
_labels = stack[-1].split()
_values = line.split()
# Verify that the label row and value row have the same number
# of columns
if len(_labels) == len(_values):
stack.append(line)
for label, value in zip(_labels, _values):
# Add new labels to the labels dictionary
if label not in labels:
labels[label] =
"cols": len(label)
# Add the value for the current block
labels[label][block] = value
# Keep track of the longest value for each label
# so we can format the output later
if len(value) > labels[label]["cols"]:
labels[label]["cols"] = len(value)
else:
raise Exception("Invalid File Format: Label/Value Mismatch")
# If there are an even number of data rows in the stack then
# the current row should be a label row - append it to the stack
else:
stack.append(line)
# Read the next line
line = file.readline()
# Construct the header row
header = ""
for label in labels:
cols = labels[label]["cols"]
header += "0: <width".format(label, width=cols+1)
# Construct the data rows
rows =
for i in range(0, block):
row = ""
for label in labels:
cols = labels[label]["cols"]
row += "0: <width".format(labels[label].get(i, "NA"), width=cols+1)
rows.append(row)
# Print the results
print(header)
for row in rows:
print(row)
You can run it like this:
python parse.py file1.txt
It produces the following output on your example data:
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
add a comment |Â
up vote
0
down vote
This Python script should do what you want:
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""parse.py
Parses a custom-format data-file.
Processes the file first and then prints the results.
"""
import sys
# Read the data from the file
file = open(sys.argv[1], 'r')
# Initialize a dictionary to collect the values for each label
labels =
# Initialize a stack to keep track of block state
stack =
# Initialize a counter to count the number of blocks
block = 0
# Process the file
line = file.readline()
while line:
# Remove white-space
line = line.strip()
# The stack should be empty when we start a new block
if line.lower() == "start":
if stack:
raise Exception("Invalid File Format: Bad Start")
else:
stack.append(line)
# Otherwise the bottom of the stack should be a "Start"
# When we reach the end of a block we empty the stack
# end increment the block counter
elif line.lower() == "end":
if stack[0].lower() != "start":
raise Exception("Invalid File Format: Bad End")
else:
block += 1
stack =
# Other lines should come in consecutive label/value pairs
# i.e. a value row should follow a label row
elif line:
# If there are an odd number of data rows in the stack then
# the current row should be a value row - check that it matches
# the corresponding label row
if len(stack[1:])%2==1:
_labels = stack[-1].split()
_values = line.split()
# Verify that the label row and value row have the same number
# of columns
if len(_labels) == len(_values):
stack.append(line)
for label, value in zip(_labels, _values):
# Add new labels to the labels dictionary
if label not in labels:
labels[label] =
"cols": len(label)
# Add the value for the current block
labels[label][block] = value
# Keep track of the longest value for each label
# so we can format the output later
if len(value) > labels[label]["cols"]:
labels[label]["cols"] = len(value)
else:
raise Exception("Invalid File Format: Label/Value Mismatch")
# If there are an even number of data rows in the stack then
# the current row should be a label row - append it to the stack
else:
stack.append(line)
# Read the next line
line = file.readline()
# Construct the header row
header = ""
for label in labels:
cols = labels[label]["cols"]
header += "0: <width".format(label, width=cols+1)
# Construct the data rows
rows =
for i in range(0, block):
row = ""
for label in labels:
cols = labels[label]["cols"]
row += "0: <width".format(labels[label].get(i, "NA"), width=cols+1)
rows.append(row)
# Print the results
print(header)
for row in rows:
print(row)
You can run it like this:
python parse.py file1.txt
It produces the following output on your example data:
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
add a comment |Â
up vote
0
down vote
up vote
0
down vote
This Python script should do what you want:
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""parse.py
Parses a custom-format data-file.
Processes the file first and then prints the results.
"""
import sys
# Read the data from the file
file = open(sys.argv[1], 'r')
# Initialize a dictionary to collect the values for each label
labels =
# Initialize a stack to keep track of block state
stack =
# Initialize a counter to count the number of blocks
block = 0
# Process the file
line = file.readline()
while line:
# Remove white-space
line = line.strip()
# The stack should be empty when we start a new block
if line.lower() == "start":
if stack:
raise Exception("Invalid File Format: Bad Start")
else:
stack.append(line)
# Otherwise the bottom of the stack should be a "Start"
# When we reach the end of a block we empty the stack
# end increment the block counter
elif line.lower() == "end":
if stack[0].lower() != "start":
raise Exception("Invalid File Format: Bad End")
else:
block += 1
stack =
# Other lines should come in consecutive label/value pairs
# i.e. a value row should follow a label row
elif line:
# If there are an odd number of data rows in the stack then
# the current row should be a value row - check that it matches
# the corresponding label row
if len(stack[1:])%2==1:
_labels = stack[-1].split()
_values = line.split()
# Verify that the label row and value row have the same number
# of columns
if len(_labels) == len(_values):
stack.append(line)
for label, value in zip(_labels, _values):
# Add new labels to the labels dictionary
if label not in labels:
labels[label] =
"cols": len(label)
# Add the value for the current block
labels[label][block] = value
# Keep track of the longest value for each label
# so we can format the output later
if len(value) > labels[label]["cols"]:
labels[label]["cols"] = len(value)
else:
raise Exception("Invalid File Format: Label/Value Mismatch")
# If there are an even number of data rows in the stack then
# the current row should be a label row - append it to the stack
else:
stack.append(line)
# Read the next line
line = file.readline()
# Construct the header row
header = ""
for label in labels:
cols = labels[label]["cols"]
header += "0: <width".format(label, width=cols+1)
# Construct the data rows
rows =
for i in range(0, block):
row = ""
for label in labels:
cols = labels[label]["cols"]
row += "0: <width".format(labels[label].get(i, "NA"), width=cols+1)
rows.append(row)
# Print the results
print(header)
for row in rows:
print(row)
You can run it like this:
python parse.py file1.txt
It produces the following output on your example data:
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
This Python script should do what you want:
#!/usr/bin/env python
# -*- encoding: ascii -*-
"""parse.py
Parses a custom-format data-file.
Processes the file first and then prints the results.
"""
import sys
# Read the data from the file
file = open(sys.argv[1], 'r')
# Initialize a dictionary to collect the values for each label
labels =
# Initialize a stack to keep track of block state
stack =
# Initialize a counter to count the number of blocks
block = 0
# Process the file
line = file.readline()
while line:
# Remove white-space
line = line.strip()
# The stack should be empty when we start a new block
if line.lower() == "start":
if stack:
raise Exception("Invalid File Format: Bad Start")
else:
stack.append(line)
# Otherwise the bottom of the stack should be a "Start"
# When we reach the end of a block we empty the stack
# end increment the block counter
elif line.lower() == "end":
if stack[0].lower() != "start":
raise Exception("Invalid File Format: Bad End")
else:
block += 1
stack =
# Other lines should come in consecutive label/value pairs
# i.e. a value row should follow a label row
elif line:
# If there are an odd number of data rows in the stack then
# the current row should be a value row - check that it matches
# the corresponding label row
if len(stack[1:])%2==1:
_labels = stack[-1].split()
_values = line.split()
# Verify that the label row and value row have the same number
# of columns
if len(_labels) == len(_values):
stack.append(line)
for label, value in zip(_labels, _values):
# Add new labels to the labels dictionary
if label not in labels:
labels[label] =
"cols": len(label)
# Add the value for the current block
labels[label][block] = value
# Keep track of the longest value for each label
# so we can format the output later
if len(value) > labels[label]["cols"]:
labels[label]["cols"] = len(value)
else:
raise Exception("Invalid File Format: Label/Value Mismatch")
# If there are an even number of data rows in the stack then
# the current row should be a label row - append it to the stack
else:
stack.append(line)
# Read the next line
line = file.readline()
# Construct the header row
header = ""
for label in labels:
cols = labels[label]["cols"]
header += "0: <width".format(label, width=cols+1)
# Construct the data rows
rows =
for i in range(0, block):
row = ""
for label in labels:
cols = labels[label]["cols"]
row += "0: <width".format(labels[label].get(i, "NA"), width=cols+1)
rows.append(row)
# Print the results
print(header)
for row in rows:
print(row)
You can run it like this:
python parse.py file1.txt
It produces the following output on your example data:
label1 label2 label3 label4 label5 label6 label7
value1 value2 value3 value4 value5 value6 value7
valueA valueB NA valueD valueE valueF NA
answered May 10 at 0:01
igal
4,785930
4,785930
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442771%2fhow-to-parse-the-file%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Did you try anything ?
â Kiwy
May 9 at 14:09