How could I (painlessly) split or reverse “Last, First” within a record in Miller?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last
, and first
, use cut
or some other verb(s) on that, and output the result to JSON.
I should add that I'm not married to JSON, and I know how to use other tools like jq
, but it would be nice to get it in that format in one step.
The syntax for the nest
verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case?
Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname
right now, OK? I would use strip
or ssub
or something to get rid of that later.)
echo -e "last_firstnLastName, Firstname"
| mlr --t2j put '$o=splitnv($last_first,",")'
# result:
# "last_first": "LastName, Firstname", "o": "(error)"
# expected something like:
# "last_first": "LastName, Firstname", "o": 1: "LastName", 2: "Firstname"
#
# or:
# "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ]
Why (error)
? Is it not reasonable that assigning to $o
as above would assign a new column o
to the result of splitnv
?
Here's something else I tried that didn't work like I would've expected either:
echo -e "last_firstnLastName, Firstname"
| mlr -T nest --explode --values --across-fields --nested-fs , -f last_first
# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname
# expected:
# last_first_1<tab>last_first_2
# LastName,<tab> Firstname
Edit: The problem with the command above is I should've used --tsv
, not -T
, which is a synonym for --nidx --fs tab
(numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see issue #233.
Any insight would be appreciated.
text-processing csv json miller
add a comment |
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last
, and first
, use cut
or some other verb(s) on that, and output the result to JSON.
I should add that I'm not married to JSON, and I know how to use other tools like jq
, but it would be nice to get it in that format in one step.
The syntax for the nest
verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case?
Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname
right now, OK? I would use strip
or ssub
or something to get rid of that later.)
echo -e "last_firstnLastName, Firstname"
| mlr --t2j put '$o=splitnv($last_first,",")'
# result:
# "last_first": "LastName, Firstname", "o": "(error)"
# expected something like:
# "last_first": "LastName, Firstname", "o": 1: "LastName", 2: "Firstname"
#
# or:
# "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ]
Why (error)
? Is it not reasonable that assigning to $o
as above would assign a new column o
to the result of splitnv
?
Here's something else I tried that didn't work like I would've expected either:
echo -e "last_firstnLastName, Firstname"
| mlr -T nest --explode --values --across-fields --nested-fs , -f last_first
# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname
# expected:
# last_first_1<tab>last_first_2
# LastName,<tab> Firstname
Edit: The problem with the command above is I should've used --tsv
, not -T
, which is a synonym for --nidx --fs tab
(numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see issue #233.
Any insight would be appreciated.
text-processing csv json miller
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57
add a comment |
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last
, and first
, use cut
or some other verb(s) on that, and output the result to JSON.
I should add that I'm not married to JSON, and I know how to use other tools like jq
, but it would be nice to get it in that format in one step.
The syntax for the nest
verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case?
Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname
right now, OK? I would use strip
or ssub
or something to get rid of that later.)
echo -e "last_firstnLastName, Firstname"
| mlr --t2j put '$o=splitnv($last_first,",")'
# result:
# "last_first": "LastName, Firstname", "o": "(error)"
# expected something like:
# "last_first": "LastName, Firstname", "o": 1: "LastName", 2: "Firstname"
#
# or:
# "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ]
Why (error)
? Is it not reasonable that assigning to $o
as above would assign a new column o
to the result of splitnv
?
Here's something else I tried that didn't work like I would've expected either:
echo -e "last_firstnLastName, Firstname"
| mlr -T nest --explode --values --across-fields --nested-fs , -f last_first
# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname
# expected:
# last_first_1<tab>last_first_2
# LastName,<tab> Firstname
Edit: The problem with the command above is I should've used --tsv
, not -T
, which is a synonym for --nidx --fs tab
(numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see issue #233.
Any insight would be appreciated.
text-processing csv json miller
I have a tab-delimited file where one of the columns is in the format "LastName, FirstName". What I want to do is split that record out into two separate columns, last
, and first
, use cut
or some other verb(s) on that, and output the result to JSON.
I should add that I'm not married to JSON, and I know how to use other tools like jq
, but it would be nice to get it in that format in one step.
The syntax for the nest
verb looks like it requires memorizing a lot of frankly non-memorable options, so I figured that there would be a simple DSL operation to do this job. Maybe that's not the case?
Here's what I've tried. (Let's just forget about the extra space that's attached to Firstname
right now, OK? I would use strip
or ssub
or something to get rid of that later.)
echo -e "last_firstnLastName, Firstname"
| mlr --t2j put '$o=splitnv($last_first,",")'
# result:
# "last_first": "LastName, Firstname", "o": "(error)"
# expected something like:
# "last_first": "LastName, Firstname", "o": 1: "LastName", 2: "Firstname"
#
# or:
# "last_first": "LastName, Firstname", "o": [ "LastName", "Firstname" ]
Why (error)
? Is it not reasonable that assigning to $o
as above would assign a new column o
to the result of splitnv
?
Here's something else I tried that didn't work like I would've expected either:
echo -e "last_firstnLastName, Firstname"
| mlr -T nest --explode --values --across-fields --nested-fs , -f last_first
# result (no delimiter here, just one field, confirmed w/ 'cat -A')
# last_first
# LastName, Firstname
# expected:
# last_first_1<tab>last_first_2
# LastName,<tab> Firstname
Edit: The problem with the command above is I should've used --tsv
, not -T
, which is a synonym for --nidx --fs tab
(numerically-indexed columns). Problem is, Miller doesn't produce an error message when it's obviously wrong to ask for named columns in that case, which might be a mis-feature; see issue #233.
Any insight would be appreciated.
text-processing csv json miller
text-processing csv json miller
edited Mar 8 at 1:58
TheDudeAbides
asked Mar 7 at 10:52
TheDudeAbidesTheDudeAbides
1777
1777
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57
add a comment |
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57
add a comment |
2 Answers
2
active
oldest
votes
I do not know if I understand your request.
If I run
echo -e "last_firstnLastName, Firstname" |
mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
[
"last_first_1": "LastName",
"last_first_2": "Firstname"
]
And if I run
echo -e "last_firstnLastName, Firstname" |
mlr --tsv nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
last_first_1 last_first_2
LastName Firstname
1
Grazie Signore! I didn't even realize there was aclean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using-T
(only good for files with no header line) instead of--tsv
, which does what I'd expect.
– TheDudeAbides
Mar 7 at 23:02
add a comment |
Here's how to switch LastName, FirstName
to be FirstName LastName
using DSL expressions:
echo -e "last_firstnLastName, FirstnamenAnotherLast, AnotherFirst"
| mlr --t2j
put -q 'o=splitnv($last_first,",");
first_last=strip(o[2]) . " " . o[1];
emit first_last'
# result:
# "first_last": "Firstname LastName"
# "first_last": "AnotherFirst AnotherLast"
I think the fact that the emit
seems to be required(?) was the key part that I didn't understand before.
It is, sadly, not much easier than using the nest
verb and all its required flags.
An alternative is to use reorderecho -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504891%2fhow-could-i-painlessly-split-or-reverse-last-first-within-a-record-in-mille%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I do not know if I understand your request.
If I run
echo -e "last_firstnLastName, Firstname" |
mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
[
"last_first_1": "LastName",
"last_first_2": "Firstname"
]
And if I run
echo -e "last_firstnLastName, Firstname" |
mlr --tsv nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
last_first_1 last_first_2
LastName Firstname
1
Grazie Signore! I didn't even realize there was aclean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using-T
(only good for files with no header line) instead of--tsv
, which does what I'd expect.
– TheDudeAbides
Mar 7 at 23:02
add a comment |
I do not know if I understand your request.
If I run
echo -e "last_firstnLastName, Firstname" |
mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
[
"last_first_1": "LastName",
"last_first_2": "Firstname"
]
And if I run
echo -e "last_firstnLastName, Firstname" |
mlr --tsv nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
last_first_1 last_first_2
LastName Firstname
1
Grazie Signore! I didn't even realize there was aclean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using-T
(only good for files with no header line) instead of--tsv
, which does what I'd expect.
– TheDudeAbides
Mar 7 at 23:02
add a comment |
I do not know if I understand your request.
If I run
echo -e "last_firstnLastName, Firstname" |
mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
[
"last_first_1": "LastName",
"last_first_2": "Firstname"
]
And if I run
echo -e "last_firstnLastName, Firstname" |
mlr --tsv nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
last_first_1 last_first_2
LastName Firstname
I do not know if I understand your request.
If I run
echo -e "last_firstnLastName, Firstname" |
mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
[
"last_first_1": "LastName",
"last_first_2": "Firstname"
]
And if I run
echo -e "last_firstnLastName, Firstname" |
mlr --tsv nest --explode --values --across-fields --nested-fs "," -f last_first
then clean-whitespace
I have
last_first_1 last_first_2
LastName Firstname
edited Mar 7 at 18:30
answered Mar 7 at 18:22
aborrusoaborruso
263310
263310
1
Grazie Signore! I didn't even realize there was aclean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using-T
(only good for files with no header line) instead of--tsv
, which does what I'd expect.
– TheDudeAbides
Mar 7 at 23:02
add a comment |
1
Grazie Signore! I didn't even realize there was aclean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using-T
(only good for files with no header line) instead of--tsv
, which does what I'd expect.
– TheDudeAbides
Mar 7 at 23:02
1
1
Grazie Signore! I didn't even realize there was a
clean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using -T
(only good for files with no header line) instead of --tsv
, which does what I'd expect.– TheDudeAbides
Mar 7 at 23:02
Grazie Signore! I didn't even realize there was a
clean-whitespace
verb. My question was written quickly before I forgot everything, and needs some serious editing. I'll fix up the question, then mark this as the accepted solution, because I think the whole entirety of my problem was using -T
(only good for files with no header line) instead of --tsv
, which does what I'd expect.– TheDudeAbides
Mar 7 at 23:02
add a comment |
Here's how to switch LastName, FirstName
to be FirstName LastName
using DSL expressions:
echo -e "last_firstnLastName, FirstnamenAnotherLast, AnotherFirst"
| mlr --t2j
put -q 'o=splitnv($last_first,",");
first_last=strip(o[2]) . " " . o[1];
emit first_last'
# result:
# "first_last": "Firstname LastName"
# "first_last": "AnotherFirst AnotherLast"
I think the fact that the emit
seems to be required(?) was the key part that I didn't understand before.
It is, sadly, not much easier than using the nest
verb and all its required flags.
An alternative is to use reorderecho -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
add a comment |
Here's how to switch LastName, FirstName
to be FirstName LastName
using DSL expressions:
echo -e "last_firstnLastName, FirstnamenAnotherLast, AnotherFirst"
| mlr --t2j
put -q 'o=splitnv($last_first,",");
first_last=strip(o[2]) . " " . o[1];
emit first_last'
# result:
# "first_last": "Firstname LastName"
# "first_last": "AnotherFirst AnotherLast"
I think the fact that the emit
seems to be required(?) was the key part that I didn't understand before.
It is, sadly, not much easier than using the nest
verb and all its required flags.
An alternative is to use reorderecho -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
add a comment |
Here's how to switch LastName, FirstName
to be FirstName LastName
using DSL expressions:
echo -e "last_firstnLastName, FirstnamenAnotherLast, AnotherFirst"
| mlr --t2j
put -q 'o=splitnv($last_first,",");
first_last=strip(o[2]) . " " . o[1];
emit first_last'
# result:
# "first_last": "Firstname LastName"
# "first_last": "AnotherFirst AnotherLast"
I think the fact that the emit
seems to be required(?) was the key part that I didn't understand before.
It is, sadly, not much easier than using the nest
verb and all its required flags.
Here's how to switch LastName, FirstName
to be FirstName LastName
using DSL expressions:
echo -e "last_firstnLastName, FirstnamenAnotherLast, AnotherFirst"
| mlr --t2j
put -q 'o=splitnv($last_first,",");
first_last=strip(o[2]) . " " . o[1];
emit first_last'
# result:
# "first_last": "Firstname LastName"
# "first_last": "AnotherFirst AnotherLast"
I think the fact that the emit
seems to be required(?) was the key part that I didn't understand before.
It is, sadly, not much easier than using the nest
verb and all its required flags.
answered Mar 8 at 2:08
TheDudeAbidesTheDudeAbides
1777
1777
An alternative is to use reorderecho -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
add a comment |
An alternative is to use reorderecho -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
An alternative is to use reorder
echo -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
An alternative is to use reorder
echo -e "last_firstnLastName, Firstname" | mlr --t2j --jlistwrap --jvstack nest --explode --values --across-fields --nested-fs "," -f last_first then clean-whitespace then reorder -f last_first_2,last_first_1
– aborruso
Mar 8 at 7:42
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f504891%2fhow-could-i-painlessly-split-or-reverse-last-first-within-a-record-in-mille%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Could you add a sample input and a sample output?
– aborruso
Mar 7 at 11:57