Groupby and append lists and strings
Clash Royale CLAN TAG#URR8PPP
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
python pandas
edited Mar 3 at 16:02
yatu
15.3k41542
15.3k41542
asked Mar 1 at 12:07
user11076352
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
add a comment |
You could groupby
value_1
and aggregate the columns containing strings with the following function:
def str_cat(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg('list':'sum', 'value_2': str_cat,
'value_3': str_cat)
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fgroupby-and-append-lists-and-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
edited Mar 1 at 13:12
answered Mar 1 at 12:15
jezraeljezrael
352k26317391
352k26317391
add a comment |
add a comment |
You could groupby
value_1
and aggregate the columns containing strings with the following function:
def str_cat(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg('list':'sum', 'value_2': str_cat,
'value_3': str_cat)
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate the columns containing strings with the following function:
def str_cat(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg('list':'sum', 'value_2': str_cat,
'value_3': str_cat)
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate the columns containing strings with the following function:
def str_cat(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg('list':'sum', 'value_2': str_cat,
'value_3': str_cat)
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
You could groupby
value_1
and aggregate the columns containing strings with the following function:
def str_cat(x):
return x.str.cat(sep=', ')
And use GroupBy.sum
to append the lists in the column list
:
df.replace('',None).groupby('value_1').agg('list':'sum', 'value_2': str_cat,
'value_3': str_cat)
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
edited Mar 3 at 15:53
answered Mar 1 at 12:14
yatuyatu
15.3k41542
15.3k41542
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fgroupby-and-append-lists-and-strings%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown