Merge two DataFrames based on columns and values of a specific column with Pandas in Python 3.x

up vote
6
down vote

favorite

Hello i have a problem which i am not able to implement a solution on.
I have following two DataFrames:

>>> df1
A B date
1 1 01-2016
2 1 02-2017
1 2 03-2017
2 2 04-2020

>>> df2
A B 01-2016 02-2017 03-2017 04.2020
1 1 0.10 0.22 0.55 0.77
2 1 0.20 0.12 0.99 0.125
1 2 0.13 0.15 0.15 0.245
2 2 0.33 0.1 0.888 0.64

What i want is following DataFrame:

>>> df3
A B date value
1 1 01-2016 0.10
2 1 02-2017 0.12
1 2 03-2017 0.15
2 2 04-2020 0.64

I already tried following:

 summarize_dates = self.summarize_specific_column(data=df1, column='date')

 for date in summarize_dates:
 left_on = np.append(left_on, date)
 right_on = np.append(right_on, merge_columns.upper())
 result = pd.merge(left=df2, right=df1,
 left_on=left_on, right_on=right_on,
 how='right')
 print(result)

This does not work. Can you help me and suggest a more comfortable implementation? Manyy thanks in advance!

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

1

The easiest way to do this would be to melt df2 and then do a left join on df1.
â€“Â CJ59
3 hours ago

1

Thanks for your answer. But what do you mean with melt the Frame? Edit: Was already answered. :)
â€“Â Michael Gann
3 hours ago

add a commentÂ |Â

up vote
6
down vote

favorite

Hello i have a problem which i am not able to implement a solution on.
I have following two DataFrames:

>>> df1
A B date
1 1 01-2016
2 1 02-2017
1 2 03-2017
2 2 04-2020

>>> df2
A B 01-2016 02-2017 03-2017 04.2020
1 1 0.10 0.22 0.55 0.77
2 1 0.20 0.12 0.99 0.125
1 2 0.13 0.15 0.15 0.245
2 2 0.33 0.1 0.888 0.64

What i want is following DataFrame:

>>> df3
A B date value
1 1 01-2016 0.10
2 1 02-2017 0.12
1 2 03-2017 0.15
2 2 04-2020 0.64

I already tried following:

 summarize_dates = self.summarize_specific_column(data=df1, column='date')

 for date in summarize_dates:
 left_on = np.append(left_on, date)
 right_on = np.append(right_on, merge_columns.upper())
 result = pd.merge(left=df2, right=df1,
 left_on=left_on, right_on=right_on,
 how='right')
 print(result)

This does not work. Can you help me and suggest a more comfortable implementation? Manyy thanks in advance!

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

1

The easiest way to do this would be to melt df2 and then do a left join on df1.
â€“Â CJ59
3 hours ago

1

Thanks for your answer. But what do you mean with melt the Frame? Edit: Was already answered. :)
â€“Â Michael Gann
3 hours ago

add a commentÂ |Â

up vote
6
down vote

favorite

Hello i have a problem which i am not able to implement a solution on.
I have following two DataFrames:

>>> df1
A B date
1 1 01-2016
2 1 02-2017
1 2 03-2017
2 2 04-2020

>>> df2
A B 01-2016 02-2017 03-2017 04.2020
1 1 0.10 0.22 0.55 0.77
2 1 0.20 0.12 0.99 0.125
1 2 0.13 0.15 0.15 0.245
2 2 0.33 0.1 0.888 0.64

What i want is following DataFrame:

>>> df3
A B date value
1 1 01-2016 0.10
2 1 02-2017 0.12
1 2 03-2017 0.15
2 2 04-2020 0.64

I already tried following:

 summarize_dates = self.summarize_specific_column(data=df1, column='date')

 for date in summarize_dates:
 left_on = np.append(left_on, date)
 right_on = np.append(right_on, merge_columns.upper())
 result = pd.merge(left=df2, right=df1,
 left_on=left_on, right_on=right_on,
 how='right')
 print(result)

This does not work. Can you help me and suggest a more comfortable implementation? Manyy thanks in advance!

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

Hello i have a problem which i am not able to implement a solution on.
I have following two DataFrames:

>>> df1
A B date
1 1 01-2016
2 1 02-2017
1 2 03-2017
2 2 04-2020

>>> df2
A B 01-2016 02-2017 03-2017 04.2020
1 1 0.10 0.22 0.55 0.77
2 1 0.20 0.12 0.99 0.125
1 2 0.13 0.15 0.15 0.245
2 2 0.33 0.1 0.888 0.64

What i want is following DataFrame:

>>> df3
A B date value
1 1 01-2016 0.10
2 1 02-2017 0.12
1 2 03-2017 0.15
2 2 04-2020 0.64

I already tried following:

 summarize_dates = self.summarize_specific_column(data=df1, column='date')

 for date in summarize_dates:
 left_on = np.append(left_on, date)
 right_on = np.append(right_on, merge_columns.upper())
 result = pd.merge(left=df2, right=df1,
 left_on=left_on, right_on=right_on,
 how='right')
 print(result)

This does not work. Can you help me and suggest a more comfortable implementation? Manyy thanks in advance!

python pandas csv merge

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

edited 3 hours ago

asked 3 hours ago

Michael Gann

343

asked 3 hours ago

Michael Gann

343

asked 3 hours ago

Michael Gann

343

1

The easiest way to do this would be to melt df2 and then do a left join on df1.
â€“Â CJ59
3 hours ago

1

Thanks for your answer. But what do you mean with melt the Frame? Edit: Was already answered. :)
â€“Â Michael Gann
3 hours ago

add a commentÂ |Â

1

The easiest way to do this would be to melt df2 and then do a left join on df1.
â€“Â CJ59
3 hours ago

1

Thanks for your answer. But what do you mean with melt the Frame? Edit: Was already answered. :)
â€“Â Michael Gann
3 hours ago

The easiest way to do this would be to melt df2 and then do a left join on df1.
â€“Â CJ59
3 hours ago

Thanks for your answer. But what do you mean with melt the Frame? Edit: Was already answered. :)
â€“Â Michael Gann
3 hours ago

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
9
down vote

You can melt df2 and then merge using the default 'inner' merge

df3 = df1.merge(df2.melt(id_vars = ['A', 'B'], var_name='date'))

 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

1

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

1

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
3
down vote

Using lookup

df1['value']=df2.set_index(['A','B']).lookup(df1.set_index(['A','B']).index,df1.date)
df1
Out[228]: 
 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

answered 3 hours ago

W-B

90.5k72754

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53214900%2fmerge-two-dataframes-based-on-columns-and-values-of-a-specific-column-with-panda%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
9
down vote

You can melt df2 and then merge using the default 'inner' merge

df3 = df1.merge(df2.melt(id_vars = ['A', 'B'], var_name='date'))

 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

1

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

1

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
9
down vote

You can melt df2 and then merge using the default 'inner' merge

df3 = df1.merge(df2.melt(id_vars = ['A', 'B'], var_name='date'))

 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

1

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

1

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
9
down vote

You can melt df2 and then merge using the default 'inner' merge

df3 = df1.merge(df2.melt(id_vars = ['A', 'B'], var_name='date'))

 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

You can melt df2 and then merge using the default 'inner' merge

df3 = df1.merge(df2.melt(id_vars = ['A', 'B'], var_name='date'))

 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

edited 3 hours ago

answered 3 hours ago

Vaishali

16.2k3927

answered 3 hours ago

Vaishali

16.2k3927

answered 3 hours ago

Vaishali

16.2k3927

1

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

1

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

1

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

1

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

Thank you i will try that out!
â€“Â Michael Gann
3 hours ago

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
3
down vote

Using lookup

df1['value']=df2.set_index(['A','B']).lookup(df1.set_index(['A','B']).index,df1.date)
df1
Out[228]: 
 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

answered 3 hours ago

W-B

90.5k72754

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
3
down vote

Using lookup

df1['value']=df2.set_index(['A','B']).lookup(df1.set_index(['A','B']).index,df1.date)
df1
Out[228]: 
 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

answered 3 hours ago

W-B

90.5k72754

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

up vote
3
down vote

Using lookup

df1['value']=df2.set_index(['A','B']).lookup(df1.set_index(['A','B']).index,df1.date)
df1
Out[228]: 
 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

answered 3 hours ago

W-B

90.5k72754

Using lookup

df1['value']=df2.set_index(['A','B']).lookup(df1.set_index(['A','B']).index,df1.date)
df1
Out[228]: 
 A B date value
0 1 1 01-2016 0.10
1 2 1 02-2017 0.12
2 1 2 03-2017 0.15
3 2 2 04-2020 0.64

answered 3 hours ago

W-B

90.5k72754

answered 3 hours ago

W-B

90.5k72754

answered 3 hours ago

W-B

90.5k72754

answered 3 hours ago

W-B

90.5k72754

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

Works fine! Thank you very much!
â€“Â Michael Gann
2 hours ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

mjhjmtu