How do I operate on a DataFrame with a Series for every column
Clash Royale CLAN TAG#URR8PPP
up vote
7
down vote
favorite
Objective and Motivation
I've seen this kind of question several times over and have seen many other questions that involve some element of this. Most recently, I had to spend a bit of time explaining this concept in comments while looking for an appropriate canonical Q&A. I did not find one and so I thought I'd write one.
This question usually arises with respect to a specific operation but equally applies to most arithmetic operations.
- How do I subtract a
Series
from every column in aDataFrame
? - How do I add a
Series
from every column in aDataFrame
? - How do I multiply a
Series
from every column in aDataFrame
? - How do I divide a
Series
from every column in aDataFrame
?
The Question
Given a Series
s
and DataFrame
df
. How do I operate on each column of df
with s
?
df = pd.DataFrame(
[[1, 2, 3], [4, 5, 6]],
index=[0, 1],
columns=['a', 'b', 'c']
)
s = pd.Series([3, 14], index=[0, 1])
When I attempt to add them, I get all np.nan
df + s
a b c 0 1
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
What I thought I should get is
a b c
0 4 5 6
1 18 19 20
python pandas
add a comment |Â
up vote
7
down vote
favorite
Objective and Motivation
I've seen this kind of question several times over and have seen many other questions that involve some element of this. Most recently, I had to spend a bit of time explaining this concept in comments while looking for an appropriate canonical Q&A. I did not find one and so I thought I'd write one.
This question usually arises with respect to a specific operation but equally applies to most arithmetic operations.
- How do I subtract a
Series
from every column in aDataFrame
? - How do I add a
Series
from every column in aDataFrame
? - How do I multiply a
Series
from every column in aDataFrame
? - How do I divide a
Series
from every column in aDataFrame
?
The Question
Given a Series
s
and DataFrame
df
. How do I operate on each column of df
with s
?
df = pd.DataFrame(
[[1, 2, 3], [4, 5, 6]],
index=[0, 1],
columns=['a', 'b', 'c']
)
s = pd.Series([3, 14], index=[0, 1])
When I attempt to add them, I get all np.nan
df + s
a b c 0 1
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
What I thought I should get is
a b c
0 4 5 6
1 18 19 20
python pandas
add a comment |Â
up vote
7
down vote
favorite
up vote
7
down vote
favorite
Objective and Motivation
I've seen this kind of question several times over and have seen many other questions that involve some element of this. Most recently, I had to spend a bit of time explaining this concept in comments while looking for an appropriate canonical Q&A. I did not find one and so I thought I'd write one.
This question usually arises with respect to a specific operation but equally applies to most arithmetic operations.
- How do I subtract a
Series
from every column in aDataFrame
? - How do I add a
Series
from every column in aDataFrame
? - How do I multiply a
Series
from every column in aDataFrame
? - How do I divide a
Series
from every column in aDataFrame
?
The Question
Given a Series
s
and DataFrame
df
. How do I operate on each column of df
with s
?
df = pd.DataFrame(
[[1, 2, 3], [4, 5, 6]],
index=[0, 1],
columns=['a', 'b', 'c']
)
s = pd.Series([3, 14], index=[0, 1])
When I attempt to add them, I get all np.nan
df + s
a b c 0 1
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
What I thought I should get is
a b c
0 4 5 6
1 18 19 20
python pandas
Objective and Motivation
I've seen this kind of question several times over and have seen many other questions that involve some element of this. Most recently, I had to spend a bit of time explaining this concept in comments while looking for an appropriate canonical Q&A. I did not find one and so I thought I'd write one.
This question usually arises with respect to a specific operation but equally applies to most arithmetic operations.
- How do I subtract a
Series
from every column in aDataFrame
? - How do I add a
Series
from every column in aDataFrame
? - How do I multiply a
Series
from every column in aDataFrame
? - How do I divide a
Series
from every column in aDataFrame
?
The Question
Given a Series
s
and DataFrame
df
. How do I operate on each column of df
with s
?
df = pd.DataFrame(
[[1, 2, 3], [4, 5, 6]],
index=[0, 1],
columns=['a', 'b', 'c']
)
s = pd.Series([3, 14], index=[0, 1])
When I attempt to add them, I get all np.nan
df + s
a b c 0 1
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
What I thought I should get is
a b c
0 4 5 6
1 18 19 20
python pandas
python pandas
asked 2 hours ago
piRSquared
147k21130264
147k21130264
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
8
down vote
Please bear the preamble. It's important to address some higher level concepts first. Since my motivation is to share knowledge and teach, I wanted to make this as clear as possible.
It is helpful to create a mental model of what Series
and DataFrame
objects are.
Anatomy of a Series
A Series
should be thought of as an enhanced dictionary. This isn't always a perfect analogy, but we'll start here. Also, there are other analogies that you can make but I am targetting a dictionary in order to demonstrate the purpose of this post.
index
These are the keys that we can reference to get at the corresponding values. When the elements of the index are unique, the comparison to a dictionary becomes very close.
values
These are the corresponding values that are keyed by the index.
Anatomy of a DataFrame
A DataFrame
should be thought of as a dictionary of Series
or a Series
of Series
. In this case the keys are the column names and the values are the columns themselves as Series
objects. Each Series
agrees to share the same index
which is the index of the DataFrame
.
columns
These are the keys that we can reference to get at the corresponding Series
.
index
This the the index that all of the Series
values agree to share.
Note: RE: columns
and index
objects
They are the same kind of things. A DataFrame
s index
can be used as another DataFrame
s columns
. In fact, this happens when you do df.T
to get a transpose.
values
This is a 2 dimensional array that contains the data in a DataFrame
. The reality is that values
is NOT what is stored inside the DataFrame
object. (Well sometimes it is, but I'm not about to try to describe the block mananager). The point is, it is better to think of this as access to a 2 dimensional array of the data.
Define Sample Data
These are sample pandas.Index
objects that can be used as the index
of a Series
or DataFrame
or can be used as the columns
of a DataFrame
idx_lower = pd.Index([*'abcde'], name='lower')
idx_range = pd.RangeIndex(5, name='range')
These are sample pandas.Series
objects that use the pandas.Index
objects above
s0 = pd.Series(range(10, 15), idx_lower)
s1 = pd.Series(range(30, 40, 2), idx_lower)
s2 = pd.Series(range(50, 10, -8), idx_range)
These are sample pandas.DataFrame
objects that use the pandas.Index
objects above
df0 = pd.DataFrame(100, index=idx_range, columns=idx_lower)
df1 = pd.DataFrame(
np.arange(np.product(df0.shape)).reshape(df0.shape),
index=idx_range, columns=idx_lower
)
Series
on Series
When operating on two Series
, the alignment is obvious. You align the index
of one Series
with the index
of the other.
s1 + s0
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
Which is the same as when I randomly shuffle one before I operate. The indices will sitll align.
s1 + s0.sample(frac=1)
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
And is NOT the case when instead I operate with the values of the shuffled Series
. In this case, Pandas doesn't have the index
to align with and therefore operates from a positions.
s1 + s0.sample(frac=1).values
lower
a 42
b 42
c 47
d 50
e 49
dtype: int64
Add a scalar
s1 + 1
lower
a 31
b 33
c 35
d 37
e 39
dtype: int64
DataFrame
on DataFrame
Similar is true when operating between two DataFrame
s
The alignment is obvious and does what we think it should do
df0 + df1
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Shuffle second DataFrame
on both axes. The index
and columns
will still align and give us the same thing.
df0 + df1.sample(frac=1).sample(frac=1, axis=1)
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Same shuffling but add the array and not the DataFrame
. No longer aligned and will get different results.
df0 + df1.sample(frac=1).sample(frac=1, axis=1).values
lower a b c d e
range
0 123 124 121 122 120
1 118 119 116 117 115
2 108 109 106 107 105
3 103 104 101 102 100
4 113 114 111 112 110
Add 1 dimensional array. Will align with columns and broadcast across rows.
df0 + [*range(2, df0.shape[1] + 2)]
lower a b c d e
range
0 102 103 104 105 106
1 102 103 104 105 106
2 102 103 104 105 106
3 102 103 104 105 106
4 102 103 104 105 106
Add a scalar. Nothing to align with so broadcasts to everything
df0 + 1
lower a b c d e
range
0 101 101 101 101 101
1 101 101 101 101 101
2 101 101 101 101 101
3 101 101 101 101 101
4 101 101 101 101 101
DataFrame
on Series
If DataFrame
s are to be though of as dictionaries of Series
and Series
are to be thought of as dictionaries of values, then it is natural that when operating between a DataFrame
and Series
that they should be aligned by their "keys".
s0:
lower a b c d e
10 11 12 13 14
df0:
lower a b c d e
range
0 100 100 100 100 100
1 100 100 100 100 100
2 100 100 100 100 100
3 100 100 100 100 100
4 100 100 100 100 100
And when we operate, the 10
in s0['a']
gets added to the entire column of df0['a']
df0 + s0
lower a b c d e
range
0 110 111 112 113 114
1 110 111 112 113 114
2 110 111 112 113 114
3 110 111 112 113 114
4 110 111 112 113 114
Heart of the issue and point of the post
What about if I want s2
and df0
?
s2: df0:
| lower a b c d e
range | range
0 50 | 0 100 100 100 100 100
1 42 | 1 100 100 100 100 100
2 34 | 2 100 100 100 100 100
3 26 | 3 100 100 100 100 100
4 18 | 4 100 100 100 100 100
When I operate, I get the all np.nan
as cited in the question
df0 + s2
a b c d e 0 1 2 3 4
range
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
This does not produce what we wanted. Because Pandas is aligning the index
of s2
with the columns
of df0
. The columns
of the result includes a union of the index
of s2
and the columns
of df0
.
We could fake it out with tricky transposition
(df0.T + s2).T
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
But it turns out Pandas has a better solution. There are operation methods that allow us to pass an axis
argument to specify the axis to align with.
-
sub
+
add
*
mul
/
div
**
pow
And so the answer is simply
df0.add(s2, axis='index')
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Turns out axis='index'
is synomynous with axis=0
.
As is axis='columns'
synomynous with axis=1
df0.add(s2, axis=0)
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Rest of the operations
df0.sub(s2, axis=0)
lower a b c d e
range
0 50 50 50 50 50
1 58 58 58 58 58
2 66 66 66 66 66
3 74 74 74 74 74
4 82 82 82 82 82
df0.mul(s2, axis=0)
lower a b c d e
range
0 5000 5000 5000 5000 5000
1 4200 4200 4200 4200 4200
2 3400 3400 3400 3400 3400
3 2600 2600 2600 2600 2600
4 1800 1800 1800 1800 1800
df0.div(s2, axis=0)
lower a b c d e
range
0 2.000000 2.000000 2.000000 2.000000 2.000000
1 2.380952 2.380952 2.380952 2.380952 2.380952
2 2.941176 2.941176 2.941176 2.941176 2.941176
3 3.846154 3.846154 3.846154 3.846154 3.846154
4 5.555556 5.555556 5.555556 5.555556 5.555556
df0.pow(1 / s2, axis=0)
lower a b c d e
range
0 1.096478 1.096478 1.096478 1.096478 1.096478
1 1.115884 1.115884 1.115884 1.115884 1.115884
2 1.145048 1.145048 1.145048 1.145048 1.145048
3 1.193777 1.193777 1.193777 1.193777 1.193777
4 1.291550 1.291550 1.291550 1.291550 1.291550
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
add a comment |Â
up vote
2
down vote
I prefer the method mentioned by @piSquared (i.e. df.add(s, axis=0)), but another method uses apply
together with lambda
to perform an action on each column in the dataframe:
>>>> df.apply(lambda col: col + s)
a b c
0 4 5 6
1 18 19 20
To apply the lambda function to the rows, use axis=1
:
>>> df.T.apply(lambda row: row + s, axis=1)
0 1
a 4 18
b 5 19
c 6 20
This method could be useful when the transformation is more complex, e.g.:
df.apply(lambda col: 0.5 * col ** 2 + 2 * s - 3)
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
8
down vote
Please bear the preamble. It's important to address some higher level concepts first. Since my motivation is to share knowledge and teach, I wanted to make this as clear as possible.
It is helpful to create a mental model of what Series
and DataFrame
objects are.
Anatomy of a Series
A Series
should be thought of as an enhanced dictionary. This isn't always a perfect analogy, but we'll start here. Also, there are other analogies that you can make but I am targetting a dictionary in order to demonstrate the purpose of this post.
index
These are the keys that we can reference to get at the corresponding values. When the elements of the index are unique, the comparison to a dictionary becomes very close.
values
These are the corresponding values that are keyed by the index.
Anatomy of a DataFrame
A DataFrame
should be thought of as a dictionary of Series
or a Series
of Series
. In this case the keys are the column names and the values are the columns themselves as Series
objects. Each Series
agrees to share the same index
which is the index of the DataFrame
.
columns
These are the keys that we can reference to get at the corresponding Series
.
index
This the the index that all of the Series
values agree to share.
Note: RE: columns
and index
objects
They are the same kind of things. A DataFrame
s index
can be used as another DataFrame
s columns
. In fact, this happens when you do df.T
to get a transpose.
values
This is a 2 dimensional array that contains the data in a DataFrame
. The reality is that values
is NOT what is stored inside the DataFrame
object. (Well sometimes it is, but I'm not about to try to describe the block mananager). The point is, it is better to think of this as access to a 2 dimensional array of the data.
Define Sample Data
These are sample pandas.Index
objects that can be used as the index
of a Series
or DataFrame
or can be used as the columns
of a DataFrame
idx_lower = pd.Index([*'abcde'], name='lower')
idx_range = pd.RangeIndex(5, name='range')
These are sample pandas.Series
objects that use the pandas.Index
objects above
s0 = pd.Series(range(10, 15), idx_lower)
s1 = pd.Series(range(30, 40, 2), idx_lower)
s2 = pd.Series(range(50, 10, -8), idx_range)
These are sample pandas.DataFrame
objects that use the pandas.Index
objects above
df0 = pd.DataFrame(100, index=idx_range, columns=idx_lower)
df1 = pd.DataFrame(
np.arange(np.product(df0.shape)).reshape(df0.shape),
index=idx_range, columns=idx_lower
)
Series
on Series
When operating on two Series
, the alignment is obvious. You align the index
of one Series
with the index
of the other.
s1 + s0
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
Which is the same as when I randomly shuffle one before I operate. The indices will sitll align.
s1 + s0.sample(frac=1)
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
And is NOT the case when instead I operate with the values of the shuffled Series
. In this case, Pandas doesn't have the index
to align with and therefore operates from a positions.
s1 + s0.sample(frac=1).values
lower
a 42
b 42
c 47
d 50
e 49
dtype: int64
Add a scalar
s1 + 1
lower
a 31
b 33
c 35
d 37
e 39
dtype: int64
DataFrame
on DataFrame
Similar is true when operating between two DataFrame
s
The alignment is obvious and does what we think it should do
df0 + df1
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Shuffle second DataFrame
on both axes. The index
and columns
will still align and give us the same thing.
df0 + df1.sample(frac=1).sample(frac=1, axis=1)
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Same shuffling but add the array and not the DataFrame
. No longer aligned and will get different results.
df0 + df1.sample(frac=1).sample(frac=1, axis=1).values
lower a b c d e
range
0 123 124 121 122 120
1 118 119 116 117 115
2 108 109 106 107 105
3 103 104 101 102 100
4 113 114 111 112 110
Add 1 dimensional array. Will align with columns and broadcast across rows.
df0 + [*range(2, df0.shape[1] + 2)]
lower a b c d e
range
0 102 103 104 105 106
1 102 103 104 105 106
2 102 103 104 105 106
3 102 103 104 105 106
4 102 103 104 105 106
Add a scalar. Nothing to align with so broadcasts to everything
df0 + 1
lower a b c d e
range
0 101 101 101 101 101
1 101 101 101 101 101
2 101 101 101 101 101
3 101 101 101 101 101
4 101 101 101 101 101
DataFrame
on Series
If DataFrame
s are to be though of as dictionaries of Series
and Series
are to be thought of as dictionaries of values, then it is natural that when operating between a DataFrame
and Series
that they should be aligned by their "keys".
s0:
lower a b c d e
10 11 12 13 14
df0:
lower a b c d e
range
0 100 100 100 100 100
1 100 100 100 100 100
2 100 100 100 100 100
3 100 100 100 100 100
4 100 100 100 100 100
And when we operate, the 10
in s0['a']
gets added to the entire column of df0['a']
df0 + s0
lower a b c d e
range
0 110 111 112 113 114
1 110 111 112 113 114
2 110 111 112 113 114
3 110 111 112 113 114
4 110 111 112 113 114
Heart of the issue and point of the post
What about if I want s2
and df0
?
s2: df0:
| lower a b c d e
range | range
0 50 | 0 100 100 100 100 100
1 42 | 1 100 100 100 100 100
2 34 | 2 100 100 100 100 100
3 26 | 3 100 100 100 100 100
4 18 | 4 100 100 100 100 100
When I operate, I get the all np.nan
as cited in the question
df0 + s2
a b c d e 0 1 2 3 4
range
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
This does not produce what we wanted. Because Pandas is aligning the index
of s2
with the columns
of df0
. The columns
of the result includes a union of the index
of s2
and the columns
of df0
.
We could fake it out with tricky transposition
(df0.T + s2).T
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
But it turns out Pandas has a better solution. There are operation methods that allow us to pass an axis
argument to specify the axis to align with.
-
sub
+
add
*
mul
/
div
**
pow
And so the answer is simply
df0.add(s2, axis='index')
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Turns out axis='index'
is synomynous with axis=0
.
As is axis='columns'
synomynous with axis=1
df0.add(s2, axis=0)
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Rest of the operations
df0.sub(s2, axis=0)
lower a b c d e
range
0 50 50 50 50 50
1 58 58 58 58 58
2 66 66 66 66 66
3 74 74 74 74 74
4 82 82 82 82 82
df0.mul(s2, axis=0)
lower a b c d e
range
0 5000 5000 5000 5000 5000
1 4200 4200 4200 4200 4200
2 3400 3400 3400 3400 3400
3 2600 2600 2600 2600 2600
4 1800 1800 1800 1800 1800
df0.div(s2, axis=0)
lower a b c d e
range
0 2.000000 2.000000 2.000000 2.000000 2.000000
1 2.380952 2.380952 2.380952 2.380952 2.380952
2 2.941176 2.941176 2.941176 2.941176 2.941176
3 3.846154 3.846154 3.846154 3.846154 3.846154
4 5.555556 5.555556 5.555556 5.555556 5.555556
df0.pow(1 / s2, axis=0)
lower a b c d e
range
0 1.096478 1.096478 1.096478 1.096478 1.096478
1 1.115884 1.115884 1.115884 1.115884 1.115884
2 1.145048 1.145048 1.145048 1.145048 1.145048
3 1.193777 1.193777 1.193777 1.193777 1.193777
4 1.291550 1.291550 1.291550 1.291550 1.291550
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
add a comment |Â
up vote
8
down vote
Please bear the preamble. It's important to address some higher level concepts first. Since my motivation is to share knowledge and teach, I wanted to make this as clear as possible.
It is helpful to create a mental model of what Series
and DataFrame
objects are.
Anatomy of a Series
A Series
should be thought of as an enhanced dictionary. This isn't always a perfect analogy, but we'll start here. Also, there are other analogies that you can make but I am targetting a dictionary in order to demonstrate the purpose of this post.
index
These are the keys that we can reference to get at the corresponding values. When the elements of the index are unique, the comparison to a dictionary becomes very close.
values
These are the corresponding values that are keyed by the index.
Anatomy of a DataFrame
A DataFrame
should be thought of as a dictionary of Series
or a Series
of Series
. In this case the keys are the column names and the values are the columns themselves as Series
objects. Each Series
agrees to share the same index
which is the index of the DataFrame
.
columns
These are the keys that we can reference to get at the corresponding Series
.
index
This the the index that all of the Series
values agree to share.
Note: RE: columns
and index
objects
They are the same kind of things. A DataFrame
s index
can be used as another DataFrame
s columns
. In fact, this happens when you do df.T
to get a transpose.
values
This is a 2 dimensional array that contains the data in a DataFrame
. The reality is that values
is NOT what is stored inside the DataFrame
object. (Well sometimes it is, but I'm not about to try to describe the block mananager). The point is, it is better to think of this as access to a 2 dimensional array of the data.
Define Sample Data
These are sample pandas.Index
objects that can be used as the index
of a Series
or DataFrame
or can be used as the columns
of a DataFrame
idx_lower = pd.Index([*'abcde'], name='lower')
idx_range = pd.RangeIndex(5, name='range')
These are sample pandas.Series
objects that use the pandas.Index
objects above
s0 = pd.Series(range(10, 15), idx_lower)
s1 = pd.Series(range(30, 40, 2), idx_lower)
s2 = pd.Series(range(50, 10, -8), idx_range)
These are sample pandas.DataFrame
objects that use the pandas.Index
objects above
df0 = pd.DataFrame(100, index=idx_range, columns=idx_lower)
df1 = pd.DataFrame(
np.arange(np.product(df0.shape)).reshape(df0.shape),
index=idx_range, columns=idx_lower
)
Series
on Series
When operating on two Series
, the alignment is obvious. You align the index
of one Series
with the index
of the other.
s1 + s0
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
Which is the same as when I randomly shuffle one before I operate. The indices will sitll align.
s1 + s0.sample(frac=1)
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
And is NOT the case when instead I operate with the values of the shuffled Series
. In this case, Pandas doesn't have the index
to align with and therefore operates from a positions.
s1 + s0.sample(frac=1).values
lower
a 42
b 42
c 47
d 50
e 49
dtype: int64
Add a scalar
s1 + 1
lower
a 31
b 33
c 35
d 37
e 39
dtype: int64
DataFrame
on DataFrame
Similar is true when operating between two DataFrame
s
The alignment is obvious and does what we think it should do
df0 + df1
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Shuffle second DataFrame
on both axes. The index
and columns
will still align and give us the same thing.
df0 + df1.sample(frac=1).sample(frac=1, axis=1)
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Same shuffling but add the array and not the DataFrame
. No longer aligned and will get different results.
df0 + df1.sample(frac=1).sample(frac=1, axis=1).values
lower a b c d e
range
0 123 124 121 122 120
1 118 119 116 117 115
2 108 109 106 107 105
3 103 104 101 102 100
4 113 114 111 112 110
Add 1 dimensional array. Will align with columns and broadcast across rows.
df0 + [*range(2, df0.shape[1] + 2)]
lower a b c d e
range
0 102 103 104 105 106
1 102 103 104 105 106
2 102 103 104 105 106
3 102 103 104 105 106
4 102 103 104 105 106
Add a scalar. Nothing to align with so broadcasts to everything
df0 + 1
lower a b c d e
range
0 101 101 101 101 101
1 101 101 101 101 101
2 101 101 101 101 101
3 101 101 101 101 101
4 101 101 101 101 101
DataFrame
on Series
If DataFrame
s are to be though of as dictionaries of Series
and Series
are to be thought of as dictionaries of values, then it is natural that when operating between a DataFrame
and Series
that they should be aligned by their "keys".
s0:
lower a b c d e
10 11 12 13 14
df0:
lower a b c d e
range
0 100 100 100 100 100
1 100 100 100 100 100
2 100 100 100 100 100
3 100 100 100 100 100
4 100 100 100 100 100
And when we operate, the 10
in s0['a']
gets added to the entire column of df0['a']
df0 + s0
lower a b c d e
range
0 110 111 112 113 114
1 110 111 112 113 114
2 110 111 112 113 114
3 110 111 112 113 114
4 110 111 112 113 114
Heart of the issue and point of the post
What about if I want s2
and df0
?
s2: df0:
| lower a b c d e
range | range
0 50 | 0 100 100 100 100 100
1 42 | 1 100 100 100 100 100
2 34 | 2 100 100 100 100 100
3 26 | 3 100 100 100 100 100
4 18 | 4 100 100 100 100 100
When I operate, I get the all np.nan
as cited in the question
df0 + s2
a b c d e 0 1 2 3 4
range
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
This does not produce what we wanted. Because Pandas is aligning the index
of s2
with the columns
of df0
. The columns
of the result includes a union of the index
of s2
and the columns
of df0
.
We could fake it out with tricky transposition
(df0.T + s2).T
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
But it turns out Pandas has a better solution. There are operation methods that allow us to pass an axis
argument to specify the axis to align with.
-
sub
+
add
*
mul
/
div
**
pow
And so the answer is simply
df0.add(s2, axis='index')
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Turns out axis='index'
is synomynous with axis=0
.
As is axis='columns'
synomynous with axis=1
df0.add(s2, axis=0)
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Rest of the operations
df0.sub(s2, axis=0)
lower a b c d e
range
0 50 50 50 50 50
1 58 58 58 58 58
2 66 66 66 66 66
3 74 74 74 74 74
4 82 82 82 82 82
df0.mul(s2, axis=0)
lower a b c d e
range
0 5000 5000 5000 5000 5000
1 4200 4200 4200 4200 4200
2 3400 3400 3400 3400 3400
3 2600 2600 2600 2600 2600
4 1800 1800 1800 1800 1800
df0.div(s2, axis=0)
lower a b c d e
range
0 2.000000 2.000000 2.000000 2.000000 2.000000
1 2.380952 2.380952 2.380952 2.380952 2.380952
2 2.941176 2.941176 2.941176 2.941176 2.941176
3 3.846154 3.846154 3.846154 3.846154 3.846154
4 5.555556 5.555556 5.555556 5.555556 5.555556
df0.pow(1 / s2, axis=0)
lower a b c d e
range
0 1.096478 1.096478 1.096478 1.096478 1.096478
1 1.115884 1.115884 1.115884 1.115884 1.115884
2 1.145048 1.145048 1.145048 1.145048 1.145048
3 1.193777 1.193777 1.193777 1.193777 1.193777
4 1.291550 1.291550 1.291550 1.291550 1.291550
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
add a comment |Â
up vote
8
down vote
up vote
8
down vote
Please bear the preamble. It's important to address some higher level concepts first. Since my motivation is to share knowledge and teach, I wanted to make this as clear as possible.
It is helpful to create a mental model of what Series
and DataFrame
objects are.
Anatomy of a Series
A Series
should be thought of as an enhanced dictionary. This isn't always a perfect analogy, but we'll start here. Also, there are other analogies that you can make but I am targetting a dictionary in order to demonstrate the purpose of this post.
index
These are the keys that we can reference to get at the corresponding values. When the elements of the index are unique, the comparison to a dictionary becomes very close.
values
These are the corresponding values that are keyed by the index.
Anatomy of a DataFrame
A DataFrame
should be thought of as a dictionary of Series
or a Series
of Series
. In this case the keys are the column names and the values are the columns themselves as Series
objects. Each Series
agrees to share the same index
which is the index of the DataFrame
.
columns
These are the keys that we can reference to get at the corresponding Series
.
index
This the the index that all of the Series
values agree to share.
Note: RE: columns
and index
objects
They are the same kind of things. A DataFrame
s index
can be used as another DataFrame
s columns
. In fact, this happens when you do df.T
to get a transpose.
values
This is a 2 dimensional array that contains the data in a DataFrame
. The reality is that values
is NOT what is stored inside the DataFrame
object. (Well sometimes it is, but I'm not about to try to describe the block mananager). The point is, it is better to think of this as access to a 2 dimensional array of the data.
Define Sample Data
These are sample pandas.Index
objects that can be used as the index
of a Series
or DataFrame
or can be used as the columns
of a DataFrame
idx_lower = pd.Index([*'abcde'], name='lower')
idx_range = pd.RangeIndex(5, name='range')
These are sample pandas.Series
objects that use the pandas.Index
objects above
s0 = pd.Series(range(10, 15), idx_lower)
s1 = pd.Series(range(30, 40, 2), idx_lower)
s2 = pd.Series(range(50, 10, -8), idx_range)
These are sample pandas.DataFrame
objects that use the pandas.Index
objects above
df0 = pd.DataFrame(100, index=idx_range, columns=idx_lower)
df1 = pd.DataFrame(
np.arange(np.product(df0.shape)).reshape(df0.shape),
index=idx_range, columns=idx_lower
)
Series
on Series
When operating on two Series
, the alignment is obvious. You align the index
of one Series
with the index
of the other.
s1 + s0
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
Which is the same as when I randomly shuffle one before I operate. The indices will sitll align.
s1 + s0.sample(frac=1)
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
And is NOT the case when instead I operate with the values of the shuffled Series
. In this case, Pandas doesn't have the index
to align with and therefore operates from a positions.
s1 + s0.sample(frac=1).values
lower
a 42
b 42
c 47
d 50
e 49
dtype: int64
Add a scalar
s1 + 1
lower
a 31
b 33
c 35
d 37
e 39
dtype: int64
DataFrame
on DataFrame
Similar is true when operating between two DataFrame
s
The alignment is obvious and does what we think it should do
df0 + df1
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Shuffle second DataFrame
on both axes. The index
and columns
will still align and give us the same thing.
df0 + df1.sample(frac=1).sample(frac=1, axis=1)
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Same shuffling but add the array and not the DataFrame
. No longer aligned and will get different results.
df0 + df1.sample(frac=1).sample(frac=1, axis=1).values
lower a b c d e
range
0 123 124 121 122 120
1 118 119 116 117 115
2 108 109 106 107 105
3 103 104 101 102 100
4 113 114 111 112 110
Add 1 dimensional array. Will align with columns and broadcast across rows.
df0 + [*range(2, df0.shape[1] + 2)]
lower a b c d e
range
0 102 103 104 105 106
1 102 103 104 105 106
2 102 103 104 105 106
3 102 103 104 105 106
4 102 103 104 105 106
Add a scalar. Nothing to align with so broadcasts to everything
df0 + 1
lower a b c d e
range
0 101 101 101 101 101
1 101 101 101 101 101
2 101 101 101 101 101
3 101 101 101 101 101
4 101 101 101 101 101
DataFrame
on Series
If DataFrame
s are to be though of as dictionaries of Series
and Series
are to be thought of as dictionaries of values, then it is natural that when operating between a DataFrame
and Series
that they should be aligned by their "keys".
s0:
lower a b c d e
10 11 12 13 14
df0:
lower a b c d e
range
0 100 100 100 100 100
1 100 100 100 100 100
2 100 100 100 100 100
3 100 100 100 100 100
4 100 100 100 100 100
And when we operate, the 10
in s0['a']
gets added to the entire column of df0['a']
df0 + s0
lower a b c d e
range
0 110 111 112 113 114
1 110 111 112 113 114
2 110 111 112 113 114
3 110 111 112 113 114
4 110 111 112 113 114
Heart of the issue and point of the post
What about if I want s2
and df0
?
s2: df0:
| lower a b c d e
range | range
0 50 | 0 100 100 100 100 100
1 42 | 1 100 100 100 100 100
2 34 | 2 100 100 100 100 100
3 26 | 3 100 100 100 100 100
4 18 | 4 100 100 100 100 100
When I operate, I get the all np.nan
as cited in the question
df0 + s2
a b c d e 0 1 2 3 4
range
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
This does not produce what we wanted. Because Pandas is aligning the index
of s2
with the columns
of df0
. The columns
of the result includes a union of the index
of s2
and the columns
of df0
.
We could fake it out with tricky transposition
(df0.T + s2).T
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
But it turns out Pandas has a better solution. There are operation methods that allow us to pass an axis
argument to specify the axis to align with.
-
sub
+
add
*
mul
/
div
**
pow
And so the answer is simply
df0.add(s2, axis='index')
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Turns out axis='index'
is synomynous with axis=0
.
As is axis='columns'
synomynous with axis=1
df0.add(s2, axis=0)
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Rest of the operations
df0.sub(s2, axis=0)
lower a b c d e
range
0 50 50 50 50 50
1 58 58 58 58 58
2 66 66 66 66 66
3 74 74 74 74 74
4 82 82 82 82 82
df0.mul(s2, axis=0)
lower a b c d e
range
0 5000 5000 5000 5000 5000
1 4200 4200 4200 4200 4200
2 3400 3400 3400 3400 3400
3 2600 2600 2600 2600 2600
4 1800 1800 1800 1800 1800
df0.div(s2, axis=0)
lower a b c d e
range
0 2.000000 2.000000 2.000000 2.000000 2.000000
1 2.380952 2.380952 2.380952 2.380952 2.380952
2 2.941176 2.941176 2.941176 2.941176 2.941176
3 3.846154 3.846154 3.846154 3.846154 3.846154
4 5.555556 5.555556 5.555556 5.555556 5.555556
df0.pow(1 / s2, axis=0)
lower a b c d e
range
0 1.096478 1.096478 1.096478 1.096478 1.096478
1 1.115884 1.115884 1.115884 1.115884 1.115884
2 1.145048 1.145048 1.145048 1.145048 1.145048
3 1.193777 1.193777 1.193777 1.193777 1.193777
4 1.291550 1.291550 1.291550 1.291550 1.291550
Please bear the preamble. It's important to address some higher level concepts first. Since my motivation is to share knowledge and teach, I wanted to make this as clear as possible.
It is helpful to create a mental model of what Series
and DataFrame
objects are.
Anatomy of a Series
A Series
should be thought of as an enhanced dictionary. This isn't always a perfect analogy, but we'll start here. Also, there are other analogies that you can make but I am targetting a dictionary in order to demonstrate the purpose of this post.
index
These are the keys that we can reference to get at the corresponding values. When the elements of the index are unique, the comparison to a dictionary becomes very close.
values
These are the corresponding values that are keyed by the index.
Anatomy of a DataFrame
A DataFrame
should be thought of as a dictionary of Series
or a Series
of Series
. In this case the keys are the column names and the values are the columns themselves as Series
objects. Each Series
agrees to share the same index
which is the index of the DataFrame
.
columns
These are the keys that we can reference to get at the corresponding Series
.
index
This the the index that all of the Series
values agree to share.
Note: RE: columns
and index
objects
They are the same kind of things. A DataFrame
s index
can be used as another DataFrame
s columns
. In fact, this happens when you do df.T
to get a transpose.
values
This is a 2 dimensional array that contains the data in a DataFrame
. The reality is that values
is NOT what is stored inside the DataFrame
object. (Well sometimes it is, but I'm not about to try to describe the block mananager). The point is, it is better to think of this as access to a 2 dimensional array of the data.
Define Sample Data
These are sample pandas.Index
objects that can be used as the index
of a Series
or DataFrame
or can be used as the columns
of a DataFrame
idx_lower = pd.Index([*'abcde'], name='lower')
idx_range = pd.RangeIndex(5, name='range')
These are sample pandas.Series
objects that use the pandas.Index
objects above
s0 = pd.Series(range(10, 15), idx_lower)
s1 = pd.Series(range(30, 40, 2), idx_lower)
s2 = pd.Series(range(50, 10, -8), idx_range)
These are sample pandas.DataFrame
objects that use the pandas.Index
objects above
df0 = pd.DataFrame(100, index=idx_range, columns=idx_lower)
df1 = pd.DataFrame(
np.arange(np.product(df0.shape)).reshape(df0.shape),
index=idx_range, columns=idx_lower
)
Series
on Series
When operating on two Series
, the alignment is obvious. You align the index
of one Series
with the index
of the other.
s1 + s0
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
Which is the same as when I randomly shuffle one before I operate. The indices will sitll align.
s1 + s0.sample(frac=1)
lower
a 40
b 43
c 46
d 49
e 52
dtype: int64
And is NOT the case when instead I operate with the values of the shuffled Series
. In this case, Pandas doesn't have the index
to align with and therefore operates from a positions.
s1 + s0.sample(frac=1).values
lower
a 42
b 42
c 47
d 50
e 49
dtype: int64
Add a scalar
s1 + 1
lower
a 31
b 33
c 35
d 37
e 39
dtype: int64
DataFrame
on DataFrame
Similar is true when operating between two DataFrame
s
The alignment is obvious and does what we think it should do
df0 + df1
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Shuffle second DataFrame
on both axes. The index
and columns
will still align and give us the same thing.
df0 + df1.sample(frac=1).sample(frac=1, axis=1)
lower a b c d e
range
0 100 101 102 103 104
1 105 106 107 108 109
2 110 111 112 113 114
3 115 116 117 118 119
4 120 121 122 123 124
Same shuffling but add the array and not the DataFrame
. No longer aligned and will get different results.
df0 + df1.sample(frac=1).sample(frac=1, axis=1).values
lower a b c d e
range
0 123 124 121 122 120
1 118 119 116 117 115
2 108 109 106 107 105
3 103 104 101 102 100
4 113 114 111 112 110
Add 1 dimensional array. Will align with columns and broadcast across rows.
df0 + [*range(2, df0.shape[1] + 2)]
lower a b c d e
range
0 102 103 104 105 106
1 102 103 104 105 106
2 102 103 104 105 106
3 102 103 104 105 106
4 102 103 104 105 106
Add a scalar. Nothing to align with so broadcasts to everything
df0 + 1
lower a b c d e
range
0 101 101 101 101 101
1 101 101 101 101 101
2 101 101 101 101 101
3 101 101 101 101 101
4 101 101 101 101 101
DataFrame
on Series
If DataFrame
s are to be though of as dictionaries of Series
and Series
are to be thought of as dictionaries of values, then it is natural that when operating between a DataFrame
and Series
that they should be aligned by their "keys".
s0:
lower a b c d e
10 11 12 13 14
df0:
lower a b c d e
range
0 100 100 100 100 100
1 100 100 100 100 100
2 100 100 100 100 100
3 100 100 100 100 100
4 100 100 100 100 100
And when we operate, the 10
in s0['a']
gets added to the entire column of df0['a']
df0 + s0
lower a b c d e
range
0 110 111 112 113 114
1 110 111 112 113 114
2 110 111 112 113 114
3 110 111 112 113 114
4 110 111 112 113 114
Heart of the issue and point of the post
What about if I want s2
and df0
?
s2: df0:
| lower a b c d e
range | range
0 50 | 0 100 100 100 100 100
1 42 | 1 100 100 100 100 100
2 34 | 2 100 100 100 100 100
3 26 | 3 100 100 100 100 100
4 18 | 4 100 100 100 100 100
When I operate, I get the all np.nan
as cited in the question
df0 + s2
a b c d e 0 1 2 3 4
range
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
This does not produce what we wanted. Because Pandas is aligning the index
of s2
with the columns
of df0
. The columns
of the result includes a union of the index
of s2
and the columns
of df0
.
We could fake it out with tricky transposition
(df0.T + s2).T
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
But it turns out Pandas has a better solution. There are operation methods that allow us to pass an axis
argument to specify the axis to align with.
-
sub
+
add
*
mul
/
div
**
pow
And so the answer is simply
df0.add(s2, axis='index')
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Turns out axis='index'
is synomynous with axis=0
.
As is axis='columns'
synomynous with axis=1
df0.add(s2, axis=0)
lower a b c d e
range
0 150 150 150 150 150
1 142 142 142 142 142
2 134 134 134 134 134
3 126 126 126 126 126
4 118 118 118 118 118
Rest of the operations
df0.sub(s2, axis=0)
lower a b c d e
range
0 50 50 50 50 50
1 58 58 58 58 58
2 66 66 66 66 66
3 74 74 74 74 74
4 82 82 82 82 82
df0.mul(s2, axis=0)
lower a b c d e
range
0 5000 5000 5000 5000 5000
1 4200 4200 4200 4200 4200
2 3400 3400 3400 3400 3400
3 2600 2600 2600 2600 2600
4 1800 1800 1800 1800 1800
df0.div(s2, axis=0)
lower a b c d e
range
0 2.000000 2.000000 2.000000 2.000000 2.000000
1 2.380952 2.380952 2.380952 2.380952 2.380952
2 2.941176 2.941176 2.941176 2.941176 2.941176
3 3.846154 3.846154 3.846154 3.846154 3.846154
4 5.555556 5.555556 5.555556 5.555556 5.555556
df0.pow(1 / s2, axis=0)
lower a b c d e
range
0 1.096478 1.096478 1.096478 1.096478 1.096478
1 1.115884 1.115884 1.115884 1.115884 1.115884
2 1.145048 1.145048 1.145048 1.145048 1.145048
3 1.193777 1.193777 1.193777 1.193777 1.193777
4 1.291550 1.291550 1.291550 1.291550 1.291550
edited 6 mins ago
answered 2 hours ago
piRSquared
147k21130264
147k21130264
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
add a comment |Â
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
1
1
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
Another good resource for me to mark dup for future questions . :-)
â W-B
8 mins ago
add a comment |Â
up vote
2
down vote
I prefer the method mentioned by @piSquared (i.e. df.add(s, axis=0)), but another method uses apply
together with lambda
to perform an action on each column in the dataframe:
>>>> df.apply(lambda col: col + s)
a b c
0 4 5 6
1 18 19 20
To apply the lambda function to the rows, use axis=1
:
>>> df.T.apply(lambda row: row + s, axis=1)
0 1
a 4 18
b 5 19
c 6 20
This method could be useful when the transformation is more complex, e.g.:
df.apply(lambda col: 0.5 * col ** 2 + 2 * s - 3)
add a comment |Â
up vote
2
down vote
I prefer the method mentioned by @piSquared (i.e. df.add(s, axis=0)), but another method uses apply
together with lambda
to perform an action on each column in the dataframe:
>>>> df.apply(lambda col: col + s)
a b c
0 4 5 6
1 18 19 20
To apply the lambda function to the rows, use axis=1
:
>>> df.T.apply(lambda row: row + s, axis=1)
0 1
a 4 18
b 5 19
c 6 20
This method could be useful when the transformation is more complex, e.g.:
df.apply(lambda col: 0.5 * col ** 2 + 2 * s - 3)
add a comment |Â
up vote
2
down vote
up vote
2
down vote
I prefer the method mentioned by @piSquared (i.e. df.add(s, axis=0)), but another method uses apply
together with lambda
to perform an action on each column in the dataframe:
>>>> df.apply(lambda col: col + s)
a b c
0 4 5 6
1 18 19 20
To apply the lambda function to the rows, use axis=1
:
>>> df.T.apply(lambda row: row + s, axis=1)
0 1
a 4 18
b 5 19
c 6 20
This method could be useful when the transformation is more complex, e.g.:
df.apply(lambda col: 0.5 * col ** 2 + 2 * s - 3)
I prefer the method mentioned by @piSquared (i.e. df.add(s, axis=0)), but another method uses apply
together with lambda
to perform an action on each column in the dataframe:
>>>> df.apply(lambda col: col + s)
a b c
0 4 5 6
1 18 19 20
To apply the lambda function to the rows, use axis=1
:
>>> df.T.apply(lambda row: row + s, axis=1)
0 1
a 4 18
b 5 19
c 6 20
This method could be useful when the transformation is more complex, e.g.:
df.apply(lambda col: 0.5 * col ** 2 + 2 * s - 3)
edited 2 hours ago
answered 2 hours ago
Alexander
51k1082117
51k1082117
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53217607%2fhow-do-i-operate-on-a-dataframe-with-a-series-for-every-column%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password