Python pandas DataFrame.apply modifies both original and duplicate DataFrames

Question

Python pandas DataFrame.apply modifies both original and duplicate DataFrames

I am having a little problem with duplicate pandas DataFrame and not having changes for duplicate and original DataFrame.

Here is an example. Let's say I create an arbitrary DataFrame from a list of dictionaries:

In [67]: d = [{'a':3, 'b':5}, {'a':1, 'b':1}]

In [68]: d = DataFrame(d)

In [69]: d

Out[69]: 
   a  b
0  3  5
1  1  1

Then I assign the 'd' dataframe to the variable 'e' and apply some arbitrary math to the column 'a' using apply:

In [70]: e = d

In [71]: e['a'] = e['a'].apply(lambda x: x + 1)

The problem is that the applicable function seems to apply to both the duplicate DataFrame 'e' and the original DataFrame 'd', which I cannot define for life:

In [72]: e # duplicate DataFrame
Out[72]: 
   a  b
0  4  5
1  2  1

In [73]: d # original DataFrame, notice the alterations to frame 'e' were also applied
Out[73]:  
   a  b
0  4  5
1  2  1

I searched the pandas and google documentation for the reason that it is, but to no avail. I can not understand what is happening here.

(, e['a'] = [i + 1 for i in e['a']]), . pandas DataFrame, ? , - .

+5

python pandas

MikeGruz 01 . '12 4:40

1

BrenBarn · Accepted Answer · 2012-06-01T05:13:47+0000

pandas. Python :

>>> a = [1,2,3]
>>> b = a
>>> b[0] = 'WHOA!'
>>> a
['WHOA!', 2, 3]

DataFrame, e = d.copy().

: , . (, a[1] = x a.foo = bar) a.

Python pandas DataFrame.apply modifies both original and duplicate DataFrames

More articles: