I am having a little problem with duplicate pandas DataFrame and not having changes for duplicate and original DataFrame.
Here is an example. Let's say I create an arbitrary DataFrame from a list of dictionaries:
In [67]: d = [{'a':3, 'b':5}, {'a':1, 'b':1}]
In [68]: d = DataFrame(d)
In [69]: d
Out[69]:
a b
0 3 5
1 1 1
Then I assign the 'd' dataframe to the variable 'e' and apply some arbitrary math to the column 'a' using apply:
In [70]: e = d
In [71]: e['a'] = e['a'].apply(lambda x: x + 1)
The problem is that the applicable function seems to apply to both the duplicate DataFrame 'e' and the original DataFrame 'd', which I cannot define for life:
In [72]: e # duplicate DataFrame
Out[72]:
a b
0 4 5
1 2 1
In [73]: d # original DataFrame, notice the alterations to frame 'e' were also applied
Out[73]:
a b
0 4 5
1 2 1
I searched the pandas and google documentation for the reason that it is, but to no avail. I can not understand what is happening here.
(, e['a'] = [i + 1 for i in e['a']]), . pandas DataFrame, ? , - .