I am still trying to resolve this issue for a few days did not receive any answer. It seems dirty, so let me rewrite it to show what I want.
So far, I think this is due to downloading the file from CSV. I did another test. Code downloaded from csv does not work yet, but writes it in an array format. This is the code:
import pandas as pd
import numpy as np
data = pd.read_csv('dffile.csv', index_col=0)
df=data[['AreaNo.','ID']]
is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
Even=Exset[is_even]
Odd=Exset[~is_even]
print (Even)
print (Odd)
This is what is inside csv:
print (df)
print (data)
>>>
AreaNo. ID
Data
1 25th 676
2 3rd 378
3 California 4740
4 Geary 3445
5 Turk 2801A
6 California 4726
7 Idaho 6239B
8 Idaho 6239.5
9 27th 558
10 29th 584
11 27th 557
12 21st 571 1/2
13 30th 524
14 27th 524
15 Alaska 258
16 Alaska 740
17 27th 645
18 27th 684
[18 rows x 2 columns]
Unit Abbr Transaction AreaNo. ID
Data
1 HSBC 1 25th 676
2 Wells NaN 3rd 378
3 Apt NaN California 4740
4 MTE 204 Geary 3445
5 FPC 202 Turk 2801A
6 HSBC NaN California 4726
7 Wells 5 Idaho 6239B
8 Apt 3 Idaho 6239.5
9 ETF NaN 27th 558
10 BAC NaN 29th 584
11 Wells NaN 27th 557
12 Apt NaN 21st 571 1/2
13 ETF G1 30th 524
14 Wells NaN 27th 524
15 Wells NaN Alaska 258
16 Apt NaN Alaska 740
17 ETF NaN 27th 645
18 Bac 3 27th 684
[18 rows x 4 columns]
>>> df.dtypes
AreaNo. object
ID object
dtype: object
Here df.ID.str.extract('([0-9]+).*')
>>> df.ID.str.extract('([0-9]+).*')
Data
1 378
2 4740
3 3445
4 2801
5 4726
6 6239
7 6239
8 558
9 584
10 557
11 571
12 524
13 524
14 258
15 740
16 645
17 684
18 NaN
Name: ID, dtype: object
Here is the error from the interpreter
Traceback (most recent call last):
File "<string>", line 420, in run_nodebug
File "C:\Users\0\Desktop\python\performance.py", line 16, in <module>
is_even = df['ID'].str.extract('([0-9]+).*').astype(int) % 2 == 0
File "C:\Python33\lib\site-packages\pandas\core\generic.py", line 2018, in astype
dtype, copy=copy, raise_on_error=raise_on_error)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2416, in astype
return self.apply('astype', *args, **kwargs)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 2375, in apply
applied = getattr(blk, f)(*args, **kwargs)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 427, in astype
values=values)
File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 444, in _astype
values = com._astype_nansafe(self.values, dtype, copy=True)
File "C:\Python33\lib\site-packages\pandas\core\common.py", line 2222, in _astype_nansafe
return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
File "lib.pyx", line 733, in pandas.lib.astype_intsafe (pandas\lib.c:12697)
File "util.pxd", line 59, in util.set_value_at (pandas\lib.c:49357)
ValueError: cannot convert float NaN to integer
Here is the code I wrote earlier that works
import pandas as pd
df=pd.DataFrame({'ID': ['10A','6.5', '4 A', '3 1/2'], 'Name': ['J','K','L','M']})
def ExtractU(df):
is_even = df['ID'].str.extract('(\d+).*').astype(int) % 2 == 0
Even=df[is_even]
Odd=df[~is_even]
return Even
print (ExtractU(df))
ID Name
0 10A J
1 6.5 K
2 4 A L
[3 rows x 2 columns]
>>> df.dtypes
ID object
Name object
dtype: object
What did I do wrong when loading data? Why is this not working? The tested data types are the same. How to fix code so that csv works? This happened when I switched to python 3.