Numpy ndarray filter (matrix) according to column values

This question is about filtering numpy ndarray according to some column values.

I have a fairly large number of ndarray (300000, 50) and I filter it according to the values ​​in some specific columns. I have ndtypes, so I can access each column by name.

The first column is called "category_code", and I need to filter the matrix to return only the rows containing category_code ("A", "B", "C").

The result should be another ndarray number, whose columns are still accessible by dtype names.

Here is what I am doing now:

index = numpy.asarray([row['category_code'] in ('A', 'B', 'C') for row in data])
filtered_data = data[index]

Understanding a list, for example:

list = [row for row in data if row['category_code'] in ('A', 'B', 'C')]
filtered_data = numpy.asarray(list)

will not work because dtypes types were not initially available.

/ ? -, :

filtered_data = data.where({'category_code': ('A', 'B','C'})

!

+5
2

NumPy, Pandas, ndarrays:

>>> # import the library
>>> import pandas as PD

python, python; /

>>> data = {'category_code': ['D', 'A', 'B', 'C', 'D', 'A', 'C', 'A'], 
            'value':[4, 2, 6, 3, 8, 4, 3, 9]}

>>> # convert to a Pandas 'DataFrame'
>>> D = PD.DataFrame(data)

, category_code B, C, , :

>>> # step 1: create the index 
>>> idx = (D.category_code== 'B') | (D.category_code == 'C')

>>> # then filter the data against that index:
>>> D.ix[idx]

        category_code  value
   2             B      6
   3             C      3
   6             C      3

Pandas NumPy, , Pandas. NumPy , , ",", ":", , , () :

>>>  D[idx,:]

Pandas ix :

>>> D.ix[idx]
+9

, pandas: " " . numpy.

+2

All Articles