python - numpy recarray from CSV dtype has many columns but shape says just one row, why is that? -
my csv has mix of strings , numeric columns. nump.recfromcsv
accurately inferred them (woo-hoo) giving dtype of
dtype=[('null', 's7'), ('00', '<f8'), ('nsubj', 's20'), ('g', 's1'), ...
so mix of strings , numbers can see. numpy.shape(csv)
gives me
(133433,)
which confuses me, since dtype implied column aware. furthermore accesses intuitively:
csv[1] > ('def', 0.0, 'prep_to', 'g', 'query_w', 'indef', 0.0, ...
i error
cannot perform reduce flexible type
on operations .all(), when using numeric column. i'm not sure whether i'm working table-like entity (two dimensions) or 1 list of something. why dtype inconsistent shape?
a recarray array of records. each record can have multiple fields. record sort of struct in c.
if shape of recarray (133433,)
recarray 1-dimensional array of records.
the fields of recarray may accessed name-based indexing. example, csv['nsub']
, equivalent to
np.array([record['nsub'] record in csv])
this special name-based indexing supports illusion 1-dimensional recarray 2-dimensional array -- csv[intval]
selects rows, csv[fieldname]
selects "columns". however, under hood , strictly speaking if shape (133433,)
1-dimensional.
note not recarrays 1-dimensional. possible have higher-dimensional recarray,
in [142]: arr = np.zeros((3,2), dtype=[('foo', 'int'), ('bar', 'float')]) in [143]: arr out[143]: array([[(0, 0.0), (0, 0.0)], [(0, 0.0), (0, 0.0)], [(0, 0.0), (0, 0.0)]], dtype=[('foo', '<i8'), ('bar', '<f8')]) in [144]: arr.shape out[144]: (3, 2)
this 2-dimensional array, elements records.
here bar
field values in arr[:, 0]
slice:
in [148]: arr[:, 0]['bar'] out[148]: array([ 0., 0., 0.])
here bar
field values in 2d array:
in [151]: arr['bar'] out[151]: array([[ 0., 0.], [ 0., 0.], [ 0., 0.]]) in [160]: arr['bar'].all() out[160]: false
note alternative using recarrays pandas dataframes. there lot more methods available manipulating dataframes recarrays. might find more convenient.
Comments
Post a Comment