This method is used to compute a simple cross-tabulation of two (or more) factors. By default, computes a frequency table of the factors unless an array of values and an aggregation function are passed.
Syntax: pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name=’All’, dropna=True, normalize=False)
Arguments :
- index : array-like, Series, or list of arrays/Series, Values to group by in the rows.
- columns : array-like, Series, or list of arrays/Series, Values to group by in the columns.
- values : array-like, optional, array of values to aggregate according to the factors. Requires `aggfunc` be specified.
- rownames : sequence, default None, If passed, must match number of row arrays passed.
- colnames : sequence, default None, If passed, must match number of column arrays passed.
- aggfunc : function, optional, If specified, requires `values` be specified as well.
- margins : bool, default False, Add row/column margins (subtotals).
- margins_name : str, default ‘All’, Name of the row/column that will contain the totals when margins is True.
- dropna : bool, default True, Do not include columns whose entries are all NaN.
Below is the implementation of the above method with some examples :
Example 1 :
Python3
# importing packages import pandas
import numpy
# creating some data a = numpy.array([ "foo" , "foo" , "foo" , "foo" ,
"bar" , "bar" , "bar" , "bar" ,
"foo" , "foo" , "foo" ],
dtype = object )
b = numpy.array([ "one" , "one" , "one" , "two" ,
"one" , "one" , "one" , "two" ,
"two" , "two" , "one" ],
dtype = object )
c = numpy.array([ "dull" , "dull" , "shiny" ,
"dull" , "dull" , "shiny" ,
"shiny" , "dull" , "shiny" ,
"shiny" , "shiny" ],
dtype = object )
# form the cross tab pandas.crosstab(a, [b, c], rownames = [ 'a' ], colnames = [ 'b' , 'c' ])
|
Output :
Example 2 :
Python3
# importing package import pandas
# create some data foo = pandas.Categorical([ 'a' , 'b' ],
categories = [ 'a' , 'b' , 'c' ])
bar = pandas.Categorical([ 'd' , 'e' ],
categories = [ 'd' , 'e' , 'f' ])
# form crosstab with dropna=True (default) pandas.crosstab(foo, bar) # form crosstab with dropna=False pandas.crosstab(foo, bar, dropna = False )
|
Output :