python - pandas matplotlib .plot(kind='hist') vs .plot(kind='bar') issue -
i have pandas dataframe named firstperiod
, column named megaball
. range of values in megaball
1 25, , line of code:
print firstperiod.megaball.value_counts().sort_index()
gives me this, want see (the # of occurrences per possible value)
1 12 2 4 3 9 4 4 5 3 6 6 7 5 8 8 9 7 10 10 11 6 12 5 13 3 14 5 15 6 16 8 17 15 18 7 19 8 20 5 21 8 22 7 23 1 24 11 25 9 firstperiod.megaball.value_counts().sort_index().plot(kind='bar') plt.show()
^this shows me bar chart fine, x-axis values 25, y-axis values 15.
but reason, when want histogram instead of bar chart (and change parameter value kind=
, gives me totally incorrect , different bar chart values earlier. why that? , how fix histogram?
firstperiod.megaball.value_counts().sort_index().plot(kind='hist') plt.show()
that's because "hist" plot not plotting data, first estimating empirical distribution of raw data , plotting result. is, "hist" going bin data, count instances per bin , plot that, there no need of doing value_counts()
ourselves.
therefore, equivalent of:
firstperiod.megaball.value_counts().sort_index().plot(kind='bar')
should be:
firstperiod.megaball.plot(kind='hist')
Comments
Post a Comment