python - pandas matplotlib .plot(kind='hist') vs .plot(kind='bar') issue -
i have pandas dataframe named firstperiod , column named megaball. range of values in megaball 1 25, , line of code:
print firstperiod.megaball.value_counts().sort_index() gives me this, want see (the # of occurrences per possible value)
1 12 2 4 3 9 4 4 5 3 6 6 7 5 8 8 9 7 10 10 11 6 12 5 13 3 14 5 15 6 16 8 17 15 18 7 19 8 20 5 21 8 22 7 23 1 24 11 25 9 firstperiod.megaball.value_counts().sort_index().plot(kind='bar') plt.show() ^this shows me bar chart fine, x-axis values 25, y-axis values 15.
but reason, when want histogram instead of bar chart (and change parameter value kind=, gives me totally incorrect , different bar chart values earlier. why that? , how fix histogram?
firstperiod.megaball.value_counts().sort_index().plot(kind='hist') plt.show()
that's because "hist" plot not plotting data, first estimating empirical distribution of raw data , plotting result. is, "hist" going bin data, count instances per bin , plot that, there no need of doing value_counts() ourselves.
therefore, equivalent of:
firstperiod.megaball.value_counts().sort_index().plot(kind='bar') should be:
firstperiod.megaball.plot(kind='hist')
Comments
Post a Comment