python - pandas matplotlib .plot(kind='hist') vs .plot(kind='bar') issue -
i have pandas dataframe named firstperiod , column named megaball. range of values in megaball 1 25, , line of code:
print firstperiod.megaball.value_counts().sort_index() gives me this, want see (the # of occurrences per possible value)
1     12 2      4 3      9 4      4 5      3 6      6 7      5 8      8 9      7 10    10 11     6 12     5 13     3 14     5 15     6 16     8 17    15 18     7 19     8 20     5 21     8 22     7 23     1 24    11 25     9   firstperiod.megaball.value_counts().sort_index().plot(kind='bar') plt.show() ^this shows me bar chart fine, x-axis values 25, y-axis values 15.
but reason, when want histogram instead of bar chart (and change parameter value kind=, gives me totally incorrect , different bar chart values earlier. why that? , how fix histogram?
firstperiod.megaball.value_counts().sort_index().plot(kind='hist') plt.show() 
that's because "hist" plot not plotting data, first estimating empirical distribution of raw data , plotting result. is, "hist" going bin data, count instances per bin , plot that, there no need of doing value_counts() ourselves. 
therefore, equivalent of:
firstperiod.megaball.value_counts().sort_index().plot(kind='bar') should be:
firstperiod.megaball.plot(kind='hist') 
Comments
Post a Comment