python - pandas matplotlib .plot(kind='hist') vs .plot(kind='bar') issue -


i have pandas dataframe named firstperiod , column named megaball. range of values in megaball 1 25, , line of code:

print firstperiod.megaball.value_counts().sort_index() 

gives me this, want see (the # of occurrences per possible value)

1     12 2      4 3      9 4      4 5      3 6      6 7      5 8      8 9      7 10    10 11     6 12     5 13     3 14     5 15     6 16     8 17    15 18     7 19     8 20     5 21     8 22     7 23     1 24    11 25     9   firstperiod.megaball.value_counts().sort_index().plot(kind='bar') plt.show() 

^this shows me bar chart fine, x-axis values 25, y-axis values 15.

but reason, when want histogram instead of bar chart (and change parameter value kind=, gives me totally incorrect , different bar chart values earlier. why that? , how fix histogram?

firstperiod.megaball.value_counts().sort_index().plot(kind='hist') plt.show() 

that's because "hist" plot not plotting data, first estimating empirical distribution of raw data , plotting result. is, "hist" going bin data, count instances per bin , plot that, there no need of doing value_counts() ourselves.

therefore, equivalent of:

firstperiod.megaball.value_counts().sort_index().plot(kind='bar') 

should be:

firstperiod.megaball.plot(kind='hist') 

Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -