How to create Histograms

Hello all!

I´ve searched thru old posts on how to create Histograms but found nothing.

Any help would be highly appreciated.

Thank you in advance,
Marcus Uerlings


mauer (BOB member since 2007-08-23)

Which type of histogram are you looking at ? … can you post a real world example.
For a normal Bucket Vs Frequency histogram , i would do the following.
Lets say I have to count the population distribution in the country and bucket it by Age Groups.
I would create these variables :

  1. Max Age in all population (e.g 150) Min Age in all population e.g 0
  2. Bucket Width (i.e age range = 10)
  3. of Buckets = (Max-Min)/Bucket width (e.g 15)

  4. Bucket = Floor(Age/# of Buckets)… if age= 63 , Bucket = Floor(63/15) >> 4

So now if i plot Bucket Vs Count(population ) as a Bar Chart , I would get a Histogram.

It would help if you could post a snapshot of what you are trying to do.


Shobhit_Acharya (BOB member since 2005-08-11)

Thanks for your prompt reply!

That´s ok for the basics, but my problem is how can I make the bucket number change from 1 to “n”.

Maybe this sounds silly to you but I´m kind of a “newbie” using this tool…

The main values (min, max, # of buckets and width) are easy to evaluate.
I´ve calculated the number of buckets using # of buckets = log2(# of individuals)+1

The problem I face is making bucket number changing from 1 to “n”

Let´s say I have a number of customers that buy stuff (fuel for instance).
For each customer I have a credit card ticket each time he/she buys gas
Minimum bill is, for instance, $10 and maximum equals to $120; The number of buys is about 100.000

I´d like to create a table like this, and then an histogram

Bucket Range Count
1 10-16 2
2 16-22 2
3 22-28 4
. . .
. . .
19 118-124 2

Since the number of buckets varies depending on the # of individuals, how can I do this.
There is no “for each” or while structure… :slight_smile:

Marcus


mauer (BOB member since 2007-08-23)

Hi,

Have you found the solution? or someone else?
We would like also to make a histogram.

Thanks
Ségolène


Segolene :fr: (BOB member since 2008-09-24)

My solution still holds good in this case too (with a few adjustments). If you need a variable Bucket , feed the Bucket object via a UserResponse. In this way you would be able to control the Span or range of a Histogram Bar. I will write my solution again using your example :

  1. Max $ for all Customers (e.g 120) Min $ for all Customers e.g 5
  2. Bucket Width (i.e $ range = 10) – fed via User Prompt
  3. of Buckets = (Max-Min)/Bucket width (= 110/5 => 22)

  4. Bucket# = Round(($Spent-Min$Spent ) /Bucket Range),0)… if $Spent= 63 , Bucket# = Floor(63-10/5) >> 10

Attatched : Excel Example to show the above
HISTOGRAM.xls (23.0 KB)


Shobhit_Acharya (BOB member since 2005-08-11)

Hi Shobhit Acharya!

Thank you for your answer. It’s helped me!
I was a little bite confused with the axis.

For the one, who are interested, here are information about the variables on the axis and a summary of the procedure:

  1. Max $ for all Customers (e.g 120)
  2. Min $ for all Customers (e.g 5)
  3. Bucket Width (e.g. $ range = 10)
  4. of Buckets = (Max-Min)/BucketWidth (e.g = 110/10 => 11)

  5. Bucket# = Round(($Spent-Min$Spent)/BucketWidth),0) (e.g if $Spent=63, Bucket# = Floor((63-5)/10)=> 10)
  6. On the Y axis: # of Customers: Count([Customer])ForEach([Bucket])
  7. On the X axis: Bucket*BucketWidh

Notice that it is the same procedure, if you have also negative values.

Have a nice week
Ségolène


Segolene :fr: (BOB member since 2008-09-24)

These formulas helped a lot, but my issue is making this work implementation wise in WEBI. I can calculate a bucket for each of the records my query returns, and can calculate the count of those instances, but I am not able to group by my bucket. What did you do specifically to make the report work correctly in WEBI. I can make the histogram shown in the Excel attachment…


jake.church (BOB member since 2008-10-15)

Hi!

I’m having the same issue. My x-axis is not working properly, cause the bucket is repeating and it´s a measure and should be a dimension. I’ve converted the bucket value to a bucket string (indeed I’ve created another variable for this using the formatnumber formula), but it´s still not working because the bucket value is repeating, once it´s value is dynamically calculated according to the customer. If I delete the customer column from the table/plot, it crashes because it´s not possible to calculate the customer bucket if the customer is not given.


hugodasilvadasilva (BOB member since 2016-07-27)