SQL Server - cumulative sum on overlapping data - getting date that sum reaches a given value -


in our company, our clients perform various activities log in different tables - interview attendance, course attendance, , other general activities. have database view unions data of these tables giving activityview looks this. can see activities overlap - example while attending interview, client may have been performing cv update activity.

+----------------------+---------------+---------------------+-------------------+ | activity_client_id   | activity_type | activity_start_date | activity_end_date | +----------------------+---------------+---------------------+-------------------+ |                  112 | interview     | 2015-06-01 09:00    | 2015-06-01 11:00  | |                  112 | cv updating   | 2015-06-01 09:30    | 2015-06-01 11:30  | |                  112 | course        | 2015-06-02 09:00    | 2015-06-02 16:00  | |                  112 | interview     | 2015-06-03 09:00    | 2015-06-03 10:00  | +----------------------+---------------+---------------------+-------------------+ 

each client has "sign date", recorded on client table, when joined our programme. here our sample client:

+-----------+---------------------+ | client_id | client_sign_up_date | +-----------+---------------------+ |       112 | 2015-05-20          | +-----------+---------------------+ 

i need create report show following columns:

+-----------+---------------------+--------------------------------------------+ | client_id | client_sign_up_date | date_client_completed_5_hours_of_activity | +-----------+---------------------+--------------------------------------------+ 

we need report in order see how effective our programme is. important aim of programme every client complete @ least 5 hours of activity possible. report tell how long sign take each client achieve figure.

what makes trickier when calculate 5 hours of total activity, must discount overlapping activities:

in sample data above client attended interview between 09:00 , 11:00.
on same day performed cv updating activity 09:30 11:30. our calculation, give them total activity day of 2.5 hours (150 minutes) - count 30 minutes of cv updating interview overlaps 11:00.

so report our sample client give following result:

+-----------+---------------------+--------------------------------------------+ | client_id | client_sign_up_date | date_client_completed_5_hours_of_activity | +-----------+---------------------+--------------------------------------------+ |       112 | 2015-05-20          | 2015-06-02                                 | +-----------+---------------------+--------------------------------------------+ 

so question how can create report using select statement ? can work out how writing stored procedure loop through view , write result report table. prefer avoid stored procedure , have select statement give me report on fly.

i using sql server 2005.

this 1 way it:

;with ctern (    select activity_client_id, activity_type,           activity_start_date, activity_end_date,           row_number() on (partition activity_client_id                               order activity_start_date) rn    activities ),    ctediff (    select c1.activity_client_id, c1.activity_type,           x.activity_start_date, c1.activity_end_date,           datediff(mi, x.activity_start_date, c1.activity_end_date) diff,           row_number() on (partition c1.activity_client_id                               order x.activity_start_date) seq    ctern c1    left join ctern c2 on c1.rn = c2.rn + 1    cross apply (select case                            when c1.activity_start_date < c2.activity_end_date                              c2.activity_end_date                           else c1.activity_start_date                        end) x(activity_start_date)     ) select top 1 client_id, client_sign_up_date, activity_start_date,               hoursofactivicty                ctediff c1 inner join clients c2 on c1.activity_client_id = c2.client_id                      cross apply (select sum(diff) / 60.0              ctediff c3              c3.seq <= c1.seq) x(hoursofactivicty) hoursofactivicty >= 5 order seq 

common table expressions , row_number() introduced sql server 2005, above query should work version.

demo here

the first cte, i.e. ctern, produces following output:

client_id   activity_type   start_date          end_date          rn 112         interview       2015-06-01 09:00    2015-06-01 11:00  1 112         cv updating     2015-06-01 09:30    2015-06-01 11:30  2 112         course          2015-06-02 09:00    2015-06-02 16:00  3 112         interview       2015-06-03 09:00    2015-06-03 10:00  4 

the second cte, i.e. ctediff, uses above table expression in order calculate time difference each record, taking consideration overlapps previous record:

client_id activity_type start_date       end_date         diff  seq 112       interview     2015-06-01 09:00 2015-06-01 11:00 120   1 112       cv updating   2015-06-01 11:00 2015-06-01 11:30 30    2 112       course        2015-06-02 09:00 2015-06-02 16:00 420   3 112       interview     2015-06-03 09:00 2015-06-03 10:00 60    4 

the final query calculates cumulative sum of time difference , selects first record exceeds 5 hours of activity.

the above query work simple interval overlaps, i.e. when end date of activity overlaps start date of next activity.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - .htaccess mod_rewrite for dynamic url which has domain names -

Website Login Issue developed in magento -