Need SQL help to Get Repeat Call Records

You can help me write SQL to get the output or let me know how it can be done at Webi (4.1)

I am using Teradata 14. I have data in following format:
I have first two columns and I need to get third column using SQL/Recursive SQL.
The first two columns show mobile number and their call time to call center. I want to flag each call whether it is a repetitive call or not.
If the call from the same number is within 48 hours of first non repeated call , then I would flag it as repeated.
I have shown 17 rows below, in reality I have 3 millon rows to process. Could you please help.

Mobile Number Call Time Repeat/Non Repeat
A 2016/01/01 06:15:53 0
B 2016/01/01 18:29:29 0
B 2016/01/01 18:33:45 1
B 2016/01/01 19:14:02 1
B 2016/01/01 20:28:38 1
B 2016/01/05 20:51:48 0
B 2016/01/05 20:53:19 1
B 2016/01/10 20:54:28 0
C 2016/01/01 22:45:41 0
D 2016/01/02 17:33:12 0
E 2016/01/02 00:15:08 0
F 2016/01/01 19:51:09 0
F 2016/01/01 20:06:06 1
F 2016/01/01 21:24:47 1
F 2016/01/02 09:21:08 1
F 2016/01/03 20:21:08 0
F 2016/01/05 09:21:08 1

Note: The 0 for F on 3rd Jan is because it is NOT withing 48 hours of the previous 0 which is for 2016/01/01 19:51:09
The repeated calls are ignored as if they don’t exist.

Here is script to create the above sample data:
CREATE TABLE TestTable
(
Mobile VARCHAR(10),
CallTime TIMESTAMP
);

INSERT INTO TestTable VALUES (‘A’,TO_TIMESTAMP( ‘2016-01-01 06:15:53’, ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-01 18:29:29’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-01 18:33:45’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-01 19:14:02’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-01 20:28:38’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-05 20:51:48’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-05 20:53:19’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘B’,TO_TIMESTAMP( ‘2016-01-10 20:54:28’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘C’,TO_TIMESTAMP( ‘2016-01-01 22:45:41’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘D’,TO_TIMESTAMP( ‘2016-01-02 17:33:12’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘E’,TO_TIMESTAMP( ‘2016-01-02 00:15:08’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-01 19:51:09’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-01 20:06:06’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-01 21:24:47’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-02 09:21:08’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-03 20:21:08’ , ‘YYYY-MM-DD HH24:MI:SS’));
INSERT INTO TestTable VALUES (‘F’,TO_TIMESTAMP( ‘2016-01-05 09:21:08’ , ‘YYYY-MM-DD HH24:MI:SS’));


Raghvendra Deshpande :india: (BOB member since 2008-05-26)

This works for Oracle, not sure about Teradata’s lag/lead functionality. Could be cleaned up some too I’m sure.

  SELECT mobile,
         calltime,
         /*
         days_since_min_calltime,
         prev_row_diff,
         days_since_min_calltime - prev_row_diff
            AS min_ctime_minus_prev_row_diff,
         init_call,
         */
         CASE
            WHEN init_call = 0
            THEN
               0
            WHEN prev_row_diff > 2
            THEN
               0
            WHEN     days_since_min_calltime > 2
                 AND (days_since_min_calltime - prev_row_diff) < 2
            THEN
               0
            ELSE
               1
         END
            AS repeater
    FROM (  SELECT a.mobile,
                   a.calltime,
                     CAST (a.calltime AS DATE)
                   - CAST (b.days_since_min_calltime AS DATE)
                      AS days_since_min_calltime,
                   CASE
                      WHEN a.calltime <> b.days_since_min_calltime
                      THEN
                           CAST (a.calltime AS DATE)
                         - LAG (CAST (a.calltime AS DATE))
                              OVER (ORDER BY a.mobile, a.calltime)
                      ELSE
                         0
                   END
                      AS prev_row_diff,
                   CASE
                      WHEN a.calltime = b.days_since_min_calltime THEN 0
                      ELSE 1
                   END
                      AS init_call
              FROM testtable a
                   JOIN (  SELECT mobile, MIN (calltime) AS days_since_min_calltime
                             FROM testtable
                         GROUP BY mobile) b
                      ON (a.mobile = b.mobile)
          ORDER BY 1, 2)
ORDER BY 1, 2

thesnow :us: (BOB member since 2011-08-10)

Thanks @thesnow,
I am not sure how to convert this syntax into Teradata (Actually I am feeling a bit lazy doing that :wink: )
But I have solved this problem now. I have two solutions for this problem.
Solution 1:
I added four columns to the table, Rank, Time_Gap, Rank2 and Rank3.


Initially the table will be like this:

Insert Into  TestTable
SEL A.Mobile ,A.CALL_START_DATE_TIME,'Prepaid' AS call_category
,ROW_NUMBER() OVER(PARTITION BY Mobile , CALL_CATEGORY ORDER BY  CALL_START_DATE_TIME) , 0, 0, 0
FROM  BaseTable

Then:
UPDATE TestTable  a 
	SET time_gap = call_start_date_time - (SEL call_start_date_time FROM TestTable  WHERE Mobile = a.Mobile AND Rank= 1  AND call_category= a.call_category) HOUR(4) ;

Then:
UPDATE TestTable  a 
	SET  Rank2 = time_gap/48;

Then:
UPDATE TestTable    A 
SET RANK3 =  CASE WHEN CALL_START_DATE_TIME  - (SEL MIN(CALL_START_DATE_TIME)  FROM TestTable    
WHERE Mobile = A.Mobile AND CALL_CATEGORY = A.CALL_CATEGORY AND RANK2 = A.RANK2-1) MINUTE(4) < 2880 THEN  A.RANK2-1 ELSE A.RANK2 END;

Then I will run following update n number of times (where n= maximum of rank 2 in previous update)
UPDATE TestTable      A 
SET RANK3 =  CASE WHEN CALL_START_DATE_TIME  - (SEL MIN(CALL_START_DATE_TIME)  FROM TestTable     
WHERE Mobile = A.Mobile AND CALL_CATEGORY = A.CALL_CATEGORY AND RANK3 = A.RANK3-1) MINUTE(4) < 2880 THEN A.RANK3-1 	ELSE A.RANK3 END;

Finally, I will use the rank3 to find out how many new calls and how many repeated calls occurred which fulfils my requirement as I have to count the total calls and the repeated calls.

This is the most efficient solution which I put in the procedure and got the result. I can process tens of millions of rows in few minutes.

Solution 2:
I wrote procedure with below syntax. It is giving me all the new calls and excluding repeated calls which fulfills my requirement.
But performance is horrible as it processes one row at a time and runs select/insert statements accordingly.
So I have used solution 1.


FOR	MainLoop AS Dt_Cur1 CURSOR FOR
SEL DISTINCT Call_Category, mobile
FROM TestTable
                                               
DO
SET I=0;                
 
FOR Loop1 AS Dt_Cur CURSOR FOR
SEL *
FROM TestTable
WHERE Call_Category=:MainLoop.Call_Category 
AND mobile=:MainLoop.mobile
ORDER BY CALL_START_DATE_TIME ASC
 
DO
IF I=0 THEN
INSERT INTO  AgentTest
SELECT Call_Category,
mobile, MIN(CALL_START_DATE_TIME)  OVER (PARTITION BY Call_Category,	mobile)  AS CALL_START_DATE_TIME, MAX(CALL_START_DATE_TIME) OVER (PARTITION BY Call_Category,		mobile)  AS EndDate , :I
FROM TestTable
WHERE 
Call_Category=:Loop1.Call_Category 	AND mobile=:Loop1.mobile 	AND
CALL_START_DATE_TIME BETWEEN  :Loop1.CALL_START_DATE_TIME 	AND :Loop1.CALL_START_DATE_TIME +INTERVAL '48' HOUR 
GROUP BY 1, 2,CALL_START_DATE_TIME,5;
 
ELSE  
IF Loop1.CALL_START_DATE_TIME > ( SELECT	MAX( EndDate) FROM	AgentTest WHERE	Call_Category=:Loop1.Call_Category 	AND mobile=:Loop1.mobile   ) 
THEN
INSERT INTO AgentTest
SELECT Call_Category,mobile, MIN(CALL_START_DATE_TIME)  OVER (PARTITION BY Call_Category,mobile)  AS CALL_START_DATE_TIME, MAX(CALL_START_DATE_TIME) OVER (PARTITION BY Call_Category,		mobile)  AS EndDate , :I
FROM TestTable
WHERE 
Call_Category=:Loop1.Call_Category 	AND mobile=:Loop1.mobile 	AND
CALL_START_DATE_TIME BETWEEN  :Loop1.CALL_START_DATE_TIME 	AND :Loop1.CALL_START_DATE_TIME +INTERVAL '48' HOUR 
GROUP BY 1,		2,CALL_START_DATE_TIME,5;
                                                                                                
END IF;
 
END IF;
 
SET I=I+1;

END FOR;
 
END FOR;


Raghvendra Deshpande :india: (BOB member since 2008-05-26)