Guest Blog: Keeping track of SQL Server Agent job durations

Oopps! Upgrade your browser pretty please. Oopps! Upgrade your browser pretty please.

About me My name is Mikael Cedersved, also (un)known as FrozenFishes. I’m a 26 year old DBA and BI developer from Sweden, working in the health care industry. I’ve been working with SQL server for 4 years. This is actually my first sql server related blog post, and I hope there’s several more to come. I’m about to launch my own blog at http://frozenfishes.wordpress.com, so stay tuned on that! You may also follow me on twitter @frozenfishes. First of all I want to thank Steinar Andersen and the people at SQL Service for letting me write a blog post here. And thanks again for putting Sweden on the SQL Server world map!

Recently I started thinking about how to monitor over all performance for our data warehouse and mostly the impacts of different changes to the environment as a whole and over time. Both inside SQL server, but also at HW/OS level. E.g if you’re switching over to another SAN solution, making changes to the memory configuration or having table compression turned on. I figured one way to do this is to keep track of sql server agent job durations, since this is often not only used to execute all the SSIS packages for ETL, but also for a lot of cmdexec tasks doing various things in the environment such as processing a large number of files in some way etc. You’ll then want to know how big the time saving (or loss) was in the big picture.

This could also be highly interesting numbers to look at if you’re bound to specific maintenance time windows, when it’s acceptable that your data is unavailable or inconsistent, and you’re having trouble fitting everything in there. Then you want to know which job durations are stable and which are not, and which jobs take the longest time to finish when it matters, so that you can focus troubleshooting and performance optimization to these bad guys.  Of course this is not very specific, and will barely provide any help at all when it comes to troubleshooting. It’s also more of a measure for what’s going on in means of modifications to these jobs, than performance in the environment, if your agent jobs changes often and a lot. In my own case we have quite stable jobs that, if changed it’s not supposed to affect these durations very much. Although this might give you a hint of when a problem occurs and you need to look at the specifics.  In a perfect world the only one making changes to these agent jobs would be the DBA, so that he/she always know what’s going on in there. At least I would like it to be like this. Although I’m pretty certain this is not true in many cases.

This might raise an alert to you as a DBA, when a developer has made significant changes to a job – so that you’ll get the opportunity to check it out and see how it copes with what you know is the fastest and best way of doing these things and make sure it doesn’t affect job durations to the point that it will interfere with other things such as backups or different maintenance tasks.  There’s some time calculation to be done here.

The following query will return one row per job, and information about the most recent run. If there’s jobs that never has been run this will be returned with ‘1970-01-01’ as RunDate. I use this query in a agent job to insert the results into a table, so that I can look at the results and variations over time with the help of a SSRS report. I scheduled this to run once a day.

SELECT CAST(convert(char(8), CASE a.last_run_date WHEN 0 THEN ‘19700101’ ELSE a.last_run_date END) AS date) as RunDate,            CASE LEN(a.last_run_time)

WHEN 5 THEN ‘0’ + LEFT(convert(nvarchar(6),a.last_run_time), 1) + ‘:’ + (SUBSTRING(convert(nvarchar(6), a.last_run_time), 2, 2)+ ‘:’ + RIGHT(convert(nvarchar(6), a.last_run_time), 2))

WHEN 6 THEN LEFT(convert(nvarchar(6), a.last_run_time), 2) + ‘:’ +(SUBSTRING(convert(nvarchar(6), a.last_run_time), 3, 2) + ‘:’ +RIGHT(convert(nvarchar(6), a.last_run_time), 2))

ELSE ’23:59:59′            END AS RunTime,                b.name as JobName,

(select COUNT(step_id) from [msdb].[dbo].[sysjobsteps] where job_id =a.job_id) as NbrOfSteps,

CASE a.last_run_outcome WHEN 0 THEN ‘#FAILED#’

WHEN 1 THEN ‘Succeeded’WHEN 3 THEN ‘#STOPPED#’ ELSE ‘N/A’ END AS Outcome,

CASE LEN(a.last_run_duration)

WHEN 1 THEN a.last_run_duration

WHEN 2 THEN a.last_run_duration

WHEN 3 THEN (convert(decimal, LEFT(a.last_run_duration, 1)) * 60) +RIGHT(a.last_run_duration, 2)

WHEN 4 THEN (convert(decimal, LEFT(a.last_run_duration, 2)) * 60) +RIGHT(a.last_run_duration, 2)

WHEN 5 THEN (convert(decimal, LEFT(a.last_run_duration, 1) * 60) * 60) +(convert(decimal, SUBSTRING(CAST(a.last_run_duration AS nvarchar(5)), 2, 2)) * 60) +RIGHT(a.last_run_duration, 2)

WHEN 6 THEN (convert(decimal, LEFT(a.last_run_duration, 2) * 60) * 60) +(convert(decimal, SUBSTRING(CAST(a.last_run_duration AS nvarchar(6)), 3, 2)) * 60) +RIGHT(a.last_run_duration, 2)

END AS DurationInSecs,

CASE LEN(a.last_run_duration)

WHEN 1 THEN cast(convert(decimal, a.last_run_duration) /60 AS decimal(6,2))

WHEN 2 THEN cast(convert(decimal, a.last_run_duration) /60 AS decimal(6,2))

WHEN 3 THEN cast(LEFT(a.last_run_duration, 1) + (convert(decimal,RIGHT(a.last_run_duration, 2)) /60) AS decimal(6,2))

WHEN 4 THEN cast(LEFT(a.last_run_duration, 2) + (convert(decimal,RIGHT(a.last_run_duration, 2)) /60) AS decimal(6,2))

WHEN 5 THEN cast((convert(decimal, (LEFT(a.last_run_duration, 1)) * 60) +SUBSTRING(CAST(a.last_run_duration AS nvarchar(5)), 2, 2) + (convert(decimal,RIGHT(a.last_run_duration, 2)) /60)) AS decimal(6,2))

WHEN 6 THEN cast((convert(decimal, (LEFT(a.last_run_duration, 2)) * 60) +SUBSTRING(CAST(a.last_run_duration as nvarchar(6)), 3, 2) + (convert(decimal,RIGHT(a.last_run_duration, 2)) /60)) AS decimal(6,2))

END AS DurationInMins,

CASE LEN(a.last_run_duration)

WHEN 1 THEN cast(convert(decimal, a.last_run_duration / 60) /60 AS decimal(6, 2))

WHEN 2 THEN cast(convert(decimal, a.last_run_duration / 60) /60 AS decimal(6, 2))

WHEN 3 THEN cast((convert(decimal, LEFT(a.last_run_duration, 1)) /60) +(convert(decimal, RIGHT(a.last_run_duration, 2)) / 60) /60 AS decimal(6,2))

WHEN 4 THEN cast((convert(decimal, LEFT(a.last_run_duration, 2)) /60) +(convert(decimal, RIGHT(a.last_run_duration, 2)) / 60) /60 AS decimal(6,2))

WHEN 5 THEN cast(LEFT(a.last_run_duration, 1) + (convert(decimal,SUBSTRING(CAST(a.last_run_duration AS nvarchar(5)), 2, 2)) /60) + (convert(decimal,(RIGHT(a.last_run_duration, 2)) / 60) /60) AS decimal(6,2))

WHEN 6 THEN cast(LEFT(a.last_run_duration, 2) + (convert(decimal,SUBSTRING(CAST(a.last_run_duration AS nvarchar(6)), 3, 2)) /60) + (convert(decimal,(RIGHT(a.last_run_duration, 2)) / 60) /60) AS decimal(6,2))

ELSE 0            END AS DurationInHrs

FROM [msdb].[dbo].[sysjobservers] a   inner join [msdb].[dbo].[sysjobs] b on a.job_id = b.job_id   order by a.last_run_date, a.last_run_time