killing a job

hi,

i ran a job from administrator . since it was running for so long ,i tried to kill it but instead i deleted that log from administrator.

Now am unable to run that job again bcoz it looks like the same job that previously ran is still running in the back end . Do u have any clue as to how to kill that previous run now ? :hb:

any input is highly appreciable …

Reg,
Rani


rani_cts (BOB member since 2008-06-05)

What message pops up when you try running again?


ganeshxp :us: (BOB member since 2008-07-17)

job just hangs… it was running for almost 13 hrs …


rani_cts (BOB member since 2008-06-05)

I dont think it is not an issue because of previously kick started job.

All your issue is because of your actual job by itself. How huge is your actual process in that job?

Maybe simple thing is, try replicating the job and run the new job. This is just to satisfy yourself. I believe in the problem of the DF.


ganeshxp :us: (BOB member since 2008-07-17)

I have the same problem. The job started this night, and 12 hours later it still runs.

Tried to run it again, it just hang at the same place. Runned it in dev, everythings runs smooth. I’ve looked all around, i’ve never found a way to manually kill a job. No log in data services i could right-click, i tried to restart the job server, the data services service, remove it from scheduled jobs (i’ve read this one somewhere) , nothing works.
There’s got to be an easy way… Can someone help me?


Derf :canada: (BOB member since 2011-05-16)

Goto Management Console and see if you could abort it. There is an option out there.


ganeshxp :us: (BOB member since 2008-07-17)

Just did, the mesasge :

Failed to stopped the job-JOB_SALES_SUMMIT in Repository BOE_DI_PROD2. Error message: BOE_DI_PROD2 (BODI-3016184)

It didnt worked, the job still running. :hb:


Derf :canada: (BOB member since 2011-05-16)

Folks. I too saw this issue accidentally. Simply very crazy when I saw it. No idea on how a job can run if the service itself is stopped. I am not able to see any items in Task manager. :cuss:

Any solutions to this guru’s!!!


ganeshxp :us: (BOB member since 2008-07-17)

Let’s talk about the relationship between the processes a bit.

AL_JobService: Is the service main handler. So when you go to Windows Services and stop the DI Service, you stop that process. Very lightweight.

AL_JobServer: Started and killed by the AL_JobService, acts as the network listener, is asked to start al_engine processes, read log files, execute schedules. Lightweight

al_engine: The process for jobs and dataflows. All the processing happens there.

So when you stop the service, you shut down the AL_JobService and it by itself will bring down the AL_JobServers. Parallel running al_engines will not be killed, they continue running. If you really want to get rid of a job/dataflow, kill the al_engine process via the OS.
(Note, occasionally al_engine processes are started for other things like usage dependency calculation, ATL export. Most important, when using Realtime Services, each realtime service is one al_engine and you don’t want to kill that.)


Werner Daehn :de: (BOB member since 2004-12-17)

Well, that is what is the same impression I had in my mind too.

But the strange thing that happened has made me get confused. So the machine is pretty much a brand new system and building the stuffs on it. So no jobs are running/scheduled. Practically a blank server.
So this job has started and got struck at a DF. But then that machine has no jobs running. I was the only owner of it. So no one even knows about the server info’s.
Interesting thing is I was not able to see any al_engine thread for it. So I was not able to kill the job.

As Derf thinks, I too feel there should be some other simple way to make it off???

To prove everything, the END TIME in Management Console has never populated. Also the greatest thing is, after 3 days the job came back to me telling it got crashed by creating a CORE DUMP Error

This was the thing happened.


ganeshxp :us: (BOB member since 2008-07-17)

hi,

when we got the above issue…just we entered into database and we killed the session which are running… we did this in DEV … I know that we cant do that prod.


joe1234 (BOB member since 2011-04-06)

Well, i know it sounds silly, but when i tried to “abort”, it gave me this error. Then someone came and told me to click “delete” instead, and the job disapeared. Dont know if it got killed, i just know that it’s not stuck pending without a finish time. I hope it did the trick. Thanks to you all for the help.


Derf :canada: (BOB member since 2011-05-16)

Just some info on this issue:

  • The ‘delete’ button in the job status view will only delete the entry from the log, not kill the job. Use the abort button to kill the job.

  • I have found that with some versions of BODS there is a bug whereby the status remains running, but if you open the error log there is an error and the job has actually aborted

  • if there’s no error in the error log then the only sure way to know if the job’s still runnning is to check for an al_engine on the job server task manager. If no al_engine process then the job’s not running, however the reverse is not always true because if you have other jobs running or real-time jobs then those will have al_engine processes running.


ClintL :south_africa: (BOB member since 2011-01-06)

Hi Ganesh,

We are also facing the same issue. End time never populated for real time jobs.
Whether you got any solution for this issue.

Thanks,
Anand.S


subana87 (BOB member since 2012-11-02)

Well, I figured out a tool called Process Explorer which is available for free to download and it is basically a click and run app for the Task Manager and it will give the time at which a Process is started!

I always use this app to kill the AL_ENGINE.exe smartly…It works well even for my PROD…

But make sure you get internal approvals to use this application!!!


ganeshxp :us: (BOB member since 2008-07-17)

It is a great tool because it also shows the hierarchy in the processes and will show you the command line of the process. You can then derive for which repo the job was started.


Johannes Vink :netherlands: (BOB member since 2012-03-20)

This issue bugged us a few times QA and PROD environments. We tried to kill the AL_* processes and it would still not update the end date.

Anyways, resynching the code from a lower environment seemed to fix it


s_mareddy :us: (BOB member since 2012-09-06)