A user was having trouble with a trigger set up to identify a process that has been running too long, resulting in either false positives or no action taken. Suggestions were made to set up a scheduled trigger with invoke-cuquery and possibly use invoke-cuaction with Remote PowerShell for the killing of the process. A suggestion was also made to open a support ticket. Another user also highlighted the filter on Start Time in the GUI as not being useful for the things they needed it for.
Read the entire ‘Troubleshooting Trigger Issues for Process Identification’ thread below:
I have been working on a trigger that seems like it should be pretty simple, but doesn’t work as expected…looking for ideas about what I can do differently to make it functional.
The situation I am working to resolve is that admins leave a process running that eventually can cause issues for the Citrix users on the server if the process is left to run too long. I have set up and advanced trigger:
Record Type: Process
Filters:
• Machine is citrixservername
• And Name is ProcessName.exe
• And User is * (this is mostly just so I get the user name in the email report to myself)
Include processes that were just discovered: Checked
Minimum Duration in new state: 14 hours
Properly scoped
Actions:
• Run a script that kills the process
• Send an email via SMTP server
Wait at least 5 minute before repeating
I’m seeing two different issues:
- I get false positive results. The administrator, including myself so I know it’s true, has closed the process long before the 14 hour mark, but the email and script are run against a non-existent process.
- Much more common, the process is running for much longer than the 14 hour mark and nothing happens to it ever.
If I lower the Minimum Duration in new state value to like 5 minutes for testing it always seems to run perfectly normal.
I have tried adding different values to the filter including CPU, Memory or PID just to see if I could get it to keep track of the process. I assume that CU just can’t keep track of the state of the process for the length of time needed for issue 2, but issue 1 is pretty odd too as the process shouldn’t have a state for that long.
Is there anything I can do to make this work better?
You’ll probably want to open a support ticket for this. However, a few pointers:
Long "minimum duration in new state" values can be tricky. Especially with volatile metrics. IE something like CPU equals or is greater than 90. Even a single time that CPU goes below 90 would reset your timer.
However, since all fields are static, this doesn’t seem to be the problem here.
Static fields
Fields that are set and forget can be tricky because, well they don’t change. One value might be in the desired state but the other may not.
Sometimes we work around this by adding a volatile metric. As the volatile metric forces re-evaluation of the trigger.
A first step might be to run invoke-cuquery to see if there any records that currently match your desired state.@member Thanks for the info. I metrics used are pretty decent over all and shouldn’t trigger a change in the timer.
I’m looking in to the invoke-cuquery now. Thanks for that idea as well. Have to pick up the syntax and find the right query, but I think I can get there.
@member The sessions are idle, but not disconnected. A quick look at performing the actions on an idle session didn’t really do what I easily as there nothing directly that would identify the process in the session. It seems like I would have to run a script to find the process in idle sessions and would cause many more hits. If I can get the process trigger to work, it’s a much cleaner method in my mind. I do see what you are saying though.
if you export the trigger, it’ll give you the internal filter logic. Which should translate pretty well to invoke-cuquery. Especially the internal names for columns.
That’s cool and not something I even knew I could do.
I have the query showing the results I expect. I have two of the processes for myself that I know have been running for longer than 14 hours.
One was started before I saved the trigger. The other was started shortly after saving the trigger.
Looks like you did everything right. Reach out to support. They’ll likely want to get an escalation engineer involved to see what the status of the trigger is for those records. I don’t have any tools to get you that. As it’ll likely require memory dumps
Thanks Dennis. I will do that.
Thinking about it some more (haven’t talked with support yet) I think I can see how different things might get confused, depending on how the processes are tracked for the trigger over the long term.
We run the process on multiple servers throughout the day and if something in the table/tracking confuses the processes because they are mostly the same, but some key maybe gets changed I can see the potential for both false positives and missed triggers. It totally is going to depend on the tracking over the 14 hours.Hi @member , I am not sure the “Minimum Duration in new state” is a good fit for your use case (e.g. killing a process that runs for more than 14 hours). I would suggest another approach:
- Configure a scheduled trigger than will run every X minutes or X hours
- Run a script that use invoke-cuquery to get the relevant Process table data, including the process start time
- Using the output, you should be able to detect processes that meet the criteria (runs for more than X hours)
- Export the relevant machine names / PIDs to a file
- Now you can either run something like PSExec to remotely terminate the detected processes, or possibly another Machine level scheduled triggers that reviews the list of target machines / PIDs to kill, and terminate the process locally
Maybe our PS team can engage with you guys to help with building this if needed. @member – pls keep me honest here…^ yeah that would work. 8.8 will allow ControlUp to kill the process as well without psexec using invoke-cuaction 😉
I like this idea now that I understand the invoke-cuquery command. I was thinking I didn’t want a scheduled task running on all of the servers just for this, but this makes good sense. I really appreciate the idea here. Looking forward to the invoke-cuaction now as I haven’t looked at it at all.
Thanks for the ideas!
You should be able to run it as a Remote Powershell command as long as it it enabled in your environment and the account as permissions on the target machine
This is just some bit of feedback…the filters on Start Time in the GUI are not very useful for the things that I always seem to want to use them for. Unless I’m not understanding how they should work, setting a static value for Start Time comparison isn’t useful. If these could be compared to current time, then I could see use, but a static date doesn’t seem to have much use case when talking processes and time.
Continue reading and comment on the thread ‘Why is My Advanced Trigger in ControlUp Not Functioning Correctly for Long-Running Processes?’. Not a member? Join Here!
Categories: All Archives, ControlUp Real-Time DX, ControlUp Scripts & Triggers