process is hanging on high cpu load even if its priority is set to the maximum
Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
even with the high priority of app1 and app2, when I run a dd command that consumes 100% of its core, these two applications will hang, shouldn't this "dd" command free the CPU for these applications when it's running with them on the same core?!
P.S: app1 and app2 are applications that interact with the sctp stack and time response is critical for them.
I think there will always be some interruptions of app1 & app2, even if they are relatively smaller/less frequent with highest priority.
If you don't want any interruption, a variation of realtime linux http://www.linuxfordevices.com/c/a/L...ference-Guide/ could be more suitable than a (more general purpose) redhat box.
Of course, if you can manage it, it could be useful to dedicate 2 of your cores to app1 & app2 and use the other 2 for most of the other apps, like dd.
thx timmeke, there's a procedure to isolate cores from RHEL http://kbase.redhat.com/faq/docs/DOC-15596 and when I will start my applications I'll force them to start on the isolated processors.
AFAIK, irqbalance is responsible of distributing jobs on the CPU cores, cannot I configure this process to never run a job on my applications core? (in case I don't want to totally isolate them)
I can understand the reluctance to completely disabling irqbalance, as proposed in the procedure, and rather somehow "reconfigure" it.
Do keep in mind that irqbalance does not balance jobs/processes, only the handling of hardware interrupts by the different cores - assigning
your jobs to the cores (and keeping other jobs from running on the same core) is just part of the story.
From a quick search, there doesn't seem to be an option to configure irqbalance - it only seems to be designed for a sophisticated "equal load" balance. That is, except by disabling it and setting the irq affinity to cpus yourself as outlined in the procedure. But that's a question better suited for the hardware forum...
It's probably either irqbalanced or manually set (in /proc/...), not both.
As I understood your goal, you may want to dedicate cores not just for running app1 and/or app2, but maybe even for handling the device interrupts that need to go into the sctp stack (which in turn is polled by your apps). If you limit these interrupts to just one core, and keep other interrupts away from the same core, should help you process them more timely.
Question remains which cores to pick - this would be architecture related (as the 4 cores are not always completely working independently from each other).
So, in practice, I'd recommend:
1- Just to be sure, make sure you have a bootable CD (Live CD,...) on stand-by in case it does get messed up.
2- Having a think about the which cores to dedicate and how, then try the posted procedure.
3- If you run into trouble with the irqs, post to the hardware forum here on LQ.
In fact that's what I'm doing right now on a test server.
Quote:
From a quick search, there doesn't seem to be an option to configure irqbalance - it only seems to be designed for a sophisticated "equal load" balance. That is, except by disabling it and setting the irq affinity to cpus yourself as outlined in the procedure. But that's a question better suited for the hardware forum...
In the following page they said that's possible to disable irqbalance for specific isolated CPUs
As I understood your goal, you may want to dedicate cores not just for running app1 and/or app2, but maybe even for handling the device interrupts that need to go into the sctp stack (which in turn is polled by your apps). If you limit these interrupts to just one core, and keep other interrupts away from the same core, should help you process them more timely.
How can I specify a core for handling only SCTP interrupts?
Code:
Question remains which cores to pick - this would be architecture related (as the 4 cores are not always completely working independently from each other).
My server has a 4 quad CPU, so I'll isolate the last four cores.
I've seen that article before - and I don't like it. cgroups (aka cpusets) is a better option IMHO.
Let's see the result of this - "grep -i ^processor /proc/cpuinfo"
Maybe you can tell me a little more on what your apps will be doing on the hard disk? Do they impose a heavy IO load as well?
The applications are reading from SCTP socket, parsing the received information, returning results and writing in log files.
Basically the traffic they are receiving is huge and they impose heavy IO load, but what I see from the output of "top" the total CPU load on their own processor do not go more than 40% and the iowait percentage is 0% as seen below:
One more info, if I run the below shell script that consumes a lot of CPU but without any IO activities, my applications keep working fine, so basically the problem is encountered only during the presence of high IO activities on the machine.
Code:
#!/bin/bash
while [ 1 ]
do
i=0
if [ 3 -ge 2 ]
then
let i++
fi
if [ 2 -le 3 ]
then
let i--
fi
done
In this case will my issue be solved by isolating the CPUs? any other suggestions? thanks
As IO seems the bottleneck, try sorting that out first (as this will have the biggest impact).
A suggestion would be to use the (somewhat old-style) ramdisk to store the logfiles.
You'll need to figure out a way to sync them to the hard drive occasionally.
When IO has improved, the bottleneck may shift to CPU, in case the isolation may become necessary to improve further.
Bottom line, don't write it off just yet... it doesn't look as promising now, but it may still come in handy later.
I/O isn't necessarily the problem - uninterruptible sleep is generally thought to be caused by (disk) I/O, but not necessarily. It just an attribute of a process. And as stated the %wa is zero - that means no tasks are waiting to use the (any) CPU whilst I/O is outstanding. Could be hard to track anyway with that many online CPUs.
In this case I'd say poor code - presumably one of the applications under discussion or a device driver.
kjournald and pdflush are kernel threads - I wouldn't expect them to be in "D" state under a heavy I/O load. The fact that there are so many pdflush processes might indicate the (disk) I/O is very bursty. pdflush is spawned as needed to write the data to disk (after a sync say). I would expect them to go away after a period of I/O inactivity.
I would guess the SCTP driver is holding up the apps whilst decryprting (or whatever), and then dumping a heap of I/O, then doing it all again.
For single threaded code with that many CPUs, I can't see trying to bind processes to CPUs is going to help at all.
Just guessing of course.
Last edited by syg00; 06-22-2010 at 07:13 AM.
Reason: s/klogd/kjournald/
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.