Weird kernel behaviour on my Lenovo: has anyone else seen symptoms like this before?
Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Weird kernel behaviour on my Lenovo: has anyone else seen symptoms like this before?
LFS 12 uses the 6.7.4 kernel. I've built plenty of kernels in my time but this one showed the strangest behaviour I've ever seen. I'm doing a bisection to see where this behaviour starts (currently it's somewhere between 6.3.4 and 6.3.7) but I'm floating an initial query here because I'd like to know if anyone has ever observed similar symptoms.
What happens is that the bootloader reports that the kernel has loaded, and then the screen freezes and the computer produces a rhythmic buzzing sound, quite different from the speaker bleeps that you get with POST errors. There are three medium-length buzzes followed by a long one, over and over. It has the feel of a diagnostic code but I haven't found any reference to coded buzz sounds anywhere. There are no visible kernel messages so nothing to get a handle on.
I'm unaware of any beep mechanism in the kernel, but I haven't read every line.
More likely it is being produced by the platform firmware (BIOS) so depends on your hardware and what firmware is installed on it. There are other possibilities as well, such as this gem: https://docs.kernel.org/6.8/sound/hd...k-pc-beep.html
Please post your hardware and firmware makes and versions.
For old IBM hardware (Lenovo, these days I guess) 1 long, 3 short means unable to initialize video. For AMI it means bad memory.
Kernel 6.3.6 boots, so the problem lies between 6.3.6 and 6.3.7. I'll need to git clone that branch and switch to git bisect for further info.
The noises are not beeps. They are low-frequency buzzes, and they occur well after POST. The actual boot process (POST->bootloader menu -> kernel load) goes smoothly.
The computer is a Lenovo Thinkstation, basically a laptop in a tower case. It has an external power unit, so no internal fan. The UEFI is American Megatrends. Kernel says
I could do with a bit of advice as to the precise command I need to do the clone. I want to clone the branch that has the commits entered between 6.3.6 and 6.3.7, but I first need to know which is the relevant tag. There's plenty of literature online about cloning isolated git branches by using tags as identifiers, so I know the syntax of the command I shall need to enter, but I haven't found anything yet that tells me what the tags mean! For example, does the tag "v6.3.6" mean version 6.3.6 and forward (in which case this is the one I want) or does it mean the twig that ends with version 6.3.6 (in which case the one I actually need is "v6.3.7").
I did a quick fractional clone (just 2%) of the v6.3.6 branch last night to get an idea of the size of the task, and worked out that the full amount of data I shall need to download is about 4 GB. This is more than my whole monthly allowance of peak time downloads! Fortunately my off-peak allowance (midnight to 8 am) is 30 GB per month.
This is clearly a job for the at daemon which I've never used before, but the syntax of the at command seems simple enough. I should be able to put it into the queue in the evening to run at 12:15am the next day. I often used to put on overnight batch jobs like this at work but I've never done one since I retired back in the nineties.
Any help, especially with the tag interpretation will be welcome. Once I have a local clone, I think I can do the actual bisection on my own. In fact I've done it once before and even wrote a blog on it.
It turns out that the halt and the buzzing are two different things: commits around 6.3.7 refuse to boot but don't buzz. The buzz appears later and is caused by something else.
According to git, the commit that causes the halt is this:
Code:
7511a699c2265790ccaf3f3c1c57545405627075 is the first bad commit
commit 7511a699c2265790ccaf3f3c1c57545405627075
Author: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Date: Thu Nov 24 14:55:34 2022 +0100
tpm, tpm_tis: Request threaded interrupt handler
commit 0c7e66e5fd69bf21034c9a9b081d7de7c3eb2cea upstream.
The TIS interrupt handler at least has to read and write the interrupt
status register. In case of SPI both operations result in a call to
tpm_tis_spi_transfer() which uses the bus_lock_mutex of the spi device
and thus must only be called from a sleepable context.
To ensure this request a threaded interrupt handler.
Signed-off-by: Lino Sanfilippo <l.sanfilippo@kunbus.com>
Tested-by: Michael Niewöhner <linux@mniewoehner.de>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/char/tpm/tpm_tis_core.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
But I've forgotten what I need to do next. How do I find the actual bad code? Or should I now hand this off to the kernel devs?
Have you looked at this document: https://docs.kernel.org/admin-guide/...gressions.html. I guess the kernel developers are not interested fixing bugs in kernel 6.3 or 6.7, so it could be a good idea to see if the latest mainline kernel still has the bug.
Thank you very much for that patch. I now have a lot to do. I want to build a kernel with your patch and test it. I also want to build an unpatched kernel with the tpm options deactivated (since I'm not using secure boot) and see what difference that makes. But what I most want to know right now is where you found that patch, so I can make a note of how to do that for the future.
The answer to your question is that I first noticed this problem after building the current LFS which uses linux-6.7.4. My other systems are Slackware and antiX which both use old stable kernels (antiX is on 6.1 I think, and Slackware-15 is still in major release 5) so you could say that I've been protected!.
I used the web interface https://git.kernel.org/pub/scm/linux...able/linux.git. In the right up corner 'master' -> select linux-6.3.y and click 'switch'. Then there is a menu under the penguin. Click 'tree'. Browse to drivers/char/tpm/tpm_tis_core.c as that was mentioned in your post. Then click 'log'. There is a list of commits to that file, and the second one is 'tpm, tpm_tis: Request threaded interrupt handler' as in your post. Click that. The third line has a link '(patch)' and that gives the patch in a mailbox format. I guess you could also search it using the commit ID. Or use your git clone locally. Anyway, the URL seems to contain the pathname of the file and the commit ID, so it would be simple to just feed them to a wget command when needed.
Last edited by Petri Kaukasoina; 05-14-2024 at 01:52 AM.
Brilliant! With your permission, I'd like to put some of that into my blog on how to bisect a kernel. I wrote it many years ago after a problem with a kernel that crashed on my previous computer and I was intending to use it as a guide to this new problem, but I was really shocked at how sketchily I'd written it. It was a thoroughly careless piece of writing, not up to my usual standards. I'm expanding it now with everything I have learned from this problem.
Update: This is not going to be as simple as I thought. I downloaded the current LFS kernel (6.7.4) and took a look at the tpm driver file that git had flagged up as a problem. I found that there has been a huge amount of development of this code since the 6.3 series, making it quite impossible to do the simple correction I was hoping for. So I decided to approach the problem from a slightly different angle and test how this recommended kernel would behave if the tpm driver simply wasn't built at all. And whaddyaknow! When I tried to boot it, it just sat and buzzed at me as before
I think it is going to take me weeks to get to the bottom of this, especially with my restricted download capacity (though I do at least know now that I can clone small git branches using an overnight at job without crashing through my allocation limit). And I have plenty of time ahead of me.
Of course there is another darker possibility, that I'm no longer capable of configuring my own kernel properly. After all, I shall be 79 this year and it's actually a very long time since I did a kernel configuration. On the old drive I didn't have to, because I had two LFS partitions and built each new LFS out of the previous one. When it came to building the kernel, I just copied over the old config file. I've been doing that for years. But my new drive is smaller and has only one LFS partition, hence the need to configure from scratch. So I am going to do a reality check:
I have installed in Slackware-15 a recent kernel image and its modules from Slackware-current. This kernel is guaranteed to be configured properly! I will make an initrd for it using Patrick's wonderful script and set it up in elilo.conf as an alternative Slack boot. If it boots successfully, then it's my configuration which is at fault. If it halts and buzzes, then it's the kernel.
In that case, at the rate Slackware moves, I shall have at least another year to try and fix things.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.