First thread

Previous thread summary

  • The Nvidia driver wouldn’t load on the older kernel because of something called a “shim layer” which has to be compiled separately for each kernel using something called dkms. There is apparently a way to do the dkms thing again in a way that would let the Nvidia driver load on the older kernel, but I’m not really feeling up to it. The only things I’m missing out on without the Nvidia driver are RVC and the TV HDMI, anyways.
  • I failed to switch to the Nouveau graphics driver because of some error involving dependencies, but was advised against doing this sort of switching around, anyways.
  • I managed to successfully locate the failed checksums in debsums, but this ended up being an apparent red herring as the files did not appear to be corrupted but failed the checksums for other reasons. Given which files failed the checksums, the running theory is that Linux Mint maybe ships Ubuntu packages but overwrites specific files to e.g. get rid of material that infringes on Ubuntu’s copyright/trademark (delenda est).
  • I’ve been given instructions on how to regenerate initramfs on my most up-to-date kernel any time I want, but am I up to that? …Maybe. After I finish my show on Blorptube later today, maybe. It probably isn’t that risky to regenerate initramfs but it still feels risky.
  • People are still saying that my OS is “cooked” and I should probably just do a fresh install, but I’m not giving up the ship just yet. If I am going to go through the rigmarole of reinstalling everything then I also floated the idea of doing this “distro hopping” thing I’ve heard so much about.

Most notable development: My computer’s Internet cut off again last night for the first time since I switched to the older kernel, indicating that my trouble had nothing to do with the kernel version nor with my graphics driver. Just as the last few times this has happened, I got a kernel panic when I tried to shut my computer down after the Internet trouble happened.

I ran sudo journalctl -b -1 -k to get the kernel logs from last night’s incident, and the kernel logs had a lot of talk about things like:

  • pcieport device error
  • pcieport ACSViol
  • pcieport broken device, retraining non-functional downstream link
  • PCIe Bus Error: severity=Uncorrectable (Non-Fatal), type=Transaction Layer, (Receiver ID)
  • pcieport AER: subordinate device reset failed
  • pcieport AER: device recovery failed
  • pcieport DPC: containment event, status:0x1f01 source:0x0000
  • pcieport DPC: unmasked uncorrectable error detected

I also saw a lot of talk about:

  • mt7921e AER: can’t recover (no error_detected callback)
  • mt7921e not ready 1023ms after DPC; waiting
  • […]
  • mt7921e not ready 65535ms after DPC; waiting

The aforementioned messages about PCIe and mt7921e basically repeated in the same order several times, though I haven’t necessarily written them in order here. There was also a singular message amidst the PCIe and mt7921e messages that said workqueue: delayed_fput hogged CPU for >10000us 11 times, consider switching to WQ_UNBOUND


Some magic words here seem to be AER for “Advanced Error Reporting”, ACSViol for “Access Control Services violation”, and DPC for “Downstream Port Containment”. I’m still not quite sure what the message with the “fput” and “WQ” stuff was about, something about the computer procrastinating on something, I guess…?

PCIe is a “high-speed standard used to connect hardware components inside computers […] commonly used to connect graphics cards, sound cards, Wi-Fi and Ethernet adapters, and storage devices such as solid-state drives and hard disk drives.” — back when I used Windows, this laptop used to have the sound constantly fail, but that doesn’t seem relevant.

mt7921e is apparently a driver for MediaTek’s MT7921 or MT7922 wireless network adapters.

So basically… The wi-fi adapter or its driver is shitting itself for some reason, and the thingamabob that’s supposed to connect my computer’s internal devices to each other notices this, tries to fix the wi-fi thingy, fails, then cuts the device off, which manifests to me as the Internet no longer working. Then the computer rinses and repeats but never gets anywhere, leading it to get stuck during shutdown.

Does that sound like a reasonable explanation? And if it is… Well, what am I supposed to do about it?

  • trompete [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    2
    ·
    25 days ago

    The workqueue message is probably not important. It’s likely caused by the other problem. It’s a very generic sort of advise towards the programmer that scheduled a job for later, that this job is taking a long time and they should do something about that.

    • Erika3sis [she/her, xe/xem]@hexbear.netOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      24 days ago

      So when you said in the previous thread to disable battery while the laptop is unplugged, did you literally mean that my laptop must be unplugged, and while it is unplugged, I need to go to the BIOS and disable battery there, such that the computer will presumably immediately shut itself off once I save the setting, until I plug it back in and push the power button again? And then I can go and enable the battery again?

      • trompete [he/him]@hexbear.net
        link
        fedilink
        English
        arrow-up
        2
        ·
        24 days ago

        Yes. I think mine shut down immediately without saving, but I don’t remember exactly now. Wait a minute just to be sure it’s all powered down. I did not need to re-enable the battery, just plugging it in and starting it enabled the battery again.

        • Erika3sis [she/her, xe/xem]@hexbear.netOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          24 days ago

          Computer froze and I got the blinkenlichten moments after I wrote the previous comment, but not before I managed to type dmesg and took a picture of the screen with my phone. There’s a lot of mt7921e “driver own failed” + “timeout for driver own” – I’m pretty sure these were the messages I saw that one time the computer wouldn’t reboot/start up back in the first thread, when it was just a black screen slowly getting filled with text.

          The last thing that dmesg spat out was wlp3s0 “driver requested disconnection from AP”

          But yeah, a bit concerning that it happened immediately after it had just happened, literally moments after I’d rebooted it was already broken again.

        • Erika3sis [she/her, xe/xem]@hexbear.netOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          24 days ago

          Alas, I could not find the settings in question on UEFI. Supposedly there is no way to do this on my kind of laptop, apparently, which seems a bit ridiculous to me.

  • doodoo_wizard@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    24 days ago

    The wireless card is not seated properly. Open the laptop, remove the wireless card and reinsert it.

      • doodoo_wizard@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        ·
        24 days ago

        Journalctl is telling you the kernel is having a problem with establishing and maintaining a pcie link to the wireless card and the card has a driver in the kernel for a long time.

        On the off chance I’m wrong, yanking the card, turning on the computer and seeing if all the errors and bad behavior goes away then turning it off and reinserting the card will give you proof positive that you’re barking up the right tree with the wireless card and if it fixes the problem for good then I was right and that’s an L but at least your wireless is working…

  • dead [he/him]@hexbear.net
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    24 days ago

    It would have been helpful if you mentioned that it was wifi failing from your first post. Wifi drivers are notoriously bad on GNU/Linux. I always plug in a wire.

    If it is only your wifi driver giving you problems, you could consider getting a USB wifi device which has an Open Source driver. I have a USB wifi antenna with a FOSS driver. The BIOS is usually locked on which wifi cards you can use because of FCC laws.

    Your kernel is probably not “old” if you are using the latest version of Mint. Nvidia compiles binaries for the current versions of Ubuntu and Debian. It is more likely that you installed an older version of the driver. DKMS is used because the driver is proprietary and nvidia doesn’t want their driver code to get mixed with the kernel code. It’s illegal to combine GPL code with proprietary code.

    Or it is possible that the DKMS failed to compile. maybe you don’t have the linux headers installed.

    important: you should not install system level software by any other means than using the package manager do not use driver install scripts. do not use install scripts. do not run install executables that you downloaded from the web.

    read this. it’s a guide on how to install software on debian/ubuntu/mint without breaking your OS. https://wiki.debian.org/DontBreakDebian

    Nvidia has repository for drivers for ubuntu and debian. This kinda breaks one of the rules of DontBreakDebian, but I think it’s the safest way to install the drivers. I have the 580 nvidia driver installed on my Debian 13 system and it works fine for playing games. I tried upgrading to 590 driver and that was buggy, so I went back to 580.

    https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/ubuntu.html

    https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/debian.html

    If you’re on a laptop and you don’t plan to play games and you have an intel processor, you can use the intel graphics driver which is Open Source. You might have to turn off nvidia graphics in the bios.

    Something you should also know is that Linux has a manual command. You just type into the terminal: man [program name] . It will give you a user manual on how to use whatever program you have installed.

    When you want to install software on your system, first thing you should do is search the distro repository and see if you can install it from the package manager. If the repository doesn’t have the software you want or isn’t the version you need, you should search flatpak/flathub for the software. If the software you want is not available in flatpak, you should try to compile it in a local user environment. If the software is not able to be compile in a local environment, you should try to compile it as a package.