I recently upgraded my gaming computer from Fedora 38 to 40 and it didn't go as planned. I went through a bunch of pain to attempt to fix my partially-completed upgrade, but luckily I was able to complete the upgrade. Here, I took some notes of the process I used to rescue my system.
More regular programming will resume eventually. I have another Fedora-related bugbear to write about (related to Asahi Linux)... but that's for another day.
Disclaimer #
This is a very specific set of steps that I took to fix my system. I can't guarantee that it will work for you, but it might be worth a shot if you're in a similar situation. Your mileage may vary. If you're in doubt, I would ask for help on the Fedora forums.
Waiting for an upgrade to complete #
First, I booted the system and ran dnf upgrade
to see if there were any packages that needed to be updated. There were (the machine hadn't been powered on since February), and I installed those and rebooted. So far, so good.
Now, we need to begin the actual upgrade process using the GNOME Software application, as recommended in the Fedora documentation. Everything was downloaded successfully and I was able to reboot the machine. The upgrade started but did not progress much beyond 52% completion. I let the upgrade sit there for 45 minutes. After about 10 minutes of waiting, I decided to power off the machine and reboot, since the upgrade appeared stuck with no output or progress.
Big mistake.
The reboot #
After rebooting, the system was in a bad state. The system booted with a Fedora 38 kernel. I could not log into the desktop. When I switched to a shell with Ctrl+Alt+F2
, the system claimed to be on Fedora Linux 40. I was able to log in as my user on the shell successfully, which did tell me that there was a chance to recover the system, or at the very least, backup my data before a reinstall.
So we have a system in a partially upgraded state: some chunk of the system is from Fedora 38, some other chunk is from Fedora 40. Let's try to fix it.
Fixing the partially upgraded system #
My first step was to try the obvious thing, and run dnf distro-sync --releasever=40
. This command should sync the system to the Fedora 40 release. Unfortunately, this command failed with a deceptively simple error:
Traceback (most recent call last):
File "/usr/bin/dnf", line 57, in <module>
from dnf.cli import main
ModuleNotFoundError: No module named 'dnf'
Oh dear. Our package manager is broken. Are we screwed?
A Python scavenger hunt #
I guessed that in the intervening period, the system Python version had been upgraded. (It turns out that Fedora 39 upgraded to Python 3.12, up from Python 3.11 in Fedora 38. But it was late into the night and looking over changelogs was the last thing I wanted to do.) I opened a Python shell and confirmed that the system Python version was indeed 3.12.
Now we just need to make sure the dnf
module actually existed somewhere, otherwise we would need to manually re-install dnf
via rpm
before we made any progress. While there's a more elegant, Pythonic way of doing it, I examined /usr/lib/python3.12/site-packages
and confirmed that the dnf
module was absent, and then tried /usr/lib/python3.11/site-packages
, and discovered that the module existed. I didn't know this at the time, but we now know that dnf
was not yet upgraded at the time I rebooted the system. From there, I made my first leap of faith: I patched /usr/bin/dnf
to invoke python3.11
instead of python
.
Now we can run dnf
again. This time, the command runs successfully. Our system is far from fixed, but we now have a working package manager, which we will need to fix the rest of the system.
Fooling the system into thinking it's Fedora 40 #
Now that dnf
works, we can attempt to fix the system. That is where I found a really helpful forum post from a user experiencing a rather similar issue to me, except I had the dnf
hoop to jump through.
First, we need to reinstall the fedora-release
package using the "correct" version. This package contains the release information for the system, and since it's probably not correct, it's important for what we need to do in order to proceed. We can do this with the following command:
dnf --releasever=40 reinstall fedora-release-\*
Now, dnf
should believe that we are "supposed" to be on Fedora 40. I'm still using --releasever=40
to ensure that we're installing the Fedora 40 packages, but it should not be necessary once you run this command.
Note that you will need internet access for this to work. Luckily, the system was connected via Ethernet, so I didn't have to worry about configuring that myself, presumably as NetworkManager
was working correctly. (If you are on Wi-Fi, you might have to configure the network manually.)
The strategy #
The system has duplicate packages (incompletely-removed Fedora 38 packages that were upgraded to Fedora 40, that dnf
hasn't cleaned up), and packages that simply haven't been upgraded to Fedora 40. Our goal is to remove all the duplicate Fedora 38 packages, finish up the upgrade using dnf distro-sync
, and then reboot to get to a hopefully-working Fedora 40 system.
Removing duplicate packages #
First, let's remove all the duplicate packages. We can do this with the following command:
dnf remove --duplicates --releasever=40
In theory, this command should remove all the Fedora 38 packages and install the Fedora 40 equivalents. However, it didn't work for me, initially due to multi-arch conflicts. At this point, I wanted to prioritize safety over speed, so I opted to resolve each conflict by hand to ensure that I knew what I did, instead of specifying any dnf
options to go "nuclear" (like permitting broken dependencies).
- I was concerned about the large number of packages that I needed to download, but since
dnf
was installing Fedora 40 packages in exchange, it was probably safe to proceed. - After the download,
dnf
's transaction test failed with a large volume of conflicts caused by multi-arch packages (this was anx86_64
system withi686
packages installed). I elected to remove all thei686
packages from the system usingdnf remove *.i686
and temporarily losing the ability to run 32-bit applications on the system. We can reinstall those later. - Next, I had conflicts related to the KDE Framework 5 to 6 upgrade. I removed all the
kf5-*
packages from the system in order to proceed, since I don't use KDE on this machine. - There was a conflict caused by the
mozjs78
(this being the SpiderMonkey engine version bundled with Firefox 78) being missing in Fedora 38. I removed themozjs78
package from the system. Nothing depended on it in my case. - Finally, there was a conflict caused by the
cups-filters
package being partially split with a-driverless
package in Fedora 40. I removed thecups-filters
package from the system. I don't print on this machine, so it was reasonably safe to remove.
After all these steps, I was able to run the dnf remove --duplicates --releasever=40
command successfully. This command removed all the duplicate Fedora 38 packages from the system, and installed the Fedora 40 equivalents.
The distro-sync and solving the mystery of my long upgrade #
Now that we have removed all the duplicate packages, we can run the dnf distro-sync --releasever=40
command. This command will upgrade all the packages that have yet to be upgraded to Fedora 40. This command ran successfully for me, and I was able to reboot the system.
But it was taking quite a while for the upgrade to complete. In particular, it was stuck on running a script for smartmontools-selinux
. After a few minutes, I open up another shell and run htop
.
Turns out, 100% of a CPU core is being pegged, running restorecon
. That means the system is relabeling the entire filesystem. Luckily, patience was key, and after another 30 minutes, the relabeling process completed and the rest of the upgrade proceeded smoothly. I did wonder whether or not this was why the upgrade had taken so long, since there was essentially no feedback for a lengthy period of time.
Once the command completed, I rebooted the system, hoping for a working system.
Almost there #
The reboot was successful, however the dracut
initramfs let me know that the system was still not fully upgraded, letting me know that Fedora 38 was end-of-life. However, I had a fully working system at this point. I still wanted to fix this issue, though.
I had to find the latest kernel-core
version (ref) and reinstall it with dnf reinstall kernel-core
. After this, I rebooted the system and saw that the new kernel was present and ready to be booted into.
My nightmare is over. I have a fully-upgraded Fedora 40 system.
So, what did I learn? #
- Don't interrupt a system upgrade. This is the most important lesson. Interrupting an upgrade guarantees a hosed system, which might be difficult to recover.
- I was lucky that someone had a similar issue to mine, but the fact that I had to rely on a random internet forum post to fix my system was unnerving. However, I understood what the commands were doing and was willing to accept the risks.
- Linux can be resilient. The system actually booted to multi-user and I had working internet access, making the system recoverable.
- But it is also not anti-fragile. It is laughably easy to screw up a Linux system and put it in a broken or unbootable state. I was lucky that I was able to recover the system. If things had shaken out differently, I would had to resort to attempting to fix the system via a
chroot
(there was also a Ubuntu install on a separate partition on the system) or reinstalling the system from scratch via a live USB.