Sunday, May 15, 2022

Installing Linux on a Dell 8940

This post probably won't interest anyone who normally reads this blog. If I still had a tech blog, I'd post it there.

TL;DR:

Dell seems to have made this deliberately difficult.

You really need to understand UEFI boot partitions to make this work. You can probably convert the built-in SSD to AHCI mode using Windows (Google it), especially if you want a dual boot system.

That conversion is the key. I recommend googling the instructions and using Windows to do it first.

I did it the hard way. I learned a lot, but it wasn't fun.

Personal Background

I started using Linux in 1995. I vividly remember installing Slackware from a big stack of floppy disks onto a laptop in my grandmother's living room. I used a physical book as a guide, because I did not have internet access. (Sidenote: when I did get internet access, gopher was still more popular and usable than the recent "world wide web" fad).

I'd decided back then that Windows was a shoddy product, Bill Gates is a terrible person, and Microsoft is evil enough that I never want anything to do with them. I haven't seen anything since then to change my mind about any of that (though I was starting to think that Bill had changed his ways after he retired, until his divorce doubled down on it all).

Sometime in November, we had a power outage. It did something that fried the firmware on my main computer's motherboard.

If you don't care about the personal details, skip to the next section (Dealing with Windows).

When I started this, I didn't realize it was going to be "Choose Your Own Adventure."

I built that computer from parts, while I was living in Dallas. Laura had caught my eye, moved to Austin, and then connected me with one of her friends who got me a job that saved me from starvation.

The job was a terrible fit, and it didn't last long, but it paid well enough that I had nearly $1000 to spare to build that PC. And the job was a nice bridge to following Laura to Austin to get married and have all sorts of wonderful changes in my life. So it was a positive experience.

Building the PC was not. It was awful. I kept slicing myself open on all the sharp edges of the case. My eyes were starting to get old enough that I needed glasses, so all the fine detail stuff was blurry, but I didn't realize it yet. I had her promise me that, next time, I'd just buy one.

So, after 7 or 8 years, that computer got fried. It had had problems in the meantime. I'd done things like adding another hard drive, replacing the power supply, and beefing up the RAM. Overall, I'm really happy with how well it held up.

Chip Shortages

Once it was time to replace that computer, I discovered that is no longer economically viable to build your own.

That used to be a good way to get exactly what you wanted, assuming you had a clue what you wanted, and you had the knowledge to put it together. It wouldn't necessarily be cheaper, but you could probably pick middle-of-the-road components that you could replace easily so it would last longer.

Right now, that isn't really an option.

Thanks to COVID, there's a world-wide shortage of a lot of things. Maybe especially computer chips.

Every system that I tried to design wound up costing at least $3500.

In a lot of ways, this means that Moore's law has at least sort-of failed for consumers. For something like 75 years, you were consistently able to purchase a better, faster computer for less money than you'd paid for the last one.

I remember reading articles a few years ago speculating about the fact that it's over because logic circuits can't get any smaller because they're running into the limits imposed by the uncertainty of quantum mechanics.

I doubt that anyone who wrote those articles could have visualized where we are today.

Either way, every option I looked at was far more expensive than last time.

Luckily, I was shopping right around the Black Friday time-frame. Or maybe Cyber Monday. One of those disgusting shopping days where merchants drop their prices ridiculously to suck you into buying crap you don't need because it's supor cheap.

I try to avoid those sorts of sales at all costs. But I felt like I actually needed this one.

And Dell had a deal that looked amazing.

I still have a semi-viable laptop (also a Dell...we found it on a clearance rack a few years ago, and it's been fine), but it isn't something I can use long-term. I could have tried to wait out the global chip shortage, but it isn't getting better in the near future.

So I bought the stupid Dell.

Wait

And then I waited.

It was supposed to get delivered around the end of December. About that time, I got a notification that let me know it had been delayed by a couple of months. Did I still want it?

I spent a lot of time considering that point. In the end, I could not find a decent video card that sold as cheaply as this entire computer.

Most of the alternatives I found would take at least another 2 months to get decent a video card.

And I'd already waited about that long.

So I told them to go ahead and ship it whenever they get one.

Lo and behold, it landed on my doorstep just a couple of days later.

Angels Sang


Dealing with Windows

If you've ever dealt with computer boot issues, you know that they all start the same way. Turn the computer on. When some logo flashes, you press some magic key. It takes you to the low-level BIOS management interface that lets you do things like specify that you want to boot from a LiveCD (well, DVD, now...and even those are getting more rare).

That was the first problem with this computer. It went straight to the Windows logo and then booted into a preliminary "Agree to these license agreements" screen so it can finalize the Windows installation.

This is where I ran into my first problems.

They present the Dell EULA right next to the Windows EULA. You have to accept them both.

Since I bought I Dell, I was fine with agreeing to that one. Since I'd reluctantly also paid for Windows, and have absolutely no use for it, I did not agree to it.

I suspect that my life would have been much easier if I had. I could have booted it up as-designed, configured Windows with a local user account, made the registry adjustments about the way the drive gets read at the firmware level, rebooted, and then installed Linux. At least, I think that would have been the easy approach. A few weeks later, I'm skeptical that it would have turned out any easier.

I could be totally wrong about the easy approach working. If you want to run Linux on this system, you may have to do it the hard way. But you might save yourself a lot of time and pain if you at least try the easy way first.

I spent a lot of time booting the system up, trying to find a way to get to the Dell logo instead of going straight into Windows.

I finally called tech support. They couldn't help me, insisting that this isn't a hardware problem. They offered to sell me the option to talk to their advanced tech support (which charges around $100). At this point, I was angry enough about their stupid design that I just wanted to send it back and start over. The tech forwarded me to customer service. CS forwarded me to some sort of RMA department.

Magical Breakthrough

Before I got a Return Merchandise Authorization, that tech had me:

  1. Turn the computer off
  2. Unplug both the power and monitor cables
  3. Hold the power button for 20-30 seconds
  4. Plug it back in
  5. Boot it back up

In their computer guide, this is the process to drain the "flea" power out of the system so it's safe to work with the internal electronic components.

I don't have any idea why this was the magical incantation to get into the BIOS setup. Maybe they ship it with some capacitor that's charged up to bypass the actual boot pieces and go directly into Windows?

Whatever the reason was, after I did this, I saw the Dell logo when I booted, and I was able to get into the BIOS setup and tell it that I want it to boot from its built-in DVD.

That let me boot into Linux! Finally!

Life seemed good. I played around with the LiveCD a bit to be sure that everything works fine. There was one problem, where the xfce4-screensaver made the UI look unresponsive. I was able to switch virtual terminals and kill the process to fix that problem. I can't remember now whether I had to use sudo or not. If not, that seems like a major security hole. I just made it a point to disable it. (I'm a big fan of xscreensaver, but I haven't gotten around to making the switch).

I couldn't see anything that looked like the actual hard drive under /dev.

I saw a few things that looked close. When I tried running fdisk on them, they failed for various reasons.

The problem was that the system couldn't see the actual disk (which, in this case, is /dev/nvm0).

When I finally decided to pull the trigger and do the install, it failed with an error. I forgot to write it down. Googling for the problem, I think it may have been "This computer uses Intel RST (Rapid Storage Technology). You need to turn off RST before installing Ubuntu. For instructions, open this page: help.ubuntu.com/rst." This basically means that the drive has been configured in Intel's broken RST mode, which sort of halfway mimics a RAID. It's a ridiculous thing to do when you only have a single drive. Apparently it makes it easier to set up at the factory, for when they do ship machines with multiple drives, or something like that. The mode itself is apparently broken enough that the linux kernel developers have refused to allow patches that support it into the kernel. This proceeded to make my life really difficult.

Dealing with RST

RST has been a problem for a long time:

And also: RST is deliberately not supported.

tobestool March 27, 2012 at 9:38 pm | Permalink As far as I know, RST isn’t directly supported under Linux. As I believe that some of the processing is done in the Windows driver (much like many of the “Fakeraid” systems), I doubt it will be easy to support under Linux (though no doubt possible if it becomes popular enough). Question: The bootup message is that no bootable partition is found. Somehow, it managed to find the bootable USB key when I was doing the Linux install, so that works. Looking at the partitions with fdisk shows no Windows residue. Going into advanced mode gives a warning message about first partition not starting on a physical secor boundary!?!? If nothing else, is it possible to edit UEFI to pick up the first Linux partition an boot off of it?

It probably never will be. There's a person at my day job who spends hackathons optimizing Linux kernel internals. He was surprised that anyone is still using RST, because it's such garbage.

The top answer I found on google (many apologies, I've lost the source):

fdisk doesn't read GPT-partitioned drives correctly and typically gives that error message. You may need to use parted or gparted to partition this drive and start over. Lots of discussions about this error and Linux on the net so Google...

I did find lots of discussions, but nothing in the way of resolution.

Update BIOS

This part didn't take a lot of effort. Go back into the BIOS config. Go to the System Config and the SATA Configuration tab. Switch it from "RAID On (Rapid Restore Tech)" to AHCI.

But then it wouldn't boot at all. It just acted like there were no drives in the system.

Completely disabled the option to boot from the hard disk. This caused the system to really freak out. It led to a scary screen with lots of red, insisting that I need to contact Dell tech support immediately. 

I tried disabling secure boot. That didn't help, so I turned it back on.

At this point I was desperate enough to try accepting the Windows EULA so I could change the bot mode. But it wouldn't boot either. It recommended downloading an update.

Fiddle with Hardware

I removed the SSD. The Live DVD booted without a glitch.

I added a couple of hard drives from my dead computer. I was all set up to hate the case also, but I was actually really impressed with the way they've engineered that part.

That was also fine.

But I really want to use the SSD for the Operating System. And I was really nervous about installing a new OS on top of either of the OSes I already had installed on those drives. Assuming that I could have made that work at all.

So I bought an external enclosure for that SSD so I could manage it from my laptop. They're cheap and easy.

At some point, I got onto the Dell Community forums to ask for help/advice. They pointed me to a bunch of the official Dell documents that basically all agree this should just work. I'm pretty sure those documents pre-date their decision to switch RST mode on by default.

I deleted the Windows partition completely. There's also a UEFI partition. I didn't touch that one.

Then I reinstalled it into my new desktop.

It still refused to boot.

I tried turning secure boot mode back off. Nope.

Moved the SSD back to my laptop and, out of desperation, deleted the UEFI partition. That didn't help.

So I installed linux from my laptop. As many times as I've done this install, this variation was pretty terrifying for me. I've never deliberately installed onto another disk. My terror was that I'd type something wrong and overwrite my existing system. It went as flawlessly as I could hope.

Supposedly, you can boot from a USB drive. So I tried a couple of variations of that. I started with the SSD in its external drive enclosure. I also tried a thumb drive that I've used for temporary boots for years. None of them worked. Dell recently disabled support for USB 1 drives, which is probably the problem here.

I also tried with Secure Boot mode off, just to be sure.

I reinstalled the SSD. Nope. 

I tried re-enabling "Raid-On" mode. Nope.

Breakthrough

But now, in the BIOS, I could see the Linux /boot partition. I couldn't find anything bootable, but I could browse it.

My theory is that, no matter what, the system was relying on that disk's UEFI partition. As long as the disk was physically in the system, it was looking for that. At first, that was configured to boot from the Windows partition.

When I disabled RST mode in the BIOS, it could no longer read that partition at all (because it had been created in RST mode). Instead of doing the sensible thing and booting from the disk I had told it to use, it broke.

This was a terrible design decision, and I really hope that enough people scream to Dell about it to convince them to change. Then again, the Linux community is small enough that I doubt they'd notice.

Back to the external drive enclosure and my laptop yet again.

Google led me to this command:

> grub-install --target=x86-64-efi --efi-directory=/mnt/esp --bootloader-id=GRUB --boot-directory=/mnt/root/boot --removable

(I had the drive mounted a /mnt).

That added a bootable piece to the UEFI partition.

After I moved the drive back into my desktop, I was able to
  • go back into the BIOS
  • go to the "Boot Sequence"
  • select that partition
  • choose /EFI/Boot/BootX64.efi

Final Steps

That let me get to a GRUB menu.

I chose Ubuntu.

It took me into recovery mode, in ash based on the "initramfs recovery." This is back into the realm of things I don't really know much about. 

It didn't see the SSD.

I went back into the BIOS and switched it back to AHCI mode.

And everything has worked flawlessly ever since.

Conclusion

I've always been extremely happy with Dell, before this.

Will I ever buy another Dell product after this experience? I really don't know. It took me weeks to figure out the magical incantation that worked.

Apparently a lot of this pain was caused by pressure from Microsoft. Dell caving in to that pressure doesn't make them look any better. And, really, their tech support people should know far more about these details than I do now.

Particularly those little details about things like the way the flea power is configured, the way UEFI works, and how to switch the disk out of RST mode.

We're experiencing a world-wide chip shortage. The pandemic broke the supply chain. Almost all electronics get manufactured in China, which is still locking down pretty hard. The Suez Canal got blocked for weeks. Ports are working 24/7 to unload the arriving ships. We're experiencing grocery shortages again (baby formula is a major problem: one manufacturer recalled a bunch due to possible bacterial infection. The FDA told them to send it back, because starvation is more dangerous).

Dell has a much bigger buffer than most computer manufacturers. This computer was basically half the price I would have paid for a similar system from a company that actually does support Linux. That's really messed up: you have to pay a lot more for something that's free. And even then, I'm not sure how long I would have been forced to wait for a decent graphics card. (Thanks to crypto-currency, there's a chronic shortage of those).

I just don't know. My Dell laptop from 2013 or 2014 installed Linux without a problem. I dual-booted Windows 10 on there for a while, until I got fed up with it forcing me to reboot so often. It still works great.

Linux works great on this desktop. I've been running it for about 3 months now without any problems.

But that initial installation was so painful. This was much worse than the first time I installed Slackware.

Or even Gentoo.

I really miss running Gentoo, but I just don't have the time it takes.

No comments:

Post a Comment

Thanks for leaving a comment! We love to hear from you!