Unable to boot after update - flashing cursor only

I just did a sudo swupd update to get the latest version of the OS - 32610. I believe I was on 32580 prior to that. Just rebooted and now the machine just sits at a flashing cursor in the upper left hand corner of the screen. I am unable to type any text.

I have a luks encrypted partition for CL that includes my home directory as well. Running on an ASUS zenbook UX430U with integrated Intel graphics, no other graphics card.

Normally my machine boots straight to the CL partition and immediately asks me for the LUKS password at a prompt before proceeding.

Windows 10 is installed on a separate partition that I boot to using the bios boot menu only no grub or any other boot managers. Windows boots fine still.

Just wondering how I begin going about troubleshooting.

Update:
I just used a live usb of Clear Linux. On booting I was able to open Gnome Disks and unlock my LUKS partition with my password and mount it. All of my files are present. I was also able to mount the EFI partition and edit loader.conf. I added timeout 10 to show the boot menu.

I then tried booting both kernel entries with the same result as before.
I then tried pressing ‘e’ on one of the entries and edited the options line, deleting all options except for the uuids to mount. The boot process proceeds and waits for 90s to mount the uuids listed. This fails after that time period. I took a photo of the screen with my phone and have attached it.

Any help would be appreciated.

Can confirm this issue after updating my X260 today… just updated it as I wanted to use it for a video chat later this evening and it’s my only device with a webcam… guess I’ll have to find a backup solution now.

I have been researching this a bit. I am not adept at the command line but am wondering if there is a way to boot from a live usb then unlock the LUKS partition and mount it. Mount the boot partition, then use chroot to run from those those partitions. Then use swupd to downgrade to a previous release. I am just theorizing and don’t have the expertise yet to do this. I am at work now and will research this a bit later.

You could try and boot and run from the live usb of any working distro to do the video chat as a temporary solution.

If you boot from the live image and mount your real rootfs at /mnt and real boot partition at /mnt/boot then use swupd repair -m $OLD_VERSION --force --path=/mnt it should go back to the version specified.

Just to confirm, are using LUKS as well?

yes, single boot in my case

Was it the native and lts kernels that you tried? Ping @miguelinux

Thank you very much for this. I will try this in a few hours once I am done work.

thanks, I will give that a try!

I think I only had the native kernels installed not lts.

same here. all stock, no special / lts kernels
probably been at least a month since I last booted the device up…

editing the post since I can’t post any more replies…

I think the kernel in 32580 was the same as the older kernel option I tried from the boot menu in this release 32610) unsuccessfully. The kernel in 32580 was fine.

Good to know, thanks.

Okay, looks like we found the problem. Hopefully the next release will have a fix but we are still working on it.

1 Like

That’s great to hear! I will roll back tonight and disable automatic updates. Any idea which version was the last one not be be affected? Just wondering what to roll back to. I had skipped several versions when updating.

I’ve tried sevral kernels and the issue is there, no specific to -native kernel

Wooha…fsck!!!

This is yet another rolling release example of a disastrous outcome with no mission critical self resolving solution implemented, has anyone considered it not important enough to care to develop a solution for such disasters?? IMO I’d have implemented a solution before deploying a rolling release strategy, seems no-one else in the Linux community has bothered either.

Ideally, under no circumstances should a machine need to be rebooted for software updates to be applied. A solution must be found to ensure this isn’t required so that production isn’t interrupted by a reboot.

Under no circumstances should machine, if rebooted, fall into a non bootable state like this, ever. This is unacceptable in any production environment, and questions should be asked if these typical rolling release solution we have implemented are suitable for a production environment? If yes, then solutions need to be in place so this failed to boot situation never happens.

Why can’t the system put itself into a pseudo sandboxed in memory disk image, test if it runs, then apply service restarting to the live system, all without a reboot.

IDK the answers but maybe swupd can evaluate the current system state before updating, put the data in a database with some automated instructions on how to reinstate.

If the machine fails to boot after the update, should the machine auto dump into emergency mode which then by auto scripting automatically consult this above mentioned swupd database state, and automatically restore the previous known bootable state, then finally reboot the machine and alert the user at the ssh command prompt after logging in,

WARNING, swupd release # failed, restored previous release# configuration, we’ve automatically sent a failed update/reboot log to developers.

It shouldn’t be dumped to the users lap, requiring him to instruct the machine to manually issues swupd repair commands.

Given the current circumstances of this thread, a question here is, when the machine fails to boot and is dumped into this emergency mode, is the network stack available to ssh into the machine?

If no, how is expected to recover the remote bare metal Clear Linux machine without physical access (no ability to use a USB live boot) and no ability to ssh into the machine to run the swupd repair command, how will we even see that the machine is stuck in this emergency mode, remotely. (maybe this is possible on some machines with IPMI, but this would seriously take some hours to setup and finally resolve to have the machine back online.)

This situations is exactly why I’m shit scared to update, let alone have swupd on auto and it’s not an isolated case, a colleagues’ openSUSE Tumbleweed machine auto updated itself the other week and it too failed to boot because of some Grub boot loader issue, this is not cool.

2 Likes

As part of our release process we do update systems and validate they are booting with the new version. LUKS encryption of the rootfs is currently not part of that test scope (though we are looking to add it). For issues involving reboot, a kernel update is the single most likely point of failure (especially considering it can be related to a particular users hardware). For that case we do keep a last working kernel as a fallback option.