2023年 5月 16日 来源: 新华社微博 字号:默认 超大

现象描述:

一台机器挂载了两块数据盘(/dev/sdb,/dev/sdc)

[root@localhost ~]# df -h
Filesystem                  Size  Used Avail Use% Mounted on
devtmpfs                    189M     0  189M   0% /dev
tmpfs                       203M     0  203M   0% /dev/shm
tmpfs                       203M  628K  203M   1% /run
tmpfs                       203M     0  203M   0% /sys/fs/cgroup
/dev/mapper/openeuler-root   17G  8.8G  7.0G  56% /
tmpfs                       203M  4.0K  203M   1% /tmp
/dev/sda1                   976M  120M  790M  14% /boot
tmpfs                        41M     0   41M   0% /run/user/989
tmpfs                        41M     0   41M   0% /run/user/0
/dev/sdb                    9.8G   37M  9.3G   1% /test
/dev/sdc                    9.8G   37M  9.3G   1% /test1
[root@localhost test1]# ll /test
total 24K
drwx------ 1 root root    4 Dec  5 17:13 nginx
drwx------ 2 root root 4.0K Dec  5 17:13 data
drwx------ 2 root root  16K Dec  5 16:35 lost+found
[root@localhost test1]# ll /test1
total 24K
drwx------ 1 root root    4 Dec  5 17:13 tomcat
drwx------ 2 root root 4.0K Dec  5 17:13 data2
drwx------ 2 root root  16K Dec  5 16:36 lost+found
[root@localhost ~]# tail -2 /etc/fstab
/dev/sdb /test ext4     defaults 0 0
/dev/sdc /test1 ext4    defaults 0 0

由于挂载的sdb一直没有使用,数据都放在sdc下,所以操作将它删除,重启后发现机器要一分多钟才可以连接:查看message日志发现有timed out:

[root@localhost ~]# cat /var/log/messages |grep -C30 'Timed'
Dec  5 17:19:59 localhost kernel: [   56.571447] hv_balloon: Max. dynamic memory size: 1024 MB
Dec  5 17:20:40 localhost systemd[1]: dev-sdc.device: Job dev-sdc.device/start timed out.
Dec  5 17:20:40 localhost systemd[1]: Timed out waiting for device /dev/sdc.
Dec  5 17:20:40 localhost systemd[1]: Dependency failed for /test1.
Dec  5 17:20:40 localhost systemd[1]: Dependency failed for Local File Systems.
Dec  5 17:20:40 localhost systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
Dec  5 17:20:40 localhost systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
Dec  5 17:20:40 localhost systemd[1]: local-fs.target: Failed to enqueue OnFailure= job, ignoring: Unit emergency.service has a bad unit file setting.
Dec  5 17:20:40 localhost systemd[1]: test1.mount: Job test1.mount/start failed with result 'dependency'.
Dec  5 17:20:40 localhost systemd[1]: dev-sdc.device: Job dev-sdc.device/start failed with result 'timeout'.
Dec  5 17:20:40 localhost systemd[1]: Starting Restore /run/initramfs on shutdown...
Dec  5 17:20:40 localhost systemd[1]: Condition check resulted in Rebuild Dynamic Linker Cache being skipped.
Dec  5 17:20:40 localhost systemd[1]: Condition check resulted in Store a System Token in an EFI Variable being skipped.
Dec  5 17:20:40 localhost systemd[1]: Condition check resulted in Rebuild Journal Catalog being skipped.
Dec  5 17:20:40 localhost systemd[1]: Condition check resulted in Commit a transient machine-id on disk being skipped.
Dec  5 17:20:40 localhost systemd[1]: Starting Create Volatile Files and Directories...
Dec  5 17:20:40 localhost systemd[1]: Condition check resulted in Update is Completed being skipped.

可能原因:

近期除删除sdb磁盘外无其他操作,可能是删除磁盘导致的

定位思路:

查看磁盘数据和盘符是否正常:

[root@localhost ~]# df -h
Filesystem                  Size  Used Avail Use% Mounted on
devtmpfs                    189M     0  189M   0% /dev
tmpfs                       203M     0  203M   0% /dev/shm
tmpfs                       203M  632K  203M   1% /run
tmpfs                       203M     0  203M   0% /sys/fs/cgroup
/dev/mapper/openeuler-root   17G  8.9G  7.0G  56% /
tmpfs                       203M  4.0K  203M   1% /tmp
/dev/sdb                    9.8G   37M  9.3G   1% /test
/dev/sda1                   976M  120M  790M  14% /boot
tmpfs                        41M     0   41M   0% /run/user/989
tmpfs                        41M     0   41M   0% /run/user/0
[root@localhost ~]# ll /test
total 24K
drwx------ 1 root root    4 Dec  5 17:13 tomcat
drwx------ 2 root root 4.0K Dec  5 17:13 data2
drwx------ 2 root root  16K Dec  5 16:36 lost+found

发现数据没有丢失,但磁盘名称和挂载目录由原来的sdc变成sdb,test1目录变成test目录

查看开机自启挂载方式:

[root@localhost ~]# tail -2 /etc/fstab
/dev/sdb /test ext4    defaults 0 0
/dev/sdc /test1 ext4    defaults 0 0

确认fstab里挂载写的是磁盘名称,不是UUID形式,所以当删除无用数据盘sdb后,导致sdc顶替成为了sdb,而文件中本来的sdc由于识别不到,导致系统重启Timed out

解决方法:

查看磁盘对应的UUID,更换UUID方式挂载:

[root@localhost ~]# blkid
/dev/mapper/openeuler-root: UUID="e45aaa82-334c-4460-a96d-e0f229e1019a" BLOCK_SIZE="4096" TYPE="ext4"
/dev/sda2: UUID="VIRBUS-atLc-4Rd6-rsVk-0ZAO-1Gi3-zaCqgZ" TYPE="LVM2_member" PARTUUID="d944d5f4-02"
/dev/mapper/openeuler-swap: UUID="3a2e33ca-eda1-473d-aa59-eed4e6e0d533" TYPE="swap"
/dev/sdb: UUID="d7770de4-6932-413a-b3ab-5b4e0174dd59" BLOCK_SIZE="4096" TYPE="ext4"
/dev/sda1: UUID="541688f6-935c-4ffa-b86b-2c062cbc6c08" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="d944d5f4-01"
[root@localhost ~]# sed -i 's#/dev/sdb#UUID=d7770de4-6932-413a-b3ab-5b4e0174dd59#g' /etc/fstab 
[root@localhost ~]# tail -2 /etc/fstab
UUID=d7770de4-6932-413a-b3ab-5b4e0174dd59 /test ext4    defaults 0 0
/dev/sdc /test1 ext4    defaults 0 0
之后删除/dev/sdc这行,因为他已经更名为了/dev/sdb

重启系统

无Timed out,搞定~!