Download ZIP

plan.md · 22 KiB · Markdown Raw

最近，我的磁盘空间不太够用了。 ```bash anduin@ms-server:~$ cd /swarm-vol/ anduin@ms-server:/swarm-vol$ df . -Th Filesystem Type Size Used Avail Use% Mounted on /dev/nvme2n1 ext4 7.0T 6.1T 559G 92% /swarm-vol anduin@ms-server:/swarm-vol$ cd /swarm-vol/nextcloud/ anduin@ms-server:/swarm-vol/nextcloud$ df . -Th Filesystem Type Size Used Avail Use% Mounted on /dev/nvme0n1 ext4 916G 554G 316G 64% /swarm-vol/nextcloud anduin@ms-server:/swarm-vol/nextcloud$ sudo fdisk -l Disk /dev/nvme1n1: 447.13 GiB, 480103981056 bytes, 937703088 sectors Disk model: INTEL SSDPED1D480GA Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: 75C97A6C-09A4-4375-8260-7A950D36C1B4 Device Start End Sectors Size Type /dev/nvme1n1p1 2048 1050623 1048576 512M EFI System /dev/nvme1n1p2 1050624 937701375 936650752 446.6G Linux filesystem Disk /dev/nvme2n1: 6.99 TiB, 7681501126656 bytes, 1875366486 sectors Disk model: WUS4BB076D7P3E3 Units: sectors of 1 * 4096 = 4096 bytes Sector size (logical/physical): 4096 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors Disk model: CT1000P3PSSD8 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes anduin@ms-server:/swarm-vol/nextcloud$ cd /dev/disk/by-uuid/ anduin@ms-server:/dev/disk/by-uuid$ ls -ashl total 0 0 drwxr-xr-x 2 root root 140 Jan 17 15:21 . 0 drwxr-xr-x 7 root root 140 Dec 28 05:45 .. 0 lrwxrwxrwx 1 root root 13 Jan 14 14:00 0377361e-2a7b-4024-a681-ea135c092cce -> ../../nvme0n1 0 lrwxrwxrwx 1 root root 13 Dec 28 05:45 49fd5e45-6074-4370-a95f-c4404920aff5 -> ../../nvme2n1 0 lrwxrwxrwx 1 root root 15 Dec 28 05:45 9C58-514E -> ../../nvme1n1p1 0 lrwxrwxrwx 1 root root 15 Dec 28 05:45 b91352af-9477-4684-8d08-2a45c39bec98 -> ../../nvme1n1p2 anduin@ms-server:/dev/disk/by-uuid$ cat /etc/fstab # /etc/fstab: static file system information. # # Use 'blkid' to print the universally unique identifier for a # device; this may be used with UUID= as a more robust way to name devices # that works even if disks are added and removed. See fstab(5). # # <file system> <mount point> <type> <options> <dump> <pass> UUID=b91352af-9477-4684-8d08-2a45c39bec98 / ext4 errors=remount-ro 0 1 UUID=9C58-514E /boot/efi vfat umask=0077 0 1 /dev/disk/by-uuid/49fd5e45-6074-4370-a95f-c4404920aff5 /swarm-vol ext4 defaults,noatime,nofail 0 0 /dev/disk/by-uuid/0377361e-2a7b-4024-a681-ea135c092cce /swarm-vol/nextcloud ext4 defaults,noatime,nofail 0 0 /swapfile none swap sw 0 0 ``` 由上面的信息，不难判断出：我的系统盘是 b91352af-9477-4684-8d08-2a45c39bec98 ，当然这和我们要调查的内容没什么关系。我的数据都放在了 /swarm-vol 这个文件夹。它背后的磁盘是 `49fd5e45-6074-4370-a95f-c4404920aff5` 即使我暂时使用奇技淫巧，将 /swarm-vol 下的子文件夹 nextcloud 暂时挪到了 `0377361e-2a7b-4024-a681-ea135c092cce` 下，还是濒临不够了。但是，幸运的是，我购买了一个全新的大而慢的机械硬盘： ```bash Disk /dev/sda: 58.21 TiB, 64003468427264 bytes, 125006774272 sectors Disk model: RAID5 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes ``` 为了测试它，我暂时挂载到了这里： ```bash /dev/sda /mnt/temp_big ext4 defaults,noatime,nofail 0 0 ``` 接下来，我认为我需要开始设计我的迁移改造计划。为了充分发挥我过去 49fd5e45-6074-4370-a95f-c4404920aff5，也就是nvme2n1，也就是 /swarm-vol 的快的固态的特性，又能发挥 /dev/sda 大的优点，我计划这样设计：使用 bcache 系统，让 /dev/sda 作为真正的存储设备，再让 49fd5e45-6074-4370-a95f-c4404920aff5 作为缓存盘，同时开启写入缓存和阅读缓存，这样我就拥有又大有快的存储了。考虑到我的缓存盘非常大（上面的信息可以得出，它足足有 6.99 TB 对吧？），我相信我可以设置非常激进的写入缓存和阅读缓存。而且我的缓存盘非常可靠，它几乎不会损坏，我也不担心短暂的数据丢失。我又不是银行，就都是电影。接下来，为了方便迁移，我开始设计我的迁移计划： # 阶段概要 ## 第一阶段 - 双数据阶段将 sda 格式化清空，作为 bcache 的后端。此时 nvme2n1 继续承载业务数据。不移除它。然后将业务数据使用 rsync 拷贝到 sda 中。 ## 第二阶段 - 暂停业务阶段将业务暂停，然后我最后运行一次rsync。这次rsync应该会跑得很快，因为只产生了增量数据差异。此时此刻，nvme2n1 （ext4）的数据，和 sda （bacache的后端）的数据是完全相同的。 ## 第三阶段 - 重构存储阶段将 nvme2n1 格式化。然后让它作为 bcache 的缓存端。再将得到的 bcache 虚拟盘，挂载到 /swarm-vol，实现业务无感。然后重启业务。注意：我没有任何额外的新空间可以用于备份！所以我的命令必须一次成功！一旦失败我们将万劫不复！ ## 第一阶段接下来，我要开始第一阶段的迁移了。我第一阶段计划这么做：目标 * 使用 make-bcache 将 /dev/sda 建立为 bcache 的后端（backing device）。 * 先不动现有 /dev/nvme2n1（现挂载于 /swarm-vol）上的业务数据，让业务继续运行。 * 格式化出的 /dev/bcache0 上创建一个文件系统（例如 ext4），然后将现有数据从 /swarm-vol 同步到这个新地方。 * 这是“第一阶段”，意在让 /dev/sda 上也有一份业务数据拷贝，从而腾出后续的操作空间。结果 * 最终会拥有两份数据： * 原始：/swarm-vol（在 /dev/nvme2n1 上） * 新的：/mnt/bcache（对应 /dev/bcache0，后端实际上是 /dev/sda） * 业务不中断我可以让服务继续使用 /swarm-vol，只要我在第一阶段只做数据拷贝、而不改动 /swarm-vol 自身。在第一阶段结束后，等我准备好，可以进入“第二阶段”短暂停机做增量 rsync 以及最终切换。 ```bash # 安装 bcache-tools sudo apt install bcache-tools # 仅示例，注意操作前先确认 /dev/sda 确实空置 # (在 fdisk 交互式命令中，删除旧分区、新建分区) sudo fdisk /dev/sda # 使用 wipefs 清除 sda 上的所有签名 sudo wipefs -a /dev/sda # 创建 bcache 后端 sudo make-bcache -B /dev/sda # 如果在 fdisk 里没有找到 /dev/bcache0，可以尝试 # 重新加载内核模块: sudo modprobe bcache # 如果还是没有，尝试手工创建 sudo echo /dev/sda > /sys/fs/bcache/register # 确认后端创建成功 # UUID: d5a45ab0-60b2-4f3a-8cf1-4d4ca97c018c # Set UUID: 01442457-240d-4bf4-8140-b7a647659beb # version: 1 # block_size: 1 # data_offset: 16 # 格式化后端 ls -ashl /dev/bcache0 sudo mkfs.ext4 /dev/bcache0 # 创建挂载点 sudo mkdir /mnt/bcache # 挂载 bcache 后端 sudo mount /dev/bcache0 /mnt/bcache # 确认挂载成功 cd /mnt/bcache # 确认挂载成功 df . -Th # (确认挂载成功后，开始 rsync) sudo rsync -Aavx --update --delete /swarm-vol/ /mnt/bcache/ # 同步 nextcloud 文件夹 sudo rsync -Aavx --update --delete /swarm-vol/nexcloud/ /mnt/bcache/swarm-vol/ ``` ## 第二阶段 - 暂停业务并做最终同步在这一阶段，我将： 1. 暂停业务，使其不再写入 `/swarm-vol`（也就是旧的 nvme2n1）。 2. 做最后一次增量 rsync，保证数据在 /dev/bcache0（后端 sda）上与旧数据完全一致。 3. 卸载旧的 `/swarm-vol`，改为挂载新的 `/dev/bcache0` 到 `/swarm-vol`，这样就完成了切换。 **示例脚本**（在生产环境中，请根据自己实际服务的暂停方式作相应调整）： ```bash # 1) 暂停业务 echo "停止相关业务/服务 (示例：docker-compose 或 systemctl stop 等)" docker-compose down sudo reboot # 重启服务器，确保业务不再写入 # 2) 做最后一次增量同步 sudo rsync -Aavx --update --delete /swarm-vol/ /mnt/bcache/ sudo rsync -Aavx --update --delete /swarm-vol/nextcloud/ /mnt/bcache/swarm-vol/ # 3) 切换挂载点 sudo umount /swarm-vol echo "将 bcache0 挂载为新的 /swarm-vol..." sudo mount /dev/bcache0 /swarm-vol echo "检查挂载..." df -Th /swarm-vol echo "请人工确认 /swarm-vol 中的数据完整性；若无误，可以继续。" ``` 在执行完成后，`/swarm-vol` 已经切换到基于 `/dev/bcache0`（后端是 `/dev/sda`）的存储，业务就可以使用这套新存储。此时 `nvme2n1` 上的原有 ext4 数据已不再对外提供服务，但仍在物理上保留（尚未被清空）。 --- ## 第三阶段 - 将原 nvme2n1 作为 bcache 缓存设备在这一阶段，我将： 1. 确认 `/swarm-vol` 已经切换成功、业务运行正常且数据安全无误。 2. 清空并格式化原本的 `nvme2n1` 为 bcache 缓存盘。 3. 将缓存盘附加到已经存在的 bcache 后端（即 `/dev/sda`）上，使两者变为真正的 “大容量 + SSD 缓存” 组合。 4. 根据需求，启用写回缓存（writeback）等激进模式。 **示例脚本**： ```bash # 1) 确认当前 /swarm-vol 已经是 /dev/bcache0，且业务正常 # （需人工自行验证，确认数据已在 /dev/sda + /dev/bcache0 上） # 此时可以停一下业务，或保持低负载也行，避免写入影响。 # 2) 清空 nvme2n1 （原来的 /swarm-vol）注意，这将销毁原数据！ echo "准备清空 /dev/nvme2n1..." sudo umount /dev/nvme2n1 || true # 若尚未卸载，可忽略报错 sudo wipefs -a /dev/nvme2n1 # 3) 将 nvme2n1 作为缓存盘初始化 echo "对 /dev/nvme2n1 执行 make-bcache -C（cache）..." #在这个例子里，默认的block大小是512B、bucket大小是128kB。block的大小应该与后端设备的sector大小匹配（通常是512或者4k）。bucket的大小应该与缓存设备的擦除block大小匹配（以减少写入放大）。例如，如果是一个4k sector的HDD和一个擦除block大小是2MB的SSD搭配，命令就应该是这样的： # sudo make-bcache --block 4k --bucket 2M -C /dev/nvme2n1 # 如果你需要查看 /dev/sda （也就是后端）的 block size，可以使用 fdisk -l /dev/sda 等命令。 # 如果你需要查看 /dev/nvme2n1 的擦除块大小，可以使用 nvme id-ns /dev/nvme2n1 等命令。一般是 4M sudo make-bcache --block 512 --bucket 4M -C /dev/nvme2n1 echo "检查生成的缓存盘信息..." sudo bcache-super-show /dev/nvme2n1 | grep -E "cset.uuid|dev.uuid" # 假设输出中 cset.uuid (或 dev.uuid) 为 11111111-2222-3333-4444-555555555555 # (这里仅演示，我需要看实际输出) CACHE_UUID="(此处填上实际的 cset.uuid)" # 4) 将缓存设备附加到现有的 /dev/bcache0（后端 /dev/sda） # /dev/bcache0 的 sysfs 路径可通过 ls /sys/block/bcache0/bcache 等命令确认 echo "附加缓存到现有 bcache 后端..." echo "$CACHE_UUID" | sudo tee /sys/block/bcache0/bcache/attach # 如果我看到 echo: write error: Invalid argument，通常是 block size 不匹配等问题 # 如果成功，则 /sys/block/bcache0/bcache/cache_mode 等节点应该出现 # 5) 为 bcache0 启用写回缓存模式（可选） echo "启用写回 (writeback) 缓存模式..." echo writeback | sudo tee /sys/block/bcache0/bcache/cache_mode # 可选：关闭顺序IO绕过等更激进的做法 # echo 0 | sudo tee /sys/block/bcache0/bcache/sequential_cutoff # echo 0 | sudo tee /sys/block/bcache0/bcache/writeback_percent # 6) 确认缓存已生效 echo "确认 /dev/bcache0 依旧正常挂载在 /swarm-vol，并检查 sysfs 等信息：" mount | grep /swarm-vol ls -l /sys/block/bcache0/bcache ``` 至此，我已经完成了将旧的 nvme2n1 转变为 bcache 缓存设备的操作，并和 `/dev/sda` 组合为统一的逻辑卷 `/dev/bcache0`。接下来的要点包括： 1. **开机自动挂载** - 通常推荐在 `/etc/fstab` 中写入对 `/dev/bcache0` 的挂载。 - 同时需要注意在 initramfs 阶段加载 bcache 模块、或者确保 `bcache-tools` 的 udev 规则可以自动将 cache attach 到 backing device（以免重启后没了 /dev/bcache0）。在 Ubuntu 下，一般可通过 `sudo update-initramfs -u` 并检查 `/lib/udev/rules.d/69-bcache.rules` 等来确认。在 `/etc/fsabt` 中添加： ```bash # 删除旧的 /swarm-vol 挂载 # /dev/disk/by-uuid/49fd5e45-6074-4370-a95f-c4404920aff5 /swarm-vol ext4 defaults,noatime,nofail 0 0 # 然后添加新的 /swarm-vol 挂载 /dev/bcache0 /swarm-vol ext4 defaults,noatime,nofail 0 0 ``` 2. **确认写回模式的风险** - 写回模式（writeback）可以大幅提高速度，但在缓存盘掉电或故障时会丢失尚未写入后端的脏数据。既然我提到 SSD 质量较好，且并不特别在意短期丢失风险，可以大胆使用。 3. **调优与监控** - 适当调节 `writeback_percent`、`sequential_cutoff` 等 sysfs 参数可以获得性能与风险的平衡。 - 还可以用 `dstat -D nvme2n1,sda` 或者 `iostat -xm 1` 来观察实际读写流量和缓存命中情况。完成后，我就拥有一个**后端极大（/dev/sda）+ 前端极快（/dev/nvme2n1 作为缓存）**的综合存储系统，挂载于 `/swarm-vol`。这样就达到了我预想的“又大又快”的目的。使用下面的命令检查其状态： ```bash anduin@ms-server:/sys/block/bcache0/bcache$ ls attach dirty_data sequential_cutoff stripe_size writeback_rate_fp_term_low backing_dev_name io_disable state writeback_consider_fragment writeback_rate_fp_term_mid backing_dev_uuid io_error_limit stats_day writeback_delay writeback_rate_i_term_inverse cache io_errors stats_five_minute writeback_metadata writeback_rate_minimum cache_mode label stats_hour writeback_percent writeback_rate_p_term_inverse clear_stats partial_stripes_expensive stats_total writeback_rate writeback_rate_update_seconds detach readahead_cache_policy stop writeback_rate_debug writeback_running dev running stop_when_cache_set_failed writeback_rate_fp_term_high anduin@ms-server:/sys/block/bcache0/bcache$ cat ./running 1 anduin@ms-server:/sys/block/bcache0/bcache$ cat ./state dirty anduin@ms-server:/sys/block/bcache0/bcache$ cat ./dirty_data 775.9M anduin@ms-server:/sys/block/bcache0/bcache$ cat ./writeback_running 1 anduin@ms-server:/sys/block/bcache0/bcache$ cat ./backing_dev_name sda anduin@ms-server:/sys/block/bcache0/bcache$ cat ./cache_mode writethrough [writeback] writearound none anduin@ms-server:/sys/block/bcache0/bcache$ cd ./cache anduin@ms-server:/sys/block/bcache0/bcache/cache$ ls average_key_size bucket_size congested flash_vol_create journal_delay_ms stats_hour tree_depth bdev0 cache0 congested_read_threshold_us internal root_usage_percent stats_total unregister block_size cache_available_percent congested_write_threshold_us io_error_halflife stats_day stop btree_cache_size clear_stats errors io_error_limit stats_five_minute synchronous anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./errors [unregister] panic anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./bucket_size 512.0k anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./block_size 0.5k anduin@ms-server:/sys/block/bcache0/bcache/cache$ cd ./stats_day/ anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ ls bypassed cache_bypass_hits cache_bypass_misses cache_hit_ratio cache_hits cache_miss_collisions cache_misses anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_hit_ratio 4 anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_hits 11611 anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_misses 269927 anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cd /swarm-vol/ anduin@ms-server:/swarm-vol$ df . -Th Filesystem Type Size Used Avail Use% Mounted on /dev/bcache0 ext4 58T 6.7T 49T 13% /swarm-vol ``` # If unable to run `wipefs` on a device due to `Device or resource busy` error The error `Device or resource busy` indicates that the device `/dev/nvme1n1` is currently in use, preventing `wipefs` from accessing it. To resolve this, you need to ensure that no processes or mount points are actively using the device. Here are some steps to identify and resolve the issue: --- ### **1. Check if the device is mounted** Run: ```bash mount | grep /dev/nvme1n1 ``` If it is mounted, unmount it: ```bash sudo umount /dev/nvme1n1 ``` --- ### **2. Check for active partitions** If any partitions on `/dev/nvme1n1` are in use, they need to be unmounted: ```bash lsblk ``` Unmap active partitions: ```bash sudo umount /dev/nvme1n1pX # Replace "X" with the partition number ``` --- ### **4. Check for `bcache` association** The presence of `bcache0` suggests `bcache` is in use. Verify: ```bash sudo bcache-super-show /dev/nvme1n1 ``` If it is associated, unregister it: ```bash echo 1 | sudo tee /sys/block/bcacheX/bcache/stop # Replace "bcacheX" appropriately ``` Clear the `bcache` superblock: ```bash sudo wipefs -a /dev/nvme1n1 ``` --- ### **5. Check for swap usage** If the device is used as swap: ```bash cat /proc/swaps sudo swapoff /dev/nvme1n1 ``` --- ### **6. Check for open file descriptors** List processes using the device: ```bash sudo lsof | grep /dev/nvme1n1 ``` Kill the processes if necessary: ```bash sudo kill -9 <PID> ``` --- ### **7. Retry `wipefs`** Once the device is no longer in use, retry: ```bash sudo wipefs -a /dev/nvme1n1 ``` If issues persist, let me know the exact usage scenario, and I can assist further! ## If bcache device not showing up on fdisk First, try some clean-up: ``` echo $cset_uuid | sudo tee /sys/fs/bcache/pendings_cleanup echo $backing_uuid | sudo tee /sys/fs/bcache/pendings_cleanup ``` Use bcache-super-show to get the uuids. Then try again to register: ```bash echo $cset_uuid | sudo tee /sys/fs/bcache/register echo $backing_uuid | sudo tee /sys/fs/bcache/register ``` The cache uuid should exist in /dev/fs/bcache if the cache device is successfully registered. If bcache-super-show says that that the backing dev.data.cache_state state is clean and the cset.uuid consists only of zeros, the bcache device is in the invalid state and must be recreated. [source] However, if clean, you could try force-starting the backing device without cache device: ```bash echo 1 | sudo tee /sys/class/block/$dev/bcache/running ``` # If unable to run `wipefs` on a device due to `Device or resource busy` error The error `Device or resource busy` indicates that the device `/dev/nvme1n1` is currently in use, preventing `wipefs` from accessing it. To resolve this, you need to ensure that no processes or mount points are actively using the device. Here are some steps to identify and resolve the issue: --- ### **1. Check if the device is mounted** Run: ```bash mount | grep /dev/nvme1n1 ``` If it is mounted, unmount it: ```bash sudo umount /dev/nvme1n1 ``` --- ### **2. Check for active partitions** If any partitions on `/dev/nvme1n1` are in use, they need to be unmounted: ```bash lsblk ``` Unmap active partitions: ```bash sudo umount /dev/nvme1n1pX # Replace "X" with the partition number ``` --- ### **4. Check for `bcache` association** The presence of `bcache0` suggests `bcache` is in use. Verify: ```bash sudo bcache-super-show /dev/nvme1n1 ``` If it is associated, unregister it: ```bash echo 1 | sudo tee /sys/block/bcacheX/bcache/stop # Replace "bcacheX" appropriately ``` Clear the `bcache` superblock: ```bash sudo wipefs -a /dev/nvme1n1 ``` --- ### **5. Check for swap usage** If the device is used as swap: ```bash cat /proc/swaps sudo swapoff /dev/nvme1n1 ``` --- ### **6. Check for open file descriptors** List processes using the device: ```bash sudo lsof | grep /dev/nvme1n1 ``` Kill the processes if necessary: ```bash sudo kill -9 <PID> ``` --- ### **7. Retry `wipefs`** Once the device is no longer in use, retry: ```bash sudo wipefs -a /dev/nvme1n1 ``` If issues persist, let me know the exact usage scenario, and I can assist further! ## If bcache device not showing up on fdisk 2 First, try some clean-up: ``` echo $cset_uuid | sudo tee /sys/fs/bcache/pendings_cleanup echo $backing_uuid | sudo tee /sys/fs/bcache/pendings_cleanup ``` Use bcache-super-show to get the uuids. Then try again to register: ```bash echo $cset_uuid | sudo tee /sys/fs/bcache/register echo $backing_uuid | sudo tee /sys/fs/bcache/register ``` The cache uuid should exist in /dev/fs/bcache if the cache device is successfully registered. If bcache-super-show says that that the backing dev.data.cache_state state is clean and the cset.uuid consists only of zeros, the bcache device is in the invalid state and must be recreated. [source] However, if clean, you could try force-starting the backing device without cache device: ```bash echo 1 | sudo tee /sys/class/block/$dev/bcache/running ``` ## Eject cache I used `bcache` only in a writethrough configuration, and IIRC even then `bcache` doesn't like at all if the cache device vanishes while the machine is running. Expect the `bcache` device to stall completely if that happens. I haven't tried to remove the cache device while the machine is powered down, so I can't say anything about that. I do think though that `bcache` is still pretty touchy, so I'd recommend that you try that with a VM or a physical test machine first. ---------- To safely remove the cache device, you can detach the cache set from the bcache device: echo <cache-set-uuid> > /sys/block/bcache0/bcache/detach To determine the necessary cache set UUID, look in `/sys/fs/bcache/`: host ~ # ll /sys/fs/bcache/ total 0 drwxr-xr-x 7 root root 0 Feb 19 00:11 eb99feda-fac7-43dc-b89d-18765e9febb6 --w------- 1 root root 4096 Feb 19 00:11 register --w------- 1 root root 4096 Feb 7 07:17 register_quiet So for example in this case, run: echo eb99feda-fac7-43dc-b89d-18765e9febb6 > /sys/block/bcache0/bcache/detach The `state` file should say `no cache` after that: host ~ # cat /sys/block/bcache0/bcache/state no cache

最近，我的磁盘空间不太够用了。

anduin@ms-server:~$ cd /swarm-vol/
anduin@ms-server:/swarm-vol$ df . -Th
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/nvme2n1   ext4  7.0T  6.1T  559G  92% /swarm-vol
anduin@ms-server:/swarm-vol$ cd /swarm-vol/nextcloud/
anduin@ms-server:/swarm-vol/nextcloud$ df . -Th
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/nvme0n1   ext4  916G  554G  316G  64% /swarm-vol/nextcloud
anduin@ms-server:/swarm-vol/nextcloud$ sudo  fdisk -l
Disk /dev/nvme1n1: 447.13 GiB, 480103981056 bytes, 937703088 sectors
Disk model: INTEL SSDPED1D480GA                     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 75C97A6C-09A4-4375-8260-7A950D36C1B4

Device           Start       End   Sectors   Size Type
/dev/nvme1n1p1    2048   1050623   1048576   512M EFI System
/dev/nvme1n1p2 1050624 937701375 936650752 446.6G Linux filesystem


Disk /dev/nvme2n1: 6.99 TiB, 7681501126656 bytes, 1875366486 sectors
Disk model: WUS4BB076D7P3E3                         
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/nvme0n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P3PSSD8                           
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
anduin@ms-server:/swarm-vol/nextcloud$ cd /dev/disk/by-uuid/
anduin@ms-server:/dev/disk/by-uuid$ ls -ashl
total 0
0 drwxr-xr-x 2 root root 140 Jan 17 15:21 .
0 drwxr-xr-x 7 root root 140 Dec 28 05:45 ..
0 lrwxrwxrwx 1 root root  13 Jan 14 14:00 0377361e-2a7b-4024-a681-ea135c092cce -> ../../nvme0n1
0 lrwxrwxrwx 1 root root  13 Dec 28 05:45 49fd5e45-6074-4370-a95f-c4404920aff5 -> ../../nvme2n1
0 lrwxrwxrwx 1 root root  15 Dec 28 05:45 9C58-514E -> ../../nvme1n1p1
0 lrwxrwxrwx 1 root root  15 Dec 28 05:45 b91352af-9477-4684-8d08-2a45c39bec98 -> ../../nvme1n1p2
anduin@ms-server:/dev/disk/by-uuid$ cat /etc/fstab 
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
UUID=b91352af-9477-4684-8d08-2a45c39bec98 /               ext4    errors=remount-ro 0       1
UUID=9C58-514E  /boot/efi       vfat    umask=0077      0       1
/dev/disk/by-uuid/49fd5e45-6074-4370-a95f-c4404920aff5 /swarm-vol ext4 defaults,noatime,nofail 0 0
/dev/disk/by-uuid/0377361e-2a7b-4024-a681-ea135c092cce /swarm-vol/nextcloud ext4 defaults,noatime,nofail 0 0
/swapfile                                 none            swap    sw              0       0

由上面的信息，不难判断出：

我的系统盘是 b91352af-9477-4684-8d08-2a45c39bec98 ，当然这和我们要调查的内容没什么关系。

我的数据都放在了 /swarm-vol 这个文件夹。它背后的磁盘是 49fd5e45-6074-4370-a95f-c4404920aff5

即使我暂时使用奇技淫巧，将 /swarm-vol 下的子文件夹 nextcloud 暂时挪到了 0377361e-2a7b-4024-a681-ea135c092cce 下，还是濒临不够了。

但是，幸运的是，我购买了一个全新的大而慢的机械硬盘：

Disk /dev/sda: 58.21 TiB, 64003468427264 bytes, 125006774272 sectors
Disk model: RAID5           
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

为了测试它，我暂时挂载到了这里：

/dev/sda /mnt/temp_big ext4 defaults,noatime,nofail 0 0

接下来，我认为我需要开始设计我的迁移改造计划。

为了充分发挥我过去 49fd5e45-6074-4370-a95f-c4404920aff5，也就是nvme2n1，也就是 /swarm-vol 的快的固态的特性，又能发挥 /dev/sda 大的优点，我计划这样设计：

使用 bcache 系统，让 /dev/sda 作为真正的存储设备，再让 49fd5e45-6074-4370-a95f-c4404920aff5 作为缓存盘，同时开启写入缓存和阅读缓存，这样我就拥有又大有快的存储了。

考虑到我的缓存盘非常大（上面的信息可以得出，它足足有 6.99 TB 对吧？），我相信我可以设置非常激进的写入缓存和阅读缓存。而且我的缓存盘非常可靠，它几乎不会损坏，我也不担心短暂的数据丢失。我又不是银行，就都是电影。

接下来，为了方便迁移，我开始设计我的迁移计划：

阶段概要

第一阶段 - 双数据阶段

将 sda 格式化清空，作为 bcache 的后端。此时 nvme2n1 继续承载业务数据。不移除它。然后将业务数据使用 rsync 拷贝到 sda 中。

第二阶段 - 暂停业务阶段

将业务暂停，然后我最后运行一次rsync。这次rsync应该会跑得很快，因为只产生了增量数据差异。此时此刻，nvme2n1 （ext4）的数据，和 sda （bacache的后端）的数据是完全相同的。

第三阶段 - 重构存储阶段

将 nvme2n1 格式化。然后让它作为 bcache 的缓存端。再将得到的 bcache 虚拟盘，挂载到 /swarm-vol，实现业务无感。然后重启业务。

注意：我没有任何额外的新空间可以用于备份！所以我的命令必须一次成功！一旦失败我们将万劫不复！

第一阶段

接下来，我要开始第一阶段的迁移了。我第一阶段计划这么做：

目标

使用 make-bcache 将 /dev/sda 建立为 bcache 的后端（backing device）。
先不动现有 /dev/nvme2n1（现挂载于 /swarm-vol）上的业务数据，让业务继续运行。
格式化出的 /dev/bcache0 上创建一个文件系统（例如 ext4），然后将现有数据从 /swarm-vol 同步到这个新地方。
这是“第一阶段”，意在让 /dev/sda 上也有一份业务数据拷贝，从而腾出后续的操作空间。

结果

最终会拥有两份数据：
原始：/swarm-vol（在 /dev/nvme2n1 上）
新的：/mnt/bcache（对应 /dev/bcache0，后端实际上是 /dev/sda）
业务不中断

我可以让服务继续使用 /swarm-vol，只要我在第一阶段只做数据拷贝、而不改动 /swarm-vol 自身。在第一阶段结束后，等我准备好，可以进入“第二阶段”短暂停机做增量 rsync 以及最终切换。

# 安装 bcache-tools
sudo apt install bcache-tools

# 仅示例，注意操作前先确认 /dev/sda 确实空置
# (在 fdisk 交互式命令中，删除旧分区、新建分区)
sudo fdisk /dev/sda

# 使用 wipefs 清除 sda 上的所有签名
sudo wipefs -a /dev/sda

# 创建 bcache 后端
sudo make-bcache -B /dev/sda

# 如果在 fdisk 里没有找到 /dev/bcache0，可以尝试
# 重新加载内核模块:
sudo modprobe bcache

# 如果还是没有，尝试手工创建
sudo echo /dev/sda > /sys/fs/bcache/register

# 确认后端创建成功
# UUID:			d5a45ab0-60b2-4f3a-8cf1-4d4ca97c018c
# Set UUID:		01442457-240d-4bf4-8140-b7a647659beb
# version:		1
# block_size:		1
# data_offset:		16

# 格式化后端
ls -ashl /dev/bcache0
sudo mkfs.ext4 /dev/bcache0

# 创建挂载点
sudo mkdir /mnt/bcache

# 挂载 bcache 后端
sudo mount /dev/bcache0 /mnt/bcache

# 确认挂载成功
cd /mnt/bcache

# 确认挂载成功
df . -Th

# (确认挂载成功后，开始 rsync)
sudo rsync -Aavx --update --delete /swarm-vol/ /mnt/bcache/

# 同步 nextcloud 文件夹
sudo rsync -Aavx --update --delete /swarm-vol/nexcloud/ /mnt/bcache/swarm-vol/

第二阶段 - 暂停业务并做最终同步

在这一阶段，我将：

暂停业务，使其不再写入 /swarm-vol（也就是旧的 nvme2n1）。
做最后一次增量 rsync，保证数据在 /dev/bcache0（后端 sda）上与旧数据完全一致。
卸载旧的 /swarm-vol，改为挂载新的 /dev/bcache0 到 /swarm-vol，这样就完成了切换。

示例脚本（在生产环境中，请根据自己实际服务的暂停方式作相应调整）：

# 1) 暂停业务
echo "停止相关业务/服务 (示例：docker-compose 或 systemctl stop 等)"
docker-compose down
sudo reboot  # 重启服务器，确保业务不再写入

# 2) 做最后一次增量同步
sudo rsync -Aavx --update --delete /swarm-vol/ /mnt/bcache/
sudo rsync -Aavx --update --delete /swarm-vol/nextcloud/ /mnt/bcache/swarm-vol/

# 3) 切换挂载点
sudo umount /swarm-vol

echo "将 bcache0 挂载为新的 /swarm-vol..."
sudo mount /dev/bcache0 /swarm-vol

echo "检查挂载..."
df -Th /swarm-vol

echo "请人工确认 /swarm-vol 中的数据完整性；若无误，可以继续。"

在执行完成后，/swarm-vol 已经切换到基于 /dev/bcache0（后端是 /dev/sda）的存储，业务就可以使用这套新存储。此时 nvme2n1 上的原有 ext4 数据已不再对外提供服务，但仍在物理上保留（尚未被清空）。

第三阶段 - 将原 nvme2n1 作为 bcache 缓存设备

在这一阶段，我将：

确认 /swarm-vol 已经切换成功、业务运行正常且数据安全无误。
清空并格式化原本的 nvme2n1 为 bcache 缓存盘。
将缓存盘附加到已经存在的 bcache 后端（即 /dev/sda）上，使两者变为真正的 “大容量 + SSD 缓存” 组合。
根据需求，启用写回缓存（writeback）等激进模式。

示例脚本：

# 1) 确认当前 /swarm-vol 已经是 /dev/bcache0，且业务正常
#    （需人工自行验证，确认数据已在 /dev/sda + /dev/bcache0 上）
#    此时可以停一下业务，或保持低负载也行，避免写入影响。

# 2) 清空 nvme2n1 （原来的 /swarm-vol） 注意，这将销毁原数据！
echo "准备清空 /dev/nvme2n1..."
sudo umount /dev/nvme2n1 || true  # 若尚未卸载，可忽略报错
sudo wipefs -a /dev/nvme2n1

# 3) 将 nvme2n1 作为缓存盘初始化
echo "对 /dev/nvme2n1 执行 make-bcache -C（cache）..."
#在这个例子里，默认的block大小是512B、bucket大小是128kB。block的大小应该与后端设备的sector大小匹配（通常是512或者4k）。bucket的大小应该与缓存设备的擦除block大小匹配（以减少写入放大）。例如，如果是一个4k sector的HDD和一个擦除block大小是2MB的SSD搭配，命令就应该是这样的：
# sudo make-bcache --block 4k --bucket 2M -C /dev/nvme2n1
# 如果你需要查看 /dev/sda （也就是后端）的 block size，可以使用 fdisk -l /dev/sda 等命令。
# 如果你需要查看 /dev/nvme2n1 的擦除块大小，可以使用 nvme id-ns /dev/nvme2n1 等命令。一般是 4M
sudo make-bcache --block 512 --bucket 4M -C /dev/nvme2n1

echo "检查生成的缓存盘信息..."
sudo bcache-super-show /dev/nvme2n1 | grep -E "cset.uuid|dev.uuid"

# 假设输出中 cset.uuid (或 dev.uuid) 为 11111111-2222-3333-4444-555555555555
# (这里仅演示，我需要看实际输出)

CACHE_UUID="(此处填上实际的 cset.uuid)"

# 4) 将缓存设备附加到现有的 /dev/bcache0（后端 /dev/sda）
#    /dev/bcache0 的 sysfs 路径可通过 ls /sys/block/bcache0/bcache 等命令确认
echo "附加缓存到现有 bcache 后端..."
echo "$CACHE_UUID" | sudo tee /sys/block/bcache0/bcache/attach

# 如果我看到 echo: write error: Invalid argument，通常是 block size 不匹配等问题
# 如果成功，则 /sys/block/bcache0/bcache/cache_mode 等节点应该出现

# 5) 为 bcache0 启用写回缓存模式（可选）
echo "启用写回 (writeback) 缓存模式..."
echo writeback | sudo tee /sys/block/bcache0/bcache/cache_mode

# 可选：关闭顺序IO绕过等更激进的做法
# echo 0 | sudo tee /sys/block/bcache0/bcache/sequential_cutoff
# echo 0 | sudo tee /sys/block/bcache0/bcache/writeback_percent

# 6) 确认缓存已生效
echo "确认 /dev/bcache0 依旧正常挂载在 /swarm-vol，并检查 sysfs 等信息："
mount | grep /swarm-vol
ls -l /sys/block/bcache0/bcache

至此，我已经完成了将旧的 nvme2n1 转变为 bcache 缓存设备的操作，并和 /dev/sda 组合为统一的逻辑卷 /dev/bcache0。接下来的要点包括：

开机自动挂载
- 通常推荐在 /etc/fstab 中写入对 /dev/bcache0 的挂载。
- 同时需要注意在 initramfs 阶段加载 bcache 模块、或者确保 bcache-tools 的 udev 规则可以自动将 cache attach 到 backing device（以免重启后没了 /dev/bcache0）。在 Ubuntu 下，一般可通过 sudo update-initramfs -u 并检查 /lib/udev/rules.d/69-bcache.rules 等来确认。

在 /etc/fsabt 中添加：

# 删除旧的 /swarm-vol 挂载
# /dev/disk/by-uuid/49fd5e45-6074-4370-a95f-c4404920aff5 /swarm-vol ext4 defaults,noatime,nofail 0 0
# 然后添加新的 /swarm-vol 挂载
/dev/bcache0 /swarm-vol ext4 defaults,noatime,nofail 0 0

确认写回模式的风险
- 写回模式（writeback）可以大幅提高速度，但在缓存盘掉电或故障时会丢失尚未写入后端的脏数据。既然我提到 SSD 质量较好，且并不特别在意短期丢失风险，可以大胆使用。
调优与监控
- 适当调节 writeback_percent、sequential_cutoff 等 sysfs 参数可以获得性能与风险的平衡。
- 还可以用 dstat -D nvme2n1,sda 或者 iostat -xm 1 来观察实际读写流量和缓存命中情况。

完成后，我就拥有一个**后端极大（/dev/sda）+ 前端极快（/dev/nvme2n1 作为缓存）**的综合存储系统，挂载于 /swarm-vol。这样就达到了我预想的“又大又快”的目的。

使用下面的命令检查其状态：

anduin@ms-server:/sys/block/bcache0/bcache$ ls
attach            dirty_data                 sequential_cutoff           stripe_size                  writeback_rate_fp_term_low
backing_dev_name  io_disable                 state                       writeback_consider_fragment  writeback_rate_fp_term_mid
backing_dev_uuid  io_error_limit             stats_day                   writeback_delay              writeback_rate_i_term_inverse
cache             io_errors                  stats_five_minute           writeback_metadata           writeback_rate_minimum
cache_mode        label                      stats_hour                  writeback_percent            writeback_rate_p_term_inverse
clear_stats       partial_stripes_expensive  stats_total                 writeback_rate               writeback_rate_update_seconds
detach            readahead_cache_policy     stop                        writeback_rate_debug         writeback_running
dev               running                    stop_when_cache_set_failed  writeback_rate_fp_term_high
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./running 
1
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./state 
dirty
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./dirty_data 
775.9M
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./writeback_running 
1
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./backing_dev_name 
sda
anduin@ms-server:/sys/block/bcache0/bcache$ cat ./cache_mode 
writethrough [writeback] writearound none
anduin@ms-server:/sys/block/bcache0/bcache$ cd ./cache
anduin@ms-server:/sys/block/bcache0/bcache/cache$ ls
average_key_size  bucket_size              congested                     flash_vol_create   journal_delay_ms    stats_hour   tree_depth
bdev0             cache0                   congested_read_threshold_us   internal           root_usage_percent  stats_total  unregister
block_size        cache_available_percent  congested_write_threshold_us  io_error_halflife  stats_day           stop
btree_cache_size  clear_stats              errors                        io_error_limit     stats_five_minute   synchronous
anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./errors 
[unregister] panic
anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./bucket_size 
512.0k
anduin@ms-server:/sys/block/bcache0/bcache/cache$ cat ./block_size 
0.5k
anduin@ms-server:/sys/block/bcache0/bcache/cache$ cd ./stats_day/
anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ ls
bypassed  cache_bypass_hits  cache_bypass_misses  cache_hit_ratio  cache_hits  cache_miss_collisions  cache_misses
anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_hit_ratio 
4
anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_hits
11611
anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cat ./cache_misses 
269927
anduin@ms-server:/sys/block/bcache0/bcache/cache/stats_day$ cd /swarm-vol/
anduin@ms-server:/swarm-vol$ df . -Th
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/bcache0   ext4   58T  6.7T   49T  13% /swarm-vol

If unable to run `wipefs` on a device due to `Device or resource busy` error

The error Device or resource busy indicates that the device /dev/nvme1n1 is currently in use, preventing wipefs from accessing it. To resolve this, you need to ensure that no processes or mount points are actively using the device. Here are some steps to identify and resolve the issue:

1. Check if the device is mounted

Run:

mount | grep /dev/nvme1n1

If it is mounted, unmount it:

sudo umount /dev/nvme1n1

2. Check for active partitions

If any partitions on /dev/nvme1n1 are in use, they need to be unmounted:

lsblk

Unmap active partitions:

sudo umount /dev/nvme1n1pX  # Replace "X" with the partition number

4. Check for `bcache` association

The presence of bcache0 suggests bcache is in use. Verify:

sudo bcache-super-show /dev/nvme1n1

If it is associated, unregister it:

echo 1 | sudo tee /sys/block/bcacheX/bcache/stop  # Replace "bcacheX" appropriately

Clear the bcache superblock:

sudo wipefs -a /dev/nvme1n1

5. Check for swap usage

If the device is used as swap:

cat /proc/swaps
sudo swapoff /dev/nvme1n1

6. Check for open file descriptors

List processes using the device:

sudo lsof | grep /dev/nvme1n1

Kill the processes if necessary:

sudo kill -9 <PID>

7. Retry `wipefs`

Once the device is no longer in use, retry:

sudo wipefs -a /dev/nvme1n1

If issues persist, let me know the exact usage scenario, and I can assist further!

If bcache device not showing up on fdisk

First, try some clean-up:

echo $cset_uuid | sudo tee /sys/fs/bcache/pendings_cleanup
echo $backing_uuid | sudo tee /sys/fs/bcache/pendings_cleanup

Use bcache-super-show to get the uuids.

Then try again to register:

echo $cset_uuid | sudo tee /sys/fs/bcache/register
echo $backing_uuid | sudo tee /sys/fs/bcache/register

The cache uuid should exist in /dev/fs/bcache if the cache device is successfully registered.

If bcache-super-show says that that the backing dev.data.cache_state state is clean and the cset.uuid consists only of zeros, the bcache device is in the invalid state and must be recreated. [source]

However, if clean, you could try force-starting the backing device without cache device:

echo 1 | sudo tee /sys/class/block/$dev/bcache/running

If unable to run `wipefs` on a device due to `Device or resource busy` error

1. Check if the device is mounted

Run:

mount | grep /dev/nvme1n1

If it is mounted, unmount it:

sudo umount /dev/nvme1n1

2. Check for active partitions

If any partitions on /dev/nvme1n1 are in use, they need to be unmounted:

lsblk

Unmap active partitions:

sudo umount /dev/nvme1n1pX  # Replace "X" with the partition number

4. Check for `bcache` association

The presence of bcache0 suggests bcache is in use. Verify:

sudo bcache-super-show /dev/nvme1n1

If it is associated, unregister it:

echo 1 | sudo tee /sys/block/bcacheX/bcache/stop  # Replace "bcacheX" appropriately

Clear the bcache superblock:

sudo wipefs -a /dev/nvme1n1

5. Check for swap usage

If the device is used as swap:

cat /proc/swaps
sudo swapoff /dev/nvme1n1

6. Check for open file descriptors

List processes using the device:

sudo lsof | grep /dev/nvme1n1

Kill the processes if necessary:

sudo kill -9 <PID>

7. Retry `wipefs`

Once the device is no longer in use, retry:

sudo wipefs -a /dev/nvme1n1

If issues persist, let me know the exact usage scenario, and I can assist further!

If bcache device not showing up on fdisk

First, try some clean-up:

echo $cset_uuid | sudo tee /sys/fs/bcache/pendings_cleanup
echo $backing_uuid | sudo tee /sys/fs/bcache/pendings_cleanup

Use bcache-super-show to get the uuids.

Then try again to register:

echo $cset_uuid | sudo tee /sys/fs/bcache/register
echo $backing_uuid | sudo tee /sys/fs/bcache/register

The cache uuid should exist in /dev/fs/bcache if the cache device is successfully registered.

If bcache-super-show says that that the backing dev.data.cache_state state is clean and the cset.uuid consists only of zeros, the bcache device is in the invalid state and must be recreated. [source]

However, if clean, you could try force-starting the backing device without cache device:

echo 1 | sudo tee /sys/class/block/$dev/bcache/running

Eject cache

I used bcache only in a writethrough configuration, and IIRC even then bcache doesn't like at all if the cache device vanishes while the machine is running. Expect the bcache device to stall completely if that happens.

I haven't tried to remove the cache device while the machine is powered down, so I can't say anything about that. I do think though that bcache is still pretty touchy, so I'd recommend that you try that with a VM or a physical test machine first.

To safely remove the cache device, you can detach the cache set from the bcache device:

echo <cache-set-uuid> > /sys/block/bcache0/bcache/detach

To determine the necessary cache set UUID, look in /sys/fs/bcache/:

host ~ # ll /sys/fs/bcache/
total 0
drwxr-xr-x 7 root root    0 Feb 19 00:11 eb99feda-fac7-43dc-b89d-18765e9febb6
--w------- 1 root root 4096 Feb 19 00:11 register
--w------- 1 root root 4096 Feb  7 07:17 register_quiet

So for example in this case, run:

echo eb99feda-fac7-43dc-b89d-18765e9febb6 > /sys/block/bcache0/bcache/detach

The state file should say no cache after that:

host ~ # cat /sys/block/bcache0/bcache/state
no cache

阶段概要

第一阶段 - 双数据阶段

第二阶段 - 暂停业务阶段

第三阶段 - 重构存储阶段

第一阶段

第二阶段 - 暂停业务并做最终同步

第三阶段 - 将原 nvme2n1 作为 bcache 缓存设备

If unable to run wipefs on a device due to Device or resource busy error

1. Check if the device is mounted

2. Check for active partitions

4. Check for bcache association

5. Check for swap usage

6. Check for open file descriptors

7. Retry wipefs

If bcache device not showing up on fdisk

If unable to run wipefs on a device due to Device or resource busy error

1. Check if the device is mounted

2. Check for active partitions

4. Check for bcache association

5. Check for swap usage

6. Check for open file descriptors

7. Retry wipefs

If bcache device not showing up on fdisk

Eject cache

If unable to run `wipefs` on a device due to `Device or resource busy` error

4. Check for `bcache` association

7. Retry `wipefs`

If unable to run `wipefs` on a device due to `Device or resource busy` error

4. Check for `bcache` association

7. Retry `wipefs`