[WP7607] Data deletion after power loss

Hi, I’m using the WP7607 in a monitoring application, where I want to control some machinery and their status, and I’ve installed all the tools and scripts that let me do this inside /home/root. It has been working fine for a couple of weeks, until a series of power losses have cleaned everything I’ve put inside the /home and /etc. I’ve conducted some tests and I have found out that they both reside in /mnt/flash, and after putting a WP through a series of power losses, I’ve been able to replicate the issue:

ima: setting up IMA subsystem...
ima: feature not supported
The proc node does not exist
mount root fs from partition (rootfs|system)
UBI device number 0, total 120 LEBs (30474240 bytes, 29.0 MiB), available 0 LEBs (0 bytes), LEB size 253952 bytes (248.0 KiB)
dm-verity key not installed, authentication disabled
Non-Secure version.
rootfs roothash set at compile time: 083b306ba8b4678734d2a68e980c583ffda32ac7e5c2c8d8579aa446abe1fa91
rootfs: dev '/dev/ubiblock0_0' 'squashfs'
mount /dev/mapper/rt
rootfs: mounting took 0ms
init started: BusyBox v1.27.2 (2020-10-22 09:40:46 UTC)
rcS: Executing mount_essential_fs...
rcS: Executing simple_network...
rcS: Executing run_S_scripts...
S02mount_early: Executing mount_early_pseudo...
S02mount_early: Mounting SMACK security pseudo fs...
S02mount_early: Executing mount_early_other...
S02mount_early: Executing mount_early_create_dirs...
S02mount_early: Executing mount_early_set_timezone...
S02mount_early: Using timezone Universal...
S02mount_early: Executing yaffs2_kern_supported_init...
S02mount_early: Executing mount_early_user_start...
S02mount_early: RO rootfs fudge allowed.
S02mount_early: User is forcing userapp file system to be ubifs.
S02mount_early: Trying to mount UBIFS on /mnt/flash using [usrquota,grpquota,rw] mount options...
[    6.410107] SWI ssmem_alloc_entry_get: ssmem region 20 not allocated
[    6.415421] SWI ssmem_get: ssmem_get: region 20 not exists
UBI device number 3, total 524 LEBs (133070848 bytes, 126.9 MiB), available 5 LEBs (1269760 bytes, 1.2 MiB), LEB size 253952 bytes (248.0 KiB)
[    7.542181] UBIFS error (ubi3:0 pid 331): grab_empty_leb: could not find an empty LEB
[    7.548982] (pid 331) start dumping LEB properties
[    7.553795] (pid 331) Lprops statistics: empty_lebs 0, idx_lebs  10
[    7.560005]  taken_empty_lebs 0, total_free 700416, total_dirty 66986680
[    7.566746]  total_used 49903240, total_dark 3727360, total_dead 0
[    7.573145] LEB 10      free 0        dirty 251264   used 2688     free + dirty 251264   dark 8192 dead 0    nodes fit 59  flags 0x1  (dirty)
[    7.585629] LEB 11      free 0        dirty 251752   used 2200     free + dirty 251752   dark 8192 dead 0    nodes fit 59  flags 0x1  (dirty)
[    7.598279] LEB 12      free 0        dirty 245800   used 8152     free + dirty 245800   dark 8192 dead 0    nodes fit 57  flags 0x1  (dirty)
[    7.610874] LEB 13      free 0        dirty 125488   used 128464   free + dirty 125488   dark 8192 dead 0    nodes fit 29  flags 0x1  (dirty)
[    7.623547] LEB 14      free 0        dirty 249576   used 4376     free + dirty 249576   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
#... THERE ARE A LOT OF THOSE ...#
[   13.574855] LEB 470     free 0        dirty 99024    used 154928   free + dirty 99024    dark 8192 dead 0    nodes fit 23  flags 0x0  (not categorized)
[   13.588395] LEB 471     free 0        dirty 249128   used 4824     free + dirty 249128   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
[   13.601071] LEB 472     free 0        dirty 114472   used 139480   free + dirty 114472   dark 8192 dead 0    nodes fit 26  flags 0x0  (not categorized)
[   13.614611] LEB 473     free 0        dirty 249208   used 4744     free + dirty 249208   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
[   13.627285] LEB 474     free 0        dirty 119832   used 134120   free + dirty 119832   dark 8192 dead 0    nodes fit 28  flags 0x0  (not categorized)
[   13.640825] (pid 331) finish dumping LEB properties
[   13.645689] (pid 331) Budgeting info: data budget sum 0, total budget sum 0
[   13.652674]  budg_data_growth 0, budg_dd_growth 0, budg_idx_growth 0
[   13.658962]  min_idx_lebs 8, old_idx_sz 510584, uncommitted_idx 25536
[   13.665431]  page_budget 4144, inode_budget 160, dent_budget 312
[   13.671415]  nospace 0, nospace_rp 0
[   13.674935]  dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
[   13.680620]  freeable_cnt 0, calc_idx_sz 497344, idx_gc_cnt 0
[   13.686304]  dirty_pg_cnt 0, dirty_zn_cnt 133, clean_zn_cnt 6
[   13.692077]  gc_lnum -1, ihead_lnum 71
[   13.695769]  jhead 0 (GC)     LEB 128
[   13.699159]  jhead 1 (base)   LEB 74
[   13.702670]  jhead 2 (data)   LEB 145
[   13.706188]  bud LEB 18
[   13.708614]  bud LEB 25
[   13.711087]  bud LEB 26
[   13.713479]  bud LEB 33
[   13.715904]  bud LEB 48
[   13.718339]  bud LEB 56
[   13.720811]  bud LEB 72
[   13.723203]  bud LEB 74
[   13.725629]  bud LEB 83
[   13.728063]  bud LEB 114
[   13.730619]  bud LEB 128
[   13.733092]  bud LEB 134
[   13.735616]  bud LEB 145
[   13.738131]  bud LEB 147
[   13.740704]  bud LEB 162
[   13.743168]  commit state 0
[   13.745947] Budgeting predictions:
[   13.749331]  available: 61393272, outstanding 0, free 57995120
mount: mounting /dev/ubi3_0 on /mnt/flash failed: No space left on device
S02mount_early: Unable to mount /dev/ubiblock3_0 onto /mnt/flash.
S02mount_early: Trying to mount UBIFS on /mnt/flash using [rw] mount options...
UBI device number 3, total 524 LEBs (133070848 bytes, 126.9 MiB), available 5 LEBs (1269760 bytes, 1.2 MiB), LEB size 253952 bytes (248.0 KiB)
[   15.302980] UBIFS error (ubi3:0 pid 376): grab_empty_leb: could not find an empty LEB
[   15.309782] (pid 376) start dumping LEB properties
[   15.314891] (pid 376) Lprops statistics: empty_lebs 0, idx_lebs  10
[   15.320844]  taken_empty_lebs 0, total_free 700416, total_dirty 66986680
[   15.327489]  total_used 49903240, total_dark 3727360, total_dead 0
[   15.333688] LEB 10      free 0        dirty 251264   used 2688     free + dirty 251264   dark 8192 dead 0    nodes fit 59  flags 0x1  (dirty)
[   15.346325] LEB 11      free 0        dirty 251752   used 2200     free + dirty 251752   dark 8192 dead 0    nodes fit 59  flags 0x1  (dirty)
[   15.358999] LEB 12      free 0        dirty 245800   used 8152     free + dirty 245800   dark 8192 dead 0    nodes fit 57  flags 0x1  (dirty)
[   15.371673] LEB 13      free 0        dirty 125488   used 128464   free + dirty 125488   dark 8192 dead 0    nodes fit 29  flags 0x1  (dirty)
[   15.384346] LEB 14      free 0        dirty 249576   used 4376     free + dirty 249576   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
#... THERE ARE A LOT OF THOSE ...#
[   21.324368] LEB 470     free 0        dirty 99024    used 154928   free + dirty 99024    dark 8192 dead 0    nodes fit 23  flags 0x0  (not categorized)
[   21.337908] LEB 471     free 0        dirty 249128   used 4824     free + dirty 249128   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
[   21.350642] LEB 472     free 0        dirty 114472   used 139480   free + dirty 114472   dark 8192 dead 0    nodes fit 26  flags 0x0  (not categorized)
[   21.364125] LEB 473     free 0        dirty 249208   used 4744     free + dirty 249208   dark 8192 dead 0    nodes fit 58  flags 0x1  (dirty)
[   21.376802] LEB 474     free 0        dirty 119832   used 134120   free + dirty 119832   dark 8192 dead 0    nodes fit 28  flags 0x0  (not categorized)
[   21.390334] (pid 376) finish dumping LEB properties
[   21.395199] (pid 376) Budgeting info: data budget sum 0, total budget sum 0
[   21.402182]  budg_data_growth 0, budg_dd_growth 0, budg_idx_growth 0
[   21.408479]  min_idx_lebs 8, old_idx_sz 510584, uncommitted_idx 25536
[   21.414940]  page_budget 4144, inode_budget 160, dent_budget 312
[   21.420932]  nospace 0, nospace_rp 0
[   21.424452]  dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
[   21.430137]  freeable_cnt 0, calc_idx_sz 497344, idx_gc_cnt 0
[   21.435820]  dirty_pg_cnt 0, dirty_zn_cnt 133, clean_zn_cnt 6
[   21.441594]  gc_lnum -1, ihead_lnum 71
[   21.445285]  jhead 0 (GC)     LEB 128
[   21.448667]  jhead 1 (base)   LEB 74
[   21.452177]  jhead 2 (data)   LEB 145
[   21.455696]  bud LEB 18
[   21.458130]  bud LEB 25
[   21.460603]  bud LEB 26
[   21.462994]  bud LEB 33
[   21.465420]  bud LEB 48
[   21.467854]  bud LEB 56
[   21.470330]  bud LEB 72
[   21.472712]  bud LEB 74
[   21.475147]  bud LEB 83
[   21.477572]  bud LEB 114
[   21.480124]  bud LEB 128
[   21.482605]  bud LEB 134
[   21.485130]  bud LEB 145
[   21.487645]  bud LEB 147
[   21.490196]  bud LEB 162
[   21.492678]  commit state 0
[   21.495458] Budgeting predictions:
[   21.498840]  available: 61393272, outstanding 0, free 57995120
mount: mounting /dev/ubi3_0 on /mnt/flash failed: No space left on device
S02mount_early: Unable to mount /dev/ubiblock3_0 onto /mnt/flash.
ubiformat: mtd16 (nand), size 137363456 bytes (131.0 MiB), 524 eraseblocks of 262144 bytes (256.0 KiB), min. I/O size 4096 bytes
libscan: scanning eraseblock 523 -- 100 % complete
ubiformat: 524 eraseblocks have valid erase counter, mean value is 14
ubiformat: formatting eraseblock 523 -- 100 % complete
UBI device number 3, total 524 LEBs (133070848 bytes, 126.9 MiB), available 480 LEBs (121896960 bytes, 116.2 MiB), LEB size 253952 bytes (248.0 KiB)
S02mount_early: Making single volume, size 115MiB on UBI device number 3...
Volume ID 0, size 475 LEBs (120627200 bytes, 115.0 MiB), LEB size 253952 bytes (248.0 KiB), dynamic, name "userapp_vol0", alignment 1
S02mount_early: Trying to mount UBIFS on /mnt/flash using [usrquota,grpquota,rw] mount options...
UBI device number 3, total 524 LEBs (133070848 bytes, 126.9 MiB), available 5 LEBs (1269760 bytes, 1.2 MiB), LEB size 253952 bytes (248.0 KiB)
S02mount_early: Performing quota check on file system mounted at /mnt/flash
S02mount_early: UBIFS volume successfully mounted on /mnt/flash
S02mount_early: Executing mount_early_legato_start...
mount Legato from partition lefwkro
UBI device number 2, total 35 LEBs (8888320 bytes, 8.4 MiB), available 0 LEBs (0 bytes), LEB size 253952 bytes (248.0 KiB)
dm-verity key not installed, authentication disabled
Non-Secure version.
S02mount_early: SQUASHFS successfully mounted on /mnt/legato

It seems like that the userapp partition gets corrupted and S02mount_early just wipes it up. Even if it didn’t wipe the partition (by modifying the script in a Yocto custom build), I haven’t been able to restore it by hand.
I’ve also tried putting the whole OS under readonly, which after a lot of headaches, I have been able to do (OverlayFS was a lot stubborn and a ton of patches were missing from the source), but that makes Legato unusable. Then I found out about Legato RO, and after another round of headaches and nonsense (Like the Yocto recipe not accepting READ_ONLY=1 during make and figuring out that Legato has to be built using the Yocto source, or else it won’t flash) it still doesn’t work, and honestly I have been banging my head into a wall this whole time. Does any of you guys have a solution/fix/whatever that can get me this problem sorted? I need full module functionality, and power loss resistance.

how about putting your required file and tools in yocto image?

I have tried, I should’ve specified before. It is quite a complex tool and requires far more disk space than I can squeeze out of the image, when I tried to fit it I had to set UBI_ROOTFS_SIZE = “55MiB” and I had to remove python, and even then it refused to flash, so I settled for putting it in /home, since it has 100Mb~ of space

you can try AT!PARTITION=? and see if you can resize the yocto area:

at!entercnd="****"
OK
at!partition=?
To print partition table: AT!PARTITION?
To modify partition sizes: AT!PARTITION=<name>,<size0>[,<size1>,...]
  <name> - name of the first partition to modify its size, can be blank
  <size0> - new size in KB of the first partition
  <size1>,... - new sizes of next partitions
  for example: AT!PARTITION=0:boot,2560
  List of partitions whose size can be modified:

None/Not Allowed

OK
at!partition?
            PART    BLOCK     SIZE         
            NAME   OFFSET     (KB)         
           0:SBL 00000000     2560
         0:MIBIB 0000000A     2560
        0:BACKUP 00000014     6656
      0:SECURITY 0000002E     1024
       0:PERSIST 00000032     2048
          0:EFS2 0000003A    17920
       0:SWIFOTA 00000080    81152
            0:TZ 000001BD     1536
        0:DEVCFG 000001C3      768
           0:RPM 000001C6      768
         0:modem 000001C9    32768
         0:aboot 00000249     1024
          0:boot 0000024D    15360
        0:system 00000289    30720
       0:LEFWKRO 00000301     8960
         0:SWIRW 00000324    25600
       0:USERAPP 00000388   134144
      0:RESERVED 00000594    55808
        0:SLOT_2 0000066E    34304
        0:SLOT_1 000006F4    34304
        0:SLOT_0 0000077A    34304

OK

It seems that I can’t modify it this way. Something else?

you need to contact distributor to get a tool to modify the partition

Otherwise, you need to prevent power loss.

I will try to contact my local distributor, unfortunately, it is pretty hard to avoid power loss in my case, so that’s why I have made this post. If someone else is reading this, even years late, I will be happy to know someone has made it.

How about setting at!pcvoltlimits to prevent sudden power cut?

How about SD card or USB thumb drive to store your big files?

I’ve looked into the PCVOLTLIMITS, it could be useful if it put the board in a safe state before complete power loss, but as far as I have read it only puts the WP in ULPM, and I can’t get it to work with my current power supply. As for the SD, it is already being used for all the logs that that program saves, and there is still a possibility that it corrupts, so we are back again at the starting point. The only USB port that the WP has is already being used for the connection with the machines, adding a HUB might be possible but that makes the connection way too fiddly and I still want to fit in the same size constraints… Seems like I won’t be able to do this without adding more memories to restore the system with or by racing the capacitors in putting the system in a safe state, a compromise must be made. I’m still wondering about Legato Read Only, I’m pretty sure that the culprit of these corruptions is Legato since I’ve disabled its startup script and the WP is still alive (But without internet connection and no GPIO). If only there was a fix, I think the corruption issue might be resolved this way.

Got a solution for this problem! After I saw this post: Free (small) partition for storing serial number? - #15 by sierra-wireless I thought about formatting the userapp MTD16 partition and splitting it into two UBI volumes (the first gets a quarter of the space and is the original userapp (the one that gets mounted to /mnt/flash) and then the second one gets all the remaining space and gets mounted as RO in another place, where I put the scripts and the machine monitoring setup). Doing this I’ve yet to have a problem, and in case something happens to the /mnt/flash, I’ve got a restore procedure that fixes it. Pretty elegant, and it was very easy to do (another 20 lines in S02mount_early). Can’t give much more detail tho.