[Fx30-3G] UBIFS error, USER1 partition turning RO

Hello,

We have identified the problem of the USER1 partition becoming read only on three of our devices during the last 3 months.

Here are some traces from the logread:

        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.031411] msm_nand_read_oob: 13bc1000 800 0 failed -117, corrected 2
        Jul 30 03:39:31 Fx30-DrBox user.notice kernel: [732621.031441] ubi2: fixable bit-flip detected at PEB 994
        Jul 30 03:39:31 Fx30-DrBox user.notice kernel: [732621.031441] ubi2: schedule PEB 994 for scrubbing
        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.031685] UBIFS error (ubi2:0 pid 1700): ubifs_check_node: bad magic 0xcdd342ff, expected 0x6101831
        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.031716] UBIFS error (ubi2:0 pid 1700): ubifs_check_node: bad node at LEB 1031:0
        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.031716] Not a node, first 24 bytes:
        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.031899] 00000000: ff 42 d3 cd b6 6f d9 0e bc ef 47 00 00 00 00 00 48 00 00 00 02 01 00 00                          .B...o....G.....H.......
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.031930] CPU: 0 PID: 1700 Comm: moduleDNP3 Not tainted 3.14.29ltsi-a00e464379_0c71c3069c #2
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.032021] [<c001583c>] (unwind_backtrace) from [<c00128f0>] (show_stack+0x20/0x24)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034371] [<c00128f0>] (show_stack) from [<c0617284>] (dump_stack+0x20/0x28)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034829] [<c0617284>] (dump_stack) from [<c025a958>] (ubifs_check_node+0x27c/0x2b8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034890] [<c025a958>] (ubifs_check_node) from [<c025c05c>] (ubifs_read_node+0x1f0/0x2f0)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034921] [<c025c05c>] (ubifs_read_node) from [<c025c490>] (ubifs_read_node_wbuf+0x334/0x35c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034951] [<c025c490>] (ubifs_read_node_wbuf) from [<c027b40c>] (ubifs_tnc_read_node+0x5c/0x1f0)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.034982] [<c027b40c>] (ubifs_tnc_read_node) from [<c025d66c>] (matches_name+0x50/0xdc)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.035470] [<c025d66c>] (matches_name) from [<c025d740>] (resolve_collision+0x48/0x2d8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.035501] [<c025d740>] (resolve_collision) from [<c0260948>] (ubifs_tnc_remove_nm+0xf8/0x1cc)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.035562] [<c0260948>] (ubifs_tnc_remove_nm) from [<c024d4a4>] (ubifs_jnl_update+0x43c/0x5d8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036019] [<c024d4a4>] (ubifs_jnl_update) from [<c025281c>] (ubifs_unlink+0x218/0x2c8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036080] [<c025281c>] (ubifs_unlink) from [<c0154128>] (vfs_unlink+0xe4/0x17c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036111] [<c0154128>] (vfs_unlink) from [<c029eaf4>] (call_unlink+0x9c/0x170)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036386] [<c029eaf4>] (call_unlink) from [<c029f840>] (vfsub_unlink+0x3c/0x68)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036691] [<c029f840>] (vfsub_unlink) from [<c02b8f34>] (aufs_unlink+0x1f0/0x34c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036935] [<c02b8f34>] (aufs_unlink) from [<c0154128>] (vfs_unlink+0xe4/0x17c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.036965] [<c0154128>] (vfs_unlink) from [<c01542a4>] (do_unlinkat+0xe4/0x1b8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.037149] [<c01542a4>] (do_unlinkat) from [<c0154f44>] (SyS_unlink+0x20/0x24)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.037210] [<c0154f44>] (SyS_unlink) from [<c000e740>] (ret_fast_syscall+0x0/0x30)
        Jul 30 03:39:31 Fx30-DrBox user.err kernel: [732621.037240] UBIFS error (ubi2:0 pid 1700): ubifs_read_node: expected node type 2
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.037484] UBIFS warning (ubi2:0 pid 1700): ubifs_ro_mode: switched to read-only mode, error -117
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.037515] CPU: 0 PID: 1700 Comm: moduleDNP3 Not tainted 3.14.29ltsi-a00e464379_0c71c3069c #2
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038461] [<c001583c>] (unwind_backtrace) from [<c00128f0>] (show_stack+0x20/0x24)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038522] [<c00128f0>] (show_stack) from [<c0617284>] (dump_stack+0x20/0x28)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038553] [<c0617284>] (dump_stack) from [<c025a050>] (ubifs_ro_mode+0x84/0x94)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038583] [<c025a050>] (ubifs_ro_mode) from [<c024d600>] (ubifs_jnl_update+0x598/0x5d8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038614] [<c024d600>] (ubifs_jnl_update) from [<c025281c>] (ubifs_unlink+0x218/0x2c8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.038644] [<c025281c>] (ubifs_unlink) from [<c0154128>] (vfs_unlink+0xe4/0x17c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040262] [<c0154128>] (vfs_unlink) from [<c029eaf4>] (call_unlink+0x9c/0x170)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040292] [<c029eaf4>] (call_unlink) from [<c029f840>] (vfsub_unlink+0x3c/0x68)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040323] [<c029f840>] (vfsub_unlink) from [<c02b8f34>] (aufs_unlink+0x1f0/0x34c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040353] [<c02b8f34>] (aufs_unlink) from [<c0154128>] (vfs_unlink+0xe4/0x17c)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040384] [<c0154128>] (vfs_unlink) from [<c01542a4>] (do_unlinkat+0xe4/0x1b8)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.040414] [<c01542a4>] (do_unlinkat) from [<c0154f44>] (SyS_unlink+0x20/0x24)
        Jul 30 03:39:31 Fx30-DrBox user.warn kernel: [732621.041544] [<c0154f44>] (SyS_unlink) from [<c000e740>] (ret_fast_syscall+0x0/0x30)
        Jul 30 03:39:31 Fx30-DrBox user.notice kernel: [732621.099715] ubi2: scrubbed PEB 994 (LEB 0:1031), data moved to PEB 992
        Jul 30 03:39:32 Fx30-DrBox user.err kernel: [732621.794664] UBIFS error (ubi2:0 pid 14532): do_commit: commit failed, error -30

Information from the “ubinfo” command:

        root@Fx30# ubinfo  -a
                UBI version:                    1
                Count of UBI devices:           3
                UBI control device major/minor: 10:35
                Present UBI devices:            ubi0, ubi1, ubi2

                ubi0
                Volumes count:                           2
                Logical eraseblock size:                 126976 bytes, 124.0 KiB
                Total amount of logical eraseblocks:     394 (50028544 bytes, 47.7 MiB)
                Amount of available logical eraseblocks: 0 (0 bytes)
                Maximum count of volumes                 128
                Count of bad physical eraseblocks:       0
                Count of reserved physical eraseblocks:  58
                Current maximum erase counter value:     1
                Minimum input/output unit size:          2048 bytes
                Character device major/minor:            245:0
                Present volumes:                         0, 1

                Volume ID:   0 (on ubi0)
                Type:        dynamic
                Alignment:   1
                Size:        265 LEBs (33648640 bytes, 32.1 MiB)
                State:       OK
                Name:        rootfs
                Character device major/minor: 245:1
                -----------------------------------
                Volume ID:   1 (on ubi0)
                Type:        dynamic
                Alignment:   1
                Size:        67 LEBs (8507392 bytes, 8.1 MiB)
                State:       OK
                Name:        scratch
                Character device major/minor: 245:2

                ===================================

                ubi1
                Volumes count:                           1
                Logical eraseblock size:                 126976 bytes, 124.0 KiB
                Total amount of logical eraseblocks:     316 (40124416 bytes, 38.3 MiB)
                Amount of available logical eraseblocks: 0 (0 bytes)
                Maximum count of volumes                 128
                Count of bad physical eraseblocks:       0
                Count of reserved physical eraseblocks:  80
                Current maximum erase counter value:     1
                Minimum input/output unit size:          2048 bytes
                Character device major/minor:            243:0
                Present volumes:                         0

                Volume ID:   0 (on ubi1)
                Type:        dynamic
                Alignment:   1
                Size:        232 LEBs (29458432 bytes, 28.1 MiB)
                State:       OK
                Name:        legato
                Character device major/minor: 243:1

                ===================================

                ubi2
                Volumes count:                           1
                Logical eraseblock size:                 126976 bytes, 124.0 KiB
                Total amount of logical eraseblocks:     1116 (141705216 bytes, 135.1 MiB)
                Amount of available logical eraseblocks: 0 (0 bytes)
                Maximum count of volumes                 128
                Count of bad physical eraseblocks:       0
                Count of reserved physical eraseblocks:  80
                Current maximum erase counter value:     259
                Minimum input/output unit size:          2048 bytes
                Character device major/minor:            244:0
                Present volumes:                         0

                Volume ID:   0 (on ubi2)
                Type:        dynamic
                Alignment:   1
                Size:        1032 LEBs (131039232 bytes, 125.0 MiB)
                State:       OK
                Name:        user1_vol0
                Character device major/minor: 244:1

We have regular write access to the FS (every sec sometime), given the UBIFS file system and the write cycles of the Flash (50,000 min) how could we calculate the life span of the Flash before errors?

Our problem may be related to the post '/home/root' user partition became 'read-only' suddenly - #9 by jyijyi.

In our case, the problem occurred after several days/months of operation. A reboot of the system restore the normal state of the partition.

Thanks to let me know if you have ideas about avoiding this problem in the future.

Regards, Christian.