File System Gets read-only Frequently
Error:
[root@server.example.com ~]# cd /STORLOGS
[root@server.example.com STORLOGS]# touch ram
touch: cannot touch `ram': Read-only file system
[root@server.example.com STORLOGS]#
Cause:
1. Path to device are failing.
Aug 21 06:00:03 server.example.com kernel: connection1:0: detected conn error (1020)
Aug 21 06:00:04 server.example.com iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)
Aug 21 06:02:03 server.example.com kernel: device-mapper: multipath: Failing path 8:112.
2. I/O Errors detected on Disk.
Aug 21 06:02:03 server.example.com kernel: JBD2: I/O error detected when updating journal superblock for dm-5-8.
Aug 21 06:02:03 server.example.com kernel: EXT4-fs error (device dm-5) in ext4_create: Journal has aborted
Aug 21 06:02:03 server.example.com kernel: EXT4-fs (dm-5): previous I/O error to superblock detected
3. Device Lost all paths. So FS is unaccessiable for that period.
Aug 21 06:02:03 server.example.com multipathd: 8:112: mark as failed
Aug 21 06:02:03 server.example.com multipathd: LOGS01: remaining active paths: 0
Aug 21 06:02:03 server.example.com multipathd: LOGS01: sdh - directio checker reports path is down
4. Path Recovered after 23 seconds. OS mounted the FS in read-only mode. Admin intervention is sought.
Aug 21 06:02:29 server.example.com multipathd: BCK01: remaining active paths: 1
Aug 21 06:02:29 server.example.com multipathd: LOGS01: sdh - directio checker reports path is up
Aug 21 06:02:29 server.example.com multipathd: 8:112: reinstated
Aug 21 06:02:29 server.example.com multipathd: LOGS01: remaining active paths: 1
Aug 21 06:06:52 server.example.com kernel: EXT4-fs error (device dm-5): ext4_journal_start_sb: Detected aborted journal
Aug 21 06:06:52 server.example.com kernel: EXT4-fs (dm-5): Remounting filesystem read-only
Solution:
1. Need to check at Storage end why its loosing connection to System so frequently.
2. There is only one path to this device. We should have multipaths for this LUN, so that it can better manage such situation and would minimize the outage.
LOGS01 (36486fd23a24c7b8031c674a448e800e2) dm-4 MSFT,STORSIMPLE 8100
size=500G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 10:0:0:7 sdh 8:112 active ready running
Workarround:
[root@server.example.com dev]# umount /STORLOGS
umount: /STORLOGS: device is busy
umount: /STORLOGS: device is busy
[root@server.example.com dev]# fuser -ck /STORLOGS
/STORLOGS: 8629c
[root@server.example.com dev]# umount /STORLOGS
[root@server.example.com dev]# mount /STORLOGS
[root@server.example.com dev]# cd /STORLOGS
[root@server.example.com STORLOGS]# touch ram
[root@server.example.com STORLOGS]# rm ram
rm: remove regular empty file `ram'? y
[root@server.example.com STORLOGS]#
[root@server.example.com ~]# cd /STORLOGS
[root@server.example.com STORLOGS]# touch ram
touch: cannot touch `ram': Read-only file system
[root@server.example.com STORLOGS]#
Cause:
1. Path to device are failing.
Aug 21 06:00:03 server.example.com kernel: connection1:0: detected conn error (1020)
Aug 21 06:00:04 server.example.com iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)
Aug 21 06:02:03 server.example.com kernel: device-mapper: multipath: Failing path 8:112.
2. I/O Errors detected on Disk.
Aug 21 06:02:03 server.example.com kernel: JBD2: I/O error detected when updating journal superblock for dm-5-8.
Aug 21 06:02:03 server.example.com kernel: EXT4-fs error (device dm-5) in ext4_create: Journal has aborted
Aug 21 06:02:03 server.example.com kernel: EXT4-fs (dm-5): previous I/O error to superblock detected
3. Device Lost all paths. So FS is unaccessiable for that period.
Aug 21 06:02:03 server.example.com multipathd: 8:112: mark as failed
Aug 21 06:02:03 server.example.com multipathd: LOGS01: remaining active paths: 0
Aug 21 06:02:03 server.example.com multipathd: LOGS01: sdh - directio checker reports path is down
4. Path Recovered after 23 seconds. OS mounted the FS in read-only mode. Admin intervention is sought.
Aug 21 06:02:29 server.example.com multipathd: BCK01: remaining active paths: 1
Aug 21 06:02:29 server.example.com multipathd: LOGS01: sdh - directio checker reports path is up
Aug 21 06:02:29 server.example.com multipathd: 8:112: reinstated
Aug 21 06:02:29 server.example.com multipathd: LOGS01: remaining active paths: 1
Aug 21 06:06:52 server.example.com kernel: EXT4-fs error (device dm-5): ext4_journal_start_sb: Detected aborted journal
Aug 21 06:06:52 server.example.com kernel: EXT4-fs (dm-5): Remounting filesystem read-only
Solution:
1. Need to check at Storage end why its loosing connection to System so frequently.
2. There is only one path to this device. We should have multipaths for this LUN, so that it can better manage such situation and would minimize the outage.
LOGS01 (36486fd23a24c7b8031c674a448e800e2) dm-4 MSFT,STORSIMPLE 8100
size=500G features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
`- 10:0:0:7 sdh 8:112 active ready running
Workarround:
[root@server.example.com dev]# umount /STORLOGS
umount: /STORLOGS: device is busy
umount: /STORLOGS: device is busy
[root@server.example.com dev]# fuser -ck /STORLOGS
/STORLOGS: 8629c
[root@server.example.com dev]# umount /STORLOGS
[root@server.example.com dev]# mount /STORLOGS
[root@server.example.com dev]# cd /STORLOGS
[root@server.example.com STORLOGS]# touch ram
[root@server.example.com STORLOGS]# rm ram
rm: remove regular empty file `ram'? y
[root@server.example.com STORLOGS]#
No comments