RAID卡缓存策略调整

可以将RAID卡缓存策略由No Write Cache if bad BBU调整为Write Cache OK if bad BBU,即在电池充放电时不关闭缓存,以此保证I/O性能。但是此法存在数据丢失风险,需要合理评估再做调整。

原因详解

  服务器的Riad卡都带有可充电电池,这块可充电电池在不使用时也会有微弱的放电现象,当它的电量放电到低到一定程度时,Raid卡控制器就会对电池进行一次“放电”,将剩余的电量放掉,然后再进行一次“充电”。这其实是一种对电池保护机制,以及对Raid卡可用性提供保障的机制。

  默认情况下,当RAID卡的电池的电量低于某阈值时,RAID卡固件认为此时的电池是不可用的,为了保证数据的安全,会禁用RAID的“缓存”,这种默认的机制本来是合理的,但是当RAID的缓存被禁用之后,RAID的I/O能力会大幅度下降。一般情况下,这个充放电(放电->充电)的时间可能会持续几个小时,对于I/O密集型的应用来说,由此带来的性能下降有可能是致命的,可能会导致系统I/O延迟增大、队列堆积、拖慢甚至有可能拖垮整个系统。

有两种方法解决这个问题:

注:下文中的操作适用于基于LSI的MegaRAID卡的服务器。

  • 法一:检查电池的状态,对电池的充放电进行撑握,也可有计划地安排手动充放电。

查看电池充放电周期:

MegaCli -AdpBbuCmd -getBbuProperties -aALL|egrep 'Period|Next'

输出样例:

  1.  
    Auto Learn Period: 27 Days
  2.  
    Next Learn time: Tue Sep 18 05:52:27 2018

手动强制充放电:

MegaCli -AdpBbuCmd -BbuLearn –a0
  • 法二:改变RAID卡策略,使其在充放电时,不禁用Raid卡缓存。

查看Raid卡当前的缓存策略:

MegaCli -LDGetProp -Cache -LAll -aAll

输出样例

  1.  
    Adapter 0-VD 0(target id: 0): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  2.  
    Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  3.  
    Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  4.  
    Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  5.  
    Adapter 0-VD 4(target id: 4): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  6.  
    Adapter 0-VD 5(target id: 5): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  7.  
    Adapter 0-VD 6(target id: 6): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  8.  
    Adapter 0-VD 7(target id: 7): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  9.  
    Adapter 0-VD 8(target id: 8): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  10.  
    Adapter 0-VD 9(target id: 9): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  11.  
    Adapter 0-VD 10(target id: 10): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  12.  
     
  13.  
    Exit Code: 0x00

注:因为此服务器上有11个VD,所以会显示11行,可以看到缓存策略是No Write Cache if Bad BBU,即在电池充放电时关闭缓存。

调整缓存策略,在充放电时不关闭写缓存:

MegaCli -LDSetProp CachedBadBBU -lall -a0

输出样例:

  1.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 0 (target id: 0) success
  2.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 1 (target id: 1) success
  3.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 2 (target id: 2) success
  4.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 3 (target id: 3) success
  5.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 4 (target id: 4) success
  6.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 5 (target id: 5) success
  7.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 6 (target id: 6) success
  8.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 7 (target id: 7) success
  9.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 8 (target id: 8) success
  10.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 9 (target id: 9) success
  11.  
    Set Write Cache OK if bad BBU on Adapter 0, VD 10 (target id: 10) success

确认操作结果,检查Raid卡当前的缓存策略:

  1.  
    Adapter 0-VD 0(target id: 0): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  2.  
    Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  3.  
    Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  4.  
    Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  5.  
    Adapter 0-VD 4(target id: 4): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  6.  
    Adapter 0-VD 5(target id: 5): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  7.  
    Adapter 0-VD 6(target id: 6): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  8.  
    Adapter 0-VD 7(target id: 7): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  9.  
    Adapter 0-VD 8(target id: 8): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  10.  
    Adapter 0-VD 9(target id: 9): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  11.  
    Adapter 0-VD 10(target id: 10): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  12.  
     
  13.  
    Exit Code: 0x00

注:

如果需要将Cache策略修改为原始值,可以通过下面的命令进行操作:

MegaCli -LDSetProp NoCachedBadBBU -lall -a0

操作实例

我们ELK的机器都是2块磁盘RAID1,作为系统盘;10块数据盘做单盘RAID0。我们现在要把系统盘的CachedBadBBU关闭(前面把所有VD的缓存策略都调整为了CachedBadBBU),以保证数据安全性。

  1.  
    # 调整系统盘所在的VD0的缓存策略为NoCachedBadBBU
  2.  
    [[email protected] ~]# MegaCli -LDSetProp NoCachedBadBBU -l0 -a0
  3.  
     
  4.  
    Set No Write Cache if bad BBU on Adapter 0, VD 0 (target id: 0) success
  5.  
     
  6.  
    Exit Code: 0x00
  7.  
  8.  
     
  9.  
    # 查看系统盘所在的VD0
  10.  
    [[email protected] ~]# MegaCli -LDGetProp -Cache -L0 -aAll
  11.  
     
  12.  
    Adapter 0-VD 0(target id: 0): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  13.  
     
  14.  
    Exit Code: 0x00
  15.  
     
  16.  
    # 查看所有VD
  17.  
    [[email protected] ~]# MegaCli -LDGetProp -Cache -LAll -aAll
  18.  
     
  19.  
    Adapter 0-VD 0(target id: 0): Cache Policy:WriteBack, ReadAhead, Cached, No Write Cache if bad BBU
  20.  
    Adapter 0-VD 1(target id: 1): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  21.  
    Adapter 0-VD 2(target id: 2): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  22.  
    Adapter 0-VD 3(target id: 3): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  23.  
    Adapter 0-VD 4(target id: 4): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  24.  
    Adapter 0-VD 5(target id: 5): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  25.  
    Adapter 0-VD 6(target id: 6): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  26.  
    Adapter 0-VD 7(target id: 7): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  27.  
    Adapter 0-VD 8(target id: 8): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  28.  
    Adapter 0-VD 9(target id: 9): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  29.  
    Adapter 0-VD 10(target id: 10): Cache Policy:WriteBack, ReadAhead, Cached, Write Cache OK if bad BBU
  30.  
     
  31.  
    Exit Code: 0x00
  32.  

I/O 调度算法

目前默认为cfq,算法比较中庸,固态硬盘可调整为noop;针对机械磁盘,不同的应用可以对比测试下deadline等其他调度算法的性能表现。对于数据库等应用,避免饿死的情况,建议调整为deadline。

文件系统journal

文件系统日志,默认开启,可以暂时不做调整。

磁盘挂载参数

为提升磁盘I/O性能,可以考虑将磁盘挂载参数调整为async,noatime,data=writeback,barrier=0,nobh

参数含义:

操作实例

调整ELK服务器data目录的此案挂载参数

  1.  
    [root@ELK-133-10 ~]# mount|grep data
  2.  
    /dev/sdc1 on /data1 type xfs (rw,noatime,nodiratime)
  3.  
    /dev/sdd1 on /data2 type xfs (rw,noatime,nodiratime)
  4.  
    /dev/sde1 on /data3 type xfs (rw,noatime,nodiratime)
  5.  
    /dev/sdf1 on /data4 type xfs (rw,noatime,nodiratime)
  6.  
    /dev/sdg1 on /data5 type xfs (rw,noatime,nodiratime)
  7.  
    /dev/sdb1 on /data6 type xfs (rw,noatime,nodiratime,barrier=1)
  8.  
    [root@ELK-133-10 ~]#
  9.  
     
  10.  
    # 生成remount命令
  11.  
    [root@ELK-133-10 ~]# mount|grep data|awk '{print "mount "$1" "$3" -o remount,rw,noatime,data=writeback,barrier=0,nobh"}'
  12.  
    mount /dev/sdc1 /data1 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  13.  
    mount /dev/sdd1 /data2 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  14.  
    mount /dev/sde1 /data3 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  15.  
    mount /dev/sdf1 /data4 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  16.  
    mount /dev/sdg1 /data5 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  17.  
    mount /dev/sdb1 /data6 -o remount,rw,noatime,data=writeback,barrier=0,nobh
  18.  
    [root@ELK-133-10 ~]#
  19.  
     
  20.  
    # 执行remount命令
  21.  
    [root@ELK-133-10 ~]# mount|grep data|awk '{print "mount "$1" "$3" -o remount,rw,noatime,data=writeback,barrier=0,nobh"}'|bash
  22.  
    [root@ELK-133-10 ~]#
  23.  
     
  24.  
    # 确认remount结果
  25.  
    [root@ELK-133-10 ~]# mount|grep data
  26.  
    /dev/sdc1 on /data1 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  27.  
    /dev/sdd1 on /data2 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  28.  
    /dev/sde1 on /data3 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  29.  
    /dev/sdf1 on /data4 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  30.  
    /dev/sdg1 on /data5 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  31.  
    /dev/sdb1 on /data6 type xfs (rw,noatime,data=writeback,barrier=0,nobh)
  32.  
    [root@ELK-133-10 ~]#

性能数据对比

(待补充)

转载于:https://www.cnblogs.com/thatsit/p/ci-panio-xing-neng-you-huashi-jian.html

01-04 22:58