circuit
在Hystrix调用服务时,难免会遇到异常,如对方服务不可用,在这种情况下如果仍然不停地调用就是不必要的,在Hystrix中可以配置使用circuit,当达到一定程度错误,就会自动调用fallback方法而不是用run方法。
配置
在Command的构造函数的CommandPropertiesDefaults中可以配置以下的参数
circuitBreakerRequestVolumeThreshold; // 在时间窗口(默认10s)中需要达到的访问数量 默认20
circuitBreakerSleepWindowInMilliseconds; // 触发circuit后重试的间隔 默认5s
circuitBreakerEnabled; // 是否启用circuit 默认启用
circuitBreakerErrorThresholdPercentage; // 在时间窗口中触发circuit的错误访问百分比 默认50%
从配置可以粗略看出,circuit就是在一个时间窗口内,当访问达到一定数量且错误率达到一定阈值就直接调用fallback,过一定时间之后会尝试,如果成功就重新开始时间窗口,如果失败继续调用fallback,如此往复。
实现
Hystrix的默认circuit实现是HystrixCircuitBreakerImpl
//这个是主要的逻辑,用于判断是否进行circuit,返回false则直接调用fallback,返回true则调用run
@Override
public boolean allowRequest() {
//判断是否强制开启circuit,使用circuitBreakerForceOpen配置
if (properties.circuitBreakerForceOpen().get()) {
// properties have asked us to force the circuit open so we will allow NO requests
return false;
}
//判断是否强制关闭circuit,使用circuitBreakerForceClosed配置
if (properties.circuitBreakerForceClosed().get()) {
// we still want to allow isOpen() to perform it's calculations so we simulate normal behavior
isOpen();
// properties have asked us to ignore errors so we will ignore the results of isOpen and just allow all traffic through
return true;
}
//调用isOpen和allowSingleTest,前者是控制流量和错误率,后者是使用
return !isOpen() || allowSingleTest();
}
//判断是否触发circuit
@Override
public boolean isOpen() {
if (circuitOpen.get()) {
// if we're open we immediately return true and don't bother attempting to 'close' ourself as that is left to allowSingleTest and a subsequent successful test to close
return true;
}
//healthCounts是一个记录请求详细信息的subject,底层是注册在Rxjava上的
// we're closed, so let's see if errors have made us so we should trip the circuit open
HealthCounts health = metrics.getHealthCounts();
//校验是否到达对应的流量
// check if we are past the statisticalWindowVolumeThreshold
if (health.getTotalRequests() < properties.circuitBreakerRequestVolumeThreshold().get()) {
// we are not past the minimum volume threshold for the statisticalWindow so we'll return false immediately and not calculate anything
return false;
}
//校验是否到达对应的错误率
if (health.getErrorPercentage() < properties.circuitBreakerErrorThresholdPercentage().get()) {
return false;
} else {
//开启circuit,使用CAS来处理多线程的征用
// our failure rate is too high, trip the circuit
if (circuitOpen.compareAndSet(false, true)) {
// if the previousValue was false then we want to set the currentTime
circuitOpenedOrLastTestedTime.set(System.currentTimeMillis());
return true;
} else {
// How could previousValue be true? If another thread was going through this code at the same time a race-condition could have
// caused another thread to set it to true already even though we were in the process of doing the same
// In this case, we know the circuit is open, so let the other thread set the currentTime and report back that the circuit is open
return true;
}
}
}
//当开启circuit在一段时间后需要重试,allowRequest中使用||短路符号来触发时间校验
public boolean allowSingleTest() {
//获得上次成功运行或测试时间
long timeCircuitOpenedOrWasLastTested = circuitOpenedOrLastTestedTime.get();
// 1) if the circuit is open
// 2) and it's been longer than 'sleepWindow' since we opened the circuit
//如果circuit开启并时间间隔大于响应的时间则进行测试,为了避免多线程的问题,这里也使用CAS进行比较时间
if (circuitOpen.get() && System.currentTimeMillis() > timeCircuitOpenedOrWasLastTested + properties.circuitBreakerSleepWindowInMilliseconds().get()) {
// We push the 'circuitOpenedTime' ahead by 'sleepWindow' since we have allowed one request to try.
// If it succeeds the circuit will be closed, otherwise another singleTest will be allowed at the end of the 'sleepWindow'.
if (circuitOpenedOrLastTestedTime.compareAndSet(timeCircuitOpenedOrWasLastTested, System.currentTimeMillis())) {
// if this returns true that means we set the time so we'll return true to allow the singleTest
// if it returned false it means another thread raced us and allowed the singleTest before we did
return true;
}
}
return false;
}
总结
Hystrix的circuit可以减少发生错误时的无用调用,其中最关键的是其中的HealthCounts用来记录错误率和流量,同时这个记录不是实时的是有一定的时间间隔的,这个是使用RxJava来实现的,后续会就Rxjava进行研究。