Overview
Service governance addresses the challenges of managing distributed services: who provides a service, who consumes it, where to register, how to handle failures, how to ensure quality, and how to degrade gracefully under pressure.
Key areas of microservice governance:
- Service Registration & Discovery: After decomposing a monolith into microservices, callers need to locate target service addresses dynamically.
- Observability: Call topology, metrics, logging, and distributed tracing.
- Traffic Management: Canary releases, blue-green deployments, A/B testing.
- Fault Tolerance: Circuit breaking, isolation, rate limiting, fallback, and timeout mechanisms.
- Security: Authentication and authorization between services.
- Control: Real-time distribution of governance policies.
- Node Health: Detect and remove unhealthy instances from the cluster.
Service Registration & Discovery
Silky supports automatic service registration and discovery with Zookeeper, Nacos, and Consul as the registry center. Instance online/offline status is detected in real time.
- When a service instance starts, it registers or updates its service metadata (and endpoint address) in the registry.
- With Zookeeper or Nacos, Silky uses a pub-sub model to receive the latest metadata and endpoint changes immediately.
- With Consul, Silky polls the registry on a heartbeat interval.
- When IO or communication errors occur during RPC, the failing instance is marked unhealthy and removed after
Governance:UnHealthAddressTimesAllowedBeforeRemovingfailures (0 = immediate removal). - Long connections support heartbeat (
Rpc:EnableHeartbeat = true). Heartbeat failures trigger the same unhealthy removal logic.
Load Balancing
Silky supports four load balancing strategies:
| Strategy | Description |
|---|---|
Polling (default) | Round-robin across all healthy instances |
Random | Randomly select a healthy instance |
HashAlgorithm | Consistent hash — same parameter always routes to the same instance |
Appoint | Direct routing to a specific address (framework internal use only) |
Global Configuration
governance:
shuntStrategy: Random
Per-Method Override
[HttpGet("{name}")]
[Governance(ShuntStrategy = ShuntStrategy.HashAlgorithm)]
Task<TestOut> Get([HashKey] string name);
When using HashAlgorithm, mark the parameter used for hashing with [HashKey]:
Task<OrderOutput> GetOrderAsync([HashKey] long orderId);
Direct Invocation via IAppointAddressInvoker
public class MyService : IMyService, IScopedDependency
{
private readonly IAppointAddressInvoker _appointAddressInvoker;
public MyService(IAppointAddressInvoker appointAddressInvoker)
{
_appointAddressInvoker = appointAddressInvoker;
}
public async Task<MyOutput> CallSpecificInstance(MyInput input)
{
return await _appointAddressInvoker.Invoke<MyOutput>(
"192.168.1.100:2200",
"Your.Service.Entry.Id_Get",
new object[] { input });
}
}
Timeout Control
Configure RPC timeout globally or per method:
governance:
timeoutMillSeconds: 5000 # 5 seconds (0 = unlimited)
Per-method override:
[Governance(TimeoutMillSeconds = 3000)]
Task<OrderOutput> CreateAsync(CreateOrderInput input);
Failover (Retry)
On IO or communication errors, Silky automatically selects another healthy instance and retries:
governance:
retryTimes: 3 # number of retries
retryIntervalMillSeconds: 50 # delay between retries (ms)
Warning
Retries only trigger on infrastructure exceptions (IO errors, connection failures). Business logic exceptions (UserFriendlyException, validation errors) do NOT trigger a retry.
Circuit Breaking
Silky uses Polly to implement circuit breaking:
governance:
enableCircuitBreaker: true
exceptionsAllowedBeforeBreaking: 3 # consecutive exceptions to open the circuit
breakerSeconds: 60 # seconds the circuit stays open
When the circuit is open, all calls to that service entry immediately fail fast without attempting a network request.
RPC Concurrent Rate Limiting
Limit the number of concurrent RPC requests handled by a single instance:
governance:
maxConcurrentHandlingCount: 50 # excess requests are routed to other instances
HTTP Rate Limiting
Integrate AspNetCoreRateLimit for external HTTP rate limiting at the gateway:
ipRateLimiting:
enableEndpointRateLimiting: true
generalRules:
- endpoint: "*"
period: 1s
limit: 100
Service Fallback
Define a fallback class to handle RPC failures gracefully:
[ServiceRoute]
[Fallback(typeof(InventoryAppServiceFallback))]
public interface IInventoryAppService
{
Task<int> GetStockAsync(long productId);
}
public class InventoryAppServiceFallback : IInventoryAppService
{
public Task<int> GetStockAsync(long productId)
{
// Return a safe default when the remote call fails
return Task.FromResult(-1);
}
}
Global Governance Configuration Reference
governance:
shuntStrategy: Polling # load balancing strategy
timeoutMillSeconds: 5000 # RPC timeout (ms); 0 = unlimited
enableCachingInterceptor: true # enable cache interception
retryTimes: 3 # failover retry count
retryIntervalMillSeconds: 50 # delay between retries (ms)
enableCircuitBreaker: true # enable circuit breaking
exceptionsAllowedBeforeBreaking: 3
breakerSeconds: 60 # circuit open duration (s)
addressFuseSleepDurationSeconds: 60
unHealthAddressTimesAllowedBeforeRemoving: 3
maxConcurrentHandlingCount: 50 # per-instance max concurrent requests
