Introduction
Building resilient cloud services is a critical aspect of modern software development, especially as cloud-based solutions become more prevalent. With the release of .NET 8, new tools and libraries have been introduced to enhance the resilience and stability of cloud services. This blog post delves into the features and strategies available in .NET 8 for building resilient cloud services, including the updated Polly library and new resilience features.
Understanding Resilience in Cloud Services
In the context of cloud services, failures are inevitable. These failures can arise from network outages, transient errors on the service side, or prolonged service outages. A resilient cloud service anticipates these errors, recovers from them, and adapts to changing conditions.
One essential tool for building resilient applications in .NET is the Polly library. Polly is a popular .NET resilience library that provides various resilience strategies such as retries, circuit breakers, timeouts, fallbacks, and rate limiting.
var retryPolicy = Policy
.Handle<HttpRequestException>()
.OrResult<HttpResponseMessage>(r => !r.IsSuccessStatusCode)
.RetryAsync(3);
var response = await retryPolicy.ExecuteAsync(() => httpClient.GetAsync("https://example.com"));
Polly and .NET 8
Polly v8, developed in collaboration with the open-source community and Microsoft, introduces several improvements, including a modernized API, simplified design, and enhanced performance with low allocation APIs. It also supports native telemetry and dependency injection, making it easier to integrate into cloud services.
var resiliencePipeline = ResiliencePipeline
.CreateBuilder()
.AddRetry(retryOptions =>
{
retryOptions.RetryCount = 3;
retryOptions.BackoffType = BackoffType.Exponential;
})
.Build();
await resiliencePipeline.ExecuteAsync(async () => await httpClient.GetAsync("https://example.com"));
Enhancing HTTP Clients with Resilience
Adding resilience to HTTP clients is a common scenario in cloud services. The Microsoft Extensions HTTP Resilience Library, built on top of Polly v8, provides seamless integration with the HTTP client factory and includes APIs for creating resilient HTTP pipelines.
1. Retry Policy
A retry policy is used to automatically retry a failed request a specified number of times before giving up. This can be useful for handling transient errors that are likely to resolve themselves.
services.AddHttpClient("ResilientClient")
.AddPolicyHandler(Policy<HttpResponseMessage>
.Handle<HttpRequestException>()
.OrResult(r => !r.IsSuccessStatusCode)
.RetryAsync(3));
In this example, the circuit breaker will open (prevent further requests) after two consecutive failures and will remain open for one minute before allowing attempts to execute again.
3. Timeout Policy
A timeout policy is used to limit the amount of time an operation can take before it is considered a failure. Here, any request taking longer than 10 seconds will be aborted and considered as failed.
services.AddHttpClient("ResilientClient")
.AddPolicyHandler(Policy.TimeoutAsync<HttpResponseMessage>(TimeSpan.FromSeconds(10)));
4. Bulkhead Policy
A bulkhead policy limits the number of concurrent calls to a specific service, preventing resource exhaustion and improving system stability under load.
services.AddHttpClient("ResilientClient")
.AddPolicyHandler(Policy.BulkheadAsync<HttpResponseMessage>(maxParallelization: 5, maxQueuingActions: 10))
This example limits the number of concurrent calls to five, with up to ten additional requests being queued.
5. Fallback Policy
A fallback policy provides an alternative action when all other policies fail, ensuring the system can still respond gracefully.
services.AddHttpClient("ResilientClient")
.AddPolicyHandler(Policy<HttpResponseMessage>
.Handle<Exception>()
.FallbackAsync(new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StringContent("Fallback response")
}));
Here is a more simplified and more concise way of adding multiple resilience policies to your HttpClient.
// Program.cs
using Microsoft.Extensions.DependencyInjection;
using Polly;
using Polly.Bulkhead;
using Polly.CircuitBreaker;
using Polly.Extensions.Http;
using Polly.Timeout;
using System.Net.Http;
var builder = WebApplication.CreateBuilder(args);
// Define Polly policies
var retryPolicy = HttpPolicyExtensions
.HandleTransientHttpError()
.OrResult(msg => msg.StatusCode == System.Net.HttpStatusCode.NotFound)
.RetryAsync(3);
var circuitBreakerPolicy = HttpPolicyExtensions
.HandleTransientHttpError()
.CircuitBreakerAsync(5, TimeSpan.FromSeconds(30));
var timeoutPolicy = Policy.TimeoutAsync<HttpResponseMessage>(5); // 5 seconds timeout
var bulkheadPolicy = Policy.BulkheadAsync<HttpResponseMessage>(maxParallelization: 10, maxQueuingActions: 20);
var fallbackPolicy = Policy<HttpResponseMessage>
.Handle<Exception>()
.FallbackAsync(new HttpResponseMessage(System.Net.HttpStatusCode.OK)
{
Content = new StringContent("Fallback response")
});
// Add HTTP client with Polly policies
builder.Services.AddHttpClient("PollyDemoClient")
.AddPolicyHandler(retryPolicy)
.AddPolicyHandler(circuitBreakerPolicy)
.AddPolicyHandler(timeoutPolicy)
.AddPolicyHandler(bulkheadPolicy)
.AddPolicyHandler(fallbackPolicy);
var app = builder.Build();
app.MapControllers();
app.Run();
Conclusion
Building resilient cloud services with .NET 8 involves leveraging powerful libraries and tools like Polly v8 to manage and mitigate failures. By implementing strategies such as retries, circuit breakers, timeouts, bulkheads, and fallbacks, developers can create robust and reliable cloud-based applications. The enhancements in .NET 8 and the new resilience libraries make it easier to integrate these strategies and build services that can withstand and adapt to failures.
References
- Building Resilient Cloud Applications
- Polly Documentation
- Microsoft Extensions HTTP Resilience Library
This blog post covers the core aspects of building resilient cloud services with .NET 8, focusing on practical examples and strategies to handle failures effectively. The use of code snippets and diagrams helps in understanding the implementation and benefits of these resilience techniques.
You could also have a look at my code in GitHub and send me PR if you think this could be improved further