Overview

Bifrost has been rigorously tested under high load conditions to ensure optimal performance for production deployments. Our benchmark tests demonstrate exceptional performance characteristics at 5,000 requests per second (RPS) across different AWS EC2 instance types. Key Performance Highlights:
  • Perfect Success Rate: 100% request success rate under high load
  • Minimal Overhead: Less than 15µs added latency per request on average
  • Efficient Queue Management: Sub-microsecond queue wait times on optimized instances
  • Fast Key Selection: Near-instantaneous weighted API key selection (~10 ns)

Test Environment Summary

Bifrost was benchmarked on two primary AWS EC2 instance configurations:

t3.medium (2 vCPUs, 4GB RAM)

  • Buffer Size: 15,000
  • Initial Pool Size: 10,000
  • Use Case: Cost-effective option for moderate workloads

t3.xlarge (4 vCPUs, 16GB RAM)

  • Buffer Size: 20,000
  • Initial Pool Size: 15,000
  • Use Case: High-performance option for demanding workloads

Performance Comparison at a Glance

Metrict3.mediumt3.xlargeImprovement
Success Rate @ 5k RPS100%100%No failed requests
Bifrost Overhead59 µs11 µs-81%
Average Latency2.12s1.61s-24%
Queue Wait Time47.13 µs1.67 µs-96%
JSON Marshaling63.47 µs26.80 µs-58%
Response Parsing11.30 ms2.11 ms-81%
Peak Memory Usage1,312.79 MB3,340.44 MB+155%
Note: t3.xlarge tests used significantly larger response payloads (~10 KB vs ~1 KB), yet still achieved better performance metrics.

Which Instance Should You Choose?

Choose t3.medium when:

  • Budget-conscious deployments
  • Moderate traffic (< 3,000 RPS sustained)
  • Memory-constrained environments
  • Development/staging workloads
  • You can tolerate slightly higher latency for cost savings

Choose t3.xlarge when:

  • High-throughput production workloads
  • Low-latency requirements are critical
  • Large response payloads are common
  • Enterprise applications with strict SLAs
  • Memory usage is not a primary constraint

Configuration Flexibility

One of Bifrost’s key strengths is its configuration flexibility. You can fine-tune the speed ↔ memory trade-off based on your specific requirements:
Configuration ParameterEffect
initial_pool_sizeHigher values = faster performance, more memory usage
buffer_size & concurrencyControls queue depth and max parallel workers (per provider)
retry & timeoutTune aggressiveness for each provider to meet your SLOs
Configuration Philosophy:
  • Higher settings (like t3.xlarge profile) prioritize raw speed
  • Lower settings (like t3.medium profile) optimize for memory efficiency
  • Custom tuning lets you find the sweet spot for your specific workload

Next Steps

Detailed Performance Analysis

Run Your Own Tests

Ready to dive deeper? Choose your instance type above or learn how to run your own performance tests.