Skip to content

Latest commit

 

History

History
247 lines (178 loc) · 10.2 KB

File metadata and controls

247 lines (178 loc) · 10.2 KB

Real-Time Metrics Visualization

memtier_benchmark supports streaming real-time metrics to a StatsD-compatible server for live visualization during benchmark runs. This is particularly useful for:

  • Observing performance during long-running tests
  • Monitoring latency spikes during scaling events (e.g., Redis Enterprise slot migrations)
  • Comparing multiple benchmark runs side-by-side
  • Sharing live dashboards with team members

Quick Start

1. Start the Monitoring Stack

A pre-configured Docker Compose setup is included with Graphite (StatsD receiver) and Grafana:

docker-compose -f docker-compose.statsd.yml up -d

This starts:

  • Graphite + StatsD on localhost:8125 (UDP) - receives metrics
  • Grafana on http://localhost:3000 - visualization (login: admin / admin)

2. Run a Benchmark with Metrics

./memtier_benchmark -s <redis-host> -p <redis-port> \
    --statsd-host=localhost \
    --test-time=60

3. View the Dashboard

Open http://localhost:3000 in your browser, log in with admin/admin, and navigate to the Memtier Benchmark dashboard. You'll see live metrics updating in real-time.

4. Stop the Monitoring Stack

docker-compose -f docker-compose.statsd.yml down

To also remove stored data:

docker-compose -f docker-compose.statsd.yml down -v

Command-Line Options

Option Default Description
--statsd-host=HOST (disabled) StatsD server hostname. Metrics are only sent when this is set.
--statsd-port=PORT 8125 StatsD server UDP port.
--statsd-prefix=PREFIX memtier Prefix for all metric names in Graphite.
--statsd-run-label=LABEL default Label to identify this benchmark run. Use different labels to compare runs.
--graphite-port=PORT 8080 Graphite HTTP port for event annotations. Use 80 when running inside the same Docker network.

Examples

Basic usage:

./memtier_benchmark -s redis.example.com --statsd-host=localhost --test-time=120

With a custom run label for comparison:

./memtier_benchmark -s redis.example.com --statsd-host=localhost \
    --statsd-run-label=baseline --test-time=60

# Later, run another test with a different label
./memtier_benchmark -s redis.example.com --statsd-host=localhost \
    --statsd-run-label=after-upgrade --test-time=60

Custom prefix (useful if sharing a Graphite instance):

./memtier_benchmark -s redis.example.com --statsd-host=metrics.internal \
    --statsd-prefix=team1.memtier --statsd-run-label=prod-test

Metrics Reference

The following metrics are sent approximately every 1 second during the benchmark.

Throughput

Metric StatsD type Graphite path Description
ops_sec gauge stats.gauges.<prefix>.<label>.ops_sec Instantaneous ops/sec over the last interval
ops_sec_avg gauge stats.gauges.<prefix>.<label>.ops_sec_avg Running average ops/sec since benchmark start
bytes_sec gauge stats.gauges.<prefix>.<label>.bytes_sec Instantaneous byte throughput over the last interval
bytes_sec_avg gauge stats.gauges.<prefix>.<label>.bytes_sec_avg Running average byte throughput since benchmark start

Latency

Metric StatsD type Graphite path Description
latency_ms timing (ms) stats.timers.<prefix>.<label>.latency_ms.* Instantaneous average latency (ms) over the last interval
latency_avg_ms timing (ms) stats.timers.<prefix>.<label>.latency_avg_ms.* Running average latency (ms) since benchmark start
latency_p<N> gauge stats.gauges.<prefix>.<label>.latency_p<N> Instantaneous latency at percentile N (ms). One metric per percentile configured via --print-percentiles. Default: latency_p50, latency_p99, latency_p99_9. Decimal points are replaced with underscores (e.g. latency_p99_9 for p99.9).

Note: latency_ms and latency_avg_ms are sent as StatsD timing metrics (ms type). StatsD processes them into derived stats (mean, upper, lower, etc.) that appear under stats.timers.* in Graphite — not under stats.gauges.* like the other metrics. The per-percentile latency_p<N> metrics are plain gauges and appear under stats.gauges.*.

Connections and Errors

Metric StatsD type Graphite path Description
connections gauge stats.gauges.<prefix>.<label>.connections Active connection count (--clients × active thread count)
connection_errors gauge stats.gauges.<prefix>.<label>.connection_errors Cumulative connection error count. Only sent when the count is > 0 and not zeroed at run end — stale values may linger in Graphite after errors clear. Protocol-level command errors are not tracked here.

Progress

Metric StatsD type Graphite path Description
progress_pct gauge stats.gauges.<prefix>.<label>.progress_pct Benchmark completion percentage (0–100)

Events (Graphite annotations)

Two events are sent via HTTP POST to the Graphite events API (not StatsD UDP):

Event Tags When
Benchmark Started memtier,start Immediately before the benchmark loop begins
Benchmark Completed memtier,end Immediately after all threads finish

These appear as vertical annotation lines on the Grafana dashboard.

End-of-run zeroing

When the benchmark completes, the following gauges are explicitly zeroed so graphs return to baseline rather than holding the last value: ops_sec, ops_sec_avg, bytes_sec, bytes_sec_avg, progress_pct.

connections, latency_ms, latency_avg_ms, latency_p<N>, and connection_errors are not zeroed at run end.

Comparing Multiple Benchmark Runs

One of the most powerful features is the ability to overlay multiple benchmark runs on the same graphs for direct comparison. This is ideal for A/B testing, before/after comparisons, or evaluating different configurations.

How It Works

  1. Run benchmarks with different labels:

    # First run - baseline
    ./memtier_benchmark -s redis-server --statsd-host=localhost \
        --statsd-run-label=baseline --test-time=60
    
    # Second run - after changes
    ./memtier_benchmark -s redis-server --statsd-host=localhost \
        --statsd-run-label=after-tuning --test-time=60
    
    # Third run - different configuration
    ./memtier_benchmark -s redis-server --statsd-host=localhost \
        --statsd-run-label=high-concurrency -c 100 -t 8 --test-time=60
  2. Use the Run Label dropdown at the top of the Grafana dashboard to:

    • Select multiple labels - Hold Ctrl/Cmd and click to select several runs
    • Select "All" - Overlay every run on the same graphs
    • Deselect runs - Click to toggle individual runs on/off
  3. Compare visually - Each run appears as a separate line with its own color, making it easy to spot performance differences.

Example Use Cases

Scenario Labels to Compare
Before/after Redis upgrade redis-6.2, redis-7.0
Connection pool tuning pool-10, pool-50, pool-100
Cluster scaling test 3-shards, 6-shards, 12-shards
Network latency impact same-az, cross-az, cross-region

Grafana Dashboard

The included dashboard provides:

  • Operations per Second - Current and average ops/sec over time
  • Latency - Current and average latency in milliseconds
  • Throughput - Current and average bytes/sec
  • Connections - Active connection count
  • Progress - Benchmark completion percentage
  • Connection Errors - Error count indicator

All panels support multi-run overlay when multiple run labels are selected.

Troubleshooting

Verify StatsD is Receiving Metrics

Send a test metric manually:

echo "test.metric:100|g" | nc -u -w1 localhost 8125

Check Graphite's web UI at http://localhost:8080 to see if metrics appear.

No Data in Grafana

  1. Ensure the benchmark is running with --statsd-host set
  2. Check that port 8125/UDP is accessible (not blocked by firewall)
  3. Verify Grafana's Graphite datasource is configured (should be automatic with the Docker setup)
  4. Try refreshing the dashboard or adjusting the time range to "Last 5 minutes"

Metrics Delayed or Missing

  • StatsD aggregates metrics every 1 second by default
  • The dashboard refreshes every 1 second
  • If running memtier from a container, use --statsd-host=host.docker.internal (macOS/Windows) or the host's IP

Reset Stored Data

To clear all historical metrics and start fresh:

docker-compose -f docker-compose.statsd.yml down -v
docker-compose -f docker-compose.statsd.yml up -d

Architecture

┌─────────────────────┐     UDP:8125      ┌─────────────────────┐
│ memtier_benchmark   │ ───────────────── │ Graphite + StatsD   │
│ --statsd-host=...   │                   │ (metrics storage)   │
└─────────────────────┘                   └──────────┬──────────┘
                                                     │
                                          ┌──────────▼──────────┐
                                          │      Grafana        │
                                          │   localhost:3000    │
                                          └─────────────────────┘

Using with External StatsD/Graphite

If you have an existing StatsD-compatible metrics infrastructure:

./memtier_benchmark -s redis-server \
    --statsd-host=statsd.your-company.com \
    --statsd-port=8125 \
    --statsd-prefix=benchmarks.memtier \
    --statsd-run-label=redis-perf-$(date +%Y%m%d-%H%M%S)

You can import the dashboard from grafana/dashboards/memtier.json into your Grafana instance. You may need to adjust the datasource UID to match your Graphite datasource.