Metrics

FlexLogs Docs Image

Metrics: Why You Should Care

Most metrics systems were built for infrastructure. They want you to track CPU load on a node you don't control, using query languages that require a cheat sheet. But you're building an app, not running a data center. You care about actual things:

  • How many users signed up last week?
  • What was the p95 response time for checkout?
  • Are any customers getting hit with weird performance issues?

FlexLogs Metrics are made for you, the app developer.

They work like you think:

  • Increments track when things happen: a signup, a login, an error. You just send a +1 each time.
  • Observations measure things: how long something took, how big a payload was, how deep a queue got. You send the value directly.

And that's it. Two types. Real math. No agents, no exporters, no Prometheus arcana.

You get automatic rollups over time (1min, 5min, 1hr, etc.) with real stats:

  • sum, count, avg, min, max
  • median, p90, p95, p99
  • std_dev for spread

All of it based on your actual data — not buckets or estimates.

Getting Started with Metrics

Setting up Metrics is as simple as tagging your logs. Here are some examples:

The most basic metric is an increment metric, which increases each time it's logged. The default value is 1. The following example shows how to create a couple simple metrics.

    
      
    // Create a basic (increment) metric
    logger.info("flexlogs{metric: 'page-load'}");

    // The same metric using the html-like format
    logger.info("<flexlogs metric=page-load />");

    // Create a basic (observation) metric
    logger.info("flexlogs{metric: 'cpu-usage', type: 'observation', value: 0.75}");
  
    
  

Options

Key Description Type / Options Default
metric A name for your metric String
type The type of metric increment or observation increment
value Metric value Number 1
tags List of tags Array []

Examples

    
      
    // add some tags
    logger.info("flexlogs{metric: 'order.new', tags: ['order', 'checkout']}");

    // observation metric for monitoring queue length
    logger.info(`flexlogs{metric: 'queue.length', value: ${ Queue.length() }, type: 'observation'}`);

    // explicitly set type and value
    logger.info("flexlogs{metric: 'user.signup', type: 'increment', value: 1}")

    // build from json object
    logger.info("flexlogs" + JSON.stringify({metric: "new-feature.enabled"}));
    // same as -> logger.info("flexlogs{metric: 'new-feature.enabled'}")

    // add multiple tags
    logger.info("flexlogs{metric: 'beta_feature.error', tags: ['error', 'beta']}");

    // interested in other examples? let us know!
  
    
  

Working with Metrics

Aggregation Functions

Metrics are only useful if they help you make sense of behavior over time. That’s where aggregation functions come in. Instead of staring at raw values, we roll your data up across intervals — 1 minute, 5 minutes, 1 hour, and more — so you can see trends, catch outliers, and compare what's typical versus what's terrible.

Here are the key aggregation functions FlexLogs supports (and why you should care):

  • sum — Adds up all values in the interval. Useful for things like "how many emails were sent?" or "how many signups happened this week?"
  • count — Shows how many events were recorded in that interval. Handy for tracking throughput or overall volume.
  • avg — Gives the mean value. Great for tracking averages: average response time, average retries per job, average cart value.
  • min / max — Show the extremes. Want to know if a queue was ever empty? Or how bad a memory spike got? This tells you.
  • median — The value right in the middle. More resistant to outliers than average, so it gives you a better feel for "typical" performance.
  • p90, p95, p99 — The high-percentile views of your data. These reveal tail behavior: how bad is performance for your slowest users? How often are people waiting too long?
  • std_dev — Standard deviation: how much the values bounce around. Useful for spotting instability or inconsistency (especially with durations).

Aggregation Examples: Request Duration

Imagine you track the duration of HTTP requests, and you see this set of values over a minute: [10, 25, 40, 2400] (all in ms).

sum2475 ms

Total time spent handling requests. Could be useful for understanding compute cost or overall system load.

count4

Number of requests in that window.

avg618.75 ms

Sure, that seems high — but that’s because of a single 2400ms outlier. Average alone can mislead you.

min / max10 ms / 2400 ms

Fastest vs slowest. That’s a massive range, which should make you ask why that one request was so slow.

median32.5 ms

Much more reasonable than the average. Most requests were actually fast — the 2400ms outlier is skewing things.

p95 → ~2400 ms

This tells you 95% of requests were faster than 2400ms — but that remaining 5%? They’re painful. p95 is your canary for bad user experience.

std_dev → High

One slow request can wreck your average, and std_dev shows you the variance directly. If this number is large, things are inconsistent.

Why It Matters

You might not care about each individual request, but you do care about patterns:

  • "How bad was traffic during the promo campaign?"
  • "Are we getting slower over time?"
  • "Was that deploy rough for anyone?"
  • "Do users in a specific region always get p95 performance issues?"

FlexLogs gives you the power to ask and answer those questions, with real math and no guesswork. Because when you’re trying to improve your app, you can’t afford to fly blind.

FlexLogs Metrics give you real insight without the infrastructure baggage. You get the data that matters, the way you already think about your app. No ragrets. You get the data that matters, the way you already think about your app.