Introducing Reldex: Measure release process effectiveness - Blog

Reldex is a new human-friendly scoring and grading system to help mobile app teams measure how effectively choreographed their app release process is. It looks at a selection of important signals, weighs them, and produces a single grade that a team can rally around in retrospectives or use as an objective anchor for improving releases.

The DevOps Research and Assessment (DORA) has been the industry standard for measuring release performance in recent years. It centers around 4 key metrics:

Lead time for changes: How long does it take to go from having the code committed to successfully running in production?‍
Deployment frequency: How often is code deployed to production?‍
Change fail rate: What percentage of changes released to production or released to users result in degraded service?‍
Time to restore service: How long does it take to restore service after a service incident or defect impacts users?

Not all of these fit neatly in the mobile DevOps world due to the trickiness of practicing continuous delivery (or often its impossibility) and because changes to users are guarded behind proprietary deployment systems.

Reldex proposes the following variants for mobile teams that roughly proxy DORA metrics and expand them to be more relevant in the mobile world:

Overall time for a release: Number of days from release start to 100% rollout
Time to stabilize before starting rollout: Number of days it takes to stabilize the build or release before submitting it to the stores for a review
Time to complete a full phased rollout: Number of days between rollout start and 100% rollout of the release
Number of hotfixes against the release: Bug fixes to the release after the release has reached all the users
Delay between the last rollout and the next hotfix: Time taken to start working on a hotfix since an issue was discovered in the last rollout

Once these metrics are collected, they are assigned certain acceptable ranges and weights. For example, a mature team releasing on a weekly cadence could set up the following:

Overall time for a release — 10 to 12 days & 20% weight
Time to stabilize before starting rollout — 2 to 3 days & 20% weight
Time to complete a full phased rollout — 7 days & 10% weight
Number of hotfixes against a release — 0 & 40% weight
Delay between the last rollout and the next hotfix — 6 to 12 hours & 10% weight

Each of the metrics is then individually scored as per an acceptable range:

Better than the acceptable range — a score of 1‍
Within acceptable range — a score of 0.5‍
Less than the acceptable range — a score of 0

These scores are combined with their weights and a final Reldex score is computed between 0 and 1:

reldex = Σ(metric_score × metric_weight)

The Reldex is then defined as Excellent, Acceptable, or Mediocre as per another set of acceptable ranges for the final Reldex.

Better than 0.9 — Excellent
Between 0.75 and 0.9 — Acceptable
Less than 0.75 — Mediocre

In the above example, we use an acceptable range of 0.75–0.9.

We've put together a spreadsheet where you can play around with Reldex scores by punching in data from your previous releases. Give it a spin!

The name

Reldex is short for Release Index. The resemblance to the name and the open standard — Apdex — is not accidental. Reldex takes inspiration from Apdex for grading math, but the similarities mostly end at that and the namesake.

Reldex is not about the performance of your app from a user-satisfaction lens or how good your features are, it is about scoring your app delivery process based on parts you see as valuable. Whether your release or feature, when once rolled out to the users, is a hit or a miss, isn't of direct concern to Reldex. Those things are best measured with bug trackers or APMs like Sentry and BugSnag.

Meaningful for teams

From building Tramline, we’ve received plenty of feedback around making sense of all the metrics and data that we radiate during and across app releases, into a single, simple-to-understand grade or score. But Reldex is more than just that, it is also a system for setting acceptable criteria for your DevOps metrics.

Since you set the acceptable criteria, Reldex represents the truth of your team. It is not some North Star metric that all app teams in the world must equally shoot for. Different teams of different sizes will set very different-looking acceptable ranges.

A mature, fast-moving team, might have ranges that look like this:

Overall time for a release — 10 to 12 days
Time to stabilize before starting rollout — 2 to 3 days
Time to complete a full phased rollout — 7 days
Number of hotfixes against a release — 0
Delay between the last rollout and the next hotfix — 6 to 12 hours

But a more compliance-heavy, slower release cycle team might look like this:

Overall time for a release — 14 to 17 days
Time to stabilize before starting rollout — 7 to 10 days
Time to complete a full phased rollout — 7 to 10 days
Number of hotfixes against a release — 1 to 2
Delay between the last rollout and the next hotfix — 1 to 2 days

Reldex forces teams to look at delivery processes objectively without relying on a release pilot’s subjective assessment of how the release went.

Some kind of DevOps metric

A goal without a method is cruel
_{— W. Edwards Deming}

The value of a simple-to-consume grade is most apparent when things inevitably change in a release process. Perhaps the team has grown in size, the number of hotfixes has increased lately, and the releases are getting delayed. These events tend to happen gradually and under the radar. Reldex immediately flags these events by parsing any of those signals. It is the check engine light for the release process as a whole, ideally meant to remain in the Excellent range. Acceptable is infrequent and Mediocre is a rare event.

But much like the DORA metrics, the grade isn't something one should point around to stack-rank teams internally or make OKRs out of, however tempting it might seem. It is another tool in your belt for your mobile DevOps practice that helps point out the whats, not necessarily the whys.

Tramline is positioned-perfect

Tramline is your release state and is in a unique position to be able to do all this. Not a lot of platform tooling out there is release-aware. Most release-adjacent tools treat release data as just metadata.

As Tramline gets more insights about your release, it only makes Reldex more accurate. Just in the last couple of weeks after introducing Reldex on Tramline, we've added two additional indicators to the mix:

the number of days since the previous release was made
the time delay between the last rollout and the next hotfix

Tramline comes with acceptable ranges for each metric based on observations of high-performing teams and finding defaults that would work for most people.

an example of an *Excellent* Reldex for a release

A standard approach

This system works for everyone. It's a grading system on top of DORA-style metrics, which if you already measure on your own, are simple enough to pull Reldex out of. It also isn't just about the metrics described above. It's a general-purpose system that you can extend by plugging in your own metrics.

Even though releasing mobile apps sits inherently atop proprietary software systems, with Reldex, we prefer that most of the constituent metrics aren't based on complex paid tools, but are rather measured through abstraction and the observation of foundational aspects of the release process.

We would love to see more mobile teams adopt Reldex as the standard for measuring the performance of app release cycles.

Join our Discord community if you find this interesting and would like to explore and talk more ☎️!