Skip to content
Public Service Campaigns Product Stewardship Customer Education

Visual Audio - the low-bandwidth layer for the physical world

Marcel Heyne
Marcel Heyne

A few months ago, I watched someone try to learn something “simple” on a phone: how to check a solar system setup, and what to do when it stops working.

The content existed. The intention was there. But the network wasn’t.

The video thumbnail loaded. Then nothing. Buffering. Another retry. Then the person gave up and asked someone nearby instead. Not because they didn’t care, but because the format assumed a connection that wasn’t available.

This is the gap we built WOM.fm for.

Not more content. Not another app. A low-bandwidth layer that sits on real-world touchpoints - packaging, manuals, posters, kiosks, out-of-home surfaces - and actually completes in the field.

We call it Visual Audio.

Visual Audio in one sentence

Visual Audio is audio plus step-by-step images, delivered via a QR code and short link, designed to load fast on slow connections - with analytics you can report on.

If you want to see it in action, here’s a simple demo we built for solar and PAYGO after-sales: wom.fm/126

Why this format exists

When you build for Sub-Saharan Africa (and many similar markets), you learn quickly that “digital” doesn’t automatically mean “video-first” or “app-first”.

In many real environments, the limiting factor is not interest. It’s bandwidth, cost, and patience.

So the question becomes: what format still works when the connection is weak?

Audio is a strong starting point, especially in multilingual settings and where literacy varies. But audio alone can still leave people guessing when it comes to procedures.

That’s why Visual Audio combines audio with a few clear images - the kind you’d expect in a printed manual - only now they are on the phone, synchronized with the explanation.

The data argument (simple, but important)

A low-resolution YouTube video still eats meaningful data. Published estimates put YouTube at around 5–7.5 MB per minute at 360p, and around 8–11 MB per minute at 480p.

By comparison, standard audio streaming at 64 kbps uses roughly 28–30 MB per hour, which is about 0.5 MB per minute.

Now add a few step-by-step images. On the web, it’s common for optimized images to land in the ~100–500 KB range depending on size and compression.

So a practical rule of thumb is:

Visual Audio often delivers the same “guided instruction” experience with roughly around a tenth of the data of a low-res video - and the gap grows as video quality increases.

That’s not a marketing trick. It’s physics.

And in low-bandwidth environments, physics decides whether your training is completed or abandoned.

From touchpoint to learning - the solar example

In the demo (wom.fm/126), the “manual” isn’t a PDF hidden on a website. It’s guidance attached to the point of need.

Someone scans a QR on a product, a box, a sticker, or a poster. They get a short audio explanation in a language that fits the user. They see exactly what to do, step by step, with images that make the instruction unambiguous.

That is the real shift:

From “content somewhere online” to knowledge directly connected to the physical world.

Proof for reporting - without personal data

One more thing we kept seeing in projects: teams can usually prove that something was produced, but struggle to prove what was actually used.

WOM.fm is designed for reporting from the start. It tracks what happened at the touchpoint - scans, listens, completion, and sharing - and turns that into a concise weekly report by default.

And it’s built privacy-first: no logins, no user profiles, no personal data required.

Where this goes next

Visual Audio is a format, but it’s also a category: low-bandwidth knowledge infrastructure.

We’re already seeing it fit naturally into:

  • solar and PAYGO after-sales support
  • telco-led digital financial literacy and consumer protection
  • stewardship and safe-use guidance in agribusiness
  • out-of-home activations that need engagement reporting
  • public health and citizen service guidance where completion matters

If you’re working on anything that needs to be understood reliably in the field, and you want proof for reporting, we should talk.

Try the demo: wom.fm/126

Share this post