Back to blog

bards.ai at RDK Summit 2025

Cutting 3-5 minutes of manual work per screen to just ~20 seconds. At scale, that's thousands of hours saved.

Michal Pogoda-Rosikon

Michal Pogoda-Rosikon

May 15, 2025 · 5 min read

bards.ai at RDK Summit 2025

At RDK Summit 2025, I joined Damian Danylko and Artur Gebicz to present results of cooperation between Comcast and bards.ai on advancing Firebolt Connect (RDKM).

Firebolt now supports automated visual testing for set-top box applications. The process is simple:

  • Define how the UI should look on a golden sample by tagging key elements (buttons, images, texts etc.)
  • Automatically verify if they are correctly rendered on other hardware / firmware / software version configurations.

This tagging step - defining Points of Interest (POIs) - usually takes 5+ minutes per screen. With ~10 screens per app and thousands of apps, it adds up VERY quickly.

We evaluated off-the-shelf solutions - including Gemini (with native ability for bounding box detection) and Omniparser 1/2. Neither met the accuracy or reliability we needed. So we built a custom dataset and trained the model from the ground up.

10x reduction in asset preparation time.

It shows that custom models are far from dead - at least in computer vision. But in NLP? I'm not so sure anymore. Foundation models are catching up fast.

Michal Pogoda-Rosikon

Written by

Michal Pogoda-Rosikon

Co-founder @ bards.ai

Bridging AI research and real product engineering. Writing about what works once the demo ends.

Like this article?

Get practical insights on AI, product, and growth sent to your inbox.