bards.ai at RDK Summit 2025

Cutting 3-5 minutes of manual work per screen to just ~20 seconds. At scale, that's thousands of hours saved.

Michal Pogoda-Rosikon

May 15, 2025 · 5 min read

At RDK Summit 2025, I joined Damian Danylko and Artur Gebicz to present results of cooperation between Comcast and bards.ai on advancing Firebolt Connect (RDKM).

Firebolt now supports automated visual testing for set-top box applications. The process is simple:

Define how the UI should look on a golden sample by tagging key elements (buttons, images, texts etc.)
Automatically verify if they are correctly rendered on other hardware / firmware / software version configurations.

This tagging step - defining Points of Interest (POIs) - usually takes 5+ minutes per screen. With ~10 screens per app and thousands of apps, it adds up VERY quickly.

We evaluated off-the-shelf solutions - including Gemini (with native ability for bounding box detection) and Omniparser 1/2. Neither met the accuracy or reliability we needed. So we built a custom dataset and trained the model from the ground up.

10x reduction in asset preparation time.

It shows that custom models are far from dead - at least in computer vision. But in NLP? I'm not so sure anymore. Foundation models are catching up fast.

Written by

Michal Pogoda-Rosikon

Co-founder @ bards.ai

Bridging AI research and real product engineering. Writing about what works once the demo ends.

Like this article?

Get practical insights on AI, product, and growth sent to your inbox.

More from the blog