bards.ai at RDK Summit 2025
Cutting 3-5 minutes of manual work per screen to just ~20 seconds. At scale, that's thousands of hours saved.
Michal Pogoda-Rosikon
May 15, 2025 · 5 min read

At RDK Summit 2025, I joined Damian Danylko and Artur Gebicz to present results of cooperation between Comcast and bards.ai on advancing Firebolt Connect (RDKM).
Firebolt now supports automated visual testing for set-top box applications. The process is simple:
- Define how the UI should look on a golden sample by tagging key elements (buttons, images, texts etc.)
- Automatically verify if they are correctly rendered on other hardware / firmware / software version configurations.
This tagging step - defining Points of Interest (POIs) - usually takes 5+ minutes per screen. With ~10 screens per app and thousands of apps, it adds up VERY quickly.
We evaluated off-the-shelf solutions - including Gemini (with native ability for bounding box detection) and Omniparser 1/2. Neither met the accuracy or reliability we needed. So we built a custom dataset and trained the model from the ground up.
10x reduction in asset preparation time.
It shows that custom models are far from dead - at least in computer vision. But in NLP? I'm not so sure anymore. Foundation models are catching up fast.
Written by
Michal Pogoda-Rosikon
Co-founder @ bards.ai
Bridging AI research and real product engineering. Writing about what works once the demo ends.
Like this article?
Get practical insights on AI, product, and growth sent to your inbox.


