// Production / Agentic Workflows & RAG
AI agents on AWS AgentCore.
You're already on AWS. Bedrock has the models inside your VPC and your IAM perimeter. AgentCore is the native runtime for agents that have to stay in your AWS stack - for compliance, data residency, and the security review your team has already passed. We build the agent layer inside your account, with your engineers and your compliance team's sign-off.
// What we see
Console works. Production breaks. Always in the same places.
01
Your supervisor adds 3 seconds per hop
Multi-agent collaboration looked clean in the diagram. In production, every supervisor → worker → return cycle adds 2-5 seconds. A 4-hop conversation that demoed at 8 seconds answers in 30. Users notice.
02
Memory bleeds across users
Long-term memory configured, short-term configured, but the scope is wrong somewhere. User A's context starts showing up in User B's responses. You don't catch it until support tickets stack up - and at the scale your compliance team operates, that's an incident report, not a bug fix.
03
Bedrock costs surprise you at scale
Cost per query was acceptable in dev. At 50 RPS production load the bill is 8x what you modeled, because each supervisor hop is a separate billed Bedrock call and nobody mapped the conversation tree to the pricing.
// Case Study
Embedded analytics agent - multi-marketplace data, AWS Bedrock + Agent Core
A $50M+ ARR ecommerce data SaaS wanted their end-customers to ask precise, cross-marketplace questions in plain language - without an analyst in the loop. We built a dedicated-toolkit agent on AWS Bedrock + Agent Core: typed tools per question pattern, SQL kept only as a long-tail escape hatch. Robust where text-to-SQL falls over. 3 months from kickoff to production.
3 months
concept to production
5+
marketplaces unified
AWS
Bedrock + Agent Core, in-perimeter

// What we do
Three decisions that decide whether AgentCore holds up.
Most AgentCore deployments don't fail on the model layer. They fail on memory shape, supervisor topology, and observability that doesn't see what's actually slow.
Memory shape decides the project
Long-term, short-term, and session memory each have different scope, retention, cost, and data-residency profile. For regulated workloads the memory tier is also a compliance decision. We map your conversation patterns and your data classification to the right tier early - because almost every production AgentCore problem traces back to a memory choice made in week one.
Supervisor latency is a design constraint
Supervisor → worker hops cost 2-5 seconds each. We design the agent graph around it - fewer hops, parallel workers where the conversation allows, tool calls inlined where AgentCore semantics permit. Your latency budget shapes the architecture, not the other way around.
Observability inside your AWS stack
Native AgentCore + CloudWatch shows what fired, not what failed quietly. We layer trace replay, per-tool latency breakdowns, and cost-per-conversation dashboards on top - feeding into your existing CloudTrail and audit pipeline. The regression that adds 30% latency or 2x cost is visible to your on-call before customers complain.
// Method fit
AgentCore isn't the right runtime for every agent.
skip it if
You're not on AWS
AgentCore's value is being inside the AWS perimeter - VPC, IAM, GuardDuty, CloudTrail, KMS. On GCP or Azure you're paying for plumbing that doesn't connect to anything you own.
Your data can leave AWS
If your security team is comfortable with LLM API calls leaving your AWS perimeter, you have more options - direct OpenAI, Anthropic, multi-cloud gateways. AgentCore's value is keeping the agent and its memory inside your existing audit boundary.
Your team owns the runtime
AgentCore is opinionated. If you have an SRE team that wants to control retries, queue semantics, and tool dispatch end-to-end, building on LangGraph + your own EKS cluster gives you more freedom - at the cost of the ops you'd be running yourself.
LangGraph Agents
use it if
AgentCore fits when you're already on AWS, your compliance perimeter is the determining factor, you have approved-model access through Bedrock, and you want a managed runtime your security team has already cleared.
// How we work
Architecture first. Iterate in a shared workspace. Hand off the dashboards.
Every AgentCore engagement starts with the architecture decisions that are expensive to undo - memory tier, supervisor topology, identity boundaries. From there we build in your AWS account, with your engineers in the loop the entire time.
01
Architecture review (week one)
We sit with your team and map the agent graph - supervisors, workers, memory scope, tool boundaries, identity flow. The output is a written design your engineers approve before any code is shipped.
02
Build in your AWS account
All work happens in your account, your VPC, your IAM. Your engineers have access to the same Bedrock workspace, CloudWatch dashboards, and trace tooling we use. No vendor sandbox we have to migrate out of later.
03
Hand off cost and latency dashboards
We hand off code, agent definitions in IaC (CDK or Terraform), the eval suite in your CI, and dashboards your on-call uses. Slack for 30 days after delivery for the questions that come up after we leave.

// Expert insight
“AgentCore bills CPU-seconds and peak memory, not per invocation. Most of an agent's runtime is waiting on Bedrock - memory billed, CPU not. Smaller memory footprint matters more than fewer calls.”
Michał Świędrowski
Co-founder @ bards.ai
// Why bards.ai
Why us, instead of two senior AWS engineersyou'd hire.
You could hire the team. It would take a year and they'd learn AgentCore on you - on your bill, in your audit trail. We've already learned it.
Production agent deployments at scale
Brand24's internal agent unifies 13 data sources behind one interface, sub-5s p50. The patterns transfer to AgentCore: memory scoping, supervisor design, observability you can debug at 11pm.
Eval-first methodology
Every agent change ships with a measured delta - latency, cost-per-conversation, task completion. Shared dashboard, no silent regressions to land a cosmetic win.
Senior engineers only, no juniors
Every person on your engagement has shipped agents to production. No AWS ramp-up tax, no learning AgentCore on your account.
// FAQ
Common questions about AgentCore
Bedrock Agents was the previous generation - a managed agent service tied to Bedrock LLMs with a fixed action-group model. AgentCore is the runtime layer underneath it, exposed: Memory, Identity, Gateway, Browser, Code Interpreter as separate primitives. AgentCore gives you more control and composability; Bedrock Agents is faster to demo but harder to bend. New projects with non-trivial agent logic should start on AgentCore.
LangGraph wins when you need full control of state, retries, and tool dispatch, or when AWS-specific compliance isn't a driver. AgentCore wins when you're AWS-first, need VPC/IAM isolation as a non-negotiable, and want a managed runtime that handles agent lifecycle, scaling, and durability without you owning the cluster. Both can coexist - LangGraph as inner-loop orchestration inside an AgentCore-hosted worker is a pattern we've shipped.
Three things: (1) Map the conversation tree to the pricing tree before any code lands - every supervisor hop is a billed call, and the model choice per hop matters. (2) Use cheaper models (Claude Haiku, Nova Lite) for supervision and routing, premium models (Sonnet, Nova Pro) only where the user-facing answer needs them. (3) Cost-per-conversation dashboards from day one, alerted on regression. The goal is to know the unit economics, not to discover them after the bill arrives.
Engagements start at $40K. Most AgentCore projects land between $40K and $120K depending on the number of agents, memory scope complexity, identity integration depth, and whether multi-region is in scope. Fixed-fee proposal after the first scoping call - no time-and-materials surprise.
// Let's ship it
Send us your agent graph. We'll send back a plan.
Tell us about the conversation shapes, the AWS perimeter you have to stay inside, and the latency and cost bars. We'll come back with an architecture and an eval plan that your security team can actually approve, usually within a business day. Engagements from $40K, typically 4-8 weeks.

Michał Świędrowski
Co-founder @ bards.ai