Webinar - How to prototype AI Agents using Lyzr & Lovable (200+ attending)

Scaling AI Agents with Qdrant at Lyzr Agent Studio 

Table of Contents

Let's automate your workflows
with an AI agent today?

AI agents today are processing millions of queries, managing large-scale knowledge bases, and operating under intense concurrency demands.

For agent builder platforms, which supports 100+ live AI agents across industries, choosing the right vector database is critical.

At scale, even slight inefficiencies in search latency or indexing can impact agent performance and user experience. To meet production demands we need a solution that could deliver both speed and cost efficiency, without compromising reliability.

Today, you’ll find out how Lyzr Agent Studio optimized its vector search stack with Qdrant. We’ll explore why it was the right fit, how it elevated system performance, and the real-world impact across customer use cases.

Initial Setup with Weaviate & other vector databases

In the initial phase, Lyzr Agent Studio integrated Weaviate as the core vector database, along with experiments on other platforms like Pinecone to benchmark early-stage performance. 

The system was designed to handle a knowledge base of approximately 1,000 to 1,500 entries, comprising a mix of short-form content, technical briefs, and structured records.

The setup operated under controlled development conditions:

ParameterDetails 
Deployment TypeSingle-node or small-cluster (Weaviate and other vector db)
Embedding ModelSentence-transformer (768 dimensions)
Concurrent Agents10 to 20 knowledge search agents
Query Rate per Agent5-10 queries per minute
Traffic PatternSteady, no significant spikes


In this environment, Weaviate and Pinecone consistently performed well. 

Query latency remained between 80ms to 150ms, and vector search results were highly relevant within the given domain context. Indexing of the dataset completed within hours, aided by Weaviate’s HNSW-based indexing and Pinecone’s managed vector infrastructure.

Performance remained optimal under:

  • A limited corpus (sub-2,000 records).
  • Moderate agent concurrency typical of early-stage validation.
  • Query loads staying well within standard operational limits.

Challenges Faced During System Growth

As the system scaled, limitations with Weaviate and other databases like Pinecone began to surface.

The knowledge base expanded from around 1,500 to over 2,500 entries, while agent concurrency increased beyond 100 knowledge search agents. This added both volume and operational complexity, as agents were generating a much higher query load across the system.

Key issues encountered:

  • Increased query latency: Query response times grew from an earlier average of 80-150ms to 300-500ms under higher load. With over 1,000 agent queries per minute during peak times, agents experienced noticeable delays in fetching relevant vectors.
  • System under strain during concurrency spikes: Agents began facing slowdowns and occasional timeouts when retrieving search results, which disrupted downstream reasoning and decision-making flows.
  • Indexing slowdowns: Scaling beyond 2,500 entries led to longer indexing and update times, especially during bulk ingestion. The re-indexing process consumed higher memory and CPU, causing performance drops during high activity periods.

Evaluation of Alternative Vector Databases

With growing data volume and rising agent concurrency, it became clear that a more scalable and efficient vector database was needed.

The goal was to find a solution that could handle heavier loads while maintaining fast response times and reducing operational overhead. This led to a thorough evaluation of alternative vector databases based on key performance criteria.

CriteriaFocus AreaImpact on System
Scalability & Distributed ComputingHorizontal scaling, clusteringSupport growing datasets and high agent concurrency
Indexing PerformanceIngestion speed, update efficiencyReduce downtime and enable faster bulk data updates
Query Latency & ThroughputSearch response under loadEnsure agents maintain fast, real-time responses
Consistency & ReliabilityHandling concurrency & failuresAvoid timeouts and failed queries during peak usage
Resource EfficiencyCPU, memory, and storage usageOptimize infrastructure costs while scaling workload
Benchmark ResultsReal-world load simulationValidate sustained performance under >1,000 QPM loads


Transition to Qdrant

At Lyzr Agent Studio, users can easily connect to a variety of vector databases to support advanced RAG (Retrieval-Augmented Generation) and agent workflows.

AD 4nXeuBS8ua1RQn1zyPsG7uvWnGk8pkwASfnm3Z1DdoXoXCs5KqlZGXWzgxSscjEvyYN74HX7QDmsjY

 The platform offers native integrations with leading vector databases, enabling efficient storage, retrieval, and querying of embeddings.

AD 4nXev99bnetmvvwpEmZiJrqyqMn90dc66jFfJ1oDVYmiMmdl6340iGRb7 k29zHSZmpstwewOw8CFONizy39VzzEDIHHgVFhvXSvWMFx3wWZTnUxLm5uZd4F3uwo Ww9wRncYMP M?key=iqYzqXgdrQN53KkPbWaSSWi9

But after evaluating scalability challenges with earlier implementations, Qdrant was chosen as the core vector database to power storage and search capabilities within Lyzr Agent Studio, delivering greater reliability and faster performance at scale.

AD 4nXe50SadH51QOUeVcFdlA1kNL8P5xO yuOHLt3ykHgPHGeGSCREXXu2q5zFzFLrXUnOMFMJeIsgnGZBdVOAPToQwG4TLq1RDkCsU01iuIeLVvXlZcVBX12aqjE2M3iOAPpJSkfhIFg?key=iqYzqXgdrQN53KkPbWaSSWi9

Why Qdrant?

  • Sustained performance at scale: Qdrant demonstrated the ability to maintain high request throughput and consistently low query latency, even under demanding workloads generated by hundreds of concurrent agents.
  • Optimized vector indexing and retrieval: Leveraging HNSW-based indexing, Qdrant enabled faster ingestion of data and quicker nearest neighbor searches, while supporting incremental updates without disrupting live operations.
  • Resource and cost optimization: By requiring fewer compute and memory resources to achieve similar or improved performance compared to earlier solutions, Qdrant offered a more cost-effective scaling strategy as data volume and agent demands increased.

Use Case 1: NTT Data – Change Request Management Agent

In an enterprise environment with NTT Data, the Change Request Management Agent was initially powered by Cosmos DB within Azure infrastructure. 

Tech Stack: 

  1. Hosting: Azure Cloud (enterprise-grade environment)
  2. Vector Database: Initially Cosmos DB with Azure, later migrated to Qdrant for improved vector search performance
  3. Agents: Change Request Management Agent (deployed for automating and managing IT change requests) 
  4. Framework: react for frontend, fastapi for backend.
AD 4nXes66 yNilbV8TBlfoN2SizRk oKMjPrINwuqwBtP3HPZs4z8P53ZgTTLgY0KsPfraAsNox0SFB5DypzsT2sz1pQ kjGVCohlpHtLscae3WHMJgpFakWKj4Ab6VpwdnDbWmvr B6A?key=iqYzqXgdrQN53KkPbWaSSWi9

While Cosmos DB provided ease of integration with the existing Azure ecosystem, it exhibited limitations in vector search accuracy. The available indexing techniques restricted the agent’s ability to surface highly relevant information for complex change request workflows.

The transition to Qdrant addressed these challenges. With Qdrant’s advanced vector indexing capabilities and support for scalable deployments, the agent achieved:

  • Higher search accuracy on change request records.
  • Improved reliability across large volumes of historical and live data.
  • Simplified horizontal scaling, enabling smooth handling of increased agent activity as project demands grew.

Use Case 2: NPD – Customer Support Agent

NPD deployed a customer support agent using Lyzr Agent Studio to automate troubleshooting assistance on its website. The agent leverages a knowledge base containing website URLs, troubleshooting documents, and product-specific resources.

How it works:

Users submit troubleshooting questions, and the agent responds with precise answers while directing them to the correct URL for the product or documentation they need.

Tech Stack: 

  1. Agents
  • Customer support agents – Help the user get to the right product URL as per their wants. Also troubleshoot their issues if they are facing any issue with existing product
  • 6 agents for each website
  1. Knowledge Base
  • 6 KB for 6 NPD websites
  • Entire website scraped (URL, content)
  1. Vector DB – Qdrant
  2. Hosting
  • Frontend – NPD’s own frontend
  • Backend – Lyzr Studio (direct API call) – using a router to hide API KEY
  1. Modules
    –  Knowledge Base

– Short Term Memory

AD 4nXeRxHmdO9xQiZnsTKMPYo6 gbbv

To meet the demands for fast, accurate query handling, the agent required a vector database optimized for contextual search and low-latency performance. Lyzr Agent Studio integrated Qdrant as the vector backend, delivering:

  • High accuracy in matching user queries to the right URLs and support content.
  • Fast vector search across thousands of knowledge base entries.
  • Seamless scalability, allowing the agent to serve increasing user traffic without performance degradation.

Performance Gains and Results with Qdrant 

The transition to Qdrant resulted in measurable improvements across multiple operational parameters for Lyzr Agent Studio.

1.Performance Comparison 

MetricWeaviate PinecodeQdrant
Avg Query Latency (at 100 agents)300-500ms250-450ms20-50ms (P99)
Indexing Time (2,500+ entries)~3 hours~2.5 hours~1.5 hours
Query Throughput (QPS)~80 QPS~100 QPS250+ QPS
Resource Utilization (CPU/Memory)HighMedium-HighLow-Medium
Horizontal ScalabilityModerate Moderate Highly Scalable 

2. Latency Improvements

  • Achieved P99 latency of 20ms even with over 1 million vectors in production.
  • Query performance remained consistent during peak workloads, reducing prior spikes in response times experienced with other vector databases.

3. Throughput and Scaling Success

  • Sustained handling of 250+ queries per second across distributed agents.
  • System stability remained intact under concurrent load from 100+ live knowledge search agents.

.4. Cost Reductions

  • Reduced compute and storage requirements led to an overall 30% decrease in infrastructure costs compared to earlier deployments.
  • Improved index efficiency allowed for a leaner cluster footprint without sacrificing performance.

5. Stability & Reliability

With over 1,000 queries per minute, Weaviate can slow down at times, Pinecone stays mostly stable but has occasional spikes, while Qdrant stays consistently stable. At 100+ live agents, both Weaviate and Pinecone face some delays, but Qdrant runs smoothly.

Wrapping Up

Using Qdrant has strengthened how vector search supports agents under heavy workloads. Faster search, greater stability, and easier scaling have all contributed to smoother agent performance across real-world applications.

The key takeaway? The right vector database is critical to keeping agents fast and reliable as data and usage grow.

With Qdrant as its core vector database, Lyzr Agent Studio continues to help teams build and operate AI agents at scale.

Ready to build? Try Lyzr Agent Studio today.

What’s your Reaction?
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0
Book A Demo: Click Here
Join our Slack: Click Here
Link to our GitHub: Click Here
Share this:
Enjoyed the blog? Share it—your good deed for the day!
You might also like

Agentic Automation: Your Definitive Guide

What is Agentic RAG? Everything You Need to Know in 2025

Why Top Companies Are Using AI for Risk Management?

Need a demo?
Speak to the founding team.
Launch prototypes in minutes. Go production in hours.
No more chains. No more building blocks.