🏆 Lyzr wins the Accenture Tech Next Challenge 2024

What is an AI Agent? Why are they gaining popularity?

What is an AI Agent? Why are they gaining popularity?

Table of Contents

September 6, 2024

Estimated reading time: 19 minutes

What is an AI Agent?

An AI Agent is a sophisticated collection of functions that software code can perform by interacting with large language models (LLMs).

These functions can range from simple operations like chat or search to complex processes such as conducting deep research on a topic and writing an SEO-optimized blog post.

AI Agents represent a significant advancement in the field of artificial intelligence, enabling more dynamic and versatile applications of LLMs.

Screenshot 2024 09 06 at 5.44.07 PM
What is an AI Agent? Why are they gaining popularity? 10

When you examine the core of AI Agents, you’ll encounter what we call in the AI world a term known as ‘function calling’.

OpenAI, the creator of ChatGPT, launched the function calling feature in 2023, which quickly became a favorite among developers due to its versatility and power.

To fully grasp the importance of an agent, it’s crucial to first understand function calling and the agent function. The agent function is essential for mapping information to specific actions, thereby showcasing the intelligence and reasoning capabilities of the agent.

Understanding Function Calling

Function calling is a fundamental block of code that accepts an input, passes it to an LLM like GPT-4 or Google Gemini, and returns the output generated by the LLM.

This process forms the backbone of how AI Agents operate and interact with language models. Function calling consists of three key components:

  1. System Message
  2. User Message
  3. Assistant Message
Screenshot 2024 09 06 at 5.44.45 PM
What is an AI Agent? Why are they gaining popularity? 11

Let’s break down each component:

System Message: This is the core instruction that the LLM uses to perform the assigned task. The System Message is often used to set a persona for the LLM, such as GPT-4. This allows the LLM to contextualize the action it is about to take, ensuring that its responses align with the desired role or expertise.

User Message: This is the input that the user provides to the function. It can be a question, a prompt, or any form of instruction that guides the LLM’s response.

Assistant Message: This is the output that the LLM generates based on the System Message and User Message. It represents the AI’s response or completion of the given task.

To illustrate how function calling works, let’s examine an example where we ask an LLM to write a tweet about New York City:

System Message: You are an intelligent AI assistant skilled at writing engaging tweets.

User Message: Write a tweet about why New York City is the best city in the world.

For this input, the assistant message (output) might be:

Assistant Message: New York City isn’t just a place—it’s a vibe. From the never-ending hustle to the limitless opportunities, every corner tells a story. It’s where dreams are born, diversity thrives, and the energy is unmatched. NYC isn’t just the best city in the world; it’s the center of it. 🌆 #NYC #CityOfDreams

This example demonstrates the typical interaction that occurs behind the scenes with ChatGPT. When you pose a query to ChatGPT, it processes your input (the user message) along with a predefined system message (which users don’t have access to) and generates an output (the assistant message).

Some of the most common applications of simple function calling include chat interfaces and text summarization.

These applications demonstrate the basic capabilities of LLMs when used with function calling. However, while function calling may seem straightforward, its simplicity can become a limitation when attempting to perform more complex tasks.

Issues with function calling

Platforms like Langchain are built on the concept of function calling. However, when you want to perform a task more complex than simple summarization, challenges arise.

Consider our previous example: if you want to review the generated tweet, provide feedback to the LLM, and have it regenerate the tweet incorporating your feedback, the process becomes more complicated.

The only way to accomplish this is by building a series of functions and chaining them together. This concept is referred to as function-chaining or prompt-chaining.

Model based reflex agents offer a more sophisticated alternative by maintaining an internal model of the environment, enhancing decision-making in complex scenarios.

Screenshot 2024 09 06 at 5.45.44 PM
What is an AI Agent? Why are they gaining popularity? 12

In some cases, such as when implementing advanced data handling capabilities, a function may not need to interact with an LLM at all.

Instead, it might require a program that performs a task by calling a machine learning algorithm or executing hard-coded ‘if-else’ logic. For example, if you want to ensure that there are no ‘toxic’ or ‘objectionable’ words in the tweet, you would need to send the output of the LLM to a machine learning model trained to detect toxic content.

The results would then be used to decide the next course of action, potentially leading to further processing or regeneration of the content.

Screenshot 2024 09 06 at 5.46.19 PM
What is an AI Agent? Why are they gaining popularity? 13

As tasks continue to grow in complexity, the number of functions increases, and the chain becomes longer. This introduces more complexity, more potential points of failure, and becomes a troubleshooting nightmare.

Many developers, including us at Lyzr, started building these function-chain models in 2023. Through this process, we realized that several of these functions could be templatized and re-used across different applications.

This realization led to the creation of AI Agents, which aim to address the limitations and complexities of function chaining.

How does an agent solve the issue of function chaining?

Now that you understand function calling, it’s easier to grasp the concept of an AI Agent. An Agent still has function calling at its core, which performs the assigned tasks, but it comes equipped with an array of pre-built functions (we refer to these as features, while others may call them modules or agent functions) that can be enabled and utilized for various tasks.

Additionally, model-based agents utilize an internal model to evaluate potential outcomes and make informed decisions, distinguishing them from simpler reflex agents.

Over several months, developers at Lyzr and contributors within the open-source community have been busy identifying and building many of these agent features.

Below are some of the most popular and useful ones. Let’s explore these features to understand the ‘agentic automation’ concept, which we’ll illustrate by examining the backend of an AI SDR agent. (AI SDRs like Jazon by Lyzr have become popular due to the successful application of Generative AI to value-producing functions in sales and marketing.)

Key Features of AI Agents

Learning agents are systems that evolve through continuous learning from prior experiences in order to enhance their performance. These agents adapt their behavior based on feedback and sensory input, and are commonly employed in unpredictable environments, utilizing deep learning techniques to inform personalized services in applications like e-commerce and streaming platforms.

what is ai agent 1 1
  1. Short Term Memory: This feature stores memory in-session during the performance of a task. Its most popular use case is in chatbots handling customer service requests, allowing the AI to maintain context within a single conversation.
  2. Long Term Memory: This feature summarizes the short-term memory of each session and stores it as persistent memory. This gives the agent context each time it interacts with an LLM. It’s important to remember that LLMs are stateless and do not innately remember your previous interactions. While ChatGPT appears to remember previous interactions, this is because it’s an application with built-in memory at the application layer. The APIs that developers use to build agents don’t have this capability by default. If you don’t want to build this feature into the agent, you can simply call 3rd party services like GetZep, which is a dedicated memory handling service.
  3. RAG (Retrieval Augmented Generation): RAG stands for retrieval augmented generation. This feature allows you to send ‘tribal knowledge’ about you, your product, or a concept that only you may know and that LLMs may not have in their dataset. The most common use case for RAG is search. Imagine a Perplexity-style search engine that works only with your data. RAG is how you supply your data, which allows the agent to search for the content you asked for and retrieve relevant information. An example of this is the ‘SuperPhil‘ application, a RAG-powered search agent that can answer any question an enterprise CIO might have by looking into Phil Fersht’s (Founder of HFS Research, a leading research company) blog posts. You can try this with your own data using Lyzr’s Knowledge Search demo app.
  4. Chat: Since LLMs are stateless, building a chat application requires you to add a temporary memory unit that stores the chats and sends them to the LLM to provide context during every interaction. Automating this module helps you convert any agent into a chat agent by simply enabling this feature (in UI terms, think of just clicking an ‘ON’ button). This feature is especially important when building an agent for customer service or lead generation use cases. You can try this agent feature with your own data using Lyzr’s Chatbot demo app.
  5. Tool Calling: Tool calling gives superpowers to your agent because it allows you to connect with any API or call custom functions. For example, if the tweet we generated earlier needs to be posted to Twitter (now called X), you can simply call the Twitter API to post the tweet automatically.
  6. Human-in-Loop: This important feature allows humans to maintain control over these agents. In the tweet example, if you want to moderate the tweet before posting, enabling the human-in-loop feature allows you to verify the agent’s output and take further action accordingly.
  7. Agent Learning from Human Feedback (ALHF): This is a modified version of the popular RLHF (Reinforcement Learning from Human Feedback) concept used in machine learning. ALHF accepts human feedback on a generated output, processes the feedback, plans improvements, and regenerates an output that adheres to the feedback.
    In our tweet example, if you ask the agent to rewrite the tweet to mention ‘Multi-cultural’, the agent will incorporate this in the new output.

Original Tweet Generated by the Agent: New York City isn’t just a place—it’s a vibe. From the never-ending hustle to the limitless opportunities, every corner tells a story. It’s where dreams are born, diversity thrives, and the energy is unmatched. NYC isn’t just the best city in the world; it’s the center of it. 🌆 #NYC #CityOfDreamsUser
 

Feedback: Mention the financial capital and fashion capital aspects.

Revised Tweet Generated by the Agent: New York City is where the world’s heartbeat syncs. As the financial capital, it drives global markets; as the fashion capital, it sets trends that inspire the world. In NYC, power and creativity collide, making it the ultimate city of ambition and style. 🌆💼👗 #NYC #FinancialCapital #FashionCapital

  1. Agent Learning from AI Feedback (ALAIF): Similar to ALHF, the ALAIF feature allows the agent to learn from its own performance metrics. For example, if the agent is tasked with writing engaging tweets about your brand, ALAIF allows the agent to track the best-performing tweets and add them to an ‘example set’ following a FIFO (First In First Out) model. The FIFO approach helps the agent continuously adjust its output quality to user preferences. ALAIF plays a major role in the ‘self-learning’ capability of the agent. Additionally, with versioning, it allows you to revert to any previous version of the agent you prefer.
  2. Input Guardrails: You wouldn’t want your agents to pass critical and sensitive information to the LLMs when performing a task. That’s where input guardrails come in. With this feature, you can enable PII (Personally Identifiable Information) redaction or define specific guardrails you want the agent to follow.
  3. Output Guardrails: Similar to input guardrails, output guardrails ensure the agent produces desired output. Lyzr agents come with features like ‘Toxicity Controller‘, an ML model which we open-sourced and published on HuggingFace. This model checks the output for toxic and objectionable language. If detected, it creates a guardrail, adds it to the guardrails list, and prompts the agent to regenerate the output.
  4. Prompt Enhancer: More than 90% of cases where LLMs don’t meet customer expectations are due to poor prompting. While the core system prompts of LLMs are improving and becoming more forgiving of bad prompts, a good prompt will further enhance the agent’s output. While there are some excellent prompt generators like Anthropic’s Prompt Generator, it’s highly beneficial if the agent has this as a built-in feature. Lyzr’s agents come with an automatic prompt enhancer, which we launched as a free tool called MagicPrompts, currently used by over 1,500 active users.
  5. Self-Reflection: This feature is crucial for helping the agent produce high-quality output consistently. Self-Reflection is much like the introspection that we humans do. With reflection, the agent will review its output against all the input conditions and verify if the output was generated as per instructions and guardrails. You can define the number of times you want the agent to reflect on its output. While a higher number of reflections results in increased LLM usage (and thus cost), it also increases quality, consistency, and relevance. A variation of self-reflection is cross-reflection, where you use a different LLM to do the review.
Screenshot 2024 09 06 at 5.48.20 PM
What is an AI Agent? Why are they gaining popularity? 14

13. Humanizer: If you’re looking to modify the agent’s output to make it sound more human-like, then the Humanizer feature handles that. This feature is often less used, mostly appearing in email generator agents where a more natural, conversational tone is desired.

14. LLM Selector: A recent paper called RouteLLM highlights the benefits of choosing the right LLM based on the task. This approach can significantly reduce the cost of using LLMs. Agents can be enabled with this feature by default, saving developers time from writing another routing function.

  1. Fact Checker: For agents that generate news articles by analyzing various news sources, fact-checking is critical. The Fact Checker is an emerging feature that allows agents to verify facts before publishing the output, ensuring accuracy and reliability.
  2. Output Eval: If you have certain test cases, you can pass them to the agent, allowing it to evaluate the output results against these test cases. If the results are inconsistent, the agent will regenerate the output to satisfy the test cases, ensuring quality and adherence to specified criteria.

While this may sound exhaustive, these various features transform a simple ‘function calling’ into a very capable, reliable AI agent.

Any function that has repeat applicability could be added as a feature.

This not only saves a ton of development time but also enables agents to have interesting combinations of capabilities, which may yield innovative and powerful applications.

Types of Agents: Simple Reflex Agents

Goal-based agents are AI systems with advanced reasoning capabilities that evaluate environmental data and adapt their decision-making processes to achieve specific objectives. They highlight the flexibility of these agents in choosing efficient pathways for complex tasks, especially in fields like robotics and natural language processing.

Over the past few months, three distinct types of AI agents have emerged, each with its own strengths and applications:

Screenshot 2024 09 06 at 5.53.27 PM
What is an AI Agent? Why are they gaining popularity? 15
  1. Building Block Agents: These versatile agents serve as the foundation for creating complex agentic automation workflows. Popular frameworks like Lyzr (The Enterprise Agent Framework), Langchain (The Popular Open-Source Framework), and CrewAI (A Langchain-based agent framework) offer building block agents. Their flexibility allows them to automate a wide range of tasks, from simple chatbots to intricate workflows, making them invaluable tools for developers and businesses alike.
  1. Persona Agents: This category has rapidly gained popularity, featuring agents designed to embody specific roles or personas. Examples include SDR (Sales Development Representative) agents, Marketing agents, and Legal agents. Harvey, a well-known legal agent, and Jazon, a popular AI SDR, exemplify this category. The appeal of persona agents lies in their specialized capabilities and, in many cases, their availability as Software-as-a-Service (SaaS) solutions. This makes them accessible and immediately useful for businesses.

    However, it’s worth noting that most of these agents (with exceptions like Lyzr’s role agents that run in the customer’s cloud) operate as “black boxes.” This means that customers cannot access or modify the system prompts and backend logic, which can limit customization and transparency.

    An interesting approach in this space comes from Ema, which is developing a universal agent concept capable of handling various tasks for an organization, potentially offering greater flexibility within the persona agent paradigm.
  2. Task Automation Agents: This category currently holds the largest market share among AI agents. These agents are designed to perform specific, often singular tasks efficiently. Examples include:
  • Chatbots for customer interaction
  • RAG (Retrieval Augmented Generation) powered search engines
  • Text summarizers
  • Text-to-SQL converters
  • Other single-task automation tools

The popularity of these agents stems from their focused functionality and ease of integration into existing workflows.

A Look at Various Agent Frameworks in Multi Agent Systems

The past 12-18 months have seen the emergence of several notable agent frameworks. Let’s examine some of the most popular and actively growing platforms:

Autonomous agents play a crucial role in the workforce by assisting human employees with various tasks, enhancing productivity and job satisfaction. Training and upskilling workers are essential to effectively integrate these agents into workflows.

  1. Langchain:
  • Originally an LLM application development platform
  • Launched LangGraph in January 2024, a dedicated agent library
  • Known for its integration capabilities with the parent Langchain framework and robust routing features
  • Key Metric: 100K+ contributions by developers to the Langchain open-source framework
  • Notable Customers: CommandBar, Adyen, New Computer
  1. Lyzr:
  • Launched its commercial framework for enterprise customers in September 2023
  • Positioning itself as the leading enterprise alternative to Langchain
  • Strengths include a rich set of agent features, enabling rapid development of sophisticated agents
  • Offers one-click deployment and extended deployment support through Lyzr Professional Services
  • Recently introduced the Agent API, revolutionizing how developers can build on Lyzr
  • Unique in offering pre-built agents like Jazon (AI SDR) and Skott (AI Marketer)
  • Key Metric: 825,000 man-hours saved for clients to date
  • Notable Customers: HFS Research, SurePeople, Evalueserve, Kastle
  1. LlamaIndex:
  • Originated as a RAG framework and remains a strong contender in that space
  • Recently expanded into the agent framework domain
  1. Flowise:
  • Serves as a visual alternative to Langchain and LlamaIndex
  • Allows users to build Langchain or LlamaIndex workflows using an intuitive drag-and-drop interface
  • Ideal for those who prefer visual programming over traditional coding
  1. AutoGen:
  • Considered one of the first true agent frameworks, demonstrating the potential of the agentic approach
  • Quickly gained momentum with open-source contributions
  • Questions remain about its enterprise readiness, which may affect its adoption in corporate environments
  1. CrewAI:
  • Gained viral popularity in early 2024
  • Known for its simplicity and enjoyable development experience
  • Allows easy agent creation by inputting persona details, including backstory and description
  • Facilitates the assembly of multiple agents into workflows
  1. Gumloop:
  • A Y Combinator-backed company offering an interesting approach to workflow automation
  • While not strictly an “agent framework,” its method of automating custom workflows is agentic in nature
  • Can be viewed as the generative AI equivalent of Zapier, bridging the gap between traditional automation and AI-driven processes
  1. Ema:
  • Like Lyzr, focuses on the enterprise market
  • Offers a unique approach to agent building with its universal agent concept
  • Acts as a frontend for multiple backend agents, allowing organizations to run multiple Emas to automate various workflows
  • Key Metric: Raised over $60 million in funding from prominent venture capital firms
  1. Wordware:
  • An Integrated Development Environment (IDE) built specifically for developing agents using English-based programming
  • Represents one of the most innovative approaches to agent development
  • While powerful, it does come with a learning curve as developers adapt to its unique programming paradigm

Agent Use Cases

As AI agents continue to evolve and improve, their applications across various industries are expanding. Here are some of the most popular and common use cases we’ve observed being built on these agent frameworks:

AI agents can be designed to perform specific tasks tailored to users’ needs, such as booking appointments or accessing account details.

  1. Automating Sales Outreach:
  • AI SDRs (Sales Development Representatives) have become extremely popular
  • Represent a proven use case for generative AI in sales and marketing
  • Can handle initial customer interactions, qualify leads, and schedule appointments
  1. Blog Generation:
  • Agents can research data for blog articles
  • Write high-quality, SEO-optimized blog posts
  • Automatically publish content to various platforms
  • This use case significantly speeds up content creation processes
  1. Customer Service Automation:
  • Handle incoming customer queries
  • Respond with appropriate answers based on company knowledge bases
  • Close tickets or escalate to human agents when necessary
  • Improve response times and customer satisfaction while reducing workload on human agents
  1. Document Review:
  • Review provided documents based on specific instructions
  • Highlight issues, inconsistencies, or areas of concern
  • Particularly useful in legal, compliance, and contract management scenarios
  1. Product Recommendation:
  • Traditional recommendation systems were primarily text-based
  • With generative AI-powered agents, businesses can build multi-modal recommendation engines
  • These can consider text, images, user behavior, and other data points to make more accurate and personalized recommendations
  1. Data Analysis:
  • Process natural language queries from users
  • Convert these queries into SQL or other database query languages
  • Run queries on structured data
  • Return actionable insights in a format easily understood by non-technical users
  • This use case bridges the gap between complex data structures and business users who need quick insights
Screenshot 2024 09 06 at 5.57.24 PM
What is an AI Agent? Why are they gaining popularity? 16

Natural language processing is essential for AI agents to understand and respond to human language, making them valuable for tasks across various sectors such as customer service and healthcare.

These use cases represent just a fraction of the potential applications for AI agents. As the technology continues to advance and more businesses adopt these tools, we can expect to see an even broader range of innovative applications across various industries and domains.

The rapid development of AI agent frameworks and their diverse applications highlight the transformative potential of this technology.

From streamlining business processes to enhancing customer experiences, AI agents are poised to play a crucial role in shaping the future of how we interact with and leverage artificial intelligence in our daily lives and work.

Lyzr Introduces the 3-Month Pilot Program

While the AI Agents is not new to Lyzr team who has been working on this since 2023, it is still a new technology for a lot of organizations including large enterprises and small businesses.

Hence Lyzr partnered with Amazon Web Services and Google Cloud in sponsoring a 3-month pilot program that will help customers build and try the AI Agents in production, and then decide to implement them full time for their business.

Screenshot 2024 09 06 at 5.58.20 PM
What is an AI Agent? Why are they gaining popularity? 17

Lyzr’s 3-month pilot program includes,

  • A discovery session to identify potential use cases
  • Development and customization of 2 agents
  • Deployment and 24*7 technical support for 2 agents
  • Additional integrations to customer’s tech stack

Reach out to us and try these powerful AI agents or build custom AI agents to automate your business.

What’s your Reaction?
+1
4
+1
0
+1
1
+1
0
+1
0
+1
0
+1
0
Book A Demo: Click Here
Join our Slack: Click Here
Link to our GitHub: Click Here
Share this:
Share
Similar Posts
Need a demo?
Speak to the founding team.
Launch prototypes in minutes. Go production in hours.
No more chains. No more building blocks.