AWS Machine Learning Blog
Adobe enhances developer productivity using HAQM Bedrock Knowledge Bases
Adobe Inc. excels in providing a comprehensive suite of creative tools that empower artists, designers, and developers across various digital disciplines. Their product landscape is the backbone of countless creative projects worldwide, ranging from web design and photo editing to vector graphics and video production.
Adobe’s internal developers use a vast array of wiki pages, software guidelines, and troubleshooting guides. Recognizing the challenge developers faced in efficiently finding the right information for troubleshooting, software upgrades, and more, Adobe’s Developer Platform team sought to build a centralized system. This led to the initiative Unified Support, designed to help thousands of the company’s internal developers get immediate answers to questions from a centralized place and reduce time and cost spent on developer support. For instance, a developer setting up a continuous integration and delivery (CI/CD) pipeline in a new AWS Region or running a pipeline on a dev branch can quickly access Adobe-specific guidelines and best practices through this centralized system.
The initial prototype for Adobe’s Unified Support provided valuable insights and confirmed the potential of the approach. This early phase highlighted key areas requiring further development to operate effectively at Adobe’s scale, including addressing scalability needs, simplifying resource onboarding, improving content synchronization mechanisms, and optimizing infrastructure efficiency. Building on these learnings, improving retrieval precision emerged as the next critical step.
To address these challenges, Adobe partnered with the AWS Generative AI Innovation Center, using HAQM Bedrock Knowledge Bases and the Vector Engine for HAQM OpenSearch Serverless. This solution dramatically improved their developer support system, resulting in a 20% increase in retrieval accuracy. Metadata filtering empowers developers to fine-tune their search, helping them surface more relevant answers across complex, multi-domain knowledge sources. This improvement not only enhanced the developer experience but also contributed to reduced support costs.
In this post, we discuss the details of this solution and how Adobe enhances their developer productivity.
Solution overview
Our project aimed to address two key objectives:
- Document retrieval engine enhancement – We developed a robust system to improve search result accuracy for Adobe developers. This involved creating a pipeline for data ingestion, preprocessing, metadata extraction, and indexing in a vector database. We evaluated retrieval performance against Adobe’s ground truth data to produce high-quality, domain-specific results.
- Scalable, automated deployment – To support Unified Support across Adobe, we designed a reusable blueprint for deployment. This solution accommodates large-scale data ingestion of various types and offers flexible configurations, including embedding model selection and chunk size adjustment.
Using HAQM Bedrock Knowledge Bases, we created a customized, fully managed solution that improved the retrieval effectiveness. Key achievements include a 20% increase in accuracy metrics for document retrieval, seamless document ingestion and change synchronization, and enhanced scalability to support thousands of Adobe developers. This solution provides a foundation for improved developer support and scalable deployment across Adobe’s teams. The following diagram illustrates the solution architecture.
Let’s take a closer look at our solution:
- HAQM Bedrock Knowledge Bases index – The backbone of our system is HAQM Bedrock Knowledge Bases. Data is indexed through the following stages:
- Data ingestion – We start by pulling data from HAQM Simple Storage Service (HAQM S3) buckets. This could be anything from resolutions to past issues or wiki pages.
- Chunking – HAQM Bedrock Knowledge Bases breaks data down into smaller pieces, or chunks, defining the specific units of information that can be retrieved. This chunking process is configurable, allowing for optimization based on the specific needs of the business.
- Vectorization – Each chunk is passed through an embedding model (in this case, HAQM Titan V2 on HAQM Bedrock) creating a 1,024-dimension numerical vector. This vector represents the semantic meaning of the chunk, allowing for similarity searches
- Storage – These vectors are stored in the HAQM OpenSearch Serverless vector database, creating a searchable repository of information.
- Runtime – When a user poses a question, our system competes the following steps:
- Query vectorization – With the HAQM Bedrock Knowledge Bases Retrieve API, the user’s question is automatically embedded using the same embedding model used for the chunks during data ingestion.
- Similarity search and retrieval – The system retrieves the most relevant chunks in the vector database based on similarity scores to the query.
- Ranking and presentation – The corresponding documents are ranked based on the sematic similarity of their modest relevant chunks to the query, and the top-ranked information is presented to the user.
Multi-tenancy through metadata filtering
As developers, we often find ourselves seeking help across various domains. Whether it’s tackling CI/CD issues, setting up project environments, or adopting new libraries, the landscape of developer challenges is vast and varied. Sometimes, our questions even span multiple domains, making it crucial to have a system for retrieving relevant information. Metadata filtering empowers developers to retrieve not just semantically relevant information, but a well-defined subset of that information based on specific criteria. This powerful tool enables you to apply filters to your retrievals, helping developers narrow the search results to a limited set of documents based on the filter, thereby improving the relevancy of the search.
To use this feature, metadata files are provided alongside the source data files in an S3 bucket. To enable metadata-based filtering, each source data file needs to be accompanied by a corresponding metadata file. These metadata files used the same base name as the source file, with a .metadata.json
suffix. Each metadata file included relevant attributes—such as domain, year, or type—to support multi-tenancy and fine-grained filtering in OpenSearch Service. The following code shows what an example metadata file looks like:
Retrieve API
The Retrieve API allows querying a knowledge base to retrieve relevant information. You can use it as follows:
- Send a POST request to
/knowledgebases/knowledgeBaseId/retrieve
. - Include a JSON body with the following:
- retrievalQuery – Contains the text query.
- retrievalConfiguration – Specifies search parameters, such as number of results and filters.
- nextToken – For pagination (optional).
The following is an example request syntax:
Additionally, you can set up the retriever with ease using the langchain-aws package:
This approach enables semantic querying of the knowledge base to retrieve relevant documents based on the provided query, simplifying the implementation of search.
Experimentation
To deliver the most accurate and efficient knowledge retrieval system, the Adobe and AWS teams put the solution to the test. The team conducted a series of rigorous experiments to fine-tune the system and find the optimal settings.
Before we dive into our findings, let’s discuss the metrics and evaluation process we used to measure success. We used the open source model evaluation framework Ragas to evaluate the retrieval system across two metrics: document relevance and mean reciprocal rank (MRR). Although Ragas comes with many metrics for evaluating model performance out of the box, we needed to implement these metrics by extending the Ragas framework with custom code.
- Document relevance – Document relevance offers a qualitative approach to assessing retrieval accuracy. This metric uses a large language model (LLM) as an impartial judge to compare retrieved chunks against user queries. It evaluates how effectively the retrieved information addresses the developer’s question, providing a score between 1–10.
- Mean reciprocal rank – On the quantitative side, we have the MRR metric. MRR evaluates how well a system ranks the first relevant item for a query. For each query, find the rank k of the highest-ranked relevant document. The score for that query is 1/k. MRR is the average of these 1/k scores over the entire set of queries. A higher score (closer to 1) signifies that the first relevant result is typically ranked high.
These metrics provide complementary insights: document relevance offers a content-based assessment, and MRR provides a ranking-based evaluation. Together, they offer a comprehensive view of the retrieval system’s effectiveness in finding and prioritizing relevant information.In our recent experiments, we explored various data chunking strategies to optimize the performance of retrieval. We tested several approaches, including fixed-size chunking as well as more advanced semantic chunking and hierarchical chunking.Semantic chunking focuses on preserving the contextual relationships within the data by segmenting it based on semantic meaning. This approach aims to improve the relevance and coherence of retrieved results.Hierarchical chunking organizes data into a hierarchical parent-child structure, allowing for more granular and efficient retrieval based on the inherent relationships within your data.
For more information on how to set up different chunking strategies, refer to HAQM Bedrock Knowledge Bases now supports advanced parsing, chunking, and query reformulation giving greater control of accuracy in RAG based applications.
We tested the following chunking methods with HAQM Bedrock Knowledge Bases:
- Fixed-size short chunking – 400-token chunks with a 20% overlap (shown as the blue variant in the following figure)
- Fixed-size long chunking – 1,000-token chunks with a 20% overlap
- Hierarchical chunking – Parent chunks of 1,500 tokens and child chunks of 300 tokens, with a 60-token overlap
- Semantic chunking – 400-token chunks with a 95% similarity percentile threshold
For reference, a paragraph of approximately 1,000 characters typically translates to around 200 tokens. To assess performance, we measured document relevance and MRR across different context sizes, ranging from 1–5. This comparison aims to provide insights into the most effective chunking strategy for organizing and retrieving information for this use case.The following figures illustrate the MRR and document relevance metrics, respectively.
As a result of these experiments, we found that MRR is a more sensitive metric for evaluating the impact of chunking strategies, particularly when varying the number of retrieved chunks (top-k from 1 to 5). Among the approaches tested, the fixed-size 400-token strategy—shown in blue—proved to be the simplest and most effective, consistently yielding the highest accuracy across different retrieval sizes.
Conclusion
In the journey to design Adobe’s developer Unified Support search and retrieval system, we’ve successfully harnessed the power of HAQM Bedrock Knowledge Bases to create a robust, scalable, and efficient solution. By configuring fixed-size chunking and using the HAQM Titan V2 embedding model, we achieved a remarkable 20% increase in accuracy metrics for document retrieval compared to Adobe’s existing solution, by running evaluations on the customer’s testing system and provided dataset.The integration of metadata filtering emerged as a game changing feature, allowing for seamless navigation across diverse domains and enabling customized retrieval. This capability proved invaluable for Adobe, given the complexity and breadth of their information landscape. Our comprehensive comparison of retrieval accuracy for different configurations of the HAQM Bedrock Knowledge Bases index has yielded valuable insights. The metrics we developed provide an objective framework for assessing the quality of retrieved context, which is crucial for applications demanding high-precision information retrieval. As we look to the future, this customized, fully managed solution lays a solid foundation for continuous improvement in developer support at Adobe, offering enhanced scalability and seamless support infrastructure in tandem with evolving developer needs.
For those interested in working with AWS on similar projects, visit Generative AI Innovation Center. To learn more about HAQM Bedrock Knowledge Bases, see Retrieve data and generate AI responses with knowledge bases.
About the Authors
Kamran Razi is a Data Scientist at the HAQM Generative AI Innovation Center. With a passion for delivering cutting-edge generative AI solutions, Kamran helps customers unlock the full potential of AWS AI/ML services to solve real-world business challenges. With over a decade of experience in software development, he specializes in building AI-driven solutions, including AI agents. Kamran holds a PhD in Electrical Engineering from Queen’s University.
Nay Doummar is an Engineering Manager on the Unified Support team at Adobe, where she’s been since 2012. Over the years, she has contributed to projects in infrastructure, CI/CD, identity management, containers, and AI. She started on the CloudOps team, which was responsible for migrating Adobe’s infrastructure to the AWS Cloud, marking the beginning of her long-term collaboration with AWS. In 2020, she helped build a support chatbot to simplify infrastructure-related assistance, sparking her passion for user support. In 2024, she joined a project to Unify Support for the Developer Platform, aiming to streamline support and boost productivity.
Varsha Chandan Bellara is a Software Development Engineer at Adobe, specializing in AI-driven solutions to boost developer productivity. She leads the development of an AI assistant for the Unified Support initiative, using HAQM Bedrock, implementing RAG to provide accurate, context-aware responses for technical support and issue resolution. With expertise in cloud-based technologies, Varsha combines her passion for containers and serverless architectures with advanced AI to create scalable, efficient solutions that streamline developer workflows.
Jan Michael Ong is a Senior Software Engineer at Adobe, where he supports the developer community and engineering teams through tooling and automation. Currently, he is part of the Developer Experience team at Adobe, working on AI projects and automation contributing to Adobe’s internal Developer Platform.
Justin Johns is a Deep Learning Architect at HAQM Web Services who is passionate about innovating with generative AI and delivering cutting-edge solutions for customers. With over 5 years of software development experience, he specializes in building cloud-based solutions powered by generative AI.
Gaurav Dhamija is a Principal Solutions Architect at HAQM Web Services, where he helps customers design and build scalable, reliable, and secure applications on AWS. He is passionate about developer experience, containers, and serverless technologies, and works closely with engineering teams to modernize application architectures. Gaurav also specializes in generative AI, using AWS generative AI services to drive innovation and enhance productivity across a wide range of use cases.
Sandeep Singh is a Senior Generative AI Data Scientist at HAQM Web Services, helping businesses innovate with generative AI. He specializes in generative AI, machine learning, and system design. He has successfully delivered state-of-the-art AI/ML-powered solutions to solve complex business problems for diverse industries, optimizing efficiency and scalability.
Anila Joshi has more than a decade of experience building AI solutions. As a Senior Manager, Applied Science at AWS Generative AI Innovation Center, Anila pioneers innovative applications of AI that push the boundaries of possibility and accelerate the adoption of AWS services with customers by helping customers ideate, identify, and implement secure generative AI solutions.