Announcing latency-optimized inference for HAQM Nova Pro foundation model in HAQM Bedrock
HAQM Nova Pro foundation model now supports latency-optimized inference in preview on HAQM Bedrock, enabling faster response times and improved responsiveness for generative AI applications. Latency-optimized inference speeds up response times for latency-sensitive applications, improving the end-user experience and giving developers more flexibility to optimize performance for their use case. Accessing these capabilities requires no additional setup or model fine-tuning, allowing for immediate enhancement of existing applications with faster response times.
Latency optimized inference for HAQM Nova Pro is available via cross-region inference in US West (Oregon), US East (Virginia), and US East (Ohio) regions. Learn more about HAQM Nova foundation models at the AWS News Blog, the HAQM Nova product page, or the HAQM Nova user guide. Learn more about latency optimized inference on Bedrock in documentation. You can get started with HAQM Nova foundation models in HAQM Bedrock from the HAQM Bedrock console.