SEEK Asia modernizes search with CI/CD and HAQM OpenSearch Service

This post was written in collaboration with Abdulsalam Alshallah (Salam), Software Architect, and Hans Roessler, Principal Software Engineer at SEEK Asia.

SEEK is a market leader in online employment marketplaces with deep and rich insights into the future of work. As a global business, SEEK has a presence in Australia, New Zealand, Hong Kong, Southeast Asia, Brazil and Mexico and its websites attract over 400 million visits per year. SEEK Asia’s business operates across seven countries and includes leading portal brands such as jobsdb.com and jobstreet.com and leverages data and technology to create innovative solutions for candidates and hirers.

In this post, we share how SEEK Asia modernized their search-based system with a continuous integration and continuous delivery (CI/CD) pipeline and HAQM OpenSearch Service.

Challenges associated with a self-managed search system

SEEK Asia provides a search-based system that enables employers to manage interactions between hirers and candidates. Although the system was already on AWS, it was a self-managed system running on HAQM Elastic Compute Cloud (HAQM EC2) with limited automation.

The self-managed system posed several challenges:

Slower release cycles – Deploying new configurations or new field mappings into the Elasticsearch cluster was a high-risk activity because changes affected the stability of the system. The little automation on both the self-managed cluster and workflows led to slower release cycles.
Higher operational overhead – Sizing the cluster to deliver greater performance, while managing cost effectively, was the other challenge. As with every other distributed system, even with sizing guidance, identifying the appropriate number of shards per node and the number of nodes to meet performance requirements still required some amount of trial and error, turning the exercise into a tedious and time-consuming activity. This consequently also led to slower release cycles. To overcome this challenge, in many occasions, oversizing the cluster became the quickest way to achieve the desired time to market, at the expense of cost.

Further challenges the team faced with self-managing their own Elasticsearch cluster included keeping up with new security patches, and minor and major platform upgrades.

Automating search delivery with HAQM OpenSearch Service

SEEK Asia knew that automation would the key to solving the challenges of their existing search service. Automating the undifferentiated heavy lifting would enable them to deliver more value to their customers quickly and improve staff productivity.

With the problems defined, the team set out to solve the challenges by automating the following:

Search infrastructure deployment
Search A/B testing infrastructure deployment
Redeployment of search infrastructure for any new infrastructure configuration (such as security patches or platform upgrades) and index mapping updates

The key services enabling the automation would be HAQM OpenSearch Service and establishing a search infrastructure CI/CD pipeline.

Architecture overview

The following diagram illustrates the architecture of the SEEK infrastructure and CI/CD pipeline with HAQM OpenSearch Service.

The workflow includes the following steps:

Before the workflow kicks off, an existing HAQM OpenSearch Service cluster with a live feeder hydrates it. The live feeder is a serverless application built on HAQM Simple Queue Service (HAQM SQS) via HAQM Simple Notification Service (HAQM SNS) and AWS Lambda. HAQM SQS queues documents for processing, HAQM SNS enables data fanout (if required), and a Lambda function is invoked to process messages in the SQS queue to import data into HAQM OpenSearch Service. The feeder receives live updates for changes that need to be reflected on the cluster. Write concurrency to HAQM OpenSearch Service is managed by limiting the number of concurrent Lambda function invocations.
The HAQM OpenSearch Service index mapping is version controlled in SEEK’s Git repository. Whenever an update to the index mapping is committed, the CI/CD pipeline kicks off a new HAQM OpenSearch Service cluster provisioning workflow.
As part of the workflow, a new data hydration initialization feeder is deployed. The initialization feeder construct is similar to the live feeder, with one additional component: a script that runs within the CI/CD pipeline to calculate the number of batches required to hydrate the newly provisioned HAQM OpenSearch Service cluster up to a specific timestamp. The feeder systems were designed to achieve idempotency processing. This meant unique identifiers (UIDs) from the source data stores are reused for each document, and duplicated documents update an existing document with the exact same values.
At the same time as Step 3, an HAQM OpenSearch Service cluster is deployed. To accelerate the initial data hydration process temporarily, the new cluster may be sized two or three times larger against sizing guidance with shard replicas and index refresh interval disabled until the hydration process is complete. The existing HAQM OpenSearch Service cluster remains as is, which means that two clusters are running concurrently.
The script inspects the number of documents the source data store has and groups the documents by batch sizes. SEEK identified that 1,000 documents per batch provided the optimal ingestion import time, after running numerous experiments.
Each batch is represented as one message and is queued into HAQM SQS via HAQM SNS. Every message that lands in HAQM SQS invokes a Lambda function. The Lambda function queries a separate data store, builds the document, and loads it into HAQM OpenSearch Service. The more messages that go into the queue, the more functions are invoked. To create baselines that allowed for further indexing optimization, the team took the following configurations into consideration and reiterated to achieve higher ingestion performance:
1. Memory of the Lambda function
2. Size of batch
3. Size of each document in the batch
4. Size of cluster (memory, vCPU, and number of primary shards)
With the initialization feeder running, new documents are streamed to the cluster until it is synced with the data source. Eventually, the newly provisioned HAQM OpenSearch Service cluster catches up and is in the same state as the existing cluster. The hydration is complete when there are no remaining messages in the SQS queue.
The initialization feeder is deleted and the HAQM OpenSearch Service cluster is downsized automatically to complete the deployment workflow, with replica shards created and the index refresh interval configured.
Live search traffic is routed to the newly provisioned cluster when A/B testing is enabled via the API layer built on Application Load Balancer, HAQM Elastic Container Service (HAQM ECS), and HAQM CloudFront. The API layer decouples the client interface from the backend implementation that runs on HAQM OpenSearch Service.

Improved time to market and other outcomes

With HAQM OpenSearch Service, SEEK was able to automate an entire cluster, complete with Kibana, in a secure, managed environment. If testing didn’t produce the desired results, the team could change the dimensions of the cluster horizontally or vertically using different instance offerings within minutes. This enabled them to perform stress tests quickly to identify the sweet spot between performance and cost of the workload.

“By integrating HAQM OpenSearch Service with our existing CI/CD tools, we’re able to fully automate our search function deployments, which accelerated software delivery time,” says Abdulsalam Alshallah, APAC Software Architect. “The newly found confidence in the modern stack, alongside improved engineering practices, allowed us to mitigate the risk of changes—improving our time to market by 89% with zero impact to uptime.”

With the adoption of HAQM OpenSearch Service, other teams also saw improvements, including the following:

Common Vulnerability and Exposure (CVE) has dropped to zero with HAQM OpenSearch Service handling the underlying hardware security updates on SEEK’s behalf, improving their security posture
Improved availability with the HAQM OpenSearch Service Availability Zone awareness feature

Conclusion

HAQM OpenSearch Service managed capabilities has helped SEEK Asia to improve customer experience with speed and automation. By removing the undifferentiated heavy lifting, teams can deploy changes quickly to their search engines, allowing customers to get the latest search features faster and ultimately contributing to the SEEK purpose of helping people live more productive working lives and organisations succeed.

To learn more about HAQM OpenSearch Service, see HAQM OpenSearch Service features, the Developer Guide, or Introducing OpenSearch.

About the Authors

Fabian Tan is a Principal Solutions Architect at HAQM Web Services. He has a strong passion for software development, databases, data analytics and machine learning. He works closely with the Malaysian developer community to help them bring their ideas to life.

Hans Roessler is a Principal Software Architect at SEEKAsia. He is excited about new technologies and upgrading legacy to newer stacks. Always staying in touch with the latest technologies is one of his passions.

Abdulsalam Alshallah (Salam) is a Software architect at SEEK, Previously a Lead Cloud Architect for SEEKAsia, Salam has always been excited about new technologies, Cloud, Serverless & DevOps, in addition to his passion of eliminating wasted time/effort & resources; He is also one of the leaders of AWS User Group Malaysia.

AWS Big Data Blog