Training AI models for skill-based matchmaking using HAQM SageMaker AI

In competitive multiplayer games, skill-based matchmaking is crucial for creating fun and competitive games. Determining player skill today is difficult due to the vast array of metrics games record (such as hits, misses, assists, time played, level, and more), making it challenging to determine which factors are most indicative of skill. Instead of manually creating algorithms to determine player skill, machine learning (ML) techniques (particularly supervised learning) can automatically identify patterns across game metrics to produce more accurate skill ratings. These ML-derived ratings enable more balanced matchmaking, ultimately enhancing player satisfaction and engagement.

In this first part of our two-part blog series, we’ll show you how to use HAQM SageMaker AI to quickly create and deploy an automated ML pipeline. HAQM SageMaker AI provides the capabilities to build, train, and deploy ML and foundation models, with fully managed infrastructure, tools, and workflows. The model and pipeline we build will produce a value that is a more reflective and precise rating of each player’s skill.

To accomplish this task, we will be building upon the Guidance for AI-Driven Player Insights on HAQM Web Services (AWS). The following architecture diagram for the guidance shows how game studios can leverage this low-code solution to quickly build, train, and deploy high-quality models that predict player skill using historic player data. Operators just upload their historic player data to HAQM Simple Storage Service (HAQM S3). This invokes a complete workflow to extract insights, select algorithms, tune hyperparameters, evaluate models, and deploy the best performing model for your dataset to a prediction API orchestrated by HAQM SageMaker Pipelines.

AWS reference architecture diagram illustrating an AI-driven player insights system. The workflow shows a linear progression from data capture to model deployment. Starting with game player event data being uploaded to HAQM S3, the process flows through SageMaker pipelines which include preprocessing, AutoML model training, and evaluation steps. The trained model is stored in the SageMaker Model Registry and ultimately deployed as a model endpoint that game clients can query for player behavior predictions. The diagram uses AWS architectural icons connected by directional arrows to show data and process flow.

Figure 1: Architecture diagram of the Guidance for AI-Driven Player Insights on AWS.

Figure 2 shows the architecture you will be implementing in this blog. The diagram shows the flow of how a player’s matchmaking request is handled. The matchmaking request triggers HAQM API Gateway, invoking an AWS Lambda function which receives the relevant player data from HAQM DynamoDB. The data is then passed to the HAQM SageMaker AI endpoint, which runs an inference produce to the more holistic skill value used by HAQM GameLift FlexMatch in the matchmaking process.

AWS architecture diagram for Matchmaking Simulation. The flow starts with a Player icon connected to HAQM GameLift Testing Toolkit. It then progresses through Request Handler (HAQM API Gateway), Request Matchmaking (AWS Lambda), Skill Inference Endpoint (HAQM SageMaker AI), and finally Matchmaking (HAQM GameLift FlexMatch). The Request Matchmaking step also connects to Player Data (HAQM DynamoDB). Each component is represented by its respective AWS service icon and linked with arrows showing the process flow.

Figure 2: Matchmaking workflow using matchmaking simulator.

The following architecture diagram shows how you will implement this solution in an actual game by connecting FlexMatch to an HAQM GameLift Servers queue. This triggers GameLift Servers to place or spin up game servers for the newly created matches.

AWS architecture diagram for Matchmaking/Game Session Placement for Actual Gameplay. The diagram shows a complete game matchmaking flow divided into two sections: Game Backend Services and HAQM GameLift. Starting with a Player connected to a Game Client, the flow moves through Request Handler (HAQM API Gateway), Request Matchmaking (Lambda), and Skill Inference (SageMaker Endpoint) in the backend services section. The HAQM GameLift section contains FlexMatch, HAQM GameLift Queue, and Game Server components. Player Data (DynamoDB) connects to both the Request Handler and Request Matchmaking components. Game Session Info flows back from the Game Server to the backend services.

Figure 3: Matchmaking workflow as part of a game backend.

Walkthrough

Prerequisites

For this walkthrough, you should have the following prerequisites:

An AWS account with IAM permissions (administrative access) to create and use a SageMaker AI Domain.
A code editor
To download the data file, you will be using for this tutorial, choose PlayerStats.csv. In the GitHub repository you can choose the “Download raw file” option, as shown in Figure 4.

GitHub page showing a preview of a CSV file with game statistics. The table headers include 'kills', 'hits', 'timePlayed', 'gamesPlayed', 'assists', 'misses', 'shots', 'deaths', 'winRate', and 'playerSkill'. Six rows of numerical data are visible. The interface includes GitHub's standard file view options such as 'Preview', 'Code', 'Blame', and a download button in the top right corner.

Figure 4: PlayerStats.csv download page.

Optional: If you would like to replace the code files, we will be modifying throughout this tutorial rather than modifying the code line by line, choose the following links to download each of the files choosing the “Download raw file” option:
The prerequisites found in the Guidance for AI-driven player insights on AWS GitHub repository.

SageMaker AI domain setup

Open the SageMaker AI console.
Under the left side menu, choose Domains.
Choose Create domain.

HAQM SageMaker Domains management page showing an empty domains list. The page header displays 'Domains (0)' with a search bar below. The main table has columns for Name, Id, Status, Created on, and Modified on, with the message 'No domains' in the center. A highlighted 'Create domain' button appears in the top right corner of the interface.

Figure 5: HAQM SageMaker domain dashboard.

Select the Quick Setup option and choose Set up.

SageMaker Domain setup page showing two main options: 'Set up for single user (Quick setup)' and 'Set up for organizations'. The Quick setup option is selected, displaying features like IAM role setup, public internet access, Studio integrations, and IAM Authentication with checkmarks. The organizations setup option shows additional features for advanced configuration. A 'Set up' button is highlighted in orange in the bottom right corner.

Figure 6: HAQM SageMaker domain creation.

Access the HAQM SageMaker Studio dashboard

Select the domain you created in the previous section.

A SageMaker AI domain management interface showing a single domain entry titled "QuickSetupDomain-20250225T142414" with ID "d-Shelizvryyzk". The domain is marked as "InService" and was created on Feb 25, 2025 at 20:24 UTC, with last modification at 20:27 UTC. The interface includes column headers for Name, Id, Status, Created on, and Modified on.

Figure 7: HAQM SageMaker domain selection.

Under the User profiles section select the Launch dropdown button located next to the profile you want to use. Select SageMaker Studio will open in another tab. Leave this tab open, we will be coming back to this in a later section.

Figure 8: Domain user profiles.

Deploy the AI driven player insights solution

Follow the deployment steps provided by the AWS AI-Driven Player Insights repository linked in the prerequisites section of this blog. You will need to deploy the solution using your own device if you do not have access to an AWS Cloud9 development environment. Follow the Deployment Steps up to Step 5 within the guide.

Understanding AI-driven player insights on AWS

Traditionally, developing effective machine learning models requires data science experience since builders must determine appropriate data pre-processing methods. These were based on metric relationships, selecting optimal machine learning algorithms, and establishing model performance evaluation strategies. With this solution you will no longer need extensive machine learning experience to build and deploy machine learning models. Instead, you will leverage the feature HAQM SageMaker Autopilot.

HAQM SageMaker Autopilot automates the complete process of building, training, tuning, and deploying machine learning models. HAQM SageMaker Autopilot analyzes your data, selects algorithms suitable for your problem type, and preprocesses the data for training. It also handles automatic model training, and performs hyper-parameter optimization to find the best performing model for your dataset. To accelerate the process of training and deploying machine learning models as data changes over time, our solution provides a pre-defined machine learning pipeline. The pipeline triggers the entire process and model deployment from the moment your data is uploaded to your S3 bucket.

This solution requires a player statistics dataset. The dataset should include relevant performance metrics for your game and the current skill rating used for matchmaking. For this tutorial, we will use the following PlayerStats.csv you downloaded in the prerequisites section.

Preparing the machine learning pipeline for linear regression

The AI-driven player insights machine learning pipeline, by default, is configured for predicting player churn with an output of “True” or “False”. Since the output value for this example refers to categorical data with discrete outcomes, this is set up for a classification problem. In our tutorial, we are looking to output numeric data for a player’s “Skill”. Since we are looking at supervised learning for numeric data with continuous outcomes, we need to modify this pipeline for linear regression, the most common supervised learning model for predicting numeric data.

In your preferred code editor, open the /player-insights/constants.py file and replace its contents with the following (Be sure to replace the values for SM_DOMAIN_ID and REGION to your own specific values.):

WORKLOAD_NAME = “PlayerSkills”
REGION = "[YOUR REGION]" 
SM_DOMAIN_ID = "[YOUR SAGEMAKER AI DOMAIN ID]" 
DATA_FILE = "PlayerStats.csv" 
TARGET_ATTRIBUTE = "playerSkill" 
PERFORMANCE_THRESHOLD = 0.00 
ENDPOINT_TYPE = "SERVERLESS"

Since we will be training a linear regression model, we will need to change the evaluation to Mean Squared Error (MSE). Replace the /player-insights/evaluation.py main function with the following code:

 if __name__ == "__main__":
    logger.debug("Starting Evaluation ...")
    logger.info("Reading Test Predictions")
    y_pred_path = "/opt/ml/processing/input/predictions/x_test.csv.out"
    y_pred = pd.read_csv(y_pred_path, header=None).squeeze()  # Assuming one column
    logger.info("Reading Test Labels")
    y_true_path = "/opt/ml/processing/input/true_labels/y_test.csv"
    y_true = pd.read_csv(y_true_path, header=None).squeeze()  # Assuming one column
    mse = mean_squared_error(y_true, y_pred)
    logger.info(f"Mean Squared Error: {mse}")
    report_dict = {
        "regression_metrics": {
            "mean_squared_error": {
                "value": mse,
                "standard_deviation": "NaN",
            },
        },
    }
    output_dir = "/opt/ml/processing/evaluation"
    pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)
    evaluation_path = os.path.join(output_dir, "evaluation_metrics.json")
    logger.info("Saving Evaluation Report")
    with open(evaluation_path, "w") as f:
        f.write(json.dumps(report_dict))

Open the /player-insights/workflow.py. Here we are adjusting the machine learning pipeline to use the MSE metric for evaluation rather than the F1 score (since F1 score is used for classification problems). To do this, we will modify the failure step defined on line 227 to send a message indicating the MSE is less than the specified threshold and being used rather than the F1 score.

    failure_step = FailStep(
        name="ModelEvaluationFailure",
        error_message=Join(
            on=" ",
            values=["Pipeline execution failure: MSE is less than the specified Evaluation Threshold"] #CHANGED: Updated to reflect evaluating MSE rather than F1 score
        )
    )

Note, you will also need to modify the conditional step defined at line 259 in this same file to use the MSE metric for evaluation instead of the weighted F1 score.

conditional_step = ConditionStep(
        name="ModelQualityCondition",
        conditions=[
            ConditionGreaterThanOrEqualTo(
                left=JsonGet(
                    step_name=evaluation_step.name,
                    property_file=evaluation_report,
                    json_path="regression_metrics.mean_squared_error.value"  #CHANGED: changed line to evaluate MSE instead of F1 score
                ),
                right=metric_threshold
            )
        ],
        if_steps=[step_register_model, deployment_step],
        else_steps=[failure_step]
    )

Verify that the cloud development kit (CDK) deployment correctly synthesizes the proper AWS CloudFormation templates, to make sure your changes are updated in the stack, by executing the following command in your terminal: cdk synth

Deploy your modified solution by executing the following command in your terminal: cdk deploy
Locate the S3 bucket, created by the CloudFormation template, by opening the AWS CloudFormation console. Choose the PlayerSkills-Stack.

A screenshot of the AWS CloudFormation Stacks dashboard showing a single stack titled "PlayerSkills-Stack". The stack is marked as CREATE_COMPLETE with a creation timestamp of 2025-02-26 13:02:51 UTC-0600. The stack description indicates it provides "Guidance for AI-driven player insights on AWS (SO9401)". The filter is set to "Active" and there are options to view nested stacks, delete, update, and create new stacks.

Figure 9: CloudFormation stack.

On the right side of your console screen, you will see a tab labeled Outputs. Select the Outputs tab, and take note of the value for the key, DataBucketName.

AWS CloudFormation console showing the "PlayerSkills-Stack" details. The "Outputs" tab is selected, displaying one output with key "DataBucketName" and value "playerskills-data-us-east-1-989038966811". The stack status is "CREATE_COMPLETE" with a creation timestamp visible. Navigation options and search fields are present in the interface.

Figure 10: CloudFormation stack outputs.

Open the HAQM S3 console, and choose the bucket that has the name you noted in the previous step. Choose Upload.

S3 bucket interface showing the bucket "playerskills-data-us-east-1-989038966811". The Objects tab is selected, displaying an empty bucket with options for uploading files, creating folders, and managing objects. The interface includes standard HAQM S3 controls, like Copy S3 URI, Copy URL, Download, Open, Delete, and Actions buttons. A search bar for finding objects by prefix is visible, along with columns for Name, Type, Last modified, Size, and Storage class.

Figure 11: HAQM S3 object upload.

Choose Add files, choose the csv file you downloaded in the prerequisites, and choose Upload. This will trigger your machine learning pipeline to begin. Training and model deployment will take roughly 30 minutes to be completed.

AWS S3 upload interface for the bucket "playerskills-data-us-east-1-989038966811". The screen shows a drag-and-drop area for files and folders, with options to "Add files" or "Add folder". The Files and folders section is empty, indicating no files have been selected yet. The Destination section shows the S3 bucket URL. Additional expandable sections for Destination details, Permissions, and Properties are visible. At the bottom right, there are "Cancel" and "Upload" buttons, with "Upload" highlighted.

Figure 12: HAQM S3 object upload – adding files.

Check pipeline completion progress

Navigate back to the tab you have with the SageMaker Studio page open. Within the left side menu choose Pipelines.

HAQM SageMaker Studio home interface displaying a dashboard. The left sidebar shows various applications and tools including JupyterLab, RStudio, Canvas, Code Editor, and MLflow, with "Pipelines" highlighted. The main panel shows an onboarding plan with three cards for "Take the tour," "Access your EFS data," and "Access your Studio Classic apps." Below that are JupyterLab and Code Editor sections with options to view their respective spaces. The interface includes navigation tabs for Overview, Getting started, and What's new.

Figure 13: HAQM SageMaker Studio home page.

Choose the PlayerSkills-AutoMLPipeline and choose your most recent execution.

An AWS web interface showing a pipeline listing. A single pipeline named 'PlayerSkills-AutoMLPipeline' is displayed in a table view. The pipeline was created on Feb 26, 2025 and last modified on the same day. It has a tag 'WorkloadName: PlayerSkills'. The interface includes search functionality and options to delete or create Accessible pipelines in a visual editor.

Figure 14: HAQM SageMaker pipeline.

An AWS pipeline execution details view showing a single execution of 'PlayerSkills-AutoMLPipeline'. The execution ID 'execution-1740596741814' is shown with a 'Succeeded' status. The pipeline ran for 19 minutes and 57 seconds, was created on Feb 26, 2025 at 19:05:41 GMT and was modified at 19:25:38 GMT. The interface includes tabs for Executions, Graph, Parameters, and Information, along with search functionality and execution controls.

Figure 15: HAQM SageMaker pipeline executions.

After the execution is completed, you will see a graph showing the steps of the pipeline and their results.

A graph view of an AWS ML pipeline workflow showing execution details. The pipeline ran for 19m 57s, starting at 19:05:41 GMT and ending at 19:25:38 GMT on Feb 26, 2025. The workflow diagram shows several connected steps: DataPreprocessingStep (Process data) AutoMLTrainingStep (AutoML) ModelCreationStep (Create model) InferenceTestingStep (Deploy model batch inference) ModelEvaluationStep (Prepare data) ModelQualityCondition (Condition) ModelRegistrationStep (Register model) ModelDeploymentStep (Lambda) ModelEvaluationFailure The steps are connected by directional arrows showing the workflow sequence, and most steps show green checkmarks indicating successful completion. The view includes zoom controls and is currently at 64 percent zoom.

Figure 16: HAQM SageMaker pipeline execution details graph.

Test your machine learning model

Once training and model deployment is completed successfully, there is an HAQM SageMaker AI endpoint that will be created. Locate the HAQM SageMaker AI endpoint by navigating to HAQM SageMaker AI in the AWS Management Console. On the left side menu, under the Inference section, choose the Endpoints option. The name of the endpoint in this tutorial is PlayerSkills-Endpoint. Note the name of your created endpoint, you will be referring to this later.

An HAQM SageMaker AI interface showing the Endpoints page. There is one endpoint listed named 'PlayerSkills-Endpoint' with its corresponding ARN (HAQM Resource Name). The endpoint is shown as 'InService' status, was created on 2/26/2025 at 1:25:37 PM and last updated at 1:27:36 PM. The interface includes options to update endpoint, create new endpoints, and perform other actions. The left navigation panel shows various SageMaker features and configurations, with 'Endpoints' selected under the 'Inference' section.

Figure 17: HAQM SageMaker AI endpoint.

To test the newly created model and endpoint, you can replace /player-insights/assets/examples/churn_inference.py file lines 23-25 with the following code:

response = predictor.predict(
        "10597,20312,602,205,1916,56266,76578,9725,8.0"
    ).decode("utf-8")

Rename the file name of churn_inference.py to skill_inference.py.
To run the script, in your command line, navigate to the /player-insights/assets/examples directory and run the script with the following commands:

cd assets/examples
python3 skill_inference.py --endpoint-name PlayerSkills-Endpoint

If the script runs successfully, it will return a response that is a value between 0-1 of the endpoint’s output, as shown in Figure 16. Changing the values entered in the previous step, Step 2, will change the output values.

The image shows a console or terminal output displaying the results of an inference request to a SageMaker endpoint. The endpoint being used is named "PlayerSkills-Endpoint". An inference request was sent with a test payload, and the SageMaker endpoint returned a response value of 0.46999722719192505.

Figure 18: Example output from skill_inference.py.

Cleaning up

You’ll be using what you deployed in this blog in the second part of this series, so we won’t clean this up yet. We’ll show you how to clean up any resources you’ve deployed in the next blog.

In the second part of this series, we’ll show you how to use the newly determined “Skill” value in conjunction with HAQM GameLift FlexMatch. HAQM GameLift FlexMatch will handle the logic of matchmaking players while giving you, the developer, a way to adjust which matches are created through a rules-based syntax called FlexMatch rule sets.

Conclusion

We showed how to deploy a solution for AI-driven player insights on AWS, and how to build a ML model to more holistically infer a player’s Skill value. Based on what makes a player skillful within your game, you can choose what in-game factors that the ML model uses to determine player skill. This results in a more precise player skill that you can use to create balanced and competitive matches.

Be certain to join us again for Implementing AI-Powered Matchmaking with HAQM GameLift FlexMatch, the second blog in this series. We’ll show you how to use the results from this blog’s model to match players using HAQM GameLift FlexMatch. You will also learn how to simulate matchmaking using the HAQM GameLift Testing Toolkit in order to test both the ML model and your matchmaking parameters.

Contact an AWS Representative to know how we can help accelerate your business.

AWS for Games Blog