AWS Cloud Operations Blog
Monitor AWS Transit Gateway Flow Logs centrally using HAQM Managed Grafana
As organizations continue to expand their cloud infrastructure by connecting multiple HAQM Virtual Private Clouds (HAQM VPC) across accounts and regions, the complexity of managing their network environment increases. AWS Transit Gateway has emerged as a powerful solution to simplify this complexity by providing a centralized hub for secure communication between HAQM VPCs, on-premises systems, and other transit gateways.
HAQM VPC Transit Gateway Flow Logs enables you to gain visibility and insights into network traffic going through your transit gateways. These logs capture detailed information on the transit gateways, such as source/destination IPs, ports, protocol, traffic counters, timestamps, and other metadata. Flow logs can help you with your use-cases such as network troubleshooting, network capacity planning and compliance and security. You can leverage HAQM Managed Grafana to visualize and monitor the Transit Gateway Flow Logs, unlocking a wealth of operational benefits. This centralized monitoring can empower organizations to analyze network performance and ensure compliance. By analyzing traffic patterns, teams can proactively identify anomalies, plan capacity, and detect suspicious activities. This comprehensive visibility allows organizations to troubleshoot issues, scale infrastructure, and maintain the overall health of their distributed cloud environments.
In this post, we will set up an HAQM Managed Grafana dashboard to visualize and centrally monitor Transit Gateway Flow Logs stored in an HAQM Simple Storage Service (HAQM S3) bucket, leveraging AWS Glue and HAQM Athena for data cataloging and querying respectively.
Architecture Overview
The following architecture diagram illustrates the delivery of flow logs generated traffic from multiple HAQM VPCs, traversing through your Transit Gateways and stored into a centralized HAQM S3 bucket. AWS Glue accesses this S3 bucket and crawls the logs data using AWS Glue crawler to create table definitions in AWS Glue Data Catalog. Next, HAQM Athena is used to create a tabular view for an effective data cataloging and querying. Finally, we leverage HAQM Managed Grafana Athena data source to create dashboards and visualize the AWS Transit Gateway Flow Logs.
Figure 1 – Architecture overview
Prerequisites
-
- Existing AWS Transit Gateways. If you don’t have a Transit Gateway set up in your account, please refer to the AWS documentation to create one.
- Create HAQM S3 bucket for storing Transit Gateway Flow Logs.
- Set up Athena workgroups with HAQM Managed Grafana prerequisites.
- Configure HAQM Managed Grafana workspace by following the steps in the Creating a workspace This will help you set up the workspace and assign user as administrator, enabling them to access the Grafana dashboard using the workspace URL with full administrative capabilities.
- In this post, we’re using the AWS IAM Identity Center option with HAQM Managed Grafana. To set up Authentication and Authorization, follow the instructions in the HAQM Managed Grafana User Guide to enable AWS IAM Identity Center.
- To use AWS data source configuration, first use the HAQM Managed Grafana console to enable service-managed AWS Identity and Access Management (IAM) roles that grants the workspace with AWS IAM policies necessary to access resources in your AWS Account or AWS Organization. Then, use the HAQM Managed Grafana workspace console to add HAQM Athena data source.
Security/IAM Note: Configure IAM permissions following the Principle of Least Privilege (PoLP) as this setup is for demonstration purposes only. Refer to Security best practices in IAM
- The HAQM S3 permissions for accessing the underlying data source of an Athena query are not included in this managed policy. You must add the necessary permissions for the HAQM S3 buckets manually, on a case-by-case basis. Refer to Athena prerequisites documentation.
Step 1: Launch the AWS CloudFormation template
We are using AWS CloudFormation (CFN) template to dynamically build the infrastructure, which will create:
- An HAQM S3 bucket to store Transit Gateway Flow Logs in the primary AWS Account
- AWS Glue crawler and database configuration
- HAQM Athena workgroup setup
- Athena view deployment as a Named Query
Note: Some of the resources that this stack deploys incur costs when in use.
After you have confirmed that you meet all prerequisites, deploy the CloudFormation template: BlogCFN.Yaml
Step 2: Configure AWS Transit Gateway Flow Logs and store them to HAQM S3 bucket
Flow logs can publish the logs data to HAQM S3 using AWS Management Console or AWS Command Line Interface (AWS CLI). These can be published to an existing HAQM S3 bucket that you specify.
- Launch the HAQM VPC console.
- From the navigation pane choose Transit gateways or Transit gateway attachments.
- Choose the checkbox for one or more transit gateways or transit gateway attachments.
- Choose Actions > Create flow log.
- For Destination, choose Send to an S3 bucket.
- For S3 bucket ARN, you can either use the automatically created S3 bucket (created by AWS CloudFormation template in the step above) or specify the HAQM Resource Name (ARN) of an existing HAQM S3 bucket. When configuring the flow logs, you can optionally specify a subfolder within the S3 bucket, like “my-bucket/my-logs/”, with the S3 bucket ARN. Note that “AWSLogs” cannot be used as a subfolder name, as it is a reserved term. If you own the bucket, AWS automatically creates a resource policy and attaches it to the bucket. For more information, see HAQM S3 bucket permissions for flow logs.
- For Log record format, specify the format for the flow log record.
- To use the default flow log record format, choose AWS default format.
- To create a custom format, choose Custom format. For Log format, choose the fields to include in the flow log record.
- For Log file format, specify the format for the log file.
-
- Text – Plain text. This is the default format.
- Parquet – Apache Parquet is a columnar data format. Queries on data in Parquet format are 10 to 100 times faster compared to queries on data in plain text. Data in Parquet format with Gzip compression takes 20 percent less storage space than plain text with Gzip compression.5. (Optional) To use Hive-compatible S3 prefixes, choose Hive-compatible S3 prefix, Enable.
-
- (Optional) To partition your flow logs per hour, choose Every 1 hour (60 mins).
- (Optional) To add a tag to the flow log, choose Add new tag and specify the tag key and value.
- Choose Create flow log.
Figure 2 – Create AWS Transit Gateway Flow logs
The IP traffic going to and from the AWS Transit Gateway is captured and stored in the S3 bucket specified when creating the flow log. Just for this blog post, we have configured AWS Glue crawler schedule as one hour. You can modify this schedule based on your requirements by following the AWS Glue documentation on updating crawler schedules.
Once the flow log file is generated and stored in S3 bucket, AWS Glue crawler will scan and catalog the data from the bucket and automatically create or update metadata in the AWS Glue database and tables.
Step 3: Create an HAQM Athena view using the saved queries created as part of the AWS CloudFormation stack
- Go to HAQM Athena > Query editor > Saved queries tab and choose the query named “aws_tgw_centralized_logging”.
Note: Workgroup created is named “tgw-logs-athena”
Figure 3 – Athena saved query
- On the Query editor, verify the Data source, Database and Table names while running the query. Upon successful execution, the query creates a view named “tgwlogs”.
Figure 4 – Run Athena saved query
Step 4: Configure HAQM Athena data source in HAQM Managed Grafana
- After creating the HAQM Managed Grafana workspace and making the user as admin as mentioned in the pre-requisite. Login into the HAQM Managed Grafana dashboard using the workspace URL.
- Navigate to Data sources and select HAQM Athena from the options.
- Adjust the HAQM Athena settings by specifying the Default Region, Data source, Database, Workgroup and set the HAQM S3 Output Location for your HAQM Athena query.
- Choose “Save & test” to confirm that the data source is functioning properly. You can now begin querying and visualizing metrics from the AWS environment.
Figure 5 – Add Athena as data source
Step 5: Create HAQM Managed Grafana dashboard using Athena as data source
HAQM Managed Grafana is a fully managed service that makes it easy to create, configure, and share interactive dashboards and charts for monitoring your data. You can also use HAQM Grafana to set up alerts and notifications based on specific conditions or thresholds, allowing you to quickly identify and respond to issues.
In this step, we will use HAQM Managed Grafana to create a near real-time dashboard to visualize your AWS Transit Gateway Flow Logs.
- You can either create a new HAQM Managed Grafana dashboard or import one using JSON to visualize your transit gateway flow logs.
- Download the sample dashboard JSON file that you can import to visualize various metrics and build upon this template.
Figure 6 – Import HAQM Managed Grafana dashboard template
- Once the sample JSON is loaded successfully, your HAQM Managed Grafana dashboard for the Transit Gateway Flow Logs will provide:
Comprehensive data insights: Track the total bytes exchanged between source and destination addresses for a clear overview of data transfers.
Strategic prioritization: Identify the top source and destination addresses based on packet counts, enabling efficient prioritization of network analysis.
Network optimization: Gain valuable insights by visualizing the top three source and destination subnets or HAQM VPCs according to bytes transferred, aiding in optimizing network performance.
Granular trend analysis: Utilize HAQM Managed Grafana to analyze byte and packet flow trends, both inbound and outbound, within specific Regions and time ranges using selected transit gateways and their attachment IDs.
Proactive issue detection: Stay ahead by detecting packet drops due to routing issues or black holes. Monitor and identify these incidents within chosen Regions and time frames for prompt action.
Figure 7 – Visualize AWS Transit Gateway flow logs on HAQM Managed Grafana Dashboard
Now we have the AWS Transit Gateway Insights on HAQM Managed Grafana. This dashboard refreshes every five minutes and runs a query against the materialized views that we previously created in HAQM Athena. Finally, HAQM Managed Grafana alerting provides us with robust and actionable alerts that help us learn about problems in the system moments after they occur. To learn more about HAQM Managed Grafana alerting visit “Alerts in Grafana”.
Clean up
To avoid ongoing charges in your AWS account, you should delete the AWS resources listed in the prerequisites section of this post. Furthermore, log in to the AWS Management Console and delete any manually created resources.
- Delete AWS CloudFormation Stack.
- Delete HAQM Managed Grafana workspace.
- Delete HAQM Athena workgroup.
Note: You can delete only the empty S3 buckets using AWS CloudFormation. Delete CloudFormation stack fails in case there is content in S3 bucket. Empty the S3 bucket before initiating delete process for the CloudFormation template.
Conclusion
This blog post demonstrated how to create visualizations using HAQM Managed Grafana dashboards for your AWS Transit Gateway Flow Logs. The ability to visualize metrics data helps save time through proactive capacity planning and trend identification, which leads to infrastructure cost savings. Additionally, visualization on HAQM Managed Grafana dashboard helps identify anomalies in source-destination traffic and enables prompt troubleshooting steps to minimize resolution time.
You can get hands-on experience with exploring One Observability Workshop. Visit the AWS Observability guide to learn more about best practices. To get started and learn more, visit HAQM VPC Transit Gateways Flow Logs and HAQM Managed Grafana Dashboards.
About the authors: