AWS Marketplace
Optimize HAQM VPC Flow Logs analysis with Cribl Stream sampling
To remain competitive, software development enterprises must constantly seek methods to optimize their cloud infrastructure monitoring. HAQM Virtual Private Cloud (HAQM VPC) Flow Logs, an important feature in AWS, captures information about IP traffic going to and from network interfaces in your VPC. These logs are essential for network monitoring, security analysis, and compliance auditing. However, organizations face significant challenges when working with VPC Flow Logs. The sheer volume of data generated can lead to high storage and processing costs. Additionally, the complexity of log formats makes it difficult to extract meaningful insights. Enriching the logs with contextual information and creating effective visualizations typically requires specialized tools and significant effort.
Cribl, an AWS Partner Network (APN) member available in AWS Marketplace, addresses these challenges head-on. As a leading telemetry data pipeline, Cribl Stream sits between data sources and destinations, allowing for flexible manipulation of flow logs in transit. It offers unique capabilities, including real-time filtering to reduce data volume, on-the-fly enrichment with external data sources, and format transformation for easier analysis. This approach not only cuts costs but also enhances the value of flow log data, enabling more efficient network monitoring and security insights.
Prerequisites
To perform the solution, you need to have the following prerequisites in place:
- An AWS account with administrative access
- An existing HAQM S3 bucket that VPC Flow Logs has permission to write to or create a new S3 bucket
- Appropriate Identity and Access Management (IAM) permissions configured for the user who need to enable flow logs
- A Cribl Cloud account
Solution overview
This post demonstrates how to integrate Cribl Stream with VPC Flow Logs to optimize your network traffic analysis. You’ll learn how to:
- Subscribe to Cribl through AWS Marketplace
- Set up VPC Flow Logs integration with Cribl Stream
- Use Cribl Search to query data from Cribl Lake
- Generate custom queries and charts for VPC flow log metrics
Figure 1 shows the architectural overview of the solution.
Solution walkthrough
This section walks you through the steps required to deploy the solution.
Subscribe to Cribl Cloud through AWS Marketplace
- Sign in to your AWS Management Console and go to the Cribl page in AWS Marketplace.
- Choose Cribl.Cloud Suite. Read through the End User License Agreement and choose Try for free.
- When the subscription is complete, you’ll receive a confirmation email from AWS Marketplace.
Step 1: Enable VPC Flow Logs and publish flow logs
To enable VPC Flow Logs and publish flow logs to HAQM Simple Storage Service (HAQM S3), follow these steps:
- On the HAQM VPC console, in the navigation pane, choose Your VPCs. Select the checkbox for the VPCs.
- Choose Actions and then Create flow log. For more information, refer to Create a flow log that publishes to HAQM S3.
- For Destination, choose Send to an HAQM S3 bucket.
- For S3 bucket ARN, specify the HAQM Resource Name (ARN) of an existing S3 bucket.
- Choose Create flow log.
Step 2: Create an HAQM SQS queue
To create an HAQM Simple Queue Service (HAQM SQS) queue, follow these steps:
- In the HAQM SQS console in your AWS Region, create a queue. For instructions, see Getting Started with HAQM SQS in the HAQM Simple Queue Service Developer Guide.
- Replace the access policy that’s attached to the SQS queue with the following policy.
- Copy the queue ARN for use in Step 3.
Step 3: Enable SQS notifications using the HAQM S3 console.
To enable SQS notifications using the HAQM S3 console, follow these steps:
- Sign in to the AWS Management Console, open the HAQM S3 console.
- Select your bucket that captures flow logs from step 1.
- Choose Properties, Event Notifications, and Create event notification. For more information, refer to Enabling HAQM SNS, HAQM SQS, or Lambda notifications using the HAQM S3 console.
- In the Destination section, choose the event notification destination as type SQS Queue and select your previously created HAQM SQS ARN from Step 2.
Step 4: Set up Cribl Stream with HAQM SQS as the source
To set up Cribl Stream with HAQM SQS as the source, follow these steps:
- On the top bar, navigate to Products, then choose CriblStream.
- From the top navigation, select Manage, then select a Worker Group to configure.
- To configure using the graphical QuickConnect UI, choose Routing and then QuickConnect (Stream).
- On the left side, choose Add Source. From the resulting drawer’s tiles, select Pull, then HAQM, and then,
- Under Input ID, enter a unique name to identify this SQS source definition.
- Under Queue, enter your ARN of the SQS queue to read events from.
- For Authentication Configuration, choose the Authentication method dropdown menu to select an AWS authentication method and provide your AWS access key and security key generated.
- Choose Save.
The following (Figure 2) shows those steps in the console.
Step 5: Configure Cribl Lake as the destination
To configure Cribl Lake as the destination, follow these steps:
- In the Cribl Stream Console, select Worker Groups and choose your Worker Group.
- On the Routing tab, choose QuickConnect.
- On the right, choose Add Destination.
- From the list, select the destination you want. In this example, you are adding Cribl Lake as the destination for flow logs.
- On the destination, choose Configure.
- Under the Lake dataset dropdown menu, choose default_metrics.
- Choose Save.
In the example shown (Figure 3), we’re adding Cribl Lake as the destination for the flow logs.
Step 6: Configure Cribl Pipeline
To configure the Cribl Pipeline, follow these steps:
- In the Cribl Stream Console, select Worker Groups and choose your Worker Group.
- Select Routing, then QuickConnect. For more information, refer to QuickConnect.
- To connect source and destination, drag a line between the desired source and destination. Select and drag from the + icon on your source to your destination.
- In the Connection Configuration modal, select Pipeline or Pack, as shown in Figure 4.
In the example shown in Figure 5, we’re installing a Pack called VPC Flow Pack, which helps security teams reduce data volume and enrich data with contextual information such as GeoIP. The GeoIP Function enriches events with geographic fields based on IP addresses. It works with MaxMind’s GeoIP binary database.
Step 7: Commit, deploy and verify configuration
To commit, deploy, and verify the configuration, follow these steps:
- To deploy the source, destination, and pipeline configurations, in the top right corner, choose Commit & Deploy.
- Verify that the source and destination are working by reviewing the common errors and warnings.
Step 8: Query and analyze data
To query and analyze data, follow these steps:
- Under Products, choose Cribl Lake.
- Choose SEARCH and enter default_metrics in the search bar. For more information, refer to Meet Cribl Search.
- In the query editor, you can execute queries to analyze your flow logs in Kusto Query Languge (KQL).
The following are some Cribl sample queries, as shown in Figure 6:
dataset="default_metrics"
| extract parser='AWS VPC Flow Logs'
| where bytes > 100
This query retrieves up to 100 rows from the VPC Flow Logs dataset.
To filter by conditions:
dataset="cribl_search_sample"| extract parser='AWS VPC Flow Logs' | render table
This query analyzes VPC Flow Logs by mapping source and destination locations through GeoIP, providing clear visibility into network traffic patterns (Figure 7).
To filter by time range:
dataset="default_metrics" earliest=-30d latest=-1h
| extract parser='AWS VPC Flow Logs'
| where bytes > 100 and action == "ACCEPT"
| timestats span=1d by host
This query, shown in Figure 8, narrows results to flow logs from the last 30 days to 1 hour ago, with bytes > 100 and action ACCEPT grouped daily by host.
Cleanup
There is a cost associated with using this solution for the services (AWS and Cribl) involved. To avoid incurring unnecessary charges, follow these steps to clean up the resources you created during this walkthrough:
- Delete VPC Flogs Logs including the S3 bucket created as part of the solution walkthrough. After you delete a flow log, it can take several minutes to stop collecting data. Deleting a flow log doesn’t delete the log data from the destination or modify the destination resource. You must delete the existing flow log data directly from the destination and clean up the destination resource, using the console for the destination service.
- Remove any temporary or unused configurations used in Cribl.
- Follow the AWS Marketplace process for canceling your SaaS subscription.
Conclusion
In this post, we showed how Cribl improves VPC Flow Logs management through efficient data routing, filtering, and transformation. Organizations using Stream reduce their AWS network traffic analysis costs while improving log quality. The platform enriches VPC Flow Logs with additional context and standardizes log formats, making analysis more straightforward. Benefits include reduced data volume and storage costs, enhanced network visibility, improved security monitoring, and flexible routing to preferred analytics tools.
For next steps, learn more about VPC Flow Logs and subscribe to Cribl in AWS Marketplace.