AWS Marketplace

Optimize HAQM VPC Flow Logs analysis with Cribl Stream sampling

To remain competitive, software development enterprises must constantly seek methods to optimize their cloud infrastructure monitoring. HAQM Virtual Private Cloud (HAQM VPC) Flow Logs, an important feature in AWS, captures information about IP traffic going to and from network interfaces in your VPC. These logs are essential for network monitoring, security analysis, and compliance auditing. However, organizations face significant challenges when working with VPC Flow Logs. The sheer volume of data generated can lead to high storage and processing costs. Additionally, the complexity of log formats makes it difficult to extract meaningful insights. Enriching the logs with contextual information and creating effective visualizations typically requires specialized tools and significant effort.

Cribl, an AWS Partner Network (APN) member available in AWS Marketplace, addresses these challenges head-on. As a leading telemetry data pipeline, Cribl Stream sits between data sources and destinations, allowing for flexible manipulation of flow logs in transit. It offers unique capabilities, including real-time filtering to reduce data volume, on-the-fly enrichment with external data sources, and format transformation for easier analysis. This approach not only cuts costs but also enhances the value of flow log data, enabling more efficient network monitoring and security insights.

Prerequisites

To perform the solution, you need to have the following prerequisites in place:

Solution overview

This post demonstrates how to integrate Cribl Stream with VPC Flow Logs to optimize your network traffic analysis. You’ll learn how to:

  • Subscribe to Cribl through AWS Marketplace
  • Set up VPC Flow Logs integration with Cribl Stream
  • Use Cribl Search to query data from Cribl Lake
  • Generate custom queries and charts for VPC flow log metrics

Figure 1 shows the architectural overview of the solution.

Figure 1: Architecture overview of flow logs ingestion to Cribl environment

Figure 1: Architecture overview of flow logs ingestion to Cribl environment

Solution walkthrough

This section walks you through the steps required to deploy the solution.

Subscribe to Cribl Cloud through AWS Marketplace

  1. Sign in to your AWS Management Console and go to the Cribl page in AWS Marketplace.
  2. Choose Cribl.Cloud Suite. Read through the End User License Agreement and choose Try for free.
  3. When the subscription is complete, you’ll receive a confirmation email from AWS Marketplace.

Step 1: Enable VPC Flow Logs and publish flow logs

To enable VPC Flow Logs and publish flow logs to HAQM Simple Storage Service (HAQM S3), follow these steps:

  1. On the HAQM VPC console, in the navigation pane, choose Your VPCs. Select the checkbox for the VPCs.
  2. Choose Actions and then Create flow log. For more information, refer to Create a flow log that publishes to HAQM S3.
  3. For Destination, choose Send to an HAQM S3 bucket.
  4. For S3 bucket ARN, specify the HAQM Resource Name (ARN) of an existing S3 bucket.
  5. Choose Create flow log.

Step 2: Create an HAQM SQS queue

To create an HAQM Simple Queue Service (HAQM SQS) queue, follow these steps:

  1. In the HAQM SQS console in your AWS Region, create a queue. For instructions, see Getting Started with HAQM SQS in the HAQM Simple Queue Service Developer Guide.
  2. Replace the access policy that’s attached to the SQS queue with the following policy.
  3. Copy the queue ARN for use in Step 3.

Step 3: Enable SQS notifications using the HAQM S3 console.

To enable SQS notifications using the HAQM S3 console, follow these steps:

  1. Sign in to the AWS Management Console, open the HAQM S3 console.
  2. Select your bucket that captures flow logs from step 1.
  3. Choose Properties, Event Notifications, and Create event notification. For more information, refer to Enabling HAQM SNS, HAQM SQS, or Lambda notifications using the HAQM S3 console.
  4. In the Destination section, choose the event notification destination as type SQS Queue and select your previously created HAQM SQS ARN from Step 2.

Step 4: Set up Cribl Stream with HAQM SQS as the source

To set up Cribl Stream with HAQM SQS as the source, follow these steps:

  1. On the top bar, navigate to Products, then choose CriblStream.
  2. From the top navigation, select Manage, then select a Worker Group to configure.
  3. To configure using the graphical QuickConnect UI, choose Routing and then QuickConnect (Stream).
  4. On the left side, choose Add Source. From the resulting drawer’s tiles, select Pull, then HAQM, and then,
  5. Under Input ID, enter a unique name to identify this SQS source definition.
  6. Under Queue, enter your ARN of the SQS queue to read events from.
  7. For Authentication Configuration, choose the Authentication method dropdown menu to select an AWS authentication method and provide your AWS access key and security key generated.
  8. Choose Save.

The following (Figure 2) shows those steps in the console.

Figure 2: Configuring source in Cribl Streams as SQS

Figure 2: Configuring source in Cribl Stream as SQS

Step 5: Configure Cribl Lake as the destination

To configure Cribl Lake as the destination, follow these steps:

  1. In the Cribl Stream Console, select Worker Groups and choose your Worker Group.
  2. On the Routing tab, choose QuickConnect.
  3. On the right, choose Add Destination.
  4. From the list, select the destination you want. In this example, you are adding Cribl Lake as the destination for flow logs.
  5. On the destination, choose Configure.
  6. Under the Lake dataset dropdown menu, choose default_metrics.
  7. Choose Save.

In the example shown (Figure 3), we’re adding Cribl Lake as the destination for the flow logs.

Figure 3: Configuring the destination as Cribl Lake

Figure 3: Configuring the destination as Cribl Lake

Step 6: Configure Cribl Pipeline

To configure the Cribl Pipeline, follow these steps:

  1. In the Cribl Stream Console, select Worker Groups and choose your Worker Group.
  2. Select Routing, then QuickConnect. For more information, refer to QuickConnect.
  3. To connect source and destination, drag a line between the desired source and destination. Select and drag from the + icon on your source to your destination.
  4. In the Connection Configuration modal, select Pipeline or Pack, as shown in Figure 4.
Figure 4: Configuring Pipeline and Packs in Cribl Streams

Figure 4: Configuring Pipeline and Packs in Cribl Stream

In the example shown in Figure 5, we’re installing a Pack called VPC Flow Pack, which helps security teams reduce data volume and enrich data with contextual information such as GeoIP. The GeoIP Function enriches events with geographic fields based on IP addresses. It works with MaxMind’s GeoIP binary database.

Figure 5: Configuring VPC Flow Logs Pack for security teams in Cribl Streams

Figure 5: Configuring VPC Flow Logs Pack for security teams in Cribl Stream

Step 7: Commit, deploy and verify configuration

To commit, deploy, and verify the configuration, follow these steps:

  1. To deploy the source, destination, and pipeline configurations, in the top right corner, choose Commit & Deploy.
  2. Verify that the source and destination are working by reviewing the common errors and warnings.

Step 8: Query and analyze data

To query and analyze data, follow these steps:

  1. Under Products, choose Cribl Lake.
  2. Choose SEARCH and enter default_metrics in the search bar. For more information, refer to Meet Cribl Search.
  3. In the query editor, you can execute queries to analyze your flow logs in Kusto Query Languge (KQL).

The following are some Cribl sample queries, as shown in Figure 6:

dataset="default_metrics"
| extract parser='AWS VPC Flow Logs'
| where bytes > 100

This query retrieves up to 100 rows from the VPC Flow Logs dataset.

Figure 6: Chart inside CriblSeach querying Cribl Lake for bytes

Figure 6: Chart inside CriblSearch querying Cribl Lake for bytes

To filter by conditions:

dataset="cribl_search_sample"| extract parser='AWS VPC Flow Logs' | render table

This query analyzes VPC Flow Logs by mapping source and destination locations through GeoIP, providing clear visibility into network traffic patterns (Figure 7).

Figure 7: Chart inside CriblSeach querying Cribl Lake based on filtered parameters

Figure 7: Chart inside CriblSearch querying Cribl Lake based on filtered parameters

To filter by time range:

dataset="default_metrics" earliest=-30d latest=-1h
| extract parser='AWS VPC Flow Logs'
| where bytes > 100 and action == "ACCEPT"
| timestats span=1d by host

This query, shown in Figure 8, narrows results to flow logs from the last 30 days to 1 hour ago, with bytes > 100 and action ACCEPT grouped daily by host.

Figure 8: Chart inside CriblSeach querying Cribl Lake grouped by hosts

Figure 8: Chart inside CriblSearch querying Cribl Lake grouped by hosts

Cleanup

There is a cost associated with using this solution for the services (AWS and Cribl) involved. To avoid incurring unnecessary charges, follow these steps to clean up the resources you created during this walkthrough:

  • Delete VPC Flogs Logs including the S3 bucket created as part of the solution walkthrough. After you delete a flow log, it can take several minutes to stop collecting data. Deleting a flow log doesn’t delete the log data from the destination or modify the destination resource. You must delete the existing flow log data directly from the destination and clean up the destination resource, using the console for the destination service.
  • Remove any temporary or unused configurations used in Cribl.
  • Follow the AWS Marketplace process for canceling your SaaS subscription.

Conclusion

In this post, we showed how Cribl improves VPC Flow Logs management through efficient data routing, filtering, and transformation. Organizations using Stream reduce their AWS network traffic analysis costs while improving log quality. The platform enriches VPC Flow Logs with additional context and standardizes log formats, making analysis more straightforward. Benefits include reduced data volume and storage costs, enhanced network visibility, improved security monitoring, and flexible routing to preferred analytics tools.

For next steps, learn more about VPC Flow Logs and subscribe to Cribl in AWS Marketplace.

About the authors

riz.jpg

Rizwan Mushtaq

Rizwan is a Principal Solutions Architect at AWS. He helps customers design innovative, resilient, and cost-effective solutions using AWS services. He holds an MS in Electrical Engineering from Wichita State University.

suramac.jpg

Sunil Ramachandra

Sunil is a Senior Solutions Architect enabling hyper-growth Independent Software Vendors (ISVs) to innovate and accelerate on AWS. He partners with customers to build highly scalable and resilient cloud architectures. When not collaborating with customers, Sunil enjoys spending time with family, running, meditating, and watching movies on Prime Video.

Kamilo-1.jpg

Kamilo “Kam” Amir

Kamilo “Kam” Amir is the Director of Business Development for Cribl and is based in the Washington, D.C. area. He’s been with Cribl since LogStream version 2.2 and leads the technical alliances program. If you need to find him, just look for him hiking in Rock Creek Park with his family and husky or in the Cribl Slack Community.