AWS for M&E Blog
Streamlining content compliance: Automating media analysis with HAQM Nova
Transform content compliance from a bottleneck to a competitive advantage. Learn how media organizations can reduce content review time using HAQM Bedrock, and HAQM Nova foundation models. Automate the analysis of content across multiple rating systems while maintaining accuracy and reducing costs.
Introduction
Every minute counts in today’s competitive streaming landscape. As content libraries expand and international markets multiply, the challenge of ensuring content compliance has grown. Content teams face:
- Escalating volumes of content
- Increasingly complex and disparate rating systems across regions (such as, GARM, MPAA, TVPG, OfCom , BBFC)
- Pressing need to maintain accuracy while accelerating time-to-market
Compliance officers spend countless hours reviewing content frame-by-frame, juggling multiple rating systems, and documenting findings—all while racing against release deadlines. This approach isn’t just time-consuming; it’s expensive, prone to inconsistency, and increasingly unsustainable as content volumes grow.
The recent class of generative AI models includes HAQM Nova, a new generation of state-of-the-art foundation models available exclusively in HAQM Bedrock. HAQM Nova offers understanding models that accept text, images, and video inputs and generate text outputs.
These powerful models can be utilized to revolutionize content compliance workflows by bringing generative AI to content moderation at scale. Using these models, we can:
- Create an intelligent content moderation system that can analyze content
- Automatically flag potential issues with the associated timestamps
- Adapt to multiple rating systems simultaneously
Compliance officers can then focus their expertise on edge cases and nuanced decisions that truly require human judgment.
Through the lens of analyzing music videos, we’ll explore how to build this automated content compliance system, examining the technical architecture, implementation steps, and real-world benefits. This solution offers a blueprint for scaling content compliance, while reducing cost, without sacrificing accuracy or speed.
Understanding the challenge
Content compliance in the media industry has reached a critical inflection point. Some customers manage libraries of hundreds of thousands of hours of content across multiple territories. This scale makes traditional frame-by-frame manual review increasingly impractical.
Adding to this complexity is the proliferation of rating systems and content standards worldwide. Content providers must simultaneously navigate multiple frameworks—from GARM’s brand safety guidelines to MPAA’s content ratings, and from OfCom’s broadcasting standards to BBFC’s classification requirements. Each system has its own nuanced criteria, cultural considerations, and compliance requirements, creating a multi-dimensional challenge for content teams.
Human moderators, despite their expertise, face inherent limitations in this environment. Consistency becomes difficult to maintain across long review sessions, and fatigue can impact judgment, particularly when dealing with sensitive or challenging content. The psychological toll of continuous exposure to potentially disturbing content also raises important concerns about reviewer wellbeing.
The financial implications of maintaining purely manual review processes are equally significant. Beyond the direct labor costs, the time required for comprehensive manual review can delay content releases, impacting revenue opportunities and market competitiveness. For organizations expanding into new territories, scaling up manual review teams for each market becomes prohibitively expensive.
These challenges create a clear imperative for transformation. Media organizations need a solution that can:
- Process high volumes of content efficiently and consistently
- Adapt to multiple rating systems simultaneously
- Support human reviewers rather than replace them
- Reduce operational costs while maintaining or improving accuracy
This is where the combination of HAQM Nova foundation models and HAQM Bedrock offers a compelling path forward, enabling a hybrid approach that amplifies human expertise with generative AI capabilities.
Solution overview
This solution contains the following key components:
- HAQM Bedrock provides access to leading foundation models through a single API, enabling flexible development of generative AI applications.
- HAQM Nova Lite and HAQM Nova Pro are multimodal understanding models. HAQM Nova Lite and HAQM Nova Pro support a 300k context window and can analyze videos, documents, and images, making it ideal for comprehensive content analysis. HAQM Nova models understand the publicly available media taxonomies that are used to generate compliance analysis.
Traditional computer vision models, such as HAQM Rekognition, are used for image classification, label detection, object segmentation, text detection, face recognition, and content moderation. While computer vision models excel at certain tasks, we will instead use foundation models (FMs) for deep image and video understanding. FMs are capable of handling a variety of tasks in a single shot, and allow us to craft multi-use-case prompts that are capable of understanding media compliance taxonomies.
In Step 1, we pass in the music video (video), accompanying lyrics (text document), and a carefully crafted prompt with the necessary ratings and compliance taxonomies (text). HAQM Nova processes the request with low latency, and returns a compliance analysis with confidence scores. We convert the compliance analysis to a standard report, and use it to determine if we need a deeper frame-by-frame analysis (Step 2).
Step 1: Video level analysis using HAQM Nova
In order to achieve higher accuracy for video understanding, both video and audio should be used as context to the model. HAQM Nova multimodal understanding models currently require the audio transcript to be passed in separately from the video. In cases where the audio transcript is not available in text format, HAQM Transcribe may be used to provide the transcription. HAQM Transcribe is a fully managed automatic speech recognition service that provides high accuracy transcriptions.
In the following example prompt, we use the HAQM Bedrock Converse API to pass the video and transcript to HAQM Nova. This is to assess regional compliance against the MPAA Rating Assessment, TVPG Rating Assessment, BBFC Rating Assessment, OfCom Broadcasting code, and GARM Brand Safety Analysis. We will also be requesting a general content moderation analysis and executive summary:
Analyze the attached video and lyrics. Provide a detailed analysis including:
Content Rating Analysis:
1. MPAA Rating Assessment
- Suggested rating (G, PG, PG-13, R, NC-17)
- Confidence score (0-100%)
- Justification for rating
- Content elements influencing rating
2. TV Parental Guidelines (TVG) Rating Assessment
- Suggested rating (TV-Y, TV-Y7, TV-G, TV-PG, TV-14, TV-MA)
- Content descriptors (D, L, S, V, FV)
- Confidence score (0-100%)
- Justification for rating
- Content elements influencing rating
3. BBFC Rating Assessment
- Suggested rating (U, PG, 12A, 12, 15, 18, R18)
- Confidence score (0-100%)
- Key classification issues
- Content advisories
4. OfCom Broadcasting Code
- Watershed assessment (Pre/Post 9pm)
- Content warnings required
- Confidence score (0-100%)
- Compliance with Section 1 (Protection of Under 18s)
- Compliance with Section 2 (Harm and Offense)
5. GARM Brand Safety Analysis
- Brand Safety Floor Categories assessment
- Brand Suitability Categories (Low, Medium, High Risk)
- Confidence score (0-100%)
- Content labels and warnings
6. Content Moderation Flags
- Violence
- Adult Content
- Hate Speech
- Self-Harm
- Profanity
- Drug Reference
- Alcohol Reference
(Include confidence score for each flag)
Content Analysis:
7. Theme and Messaging
8. Visual Content Assessment
9. Lyrical Content Analysis
10. Target Audience
11. Potentially Sensitive Content Areas
Format the ratings and flags with confidence scores in the response.
Provide a brief executive summary (2-3 sentences) of the overall content analysis and rating recommendation.
An example response can be seen in the following. The response shows that HAQM Nova has performed the assessment and flagged that there are compliance considerations. It has also provided content moderation flags with confidence scores and an executive summary.
Content Rating Analysis:
1. MPAA Rating Assessment:
Suggested rating: PG-13
Confidence score: 85%
Justification for rating: The video contains profanity, suggestive themes, references to alcohol and drug use, and some suggestive content. These elements are consistent with an MPAA PG-13 rating. The lyrical content includes themes of crime, loss, and struggle, which are mature in nature.
Content elements influencing rating:
- Profanity (Confidence score: 90%)
- References to alcohol and drug use (Confidence score: 75%)
2. TV Parental Guidelines (TVG) Rating Assessment:
Suggested rating: TV-PG
Content descriptors: L (Language), D (Drug Reference), S (Suggestive Themes)
Confidence score: 80%
Justification for rating: The video includes strong language, and references to alcohol and drug use, which are appropriate for a TV-PG rating.
Content elements influencing rating:
- Language (Confidence score: 85%)
- Drug Reference (Confidence score: 70%)
- Suggestive Dialogue (Confidence score: 80%)
3. BBFC Rating Assessment:
Suggested rating: 12
Confidence score: 80%
Key classification issues: Strong language, references to drug and alcohol use, and suggestive themes.
Content advisories: Strong language, and drug reference.
4. OfCom Broadcasting Code:
Watershed assessment: Post 9pm
Content warnings required: Yes, for strong language, and drug reference.
Confidence score: 80%
Compliance with Section 1 (Protection of Under 18s): Requires appropriate warnings and restrictions.
Compliance with Section 2 (Harm and Offense): Likely to cause offense due to strong language.
5. GARM Brand Safety Analysis:
Brand Safety Floor Categories assessment: Medium Risk
Brand Suitability Categories: Medium Risk
Confidence score: 75%
Content labels and warnings: Explicit language, depictions of violence, drug and alcohol references.
6. Content Moderation Flags:
- Violence: 10%
Depictions of violence in the video and lyrics.
- Adult Content: 60%
References to sexual content and mature themes.
- Hate Speech: 10%
No direct hate speech, but some lyrics may be interpreted as aggressive.
- Self-Harm: 5%
Indirect references to personal struggles and loss.
- Profanity: 80%
Frequent use of explicit language in the lyrics.
- Drug Reference: 75%
References to drug use and dealing in the lyrics.
- Alcohol Reference: 80%
References to partying and nightlife, including drinking alcohol.
Content Analysis:
7. Theme and Messaging: The video revolves around themes of betrayal, blame, and the consequences of reckless behavior. The lyrics suggest a narrative of mutual involvement in wrongdoing and the fallout from it.
8. Visual Content Assessment: The visuals are dark and moody, with flashing lights and silhouettes that enhance the dramatic and intense atmosphere of the content.
9. Lyrical Content Analysis: The lyrics contain strong language, references to drug use, and suggest a narrative of mutual blame and involvement in wrongdoing.
10. Target Audience: The target audience is likely adults, given the mature themes and language. The content may not be suitable for younger viewers.
11. Potentially Sensitive Content Areas:
- References to drug use and alcohol consumption.
- Strong language.
- Suggestive themes.
Executive Summary:
The video contains strong language, references to drug and alcohol use, and suggestive themes, making it suitable for a PG-13 (MPAA), TV-PG (TVG), and 12 (BBFC) rating. It carries a medium risk for brand safety and requires appropriate warnings and restrictions for broadcast. The content is likely to cause offense and is best suited for an adult audience.
The results of the analysis can be output in JSON format, which can then be parsed to create a compliance report for use within your organization.
Following is an example compliance report created by parsing the JSON file:
If any of the flags exceed your thresholds, the detailed frame analysis can be triggered as a secondary step.
Step 2: Frame level analysis using HAQM Nova
This stage is only invoked if needed, to further explore any compliance issues discovered in the previous stage. With each frame analyzed, we can generate a timeline analysis visualization, which shows a time series and detailed analysis of specific compliance event.
Frame-by-frame analysis follows two steps:
-
- Extract frames from the video at a given frequency. For example, by using AWS Elemental MediaConvert to extract one frame per second.
- For each frame extracted:
- Skip the frame if it is too similar to the previous frame. One way to assess this would be using perceptual hashing between frames—reducing costs.
- Analyze frames for moderation flags.
Use the following prompt, along with passing the frame image to HAQM Nova:
You are an expert content moderator. Analyze the image for:
- Violence or graphic content
- Adult or inappropriate content
- Hate or offensive symbols
- Self-harm indicators
- Drug or alcohol references
Provide detailed analysis with confidence scores.
The timestamp of any moderation issues can be calculated by knowing the rate of extraction of frames and how many frames into the sequence of processing we are. Flags and timestamps can be collated to generate a timeline analysis visualization, which shows a time series of specific compliance events.
Following (Figure 5) is an example compliance report created by parsing the timeline analysis JSON file.
Benefits and results
Music videos pose unique challenges for content review due to their fast-paced nature, multiple themes, and complex visual elements. Music videos are fast-paced with frequent shot changes, they may present multiple themes in a short period of time, and are visually powerful. These elements make human review of content more error prone. We tested the visual understanding capabilities of HAQM Nova Lite and HAQM Nova Pro with music videos across all MPAA categories, genres, and artistic styles. There are many variables that contributed to our output, and this analysis is meant to be directional and not definitive. We encourage you to run your own tests on your specific content. The following section describes observations from our tests.
Key Observations:
-
-
- Both HAQM Nova models demonstrated caution, rating categories conservatively
- HAQM Nova Pro showed higher accuracy in detecting nuanced content elements
- HAQM Nova Lite proved sufficient for standard cases with greater than 83 percent accuracy
- HAQM Nova Lite was sufficient in many test cases, though there were cases where HAQM Nova Pro provided more detailed descriptions of scenes (without being prompted to describe these aspects). Consider using HAQM Nova Pro when more detail is required for:
- Facial expressions
- Advanced plot or theme analysis
- Spatial understanding
- Specific location of objects
- A human reviewer neglected to identify elements such as:
- Fast flashing lights
- Certain self-harm themes
- Partial nudity in the background due to strong blurring in out-of-focus areas
- Suggestive content displayed on body tattoos
-
Best practices and considerations
When implementing content compliance automation with HAQM Nova models, organizations should consider both cost and performance optimizations. Our testing revealed that analyzing a typical four-minute music video using HAQM Nova Lite consumed 76,497 input tokens and 496 output tokens, resulting in a cost of $0.0047 based on current pricing. This three-stage approach (content preparation, video analysis, and frame analysis) enables independent scaling and targeted problem-solving without disrupting the entire workflow.
For maximum effectiveness, prompts should be structured hierarchically with clear categories and specific examples, with media content presented before analysis instructions. For more prompting best practices, refer to the Prompting best practices for HAQM Nova understanding models.
Organizations should leverage HAQM Bedrock Prompt Management for version control and updates. Implementing human-in-the-loop workflows for content within specific confidence thresholds is recommended. You should also include feedback mechanisms that allow moderators to correct AI decisions and build valuable datasets for future improvements. Regular performance reviews and automated metrics should inform prompt refinements, enabling fast turnaround updates independent from the application.
By following these best practices, organizations can optimize their content compliance automation, ensuring efficient, accurate, and cost-effective operations while maintaining the flexibility to adapt to changing requirements.
Conclusion
Leveraging HAQM Nova foundation models and HAQM Bedrock for content compliance workflows represents a paradigm shift for media organizations. This generative AI-powered approach not only addresses the immediate challenges of scale, complexity, and cost but also positions content teams to thrive in an increasingly global and content-rich landscape.
Organizations can expect faster content review, consistent ratings, and significant cost savings. Begin by assessing your current workflow and implementing a proof of concept before scaling to full deployment.
Ready to transform your content compliance workflow? Contact an AWS Representative to know how we can help accelerate your business.
Visit the following links to learn more about additional media and entertainment industry use cases: