GitHub - RHEcosystemAppEng/nvidia-video-search-and-summarization: Deploying NVIDIA's Blueprint for ingesting massive volumes of live or archived videos and extracting insights for summarization and interactive Q&A on OpenShift.

NVIDIA AI Blueprint: Video Search and Summarization

Overview

This repository is what powers the build experience, showcasing video search and summarization agent with NVIDIA NIM microservices.

Insightful, accurate, and interactive video analytics AI agents enable a range of industries to make better decisions faster. These AI agents are given tasks through natural language and can perform complex operations like video summarization and visual question-answering, unlocking entirely new application possibilities. The NVIDIA AI Blueprint makes it easy to get started building and customizing video analytics AI agents for video search and summarization — all powered by generative AI, vision language models (VLMs) like Cosmos Nemotron VLMs, large language models (LLMs) like Llama Nemotron LLMs, NVIDIA NeMo Retriever, and NVIDIA NIM.

Use Case / Problem Description

The NVIDIA AI Blueprint for Video Search and Summarization addresses the challenge of efficiently analyzing and summarizing large volumes of video data. This can be used to create vision AI agents, that can be applied to a multitude of use cases such as monitoring smart spaces, warehouse automation, and SOP validation. This is important where quick and accurate video analysis can lead to better decision-making and enhanced operational efficiency.

Software Components

NIM microservices: Here are models used in this blueprint:
Ingestion Pipeline:

The process involves decoding video segments (chunks) generated by the stream handler, selecting frames, and using a vision-language model (VLM) along with a caption prompt to generate detailed captions for each chunk. A computer vision pipeline enhances video analysis by providing detailed metadata on objects. In parallel, the audio is extracted and a transcription is generated. These dense captions, along with audio transcripts and CV metadata are then indexed into vector and graph databases for use in the Context-Aware Retrieval-Augmented Generation workflow.
CA-RAG module:

The Context-Aware Retrieval-Augmented Generation (CA-RAG) module leverages both Vector RAG and Graph-RAG as primary sources for video understanding. This module is utilized in key features such as summarization, Q&A, and sending alerts. During the Q&A workflow, the CA-RAG module extracts relevant context from the vector database and graph database to enhance temporal reasoning, anomaly detection, multi-hop reasoning, and scalability. This approach offers deeper contextual understanding and efficient management of extensive video data. Additionally, the context manager effectively maintains its working context by efficiently using both short-term memory, such as chat history, and long-term memory resources like vector and graph databases, as needed.

Target Audience

This blueprint is designed for ease of setup with extensive configuration options, requiring technical expertise. It is intended for:

Video Analysts and IT Engineers: Professionals focused on analyzing video data and ensuring efficient processing and summarization. The blueprint offers 1-click deployment steps, easy-to-manage configurations, and plug-and-play models, making it accessible for early developers.
GenAI Developers / Machine Learning Engineers: Experts who need to customize the blueprint for specific use cases. This includes modifying the RAG pipeline for unique datasets and fine-tuning LLMs as needed. For advanced users, the blueprint provides detailed configuration options and custom deployment possibilities, enabling extensive customization and optimization.

Repository Structure Overview

deploy/: Contains scripts for docker compose and helm chart deployment, along with notebook for Brev launchable deployment.
src/: Source code for the video search and summarization agent.
examples/: Training notebooks for using VSS and usecase examples.

Documentation

For detailed instructions and additional information about this blueprint, please refer to the official documentation.

Prerequisites

Obtain API Key

NVIDIA AI Enterprise developer licence required to local host NVIDIA NIM.
API catalog keys:
- NVIDIA API catalog or NGC (steps to generate key)

Hardware Requirements

The platform requirement can vary depending on the configuration and deployment topology used for VSS and dependencies like VLM, LLM, etc. For a list of validated GPU topologies and what configuration to use, see the supported platforms.

Deployment Type	VLM	LLM	Embedding (llama-3.2-nv-embedqa-1b-v2)	Reranker (llama-3.2-nv-rerankqa-1b-v2)	Minimum GPU Requirement
Local deployment (Default topology)	Local (Cosmos Reason2 8B)	Local (Llama 3.1 70B)	Local	Local	8xB200, 8xH200, 8xH100, 8xA100 (80GB), 8xL40S, 8xRTX PRO 6000 Blackwell
Local deployment (Reduced Compute)	Local (Cosmos Reason2 8B)	Local (Llama 3.1 70B)	Local	Local	4xB200, 4xH200, 4xH100, 4xA100 (80GB), 6xL40S, 4xRTX PRO 6000 Blackwell
Local deployment (Single GPU)	Local (Cosmos Reason2 8B)	Local (Llama 3.1 8b low mem mode)	Local	Local	1xB200, 1xH200, 1xH100, 1xA100 (80GB), 1xRTX PRO 6000 Blackwell, DGX Spark, GH200, GB200
Local VLM deployment	Local(Cosmos Reason2 8B)	Remote	Remote	Remote	1xB200, 1xH200, 1xH100, 2xA100 (80GB), 1xL40S, 1xRTX PRO 6000 Blackwell, Jetson Thor, DGX Spark, GH200, GB200
Complete remote deployment	Remote	Remote	Remote	Remote	Minimum 8GB VRAM GPU, Jetson Thor, DGX Spark, GH200, GB200

Quickstart Guide

Launchable Deployment

Ideal for: Quickly getting started with your own videos without worrying about hardware and software requirements.

Follow the steps from the documentation and notebook in deploy directory to complete all pre-requisites and deploy the blueprint using Brev Launchable in an 8xL40s Crusoe instance.

deploy/1_Deploy_VSS_docker_Crusoe.ipynb: This notebook is tailored specifically for the Crusoe CSP which uses Ephemeral storage.

Docker Compose Deployment

Ideal for: Development phase where you need to run VSS locally, test different models, and experiment with various deployment configurations. This method offers greater flexibility for debugging each component.

For custom VSS deployments through Docker Compose, multiple samples are provided to show different combinations of remote and local model deployments. The /deploy/docker directory contains a README with all the details. Link to README

System Requirements (x86 systems)

Ubuntu 22.04
NVIDIA driver 580.65.06 (Recommended minimum version)
CUDA 13.0+ (CUDA driver installed with NVIDIA driver)
NVIDIA Container Toolkit 1.13.5+
Docker 27.5.1+
Docker Compose 2.32.4

Please refer to Prerequisites section here for more information.

System Requirements (NVIDIA Jetson Thor)

Please refer to NVIDIA Jetson Thor Setup Instructions.

System Requirements (NVIDIA DGX Spark)

Please refer to NVIDIA DGX Spark Setup Instructions.

Helm Chart Deployment

Ideal for: Production deployments that need to integrate with other systems. Helm offers advantages such as easy upgrades, rollbacks, and management of complex deployments.

The /deploy/helm/ directory contains a nvidia-blueprint-vss-2.4.1.tgz file which can be used to spin up VSS. Refer to the documentation here for detailed instructions.

System Requirements

Ubuntu 22.04
NVIDIA driver 580.65.06 (Recommended minimum version)
CUDA 13.0+ (CUDA driver installed with NVIDIA driver)
Kubernetes v1.31.2
NVIDIA GPU Operator v23.9 (Recommended minimum version)
Helm v3.x

NOTE: Helm deployments are supported only for x86 platforms.

OpenShift Deployment

Ideal for: Production deployments on Red Hat OpenShift and OpenShift AI (RHOAI) clusters.

Deploying VSS on OpenShift requires additional configuration to handle security context constraints, storage permissions, and GPU scheduling. A deployment script and OpenShift-specific Helm value overrides are provided.

docs/deploy-openshift.md: Full deployment guide with OpenShift-specific challenges and solutions.
deploy/helm/deploy-openshift.sh: Automated deployment script (creates namespace, secrets, service account, and runs helm upgrade --install).
deploy/helm/values-openshift.yaml: Helm value overrides for OpenShift compatibility.

Known CVEs

VSS Engine 2.4.1 Container has the following known CVEs:

CVE	Description
GHSA-58pv-8j8x-9vj2	This impacts jaraco.context < 6.1.0 python package. This does not affect VSS since it does not install user provided python packages.
CVE-2025-69223	This impacts aiohttp < 3.13.3 python package. This does not affect VSS since it gets included as a private package inside ray and ray is not used by VSS.
GHSA-f83h-ghpp-7wcc	This impacts pdfminer.six < 20251230 python package. This does not affect VSS since it does not implement PDF parsing.
CVE-2025-68973	This impacts gnupg < 2.4.8. This does not affect VSS since it does not implement GPG encryption.
GHSA-mcmc-2m55-j8jj GHSA-mrw7-hf4f-83pf CVE-2025-62372	This impacts vLLM < 0.11.1 python package. This does not affect VSS since it does not support user provided embeddings.
CVE-2026-21441	This affects urllib3 < 2.6.3 python package. This does not affect VSS since it does not access user provided URLs at runtime.
CVE-2025-3887	This impacts GStreamer H.265 codec parser, Malicious malformed streams can cause stack overflow in H.265 codec parser causing the application to crash. Users must take care that malicious H.265 streams are not added to VSS. This can be remedied by building and installing the GStreamer1.24.2 codec parser library after applying the patch mentioned in https://gstreamer.freedesktop.org/security/sa-2025-0001.html.
GHSA-rcfx-77hg-w2wv	This impacts fastmcp < 2.14.0 python package. This does not affect VSS since it already used an updated version of MCP SDK.

License

Refer to LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
deploy		deploy
docs		docs
eval		eval
examples		examples
perf-benchmark		perf-benchmark
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-3rd-party.txt		LICENSE-3rd-party.txt
LICENSE.DATA		LICENSE.DATA
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA AI Blueprint: Video Search and Summarization

Table of Contents

Overview

Use Case / Problem Description

Software Components

Target Audience

Repository Structure Overview

Documentation

Prerequisites

Obtain API Key

Hardware Requirements

Quickstart Guide

Launchable Deployment

Docker Compose Deployment

System Requirements (x86 systems)

System Requirements (NVIDIA Jetson Thor)

System Requirements (NVIDIA DGX Spark)

Helm Chart Deployment

System Requirements

OpenShift Deployment

Known CVEs

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NVIDIA AI Blueprint: Video Search and Summarization

Table of Contents

Overview

Use Case / Problem Description

Software Components

Target Audience

Repository Structure Overview

Documentation

Prerequisites

Obtain API Key

Hardware Requirements

Quickstart Guide

Launchable Deployment

Docker Compose Deployment

System Requirements (x86 systems)

System Requirements (NVIDIA Jetson Thor)

System Requirements (NVIDIA DGX Spark)

Helm Chart Deployment

System Requirements

OpenShift Deployment

Known CVEs

License

About

Resources

License

Licenses found

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages