x-dream media

x-dream AI Hub

Audio / Video Analysis, Search & Compositing AI

x-dream AI Hub is an AI software stack designed to analyse audio and video content at scale. It aggregates results into rich semantic content descriptions, enables powerful search across your entire media library, and creates new content from existing assets.

AI has been one of x-dream-Fabrik's core processing engines since 2024. x-dream AI Hub is the next step: our own dedicated AI development, designed to run as a standalone hub that any platform — ours or a partner's — can call over a clean API.

With a built-in API, x-dream AI Hub integrates seamlessly as part of the x-dream-Fabrik platform, as a standalone component, or as a module embedded in any third-party software. Software development partners can leverage its comprehensive API to power their own intelligent media applications.

Every analysis run processes your media through multiple AI engines in a single pipeline — from speech-to-text and speaker diarization, identification to face identification and contextual scene analysis — delivering a rich, multi-dimensional metadata layer over your entire content archive.

About

Use cAses

News Production

Journalists describe the story they want to tell in natural language. x-dream AI Hub identifies matching archive shots, generates a text and editing script with beats, and finally produces an EDL incl. voiceover for direct rendering or craft editing — all in minutes.

Archive Enrichment

Retroactively enrich legacy archives with semantic metadata. Every clip receives shot-level descriptions, transcripts, face IDs, and tags — making decades of footage instantly searchable.

MAM / PAM Integration

Expose the x-dream AI Hub API to your existing MAM or PAM system. Automatic metadata enrichment on ingest improves asset discoverability and reduces manual logging time by over 80%.

Sports & Events

Detect player faces, objects, and action moments across live recordings. Generate highlight clips and social media edits from event footage in near real-time using semantic search and story creation.

Multilingual Publishing

Language detection, transcription, and translation in a single pipeline. Produce localised captions, subtitles, and metadata in multiple languages from a single analysis run.

Compliance & Research

OCR extracts on-screen text; face and voice recognition cross-reference known persons. Used by broadcasters and research institutions for content compliance monitoring and corpus analysis.

Target Customers

Broadcaster

  • National TV
  • Regional TV
  • News channels
  • Special interest channels
  • Event channels (e.g. sports, news, entertainment)

Content Owner

  • Archive Ingest
  • Asset Aggregation
  • B2B content delivery

Post-Production Facilities

  • On-premises Editing
  • Distributed Production
  • Content Delivery

Media Groups

  • Archive Ingest
  • Event channels (e.g. sports, music, society)
  • Special interest channels

Localisation Agencies

  • Translation
  • Subtitling

Corporate & Public

  • Content Production
  • Cross Media Publishing
  • Archive Ingest
  • Business TV

Deployment

x-dream AI Hub is infrastructure-agnostic. Choose the deployment model that fits your security, latency, and cost requirements.

On-Premises

Full control and data sovereignty. Deploy x-dream AI Hub on your own hardware for maximum security and lowest latency on local storage.

Datacentre

Co-locate with your existing media infrastructure in a managed datacentre. Combine performance with reduced operational overhead.

Cloud

Scale elastically on AWS, Azure, or GCP. Pay-per-use GPU resources for burst workloads without capital expenditure on hardware.

Hybrid

Keep sensitive content on-premises while routing non-confidential workloads to cloud AI services via the built-in Cloud Connector.

Features

Core Capabilities

  • Audio & Video analysis pipeline
  • Shot-level scene detection & timeline
  • Face & voice identification across assets
  • OCR & overlay text extraction
  • LLM-powered contextual enrichment
  • Semantic content description aggregation
  • Natural language semantic search
  • AI-assisted story & content creation
  • RESTful API for third-party integration
  • Flexible deployment (cloud, on-prem, hybrid)

Metadata ↔ Semantic Synergy

In x-dream-Fabrik, manually entered metadata and AI-derived semantic data work both ways: an existing title or description can be fed to x-dream AI Hub as a hint, guiding the analysis and reducing false detections. In return, the AI-generated shot summary can become the synopsis in your standard metadata set. Both layers stay fully optional and independently usable.

Ai Engines

Language Detection

Automatically identifies the spoken or written language present in audio and video content, enabling correct routing to transcription and translation engines.

Transcription

Converts spoken audio to accurate timestamped text transcripts. Works across languages and supports broadcast-quality audio with background noise.

Translation

Translates transcribed text into target languages, enabling cross-language search and multilingual metadata generation from a single analysis run.

Diarization

Identifies and separates different speakers within an audio track, assigning speaker labels to transcript segments with timestamps for accurate attribution.

Voice Detection

Detects and matches voiceprints across audio tracks, confirming a speaker's identity independent of face recognition — useful for off-camera narration, archive audio, or phone and radio sources.

Object Detection

Detects and classifies objects, persons, and entities within video frames, enriching each shot with structured visual metadata for downstream search.

Image Captioning

Generates natural language descriptions for individual video frames and shots, creating human-readable summaries of visual content at the scene level.

Image Classification

Categorises frames using a rich taxonomy of scene types, locations, activities, and visual attributes, enabling faceted filtering in search interfaces.

Face Recognition

Identifies and clusters faces across all analysed assets, enabling person-based search and tracking of known individuals throughout a media library.

Contextual Analysis

Uses a Large Language Model to synthesise outputs from all other engines into coherent semantic descriptions, actions, places, and canonical tags per shot.

Specifications

x-dream AI Hub is a containerised microservices stack. Each AI service is independently deployable and scalable. Below are reference hardware configurations and supported capabilities.

Hardware Requirements

Entry level system

  • Intel Xeon CPU or VM with 8 cores, min 2.9 GHz
  • 32 GByte RAM
  • 1 TB NVMe SSD
  • NVIDIA RTX 3090 / A5000 (recommended)
  • 1 Gbps Ethernet
  • Ubuntu 24.04 / Windows Server 2026
  • Docker

Best value system

  • Dual Intel Xeon CPU or VM with 16 cores, min 3.2 GHz
  • 128 GByte ECC RAM
  • 4 TB NVMe SSD RAID
  • NVIDIA A100 / H100 (multi-GPU support)
  • 10 Gbps Ethernet
  • Ubuntu 24.04 / Windows Server 2026
  • Docker

Supported Input Formats

Containers

  • MXF
  • MOV
  • MP4
  • TS
  • AVI
  • MKV

Video Codes

  • AVC/H.264
  • HEVC/H.265
  • AVC-Intra
  • XAVC
  • XDCAM 
  • MPEG-2
  • DNxHD
  • ProRes

Audio Codes

  • AAC
  • MP2
  • MP3
  • PCM

Images

  • JPEG
  • PNG
  • TIFF (for fram analysis)

API & Integrations

  • RESTful HTTP/JSON 
  • JWT / API Key 
  • HTTPS, WebSocket (status callbacks) 
  • Qdrant (embedded or remote) 
  • Cloud Connectors: AWS, Azure, GCP AI services 
  • MAM Integration: x-dream-Fabrik, third-party via 
  • Export: JSON, XML, EDL, SRT/VTT subtitles

Supported Languages

  • German
  • English
  • French
  • Spanish
  • Italian
  • Portuguese
  • Dutch
  • Polish
  • Russian
  • Arabic
  • Turkish
  • Chinese (Mandarin)
  • Japanese
  • Korean
  • Swedish
  • Norwegian
  • Danish
  • Finnish
  • Czech
  • Hungarian
  • Romanian

+ 50 further languages via cloud connector

Documentation

All x-dream AI Hub related documentation:

Brochures

  • More to come
Demo Request
Trial Request
Demo Request
Trial Request

Media suite interview ibc 2025

x-dream Media Suite xIngest Demo

Questions? Curious? Interested?

Privacy Notice

This website uses internal components for usage statistics and session cookies for forms and logging in.
It also embeds external components like Youtube that might collect data about your usage.
You are informed when accessing such components. Privacy Information  Imprint