> Back to Blog

Universal Video Scraper & AI Analysis Platform

View on GitHub

Universal Video Scraper & AI Analysis Platform
https://github.com/sreejagatab/videoprocessing

Universal Video Scraper & AI Analysis Platform

A comprehensive, production-ready web application that automatically scrapes, downloads, processes, and performs AI-powered analysis on videos from any target URL.

πŸš€ Features

  • Universal Video Support: Works with YouTube, Vimeo, Facebook, Twitter, TikTok, direct URLs, and 50+ platforms
  • AI-Powered Analysis: Transcription, summarization, sentiment analysis, topic modeling, and more
  • Real-time Processing: Live progress updates with WebSocket notifications
  • Scalable Architecture: Handle viral content and traffic spikes
  • Enterprise Ready: Production-grade security, monitoring, and reliability

πŸ—οΈ Architecture

Backend Stack

  • FastAPI with SQLAlchemy and Alembic
  • Celery + Redis for async video processing
  • PostgreSQL for data storage
  • MinIO/S3 for object storage
  • FFmpeg for video processing
  • yt-dlp for universal video downloading

AI/ML Stack

  • OpenAI GPT for content analysis
  • Whisper/AssemblyAI for transcription
  • Transformers for NLP processing
  • OpenCV for computer vision

Frontend Stack

  • Next.js 14 with TypeScript
  • Tailwind CSS for styling
  • React Player for video playback
  • SWR for data fetching
  • Zustand for state management

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Node.js 18+ (for frontend development)
  • Python 3.11+ (for backend development)

Development Setup

  1. Clone the repository
git clone <repository-url>
cd videoprocessing
  1. Start the development environment
docker-compose up -d
  1. Install dependencies
# Backend
cd backend
pip install -r requirements.txt

# Frontend
cd ../frontend
npm install
  1. Run database migrations
cd backend
alembic upgrade head
  1. Start the development servers
# Backend (in one terminal)
cd backend
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

# Frontend (in another terminal)
cd frontend
npm run dev

πŸ“ Project Structure

videoprocessing/
β”œβ”€β”€ backend/                 # FastAPI backend
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ api/            # API routes
β”‚   β”‚   β”œβ”€β”€ core/           # Core configuration
β”‚   β”‚   β”œβ”€β”€ models/         # Database models
β”‚   β”‚   β”œβ”€β”€ services/       # Business logic
β”‚   β”‚   β”œβ”€β”€ tasks/          # Celery tasks
β”‚   β”‚   ���── utils/          # Utilities
β”‚   β”œβ”€β”€ alembic/            # Database migrations
β”‚   β”œβ”€β”€ tests/              # Backend tests
β”‚   └── requirements.txt
β”œβ”€β”€ frontend/               # Next.js frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/     # React components
β”‚   β”‚   β”œβ”€β”€ pages/          # Next.js pages
β”‚   β”‚   β”œβ”€β”€ hooks/          # Custom hooks
β”‚   β”‚   β”œβ”€β”€ utils/          # Utilities
β”‚   β”‚   └── types/          # TypeScript types
β”‚   β”œβ”€β”€ public/             # Static assets
β”‚   └── package.json
β”œβ”€β”€ docker/                 # Docker configurations
β”œβ”€β”€ k8s/                    # Kubernetes manifests
β”œβ”€β”€ terraform/              # Infrastructure as Code
β”œβ”€β”€ docs/                   # Documentation
└── docker-compose.yml      # Development environment

πŸ”§ Configuration

Environment Variables

Create .env files in both backend/ and frontend/ directories:

Backend (.env)

DATABASE_URL=postgresql://user:password@localhost:5432/videoprocessing
REDIS_URL=redis://localhost:6379/0
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin
OPENAI_API_KEY=your-openai-key
ASSEMBLYAI_API_KEY=your-assemblyai-key
SECRET_KEY=your-secret-key

Frontend (.env.local)

NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_WS_URL=ws://localhost:8000

πŸ§ͺ Testing

# Backend tests
cd backend
pytest

# Frontend tests
cd frontend
npm test

# E2E tests
npm run test:e2e

πŸ“Š Monitoring

  • Health Check: GET /health
  • Metrics: GET /metrics
  • API Documentation: http://localhost:8000/docs

πŸš€ Deployment

Production Deployment

  1. Build Docker images
docker build -t videoprocessing-backend ./backend
docker build -t videoprocessing-frontend ./frontend
  1. Deploy with Kubernetes
kubectl apply -f k8s/
  1. Or deploy with Docker Compose
docker-compose -f docker-compose.prod.yml up -d

πŸ“ API Documentation

Interactive API documentation is available at:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

For support, please open an issue on GitHub or contact the development team.

Sree Jagatab

AI Automation Expert and Python Developer based in Wisbech, Cambridgeshire. With over 96 production-ready solutions deployed, I help businesses transform through innovative technology implementations. Specializing in AI integration, business process automation, and custom software development.