BabylPhysh Terminal - Professional Manual v0.8a3-Enterprise
.::...... :#+=+%@+=****+=*@*==**. ......:::. .#+++++=====+@------------===++@=*+:.....:#=*%++===-----------=#======++++*+. .*=--:::.....-*----::::::::...*-*:.........+++-..:::::::::---=%:...::::---*:. .*%*=====-=+#@***++++++++==*+==...........%=%==+++++++++**#@+========*%-. .*---:::::..++---::::::::::=+==..... .....#=%-:::::::::----#...:::::--*- ..-%+=+**##%%%@@%%#+=--:::.:#-*..........=#+=.:::---+*#%@@@%%###*++=*%:. .#=--::::..=+---:::-+**+=-#=#=......:*++*=++**=-::----#:.:::::--+- .=*++++***#*+%%*=-::::=**:**=-++=+=-=%:-*+-:::--+*%*+*##**++++#:. -+--:::::.-%----##-:::%--+-@%@@%:%:=*:::=%*---=#:.:::::--%. .=**+==-:...:#---:=#=:-*:=+:*.*-:*::+*-:--#-:...:-=++*+-. .+**%=---*::%-:*:==:=#:--@#**- .#@@@--=%:-#--#--=@@%- .#.:#=-###@*#=-#=.:*. .#-::+::%:.*.-=:-%::+: -=::+=:-#::*:-+::#=:-#. -+-=@--+=:-#:-*:--%--#. ..-:-**%*--*--%#*#:--. :..:: BBBBBB A BBBBBB Y Y L B B A A B B Y Y L B B A A B B Y Y L BBBBBB A A BBBBBB Y L B B AAAAAAA B B Y L B B A A B B Y L BBBBBB A A BBBBBB Y LLLLLL PPPPPP H H Y Y SSSS H H P P H H Y Y S H H P P H H Y Y S H H PPPPPP HHHHHH Y SSSS HHHHHH P H H Y S H H P H H Y S H H P H H Y SSSS H H TTTTT EEEEE RRRRR M M I N N A L T E R R MM MM I NN N A A L T E R R M M M I N N N A A L T EEEEE RRRRR M M I N NN AAA L T E R R M M I N N A A L T E R R M M I N N A A L T EEEEE R R M M I N N A A LLLLLL

BABYLPHYSH TERMINAL

v0.8a3-Enterprise

Professional Subtitle Production Management System

Copyright © 2025 Douglas Beechwood

System Overview

Current Release: v0.8a3-Enterprise
Status: Production Ready
Release Date: September 23, 2025
Architecture: Modular Python Package

BabylPhysh Terminal is a comprehensive subtitle production management system designed for professional documentary film workflows. It provides quality control, multi-language translation, batch processing, and intelligent file management capabilities.

Key Capabilities

  • Complete subtitle quality control and validation
  • Multi-language translation with 108+ language support
  • Intelligent language detection and file naming
  • Professional HTML/CSV reporting system
  • Frame rate detection and alignment
  • Batch processing and pipeline automation
  • Production analytics and insights

Terminal Interface

Interactive Command-Line Interface: BabylPhysh presents a simple menu-driven terminal interface for all operations.

Main Menu Display

╔══════════════════════════════════════════════════════════════════╗
║ 🎬 BABYLPHYSH TERMINAL v0.8a3-Enterprise ║
║ Professional Subtitle Production System ║
╚══════════════════════════════════════════════════════════════════╝
CORE PROCESSING MODES:
----------------------------------------------------------------------
1) QC-A- Quality Check Analysis (No Changes)
2) QC-FX- Quality Check & Fix
3) QC-DIFF- Compare Original vs Processed
4) TRANS- Translation
5) BT-QC- Back-Translation Quality Check
6) CONFORM- Language Conforming
7) SYNC- Transcript Sync
8) SRT-NAME- Batch Append Language Names
ENHANCED WORKFLOW MODES:
----------------------------------------------------------------------
9) BATCH-QC- Batch Quality Control
P) PIPELINE- Complete Production Workflow
A) ANALYSIS- Enhanced Analysis Dashboard
----------------------------------------------------------------------
0) Exit
======================================================================
Enter number/letter or mode name (e.g., '1', 'P', or 'qc-a')
🎬 Select mode: _

Example User Session

$ python3 launch.py
Welcome to BabylPhysh Terminal Subtitle Toolkit!
Working directory: /Applications/BabylPhysh Terminal v0.8a3
[Menu displays as shown above]
🎬 Select mode: 4
=== TRANS: Translation ===
Available files:
1) [masters ]documentary_episode1.en.sdh.srt
2) [masters ]documentary_episode2.en.sdh.srt
3) [to_qc ]movie_subtitle.en.srt
Choose file: 1
Selected: documentary_episode1.en.sdh.srt (245 subtitles)
🌍 Target Languages:
1)Essential Pack (5 languages)
2)Extended Pack (10 languages)
3)Comprehensive Pack (20 languages)
4)Custom selection
Select package [1]: 1
Processing: Spanish (Latin America)... ✓
Processing: French... ✓
Processing: German... ✓
Processing: Portuguese (Brazil)... ✓
Processing: Italian... ✓
[SUCCESS] Translation Complete!
[OUTPUT] 5 language versions created in to_qc/
[NEXT] Use CONFORM mode to check for English remnants

Core Features - v0.8a3-Enterprise

1. QC-A: Quality Check Analysis
Analyze subtitle files without modifications
  • CPL/CPS/Duration checks
  • Reading speed analysis
  • HTML/CSV reports
  • No file modifications
2. QC-FX: Quality Check & Fix
Analyze and automatically fix issues
  • All QC-A analysis
  • Automatic corrections
  • Frame grid alignment
  • Before/after reports
3. QC-DIFF: File Comparison
Compare original vs processed files
  • Visual diff highlighting
  • Timing analysis
  • Change tracking
  • Color-coded output
4. TRANS: Translation
Multi-language subtitle translation
  • Single to multiple languages
  • Language packages
  • Progress tracking
  • API integration
5. BT-QC: Back-Translation QC
Translation quality validation
  • Back-translation analysis
  • Similarity scoring
  • Quality assessment
  • Statistical reporting
6. CONFORM: Language Conforming
Fix mixed-language content
  • English remnant detection
  • Mixed script fixing
  • Sound effect translation
  • Consistency checking
7. SYNC: Transcript Sync
Match transcript to timing
  • Text-to-timing alignment
  • Fuzzy matching
  • Auto distribution
  • Confidence scoring
8. SRT-NAME: Language Naming
Detect and rename files
  • Auto language detection
  • Filename + content analysis
  • Batch renaming
  • 108+ language support
9. BATCH-QC: Batch Processing
Process multiple files
  • Multi-file selection
  • Batch analysis/fix
  • Progress tracking
  • Summary reports
10. PIPELINE: Complete Workflow
Automated end-to-end
  • SDH to multi-language
  • Stage automation
  • Master reports
  • Production ready
11. ANALYSIS: Dashboard
Production analytics
  • File statistics
  • Quality scoring
  • Workflow insights
  • Comparative analysis

Installation & Setup

Quick Start

git clone https://github.com/growgoo/Babyl.git cd Babyl git checkout Babylphysh-08a3-Enterprise pip install -r requirements.txt python3 launch.py

API Configuration

Required: OpenAI API key for translation features

Edit data/api_config.csv:

provider,api_key,model openai,sk-YOUR_API_KEY_HERE,gpt-3.5-turbo
Get your API key at: platform.openai.com/api-keys

Complete Version History

September 23, 2025
v0.8a3-Enterprise Current

Major Enhancements

  • Fixed all syntax errors and import dependencies
  • Enhanced SRT-NAME with advanced language detection
  • Improved workflow management and file selection
  • Professional package structure with proper initialization
  • Comprehensive error handling and logging
  • Production-ready deployment system

Critical Fixes

  • config.py syntax error (line 269) - RESOLVED
  • srt_name.py import issues - RESOLVED
  • __init__.py formatting errors - RESOLVED
  • workflow module dependencies - RESOLVED

New Features

  • Self-contained file selection in SRT-NAME
  • Filename + content language detection
  • Confidence scoring for detection accuracy
  • Batch language naming with selective processing
  • Professional rename preview and confirmation
September 2025
v0.8a2

Focus: Core Stability & Reporting

  • Enhanced HTML/CSV reporting system
  • Improved QC analysis algorithms
  • Frame rate detection capabilities
  • Basic batch processing implementation
  • Professional report templates
August 2025
v0.8a1

Focus: Foundation & Basic Functionality

  • Initial release of v0.8 series
  • Core QC functionality established
  • Basic translation support
  • SRT parsing and writing engines
  • Command-line interface framework
2024-2025
v0.7 Series

Focus: Prototype & Early Development

  • Proof of concept implementation
  • Basic mode structure design
  • Initial CLI interface
  • Early translation experiments
  • Core architecture planning

Forward Looking: v0.9 "Intelligence Enhanced"

Release Target: Q1 2026
Development Timeline: 8 weeks
Status: Planning Phase

Strategic Vision

Transform BabylPhysh from a fixed-API tool into a flexible AI-powered subtitle production platform where users select optimal AI models for different tasks. The flagship feature addresses critical ASR transcription errors through an innovative Hallucination Removal Engine.

Priority Features

CRITICAL: Hallucination Removal Engine

Problem: Whisper creates fictional dialogue during unclear audio, foreign speech, or background noise

Solution:

  • Manual marker detection [Hallucination begin/end]
  • User-defined replacement text
  • Batch correction across languages
  • Pattern analysis and detection
  • Timing preservation

Use Case: Documentary producer identifies 8 minutes of Tibetan speech that Whisper hallucinated as English dialogue. Marks sections, replaces with "[Tibetan speech]" across all 12 language versions in batch.

Multi-AI Model Selection

Choose Your Intelligence:

  • GPT-3.5 Turbo (fast, economical)
  • GPT-4 (high quality)
  • GPT-4 Turbo (balanced)
  • GPT-4o (latest model)
  • Real-time cost estimation
  • Quality vs cost optimization

Multi-Provider Support

Flexible API Integration:

  • OpenAI (current, enhanced)
  • Claude/Anthropic
  • Google Gemini
  • Azure OpenAI
  • Automatic failover
  • Cost comparison

Translation Memory

Consistency & Efficiency:

  • Remember previous translations
  • Terminology database
  • Context preservation
  • Auto-apply approved terms
  • Export/import memories

AI-Powered QC Enhancements

Intelligent Analysis:

  • Smart profile suggestions
  • Context-aware error detection
  • Cultural/linguistic checking
  • Hallucination pattern hints
  • Confidence scoring integration

Smart Batch Processing

Automated Optimization:

  • Automatic model selection
  • Load balancing across providers
  • Cost optimization strategies
  • Quality vs speed balancing
  • Batch hallucination correction

Development Timeline

Phase Duration Focus Deliverables
Phase 1 Weeks 1-2 Foundation Hallucination removal engine, Model selection framework, Enhanced configuration
Phase 2 Weeks 3-4 Provider Expansion Claude/Gemini integration, Provider failover, Cost estimation engine
Phase 3 Weeks 5-6 Intelligence Features Translation memory, AI-powered QC, Context-aware processing
Phase 4 Weeks 7-8 Advanced Features Smart batch optimization, Enhanced analytics, Production reporting

Expected Benefits

Cost Optimization
  • 30% cost reduction through smart model selection
  • Real-time cost estimation
  • Budget-aware processing
Quality Improvement
  • 20% translation quality increase
  • 95% hallucination removal accuracy
  • AI-powered error detection
Productivity Boost
  • 40% workflow time reduction
  • Translation memory efficiency
  • Automated batch processing

Technical Specifications

System Requirements

Component Minimum Recommended
Python Version 3.8 3.10+
RAM 2GB 4GB+
Disk Space 500MB 1GB+
Internet Required for API Broadband

Supported File Formats

  • Input: SRT (SubRip Text) files
  • Output: SRT, HTML reports, CSV data
  • Encoding: UTF-8, Latin-1 (auto-detected)

API Integration

Provider Status (v0.8a3) Status (v0.9)
OpenAI Active Enhanced
Claude/Anthropic Planned
Google Gemini Planned
Azure OpenAI Planned

Supported Languages

BabylPhysh supports 108+ languages via BCP47 language codes, including:

Essential Languages
  • English (US, UK)
  • Spanish (Latin America, Spain)
  • French
  • German
  • Portuguese (Brazil, Portugal)
Asian Languages
  • Japanese
  • Korean
  • Chinese (Simplified, Traditional)
  • Hindi
  • Thai, Vietnamese
European Languages
  • Italian, Dutch, Swedish
  • Polish, Czech, Hungarian
  • Norwegian, Danish, Finnish
  • Romanian, Bulgarian, Greek

Quick User Guide

Basic Workflow

Typical Documentary Production Pipeline:
  1. Place English SDH file in masters/ folder
  2. Run Mode 4 (TRANS) to generate multiple languages
  3. Run Mode 6 (CONFORM) to fix partial translations
  4. Run Mode 9 (BATCH-QC) for quality control
  5. Use Mode 8 (SRT-NAME) to organize files
  6. Generate final reports and deliver

Common Use Cases

1. Quick Quality Check

1. Launch: python3 launch.py 2. Select Mode 1 (QC-A) 3. Choose QC profile (FilmHub/Broadcast) 4. Select subtitle file 5. Review HTML report in output/reports/

2. Multi-Language Production

1. Launch: python3 launch.py 2. Select Mode 10 (PIPELINE) 3. Choose source SDH file 4. Select language package (Essential/Extended/Comprehensive) 5. Automatic translation + QC + reporting 6. Review master report for delivery status

3. Language Detection & Organization

1. Launch: python3 launch.py 2. Select Mode 8 (SRT-NAME) 3. Choose files to analyze (individual/range/all) 4. Review detected languages 5. Confirm batch rename 6. Files organized with language identifiers

Troubleshooting

Common Issues & Solutions:
  • Translation fails: Check API key in data/api_config.csv
  • Import errors: Ensure Python 3.8+ and dependencies installed
  • File not found: Verify file paths and folder structure
  • Encoding issues: BabylPhysh auto-detects UTF-8 and Latin-1

Contributing & Support

How to Contribute

  • Report bugs via GitHub Issues
  • Suggest features in Discussions
  • Submit pull requests with improvements
  • Help with documentation
  • Share your production workflows

Support Resources

Resource Link
GitHub Repository github.com/growgoo/Babyl
Issue Tracker github.com/growgoo/Babyl/issues
Documentation docs/ folder in repository
Current Branch Babylphysh-08a3-Enterprise