A coding implementation to build an interactive transcription and PDF analysis with Lyzr Chatbot Framework

In this tutorial we introduce a streamlined approach to extracting, processing and analysis of YouTube -Videotransus using LyzrAn advanced AI-driven frame designed to simplify interaction with text data. Utilization of Lyzr’s intuitive chatbot interface along with YouTube-Transcript-API and FPDF, users can effortlessly convert video content to structured PDF documents and perform insightful analyzes through dynamic interactions. Ideal for researchers, educators and content creators, the Lyzr accelerates the process of deriving meaningful insights, generating summary and formulating creative questions directly from multimedia resources.

!pip install lyzr youtube-transcript-api fpdf2 ipywidgets
!apt-get update -qq && apt-get install -y fonts-dejavu-core

We create the necessary environment for tutorial. The first command installs important python libraries, including Lyzr for AI-Driven Chat, YouTube-Transcript-API for transcription extraction, FPDF2 for PDF generation and iPywidgets to create interactive chat interfaces. The second command ensures that Dejavu Sans Font is installed on the system for supporting full unicode text reproduction within the generated PDF files.

import os
import openai


openai.api_key = os.getenv("OPENAI_API_KEY")
os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY_HERE"

We configure Openai API key access to Tutorial. We import the OS and Openai modules and then retrieve the API key from environmental variables (or set it directly via us. This setup is important for utilizing Openai’s powerful models within the Lyzr frames.

import json
from lyzr import ChatBot
from youtube_transcript_api import YouTubeTranscriptApi, TranscriptsDisabled, NoTranscriptFound, CouldNotRetrieveTranscript
from fpdf import FPDF
from ipywidgets import Textarea, Button, Output, Layout
from IPython.display import display, Markdown
import re

Check out the full notebook here

We import important libraries required for tutorial. It includes JSON for data handling, Lyzr’s Chatbot for AI-powered chat features and Youtub Transcript for extracting transcripts from YouTube videos. It also brings FPDF to PDF generation, iPywidgets to interactive UI components and iPython.display to reproduce Markdown content in notebooks. The re module is also imported into regular expression operations in word processing tasks.

def transcript_to_pdf(video_id: str, output_pdf_path: str) -> bool:
    """
    Download YouTube transcript (manual or auto) and write it into a PDF
    using the system-installed DejaVuSans.ttf for full Unicode support.
    Fixed to handle long words and text formatting issues.
    """
    try:
        entries = YouTubeTranscriptApi.get_transcript(video_id)
    except (TranscriptsDisabled, NoTranscriptFound, CouldNotRetrieveTranscript):
        try:
            entries = YouTubeTranscriptApi.get_transcript(video_id, languages=['en'])
        except Exception:
            print(f"[!] No transcript for {video_id}")
            return False
    except Exception as e:
        print(f"[!] Error fetching transcript for {video_id}: {e}")
        return False


    text = "\n".join(e['text'] for e in entries).strip()
    if not text:
        print(f"[!] Empty transcript for {video_id}")
        return False


    pdf = FPDF()
    pdf.add_page()


    font_path = "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf"
    try:
        if os.path.exists(font_path):
            pdf.add_font("DejaVu", "", font_path)
            pdf.set_font("DejaVu", size=10)
        else:
            pdf.set_font("Arial", size=10)
    except Exception:
        pdf.set_font("Arial", size=10)


    pdf.set_margins(20, 20, 20)
    pdf.set_auto_page_break(auto=True, margin=25)


    def process_text_for_pdf(text):
        text = re.sub(r'\s+', ' ', text)
        text = text.replace('\n\n', '\n')


        processed_lines = []
        for paragraph in text.split('\n'):
            if not paragraph.strip():
                continue


            words = paragraph.split()
            processed_words = []
            for word in words:
                if len(word) > 50:
                    chunks = [word[i:i+50] for i in range(0, len(word), 50)]
                    processed_words.extend(chunks)
                else:
                    processed_words.append(word)


            processed_lines.append(' '.join(processed_words))


        return processed_lines


    processed_lines = process_text_for_pdf(text)


    for line in processed_lines:
        if line.strip():
            try:
                pdf.multi_cell(0, 8, line.encode('utf-8', 'replace').decode('utf-8'), align='L')
                pdf.ln(2)
            except Exception as e:
                print(f"[!] Warning: Skipped problematic line: {str(e)[:100]}...")
                continue


    try:
        pdf.output(output_pdf_path)
        print(f"[+] PDF saved: {output_pdf_path}")
        return True
    except Exception as e:
        print(f"[!] Error saving PDF: {e}")
        return False

Check out the full notebook here

This feature, transcript_to_pdf, automates the conversion of YouTube -Videotransrcripts to clean, readable PDF documents. It retrieves the transcription using YouTub Transcriptapi, handles graceful exceptions such as inaccessible transcripts and format the text to avoid problems such as long words that break the PDF layout. The feature also ensures correct Unicode support using Dejavusans type (if available) and optimizes text for PDF reproduction by dividing too long words and maintaining uniform margins. It returns true if the PDF is generated successfully or false if errors occur.

def create_interactive_chat(agent):
    input_area = Textarea(
        placeholder="Type a question…", layout=Layout(width="80%", height="80px")
    )
    send_button = Button(description="Send", button_style="success")
    output_area = Output(layout=Layout(
        border="1px solid gray", width="80%", height="200px", overflow='auto'
    ))


    def on_send(btn):
        question = input_area.value.strip()
        if not question:
            return
        with output_area:
            print(f">> You: {question}")
            try:
                print("<< Bot:", agent.chat(question), "\n")
            except Exception as e:
                print(f"[!] Error: {e}\n")


    send_button.on_click(on_send)
    display(input_area, send_button, output_area)

Check out the full notebook here

This feature, Create_Interactive_chat, creates a simple and interactive chat interface in Colab. Using iPywidgets provides a text input area (textarea) for users to enter questions, a send button (button) to trigger the chat and an output area (output) to show the conversation. When the user clicks, the entered question is sent to the Lyzr Chatbot agent who generates and shows an answer. This allows users to participate in dynamic Q&A sessions based on the transcription analysis, making the interaction as a living conversation with the AI model.

def main():
    video_ids = ["dQw4w9WgXcQ", "jNQXAC9IVRw"]
    processed = []


    for vid in video_ids:
        pdf_path = f"{vid}.pdf"
        if transcript_to_pdf(vid, pdf_path):
            processed.append((vid, pdf_path))
        else:
            print(f"[!] Skipping {vid} — no transcript available.")


    if not processed:
        print("[!] No PDFs generated. Please try other video IDs.")
        return


    first_vid, first_pdf = processed[0]
    print(f"[+] Initializing PDF-chat agent for video {first_vid}…")
    bot = ChatBot.pdf_chat(
        input_files=[first_pdf]
    )


    questions = [
        "Summarize the transcript in 2–3 sentences.",
        "What are the top 5 insights and why?",
        "List any recommendations or action items mentioned.",
        "Write 3 quiz questions to test comprehension.",
        "Suggest 5 creative prompts to explore further."
    ]
    responses = {}
    for q in questions:
        print(f"[?] {q}")
        try:
            resp = bot.chat(q)
        except Exception as e:
            resp = f"[!] Agent error: {e}"
        responses[q] = resp
        print(f"[/] {resp}\n" + "-"*60 + "\n")


    with open('responses.json','w',encoding='utf-8') as f:
        json.dump(responses,f,indent=2)
    md = "# Transcript Analysis Report\n\n"
    for q,a in responses.items():
        md += f"## Q: {q}\n{a}\n\n"
    with open('report.md','w',encoding='utf-8') as f:
        f.write(md)


    display(Markdown(md))


    if len(processed) > 1:
        print("[+] Generating comparison…")
        _, pdf1 = processed[0]
        _, pdf2 = processed[1]
        compare_bot = ChatBot.pdf_chat(
            input_files=[pdf1, pdf2]
        )
        comparison = compare_bot.chat(
            "Compare the main themes of these two videos and highlight key differences."
        )
        print("[+] Comparison Result:\n", comparison)


    print("\n=== Interactive Chat (Video 1) ===")
    create_interactive_chat(bot)

Check out the full notebook here

Our most important () function acts as a core driver for the entire tutorial pipeline. It processes a list of YouTube -Video -ids that convert available transcripts to PDF files using the transcript_to_pdf feature. Once PDFs is generated, a Lyzr PDF chat agent is initialized on the first PDF, allowing the model to answer predefined questions, such as summarizing the content, identifying insights and generating quiz questions. The answers are stored in an answer.json file and formatted to a fielddown -report (report.md). If multiple PDFs are created, the feature compares them using the Lyzr agent to highlight key differences between the videos. Finally, it launches an interactive chat interface with the user that enables dynamic conversations based on the transcription content showing the power of Lyzr for trouble-free PDF analysis and AI-powered interactions.

if __name__ == "__main__":
    main()

We ensure that the Main () function only runs when the script is performed directly, not when imported as a module. It is a best practice in Python scripts to check the execution stream.

Finally, by integrating Lyzr into our workflow as demonstrated in this tutorial, we can effortlessly transform YouTube videos into insightful, actionable knowledge. Lyzr’s intelligent PDF chat capacity simplifies the extraction of core themes and generating extensive summary and also enables engaging, interactive exploration of content through an intuitive conversation interface. The adoption of Lyzr allows users to unlock deeper insights and improve productivity significantly when working with video transcripts, whether for academic research, educational purposes or creative content analysis.

Check the laptop here. All credit for this research goes to the researchers in this project. You are also welcome to follow us on Twitter And don’t forget to join our 95k+ ml subbreddit and subscribe to Our newsletter.

Asif Razzaq is CEO of Marketchpost Media Inc. His latest endeavor is the launch of an artificial intelligence media platform, market post that stands out for its in -depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts over 2 million monthly views and illustrates its popularity among the audience.

Leave a Comment Cancel reply