Video Transcribing & Translating with Cloud based (AWS/Google/other) or ML based techniques.



  • We have an education platform website. The teacher uploads video webinars through a web-based teacher-portal. The admin declines/approves the video. Upon approval, these videos are visible on a web-based student-portal.

    The videos are stored in the local filesystem. There is a MySQL database with the following details columns: `file_path`, `approve_status`.

    Project Goals:
    We want a background cron/scheduled script that checks the MySQL database for new videos and then,transcribes & translates the video automatically and then, saves the results in a folder. More precisely, the script should do the following:

    1. AWS Transcribe - We want each uploaded video to be transcribed. That is, your script would take in a video in English, say "video.mp4", process and then produce an output which is the transcribed subtitles file in English, ie. "video_en.srt". Note: The videos are in English.

    2. Google Cloud Translate - The "video_en.srt" file will then be translated to other Indian languages to generate a translated version of each subtitle file like "video_hi.srt" (Hindi), "video_te.srt", (Telugu), "video_ma.srt" (Marathi) etc.

    3. AWS Polly / Google Cloud Text-to-Audio - The translated .srt files will then be used to generate a translated video. That is, for example, the "video_hi.srt" subtitles would be used to generated a translated video "video_hi.mp4" and similarly for other languages.

    Workflow:
    The expected end workflow would be the following:

    1. Teacher Uploads a Video.
    2. Background Worker (YOUR SCRIPT) - Transcribes: creates a .srt file.
    3. Admin Panel - Approves / Rejects the transcription
    4. Background Worker (YOUR SCRIPT) - Upon a video approval - Translates subtitles and Creates a translated video.

    Stack: We are open to using ANY technology for your background script and also any cloud service or custom machine learning solutions. The above AWS/Google services was just a suggestion. There is no need to stick to it. We are also flexible on the workflow and the solution as long as the end result of Transcribing and Translating is met.

     

    Project Type:One-time project

     

    Skills and expertise

     

    Machine Learning Methods
    Deep Learning
    Machine Learning Languages
    Python


Ask A Question