Repurpose your long-form videos to short-form using Veehive.ai

6 min readAug 16, 2021

We would love to share our research and development on “Repurpose your long-form videos to short-form using Veehive.ai” with you!

With a rising trend for online content, the video streaming industry is at its peak. However, research shows that when it comes to providing real-time video content, less is more! The shorter the video, the more is the user engagement. This mantra is well adopted by Veehive, an AI-based multi-purpose media streaming platform, deep-diving into this emerging digital trend of creating bite-sized and easy-to-consume videos.

Sathish Jeyakumar, Founder of Veehive.ai, says, “Content creators have their closed context communities, and want to explore commerce through them. However, the existing business model and technologies, the current legacy platforms do not support them enough with modern tools and user experience.”

As a result, communities across the globe are laden with loads of lengthy and unconsumed video content. There is a dire need to convert these into short-form videos which are consumable, crisp and highly engaging for the audience of the present day.

He adds, “Veehive provides personalisation through an array of professional & managed services to pick from; one such service uses AI, to provide a platform where you can convert your hour-long videos to 3–5 minutes of short-form video content and gauge the attention of content consumers. This technology will help communities to repurpose their old videos into content that is easily consumable in an effective manner on mobile devices.”

How does Veehive use AI to generate short-form videos?

Artificial Intelligence (AI) has come a long way into the digital world, and the industry is well adopting its capabilities. Veehive has already delved into the world of technology to become a trailblazer in online content and community management. The main objective of Veehive is to create short-form content directed towards a more controllable and shareable viewing experience. Generating short-form content for the online platform is a megatrend now. Therefore, it becomes inevitable to work around with existing content and extract the best out of them. Veehive’s AI Team can achieve this by giving community(s) the control to remodel their long content by selecting what key content needs to be a part of the generated short-form video without the need for any editing tool. Syed Arsalan and Mehak Kashmiria, Machine Learning Engineers at Veehive.ai, designed and developed this system using Deep Neural Networks including Transformer Models, Natural Language Processing (NLP) and other Machine Learning concepts.

Natural Language Processing (NLP) deals with processing the natural human language in terms of speech or text to be better communicated with the computer machines. Veehive deployed transformer models, deep neural networks that work on the ‘attention’ mechanism and aim to resolve sequence-to-sequence encoding-decoding tasks in NLP.

Veehive utilized the magic of AI for its audio processing, semantic and contextual analysis of the text. Recurrent Neural Networks (RNNs) are deployed for the Natural Language Processing (NLP) tasks. RNNs are a type of machine learning algorithm that works with sequential data such as text, time-series data, audio, video, etcetera. Since Veehive’s AI engine works with audio and text data, the RNNs are a good bet for achieving the required goal.

“Vee aim to help content creators focus on content than technologies. AI and the Veehive platform should enable many Digital Entrepreneurs and Closed Context Communities to grow their dream to learn, share and add commerce to their communities.”- Sathish

This is how the platform works:

A user uploads their long-form video on Veehive’s dedicated short-form creation platform.
The video gets processed, and its audio is extracted in the background. The corresponding audio transcript also gets generated.
This proprietary model then performs text detection, restoration and correction for the transcript.
The platform generates key highlights of the video content, which the user can select based on its preference to include in the short-form video.
The last step is the creation of the short-form video itself based on user-selected content.

Voila! That’s how super easy it is to produce short-form videos from your, otherwise, neglected and long video content. Time to bring them out and transform them into more acceptable, engaging and crispier content.

Let’s dive into the model components

Below is the high-level architecture of how different model pieces fit together: Audio processing deals with the main attributes of an audio signal, and enhances those elements to fetch meaningful insights about the audio data. Audio processing is generally done using the concept of automated speech recognition (ASR). ASR engines recognise speech from audio data and then applies machine learning algorithms to its various elements. Veehive uses an end-to-end approach based on deep neural networks for this task. Deep Neural Networks are no more just a textual concept. In fact, they are efficient artificial neural networks that help get better insights from the data.

Next, text processing involves analyzing and pre-processing the audio extracted text and transforming it into a form easily understood by the machine. A bidirectional recurrent neural network (BRNN) works on text detection and restoration to convert it to a machine-readable format. BRNNs are recurrent neural networks where the hidden layers are connected to the output layer using forward and backward states. The output is generated using information from both directions while training. This makes the predicted output more accurate.

The main highlight of the model is keyword generation. This is where things get interesting, and the real essence of the short-form content is played out. User’s video content may include vast information which is redundant and repetitive. Veehive’s AI platform offers a way to let the user select the key highlights from its video content and use them to generate the short-form video. This keyword extraction process uses pre-trained transformer models for natural language processing to understand the context of words in the text. Thus, it introduces diversity and minimizes redundancy in the content while maintaining the semantics.

Based on the user-selected topics above, the uploaded video is edited to create a short-form video with seamless transitions without losing any audio content. After this, the short-form video undergoes processing to create a SubRip Subtitle file (SRT). This SRT is based on a fixed segment length for a clear display on the screen and maps precisely to the spoken audio.

Well, this is it! That is how Veehive’s short-form content-generating platform is all wrapped up. Getting insights from all these modules, which are working together in perfect harmony, helps in transforming long videos into shortened forms.

User Experience

The entire flow of the short-form content creation is so straightforward that everything happens on screen, and the user is kept updated on each step. This model also allows the user to view the transcript of highlighted content so that it helps them in selecting or rejecting a particular topic from the short-form video.

As soon as the short-form video is generated, the user may wish to preview it, download it or upload more content. Plans are to make it more scalable and robust and to improve user experience as we grow.

Veehive’s AI team is working on providing a smooth and scalable UI for its users. The users will be able to download the videos and SRT files successfully and select from some AI auto-generated short-form content to save time and effort.

Sathish believes “Veehive’s ‘long-form to short-form video’ platform will enable many communities to repurpose their long-lost video content and transform it into digestible bite-sized short content or golden nuggets.”

Future enhancements

Veehive will transcend this working model into a more comprehensive and fully automated solution in due course of time. The UI is being worked on. In addition, the model will be enhanced to include multiple model-generated short-form videos for the user to select based on topics/keywords selected by the model itself. This will bring diversity to the short-form content being created.

Currently, the model only caters to videos that contain human speech, where the speech audio can be transcribed to text. Veehive’s plans will focus on converting videos with background music. This will bring computer vision and Optical Character Recognition (OCR) into play.

Another potential feature to include is summarizing the entire video content and showcasing the most relevant nuggets of information. This will help to give an overall view of the long content. The feature will also provide a mechanism to manually select and edit the transcript which the user wants to include.

Veehive is working to advance into the future with the mission to actualize innovation and intuitive customer experience for bringing communities together using AI. Veehive’s vision is to reach a pinnacle in the world of audio & video content creation, management and sharing.

Vee aim to help content creators focus on content than the technologies. AI and the Veehive platform should enable many Digital Entrepreneurs and Closed context communities to grow into global communities to learn, share and add commercialize with the content of relevance.

#letsveehive #aiforgood #mlengineers #aiml

Repurpose your long-form videos to short-form using Veehive.ai

How does Veehive use AI to generate short-form videos?

Let’s dive into the model components

User Experience

Future enhancements

Written by letsveehive