Correct text/audio alignment errors
shipped
Ben Woodward
If the auto-transcription fails to detect a string of words in the audio, you can edit the transcript and add them in. The software will make a judgement about which waveforms to align these words to. However, if the software makes an incorrect assessment it is currently not possible to manually adjust the word positioning.
Kevin from Descript
shipped
Hello Ben + co., this is possible in the timeline! Simply hold cmd/ctrl to click and drag word tabs.
https://share.descript.com/view/FqHotWNwUJ1
Ron Passaro
Hi all, this is not really shipped, it's still WAY WAY too hard to line up multiple words/paragraphs of text. We need a way to move multiple words at once or at least click and drag a single word easily to a new position without having to Zoom in a massive amount and select the start and end point of each word and separately adjust those positions bit by bit. Often we have to do that several times on ONE word just to get it in the right place. Imagine having to do that with like 400 words. This is the difference between literally seconds and many many hours of work. I don't know if I should open a new request with this one already in existence, but I'm happy to do so.
Kevin from Descript
shipped
Hello Ben + co., this is possible in the timeline! Simply hold cmd/ctrl to click and drag word tabs.
Andrew Mason
open
J
Joel Rendall
Editing videos in Portuguese, I have this problem with almost every sentence. I can't really just edit the transcript without also manually correcting the in point, since it usually clips off half the following word.
Toggle Rat
This is definitely still a problem. Being able to select text and align a group of words would be nice.
R
Rick Maranta
This is still a big problem. I find it almost impossible at times to put the correct audio on a part of the waveform. Super frustrating.
Yannick Vez
Rick Maranta: I wanted to kill myself three times because of this shit, so frustrating
J
J D
Yannick Vez: Helps to use both a mouse and a touch pad. You can use a pinch action to make the waveform wider quickly and use the mouse with ctrl/cmd to grab the edges.
Would certainly be nice if there was a more fluid way of moving the words on the waveform around and expanding/contracting them as grouped text rather than one word at a time when Descript aligns them really poorly.
J
J D
Can we merge this with my thoughts here: https://feedback.descript.com/feature-requests/p/pin-or-lock-words-to-wave-form
Yannick Vez
J D: Thanks, I'll try it.
d
daytona
How about a word/word group slide function? In stead of adjusting the in/out of a word, just slide it along the timeline to where is goes!
Yannick Vez
daytona: YES YES I want this!
J
Jay Mutzafi
How is this not solved yet? I have a clip where it failed completely to detect words, and if i add them manually, it adds it on a "separate" track, and I have found no way to align it with the audio whatsoever.
J
J D
Jay Mutzafi: Looks like this topic is under review, so that's good. I've been working around alignment problems by using the blade tool and then typing text and isolating it with a blade cut on both sides. This allows me to "pin" a phrase where I want it, even if the audio is silent such as with a silent video where I want to add an overdub. So, we can technically do this without any modifications to the tool, but I think for the casual users of this tool, it's beyond their ability. So this will be nice to see how the Descript team decides to implement this.
R
Rick Maranta
Yes. Just had this issue. The first half of a sentence failed to have words and it was impossible to add them in. Even aligning the words with the text is a pain having to drag both ends to the right spot. Verfy fiddly and frustrating. I had to clip to a new composition. Took me a long time. Should be also be able to re-transcribe a selection or clip.
Harry Hawk
Related to this I put in a request to support a 2nd pass on the AI transcription to find gaps.
Specifically, if you (like me) are using a manually supplied "human" transcript there are often gaps where non-word filler sounds appear (ummm), etc.
Since it's clear from the timeline that Descript knows where they are (they are colored black vs. pink).
Descript fire off a 2nd pass AI transcription that could use those time codes to instruct the AI transcription to only transcribe those "black" sections.
Load More
→