41
Correct text/audio alignment errors
under review
Ben Woodward
If the auto-transcription fails to detect a string of words in the audio, you can edit the transcript and add them in. The software will make a judgement about which waveforms to align these words to. However, if the software makes an incorrect assessment it is currently not possible to manually adjust the word positioning.
Activity
Newest
Oldest
Harry Hawk
Related to this I put in a request to support a 2nd pass on the AI transcription to find gaps.
Specifically, if you (like me) are using a manually supplied "human" transcript there are often gaps where non-word filler sounds appear (ummm), etc.
Since it's clear from the timeline that Descript knows where they are (they are colored black vs. pink).
Descript fire off a 2nd pass AI transcription that could use those time codes to instruct the AI transcription to only transcribe those "black" sections.
https://feedback.descript.com/feature-requests/p/human-ai-transcripts

Y
Yoz Grahame
I think this feature has now been added: https://help.descript.com/hc/en-us/articles/360054482811
R
Russell Silber
@Yoz Grahame: They did and it's a great start, but if there are large blocks of unaligned text it is labor intensive to fix in the current implementation. I think there needs to also be a way to move/slide the word tabs, not just trim them at the edges, and then importantly, select multiple word tabs to move/slide in a batch. Right now when I have to manually fix 30-40 word tabs that all bunched up for some reason it is not fun.
Alastair Budge
@Russell Silber: definitely agree with this. In my experience (now editing 100+ podcasts with Descript) the errors with alignment are almost always multiple sentences, not individual words. So fixing the alignment takes a huge amount of time, and it's actually easier to edit the timestamps in the exported VTT files.
R
Russell Silber
@Alastair Budge: that’s a great tip, I’ll play around with that as a workaround.
Toggle Rat
@Yoz Grahame: I didn't know this was a thing. Thanks!
Mark Bramhill
Been a fan of Descript since the alphas, and this is one of the biggest gaps that's still in the app. Sure it's annoying when there are errors, but as long as there's a way to fix them it's just a mild annoyance. When there's nothing I can do to fix things, it's maddening.
T
Trilly
Very surprised this hasn't been addressed yet - this would enable me to get far more out of Descript than I currently can.
I'm not using it for much more than transcription atm, and don't feel I can seriously use it to produce finished audio without this feature.
Andrew Mason
under review
Kyle Frager
I was super excited to use Descript when I found it, but this exact issue is the reason why I will not be paying for a subscription when my free trial ends. This seems like a major issue that should have been fixed right away.
R
Russell Silber
This is still a major problem, I just submitted a ticket today. And I work with very clean audio off good mics, good diction, no accents, etc... We primarily use the transcription for captioning and the out-of-sync text ruins it. I think the ability to merge/split/move the word tabs along the top of the timeline is a crucial feature that needs to be added.

K
Karl Blattmann
Bingo! I really need this in one case, and I'm not sure how to workaround it.
Ryan Nantell
I posted a related request at link below... Descript really struggles with noisy audio too, where it will assume there's no speaker/words for long stretches, making that audio impossible to work with. Really need the ability to "correct text" on those kinds of files but Descript currently does not allow that. You're just stuck with a blank transcript for that part.
L
Lynn Friedman
This is huge - we are looking at using another tool for this very reason. We are working with archival audio, and sometimes background noise, cross-talk or bad mics prevent proper alignment for certain sections of the recording, even after remastering the audio. That said - it is pleasantly surprising how often the alignment
does
work well. But for those times when it doesn't - we need a way to manually "re-pin" a word to a new timestamp so that when we generate captions, they bear some resemblance to the audio in those challenging sections.Load More
→