Real-time dubbing, a service that streams audio and provides translated content back, faces challenges like the need for accuracy and maintaining the original speaker's emotion, especially in scenarios such as sports and news broadcasting. Sports events, which have a global audience and are typically consumed live, can tolerate some additional latency for dubbing since viewers benefit from listening in their native language. In sports, capturing the emotion and timing is crucial, and while current voice cloning models can replicate some aspects, there is room for improvement in achieving the emotional depth of live commentators. In news broadcasting, the focus is on accuracy and nuance in translation, as some concepts require cultural sensitivity that automated systems sometimes lack. Future advancements in real-time dubbing may involve using additional context such as images and video or developing "emotional transcripts" to enhance the delivery of dubbed audio. The development of conversational dubbing for real-time, in-person conversations is also being explored by leveraging predictive models to anticipate and translate speech more seamlessly.