Text-to-image (T2I) and text-to-video (T2V) generation have made significant strides in generative models. While T2I models can control subject identity well, extending this capability to T2V remains challenging. Existing T2V methods need more precise control over generated content, particularly identity-specific generation for human-related scenarios. Efforts to leverage T2I advancements for video generation need help maintaining…
