Impact Factor (2025): 6.9
DOI Prefix: 10.47001/IRJIET
Text-to-image
and video AI models represent technologies that combine narrative with visual
content. The model works by converting written text (descriptions, sentences or
phrases) into corresponding images or videos. Leveraging advanced deep learning
architectures such as Generative Adversarial Networks (GANs) or Transformers, this
intelligence can interpret content in narratives and generate visual content
consistent with text. In the text-to-image domain, the model creates real
images based on text that describe scenes, objects, or even complex scenes
described in the text. In film, he arranges images or frames to create a
well-rounded, coherent film that suits the narrative. The impact of this
technology is broad, providing powerful tools to transform the content of
content into a graphical representation, expanding content creation, visual
arts, e-commerce, and accessibility for the visually impaired. For producing
high resolution images we have implemented EDSR4X model. The EDSR (Enhanced
Deep Super-Resolution) model is a state-of-the-art architecture specifically
designed for single-image super-resolution tasks. It belongs to the category of
convolutional neural networks (CNNs) and focuses on improving the resolution of
low-quality images.
Country : India
IRJIET, Volume 8, Issue 1, January 2024 pp. 11-14