AI-Driven Image Generation and Virtual Try-On for Personalized Fashion Experiences

Ashwini K. Suganawar

doi:https://doi.org/10.47001/IRJIET/2024.809014

AI-Driven Image Generation and Virtual Try-On for Personalized Fashion Experiences

Ashwini K. SuganawarM. Tech Student of Department of Computer Science & Engineering, Shri Balasaheb Mane Shikshan Prasarak Mandal’s, Ashokrao Mane Group of Institutions, Vathar, Kolhapur, India

Vol 8 No 9 (2024): Volume 8, Issue 9, September 2024 | Pages: 112-118

International Research Journal of Innovations in Engineering and Technology

OPEN ACCESS | Research Article | Published Date: 17-09-2024

doi.org/10.47001/IRJIET/2024.809014

Full Text PDF

Abstract

The integration of Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs) presents a transformative approach to generating personalized and contextually relevant images that cater to specific user preferences. This project aims to harness the synergistic potential of RAG and LLMs to develop a robust and scalable image generation pipeline that seamlessly blends state-of-the-art natural language processing with advanced computer vision techniques. The process begins by utilizing a RAG model, which combines the strengths of retrieval-based methods and generative models to produce high-quality images that are not only coherent with the input prompts but also enriched with context from external knowledge sources .Following the image generation, a dedicated preprocessing module is employed to resize and optimize the images, ensuring they meet the quality standards required for subsequent integration. The next critical phase involves the detection of human upper bodies in photographs using Haar Cascade classifiers, a machine learning-based approach known for its efficiency in real-time object detection. The accurate identification of the upper body regions is crucial for the next step, where the generated images are overlaid onto these detected regions using OpenCV, a powerful computer vision library. This integration ensures that the images are aligned precisely with the contours of the human body, creating a visually realistic and aesthetically pleasing effect. To facilitate user interaction and deployment, the entire process is encapsulated within a Flask framework, which serves as the backbone of the application’s architecture. The Flask framework not only handles the backend processing, including API requests and image processing tasks, but also supports a user-friendly frontend interface, allowing users to interact with the system effortlessly.

Keywords

Augmented Generation (RAG),Large Language Models (LLMs), image generation, Machine Learning, Open CV, Deep Learning

Citation of this Article

Ashwini K. Suganawar, (2024). AI-Driven Image Generation and Virtual Try-On for Personalized Fashion Experiences. International Research Journal of Innovations in Engineering and Technology - IRJIET, 8(9), 112-118. Article DOI https://doi.org/10.47001/IRJIET/2024.809014

This work is licensed under Creative common Attribution Non Commercial 4.0 Internation Licence

References

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Riedel, S. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS), 1-16.
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P.,.. & Amodei, D. (2020). "Language Models are Few-Shot Learners." Advances in Neural Information Processing Systems (NeurIPS), 33, 1877-1901.
Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020). "REALM: Retrieval-Augmented Language Model Pre-Training." Proceedings of the 37th International Conference on Machine Learning (ICML), 8877-8888.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2020). "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer." Journal of Machine Learning Research, 21(140), 1-67.
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., ... & Yih, W. T. (2020). "Dense Passage Retrieval for Open-Domain Question Answering." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 6769-6781.
Bradski, G. (2000). "The OpenCV Library." Dr. Dobb's Journal of Software Tools, 25(11), 120-125.
Kaehler, A., & Bradski, G. (2016). "Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library." O'Reilly Media.
Dalal, N., & Triggs, B. (2005). "Histograms of Oriented Gradients for Human Detection." Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 886-893.
Viola, P., & Jones, M. (2001). "Rapid Object Detection Using a Boosted Cascade of Simple Features." Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 511-518.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). "Deep Learning." MIT Press.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). "Deep Residual Learning for Image Recognition." Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778.

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

AI-Driven Image Generation and Virtual Try-On for Personalized Fashion Experiences

Abstract

Keywords

Citation of this Article

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links