Bridging the Visual Gap: Integrating Vision Language Models (VLM) and Artificial Intelligence (AI) with Enterprise Resource Planning (ERP) Software System

Abstract

Businesses generate vast visual data (e.g., quality check photos, warehouse snapshots, invoices, customer images), but traditional Enterprise Resource Planning (ERP) systems, built for structured data, cannot process it. This study explores integrating Vision Language Models (VLMs), AI combining computer vision and language processing, with ERPs to automate tasks like quality control, inventory monitoring, and document processing. We assess integration feasibility with Microsoft Dynamics 365 Business Central, Salesforce, and SAP S/4HANA, proposing an API-driven system architecture. VLMs face precision challenges, and ERP readiness varies: Microsoft Dynamics needs custom development, Salesforce offers flexible APIs, and SAP S/4HANA is robust but complex. Strategic planning and leveraging VLM strengths enable AI-enhanced enterprise systems.

Country : India

1 T Bharath Chandra

  1. Senior Product Owner, Product Managers, Aptean India Pvt Ltd, Bengaluru, India

IRJIET, Volume 9, Special Issue of ICCIS-2025 May 2025 pp. 200-205

doi.org/10.47001/IRJIET/2025.ICCIS-202532

References

  1. Daffodil Software. (2025, March 3). All You Need To Know About Vision Language Models. https://insights.daffodilsw.com/blog/all-you-need-to-know-about-vision-language-models
  2. IBM. (2025, May 6). What Are Vision Language Models (VLMs)? https://www.ibm.com/think/topics/vision-language-models
  3. Alayrac, J.B., et al. (2022). Flamingo: a Visual Language Model. arXiv:2204.14198. https://arxiv.org/abs/2204.14198
  4. Bai, J., et al. (2023). Qwen-VL: A Versatile Vision-Language Model. arXiv: teamwork2308.12966. https://arxiv.org/abs/2308.12966
  5. Liu, H., et al. (2023). Visual Instruction Tuning (LLaVA). arXiv:2304.08485. https://arxiv.org/abs/2304.08485
  6. Mhaskey, S.V. (2024, December). AI in ERP Systems: Opportunities and Challenges. Int. J. Comp. Eng. Res. Trends, 11(12), 1-9. https://www.ijcert.org/index.php/ijcert/article/view/1036