Chatbot Building with BERT for E-Commerce

Authors: Dr Guru Kesava Dasu Gopisetty, Dusari Eswar Teja, Gunturu Venkata Satish Kumar, Duggirala Rahul Dinesh, Chikkam Akhil

DOI Link: https://doi.org/10.22214/ijraset.2024.59449

Certificate: View Certificate

Abstract

Building a chatbot powered by BERT (Bidirectional Encoder Representations from Transform-ers) involves leveraging its pre-trained language understanding abilities to create an interface that mimics human conversation. Developed by Google, BERT marks a significant advancement in natural language processing (NLP), showcasing remarkable performance across a range of tasks. In the era of increasing artificial intelligence (AI) adoption, chatbots have emerged as crucial tools for engaging users, particularly on mobile platforms where they adapt to different contexts and communication modes, including text and voice. BERT\'s bidirectional architecture allows it to grasp word meanings within their surrounding context, thanks to its extensive pre-training on vast textual datasets. Fine-tuning BERT for chatbot applications involves training it on a dataset containing user queries paired with suitable responses, with annotations indicating response appropriateness. Tokenization, a crucial preprocessing step, involves breaking down sentences into smaller tokens to aid BERT\'s processing efficiency. The chatbot architecture integrates BERT, potentially incorporating additional layers to enhance context understanding and response generation. Following this, the model undergoes training using fine-tuned BERT on the prepared dataset, with adjustments made to hyperparameters for optimal performance. Evaluation of the chatbot typically involves testing it on a validation set or through interactive sessions to assess its effectiveness. Any necessary refinements to the architecture or fine-tuning process are guided by performance analysis. Ultimately, deploying the chatbot involves seamless integration into real-world platforms such as web or mobile applications, enabling smooth interaction between users and the chatbot across various scenarios, all while prioritizing originality and integrity in the development process.

Introduction

I. INTRODUCTION

Creating a chatbot with BERT (Bidirectional Encoder Representations from Transformers) entails harnessing its pretrained language understanding capabilities to develop a conversational interface that emulates human interaction. Originating from Google, BERT stands as a significant leap forward in natural language processing (NLP), showcasing exceptional performance across diverse tasks. With the rising prominence of artificial intelligence (AI), chatbots have become indispensable tools for user engagement, particularly on mobile platforms, where they adapt to different contexts and communication modalities, including text and voice. BERT's bidirectional design enables it to grasp word meanings within their contextual framework, owing to its extensive pretraining on vast amounts of textual data.

Fine-tuning BERT for chatbot functionalities involves training it on a dataset comprising user queries paired with appropriate responses, along with annotations indicating response suitability. Tokenization plays a pivotal role in preprocessing, breaking down sentences into smaller units or tokens to facilitate effective processing by BERT. The architecture of the chatbot integrates BERT, potentially incorporating additional layers for enhanced context comprehension and response generation. Subsequently, the model undergoes training using the fine-tuned BERT on the prepared dataset, with adjustments made to hyperparameters to optimize performance. Evaluation of the chatbot typically entails testing it on a validation set or through interactive sessions to gauge its efficacy. Any necessary refinements to the architecture or fine-tuning process are informed by performance analysis. Finally, deploying the chatbot involves seamless integration into real-world platforms like web or mobile applications, facilitating fluid interaction between users and the chatbot across various scenarios, all while ensuring originality and integrity in the development process.

II. LITERATURE SURVEY

Artificial intelligence (AI) plays an important role in today’s technological environment, especially when combined with natural language processing (NLP) and machine learning algorithms.[1]

This paper presents an in-depth analysis focusing on the use of AI in chatbots in, on different platforms providing different services to different users, Emphasis is on design methods and learning methods.The specialty deals with computer applications using AI to simulate human decision making , providing a wide range of services.[2]

The study addresses the importance of AI-powered chatbots, highlighting the platforms used in their development. Applications vary depending on the tasks intended, and focus on adapting to the needs of the user.A key feature discussed is the ability of chatbots to gain experience through learning a derived from previous interactions.[3]

Different algorithms are used to optimize chatbots and improve their performance. Training data is an important resource, enabling chatbots to demonstrate knowledge base for accurate responses to user queries through client-side applications.[4]

The study examines contemporary approaches to chatbot development, and introduces a new framework to address the challenges associated with a series of models. The proposed model incorporates conditional Wasserstein generation adversarial networks and a transformer model for postgeneration in chatbots.[5]

The major role of today's technology is played by the artificial intelligence along with the NLP processing integrated with the machine learning algorithms. The computer program which uses artificial intelligence to imitate the behavior of the human decision making as well as providing the various

kind of services forms the basis for the survey on artificial intelligence on the chatbots. Thus, the paper provides a survey based on the different platforms used to build a chatbot for providing various kind of services to different kind of users. The design techniques for building the chatbot depends on the services meant to provide for the users. The chatbot will get the experience by learning through the past experience using various algorithms. The data can be trained to the chatbot which will enable it to check with the knowledge base for providing accurate results to the query of the user through client side applications[6]

Using embedded transformer-based generators and discrimination models, this architecture stands out as a leading approach in generative chatbots. There are Present of various NLP techniques and various models.Another important area covered in the literature review is the use of deep learning and neurolinguistic models in multi-domain transfer learning scenarios.[7]

BERT which is known as Bi-Directional Encoder from Transformers Although these models are competent, the Transformer is considered a significant improvement because it doesn't require sequences of data to be processed in any fixed order, whereas RNNs and CNNs do. Because Transformers can process data in any order, they enable training on larger amounts of data than ever was possible before their existence. This, in turn, facilitated the creation of pre-trained models like BERT, which was trained on massive amounts of language data prior to its release.[8]

Chatbot serves as a communication tool between a human user and a machine to achieve an appropriate answer based on the human input. In more recent approaches, a combination of Natural Language Processing and sequential models are used to build a generative Chatbot.

The main challenge of these models is their sequential nature, which leads to less accurate results. To tackle this challenge, in this paper, a novel architecture is proposed using conditional Wasserstein Generative Adversarial Networks and a transformer model for answer generation in Chatbots. While the generator of the proposed model consists of a full transformer model to generate an answer, the discriminator includes only the encoder part of a transformer model followed by a classifier. To the best of our knowledge, this is the first time that a generative Chatbot is proposed using the embedded transformer in both generator and discriminator models. Relying on the parallel computing of the transformer model, the results of the proposed model on the Cornell Movie-Dialog corpus and the Chit-Chat datasets confirm the superiority of the proposed model compared to state-of-the-art alternatives using different evaluation metrics.[9]

III. PROPOSED METHODOLOGY

Our project revolves around an Artificial Intelligence-powered Chatbot designed to enhance user interactions and provide valuable web services. Built using Python, the software offers a user-friendly interface, simplifying connections to the internet and ensuring accessibility to reliable online services. Specifically, we've developed a sample chatbot using Python, tailored for Twitch, an online platform offering chatbot services to clients. This web-based platform offers a vast array of intelligent capabilities, facilitating problem-solving simulations for users. Users can interact with our chatbot to

inquire about various topics or seek assistance with queries. Our methodology encompasses an BERT MODEL for the chatbot, accompanied for interactive functionalities. The backend operations are handled using Python, ensuring seamless functioning. Furthermore, our chatbot integrates various machine learning algorithms, enabling it to learn from user interactions and requests, thus continually improving its performance and adaptability

V. ACKNOWLEDGEMENT

We would like to extend our sincerest gratitude to all individuals and entities who have contributed to the successful completion of our project focused on building a chatbot using the BERT (Bidirectional Encoder Representations from Transformers) model.

First and foremost, we express our deepest appreciation to our project supervisor/mentor for their invaluable guidance, expertise, and unwavering support throughout the duration of this project. Their insights and encouragement have been instrumental in shaping the direction and scope of our work.

We are also immensely grateful to our team members for their dedication, collaboration, and collective effort in bringing this project to fruition. Each member has played a crucial role, contributing their skills and expertise to various aspects of the project, from research and development to testing and implementation.

Furthermore, we would like to acknowledge the research community and developers who have contributed to the advancement of natural language processing (NLP) technologies, particularly the development of the BERT model. Their groundbreaking work has paved the way for innovative applications like ours in the field of chatbot development.

Additionally, we extend our appreciation to the open-source community for providing access to tools, libraries, and resources that have facilitated the implementation and experimentation phases of our project. The collaborative spirit of the open-source community has been integral to our success.

Finally, we express our gratitude to our friends, families, and loved ones for their unwavering support, patience, and understanding throughout the project. Their encouragement has been a source of motivation during challenging times.

Together, the collective efforts of all those involved have contributed to the successful completion of our project, and we look forward to further developments and opportunities in the field of AI-powered chatbots.

References

[1] M.Ganesan,Deepika.C,Harivashini.B,”A Survey on Chatbots using Artificial Intelligence” in 2020 International Conference on System, Computation, Automation and Networking (ICSCAN),July 2020, DOI: 10.1109/ICSCAN49426.2020.9262366 [2] Nikita Kanodia, Khandakar Entenam,” Question Answering Model Based Conversational Chatbot using BERT Model and Google Dialogflow”, 2021 31st International Telecommunication Networks and Applications Conference (ITNAC),November 2021, DOI:10.1109/ITNAC53136.2021.9652153. [3] BRUCCE NEVES DOS SANTOS , RICARDO MARCONDES MARCACINI AND SOLANGE OLIVEIRA REZENDE,”Multi-Domain Aspect Extraction Using Bidirectional Encoder Representations”, Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos 13566-590, Brazil,July 2021, Digital Object Identifier:10.1109/ACCESS.2021.3089099 [4] ISTVÁN ÜVEGES ,ANDORSOLYARING, “HunEmBERT: A Fine-Tuned BERT-Model for Classifying Sentiment and Emotion in Political Communication”Centre for Social Sciences, 1097 Budapest, Hungary,May2023, Digital Object Identifier:10.1109/ACCESS.2023.3285536 [5] AHMADF. SUBAHI,” BERT-Based Approach for Greening Software Requirements Engineering Through Non-Functional Requirements”, Department of Computer Science, University College of Al Jamoum, Umm Al-Qura University, Makkah 21421, Saudi Arabia,August 2023, Digital Object Identifier 10.1109/ACCESS.2023.3317798 [6] R. S. Wallace, ‘‘The anatomy of A.L.I.C.E.’’ in Parsing the Turing Test. Dordrecht, The Netherlands: Springer, 2009, pp. 181–210. [7] S. Jafarpour and C. J. C. Burges, ‘‘Filter, rank, and transfer the knowledge: Learning to chat,’’ Microsoft Research, Redmond, WA, USA, Tech. Rep. MSR-TR-2010-93, 2010. [Online]. Available: https://www.microsoft.com/en-us/research/publication/filter-rank-andtransfer-the-knowledge-learning-to-chat/ [8] Z. Yan, N. Duan, J. Bao, P. Chen, M. Zhou, Z. Li, and J. Zhou, ‘‘DocChat: An information retrieval approach for chatbot engines using unstructured documents,’’ in Proc. 54th Annu. Meeting Assoc. Comput. Linguistics, vol. 1. Berlin, Germany: Association for Computational Linguistics, Aug. 2016, pp. 516–525. [Online]. Available: https://aclanthology.org/P16-1049 [9] R. Al-Rfou, M. Pickett, J. Snaider, Y.-H. Sung, B. Strope, and R. Kurzweil, ‘‘Conversational contextual cues: The case of personalization and history for response ranking,’’ 2016, arXiv:1606.00372. [10] E. Riloff and J. Wiebe,” Learning extraction patterns for subjective expressions, in Proc. Conf. Empirical Methods Natural Lang. Process.” Stroudsburg, PA, USA: Association for Computational Linguistics, 2003, pp. 105112. [11] H. Yu and V. Hatzivassiloglou, “Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences, in Proc. Conf. Empirical Methods Natural Lang. Process.” Stroudsburg, PA, USA: Association for Computational Linguistics, 2003, pp. 129136. [12] R. Feldman, “Techniques and applications for sentiment analysis”, Com mun. ACM, vol. 56, no. 4, pp. 8289, Apr. 2013. [13] I. P. Matsuno, R. G. Rossi, R. M. Marcacini, and S. O. Rezende,” Aspect based sentiment analysis using semi-supervised learning in bipartite het erogeneous networks,” J. Inf. Data Manage., vol. 7, no. 2, p. 141, 2016. [14] R. M. Marcacini, R. G. Rossi, I. P. Matsuno, and S. O. Rezende, “Cross domain aspect extraction for sentiment analysis: A transductive learning approach”, Decis. Support Syst., vol. 114, pp. 7080, Oct. 2018. [15] A. Nazir, Y. Rao, L. Wu, and L. Sun,”Issues and challenges of aspect based sentiment analysis: A comprehensive survey”, IEEE Trans. Affect. Comput., early access, Jan. 30, 2020, doi: 10.1109/TAFFC.2020.2970399. [16] A.YadavandD.K.Vishwakarma, “Sentiment analysis using deep learning architectures: A review, Artif. Intell. Rev”., vol. 53, no. 6, pp. 43354385, Aug. 2020. [10] O. Irsoy and C. Cardie, Opinion mining with deep recurrent neural networks, in Proc. Conf. Empirical Meet [17] Tran, A.D., J.I. Pallant, and L.W. Johnson,”Exploring the impact of chatbots on consumer sentiment and expectations in retail. Journal of Retailing and Consumer Services”, 2021. 63: p. 102718. 2. [18] Miklosik, A., N. Evans, and A.M.A. Qureshi, “The Use of Chatbots in Digital Business Transformation: A Systematic Literature Review.”IEEE Access, 2021. 9: p. 106530-106539. [19] Okonkwo, C.W. and A. Ade-Ibijola,” Chatbots applications in education: A systematic review.” Computers and Education: Artificial Intelligence, 2021. 2: p. 100033. [20] Mogaji, E., et al.,” Emerging-market consumers’ interactions with banking chatbots.” Telematics and Informatics, 2021. 65: p. 101711. [21] Tsai, M.-H., et al.,” Four-Stage Framework for Implementing a Chatbot System in Disaster Emergency Operation Data Management: A Flood Disaster Management Case Study.” KSCE Journal of Civil Engineering, 2020. 25. [22] Adamopoulou, E. and L. Moussiades, Chatbots: History, technology, and applications. Machine Learning with Applications, 2020. 2: p. 10000 [23] Zhu, Y., et al.,” Proactive Retrieval-Based Chatbots Based on Relevant Knowledge and Goals”. Sigir \'21, 2021: p. 2000–2004. [24] Dhyani, M. and R. Kumar, “An intelligent Chatbot using deep learning with Bidirectional RNN and attention model.” Materials Today: Proceedings, 2021. 34: p. 817-824. [25] Wang, Y., et al., “Augmenting Dialogue Response Generation With Unstructured Textual Knowledge.” IEEE Access, 2019. 7: p. 1.

Copyright

Copyright © 2024 Dr Guru Kesava Dasu Gopisetty, Dusari Eswar Teja, Gunturu Venkata Satish Kumar, Duggirala Rahul Dinesh, Chikkam Akhil. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET59449

Publish Date : 2024-03-26

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here