Ijraset Journal For Research in Applied Science and Engineering Technology
Authors: Shelar Aniket, Kowe Ankit, Prof Hiranwale S. B
DOI Link: https://doi.org/10.22214/ijraset.2024.62586
Certificate: View Certificate
: In the medical field, pictures of the body are really important. But answering questions about these images to help diagnose problems is tricky. We made a new model called CGMVQA to help with this. It\'s good at sorting out different types of questions and finding answers. We use special techniques to understand written questions and make images clearer. We also made the model work faster by changing some settings. Our model is really good! It\'s the best one yet at answering medical image questions, like figuring out what\'s in a picture or matching words to images. This means doctors can use it to help with diagnoses. BioGPT is another helpful tool. It\'s great for scientists and teachers in biology. It gives quick and accurate answers to complex biology questions. As technology gets better, BioGPT will help us learn more about life and speed up biological research.
I. INTRODUCTION
In recent advancements within the realm of artificial intelligence (AI), Microsoft garnered attention with the launch of ChatGPT, an innovative chatbot developed by OpenAI, in November of the preceding year. However, a lesser-known yet equally significant unveiling by Microsoft occurred in January of the subsequent year with the introduction of BioGPT. Unlike ChatGPT, BioGPT is tailored specifically for the biomedical domain, offering a unique AI tool designed to evaluate biomedical research and provide insights into complex biomedical queries. Leveraging generative language models trained on millions of published biomedical research articles, BioGPT possesses the capability to extract relevant information, generate text, and offer answers to biomedical questions. This introduction of BioGPT signifies a pivotal step in empowering researchers with a powerful AI-driven resource for gaining fresh perspectives and enhancing biomedical research endeavors. In this paper, we delve into the development and capabilities of BioGPT, a generative pretrained Transformer language model, highlighting its potential in revolutionizing the landscape of biomedical text generation and analysis.
II. PROJECT SCOPE
The project aims to develop a large language model that can generate and understand biomedical text in a comprehensive and informative way.
This includes the following goals:
III. MATHEMATICAL MODEL
A mathematical model for BioGPT could involve various mathematical techniques and algorithms to represent the underlying processes involved in natural language processing (NLP) and machine learning. Since BioGPT is designed for bioinformatics applications, the mathematical model would need to incorporate domain-specific knowledge and techniques tailored to analyzing biological data.
The BioGPT system architecture consists of the following components:
a. Data Layer: The data layer stores the data that is used to train and deploy BioGPT. This data includes a large dataset of unlabeled biomedical text and a labeled dataset for fine-tuning BioGPT on specific biomedical tasks.
b. Training Layer: The training layer is responsible for training the BioGPT model. The training layer uses the data from the data layer to train the BioGPT model to perform a variety of tasks, such as generating text, answering questions, and extracting relationships between entities in biomedical text.
c. Inference Layer: The inference layer is responsible for deploying the BioGPT model and making it available to users. The inference layer receives requests from users and uses the BioGPT model to generate responses.
d. API Layer: The API layer provides a way for users to interact with the BioGPT system. The API layer exposes a set of endpoints that users can call to generate text, answer questions, and extract relationships between entities in biomedical text.
The BioGPT system architecture is designed to be scalable and reliable. The data layer is distributed across multiple servers to ensure that it can handle a large volume of data. The training layer is also distributed across multiple servers to speed up the training process. The inference layer is stateless, which means that it can be scaled horizontally to handle a large number of concurrent requests.
IV. SOFTWARE AND HARDWARE REQUIREMENT
A. Software Requirements
a. Amazon Web Services (AWS): AWS offers a comprehensive set of cloud computing services, including compute, storage, database, and machine learning services. BioGPT can leverage AWS Lambda for serverless computing, Amazon S3 for scalable storage, and Amazon SageMaker for machine learning model training and deployment.
b. Microsoft Azure: Azure provides a wide range of cloud services, including virtual machines, databases, AI services, and DevOps tools. BioGPT can utilize Azure Functions for serverless computing, Azure Blob Storage for data storage, and Azure Machine Learning for model training and deployment.
2. Containerization Platforms
a. Docker: Docker enables packaging BioGPT and its dependencies into lightweight, portable containers that can run consistently across different environments. BioGPT containers can be deployed on-premises or in cloud environments, providing flexibility and portability.
3. Serverless Computing Platforms
a. AWS Lambda: AWS Lambda allows running BioGPT functions without provisioning or managing servers. It offers automatic scaling, fine-grained billing based on usage, and seamless integration with other AWS services.
b. Azure Functions: Azure Functions provides serverless computing capabilities similar to AWS Lambda, enabling BioGPT to execute code in response to events with automatic scaling and pay-per-use pricing.
B. Hardware Requirements
2. Memory (RAM)
3. Storage
4. Networking
5. Operating System
V. ALGORITHM DETAILS
A. Transformer Architecture
B. Masked Language Modeling(MLM)
C. Next Sentence Prediction(NSP)
D. Contrastive Learning of Biomedical Entites(CLBE)
VI. FUTURE SCOPE
BioGPT emerges as an indispensable asset for fostering inspired innovations across diverse sectors. Its capacity to swiftly generate creative solutions to existing challenges empowers individuals and organizations to maintain a competitive edge within their industries. Moreover, BioGPT\'s versatility transcends boundaries, rendering it applicable across various fields of study and industries. Whether seeking novel ideas or solutions, BioGPT stands out as the ultimate tool for innovation. Its rapid ideation capabilities revolutionize conventional problem-solving approaches, promising transformative outcomes. Furthermore, BioGPT contributes to advancing science communication and education by simplifying intricate biological concepts, thereby facilitating broader accessibility and understanding. As a catalyst for innovation and knowledge dissemination, BioGPT epitomizes the potential of AI in driving positive change and progress.
[1] Fuji Ren, (Senior Member, IEEE), and Yangyang Zho. CGMVQA: A New Classification and Generative Model for Medical Visual Question Answering. 2020 [2] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, Jaewoo Kang. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. 201 [3] Shamsi Daneshi, Anthony Gitter. Attention Is All You Need: A Review of Attention Mechanisms in NLP and their Application in Genomicc.2020 [4] Andrew M Jones, Max Bileschi, Gokhan Tur, A. Gilad Kusne. Applications of deep learning and reinforcement learning to biological data. 2020 [5] Georgior Balikas, Prodromos Malakasiotis, Ioannis Partalas, et al. BioASQ at 7: Large-scale Biomedical Semantic Indexing and Question Answering. 2020 [6] Guo J, Huang X, Dou L, Yan M, Shen T, Tang W, Li J. Aging and aging-related diseases: from molecular mechanisms to interventions and treatments. SignalTransductTargetTher. 2022; 7:391. [7] Aging: Molecular Pathways and Implications on the Cardiovascular System. Oxid Med Cell Longev. 2017; 2017:794156 [8] The hallmarks of aging. Cell. 2013; 153:1194–217. . BioGPT-The ChatGpt of life sciences 20 [9] Galkin F, Mamoshina P, Aliper A, Putin E, Moskalev V, Gladyshev VN, Zhavoronkov A. Human Gut Microbiome Aging Clock Based on Taxonomic Profiling and Deep Learning. iScience. 2020; 23:101199. [10] Zhavoronkov A. Generation of Novel Chemistry. Mol Pharm. 2018; 15:4311–3. [11] Pun FW, Liu BHM, Long X, Leung HW, Leung GHD, Mewborne QT, Gao J, Shneyderman A, Ozerov IV, Wang J, Ren F, Aliper A, Bischof E, et al. Identification of Therapeutic Targets for Amyotrophic Lateral Sclerosis Using PandaOmics - An AIEnabled Biological Target Discovery Platform. FrontAging Neurosci. 2022; 14:914017 [12] Sallam M. ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns. Healthcare (Basel). 2023; 11:887 [13] Luo R,BioGPT: generative pretrained transformer for biomedical text generation and mining. Brief Bioinform. 2022; 23:bbac409 [14] Kandhaya-Pillai R, Yang X, Tchkonia T, Martin GM, Kirkland JL, Oshima J. TNF?/IFN-? synergy amplifies senescence-associated inflammation and SARS-CoV-2 receptor expression via hyper-activated JAK/STAT1. Aging Cell. 2022; 21:e13646 [15] Tchkonia T, Niedernhofer LJ. Senolytic Drugs: Reducing Senescent Cell Viability to Extend Health Span.Annu Rev PharmacolToxicol. 2021; 61:779–803
Copyright © 2024 Shelar Aniket, Kowe Ankit, Prof Hiranwale S. B. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Paper Id : IJRASET62586
Publish Date : 2024-05-23
ISSN : 2321-9653
Publisher Name : IJRASET
DOI Link : Click Here