Agriculture is a major sector for the Indian economy. One prevalent issue faced by Indian farmers is the failure to select appropriate crops based on their soil\'s nutrient content. Consequently, they encounter difficulties in achieving high productivity. To address this concern, Smart agriculture has emerged as a solution. Smart agriculture refers to a contemporary farming technique that utilizes research data pertaining to soil characteristics, soil types, and crop yield data collection. By analyzing these factors, it suggests suitable crops to farmers based on their specific soil parameters. This approach minimizes the likelihood of choosing unsuitable crops and enhances overall productivity. Our ongoing project aims to develop an intelligent system that assists Indian farmers in making well-informed decisions regarding crop selection, considering their farm\'s geographical location and soil characteristics. Furthermore, the system will provide fertilizer recommendations for the recommended crops, ensuring optimal agricultural practices
Introduction
I. INTRODUCTION
Agriculture, as we all know, is the major contributor to the Indian economy. Agriculture is an important occupation in India. More than 60% of the country's land is used for agriculture, which feeds about more than 1.3 billion people. Agriculture is a primary activity that includes growing crops, vegetables, fruits and rearing livestock. We need soil to yield crops. As a result, soil is an important factor in agriculture. Soil health is essential for good production of crops. It provides the roots with essential nutrients, water, oxygen, and support. In India, several soil varieties are available with different soil nutrients. They are black soil (generally poor in phosphoric contents), red soil (rich in potash), laterite soil (deficient in nitrogen and potash), alluvial soil (great potash content but poor in phosphorous), and so on. Numerous studies have been carried out with the aim of enhancing agricultural practices. Smart farms and agricultural operations are taken place totally different than those many decades ago, primarily due to advancements in technology, as well as sensors, devices, machines, and knowledge technology. Today’s agriculture uses advanced technologies like temperature and wetness sensors, Machine Learning and lots of complex IOT devices. These advanced devices in agriculture helps businesses and farmers to gain additional profitability and are more environment friendly. Various machine learning techniques can be used recommend the crop. The environmental data that is collected by remote sensors are processed by different algorithms and statistical data which is easy to interpret and helpful to farmers for decision makings and keep track of their farms. The ultimate aim is that farmers will use these technologies to attain their goal of improved harvest by creating optimized selections within the field.
II. RELATED WORK
The literature has many reported walks in the domain.
In the proposed system, the main concept implemented is the Internet of Things (IOT). The System uses different sensors to collect various data and process the sensed data with ATMega328p microcontroller. The data received from the hardware device will be analysed using different machine learning algorithms. The simple UI was developed so that the farmer knows what to do next. But the user has to enter the NPK values manually. So, it’s a bit overhead for the user to enter the values of nutrients of soil. Sameer. M. Patel, Mittal. B. Jain, Sagar. D. Korde are the authors of this paper.
In this paper, authors proposed the system that uses Node MCU microcontroller to collect the sensed data from the sensors. They have done implementation of Agri Stick, which is used to monitor various crop parameters. Thus, the data obtained will be uploaded to the cloud. The data from the cloud is extracted and crop prediction is done using three machine learning techniques, namely the KNN algorithm, Support vector machine, and Logistic regression. The accuracy of the algorithms used is up to 85%. So, there is a scope for using more precise and concise algorihms. Geo Abraham, Raksha R., M. Nithya are the authors of this paper.
In this proposed system, sensed data is received from the sensors. Classification models like Decision Tree, Random Forest, Naïve Bayes and KNN are used for predicting the adequate crop.
The proposed approach recommends the optimum crop based on a few soil nutrients. Performance Analysis is done, which is a specialized field that utilizes systematic goals to enhance performance and facilitate informed decision-making. Suggestion template is not developed about the measures to take to yield the maximum productivity. Shafiulla Shariff, Pushpa H are the authors of this proposed work. In this paper, authors proposed the system that uses a Raspberry Pi which is a low-cost credit card size microprocessor. All the sensors .e. Temperature sensor, soil moisture sensor, humidity sensor and wind sensor are connected to the GPIO pins of the Pi to collect the real-time information. The solution in the paper considers the soil type and the weather condition of the current season, this solution predicts the best crop suitable for the soil and the necessary minerals and nutrients which have to be added to the soil. Fahad Kamraan Syed, Ajay Kumar, Agniswar Paul are the contributors of this study.
III. METHODOLOGY
A. Collection of Dataset
For healthy cultivation of crop certain factors such as temperature, humidity, soil pH, soil nutrient, and soil moisture are important. To receive high yield these conditions should be satisfied. But these conditions may vary according to the crop and soil. The dataset for this system was taken from Kaggle. Basically it is a crop recommendation dataset giving information about various types of crops and the different parameters like N (Nitrogen content in soil), P (Phosphorous content in soil),K ( Potassium content in soil) soil Temperature ,Humidity ,pH value that decide which crop is suitable for growing.
B. Data Cleaning and Preprocessing
The first steps are to ensure the dataset we are using is accurate. The dataset should not have any missing values and if the dataset does have missing values, they should be replaced by the suitable values. The data should be checked to see if there is a normal distribution for its features. The outliers should be removed to optimize the dataset.
C. Machine Learning Algorithms
Decision Tree: The Decision Tree is a widely used Supervised learning technique in machine learning. It is renowned for its applicability in classification problems, although it can also be utilized for regression problems. The working of it is based on a simple method, wherein a yes/no question is asked and according to the answer the tree is divided into smaller nodes. The split of the nodes can either happen by calculating the measure of impurity (Gini impurity) or by calculating the change in the entropy. Decision Trees are susceptible to overfit and hence this is responsible for getting a lower accuracy. This problem can be solved by using random forest algorithm.
Random Forest: Random Forest is a supervised machine learning algorithm used in both classification as well as regression problems. It contains many decision trees and an average of it is taken so as to give the output. It is based on the conception of bagging wherein multiple decision trees are created and an average of them is taken so as to give the desired output. Since decision trees are prone to overfitting, random forests are valuable in mitigating the impact of overfitting and consequently providing more accurate results. However, it's important to note that random forests are not specifically optimized for handling unbalanced datasets and tend to prioritize hyperparameters for model optimization.
XGBoost: XGBoost is one of the most popular algorithms used today in terms of accuracy as well as speed. It is a tree-based algorithm using gradient boosting technique. The algorithm relies on a feedback mechanism that incorporates insights from the decision tree to enhance its performance. This iterative process increases the efficiency of the tree and contributes to achieving higher accuracy levels.
LightGBM: LightGBM is an optimized gradient boosting framework that builds upon decision trees. It effectively enhances model efficiency and reduces memory consumption. LightGBM utilizes two key techniques, namely Gradient-based One Side Sampling and Exclusive Feature Bundling, to overcome the limitations associated with histogram-based algorithms commonly employed in various gradient boosting decision tree frameworks.
D. Crop Prediction
To determine the most suitable crop for a specific soil type, a machine learning model is employed. By training the crop recommendation model using data collected from Arduino sensors, machine learning algorithms are utilized to identify the optimal crop with the highest probability of yielding. The Light Gradient Boosting machine algorithm is employed to select the best crop for the designated land. The model provides recommendations on the specific crops that farmers should cultivate based on this analysis. This is done by analyzing factors of humidity, temperature, soil moisture, pH level, and soil Nutrients.
E. Building UI
Subsequently, we have developed a user interface (UI) that enables users to input their data. By entering relevant information, along with the sensor-collected data, the model will process the input and provide recommendations on the most suitable crop to be cultivated under the given conditions.
IV. SYSTEM ARCHITECTURE
The system would contain a low-level hardware device that will measure different variables of the surroundings like temperature, humidity, soil moisture, pH of the soil and various nutrients such as nitrogen (N), phosphorus (P) and potassium (K).
The data received from the hardware device will be pushed on to the cloud platform and then later analyzed using different machine learning algorithms.
According to the results from different machine learning algorithms, a suggestion template will be developed in the web application so that the farmer knows about the crop which is adequate for his soil and the amount of nutrients he should add to yield good crops.
Conclusion
Agriculture plays a crucial role and is one of the major sectors for the economic development of India. The traditional agricultural sector faces numerous challenges such as inadequate crop growth and unfavorable climate conditions. To address these issues, real-time sensor data combined with machine learning algorithms can assist farmers in making decisions about which is the adequate crop to grow in a particular region. Besides, user will get fertilizers recommendations based on various factors like soil quality and climate conditions. By harnessing this technology, farmers can maximize their crop yields and make valuable contributions to the sustainable development of the Indian economy. Further development would be to integrate the crop recommendation system with another subsystem, yield predictor that would also provide the farmer an estimate of production if he plants the recommended crop.