A Movie Recommender System Using Hybrid Approach: A Review

Authors: Sara Mohile, Hemant Ramteke, Pragati Shelgaonkar, Hritika Phule, M. M. Phadtare

DOI Link: https://doi.org/10.22214/ijraset.2022.41014

Abstract

The topic of this paper is movie suggestions. Because of its ability to provide improved entertainment, a movie recommendation is vital in our social lives. Users can be recommended a set of movies depending on their interests or admiration for the films by such a system. A recommendation system is used to make suggestions for things to buy or see. They employ a big collection of information to steer consumers to the things that will best match their needs. A recommender system, also known as a recommendation system, is a type of material filtering system that attempts to forecast a user\'s \"rating\" or \"preference\" for an item. They\'re mostly employed for commercial purposes. MOVREC also assists users in efficiently and effectively locating movies of their choice based on the movie experiences of other users, without wasting time in pointless searching.

Introduction

I. INTRODUCTION

A . Overview

The goal of scientific discourse is to disseminate new ideas. Every year, there are a growing number of journal articles and conference papers produced, making it increasingly difficult to locate relevant research. In light of the digitization of research publications, there has been an increase in the use of computers to aid in the search for relevant movies. They are known as "research film recommendation systems."

Recommender systems can be thought of as a black box that takes in a user's profile and matches it against a candidate set of items in order to recommend previously unseen stuff to a user. These are the suggestions that have been determined to be the most useful for that particular user.

Content-based, collaborative filtering, and hybrid recommendation are the three main types of movie recommendation systems now available. But in other cases, the person's preferences can't be effectively captured by a bag of words in the user profile. Based on user ratings, the system suggests movies that are generally free of explicit content, which is what we call "collaborative filtering." There is a problem known as "cold-starting" when there aren't many users or the number of users in the system isn't high enough. Although content-based and collaborative filtering recommendation systems can yield relevant results, their limitations must be taken into consideration.

This study presents a three-stage user profile for a new movie recommendation system. To accomplish this, we came up with a brand-new technique we dubbed BAP [Behaviour and Popularity]. Based on the user's reading behaviour and popularity of a movie, the approach gives each historical film a corresponding weight instead of 0, 1, or some predetermined value. Furthermore, when dealing with short-term profiles, we suggest a time function that adjusts the user's preferences for all historical movies rather than just part of them. A more objective and complete short-term profile of the user can be created in this way. In total, the system consists of four primary components: movie collecting and processing, user profiling, tailored movie suggestion, and location-aware personalised movie recommendation.

In this paper, we address some of the current system's issues. For starters, many movie recommendation systems have user profiles that are only one-sided and cannot accurately reflect the preferences of consumers. The degree to which people favour historical films has yet to be quantified, which is a second issue.

In actuality, moviegoers' tastes diverge greatly. Treating all of these past records in the same way while trying to figure out a user's preferences is therefore absurd. Third, most studies abandon or use only a few recent browsing records when constructing a short-term profile. As a result, the user may receive recommendations that are too similar to what they just read, leading to a plethora of unexpected outcomes and a faulty interpretation of their preferences.

II. RELATED WORK

A. Collaborative Filtering Method

Rough set-based collaborative filtering is presented in the study [2] in order to forecast a user's missing movie category rating values and to improve the ranking of novel movie items. Provide a customised movie suggestion to customers using an end-to-end system prototype that compares user profile database data with a database of movie articles, then reranks those articles according to novelty and CPCC similarity. There are many advantages: The uniqueness of movie things can be automatically detected. Efficient method for automatically discovering an active user's missing rating. Dynamic community detection does not work with this tool.

An online movie publisher's web server access logs are analysed in the paper [4] in order to discover trends in online readership. A model that may be used to forecast which articles will be read most often by a specific user is first developed, and then the most important elements of the articles are examined and the learnt model is interpreted. There are many advantages: Prediction accuracy can be greatly affected by the choice of time window. The following are the disadvantages: The addition of the Users feature, which is the most computationally demanding.

[6] presents PENETRATE, an ensemble hierarchical clustering-based Personalized Movie Recommendation Framework that provides attractive recommendation results. Unlike content-based techniques and collaborative filtering, we present a framework that considers both the individual user's activity and the group's behaviour simultaneously. In terms of the benefits, they include increased accuracy and efficiency. Disadvantages include: a considerable time investment.

According to this paper [5] it is possible to model users' reading preferences using long-term and short-term user profiles in a seamless manner. The long-term profile should be used to identify movie genres and subgenres, and the short-term profile should be used to target specific articles on movies to individual users. As an advantage, a short-term model can capture user preferences for fine-grained movie topics that have recently or currently been expressed. To increase the user's interest in reading by introducing a wider range of topics and styles..

B. Content Based Recommendation

According to the research [8], there are several scalable methods for making movie recommendations, including multi-dimensional similarity calculations, quick clustering using Kmeans clustering, and Top-N recommendations. Traditional collaborative filtering suffers from data sparsity, but the multi-dimensional similarity calculation method solves this problem by calculating how similar two users are based on the movie's rich material and their own behaviour and time. The suggested scalable recommendation approach has the following advantages: It effectively solves the scalability problem. When there is less data, the quality of movie recommendations is improved.

Only content-based recommendation is presented in this work, not collaborative filtering.

Web content recommendation is the goal of the paper [9]. Users' long-term interest and preference models are built using their navigational history and connected with the recommender system. Web material is supplied to users based on whether or not the content matches their mental models. Benefits include: a self-navigating navigation model that frees users from tiresome and repetitive web surfing, automatically classifying web pages, and increasing productivity.

There are two ways in [10] paper that take into consideration the meaning of words. In the procedures, concepts and semantic similarities are used to determine the similarities between film items. Second method, Semantic Similarity (SSV), first method, Synset Frequency - Inverse Document Frequency (SF-IDF) (SS). Ceryx, an addon to Hermes Movie Portal's movie personalisation service, implements the proposed ways to movie item recommendation. There are many advantages: It outperforms TF-IDF on a statistical level.

C. Hybrid Method

Personal movie recommendations with explicit semantic analysis (LP-ESA), a hybrid method proposed in [1,] takes into account both the user's personal preferences and their own interests while making movie recommendations An additional unique movie recommendation approach, known as LP-DSA (deep semantic analysis), is proposed to address the issues of LP-large ESA's dimensions, sparseness, and redundancy by using deep semantic analysis. Based on a deep neural network's recommendations, LP-DSA extracts feature representations for places, movie-goers, and other entities. For example, the LP-DSA method uses deep neural networks in order to solve the LP-enormous ESA's dimensionality, sparseness, and redundancy issues. Improve the effectiveness and efficiency of movie recommendation. The computation time is reduced. The following are the disadvantages: Improve the LP-performance DSA's by mastering an improved abstract feature space.

Proposes personalised movie recommendation in [3] article, an algorithmic mix of content-based and collaborative filtering algorithms that use multi-dimensional domain-specific information included in movies to produce recommendations (i.e., CB and CF, for short).

Using both domain-specific and general movie features, this is the first recommendation system of its kind, as far as we know. In order to increase the accuracy of movie recommendations, this paper proposes a novel CB algorithm that makes advantage of the trends feature (a domain-specific feature from movies), which indicates that various movie categories have different lifecycles and play diverse roles in accumulating user profiles.

There are many advantages: The advantages of using movie domain-specific features to improve movie recommendation systems. Both individual approaches and benchmarked hybrid strategies outperform the deviation-based hybrid strategy in terms of accuracy and stability. FereBSP and FereRBML are better than their comparable benchmarks in terms of performance. The following are the disadvantages: Those publishers who can keep track of their readers for a lengthy period of time are the only ones who can benefit from the recommended strategy. Based on both user content and the collaborative filtering Wesomender architecture, a hybrid recommendation engine.

Collaborative filtering and content-based filtering are the two fundamental components of Wesomender [7]. Each component examines and makes independent suggestions for a movie that the user hasn't viewed or rated yet. There are many advantages to using context-sensitive recommendations in journalism. Context-aware adaptive recommendation engines can help journalists with their everyday tasks by finding relevant information in real time. By rejecting very old movies or those that took place too far away, heuristics can be used to lessen the effort on the other component.

D. Multidimensional Approach

Additionally, it allows numerous Dimensions, profiling information for users, and a hierarchical aggregation of recommender systems. It is possible to estimate an approximation integral using this approach, which makes use of a machine learning regression model trained to mimic the target integrand. In order to ensure that the final integral result has a statistically accurate estimate of uncertainty due to the prediction error introduced by machine learning, any bias in the estimate is corrected using a bias correction term.

E. TF-IDF Vectorization

Retrieval of materials and extraction of data are the primary functions of the TF-IDF subtask. An attempt is made to highlight the significance of a term in a document that is part of a corpus of documents. When you use term frequency-inverse document frequency, the text is transformed into a vector that may be used. Terms Frequency (TF) is combined with Document Frequency (DF) (DF).

The frequency of a term in a document is measured using the term frequency. How frequently a word is used can indicate how essential it is inside an article or other piece of writing. Each document is represented in the data as a matrix with rows and columns that represent the number of distinct phrases in each document.

F. Linear Kernel Method

In machine learning, the class of algorithms known as the kernel technique is used to analyse patterns. In order to learn and identify the most common forms of relationships, this strategy is utilised (such as correlation, classification, ranking, clusters, principal components, etc.) A Linear Kernel is utilised if the data can be separated using a single Line, or if the data is linearly separable. In terms of kernels, this is one of the most commonly utilised. Use it when the Data Set contains a large number of features. Text Classification is one of the situations where there are many features, as each alphabet is a new one. In text classification, Linear Kernel is the algorithm of choice.

G. K-folds Cross Validation

Resampling procedures like cross-validation are used to evaluate machine learning models on insufficient data sets. To partition a data sample into groups, a single parameter called k is used in this technique. K-fold cross-validation refers to this method. Cross-validation can be performed 10 times when a specific value of k is chosen, and this value can be used in place of k in the model reference. In comparison to other strategies, such as a basic train/test split, this one is popular since it's easy to understand and produces a less biassed or optimistic approximation of the model's skill.

H. Singular Value Decomposition

An algorithm known as the singular value decomposition (SVD) can be used to reduce large datasets. Using this method to create answers for a smaller number of parameters is too useful. Although there are only a few more values, the original data contains a vast amount of variability. Matrixes can be reduced to three smaller matrices using the SVM (Single Value Decomposition). It's got some neat algebraic features, and it reveals a lot about linear transformations in terms of geometry and theory. Data science also has a number of uses for it. Matrix operations like the matrix inverse, as well as data reduction methods in machine learning, frequently involve the SVD.

I. Gap Analysis

Content-based, collaborative filtering, and hybrid recommendation are the three main types of movie recommendation systems now available. To locate movies with similar content to what a user has just read, the recommendation system uses a content-based approach. The implementation of content-based recommendation systems is often straightforward. But in other cases, the person's preferences can't be effectively captured by a bag of words in the user profile. Based on user ratings, the system suggests movies that are generally free of explicit content, which is what we call "collaborative filtering." A cold-start problem occurs when a system's user base is too small or the past behaviours of the system's users are insufficient. Although content-based and collaborative filtering recommendation systems can yield relevant results, their limitations must be taken into consideration.

Conclusion

Users can choose from an agreed-upon set of attributes and then have the system recommend movies for them to watch based on their cumulative weight of those attributes. It\'s difficult to evaluate our system\'s success because there is no right or incorrect recommendation; it\'s only an issue of personal preferences. We received a positive reaction from a small group of users who participated in informal reviews. We\'d prefer more data so that our system can produce more useful findings. If we can combine multiple machine learning and clustering techniques, we will be able to compare their outcomes. Our ultimate goal is to create a web-based user interface with a user database and a personalised learning model for each individual user.

References

[1] C. Chen, X. Meng, Z. Xu, and T. Lukasiewicz, ``Location-aware personalized movie recommendation with deep semantic analysis,\'\' IEEE Access, vol. 5, pp. 1624_1638, 2017. [2] K. G. Saranya and G. S. Sadasivam, ``Personalized movie article recommendation with novelty using collaborative filtering based rough set theory,\'\' Mobile Netw. Appl., vol. 22, no. 4, pp. 719_729, 2017. [3] Review on Techniques of Incremental Mining of High Utility Patterns Manasi Phadatare, July 2020 INTERNATIONAL JOURNAL FOR RESEARCH IN APPLIED SCIENCE AND ENGINEERING TECHNOLOGY [4] Uncertain data mining using decision tree and bagging technique MM Phadatare, SS Nandgaonkar, April 2014 INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY [5] P. Lv, X. Meng, and Y. Zhang, ``FeRe: Exploiting influence of multidimensional features resided in movie domain for recommendation,\'\' Inf. Process. Manage., vol. 53, no. 5, pp. 1215_1241, 2017. [6] B. Fortuna, P. Moore, and M. Grobelnik, ``Interpreting movie recommendation models,\'\' in Proc. 24th Int. Conf. WorldWideWeb (WWWCompanion), 2015, pp. 891_892. [7] L. Li, L. Zheng, F. Yang, and T. Li, ``Modeling and broadening temporal user interest in personalized movie recommendation,\'\' Expert Syst. Appl., vol. 41, no. 7, pp. 3168_3177, 2014. [8] L. Zheng, L. Li, W. Hong, and T. Li, ``PENETRATE: Personalized movie recommendation using ensemble hierarchical clustering,\'\' Expert Syst. Appl., vol. 40, no. 6, pp. 2127_2136, 2013. [9] Montes-García, J. M. Álvarez-Rodríguez, J. E. Labra-Gayo, and M. Martínez-Merino, ``Towards a journalist-based movie recommendation system: The Wesomender approach,\'\' Expert Syst. Appl., vol. 40, no. 17, pp. 6735_6741, 2013. [10] M. Lu, Z. Qin, Y. Cao, Z. Liu, and M.Wang, ``Scalable movie recommendation using multi-dimensional similarity and Jaccard_Kmeans clustering,\'\' J. Syst. Softw., vol. 95, pp. 242_251, Sep. 2014. [11] H. Wen, L. Fang, and L. Guan, ``A hybrid approach for personalized recommendation of movie on the Web,\'\' Expert Syst. Appl., vol. 39, no. 5, pp. 5806_5814, 2012. [12] M. Capelle, F. Frasincar, M. Moerland, and F. Hogenboom, ``Semantics based movie recommendation,\'\' in Proc. 2nd Int. Conf. Web Intell., Mining Semantics, 2012, pp. 27_36.

Copyright

Copyright © 2022 Sara Mohile, Hemant Ramteke, Pragati Shelgaonkar, Hritika Phule, M. M. Phadtare. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET41014

Publish Date : 2022-03-26

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here