Event-Based Probabilistic Embedding for POI Recommendation
Event-Based Probabilistic Embedding for POI Recommendation
Zhang, Tiancheng;Liu, Hengyu;Geng, Xue;Yu, Ge
2023-01-17 00:00:00
applied sciences Article 1, 1 2 1 Tiancheng Zhang * , Hengyu Liu , Xue Geng and Ge Yu School of Computer Science and Engineering, Northeastern University, Shenyang 110167, China Institute for Infocomm Research (I2R), Agency for Science, Technology and Research, Singapore 138632, Singapore * Correspondence: tczhang@mail.neu.edu.cn Abstract: Location-based social networks (LBSNs) have collected massive geo-tagged information, enabling the derivation of user preference for point of interests (POIs) in support of personalized recommendation. The existing embedding techniques deal with multiple factors by embedding a separate model for each factor. As a result, the interaction amongst various factors cannot be captured properly. In addition, we notice that the effectiveness of personalized recommendation is closely related to the current time and location. It is obvious that users would check into a POI which fits their interests, even if the current location is far away from the POI or the time is inappropriate. Therefore, it is necessary to recommend the right POI according to the time and geographic location of the user. In other words, it is necessary to predict the most likely visiting event, including users, POI, event time, and event location. In this paper, we propose a probabilistic embedding model called Topic And Region Embedding (TARE), which embeds events by simulating the users’ decision-making process. The results of TARE not only take various factors and their interaction into consideration but also consider the time and geographic location of events. Extensive experiments on three location-based social network datasets show that TARE achieves better performance in recommendation accuracy than existing state-of-the-art methods. Keywords: POI recommendation; probabilistic generation model; deep neural network; probabilistic embedding Citation: Zhang, T.; Liu, H.; Geng, X.; 1. Introduction Yu, G. Event-Based Probabilistic Embedding for POI The development of location-based social networks (LBSNs) enables the easy acqui- Recommendation. Appl. Sci. 2023, 13, sition of massive geo-tagged information which can be used in the derivation of user 1236. https://doi.org/10.3390/ preference for point-of-interests (POIs) to provide personalized recommendations. POI app13031236 recommendation [1] aims to model user behavior in support of future prediction regarding which POIs to visit. The behavior can be regarded as a synthesized activity influenced Academic Editor: Giacomo Fiumara by various factors including geographical effect, temporal effect, user preference, POI Received: 17 December 2022 categories, and so on. For example, for a person who usually goes to a punk-style bar near Revised: 10 January 2023 his home at night, his behavior may be subject to time (the user likes night activities), user Accepted: 13 January 2023 preferences (the user likes to drink), geographic location (the user likes to visit places near Published: 17 January 2023 home), and POI category (the user likes punk-style POIs). To tackle the POI recommendation problem, traditional Matrix Factorization (MF)- based recommendation algorithms [2–7] decompose the check-in matrix into the user–topic matrix and the topic–POI matrix. Then, the user–topic matrix multiplies the topic–POI Copyright: © 2023 by the authors. matrix to fill the missing values in the check-in matrix. Accordingly, a user ’s behavior is Licensee MDPI, Basel, Switzerland. predicted by using the filled-in value in the check-in matrix. However, the users’ behavior This article is an open access article in the real world is complicated. The linear models based on MF take various influential fac- distributed under the terms and tors into consideration, but the nature of a linear model limits its performance in predicting conditions of the Creative Commons the user ’s behavior pattern where the various factors can interact with each other. Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ To address this issue, a growing number of researchers have integrated deep learning 4.0/). techniques into recommendation systems, e.g., PACE [8], POI2Vec [9], Geo-teaser [10] and Appl. Sci. 2023, 13, 1236. https://doi.org/10.3390/app13031236 https://www.mdpi.com/journal/applsci Appl. Sci. 2023, 13, 1236 2 of 14 so on [11–14]. The major ideas behind these systems include two steps. The first step is to obtain user and POI embeddings. The second step takes the POI and user embedding into a non-linear model [8] or a linear model [15] for further joint training. There are two main types of embedding algorithms. The first one is a variant of the word2vec algorithm [9,10]. It generates the POI sequence according to the check-in time, and then, it uses the POI sequence to embed the POI. The other one is a variant of the node2vec algorithm [8,16,17]. It constructs a graph of users and POIs, such as the user ’s social network graph, the geographical graph of the POI, and so on, in order to use the graphs to perform POI embedding. Nevertheless, these existing embedding methods have certain deficiencies. Firstly, the existing embedding methods consider multiple factors by embedding a separate model for each factor without exploring the intrinsic relationship between different factors. Notice that the intrinsic relationship between different factors has a great impact on the user ’s be- havior. For example, users may like to visit a POI near their home at night while preferring to visit a POI near their work location during the day (time affects the user ’s geographic preferences). Second, we notice that the effectiveness of personalized recommendation is closely related to the current time and location. The certain time and user history check-in records can be used to infer the user preferences at event time. Event locations can filter out POIs that are too far away and thus improve the accuracy of user behavior predictions. So, the embedding methods should take into account the time and geographic location for better recommendation performance. To address the above issues in existing methods, we propose a novel embedding method called Topic And Region Embedding (TARE), which takes into account a variety of factors including geographical effect, temporal effect, user preference and POI categories. Unlike the existing embedding method, the TARE embedded object is not a user or a POI but an event that includes user, POI, event time and event location. The results obtained by TARE not only take into account user preferences and POI information but also event time and event locations. To explore the intrinsic link between multiple factors, we use the probability generation model to embed events. The basic idea is to “mimic” user behaviors in a process of decision making. So, we use a hidden variable topic to explore the joint impact of POI category and time factors on the user behavior and use a hidden variable region to explore the joint impact of geographic location and time factors on user behavior. As region and topic are interdependent, TARE combines the time factor, geographic factor, POI category, and user preference into a unified model. The main contributions of this work are summarized as follows: • We propose a novel embedding method called TARE which takes into account a variety of factors including geographical effect, temporal effect, user preference and POI categories. In addition, TARE is also able to explore the intrinsic link between time and geographic factors and the intrinsic link between time factors and POI categories to improve the accuracy of recommendations. • The objects embedded in TARE are events. Compared with the existing embedding methods, TARE not only considers user preference and POI information but also time and location of the event. To the best of our knowledge, this is the first attempt to explore event-based embedding in POI recommendation. • We conduct extensive experiments on three real-world datasets to evaluate the per- formance enhancement in recommendation accuracy of TARE, and our approach achieves up to 12.39% performance gains in recommendation accuracy compared to existing state-of-the-art methods. 2. Topic And Region Embedding In this section, we first formulate the problem and then present our proposed embed- ding method TARE (Topic and Region Embedding). Appl. Sci. 2023, 13, 1236 3 of 14 2.1. Preliminary Definition 1 (POI). A POI is a uniquely identified site (e.g., a bar, gym or restaurant). In our model, a POI has three attributes: identifier, location and category. We use v to represent a POI identifier and l to denote its corresponding geographical attribute in terms of longitude and latitude coordinates. We use the notation W to denote the set of its category. Definition 2 (Check-in Activity). A user check-in activity is represented by a five-tuple (u, v, l , W , t) v v denoting user u visits POI v at time t. Definition 3 (Event). An event is represented by a four-tuple (u, v, t , l ) denoting user u to visit q q POI v at time t at location l . q q Definition 4 (User Profile). For each user u, we create a user profile D , which consists of a set of check-in activities associated with u. The dataset D used in our model consists of user profiles, i.e., D = fD : u 2 Ug Problem 1 (Event-based Embedding). Given a check-in activity dataset D and an event e = (u, v, t , l ), our goal is to embed e according to the u’s profile, l , t , the category W and q q q q v location l of POI v. 2.2. Model Description The POI recommendation is mainly used to predict user behavior. In order to handle POI recommendations better, we analyze which factors influence user behavior and how these factors affect user behavior. Figure 1 shows the decision-making process of users. An intuitive factor that affects user behavior is time. Furthermore, the time affects the activity type of a user which progressively affects the user ’s geographic preferences and POI category preferences. For example, users are inclined to decide which place to eat between 12:00 p.m. and 1:00 p.m. Meanwhile, geographic information and POI category are also affecting the decision. Users may choose an appropriate POI geographic location and POI category according to personal preference and current location. Finally, users select a POI to visit based on activity type, POI location and POI category. ! ( % % ! " " % " ! % ( ) # " " " % " ! ) ! $ % & ! $ % & $ % ! $ % ! ! " " $ ) % " * % ( " ! " " ) ! ! " $ " " " Figure 1. The decision-making process of a user to visit a place. According to the above process, this paper proposes a new embedding method called Topic And Region Embedding (TARE), which embeds events according to user preferences, event time, event location and POI by simulating the user decision process. TARE divides the factors that influence user behavior into four categories: geographic information, time factor, POI category, and personal preference. User behavior has a similar periodic pattern, and this is manifested in the fact that users always visit similar POIs at specific times [18]. In other words, time factors affect user behavior all the time. To this end, TARE uses two hidden variables (r, z) to discover the relationship between time, POI category !! Appl. Sci. 2023, 13, 1236 4 of 14 and geographic information. Table 1 lists the notations of our model. Table 2 lists the abbreviation and full name of different concepts in this paper. Figure 2 shows the graphical representation of TARE. TARE consists of two parts: embedding of events and prediction of event occurrence, which will be described below. Table 1. Notations of parameters. Symbol Description N, V, R, Z The number of users, POI, region, topic D The profile of user u v The POI of i-th record in D u,i u l The location of POI v u,i u,i q , q The interest of user u, distribution over topic, region zju rju q The topic z’s distribution over POI categories wjz m , S The mean location, location convariance of region r r r q , q The region r and topic z distribution over POI and time vjr,z tjr,z $%$! "# Figure 2. The Graphical Representation of TARE. Appl. Sci. 2023, 13, 1236 5 of 14 Table 2. The abbreviation and full name of concepts. Abbreviation Full Name LBSNs Location-Based Social Networks POIs Point Of Interests MF Matrix Factorization SGD Stochastic Gradient Descent PSSG Projected Scaled Sub-Gradient TARE Topic And Region Embedding 2.3. Embedding of Events Users’ time preferences and category preferences often interact with each other. For example, people who like midnight activities often like to go to the bar. To this end, we explore the interaction between time preferences and category preferences through the hidden variable topic z. Technically, each topic z in our model is not only associated with a multinomial distribution q over categories of POIs but also related with a multinomial wjz distribution over a POI’s identifier q and a check-in time q . Users liking a topic z vjr,z tjr,z indicates that the time preference and category preference corresponding to topic z are in line with the users’ preference. So, we use a user ’s distribution q of interest on a set of zju topics to simulate a user ’s interest in topic z. In addition, we use the hidden variable region r to explore the interaction between time preference and geographical preference. Unlike Wang et al. [19], we use two hidden variables (topic z and region r) instead of just one hidden variable, because using two hidden variables has the following advantages. Firstly, the use of two hidden variables can better explain the meaning of the model. The two hidden variables, respectively, represent the geographical regions and semantic topics. Second, using two hidden variables can make the model perform better with the same number of free variables. Similar to topic z, each region r is associated with a multinomial distribution over the POI’s identifier q , vjr,z a check-in time q and a Gaussian distribution N(m , S ). Then, we use N distributions r r tjr,z q to represent each user ’s preference for r. Finally, we model the user decision process as rju shown in Algorithm 1, and the time complexity of Algorithm 1 is O(jD jjW j). u v Algorithm 1 Modeling user decision process. 1: for each D in D do 2: for the each check-in record(u,v,l ,W ,t) in D do v v u 3: Draw a topic index z Multi (q ) zju 4: Draw a region index r Multi (q ) rju 5: Draw a POI v Multi (q ) vjr,z 6: Draw a time t Multi (q ) tjr,z 7: for each word w 2 W do 8: Draw w Multi(q ) wjz 9: end for 10: Draw a location l N (m ,S ) v r r 11: end for 12: end for To incorporate the check-in time information to the topic discovery process, we employ the widely adopted discretization method by [20] to split a day into hourly-based slots. The joint probability is characterized by P(u, v, l , W , t, z, r) as follows. v v (t ) (z) (r) q (v) w P(u, v, l , W , t , z, r) = q q q q p(l jm , S ) q v v q v r r (1) zju rju tjr,z vjr,z wjz w 2W i v Appl. Sci. 2023, 13, 1236 6 of 14 i i where q is the probability that user u generates topic i, q is the probability that user zju rju u generates region i, q is the probability that region r and topic z jointly generate time tjr,z i i i, q is the probability that region r and topic z jointly generate POI i and q is the vjr,z wjz probability that topic z generates word w . The probability that POI v appears in region r is characterized by P(l jm , S) as follows. v r ~ 1 ~ ~ 1 (l m ) S (l m~ ) v r v r ~ (2) p(l jm~ , S ) = p ex p( ) v r r 2p S where m and S denote the mean vector and covariance matrix, respectively. r r The event-embedding result is characterized by E as follows. u,v,t ,l q q (1) (2) (R) (1) (2) (Z) E = [E , E , ..., E , E , E , ..., E ] u,v,t ,l r r r z z z q q (3) (1) (2) (R) (1) (2) (Z) = ReLU([P , P , ..., P , P , P , ..., P ]) r r r z z z where ReLU(x) = max(0, x). (i) (i) Given event e = [u, v, l , t ], P and P are defined as: q q r z (i) (i) (i) (i) ~ ~ P = q q q p(l jm~ , S ) p(l jm~ , S ) v r r q r r t jr vjr rju (4) (i) (i) (i) (i) w P = q q q q z Õ t jz vjz zju wjz w 2W i v where represents the dot product of two vectors, q is the probability that user u generates zju i i topic i, q is the probability that user u generates region i and q is the probability that rju wjz topic z generates word w . (i) (i) (i) (i) In addition, q , q , q and q are defined as follows. t jr vjr t jz vjz q q (i) (t) (t) (t) q = [q , q , ..., q ] tjr tji,z=1 tji,z=2 tji,z=Z (i) (v) (v) (v) q = [q , q , ..., q ] vjr vji,z=1 vji,z=2 vji,z=Z (5) (i) (t) (t) (t) q = [q , q , ..., q ] tjz tjr=1,i tjr=2,i tjr=R,i (i) (v) (v) (v) q = [q , q , ..., q ] vjz vjr=1,i vjr=2,i vjr=R,i i i where q is the probability that region r and topic z jointly generate time i and q is the tjr,z vjr,z probability that region r and topic z jointly generate POI i. 2.4. Prediction of Event Occurrence We feed the embedded results to the deep neural network. At the time t , the prediction of user u’s preference for POI v at location l is as follows. y = H(E jq ) (6) u,v,t ,l u,v,t ,l neural q q q q where E is the embedding result of the event, q denotes the parameters in the u,v,t ,l neural q q hidden layers and y is user u’s preference for POI v at location l and time t . q q u,v,t ,l q q Given the input feature vector x, the q-th hidden layer is denoted as h , which is a q 1 non-linear function of the previous hidden layer h defined as: q q q 1 q h (x) = ReLu(W h (x) + b ) (7) q q where W and b are parameters of the q-th layer. We adopt the rectified linear unit ReLU(x) = max(0, x) as the non-linear function. By combining Equation (7), the hidden layers can be defined as follows. Appl. Sci. 2023, 13, 1236 7 of 14 x = H(E ) pre u,v,t ,l q q (8) Q 1 0 = h (...h (h (E ))...) u,v,t ,l q q Due to the one-class nature of implicit feedback in POI recommendation, we first connect a binary softmax layer on the top of H, which is basically a logistic regression with sigmoid function, so we have: y ˆ = H(E jq ) u,v,t ,l u,v,t ,l neural q q q q (9) = s( H(E ) w ) u,v,t ,l q q where the sigmoid function is defined as s(x) = 1/(1 + e ). 3. Model Inference and Learning Our goal with model inference is to learn the parameters that maximize the likelihood of observed random variables and minimize the loss function of neural network. TARE’s optimization goal can be expressed as follows. J = J +J (10) PG M neural where J is the optimization target of event embedding and J is the loss function PG M neural of the neural network. The J is defined as follows. PG M PG M = P(~ z,~ r, w ~ , t,~ vj~ u, q ) embedding ~ ~ = P(~ zj~ u, q )P(~ rj~ u, q )P(tj~ r,~ z, q )P(~ vj~ r,~ z, q )P(w ~j~ z, q )P(lj~m, S) z r t v w (11) jD j N u (w) (z ) (r ) (t ) (v ) u,i u,i u,i u,i = q q q q q p(l jm , S ) v r r Õ Õ Õ u,i u,i u,i wjz zju rju tjr ,z vjr ,z u,i u,i u,i u,i u,i u=1 w2W i=1 u,i The loss function J of the neural network is defined as follows. neural jD j N u (12) J =
y H(E )
+ lkWk neural å å u,v ,t ,l u,v ,t ,l u,i u,i q u,i u,i q F u=1 i=1 0 0 where D represents the set of u s check-in activities, andjD j represents the number of u s u u check-in activities. We use L2 regularization to prevent overfitting. However, it is difficult to minimize J . Therefore, we apply the Gibbs EM algo- rithm [21], which is a mixture between EM and a Monte Carlo sampler, to minimize J . In the E-step, we sample regions and topics by using Gibbs sampling. In the M-step, we optimize the model parameters q and q by fixing all topic and region assignments. neural PG M The two steps are conducted repeatly until convergence. More specifically, we draw region r and topic z for each check-in activity in the E-step. We use Gibbs sampling to sample region r and topic z according to Equations (13) and (14). When we sample region r, we assume all other variables are fixed. r represents the region u,i th assignments for all user activities except the i activity of user u, and the sampling topic z is similar to the sampling region r. P(z jz ,~ u,~ v, w ~ , l, t) u,i :u,i (13) (z ) (t ) (v ) (r) w u,i u,i u,i µ q q q q p(l jm , S ) q å v r r Õ u,i zju rju tjr,z vjr,z wjz u,i u,i u,i r=1 w 2W i Appl. Sci. 2023, 13, 1236 8 of 14 P(r jr ,~ u,~ v, w ~ , l, t) u,i :u,i (14) (r ) (t ) (v ) (z) w u,i u,i u,i µ q q q q p(l jm , S ) q å v r r Õ u,i u,i u,i zju rju tjr ,z vjr ,z wjz u,i u,i z=1 w 2W In the M-step, we optimize the parameters q and q to minimize J with all neural PG M topic and region assignments fixed. We use Stochastic Gradient Descent (SGD) to update the parameters q and q for minimizing J . neural PG M neural There are no negative samples in the training data because this is one-class problem. So, we generate negative samples by negative sampling. For each positive sample, we generate five negative samples by randomly changing one of the elements (u, v, t or l ) q q in positive samples. Due to space limitations, the gradient of the weight of the neural t v r z w network is not listed, and the gradients of model parameters q , q , q , q and q tjr,z vjr,z rju zju wjz are provided as follows. ¶q ¶q ¶P rz neural neural t t ¶P ¶q ¶q rz tjr,z tjr,z (r) (v) (r) (z) (v) (w ) (z) = d q P(l jm , S )q + d q q q (15) v r r E,r E,z vjr,z rju vjr,z wjz zju w 2W i v ¶q ¶q ¶P rz neural neural t t ¶P ¶q ¶q rz vjr,z vjr,z (r) (t) (r) (z) (t) (w ) (z) = d q P(l jm , S )q + d q q q (16) v r r E,r E,z tjr,z rju tjt,z wjz zju w 2W i v ¶q ¶q ¶P rz (r) (r) neural neural r = = d (q q )P(l jm , S ) (17) v r r r r E,r tjr vjr ¶q ¶P ¶q rz rju rju ¶q ¶q ¶P rz (z) (z) (z) (w ) neural neural = = d (q q ) q (18) z z E,z tjz (vjz) wjw ¶q ¶P ¶q rz zju zju w 2W ¶q ¶q ¶P neural neural rz w w ¶q ¶P ¶q rz wjz wjz (z) (z) (z) (w) = d (q q )q q (19) E,z zju tjz (vjz) wjz w 2W &w 6=w i v i (1) (2) (R) (1) (2) (Z) where P = [P , P , ..., P , P , P , ..., P ], d is the residual of the embedded layer rz r r r z z z E i i i i E, d ,d represents that the residual of E , E . d can be calculated by neural network r z E E,r E,z back-propagation. Due to space limitations, the derivation process of d is not listed. To update the parameters q for maximizing J , we use the gradient descent PG M PG M learning algorithm Projected Scaled Sub-Gradient (PSSG) [22], which is designed to solve optimization problems with L1 regularization on parameters. The gradients of model t v r z w parameters q , q , q , q and q are provided as follows. tjr,z vjr,z rju zju wjz jD j ¶q (z) (r) PG M = d(t, z, r) q q q (20) tjr,z t zju rju j j ¶q tjr,z j=1 jD j ¶q (z) (r) PG M v = d(v, z, r) q q q (21) v zju rju vjr,z j j ¶q vjr,z j=1 ¶q PG M = d(u, r) q N (22) rju ¶q rju ¶q PG M = d(u, z) q N (23) z zju ¶q zju Appl. Sci. 2023, 13, 1236 9 of 14 ¶q (w) PG M = d(z, w) d(z)q (24) wjz ¶q wjz where d(t, z, r) is the number of activities assigned to topic z and region r on time t; d(v, z, r) is the number of activities where the POI v is assigned to topic z and region r; d(u, r), d(u, z) is the number of u’s activities which are assigned to topic r and region z; d(z, w) is the number of activities where the word w is assigned to topic z; d(z) is the number of activities which is assigned to topic z; D denotes the set of activities occurring on time t; D denotes t v the set of activities occurring at POI v; and N is the number of u’s activities. For geographical modeling, the parameters ~m and S can be obtained in closed form as Equations (25) and (26). jD j m = l (25) å j d(z) j=1 jD j ~ ~ S = (l m~ ) (l m~ ) (26) r r r å j j d(z) 1 j=1 where D represents the set of check-in activities assigned to region r and l is the location th where the j activity in D occurs. 4. Experiments In this section, we evaluate our model for POI recommendation with extensive experi- ments on real-world check-in datasets. 4.1. Dataset Gowalla. The Gowalla check-in dataset (http://snap.stanford.edu/data/loc-gowalla.html, accessed on 12 January 2023) is generated from February 2009 to October 2010. We use the records generated within the United States and filter out those users with fewer than 20 check-ins and the POIs with fewer than 15 visitors. After filtering, the sparsity of the user–POI check-in matrix is 99.72%. Foursquare. The Foursquare dataset (https://sites.google.com/site/yangdingqi/home/ foursquare-dataset, accessed on 12 January 2023) includes check-in data from April 2012 to September 2013. We use the records generated within New York, Philadelphia and Trenton, and we filter those users with fewer than 10 check-ins as well as those POIs with fewer than 15 visitors. After filtering, the sparsity of the user–POI check-in matrix is 99.56%. Yelp. The Yelp’s Challenge Dataset (Yelp dataset challenge round 7 ( https://www.yelp. com/datasetchallenge, accessed on 20 February 2016) comes from the Yelp Dataset Chal- lenge round 7 in 2016. The data contain a large number of geotagged businesses and reviews within several cities. We use the records generated within Phoenix and eliminate those users with fewer than 20 check-ins as well as those POIs with fewer than 15 visitors. The sparsity of the user–POI check-in matrix is 99.78%. The statistics of the used data sets are shown in Table 3. We partition each dataset into a training set and test set. For each user, we use the earliest 80% check-ins as the training data and the most recent 20% check-ins as the test data. Different learning rate schedule have a significant impact on performance [23]. so in order to ensure the fairness of the experiment, we apply constant learning rate for all models in the experiment. We implement TARE with Tensorflow (http://www.tensorflow.org/, accessed on 12 January 2023) for the training of our model. Appl. Sci. 2023, 13, 1236 10 of 14 Table 3. The statistics of the used data sets. Dataset #user #POI #check-in Sparsity Gowalla 19,017 11,681 643,620 99.72% Foursquare 7961 10,629 459,221 99.56% Yelp 17,624 15,040 602,445 99.78% 4.2. Baselines for Comparison We compare TARE with the following six state-of-the-art methods with well-tuned parameters. • LRT: LRT [20] is a time-enhanced MF model based on observed temporal properties that characterize each user by different latent vectors for different time slots. • GeoMF: GeoMF [4] is a geographical Weighted Matrix Factorization (WMF) model which integrates geographical influence by modeling users’ activity regions and the influence propagation on geographical space. • JIM: JIM [24] is a joint probabilistic generative model that mimic users’ check-in behaviors by integrating the factors of temporal effect, content effect, geographical influence and word-of-mouth effect. • TRM: TRM [25] is a unified probabilistic generative model that simultaneously dis- covers the semantic, temporal, and spatial patterns of users’ check-in activities, and it models their joint effect on users’ decision making for the selection of POIs to visit. • PACE: PACE [8] is a general and principled combination of CF and SSL based on neural networks that model user preference for POIs. PACE considers both social effect and geographic effect. On the other hand, PACE is an embedding algorithm based on the node2vec algorithm. • Geo-Teaser: Geo-Teaser [10] is a temporal POI-embedding model that captures the check-ins’ sequential contexts and the various temporal characteristics on differ- ent days. On the other hand, Geo-Teaser is an embedding algorithm based on the word2vec algorithm. 4.3. Evaluation Methods and Metrics As for the performance comparison with baselines, we apply two widely used metrics, i.e., Accuracy@k and M AP@k, where accuracy measures the number of correct recommen- dation in the result, while MAP considers the rank of the recommendation. Particularly, for each test case (u, v, l , W , t ), we use Gaussian function to generate a v v q coordinate l within the circle of radius d centered at l to represent the current standing q v point of user u. Thus, a query q = (u, l , t ) is formed for the test case. The set of all queries q q is represented as S . test For each query (u, l , t ) in S , we calculate the user preferences for POIs and generate q q test a ranked list based on their preference scores. Let p denote the rank of the test item v within this list. Finally, we obtain a top-k recommendation list. If p < k, we have a hit (i.e., the test item v is recommended to the user). The computation of Accuracy@k proceeds as follows. For a single test, we define hit@k as 1 when v appears in the top-k recommendation list; otherwise, hit@k is 0. The overall Accuracy@k is defined by averaging all test cases: #hit@k Accuracy@k = (27) jS j test where #hit@k denotes the number of hits in the test set, and jS j is the number of all test test cases. Appl. Sci. 2023, 13, 1236 11 of 14 M AP@k is highly relative to the position of v in the list (i.e., p). When we have a hit, we focus on the value of p. M AP@k is defined as follows: jS j test hit@k i=1 p M AP@k = (28) jS j test 4.4. Performance Analysis In this part, we present the effectiveness of the recommendation methods with well- tuned parameters. Figure 3 show the performance under Accuracy@k and M AP@k on the Gowalla, Yelp and Foursquare datasets, respectively. Although the relative performance among the baselines varies across different datasets and metrics, TARE outperforms other competitor models significantly. Compared with the strongest baseline, TARE yields approximately a 12.39% relative improvement in terms of Accuracy@10 and 10.14% relative improvement in terms of M AP@10. Such consistent improvements clearly demonstrate the strength of TARE over the baseline methods. Taking a closer look at the results, several observations are made: (1) Both LRT and GeoMF are based on the MF model, but the performances vary greatly. The result shows that temporal influence is more critical than geographical property for POI recommendation. (2) The experimental results show that the methods (TARE, TRM, JIM) which consider query time and query location are much better than other methods. Therefore, query time and query location are crucial to the accuracy of recommendation. (3) The embedded method PACE which considers geographic and social factors has a better performance than Geo-Teaser, which only considers geographical factors. The result shows that embedded methods which effectively consider a large number of factors are superior to those only considering a small number of factors. (4) Although TARE and TRM consider the same factors, TARE takes into account the intrinsic link between time and other factors (POI category and POI location). So, TARE can achieve better performance compared to TRM. The main reasons why TARE is superior to other methods can be concluded as follows. First, the objects embedded in TARE are events. TARE takes full advantage of all the infor- mation in an event. Compared to TARE, other embedding methods only take advantage of a small portion of the information in an event. Second, TARE considers the intrinsic link between multiple factors. TARE explores the intrinsic link between time factors and POI category and the intrinsic relationship between time factors and geographic factors. 0.6 0.5 0.30 0.5 0.4 0.25 0.4 0.20 0.3 0.3 0.15 PACE PACE 0.2 Geo-Teaser Geo-Teaser Geo-Teaser 0.2 GeoMF 0.10 GeoMF GeoMF LRT LRT LRT 0.1 JIM JIM 0.1 JIM 0.05 TRM TRM TRM TARE TARE TARE 0.0 0.00 0.0 0 2 4 6 8 101214161820 0 2 4 6 8 101214161820 0 2 4 6 8 101214161820 Rank Rank Rank (a) Accuracy@k-Gowalla (b) Accuracy@k-Yelp (c) Accuracy@k-Foursquare 0.25 0.10 0.30 0.25 0.20 0.08 0.20 0.15 0.06 0.15 PACE PACE 0.10 0.04 Geo-Teaser Geo-Teaser Geo-Teaser 0.10 GeoMF GeoMF GeoMF LRT LRT LRT 0.05 0.02 JIM JIM JIM 0.05 TRM TRM TRM TARE TARE TARE 0.00 0.00 0.00 0 2 4 6 8 101214161820 0 2 4 6 8 101214161820 0 2 4 6 8 101214161820 Rank Rank Rank (d) MAP@k-Gowalla (e) MAP@k-Yelp (f) MAP@k-Foursquare Figure 3. Performance compared. MAP@k Accuracy@k MAP@k Accuracy@k MAP@k Accuracy@k Appl. Sci. 2023, 13, 1236 12 of 14 4.5. Impact of Different Parameters We fix the number of hidden layers as 4 and the sizes of layers as 128 ! 64 ! 32 ! 16 for neural networks. However, the effectiveness of TARE is subject to the number of topics and regions (i.e., Z and R ). To this end, we study the effectiveness of event-based embedding by varying Z and R. The results are showen in Table 4. From the results, we observe that the recommendation accuracy of TARE is first enhanced with an increasing number of topics, and then, it does not change drastically when the number of topics exceeds 80. Similar observations can be made for increasing the number of regions (i.e., R): the recommendation accuracy of TARE is imporved with an increasing number of regions, and then, it remains stable when the number of regions is larger than 80. The reason is that Z and R represent the model complexity. Thus, when Z and R are too small, the model has limited ability to describe the data. It should be noted that the performance reported in Figure 3 is achieved with Z = 100 and R = 100. Table 4. Accuracy@10 with different #topic and #region. Capacity R = 20 R = 40 R = 60 R = 80 Z = 100 Z = 20 0.221 0.292 0.355 0.378 0.431 Z = 40 0.254 0.337 0.403 0.430 0.461 Z = 60 0.314 0.375 0.417 0.472 0.481 Z = 80 0.317 0.364 0.448 0.465 0.527 Z = 100 0.349 0.380 0.465 0.523 0.540 5. Releated Work 5.1. Latent Factor Model The latent factor model (LFM) is also popular in recommender systems for both implicit and explicit feedback which assumes the existence of latent factor space and leverages matrix factorization to obtain low-dimensional matrices. Probabilistic matrix factorization (PMF) is developed to optimize regularized MF by introducing Gaussian prior. Many researchers try to improve and optimize models based MF and PMF. Cheng et al. [3] proposed PMFSR to integrate social influence into PMF with social regularization. In [20], Gao et al. proposed a Location Recommendation framework with Temporal effects (LRT), which incorporates temporal influence into an MF. In addition, Lian et al. proposed the GeoMF model [4] to incorporate geographical influence into a weighted regularized matrix factorization model (WRMF) [26,27]. However, MF-based methods’ computational performance will be significantly reduced facing large-scale data. Furthermore, MF-based methods rely on global calculation resulting in difficulty in incremental updates. In addition, the probabilistic generative models [28] are also applied to POI rec- ommendation due to their strong interpretability. Zhao et al. [29] proposed a unified sentiment–aspect–region model (SAR) to learn user preference based on the reviews, cate- gories, and locations of POIs. However, it just obtains topical regions and topical aspects by applying the topic model and sets parameters manually to fuse different influence factors. JIM [24] is further developed later, which integrates the factors of temporal effect, content effect, geographical influence, and word-of-mouth effect, especially for out-of-town users. Different from the fused model, JIM is the joint model [30] which learns several influential factors together. LSARS [31] takes crowd sentiment into account to improve the effectiveness of JIM. However, the nature of the linear model limits the performance of the above model. 5.2. Deep Learning Model Neural networks [32–34] are widely used in various fields, and have been proven to have superior performance through research and practice. In recent years, a growing Appl. Sci. 2023, 13, 1236 13 of 14 number of researchers have integrated deep learning techniques into recommendation systems. These recommendation systems can basically be divided into two modules. The first module is an embedded module that embeds users or POIs. The second module is the preference learning module, which is recommended based on the results of the embedded module. Refs. [8,16,17,35,36] use the node2vec algorithm to embed users and POIs. It establishes different user networks and POI networks for various users and POIs by considering different factors. These networks are then embedded by the node2vec algorithm. However, the embedding methods based on node2vec cannot consider the intrinsic link between multiple factors. Refs. [9,10] treat the POI as a word and the POI sequence as a sentence. These methods use the word2vec algorithm to embed POI. They embed the POI by analyzing the POI sequence. Ref. [15] uses a convolutional neural network to embed documents and then apply the results to the MF. However, these methods make it difficult to consider other factors. 6. Conclusions In this paper, we propose an event-based embedding for POI recommendation called TARE. Different from the existing embedding methods, TARE not only considers the influence of various factors on user behavior in the process of embedding but also considers the intrinsic relationship between various factors. In addition, TARE integrates the time and location of the event into the embedded processor to improve the accuracy of the recommendation. Experimental results show that TARE achieves up to 12.39% performance enhancement in recommendation accuracy compared to up-to-date solutions. Author Contributions: Methodology, H.L.; Writing—original draft, H.L.; Writing—review & editing, X.G.; Supervision, T.Z. and G.Y.; Funding acquisition, T.Z. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by National Natural Science Foundation of China (62272093). Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: Not applicable. Conflicts of Interest: The authors declare no conflict of interest. References 1. Raza, S.; Ding, C. Progress in context-aware recommender systems—An overview. Comput. Sci. Rev. 2019, 31, 84–97. 2. Bao, J.; Zheng, Y.; Mokbel, M.F. Location-based and preference-aware recommendation using sparse geo-social networking data. In SIGSPATIAL/GIS; ACM: New York, NY, USA, 2012; pp. 199–208. 3. Cheng, C.; Yang, H.; King, I.; Lyu, M.R. Fused matrix factorization with geographical and social influence in location-based social networks. In AAAI; AAAI Press: Washington, DC, USA, 2012. 4. Lian, D.; Zhao, C.; Xie, X.; Sun, G.; Chen, E.; Rui, Y. Geomf: joint geographical modeling and matrix factorization for point-of- interest recommendation. In KDD; ACM: New York, NY, USA, 2014; pp. 831–840. 5. Davtalab, M.; Alesheikh, A.A. A POI recommendation approach integrating social spatio-temporal information into probabilistic matrix factorization. Knowl. Inf. Syst. 2021, 63, 65–85. 6. Rahmani, H.A.; Aliannejadi, M.; Baratchi, M.; Crestani, F. Joint Geographical and Temporal Modeling Based on Matrix Factorization for Point-of-Interest Recommendation. In Proceedings of the Advances in Information Retrieval-42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14–17 April 2020. 7. Zhang, Z.; Liu, Y.; Zhang, Z.; Shen, B. Fused matrix factorization with multi-tag, social and geographical influences for POI recommendation. World Wide Web. 2019, 22, 1135–1150. 8. Yang, C.; Bai, L.; Zhang, C.; Yuan, Q.; Han, J. Bridging collaborative filtering and semi-supervised learning: A neural approach for POI recommendation. In KDD; ACM: New York, NY, USA, 2017; pp. 1245–1254. 9. Feng, S.; Cong, G.; An, B.; Chee, Y.M. Poi2vec: Geographical latent representation for predicting future visitors. In AAAI; AAAI Press: Washington, DC, USA, 2017; pp. 102–108. 10. Zhao, S.; Zhao, T.; King, I.; Lyu, M.R. Geo-teaser: Geo-temporal sequential embedding rank for point-of-interest recommendation. In WWW (Companion Volume); ACM: New York, NY, USA, 2017; pp. 153–162. Appl. Sci. 2023, 13, 1236 14 of 14 11. Liu, X.; Yang, Y.; Xu, Y.; Yang, F.; Huang, Q.; Wang, H.. Real-time POI recommendation via modeling long- and short-term user preferences. Neurocomputing 2022, 467, 454–464. 12. Liu, Y.; Yang, Z.; Li, T.; Wu, D. A novel POI recommendation model based on joint spatiotemporal effects and four-way interaction. Appl. Intell. 2022, 52, 5310–5324. 13. Li, Q.; Xu, X.; Liu, X.; Chen, Q. An Attention-Based Spatiotemporal GGNN for Next POI Recommendation. IEEE Access 2022, 10, 26471–26480. 14. Chen, Y.C.; Thaipisutikul, T.; Shih, T.K. A Learning-Based POI Recommendation With Spatiotemporal Context Awareness. IEEE Trans. Cybern. 2022, 52, 2453–2466. 15. Kim, D.H.; Park, C.; Oh, J.; Lee, S.; Yu, H. Convolutional matrix factorization for document context-aware recommendation. In RecSys; ACM: New York, NY, USA, 2016; pp. 233–240. 16. Xie, M.; Yin, H.; Wang, H.; Xu, F.; Chen, W.; Wang, S. Learning graph-based POI embedding for location-based recommendation. In CIKM; ACM: New York, NY, USA, 2016; pp. 15–24. 17. Xie, M.; Yin, H.; Xu, F.; Wang, H.; Zhou, X. Graph-based metric embedding for next POI recommendation. In WISE (2); Springer: Cham, Switzerland, 2016; pp. 207–222. 18. Cho, E.; Myers, S.A.; Leskovec, J. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21–24 August 2011; pp. 1082–1090. 19. Wang, W.; Yin, H.; Chen, L.; Sun, Y.; Sadiq, S.; Zhou, X. ST-SAGE: A Spatial-Temporal Sparse Additive Generative Model for Spatial Item Recommendation. ACM Trans. Intell. Syst. Technol. (TIST) 2017, 8, 48:1–48:25. 20. Gao, H.; Tang, J.; Hu, X.; Liu, H. Exploring temporal effects for location recommendation on location-based social networks. In RecSys; ACM: New York, NY, USA, 2013; pp. 93–100. 21. Wallach, H.M. Topic Modeling: Beyond Bag-of-words. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 977–984. 22. Schmidt, M.; Niculescu-Mizil, A.; Murphy, K. Learning graphical model structure using L1-regularization paths. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-07), Vancouver, BC, USA, 22–26 July 2007; pp. 1278–1283. 23. Liu, H.; Fu, Q.; Du, L.; Zhang, T.; Yu, G.; Han, S.; Zhang, D. Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, 17–21 October 2022; pp. 4234–4238. 24. Yin, H.; Zhou, X.; Shao, Y.; Wang, H.; Sadiq, S.W. Joint modeling of user check-in behaviors for point-of-interest recommendation. In CIKM; ACM: New York, NY, USA, 2015; pp. 1631–1640. 25. Yin, H.; Cui, B.; Zhou, X.; Wang, W.; Huang, Z.; Sadiq, S.W. Joint modeling of user check-in behaviors for real-time point-of-interest recommendation. ACM Trans. Inf. Syst. 2016, 35, 11:1–11:44. 26. Hu, Y.; Koren, Y.; Volinsky, C. Collaborative filtering for implicit feedback datasets. In ICDM; IEEE Computer Society: Washington, DC, USA, 2008; pp. 263–272. 27. Pan, R.; Zhou, Y.; Cao, B.; Liu, N.N.; Lukose, R.M.; Scholz, M.; Yang, Q. One-class collaborative filtering. In ICDM; IEEE Computer Society: Washington, DC, USA, 2008; pp. 502–511. 28. Liu, H.; Zhang, T.; Li, F.; Gu, Y.; Yu, G. Tracking knowledge structures and proficiencies of students with learning transfer. IEEE Access 2020, 9, 55413–55421. 29. Zhao, K.; Cong, G.; Yuan, Q.; Zhu, K.Q. SAR: A sentiment-aspect-region model for user preference analysis in geo-tagged reviews. In ICDE; IEEE Computer Society: Washington, DC, USA, 2015; pp. 675–686. 30. Zhao, S.; King, I.; Lyu, M.R. A survey of point-of-interest recommendation in location-based social networks. arXiv 2016, arXiv:1607.00647. 31. Wang, H.; Fu, Y.; Wang, Q.; Yin, H.; Du, C.; Xiong, H. A location-sentiment-aware recommender system for both home-town and out-of-town users. In KDD; ACM: New York, NY, USA, 2017; pp. 1135–1143. 32. Du, L.; Shi, X.; Fu, Q.; Ma, X.; Liu, H.; Han, S.; Zhang, D. GBK-GNN: Gated Bi-Kernel Graph Neural Networks for Modeling Both Homophily and Heterophily. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 1550–1558. 33. Yu, M.; Li, F.; Liu, H.; Zhang, T.; Yu, G. ContextKT: A Context-Based Method for Knowledge Tracing. Appl. Sci. 2022, 12, 8822. 34. Yu, M.; Zhang, Y.; Zhang, T.; Yu, G. Semantic enhanced top-k similarity search on heterogeneous information networks. In International Conference on Database Systems for Advanced Applications; Springer: Cham, Switzerland, 2020; pp. 104–119. 35. Hang, M.; Pytlarz, I.; Neville, J. Exploring student check-in behavior for improved point-of-interest prediction. In KDD; ACM: New York, NY, USA, 2018; pp. 321–330. 36. Yang, J.; Eickhoff, C. Unsupervised learning of parsimonious general-purpose embeddings for user and location modeling. ACM Trans. Inf. Syst. 2018, 36, 32:1–32:33. Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png
Applied Sciences
Multidisciplinary Digital Publishing Institute
http://www.deepdyve.com/lp/multidisciplinary-digital-publishing-institute/event-based-probabilistic-embedding-for-poi-recommendation-9sIisjzSP0