International Journal of Computational Intelligence Systems

Volume 13, Issue 1, 2020, Pages 1699 - 1713

Link Prediction in Social Networks by Neutrosophic Graph

Authors
Rupkumar Mahapatra1, ORCID, Sovan Samanta2, *, ORCID, Madhumangal Pal1, ORCID, Qin Xin3, ORCID
1Department of Applied Mathematics with Oceanology and Computer Programming, Vidyasagar University, Midnapore, 721102, India
2Department of Mathematics, Tamralipta Mahavidyalaya, West Bengal, 721636, India
3Faculty of Science and Technology, University of the Faroe Islands, Tórshavn, 100, Faroe Islands
*Corresponding author. Email: ssamantavu@gmail.com
Corresponding Author
Sovan Samanta
Received 29 July 2020, Accepted 3 October 2020, Available Online 24 October 2020.
DOI
10.2991/ijcis.d.201015.002How to use a DOI?
Keywords
Social network; Neutrosophic graph; Link prediction; Modified RSM index
Abstract

The computation of link prediction is one of the most important tasks on a social network. Several methods are available in the literature to predict links in networks and RSM index is one of them. The RSM index is applicable in the fuzzy environment and it does not incorporate the notion of falsity and indecency parameters which occur frequently in uncertain environments. In the present method, the behaviors of the common neighbor and the other parameters, like nature of job, location, etc., are considered. In this paper, more parameters are included in the RSM index for making it more flexible and realistic and it is best fitted in the neutrosophic environment. Many important properties are studied for this modified RSM index. A small network from Facebook is considered to illustrate the problem.

Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Nowadays, the use of social networks [1] are progressing very fast. Social networks can be used for many purposes. Many types of social networks are available. These social networks are prepared to grow their business rapidly, and hence the providers of social networks try to increase their networks.

Over the past few years, online social networking has exploded in popularity as a means for people to share information and build connections with others. For communication, marketing, and spreading of news, etc., it becomes a vital instrument. In the social network market, there is a substantial competitive situation, so all social network organizations are trying to enhance their networks' popularity. So popularity directly depends on how many users and edges/relationship are there between users. In social networks, it is essential to know how to improve the number of edges.

A user of a social network wants to connect to another user by nature of the user, therefore, at first, he/she gathers some information like common friends, personality, age, sex, educational background, job and living area.

It is analyzed that the personalities of common friends are proportional to build a link between to unknown friends. There are lots of graph-theoretic measures for link prediction. However, the given data in social networks are not precise all the times. Fuzzy systems capture these uncertainties with a degree of memberships. So, the fuzzy graph gives a more effective result for this calculation. Samanta and Pal [25] introduced many types of fuzzy graphs. Mahapatra et al. [68] presented many applications of the fuzzy graph.

It is a common research topic in social networks on how to improve the number of edges in networks, and many types of methods are used to increase links. Nowell et al. [9] introduced common neighbor (CN) methods of link prediction and it is modified as Jaccard's coefficient [10]. Sorensen [11] introduced Sorensen Index. Adamic/Adar (AA) index was introduced by Adamic and Adar [12].

Almost all the link prediction methods are calculated on the basis of the neighbors. The prediction score for links is based on the number of neighbors. If the number of neighbors increases, the probability of predicting links will also increase. Yet neighbors' behavior is essential. But, in link prediction calculation nature of common neighbor plays a vital role. Mahapatra et al. [13] introduced RSM index for link prediction calculation depending on the nature of common neighbor in the fuzzy graph.

Though capturing false data is a research question, some times, data are displayed with falsity and indeterminacy. The number of friends of person may be assumed as true value. But the number of inactive or fake friends may be assumed as falsity. Sometimes data are not available or the data are contradicting the facts. In these cases, indeterminacy may be taken to capture the notions. All three values, true value, falsity, indeterminacy, are taken in neutrosophic graphs [14]. Manh Tuan et al. [15] proposed link prediction calculation by neutrosophic modelling. In these cases, too, the nature of common friends is ignored. Besides, there are some other realistic notions like locality, jobs, educations which have effect in link prediction calculation. In the proposed method, the RSM index has been updated with these notions in neutrosophic fuzzy environment. This article proposes an advanced idea of RSM index, modified RSM index.

Some Notations

All the basic notations are shown in Table 1.

Notation Meaning
ξ Fuzzy graph
ζ Neutrosophic graph
V Vertex set
E Edge set
TA(x),IA(x),FA(x) True membership value, indeterminacy membership value, falsity membership value of the vertex x of ζ
TB(x,y),IB(x,y),FB(x,y) True membership value, indeterminacy membership value, falsity membership value of the edge (x,y) of ζ
C(x) Set of neighbor of the vertex x
dx Degree of vertex x
(TCi,ICi,FCi) True value, indeterminacy value, falsity value of nature of common neighbors
Sxy Link prediction value between the vertices x and y
(TDi,IDi,FDi) True value, indeterminacy value, falsity value of the other parameters
(TL,IL,FL) True value, indeterminacy value, falsity value of link prediction by modified RSM index
Ŝxy Score of link prediction between the vertices x and y by modified RSM index
Table 1

Some basic notations.

2. PRELIMINARIES

A fuzzy graph ξ=(V,σ,μ) is complete if μ(u,v)=min{σ(u),σ(v)} for all u,vV, where (u,v) denotes the edge between the vertices u and v.

A neutrosophic set A in X is characterized by a truth membership function TA(x), an indeterminacy membership function IA(x) and a falsity membership function FA(x). The functions TA(x), IA(x) and FA(x) are real standard or nonstandard subset of ]0,1+[. That is, TA(x):X]0,1+[, IA(x):X]0,1+[ and FA(x):X]0,1+[ and 0TA(x)+IA(x)+FA(x)3+.

A neutrosophic graph is an order pair ζ=(A,B), where A:V[0,1] is a neutrosophic set in V (nonempty) and B:V×V[0,1] is a neutrosophic relation on V such that

TB(x,y)min{TA(x),TA(y)},
IB(x,y)max{IA(x),IA(y)},
FB(x,y)max{FA(x),FA(y)}
for all x,yV.

2.1. Link Prediction Methods

In a social network, two unknown users may be connect in future. Suppose in Figure 1 consider a small networks, at time t the vertices c and d has no edge but in future at time t there may have some chance to connect each other. This type of chance is calculated by method of link prediction calculation.

Figure 1

Example of link prediction.

Various types of link prediction methods are available, some of these are given below.

2.1.1. Common neighbors (CN)

CNs [9] methods directly depend on number of common neighbors. Suppose, set of neighbor of a vertex x is C(x) then link prediction value by this method is

S(a,b)=|C(a)C(b)|.

2.1.2. Salton index

C(x) is the set of neighbor and dx is the degree of a vertex x then the link prediction value by Salton index [16] defined as

S(a,b)=|C(a)C(b)|dadb.

2.1.3. Jaccard index

Suppose, set of neighbor of a vertex x is C(x) then the link prediction value by Jaccard index [10] is

S(a,b)=|C(a)C(b)||C(a)C(b)|.

2.1.4. Sorensen index

Soresen index [11] of link prediction is defined as

S(a,b)=|C(a)C(b)|da+db.

2.1.5. Hub promoted index

The link prediction value by Hub Promoted index [17] is defined as

S(a,b)=|C(a)C(b)|min(da,db).

2.1.6. Hub depressed index

The link prediction value by Hub Depressed index is defined as

S(a,b)=|C(a)C(b)|max(da,db).

2.1.7. Leicht-Holme-Newman index

The link prediction value by Leicht-Holme-Newman index [18] is defined as

S(a,b)=|C(a)C(b)|(dadb).

2.1.8. Preferential attachment index

The link prediction value by Preferential Attachment index is defined as

S(a,b)=dadb.

2.1.9. AAdar index

The link prediction value by AA index [12] is defined as

S(a,b)=PN(a)N(b)1(logdP).

2.1.10. Resource allocation index

The link prediction value by Resource Allocation index [19] is defined as

S(a,b)=PN(a)N(b)1dP.

2.1.11. RSM index in fuzzy graph

The link prediction value by RSM index [13] in fuzzy graph is defined as

S(a,b)=i=1rNir.

Ni is the nature of the common neighbors and it is calculated by Ni=min{μ(a,xi),μ(b,xi)},xi is the common neighbors between the vertices a, b where μ(a,xi) is the edge membership value of edge (a,xi) of a fuzzy graph and i=1,2,3,,r.

3. MODIFIED RSM INDEX FOR LINK PREDICTION

All above methods except “RSM index” are based on the number of neighbors. As a result, link prediction value depends on the number of neighbors. The number of common neighbors increases, then the probability of link prediction will be increase. However, the nature of neighbors is significant for the calculation of link prediction in “RSM index.” But, there are several other factors which are essential to predict the link between two unknown people. That is, despite no mutual friends, there are still chances to be linked between two unknown people based on other parameters like location, job, education, etc. In the proposed Modified RSM Index, link prediction value is calculated from nature of neighbor and a few other parameters in neutrosophic environment.

Now introduce the Modified RSM Index in neutrosophic environment-

Consider u and v be any two nonadjacent vertices of a neutrosophic graph ζ. Also, let u and v have n number of neighbors as w1,w2,,wn. Then the link prediction between u and v depends on the nature of this common neighbor and some other associated parameters like (i) job, (ii) location, (iii) education, etc.

Now, true value, indeterminacy value, falsity value of nature of common neighbors are denoted by (TCi,ICi,FCi) defined by

TCi=min{TB(u,wi),TB(wi,v)}
ICi=max{IB(u,wi),IB(wi,v)}
FCi=max{FB(u,wi),FB(wi,v)},
where, TB(u,wi),IB(u,wi)FB(u,wi) are the true membership value, indeterminacy membership value, falsity memberships value of the edge (u,wi) and i=1,2,,n.

Also, consider the (TDi,IDi,FDi) be the true membership value, indeterminacy membership value, falsity memberships value of the other m associated parameters.

Then the link prediction between u and v is denoted by (TL,IL,FL) is defined by

TL=i=1nTCi+j=1mTDjm+n
IL=i=1nICi+j=1mIDjm+n
FL=i=1nFCi+j=1mFDjm+n

Score of link prediction by Modified RSM Index

Ŝuv=2+TLILFL3.

3.1. Algorithm to Calculate Score of Link Prediction by the Neutrosophic Graph

Input: ζ=(A,B) be a neutrosophic graph.

Output:- Score of link prediction between two vertices a and b of ζ.

Step 1: Calculate true value, indeterminacy value, falsity value of nature of common neighbors (direct) between a, b are (TCi,ICi,FCi), where i=1,2,3,,n.

Step 2: Calculate (TDi,IDi,FDi) be true membership value, indeterminacy membership value, falsity memberships value of the other m associated parameters.

Step 3: Calculate true value, indeterminacy value, falsity value of link prediction between a and b is (TL,Il,FL).

Step 4: Calculate the score of link prediction by Modified RSM Index is Ŝab=2+TLILFL3.

Example 1.

Consider in Figure 2, a neutrosophic graph with five vertices and the vertices membership value are consider as in Table 2. Edges membership values are consider as in Table 3. Here, we consider three others parameters like (i) job, (ii) location, (iii) education. The membership value between of three parameters between a and b are location (0.7,0.3,0.5), job (0.8,0.2,0.1), education (0.5,0.1,0.2). Now, the nature of this common neighbor c is

TCc=min{0.6,0.6}=0.6
ICc=max{0.6,0.5}=0.6
FCc=max{0.4,0.3}=0.4.
Figure 2

Link prediction by modified RSM index between the vertices a and b.

Similarly, nature of d is TCd=0.5, ICd=0.6, FCd=0.5 and for vertex e is TCe=0.2, ICe=0.5, FCe=0.5. Then the link prediction between a and b is La,b=(TL,IL,FL) is

TL=0.6+0.5+0.2+0.8+0.5+0.76=3.36=0.55
IL=0.6+0.6+0.5+0.2+0.1+0.36=2.36=0.383
FL=0.4+0.5+0.5+0.5+0.1+0.26=2.26=0.367.

Score of link prediction by modified RSM is Ŝab=2+TLILFL3=2+0.550.3830.3673=1.83=0.6

Lemma 1.

The value of nature of a vertex is not fixed.

Proof.

In Figure 3 a neutrosophic graph shown here Q is the common neighbor between the vertices P and R. So, the nature of Q is (0.4,0.5,0.6). Also, Q is the common neighbor between the vertices R and S. In this case nature of Q is (0.3,0.5,0.6). So, the nature of Q is not fixed. Therefore, the nature of a vertex is not fixed.

Figure 3

A neutrosophic graph.

Vertex Membership Value
a (0.8,0.5,0.2)
b (0.7,0.4,0.3)
c (0.7,0.5,0.3)
d (0.6,0.3,0.4)
e (0.3,0.3,0.4)
Table 2

Vertex membership value of Figure 2.

Edge Membership Value Edge Membership Value
(a,c) (0.6,0.6,0.4) (c,b) (0.6,0.5,0.3)
(a,d) (0.5,0.6,0.4) (d,b) (0.5,0.4,0.5)
(a,e) (0.3,0.5,0.5) (e,b) (0.2,0.5,0.4)
Table 3

Edges membership value of Figure 2.

Theorem 1.

Score of link prediction between u and v in a neutrosophic graph ζ by Modified RSM Index is Ŝuv then 0Ŝuv1.

Proof.

Ŝuv is the Score of link prediction between u and v in a neutrosophic graph ζ. Then, Ŝuv=2+TLILFL3 and 0TL1, 0IL1 and 0FL1 are true.

The value of Ŝuv will maximum if the value of TL is maximum and IL, FL are minimum value.

So, the maximum value of Ŝuv is Ŝuv=2+1003=1.

Also, the value of Ŝuv will minimum if the value of TL is minimum and IL, FL are maximum value. So, the minimum value of Ŝuv is Ŝuv=2+0113=0.

Then 0Ŝuv1.

In the modified RSM index ignore the other parameter then following theorems hold.

Lemma 2.

Let ζn(n4) be a neutrosophic cycle graph then link prediction by the modified RSM index and nature of neighbor are equal.

Proof.

In the Figure 4, consider ζ5 is a neutrosophic cycle graph with 5 vertices. Now link prediction between B and E (having common neighbor A) is equal to nature of A. In the cycle neutrosophic graph, there is only one common neighbor and cycle graph less than three vertices is a complete graph. Therefore, link prediction by the modified RSM index in neutrosophic cycle graph is equal to nature of the neighbor.

Figure 4

The link prediction for the neutrosophic cyclic graph.

Theorem 2.

Let ζ be a star neutrosophic graph then the link prediction by modified RSM index between any two vertices is equal to the nature of center against those vertices.

Proof.

In Figure 5, consider a star neutrosophic graph with the vertices A,B,D,E,F,G,H,I and C is the center vertex. In this case link prediction between A and B is equal to the nature of the center vertex C against the vertices A and B (Lemma 1, it is proved that nature of a vertex is not fixed). So, in star neutrosophic graph there is a only one neighbor C. So, the link prediction between any two vertices is equal to the nature of center vertex against those vertices.

Figure 5

The link prediction for the neutrosophic star graph.

4. VERIFICATION OF MODIFIED RSM INDEX IN A SMALL NETWORK FROM FACEBOOK

Few data have been collected from Facebook, which is reflected in Figure 6. A small network of Facebook friends has been considered for the analysis of modified RSM index. All the users of Facebook have been considered as vertices of this network and there exists an edge between two vertices if they are friends in Facebook. Based on these data, a small network has been considered (see Figure 7). Figure 8 shows the mutual friend between two friends. In the considered network, few link-predicted measures have been shown by Modified RSM index. For the considered graph, vertex have true membership value, falsity and indeterminacy membership value. A total number of Facebook friends of a node are assumed as the indicator of true membership value. The normalized value is taken as true membership value of a vertex. Among Facebook friends, few friends are inactive. These friends indicate the falsity membership value of a vertex. The ratio of inactive friends to the total number of friends is assumed as falsity. The number of inactive friends is considered by calculating the “likes/comments” in the last five posts. Now, indeterminacy membership value of a vertex is taken as 1(numberofactivefriendstotalnumberoffriends)2. This is taken as when the number of active friends increases, the indeterminacy value decreases. All the calculation of vertex membership value have been shown in Figure 9.

Figure 6

Source data, data taken from Facebook.

Figure 7

Source graph, data taken from Facebook.

Figure 8

Calculation of edge membership value.

Figure 9

Calculation of vertex membership value.

The true membership value of an edge is calculated from the number of common friends. True membership value of an edge is “Normalized value of mutual friends × minimum true membership value of end vertices,” i.e., μT(x,y)= {Normalized value of mutual friends} × {σT(x)σT(y)}. For the simplification we assumed the indeterminacy membership of an edge is the maximum indeterminacy membership value of end vertices, i.e.,μI(x,y)=σI(x)σI(y). Also, same condition applies for the falsity membership value of an edge, i.e., μF(x,y)=σF(x)σF(y). The membership values of all edges are shown in Figure 8. Also, the neutrosophic graph of this source graph has been shown in Figure 10. The location membership value of all predicted edges is shown in Figure 11. Here, the distance between their (predicted vertex) home address is shown in Figure 11 and the normalized score of this distance taken as true membership value of location. Also, the distance between their (predicted vertex) Facebook address is shown in Figure 11. The calculation of falsity membership value and indeterminacy membership value of location has been shown in Figure 11. Then, the calculation of the link prediction by modified RSM index of all the thirty-six predicted edges is shown in Figure 12. Also, calculation of the value of link prediction by first five methods of the Subsection 3 of all thirty-six predicted edges is shown in Figure 13 and the normalized score of link prediction by first five methods and modified RSM index is shown in Figure 14. Also, a comparison graph with the RSM index and the first five methods is shown in Figure 15.

Figure 10

Neutrosophic graph of the source graph.

Figure 11

Calculation of location membership value.

Figure 12

Calculation of link prediction by modified RSM index.

Figure 13

Calculation of link prediction by common neighbors, Salton Jaccard, Sorensen, Hub Promoted, Hub Depressed, Newman, Adamic-Adar, Preferential and Resource Allocation.

Figure 14

Normalized scores of modified RSM index along with common neighbors, Salton, Jaccard, Sorensen and HUB methods.

Figure 15

Comparison graph with common neighbors, Salton, Jaccard, Sorensen and HUB methods.

Also, calculation of the value of link prediction by another five methods of the Subsection 3 of all thirty-six predicted edges is shown in Figure 16 and the normalized score of link prediction by another five methods and modified RSM index is shown in Figure 17. Also, a comparison graph with the RSM index and another five methods is shown in Figure 10.

Figure 16

Normalized scores of modified RSM index along with HUB Depressed, Newman, Adamic-Adar, Preferential and Resource Allocation indexes.

Figure 17

Comparison graph with HUB Depressed, Newman, Adamic-Adar, Preferential and Resource Allocation.

Here, the common neighbor method gives the same value for maximum prediction links. However, the modified RSM index gives the different value of predictions links compare to the common neighbor method. For example, consider a link in serial number 6 of Figure 14. It is the link between JM and AM. The common neighbor method only counts the number of common neighbors. However, our method modified RSM index checks the nature of common neighbors and other parameters. Also, this value considers the neutrosophic environment. Thus the results of our proposed methods depend on the human behaviors and other parameters like the location. HUB Promoted index gives high value for the prediction links compare to modified RSM index. The score of link prediction of the links of serial numbers 6,7,17,28,34 of Figure 14 is different from our proposed method. Jaccard method gives the score proportional to our proposed method except for some links. Few results are upper than the modified RSM index, and some are lower than abovementioned links.

The almost similar score is given for HUB depressed and modified RSM index except for some links. Leicht-Holme-Newman index gives identical results with modified RSM index except for that specific links of serial numbers of Figure 16 are 3,4,5,18,22,34. The almost same scores are given for the methods AA and Preferential attachment. Thus it can be decided that the modified RSM index is the considerable expected result for link prediction than other existing methods.

5. CONCLUSION

In the modified RSM index, few limitations are there. Calculation of true value, falsity and indeterminacy from crisp data is not easy to capture. There are no available methods to find such data. In this study, some parameters which display the results are assumed for true value, some parameters which are taken wrongly in the displayed results are assumed as falsity, and the parameters which are neutral for showing the results are indeterminacy. Otherwise, this method captures all the notions of link prediction. This link prediction captures the nature of common neighbors as well as jobs, education and living places are assumed in the calculations.

CONFLICTS OF INTEREST

The authors do not have any kinds of conflicts of interests.

AUTHORS' CONTRIBUTIONS

Mr. Rupkumar Mahapatra, Dr. Sovan Samanta and Prof. (Dr.) Madhumangal Pal equally contributed to preparing the article. Prof.(Dr.) Qin Xin verified the results and statements and guided the work.

ACKNOWLEDGMENTS

The authors are highly thankful to the Editor-in-Chief, guest editor of the special issue and reviewers of the paper. The authors are also grateful to Mr Rajib Dolai, Assistant Professor, Economics, Tamralipta Mahavidyalaya for helping in data collection.

REFERENCES

3.S. Samanta and M. Pal, Fuzzy threshold graphs, CIIT Int. J. Fuzzy Syst., Vol. 3, 2011, pp. 360-364.
4.S. Samanta and M. Pal, Irregular bipolar FGs, Int. J. Appl. Fuzzy Sets, Vol. 2, 2012, pp. 91-102.
11.T. Sorensen, A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons, Biol. Skr., Vol. 5, 1948, pp. 1.
16.G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Auckland, New Zealand, 1983.
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
13 - 1
Pages
1699 - 1713
Publication Date
2020/10/24
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.d.201015.002How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Rupkumar Mahapatra
AU  - Sovan Samanta
AU  - Madhumangal Pal
AU  - Qin Xin
PY  - 2020
DA  - 2020/10/24
TI  - Link Prediction in Social Networks by Neutrosophic Graph
JO  - International Journal of Computational Intelligence Systems
SP  - 1699
EP  - 1713
VL  - 13
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.201015.002
DO  - 10.2991/ijcis.d.201015.002
ID  - Mahapatra2020
ER  -