International Journal of Computational Intelligence Systems

Volume 13, Issue 1, 2020, Pages 698 - 705

Individual Heterogeneous Learning with Global Centrality in Prisoner Dilemma Evolutionary Game on Complex Network

Authors
Zundong Zhang1, 2, *, ORCID, Yifang Zhang3, ORCID, William Danziger4
1College of Electrical and Control Engineering, North China University of Technology, No. 5 Jinyuanzhuang Road, Beijing, Shijingshan, China†
2Department of Civil and Environment Engineering, University of Washington, 3760 E. Stevens Way NE, Seattle, WA, US‡
3College of Electrical and Control Engineering, North China University of Technology, No. 5 Jinyuanzhuang Road, Beijing, Shijingshan, China
4Department of Civil and Environmental Engineering, University of Washington, 121G More Hall, Seattle, WA, US

The permanent address of the author.

The present address of the author.

*Corresponding author. Email: zyfncut@163.com
Corresponding Author
Zundong Zhang
Received 6 November 2019, Accepted 2 April 2020, Available Online 17 June 2020.
DOI
10.2991/ijcis.d.200603.002How to use a DOI?
Keywords
Evolution game; Individual heterogeneity; Global centrality; Cooperation rate
Abstract

The influence of individual heterogeneity on the evolutionary game has been studied extensively in recent years. Whereas many theoretical studies have found that the heterogeneous learning ability effects cooperation rate, the individual learning ability in networks is still not well understood. It is known that an individual's learning ability is influenced not only by its first order neighbors, but also by higher order individuals, and even by the whole network. At present, existing methods to represent individual learning ability are based on degree centrality, resulting in ignoring the global centrality of nodes. In this paper, we design a method for describing the heterogeneous learning ability by taking advantage of a pre-factor θx related to the node betweenness. And a parameter α is used to tune θx. Experiments show that individual heterogeneous learning ability is effected by global information. Our findings provide a new perspective to understand the important influence of the global attributes of nodes on the evolutionary game.

Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Cooperative behaviors are widespread in a biological system and human society [1]. However, understanding and explaining the emergence of costly cooperative behaviors among selfish individuals in social dilemmas such as the prisoner's dilemma game (PDG) and the public goods game remains a conundrum [24]. To a certain degree, the evolutionary game theory which assumes that individuals are boundedly rational has successfully explained the evolution of cooperation from various perspectives [5]. Nowak [6] systematically summarizes five mechanisms for promoting cooperation including kin selection, direct reciprocity, indirect reciprocity, network reciprocity, and group selection. In the real world, individuals generally do not interact with all other individuals, but only interact with some other individuals, thus forming a corresponding network structure. On a network with a certain structure, collaborators can form clusters to help each other, so that the level of cooperation can be maintained, which also creates space reciprocity [7]. The Evolution of Cooperation, written by Axelrod, first focuses on the impact of spatial structure on cooperative evolution [8]. In 1992, Nowak and May studied the prisoner's dilemma game on the grid and found that the spatial structure can not only increase the cooperative frequency but also produce rich spatial patterns under certain conditions [7,9]. Later, due to the rise of complex network theory, the research on spatial evolution game was pushed to a new climax.

In recent years, extensive research has been conducted on PDG on complex networks of various spatial structures, such as on rule networks: Szabó et al. adopt a random strategy updating rule to study the prisoner's dilemma game behavior on a square lattice [10]; Perc et al. systematically studied the impact of stochastic payoff variations with different distributions on the evolution of cooperation in the spatial PDG[11]. Inspired by Wu et al. [12,13], Szolnoki and Szabó studied the effects of individual heterogeneity on cooperative behavior in the prisoner's dilemma game on Kagome lattice and grid networks [14]. Yu and Wang study the PDG on square lattice by interacting the deterministic and Data envelopment analysis efficient rule into adaptive rules [15]. For small world networks: Santos et al. compared the differences in prisoner's dilemma behaviors between WS small-world networks and random regular networks generated by random exchange edges [16]. Ren et al. investigated the effect of randomness in both the topological randomness in individual relationships and the dynamical randomness in decision makings on the evolution of cooperation on homogeneous small-world networks [17]. Fu et al. studied the influence of degree heterogeneity on cooperative behavior based on the NW small world network model, indicating that appropriate heterogeneity can effectively promote the emergence of cooperative behavior in the network [18]. Santos et al. first studied the evolution of cooperation on scale-free networks [1921] and found that BA scale-free networks could greatly improve the level of network cooperation. As a continuation of the work of [1921], Gomez-Gardenes et al. study the effect of clustering on the organization of cooperation by analyzing the evolutionary dynamics of the “Prisoner's Dilemma” on scale-free networks with a tunable value of clustering [22], Pusch et al. also reached a similar conclusion [23]. Ren et al. proposed the preferential learning mechanism on PD games. Compared with no preference, the addition of preference mechanism can effectively improve the cooperation frequency in BA scale-free networks [24]. Recently, Szolnoki et al. found that the diversity in the reproduction capability, related to inherently different attitudes of individuals, can enforce the emergence of cooperative behavior among selfish competitors [25]. Here, Lima and Tarik study an agent-based model of the evolution of tag-mediated cooperation on ER graphs [26].

In addition to the network structure, it is also found that when using evolutionary games on complex networks to study cooperation issues, many other mechanisms also have an impact on cooperative behavior and evolutionary dynamics [2731]. Yang and Chen found punishment is an effective way to sustain cooperation among selfish individuals [32]. Wang and Li introduce the concept of social influence into the model to enable cooperators and extortioners to be evolutionarily stable [33]. Qin and Chen reveal that the introduction of neighborhood diversity elevates the level of cooperation in various types of social dilemmas [34]. In the real world, both biological individuals and humans exhibit rich diversity or heterogeneity, such as individual learning to teach heterogeneity [14,35], heterogeneity in the income matrix [11,36], conformity-driven reproductive ability [37], degree of heterogeneity [18], and so on. Lin and Yang propose an opinion dynamics model to study the effects of heterogeneous influence of individuals on the global consensus [38]. The evolutionary game model research on complex networks found that the heterogeneity of individuals plays an important role in the emergence and maintenance of cooperative behavior. Motivated by degree heterogeneity research, an individual's learning ability algorithm based on node degree is proposed to study cooperative behaviors in both prisoners dilemma [39] and public goods game [40]. In this algorithm, different learning abilities are designed according to the degree of the player, and the change of the cooperation level of the network is observed accordingly. Therefore, the algorithm of individual learning ability based on node degree provides insight into how cooperative behaviors emerge and evolve in social dilemmas. Using node degrees to represent individual learning ability means that the individual's learning ability is only related to its first order neighbor's information.

Actually, in reality, the connection between individuals is not limited to local, and the individual's learning ability is also affected by global information. Therefore, this paper proposes a method for describing the heterogeneous learning ability by taking advantage of a pre-factor θx related to the node betweenness. And a parameter α is used to tune θx. The effects on the evolution of cooperation are studied by considering the inhomogeneous activity of learning ability in PDG on regular lattices, small-world networks, random networks, and scale-free networks. Our goal is to address the following questions:What impact does the learning ability represented by the individual's global attributes have on the emergence and evolution of cooperation? Is there any difference in the level of cooperation among different network structures? If the difference does exist, which network structures cooperation level is more affected by individual learning ability.

The remainder of the paper is organized as follows: In Section 2, we describe the particularities of the evolutionary prisoner's dilemma game and the strategy updating rule used in this study. The main simulation results and theoretical analyses are presented in Section 3. Finally, the main conclusions are summarized in Section 4.

2. EVOLUTIONARY PRISONER'S DILEMMA GAME

Game theory provides a useful framework for describing the evolution of systems consisting of selfish individuals. The PDG as a metaphor for investigating the evolution of cooperation has drawn considerable attention.

In the PDG, two players simultaneously choose whether to cooperate or defect. Mutual cooperation results in payoff R for both players, whereas mutual defection leads to payoff P gained both. If one cooperates while the other defects, the defector gains the highest payoff T, while the cooperator bears a cost S. The prisoner's dilemma payoff matrix is shown in Table 1.

Table 1

The prisoner's dilemma game's (PDG) payoff matrix.

This thus gives a simply rank of four payoff values: T>R>P>S. One can see that in the PDG, it is best to defect regardless of the co-players decision to gain the highest payoff T.

The individuals can follow only two simple strategies: C (cooperate) or D (defect), described by

z=10 or 01(1)
respectively. Each individual plays the PDG with its neighbors defined by their spatial relationships. The total income of the player x can be expressed as
Px=yΩxzxTAzy(2)
where zx and zy denote the strategy of node x and y. Ωx is all neighbors of node x. A denotes the payoff matrix. The payoff matrix has rescaled forms in
A=10b0(3)
where 1<b<2. An individual can be either a cooperator or a defector. All pairs of connected individuals play the game simultaneously and gain benefits according to the payoff parameters mentioned in the last section. During the evolutionary process, each player is allowed to learn from one of its neighbors and update its strategy at each round.

The probability of a player x selecting one of its neighbors y can be defined as f(y)vf(v), where the sum runs over the set of neighbors of x. f() could be degree, betweenness or one of other characteristics of a node. The assumption takes into account the fact that individuals with more interactions usually cause more attraction in society. In other words, well-known persons will have more influences than the others. In the initial stage, each individual can either choose to cooperate or defect with equal probability. During the evolutionary process, each individual is allowed to learn from one of its neighbors and update its strategy synchronously with a probability in each generation. On the basis of the previous studies, each player x can choose neighbor y randomly. And player x adopts the neighbor y's strategy with a probability q depending on their payoff difference, which is determined by the Fermi function:

q=11+expPxPyk(4)
where k characterizes the noise introduced to permit irrational choices. Here k is set to 0.1. Px and Py respectively represent the total payoff of player x and y. According to the evolutionism, q reflects the rule of natural selection based on relative fitness.

In this paper, the effects on the evolution of cooperation are studied by considering the inhomogeneous activity of learning ability in PDG on complex networks. We present the heterogeneous learning ability of players related to their global centrality. Therefore, the formula for learning ability uses the redefined q representation

q=θx11+expPxPyk(5)
where the pre-factor θx reflects the strength of node x's heterogeneous learning capabilities. The pre-factor θx can be given by
θx=BxBiα(6)
where B is node betweenness centrality. α is the adjustment factor that tunes the heterogeneous learning ability. The range of α is [3,3].

3. EXPERIMENTS AND RESULT ANALYSIS

3.1. Experimental Data

In this section, we focus on the influence of node betweenness centrality of complex networks on the evolution of the networked PDG. Firstly, we construct networks using four typical models—the scale-free network (BA model), the small-world network (SW model), the random network (ER model), and the grid regular network (GD), as shown in Figure 1. Each example network has 20 nodes.

Figure 1

The four example networks.

Two main mechanisms for forming a scale-free network: growth and preferred attachment. Because complex systems in the real world are evolving open systems, the BA network model essentially characterizes the most important characteristics of real networks. The specific travel rules of the small world model are as follows: starting from a one-dimensional ring nearest neighbor coupling network with N nodes, and then reconnecting all edges in the network randomly with probability p. Obviously, p=0 corresponds to a completely regular network, and p=1 corresponds to a completely random network. By adjusting the p-value, you can control the transition from a completely regular network to a completely random network. The nodes in a random network are connected together in a certain random manner. Although the ER random graph as an actual complex network model has obvious defects, in the last 40 years of the 20th century, the ER random graph theory has been studying complex networks. The basic theory of topology, some of which are still important for the current research on complex networks. The grid rule network is that the nodes are connected in a certain way.

Complex networks have many special properties. The average shortest path length refers to the average of the shortest path lengths among all nodes in the network. The shortest path between any two points in the network is the minimum number of edges from one node to another. The average node betweenness of a network refers to the average of the betweenness of all nodes in the network. A node betweenness refers to the proportion of the shortest path between any two nodes passing through the node to all the shortest paths. The clustering coefficient is a parameter describing the clustering characteristics of the network. The degree of a network node refers to the number of other nodes connected to it. The attributes of the four networks constructed in this article are shown in Table 2.

Average Shortest Path Length Average Node Betweenness Average Clustering Coefficient Average Node Degree Neighborhood Connectivity
BA network 2.67 0.07 0.01 3.4 4.58
ER network 2.10 0.06 0.18 4.11 4.65
GD network 2.96 0.12 0 3.16 3.22
SW network 2.35 0.35 0.3 4.05 4.29
Table 2

The Statistics of the four networks.

The Grid network (a kind regular graph) have long average paths and high clustering (as the nodes tend to be densely connected in groups). The random ER (Erdös-Rényi) graph is generated by starting with a disconnected set of nodes that are then paired with a uniform probability. The ER network has low heterogeneity (most nodes have the same number of connections), the degree distribution is a Gaussian bell-shaped curve. The ER network built after ER algorithm has short average paths and low clustering. The SW network is between the regular and random in their structure and are referred to as small-world (SW) networks. The SW network is very close structurally to many social networks in that they have a higher clustering and almost the same average path than the random networks with the same number of nodes and edges. The SW network also has high modularity (groups of the nodes that are more densely connected together than to the rest of the network). Finally, there is a large class of so-called scale-free (SF) networks characterized by a highly heterogeneous degree distribution, which follows a power-law. In the BA network, there is a few, but significant number of nodes with a lot of connections and theres a trailing tail of nodes with a very few connections at each level of magnification. The BA network's structure combines heterogeneity and randomness, it can have low or high modularity.

3.2. Experimental Analysis

In the following, we discuss the computer simulation results of our model on the four networks. This paper consider networks with finite node number N=20. One of the most important factors is the cooperation rate fc, which is defined as the ratio of the number of cooperators to the number of all nodes in the network. Obviously, the value of the cooperation rate ranges from 0 to 1. When the cooperation rate is 0, all individuals in the network are defectors, and 1 indicates that the network is full of cooperators. In this experiment we used Monte Carlo (MC) simulation. The MC method refers to the use of random numbers to solve certain computer problems, mainly including these three steps: construct or describe the probability process; implement sampling from a known probability distribution; and establish various estimators. The MC algorithm indicates that the more samples, the closer the optimal solution is. In each independent iteration, the cooperation rate is obtained by the averaging over 1000 MC time steps after a long relaxation of 10,000 MC time steps, when the dynamical system equilibrium has been convincingly achieved. The simulation results are averaged over 50 iterations and a different realization of networks is performed in each round.

First, we investigated the overall change of the cooperation rate for different values of b and α in Figure 2. The meanings of positive and negative alpha are as follows: when α>0, for the same alpha, the higher the degree of connection between nodes, the higher the individual heterogeneous learning ability of the nodes; when α<0, for the same alpha, the relationship between nodes the smaller the degree, the higher the individual heterogeneous learning ability of the node. The cooperation level of the network depends on the size of parameter b and adjustment factor α in the whole prisoner's dilemma. The model is equivalent to standard PDG when α=0.

Figure 2

Overall change of the cooperation rate.

We can see the relationship between α and the cooperation frequency by observing the variation of cooperation frequency for the same b under different α. For the four networks, within the whole region of b, the change trend of cooperation rate presents a non-monotonous phenomenon with the increment of α, and the maximum value corresponding to different b is different. In the middle part of the α, the cooperation rate fc reaches its maximum, and the highest cooperation level is obtained around the standard prisoner's dilemma (α=0).

The proportion of collaborators depends on the adjustment factor α and on the size of the parameter b in the entire prisoner's dilemma. In order to observe the relationship between b and the cooperation rate much clearer, we further explore the cooperation frequency as a function of different values of b. This model is equivalent to the standard prisoner's dilemma game when α=0.

For the prisoner's dilemma game on the BA network (in Figure 3), when α<0, the cooperation rate gradually decreases with the decrease of α. When b1.9, all the individuals in the network are defectors. When α>0, the cooperation rate gradually decreases with the increase of α. When b1.7, all individuals in the network are defectors. Therefore, compared with α<0 or α>0, when α=0, the cooperation rate level of the entire network is the highest.

Figure 3

The change of the cooperation rate in the BA model.

In the SW network as shown in Figure 4, when α<0, the cooperation rate gradually decreases with the decrease of α. When b1.4, the cooperation rate of the network is close to 0. Almost all the individuals in the network are defectors. When α>0, the cooperation rate gradually decreases with the increase of α. When b1.5, the cooperation rate of the network is close to 0. At this time, almost all the individuals in the network are defectors. Compared with α<0 or α>0, when α=0, the cooperative rate level of the whole network is the highest.

Figure 4

The change of the cooperation rate in the SW model.

For the prisoner's dilemma game on the ER network in Figure 5, when α<0, the cooperation rate gradually decreases with the decrease of α. And the downward trend is basically consistent with α=0. When α=0 and b=2, all individuals in the network are defectors. When α>0, the cooperation rate gradually decreases with the increase of α. But the downward trend is completely different from that of α=0. The cooperation rate drops sharply at b=1.3. When b1.4, the cooperation rate of the network is close to 0. At when, there are only a few partners in all the individuals in the network. And the rest are almost all defectors. Compared with α<0 or α>0, when α=0, the cooperation rate level of the whole network is the highest.

Figure 5

The change of the cooperation rate in the ER model.

In the GD network (Figure 6), when α<0, the partner frequency gradually decreases with the increase of b. When b1.4, the cooperative rate on the network is 0. When α>0, As α decreases, the cooperative rate rises at b=1.1, then gradually decreases. When b1.4, the cooperative rate of the network is close to 0. At when, all the networks individuals are almost defectors. The cooperative frequency decreases rapidly with the increase of α. The falling speed is extremely fast. When b1.2, the cooperative rate is close to 0. At that point, almost all the individuals in the network are defectors. Compared with α<0 or α>0, when α=0, the cooperation frequency level of the whole network is the highest.

Figure 6

The change of the cooperation rate in the Grid network.

In the BA network (Figure 7a), the results reveal the collaborator's dissipation mechanism more deeply, and we use the cooperative rate as a function of time step. We want to find the relationship between α and cooperate rate, we must choose a fixed b=1.5 as a constant in the range of b. Selecting an intermediate value of b can more clearly see the changing trend of cooperation frequency. We can see from Figure 7a that the cooperation rate of the entire network drops first and then reaches a steady state. In the initial round, each individual chooses to cooperate or betray with the same possibility. Therefore, some nodes with high betweenness may become defectors. We assume that the node X with a large node betweenness value is a defector in the whole network. It can be observed that its first-order neighbors (cooperators) bring a higher return to X. So X has the greatest benefit, which will make it is easier to invade collaborators in its neighbors. In other words, neighbors of X have a greater probability of learning X's strategy. That will cause the first-order neighbors of X to become traitors. And the strategies of the first-order neighbors affect the second-order neighbors' strategies. Finally, the cooperation rate of the whole network will gradually decrease.

Figure 7

The change of the cooperation rate with 10,000 iterations.

For the SW (Figure 7b), ER (Figure 7c), GD (Figure 7d) networks, in the initial stage, the overall cooperation rate of the network is relatively high. Therefore, several nodes with higher betweenness may become defectors, and the rest are collaborators. When α>0, the learning ability of those neighbor nodes who having lower betweenness have been enhanced. So they tend to be influenced and invaded faster by the defectors with higher betweenness, resulting in a rapid decline of the network cooperation rate with the lowering of α. For α>0, the learning ability of high-betweenness nodes is stronger than low-betweenness nodes. Therefore, the propagation speed of defect has decreased, and the cooperation level of the network has also started to decline after maintaining a high level for a while. With the increase of α, the rate of decline gradually slows down. However, due to the characteristic of θx, the learning ability of the node is weaker than the standard prisoner's dilemma.

In conclusion, when the network size is large enough and there are enough nodes in the network, the cooperator will get a chance to invade high-betweenness nodes. Once a node with high betweenness becomes a collaborator, its neighbors will cooperate. On the other side, once a collaborator invades a central node, its fitness will increase rapidly. And it will be difficult to be invaded by defectors. When α<0, nodes with low betweenness will easily change their strategies and become collaborators because of their enhanced learning ability. Nodes with high betweenness have weakened learning ability, which in turn can keep their own cooperation strategies, as well as the cooperation rate of the entire network is also kept.

The advantage of our proposed method is that it introduces the pre-factor related to the betweenness to represent the heterogeneous learning ability of individuals, considers the global attributes of nodes, and provides a new perspective for understanding the important impact of global attributes of nodes on evolutionary games. The disadvantage is that due to the limited computing power of our computer, the number of nodes in the network model we construct is not enough, which may cause the obtained experimental results to be not fine enough, which can only indicate a general trend.

4. CONCLUSIONS

In summary, previous experimental analysis shows that the individual's heterogeneous learning ability can improve the level of cooperation. To have a more detailed observation about microscopic dynamical processes of diversity-promoted cooperation, we introduce the global attributes of individuals into individual learning abilities. In the paper, we design a pre-factor θx, which is related to node betweenness, to adjust the inhomogeneous activity of learning. As well, α is used to adjust the size of individual learning ability. It has been found in experiments that the heterogeneous learning abilities related to individual global attributes can significantly affect the level of cooperation in complex networks.

Furthermore, in most cases, network cooperation is in a high level. Only in some middle values of α, it can significantly inhibit the spread of defectors and improve the level of cooperation of the entire network. By carefully investigating the impact of learning capabilities represented by individual global attributes in collaborative behavior, we find that all individuals in the network can enhance the sustainability of collaboration by reducing learning ability, although this will delay the evolutionary process.

From the conclusions of this study, it is known that the influence of individual global attributes on game strategy is huge. It is not comprehensive to consider the global attributes of nodes only considering the local attributes of nodes. Therefore, our work may be much meaningful in understanding the evolutionary game research of heterogeneous learning ability on complex networks. Our research can be used to solve traffic network congestion problems, such as the problem of optimal control of regional signals. In the future we will make more research in this area.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.

AUTHORS' CONTRIBUTIONS

The authors confirm contribution to the paper as follows: study conception and design: Z. Zhang, J. Ban; data collection: Y. Zhang; analysis and interpretation of results: Z. Zhang, Y. Zhang; draft manuscript preparation: Z. Zhang, Y. Zhang. All authors reviewed the results and approved the final version of the manuscript.

ACKNOWLEDGMENTS

This paper is supported by The Chinese the State 13 Five-year Scientic and Tech- nological Support Project (2016YFB1200402), the Big-Data Based Beijing Road Trac Congestion Reduction Decision Support Project (PXM2016014212000036), the Project of the Innovation and Collaboration Capital Center for World Urban Transport Improvement (PXM2016014212000030), and the Basic Research Business Fee Project of Science and Technology Innovation Service Building (110052971921/031).

REFERENCES

24.J. Ren, W.-X. Wang, G. Yan, and B. Wang, Emergence of cooperation induced by preferential learning, OALib J., 2006. arXiv:physics/0603007
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
13 - 1
Pages
698 - 705
Publication Date
2020/06/17
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.d.200603.002How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Zundong Zhang
AU  - Yifang Zhang
AU  - William Danziger
PY  - 2020
DA  - 2020/06/17
TI  - Individual Heterogeneous Learning with Global Centrality in Prisoner Dilemma Evolutionary Game on Complex Network
JO  - International Journal of Computational Intelligence Systems
SP  - 698
EP  - 705
VL  - 13
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.d.200603.002
DO  - 10.2991/ijcis.d.200603.002
ID  - Zhang2020
ER  -