International Journal of Computational Intelligence Systems

Volume 10, Issue 1, 2017, Pages 663 - 676

An orthogonal clustering method under hesitant fuzzy environment

Authors
Yanmin Liu1298958905@qq.com
Institute of Sciences, PLA University of Science and Technology, Nanjing, Jiangsu 211101, China
Hua Zhaozhaohua_pla@163.com
Institute of Sciences, PLA University of Science and Technology, Nanjing, Jiangsu 211101, China
Zeshui Xu*, xuzeshui@263.net
Business School, Sichuan University, Chengdu, Sichuan 610064, China
*Corresponding author.
Corresponding Author
Received 15 February 2016, Accepted 13 January 2017, Available Online 30 January 2017.
DOI
10.2991/ijcis.2017.10.1.44How to use a DOI?
Keywords
Hesitant fuzzy set; distance measure; clustering analysis; orthogonal method
Abstract

In this paper, we investigate the cluster techniques of hesitant fuzzy information. Consider that the distance measure is one of the most widely used tools in clustering analysis, we first point out the weakness of the existing distance measures for hesitant fuzzy sets (HFSs), and then put forward a novel distance measure for HFSs, which involves a new hesitation degree. Moreover, we construct the distance matrix and choose different values of λ so as to obtain the λ – cutting matrix, each column of which is treated as a vector. After that, an orthogonal clustering method is developed for HFSs. The main idea of this clustering method is that the orthogonal vectors in the distance matrix should be clustered into the same group, and according to the different values of λ, the procedure will repeat again and again until all the cases are considered. Finally, two numerical examples are given to demonstrate the effectiveness of our algorithm.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

1. Introduction

In our daily life, there exist some phenomena that cannot be described accurately in mathematical forms, for example, the words “fast”, “big”, “beautiful”, “rich” and so on. It encourages people to find a more effective way to study and handle these uncertain problems. In 1965, Zadeh1 originally put forward the concept of fuzzy set, which opens the door of fuzzy theory research. Since then, fuzzy set theory has been developed from various angles. In 1986, Atanassov extended fuzzy set to intuitionistic fuzzy set 2, which takes account of the membership degree, the non-membership degree and the hesitance degree. Compared to fuzzy set, it includes more details to distinguish different objects. Later, Atanassov and Gargov introduced the concept of interval-valued intuitionistic fuzzy set3, in which the membership degree and the non-membership degree are interval-valued. There are also some other kinds of fuzzy sets, such as type-2 fuzzy set 4,5, interval type-2 fuzzy set 6, type-n fuzzy set 4, and hesitant fuzzy linguistic set 7,8, etc.

The fuzzy sets mentioned above can solve a lot of decision making problems appropriately, but they do not do well when analyzing hesitant fuzzy information 7,8. In some decision making problems, when considering the degree that an alternative satisfies a criterion, different experts have different opinions. Some experts who are optimistic may assign 0.9, some experts may give 0.6, and some 0.1. Their attitudes cannot be changed and each opinion cannot be ignored. It is clear that this issue cannot be solved by using fuzzy sets talked before. Hesitant fuzzy set 7,8 proposed by Torra and Narukawa can work out the issue more convincingly. Xia and Xu 9,10 gave the mathematical expression of the HFS, and called its components the hesitant fuzzy elements (HFEs). Obviously, utilizing the HFE {0.9,0.6,0.1} to describe the above situation is much more reasonable than the interval-valued fuzzy set [0.1,0.9] or the single value. Therefore, it is necessary and essential to use the HFEs (or HFSs) to describe the hesitant information in the decision making problems. So far, a lot of research has been done to develop the hesitant fuzzy set theory. Torra et. al 7,8 made a deep exploration about the difference among the HFS and other fuzzy sets, and gave some basic operations of HFSs, such as complement, union and intersection. Xia and Xu 9,10 developed some aggregation operators for HFEs, and also studied the distance, similarity and correlation measures of HFSs. Rodriguez et al.11 presented an overview on hesitant fuzzy set and pointed out the further research directions in the future. In the existing research, Xu and Xia 10 proposed several distance measures to discuss the variances and applied them to clustering analysis. In addition, the distance measures have been widely used in decision making 1214, medical diagnosis 15 and pattern recognition 16, etc. However, the existing distance measures have some drawbacks, such as changing the original information and ignoring the hesitation degree. To overcome these drawbacks, in this paper, we develop a novel method to calculate the distance involving hesitation degrees and compare it with the existing distance measures, and finally, we apply the proposed distance measure to clustering analysis.

Clustering is a dividing process which divide the set of different kinds of objects into a few groups generally according to their characteristics, which has been widely used in various fields, such as economics, computer sciences, astronomy and so on 1719. The similar objects would be clustered into the same group. Based on the properties of the generated clusters, the clustering techniques are generally classified as the partitional clustering method and the hierarchical clustering method21. The partitional clustering algorithm divides the data into several partitions based on certain objective function, where each partition represents a cluster, such as K-means clustering algorithm. While the hierarchical clustering algorithm gathers all the data to form a tree shaped structure, compare the distances or similarities between each pair of clusters in each layer, and form a new layer. Through continuous cycle, we can get the clustering results finally. Recently, some scholars have been giving research on hesitant fuzzy clustering techniques. Chen et al.22 constructed a correlation matrix by calculating the correlation coefficients for each pair of HFSs, then formed the correlation coefficients equivalent matrix, and finally clustered the HFSs based on the λ – cutting matrix. Zhang and Xu 23 proposed a minimal spanning tree (MST) clustering technique, while drawing the MST is too complicated. Zhang and Xu 21 adopted the traditional agglomerative hierarchical clustering method 24 to calculate the center of the groups again and again, which needs too much calculational effort too. Chen et al. 25 put forward a clustering method of HFSs based on K-means clustering algorithm which takes the results of hierarchical clustering as the initial input.

Looking into the clustering algorithms discussed above, we find that some need a large amount of computational effort, some need the complicated transformation, and most of the algorithms take a lot of time to finish clustering. To overcome these issues, in this paper, we will propose a novel orthogonal clustering method for HFSs. In this method, we first construct the distance matrix using our new distance measure for HFSs. After that, we choose the confidence level λ to obtain the λ – cutting matrix, every column of which is treated as a vector. If two vectors are orthogonal, then we cluster them into the same group.

The remainder of the paper is organized as follows: Section 2 reviews some basic knowledge related to HFSs. Section 3 gives a novel method to calculate the distance and defines a new concept of hesitation degree. In Section 4, we put forward the orthogonal clustering method for HFSs. We illustrate the effectiveness of the method via two numerical examples in Section 5. The paper ends with some concluding remarks in Section 6.

2. Preliminaries

2.1. The basic knowledge related to HFSs

In some decision making problems, many experts are needed to express individual opinions on the same problem. When considering the degree that an alternative satisfies a criterion, different experts have different opinions, and thus, different people may assign different values to the alternatives. To solve these problems, Torra 7 generalized the fuzzy set to hesitant fuzzy set (HFS), in which the membership degree of an element to a set is expressed as several possible values between 0 and 1.

Let X = {x1, x2, …, xm} be a fixed set, a HFS A on X is defined as a function that when applied to X returns a subset of [0,1]. Xia and Xu9 expressed it by a mathematical symbol as follows:

A={<x,hA(x)|xX>}
where x represents a criterion and hA(x) is a set of some values in [0,1], on behalf of the possible membership degrees of the element x to the set A. For convenience, Xu and Xia 9 defined hA(x) as hesitant fuzzy element (HFE).

Example 1.

Let X = {x1 x2, x3} be a fix set, hA(x1) = {0.1,0.2,0.3}, hA(x2) = {0.4,0.5}, and hA(x3) = {0.3,0.5,0.6} are the HFEs of xi(i = 1,2,3) to a set A . Then the HFS A can be expressed as:

A={<x1,(0.1,0.2,0.3)>,<x2,(0.4,0.5)>,<x3,(0.3,0.5,0.6)>}

To compare the HFEs, we introduce the score of the HFE9:

s(h)=1lhγhγ
where lh is the number of the elements in h, generally being called the length of h, and γ is the elements in h . In fact, the score of the HFE is the average value of numbers in h. For two HFEs h1 and h2, if s(h1) > s(h2), then h1 is superior to h2, denoted by h1 > h2. However, if s(h1) = s(h2), only with the score of HFEs, then we cannot distinguish which one is bigger. For example, h1 = {0.1, 0.1, 0.7} and h2 = {0.2, 0.4} are two HFEs. We can easily get the scores of these two HFEs: s(h1) = 0.3 and s(h2) = 0.3. Since s(h1) = s(h2), then we cannot distinguish these two HFEs with the score. Clearly, even if two HFEs h1 and h2 have the same score, their deviation degrees may be different. To better compare the HFEs, Chen et al.22 defined the concept of deviation degree: Let the HFE h1(x) = {x1, x2 … xn}, then the deviation degree of h1(x) is expressed as:
σ(h)=i=1n(xi-x¯)2ln
where x¯=i=1nxin. The deviation degree is just the standard variance of xi.

Distance measure is an important content in clustering analysis with hesitant fuzzy information. Xu and Xia10 gave its concept as follows:

Definition 110.

Let A1 and A2 be two HFSs defined on X, then the distance measure between A1 and A2 is defined as d (A1, A2), which should satisfy:

  1. (1)

    0 ≤ d(A1, A2) ≤ 1.

  2. (2)

    d(A1, A2) = 0 if and only if A1 = A2.

  3. (3)

    d(A1, A2) = d(A2, A1).

Based on the above properties of distance measure between each pair of HFSs, we can construct the distance matrix below:

Definition 210.

Let Ai (i = 1,2, ···, m) be m HFSs, then D=(dij)m×m is called a distance matrix, where dij = d(Ai, Aj) is the distance measure between Ai, and Aj, and dij should satisfy the following properties:

  1. (1)

    0 ≤ dij ≤ 1 for all i, j = 1,2, ···, m.

  2. (2)

    d(Ai, Aj) = 0 if and only if Ai = Aj.

  3. (3)

    dij = dji for all i, j = 1,2, ···, m.

Definition 324.

If D = (dij)m×m is a distance matrix, then we define Dλ = (λdij)m×m as the λ -cutting matrix of D, where

dijλ={0ifdij<λ,1ifdijλ,i,j=1,2,,m

2.2. The existing distance measures for HFSs

Clustering is a progress which divides different kinds of elements into a few groups. Elements in the same group have something in common. On the contrary, the elements in different groups differ widely. So it is very important to find a suitable method to measure the relationship between different elements. Generally, we estimate this relation by distance, similarity or correlation coefficient. Only a good measure can lead to accurate results, in what follows, we will review some existing distance measures for HFSs.

In earlier research, a lot of approaches have been found to calculate the distance between any two HFSs. Let hA(x) and hB(x) be two HFEs. In most cases, the numbers of values in the HFEs hA(x) and hB(x) may be different. According to Ref. [10] , the number of values in the HFE hA(xi) is called the length of the HFE hA(xi), expressed as lhA(xi). That is to say, the lengths of the HFEs may be different. For convenience, Xu and Xia10 put forward some rules as follows: Arrange the values in hA(x) and hB(x) in descending orders. It means that hAσ(j)(x) represents the j th largest element. The equation hA(x) = hB(x) holds if and only if hAσ(j)(x)=hBσ(j)(x), j = 1,2, ···, m. Generally, if lhA(xi) and lhB(xi) represent the lengths of hA(xi) and hB(xi), then lxi = max { lhA(xi), lhB(xi)} for each xi in X. Only if they have the same length, can we continue to make deeper research. In the previous algorithms, if the lengths of two HFEs are different, then we add values to the HFEs which have less numbers until they have the same length. The principle of adding numbers reflects the risk preferences of the decision makers. The optimists may add the maximum value to the HFE, while the other who expect negative consequences may add the minimum number to the HFE. In order to avoid its weakness, in this paper, we will extend the shorter one by the maximum value, assuming that the decision makers are all optimistic. For example, there are two HFEs hA(x1) = {0.6,0.5,0.3,0.3,0.3} and hB(x1) = {0.4,0.2}, and assume that the decision makers’ opinions are optimistic, we extend hB(x1) to h′B(x1) = {0.4, 0.4, 0.4, 0.4, 0.2}.

Let A1 and A2 be two HFSs defined on X = {x1, x2, ··· xn}, hA(xi) and hB(xi) be two HFEs. Based on the well-known Hamming distance and Euclidean distance, Xu and Xia10 proposed a generalized hesitant normalized distance:

d1(A,B)=i=1n[1lxij=1lxi|hAσ(j)(xi)hBσ(j)(xi)|λ]1λ
where λ > 0. Particularly, if λ = 1, 2, then the generalized hesitant normalized distance is reduced to the hesitant Hamming distance and the hesitant Euclidean distance, respectively:
d2(A,B)=i=1n[1lxij=1lxi|hAσ(j)(xi)hBσ(j)(xi)|]
d3(A,B)=[i=1n(1lxij=1lxi|hAσ(j)(xi)hBσ(j)(xi)|2)]1/2

If the weight wi, of each element xi is taken into consideration, then the generalized hesitant weighted distance is defined as follows:

d4(A,B)=i=1nwi[1lxij=1lxi|hAσ(j)(xi)hBσ(j)(xi)|λ]1λ
where wi, ≥ 0, i = 1,2 ··· n, i=1nwi=1.

Example 2.

For two HFEs h1(x) = {0.8, 0.1} and h2(x) = {0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1}. Now we calculate the distance between these two HFEs: According to the distance measure proposed above, we first extend h1(x) until it h1(x) and h2(x) have the same length. Suppose that the decision maker is pessimistic, then we should add the minimum number to h1(x). Then h1(x) can be modified to h1(x) = (0.8, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1). By the hesitant Euclidean distance, we can get the distance between h1 and h2 as d3(h1, h2) = 0.096.

3. A novel method to calculate the distance between HFSs

3.1. The drawbacks of the existing distance measures for HFSs

In Section 2, we have reviewed some distance measures. Clearly, all of the above distance measures satisfy the three conditions in Definition 1. However, we can easily find that these distance measures have some shortcomings. Most importantly, these distance measures are based on the following assumptions:

  • The values in a HFE are arranged in ascending order or increasing order.

  • Assume that the lengths of each two corresponding HFEs are the same.

However, in most practical decision making problems, the lengths of the corresponding HFEs may be different. In some cases, they may even have big differences. It is also possible that the values in hA and hB are in order, that is to say, to arrange the values again may be improper. The existing distance measures have some weakness as described in the introduction. In the following, we will give two short examples to illustrate the weakness of the existing methods:

Example 3.

Suppose that we are going to compute the distance between h1(x) = {0.8, 0.1} and h2(x) = {0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1}. In Example 2, we have got the distance as: d3(h1,h2) = 0.096. However, in the course of calculation, we have added six numbers to h1(x) which just has two values originally, and the original average value is 0.45. After adding numbers, the average value is 0.1875. Clearly, the result is totally different. Furthermore, if we add the maximum number to the shorter one and suppose that the decision maker is optimistic, then we can extend h1(x) to h1(x) = {0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.8, 0.1}. So we can measure the distance d3(h1, h2) = 0.119, and the average value is 0.7125. Compared to the results measured before, it is completely different. However, according to the rules given in Ref. [10], these two results are all reasonable even though they differ very much. It is the weakness of the existing methods that change the original information when calculating the distances of HFEs. In the following, we try to propose a novel method to estimate distances so as to avoid the drawback mentioned above.

3.2. A novel distance measure for HFSs

Let A = {< xi, hA(xi) >| xiX, i = 1,2,···,n} and B = {< xi, hB(xi) >| xiX, i = 1,2, …, n} be two HFSs on X = {x1, x2, … xn}, respectively. According to the principle we talked before, if the values are given in disorder, we needn’t to arrange the values in HFEs in a decreasing order or increasing order. If the lengths of the corresponding HFEs are different, then adding the minimum number to the short one for computing distances of HFSs is also redundant. To keep the original information, we may consider a new distance measure between the HFSs A and B as:

d9(A,B)=1ni=1n|1lhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)hBn(xi)|
where l(hA(xi)) is the length of hA(xi). Now we need to check whether d9(A, B) satisfies Definition 1 or not. Apparently, d9(A, B) = d9(B, A) and 0 ≤ d9(A, B) ≤ 1 hold. What we need to check is d9(A, B) = 0 if and only if A = B. However,
d9(A,B)=1ni=1n|1lhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)-hBn(xi)|=1ni=1n|lhB(xi)m=1lhA(xi)hAm(xi)-lhA(xi)n=1lhB(xi)hBn(xi)|
If lhB(xi)m=1lhA(xi)hAm(xi)lhB(xi)n=1lhB(xi)hBn(xi)=0 holds, then d9(A, B) = 0. While in this case, A = B cannot be guaranteed. For example, A = {0.5, 0.4, 0.3} and B = {0.6, 0.5, 0.4, 0.1}, d9(A, B) = 0 but A ≠ B. On the contrary, when AB, d9(A, B) = 0 may also be possible. Thus, d9(A, B) = 0 if and only if A = B cannot be guaranteed in any cases.

Through analyzing the data, we find that what leads to this conclusion is that we haven’t considered the influence of the length and deviation of data on the result. These two parameters are also essential, which involve the hesitance of the decision makers. In the following, we will analyze the importance of hesitance of the decision makers, and develop a novel distance measure considering the hesitation degrees.

Example 4.

For two HFEs h1(x) = {0.8} and h2(x) = {0.8, 0.8, 0.8, 0.8, 0.8, 0.8}. According to the rules10, the lengths of these two HFEs are different. For convenience of calculation, we extend h1(x) to h1(x) = {0.8, 0.8, 0.8, 0.8, 0.8, 0.8}. So we can conclude that these two HFEs are the same. But considering the amount and distribution of values, the decision makers who provided h2(x) seem to be more hesitant. As a result, the hesitation degrees cannot be ignored.

In the previous research, Li26 defined a concept of hesitation degree only considering the lengths of HFEs:

Definition 426.

Let A be a HFS on X = {x1, x2, … xn}. Then the hesitation degree of the HFS A is defined as:

μ(A)=1ni=1nμ(h(xi))
where
μ(h(xi))=11l(h(xi))
and l(h(xi)) is the length of h(xi).

For the HFE h(xi), the value of µ(h(xi)) represents the hesitation degree of a decision maker when he or she determines the membership degree. If l(h(xi)) = 1, then µ(h(xi)) = 0. It means that the decision maker is quite sure about the membership degree. In contrast, if l(h(xi)) intends to be infinite, the hesitation degree µ(h(xi)) intends to be 1 which indicates that the decision maker is quite hesitant. For example, for the HFEs h1(x) = (0.9, 0.8, 0.7) and h2(x) = (0.8, 0.5), μ(h1(x))=113=23, and μ(h2(x))=112=12.

Superficially, the result mentioned above is somewhat reasonable. It is clear that the hesitation degree is closely associated to the length of HFE. However, to a certain extent, this method is not comprehensive. For example, when talking about the membership degree about the same alternative to meet the same properties, one decision maker puts forward his/her preference by the HFE h1(x) = (0.9,0.8,0.7), while the other expresses the preference with h2(x) = (0.9,0.2,0.1). Obviously, we can discover that the lengths of h1(x) and h2(x) are the same. If we evaluate the hesitation degree according to the method mentioned above, then the hesitation degrees of the two HFEs will be same as well, namely, µ(h1(x)) = µ(h2(x)). However, by analyzing the data, we can catch that data in h2(x) spread wider compared with h1(x). It means that the decision maker is doubtful about the membership degree and actually very uncertain. So, in fact µ(h1(x)) < µ(h2(x)). That is the reason why the decision maker gives 0.9, 0.2 and 0.1, these three numbers are far away from each other. From the example, we can see that only considering the length of HFE is not enough at all. When analyzing the hesitation degree, it is essential to take the deviation degrees of data into account as well. The wider the data distribute, the bigger the deviation degree is.

Consequently, combined with the standard deviation, we put forward a generalized hesitation degree considering the influence of divergence and the length of HFE.

Definition 5.

Let A be a HFS on X = (x1, x2, … xn}. The generalized hesitation degree can be expressed as:

μ(h(x))=α(11l(h(x)))+βi=1n(xix¯)2n
μ(A)=1ni=1nμ(h(xi))
where α and β are the weight coefficients, 0 ≤ α, β ≤ 1 and α + β =1. If α = 0, then it means that we pay no attention to the influence of the length of HFE. In contrast, if β = 0, then it represents that we ignore the standard deviation. For convenience, here we let α=β=12. So, the generalized hesitation degree has the following form:
μ(h(x))=12(11l(h(x))+i=1n(xix¯)2n))

To preserve the original information and by combining the generalized hesitation degrees, we try to define a new distance measure between the HFSs A and B as:

d10(hA(xi),hB(xi))=|αlhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)hBn(xi)|+β|μhA(xi)μhB(xi)|
d10(A,B)=1ni=1nd(hA(xi),hB(xi))
where α and β are the weight coefficients, 0 < α, β < 1 and α+β = 1. and l(hA(xi)) is the length of hA(xi). μhA (xi) is the generalized hesitation degree of the HFE hA(xi). For convenience, suppose that we have the same preference between the hesitation degree and the membership values, then α=β=12, and the distance measure can be expressed as:
d10(hA(xi),hB(xi))=12[|1lhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)hBn(xi)|+|μhA(xi)μhB(xi)|]

It is clear that we have taken the generalized hesitation degree into consideration in the formula (14). Next, we will prove that this distance measure satisfies all the properties in Definition 1.

Proof.

  1. (1)

    Since for every HFE, hAm(x)0 and μhA(xi) ≥ 0 hold. It is obvious that 0 ≤ d10 (A, B) ≤ 1.

  2. (2)

    If A = B, then hAm(xi)=hBn(xi), for any xiX and any m, n. What’s more, their lengths and deviation degrees are both the same, so μhA(xi) = μhB(xi), for any 1 ≤ in. As a result, d10 (A, B) = 0. On the contrary, if d10 (A, B) = 0, then we can get that for any 1 ≤ in, |μhA(xi) = μhB(xi)| = 0 and |lhB(xi)m=1lhA(xi)hAm(xi)lhB(xi)n=1lhB(xi)hBn(xi)|=0.. Since |μhA(xi) − μhB(xi)| = 0, then we can get that the lengths of figures and the deviation degrees in hA(xi) and hB (xi) are both the same. Thus, hAm(xi)=hBn(xi), for any xiX and any m, n. As a result, we can get that the HFS A is equal to the HFS B. That is to say, d10(A, B) = 0 if and only if A = B.

  3. (3)

    It is straightforward that d10(A, B) = d10(B, A).

Consider that the objects xi (i = 1, 2, …, n) may have different weights in some cases, we propose a weighted form of the distance measure for HFSs. Let w = (w1, w2wn) be the weight vector of xi (i = 1,2,…,n), with wi ≥ 0, i = 1,2,···,n, and i=1nwi=1. .Then the hesitant weighted distance is defined as:

d11(A,B)=i=1n{wi2[|1lhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)hBn(xi)|+|μhA(xi)μhB(xi)|]}
which also satisfies the properties in Definition 1.

In the following work, we will apply the novel distance measure developed in this paper to clustering analysis. Actually, there are some existing clustering methods, such as the MST clustering algorithm 23, hierarchical clustering algorithm 21 and so on. But these methods are somewhat complicated and need so much calculation and lots of transformations. In the following, we will propose a straightforward clustering algorithm called the orthogonal method for clustering the HFSs.

4. An orthogonal method for clustering the HFSs

The main idea of the orthogonal clustering algorithm is uncomplicated at all: Firstly, we utilize the developed distance measure to compute the distance between each two HFSs, and then construct a distance matrix M; Secondly, we should choose a confidence level λ ∈ [0,1] to get a λ – cutting matrix Mλ of the distance matrix M, and then take each column of the matrix Mλ as a vector, so the matrix Mλ can be expressed as Mλ=(α1,α2,,αn), where αj=(α1j,α2j,,αnj)T. Finally, we utilize the orthogonal relation among the HFSs to cluster the objects. The detailed process can be described as follows:

  • Step 1. Let (A1, A2, ·, An} be a set of HFSs over X = (x1, x2,…, xm }, representing m samples. using the distance measure defined before, we can calculate the distance between each two HFSs, and then construct a distance matrix M = (dij)n×n, where dij = d(Ai, Aj).

  • Step 2. Choose the confidence level λ ∈ [0, 1], and then construct the corresponding λ – cutting matrix Mλ according to Definition 4. We choose the value of λ from the values in the matrix M, with the order from the biggest value to the smallest one.

  • Step 3. After getting the λ – cutting matrix, we take each column of the matrix Mλ as a vector. As a result, the matrix Mλ can be expressed as Mλ=(α1,α2,,αn), where αj=(α1j,α2j,,αnj)T. The inner product of any two column vectors is (αi,αj)=αiTαj. If (αi,αj)=αiTαj=0,, then we call that these two column vectors are orthogonal.

  • Step 4. We cluster the objects into a few classes according to the orthogonal method among the column vectors. The detailed procedure is as follows:

    • ○,1If (αi,αj)0, then we cluster the objects Ai and Aj into the same class. This is called direct clustering principle.

    • ○,2If there exist 1 ≤ n1, n2,·…, nsn, and (αi,αn1)(αn1,αn2)(αns,αj)0, then we cluster the objects Ai and Aj into the same class. This is called indirect clustering principle.

    • ○,3if (αi,αj)=0, and for any 1 ≤ n1, n2,…,nsn, (αi,αn1)(αn1,αn2)(αns,αj)=0, then the objects Ai and Aj are not in the same group.

Compared with the hesitant fuzzy MST clustering algorithm 23 and the hesitant fuzzy agglomerative hierarchical clustering algorithm 21, the hesitant fuzzy orthogonal clustering method we proposed is much easier and can be realized by computer programs. It is practical and can be generalized to the large data environment. While in the hesitant fuzzy agglomerative hierarchical clustering algorithm 21, we should first divide the alternatives into certain clusters, compute the distance of each pair of clusters, combine the clusters with the minimum distance to form a new cluster, and repeat the above steps until all the alternatives are in the same clusters. Obviously, it is much more complicated and needs a large amount of computational efforts and takes a lot of time to accomplish. In the hesitant fuzzy MST clustering algorithm 23, after we get the distance of each pair of alternatives, we need draw a hesitant fuzzy graph where every node represents an alternative and every edge has weight which shows the dissimilarity degree. Then, we make the clustering analysis by using the hesitant fuzzy minimal spanning tree. Although this method is easy to realize, it is not convenient to be computed by the automatic programs, which is a big limitation.

5. Applications

In what follows, two numerical examples are given to demonstrate the effectiveness and practicality of the proposed clustering method:

Example 5.

Different people have different requirements towards computers. Someone who likes watching movie prefers the computer with clear screen compared to other characters, while game players may like the computer with high speed CPU to control the figures in time.

Now there are six different computers to be evaluated. In order to provide recommendations to the consumers, a digital evaluation website invites six experts to evaluate the performance of these six computers, mainly from five aspects: price (x1), electricity consumption (x2), the speed of CPU (x3), the quality of screen (x4), and design (x5). The weight of each attribute is not the same and can be changed according to different demands. For business man, electricity consumption is very important, so we can increase the weight of x2 to pick out the most suitable computer. In this paper, we assume that the attribute weight vector is w = (0.1, 0.2, 0.3, 0.25, 0.15)T. The evaluation results for each computer are expressed by hesitant fuzzy numbers shown in Table 1. When the experts evaluate the performance of each computer, the value of the evaluation will be given. For example, the CPU of one computer runs very fast. The experts believe that CPU is very good, they may give a relatively high value, maybe close to 1. If someone is hesitant in giving membership degree, he/she may give two or more values. Gathering all the evaluation results, we form Table 1. The larger the value of the hesitant fuzzy element in Table 1 is, the better the attribute of the alternative do the experts think.

x1 x2 x3 x4 x5
A1 {0.5,0.4,0.3} {0.9,0.8,0.7} {0.5,0.4,0.3} {0.9,0.6,0.5,0.4} {0.5,0.4}
A 2 {0.5,0.3} {0.9,0.7,0.6,0.6} {0.8,0.6,0.5,0.1} {0.7,0.5,0.3} {0.6,0.3,0.3}
A 3 {0.7} {0.9,0.5} {0.7,0.5,0.3} {0.8,0.4} {0.8,0.6,0.4,0.2}
A 4 {0.8,0.7,0.5,0.4} {0.7,0.4,0.4} {0.5,0.1} {0.7,0.3} {0.5,0.3}
A 5 {0.7,0.5,0.3} {0.8,0.2} {0.9,0.8,0.7} {0.3} {0.7,0.5,0.3}
A6 {0.9,0.7,0.6,0.2,0.1} {0.8,0.6} {0.8} {0.3,0.1} {0.9,0.7,0.6,0.2}
Table 1.

The characteristics information of the computers

In the following, we utilize the hesitant fuzzy orthogonal clustering method proposed in this paper to classify these six computers, which involves the following steps:

  • Step 1. We calculate the hesitation degrees by the formula (11):

    μ(h(x))=12(11l(h(x))+j=15(xix¯)25)
    For hA1(x1), l = 3 and x¯=0.4, we have μ(hA1 (x1)) = 0.3742. We can calculate the others in a similar way. As a result, we can get all the hesitation degrees which can be expressed as a matrix:
    h=(0.37420.37420.37420.46850.27500.30000.43620.50250.41500.404000.35000.41500.35000.48680.45410.40400.35000.35000.30000.41500.40000.374200.41500.55170.300000.30000.5025)

  • Step 2. According to the formula (15), we can calculate the distance with the weighting vector w = (0.1, 0.2, 0.3, 0.25, 0.15) between each two computers:

    d11(A,B)=i=1n{wi2[|1lhA(xi)lhB(xi)n=1lhB(xi)m=1lhA(xi)hAm(xi)hBn(xi)|+|μhA(xi)μhB(xi)|]}
    Consequently, we can get the distance matrix:
    D=(00.08680.10170.09850.20990.24680.086800.08610.10970.18380.22580.10170.086100.12690.19350.21480.09850.10970.126900.17090.24170.20990.18380.19350.170900.15700.24680.22580.21480.24170.15700)

  • Step 3. We choose the confidential level λ from the distance matrix to get the λ -cutting matrix Dλ = (λdij)6×6 according to the principles given before. For example, if λ = 0.0868, then any values in the distance matrix which are bigger than λ would turn into 1, otherwise, it would be 0. Consequently, when λ = 0.0868, the λ -cutting matrix is expressed as:

    Dλ=(011111100111100111111011111101111110)=(α1,α2,α3,α4,α5,α6)
    If λ = 0.2148, then the λ -cutting matrix is expressed as:
    Dλ=(000001000001000001000001000000111100)=(α1,α2,α3,α4,α5,α6)
    which takes each column of the matrix Dλ as a vector. As a result, the matrix Dλ can be expressed as Dλ=(α1,α2,,α6), where αj=(α1j,α2j,,α6j)T The inner product of any two column vectors is (αi,αj)=αiTαj. If (αi,αj)=αiTαj=0, then we call that these two column vectors are orthogonal. The orthogonal vectors cannot be clustered into the same group. That is to say, if (αi,αj)0, then we cluster the computers Ai and Aj into the same class. For example, when λ = 0.2148, (α1,α2)0, then cluster the computers A1 and A2 into the same class. Consequently, we can get all the possible classifications of Ai,(i = 1,2,…,6) as follows:
    1. (1)

      If 0 ≤ λ ≤ 0.1570, then Ai(i = 1,2,…,6) are clustered into the same group:

      • {A1,A2,A3,A4,A5,A6}

    2. (2)

      If 0.1570 < λ ≤ 0.2099, then Ai (i = 1,2,…,6) are clustered into two groups:

      • {A1,A2,A3,A4},{A5,A6}

    3. (3)

      If 0.2099 < λ ≤ 0.2148, then Ai(i = 1,2,…,6) are clustered into three groups:

      • {A1,A2,A3,A4},{A5},{A6}

    4. (4)

      If 0.2148 < λ ≤ 0.2258, then Ai (i = 1,2,…,6) are clustered into three groups:

      • {A1,A2,A4},{A3},{A5},{A6}

    5. (5)

      If 0.2258 < λ ≤ 0.2417, then Ai(i = 1,2,…,6) are clustered into five groups:

      • {A1,A4},{A2},{A3},{A5},{A6}

    6. (6)

      If 0.2417 < λ ≤1, then Ai(i = 1,2,…,6) are clustered into six groups:

      • {A1},{A2},{A3},{A4},{A5},{A6}

Obviously, according to the clustering results, the computer market can divide these computers into several groups. The computers in the same group have something similar in some extent. If some consumers regard one attribute as quite important, such as “speed of CPU (x3)”, then the attribute weight vector can be changed to get the satisfied computers. Different attribute weight vectors lead to different clustering results.

Liu et al. 27 proposed a hesitant fuzzy netting clustering method, whose process is as follows:

  • Step 1. According to the similarity measures, we compute the similarity between each two HFSs.

  • Step 2. Construct the similarity matrix S = (Sij)m×m, where sij = s(Ai, Aj), i, j = 1, 2,...,m.

  • Step 3. Delete all the elements above the diagonal of the distance matrix, and replace the elements on the diagonal with the representation of the objects.

  • Step 4. Choose the confidence level λ and construct the corresponding λ -cutting matrix. Replace ‘1’ with ‘*’ and delete all the ‘0’ in the matrix. From the points ‘*’ in the matrix, we can draw the vertical and horizontal line to the representations of the objects on the diagonal. Obviously, each ‘*’ links to two points on the diagonal which represents two objects. Cluster these two objects into the same group. Until go through all the points ‘*,, we can get the clustering result corresponding to the selected λ . Choosing different values of λ, we can get different clustering results until all the objects are clustered into one class.

In the following, we cluster the six kinds of computers with the hesitant fuzzy netting clustering method27 again. Here, we utilize the distance measure to estimate the relationship between computers instead of similarity measure. After we get the λ –cutting matrix, if we use the netting clustering method, then we first should replace ‘1’ with ‘*’ and delete all the ‘0’ in the matrix according to Step 4. For example, when λ = 0.2258, the λ -cutting matrix is expressed as:

Dλ=(000001000001000000000001000000110100)
According to Step 4, the matrix can be adjusted to
Dλ=(A1A2A3A4A5***A6)
In this case, the computers can be clustered into three classes:
  • {A1, A2, A4},{A3},{A5},{A6}

We can combine the results of these two algorithms as follows:

Classes Netting clustering method Orthogonal clustering method
6 {A1},{A2},{A3},{A4},{A5},{A6} {A1},{A2},{A3},{A4},{A5},{A6}
5 {A1,A4},{A2},{A3},{A5},{A6} {A1,A4},{A2},{A3},{A5},{A6}
4 {A1,A2,A4},{A3},{A5},{A6} {A1,A2,A4},{A3},{A5},{A6}
3 {A1,A2,A3,A4},{A5},{A6} {A1,A2,A3,A4},{A5},{A6}
2 {A1,A2,A3,A4},{A5,A6} {A1,A2,A3,A4},{A5,A6}
1 {A1,A2,A3,A4,A5,A6} {A1,A2,A3,A4,A5,A6}
Table 2.

Clustering results

Obviously, the results derived by these two algorithms are the same. Since the two algorithms utilize the same distance formula and the core idea is similar. However, there still exist some differences between these two methods. In the hesitant fuzzy orthogonal method, we take every column of the distance matrix as a vector. The relationship between each two objects can be easily found through the orthogonal vectors. Thus, this method can be accomplished easily and efficiently by MATLAB programs, which speeds up the clustering progress. While the hesitant fuzzy netting clustering method27 is not very convenient to realize with the computer programs. Since the netting is complicated to realize for programs and the netting graph needs people to recognize, then if the data are very complicated and very big, clustering so many objects is really difficult by using the hesitant fuzzy netting clustering method 27. Compared to the hesitant fuzzy netting method 27, the hesitant fuzzy orthogonal method is much more effective and simple.

In order to illustrate the computation complexity, we generate a few HFSs at random for clustering to compare these two algorithms. We measure the computation time before we get the clustering results respectively. The run time of these two methods is shown in Table 3. Considering the practical application, we think the orthogonal clustering method can save much time for big data problem.

numbers 6 10 15 20
The orthogonal method 0.000164 0.000523 0.001056 0.001619
The netting method 0.000205 0.000852 0.001431 0.002057
Table 3.

The running time for each method

In the following example, we compare the clustering results by using our distance measure and the existing distance measure given in Ref. [10]:

Example 6.

As we all know, pirate is one of the most important factors threatening the security of merchant shipping. With different social backgrounds, pirates in different oceans have great difference in equipment and strength. Furthermore, their attack targets and means of crimes are usually different. However, it is common that they all escape quickly from the scene, always before the arrival of modern Navy. In this case, we cannot know its real strength, such as weapons, amounts of people and the areas they always appear in. While fuzzy mathematics can handle these uncertain problems better. According to the features of every accident, we can cluster the pirates in any area into a few groups so that we can easily find out which area is very dangerous and the situations in that area. What’s more, through comparing the strengths of pirates in all areas, we can give advice to the passing ships. It contributes greatly to the ship’s emergency plan and risk management.

According to the report of IMO (International Maritime Organization), we first extract the features in every attacked accident, including the number of pirates, the damage degree of ships, the loss degree of packages, the damage degree of ship crews and so on.

Since the damage degree cannot be expressed accurately in mathematical forms, it is better to estimate them by HFSs. For every accident, the specialists will make judgements about the damage degrees. According to the given data, we can divide the ocean area and estimate the threatening degrees of any areas and any pirates.

In the recent three months, there happened 10 accidents attacked by pirates. After each crime, some specialists are invited to evaluate the damage degrees of these 10 accidents by HFSs, including the amount of pirates (x1), the damage degree of ships (x2), the loss degree of packages (x3), the damage degree of ship crews (x4). Now we get the data to make clustering in order to give advice to the passing ships. The attribute weight vector is w = (0.4, 0.2, 0.1, 0.3), and the data are shown below:

x1 x2 x3 x4
A1 {0.1,0.1,0.2,0.4,0.5,0.5} {0.2,0.4} {0.5,0.7,0.8} {0.6,0.6,0.7,0.7}
A2 {0.1,0.2,0.2,0.4} {0.2,0.4,0.7} {0.3,0.5,0.5,0.6} {0.5,0.5,0.6}
A3 {0.3,0.4} {0.3,0.3,0.4,0.4,0.5} {0.7,0.8} {0.2,0.2,0.4,0.5,0.5,0.6}
A4 {0.5,0.6} {0.2} {0.5,0.7,0.7} {0.5,0.5}
A5 {0.3} {0.4,0.6,0.6} {0.5,0.6,0.7} {0.3,0.4,0.5}
A6 {0.5,0.6,0.6} {0.3,0.6} {0.4,0.5} {0.6,0.7,0.8}
A7 {0.3,0.4,0.4,0.5,0.6,0.6} {0.5,0.7} {0.4,0.5} {0.2,0.3,0.6}
A8 {0.3,0.5} {0.3,0.7,0.8} {0.4,0.7,0.9} {0.7,0.8,0.9,0.9}
A9 {0.3,0.4,0.5,0.5,0.5} {0.4,0.5,0.7} {0.4,0.5,0.7,0.7,0.8} {0.2,0.6}
A10 {0.3,0.5} {0.4,0.5} {0.5,0.5} {0.3,0.4,0.5,0.6}
Table 4.

The characteristic information of pirate crimes

  • Step 1. Using the distance measure (15), we can get the distance matrix:

    D=(00.09190.13910.18310.17780.12840.13170.13550.12480.12840.091900.12330.19420.14810.13220.13030.14070.08960.11250.13910.123300.16430.11630.15580.12970.12490.10840.06880.18310.19420.164300.21200.14030.18570.19730.16320.13530.17780.14810.11630.212000.19610.16650.16930.12710.12610.12840.13220.15580.14030.196100.11710.11530.12440.09950.13170.13030.12970.18570.16650.117100.14980.06160.08340.13550.14070.12490.19730.16930.11530.149800.12520.10770.12480.08960.10840.16320.12710.12440.06160.125200.09320.12840.11250.06880.13530.12610.09950.08340.10770.09320)

  • Step 2. Choose the confidential level λ from the distance matrix to get the λ -cutting matrix Dλ = (λdij)10×10 according to the formulas given before. Finally, by the proposed hesitant fuzzy orthogonal clustering method, we can get the clustering results as follows:

    1. (1)

      If 0 ≤ λ ≤ 0.1322, then Ai (i = 1,2,…,10) can be clustered into the same group:

      • {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10}

    2. (2)

      If 0.1322 < λ ≤ 0.1558, then Ai (i = 1,2,…,10) are clustered into two groups:

      • {A1,A2,A3,A4,A5,A6,A7,A8,A9},{A10}

    3. (3)

      If 0.1558 < λ ≤ 0.1643, then Ai (i = 1,2,…,10) are clustered into three groups:

      • {A1,A2,A3,A4,A5,A6,A7,A8},{A9},{A10}

    4. (4)

      If 0.1643 < λ ≤ 0.1778, then Ai (i = 1,2,…,10) are clustered into four groups:

      • {A1,A2,A4,A5,A6,A7,A8},{A3},{A9},{A10}

    5. (5)

      If 0.1778 < λ ≤ 0.1831, then Ai (i = 1,2,…,10) are clustered into five groups:

      • {A1,A2,A5,A7,A8},{A4,A6},{A3},{A9},{A10}

    6. (6)

      If 0.1831 < λ ≤ 0.1857, then Ai (i = 1,2,…,10) are clustered into six groups:

      • {A1},{A2,A5,A7,A8},{A4,A6},{A3},{A9},{A10}

    7. (7)

      If 0.1857 < λ ≤ 0.1942, then Ai (i = 1,2,…,10) are clustered into seven groups:

      • {A1},{A2,A5,A8},{A4,A6},{A3},{A7},{A9},{A10}

    8. (8)

      If 0.1942 < λ ≤ 0.1961, then Ai (i = 1,2, …,10) are clustered into eight groups:

      • {A1},{A2},{A5, A8},{A4, A6},{A3},{A7},{A9},{A10}

    9. (9)

      If 0.1961 < λ ≤ 0.1973, then Ai (i=1,2,,10) are clustered into nine groups:

      • {A1},{A2},{A5, A8},{A4},{A6},{A3},{A7},{A9},{A10}

    10. (10)

      If 0.1973 < λ 1, then Ai (i=1,2,,10) are clustered into ten groups:

      • {A1},{A2},{A5},{A8},{A4},{A6},{A3},{A7},{A9},{A10}

In the following, we will use the existing distance measure 10 in our hesitant fuzzy orthogonal clustering method:

According to the method proposed by Xu and Xia 10, if the lengths of the HFEs are different, then we should extent the shorter one by adding the minimum value or the maximum value until they have the same length. Consequently, we can get the improved HFSs as follows:

x1 x2 x3 x4
A1 {0.1,0.1,0.2,0.4,0.5,0.5} {0.2,0.4,0.4,0.4,0.4} {0.5,0.7,0.8,0.8,0.8} {0.6,0.6,0.7,0.7,0.7,0.7}
A2 {0.1,0.2,0.2,0.4,0.4,0.4} {0.2,0.4,0.7,0.7,0.7} {0.3,0.5,0.5,0.6,0.6} {0.5,0.5,0.6,0.6,0.6,0.6}
A3 {0.3,0.4,0.4,0.4,0.4,0.4} {0.3,0.3,0.4,0.4,0.5} {0.7,0.8,0.8,0.8,0.8} {0.2,0.2,0.4,0.5,0.6,0.6}
A4 {0.5,0.6,0.6,0.6,0.6,0.6} {0.2,0.2,0.2,0.2,0.2} {0.5,0.7,0.7,0.7,0.7} {0.5,0.5,0.5,0.5,0.5,0.5}
A5 {0.3,0.3,0.3,0.3,0.3,0.3} {0.4,0.6,0.6,0.6,0.6} {0.5,0.6,0.7,0.7,0.7} {0.3,0.4,0.5,0.5,0.5,0.5}
A6 {0.5,0.6,0.6,0.6,0.6,0.6} {0.3,0.6,0.6,0.6,0.6} {0.4,0.5,0.5,0.5,0.5} {0.6,0.7,0.8,0.8,0.8,0.8}
A7 {0.3,0.4,0.4,0.5,0.6,0.6} {0.5,0.7,0.7,0.7,0.7} {0.4,0.5,0.5,0.5,0.5} {0.2,0.3,0.6,0.6,0.6,0.6}
A8 {0.3,0.5,0.5,0.5,0.5,0.5} {0.3,0.7,0.8,0.8,0.8} {0.4,0.7,0.9,0.9,0.9} {0.7,0.8,0.9,0.9,0.9,0.9}
A9 {0.3,0.4,0.5,0.5,0.5,0.5} {0.4,0.5,0.7,0.7,0.7} {0.4,0.5,0.7,0.7,0.8} {0.2,0.6,0.6,0.6,0.6,0.6}
A10 {0.3,0.5,0.5,0.5,0.5,0.5} {0.4,0.5,0.5,0.5,0.5} {0.5,0.5,0.5,0.5,0.5} {0.3,0.4,0.5,0.6,0.6,0.6}
Table 5.

The characteristics information of pirate crimes

  • Step 1. Calculate the distance dij = d1(Ai, Aj) by the formula (2), and let the attribute vector be w = (0.4,0.2,0.1,0.3) . Then we can get the distance matrix D = (dij)n×n as:

    D=(00.10800.15800.20130.17970.19830.20570.19370.15800.16270.108000.15800.22400.12370.20500.12830.21230.11070.13730.15800.158000.16800.10830.24700.14430.24230.12670.11130.20130.22400.168000.20230.16100.20170.26170.18130.14870.17970.12370.10830.202300.22330.13270.23470.11700.11170.19830.20500.24700.16100.223300.15070.13270.15230.14370.20570.12830.14430.20170.13270.150700.18470.06100.07970.19370.21230.24230.26170.23470.13270.184700.13970.18300.15800.11070.12670.18130.11700.15230.06100.139700.06670.16270.13730.11130.14870.11170.14370.07970.18300.06670)

  • Step 2. We still use the orthogonal clustering method to make analysis. Firstly, we choose the confidence level λ and construct the corresponding λ-cutting matrix. Then we can group the accidents into several clusters as follows:

    1. (1)

      If 0 < λ ≤ 0.1813, then we get

      • {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10}.

    2. (2)

      If 0.1813 < λ ≤ 0.1830, then we get

      • {A1,A2,A3,A4,A5,A6,A7,A8,A10},{A9}.

    3. (3)

      If 0.1830 < λ ≤ 0.2017, then we get

      • {A1,A2,A3,A4,A5,A6,A7,A8},{A9},{A10}.

    4. (4)

      If 0.2017 < λ ≤ 0.2123, then we get

      • {A1},{A2,A3,A4,A5,A6,A8},{A7},{A9},{A10}.

    5. (5)

      If 0.2123< λ ≤ 0.2240, then we get

      • {A1},{A3, A4, A5},{A2, A6, A8},{A7},{A9},{A10}.

    6. (6)

      If 0.2240 < λ ≤ 0.2347, then we get

      • {A1},{A3,A4,A5},{A2},{A6,A8},{A7},{A9},{A10}.

    7. (7)

      If 0.2347 < λ ≤ 0.2423, then we get

      • {A1},{A3,A4},{A5},{A2},{A6,A8},{A7},{A9},{A10}.

    8. (8)

      If 0.2423< λ ≤ 1, then we get

      • {A1},{A3},{A4},{A5},{A2},{A6},{A8},{A7},{A9},{A10}.

To compare these two distance measures, we combine the final clustering results in one table. See Table 6.

Classes Distance in this article Euclid distance [10]
10 {A1},{A2},{A5}{A8},{A4},{A6}, {A3},{A7},{A9},{A10} {A1},{A3},{A4},{A5},{A2},{A6},{A8}, {A7},{A9},{A10}
9 {A1},{A2},{A5,A8},{A4},{A6},{A3}, {A7},{A9},{A10} dull
8 {A1},{A2},{A5,A8},{A4,A6},{A3}, {A7},{A9},{A10} {A1},{A3,A4},{A5},{A2},{A6,A8}, {A7},{A9},{A10}
7 {A1},{A2,A5,A8},{A4,A6},{A3},{A7}, {A9},{A10} {A1},{A3,A4,A5},{A2},{A6,A8}, {A7},{A9},{A10}
6 {A1},{A2,A5,A7,A8},{A4,A6}, {A3},{A9},{A10} {A1},{A3,A4,A5},{A2,A6,A8}, {A7},{A9},{A10}
5 {A1,A2,A5,A7,A8},{A4,A6}, {A3},{A9},{A10} {A1},{A2,A3,A4,A5,A6,A8}, {A7},{A9},{A10}
4 {A1,A2,A4,A5,A6,A7,A8}, {A3},{A9},{A10} dull
3 {A1,A2,A3,A4,A5,A6,A7,A8}, {A9},{A10} {A1,A2,A3,A4,A5,A6,A7,A8}, {A9},{A10}
2 {A1,A2,A3,A4,A5,A6,A7,A8,A9},{A10} {A1,A2,A3,A4,A5,A6,A7,A8,A10},{A9}
1 {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10} {A1,A2,A3,A4,A5,A6,A7,A8,A9,A10}
Table 6.

Clustering results

From the above numerical example, we can find that the clustering results by using these two measures are quite different. Firstly, the accidents can just be clustered into eight groups if using the traditional distance measure, while the results derived by using our measure can be more exquisite. Secondly, there exist some differences among the results. For example, in the class 5, if we use the traditional Euclid distance, then the result is {A1},{A2, A3, A4, A5, A6, A8},{A7},{A9},{A10}. While if we use the distance measure proposed in this paper, then the result becomes

  • {A1,A2,A5,A7,A8}, {A4,A6},{A3},{A9},{A10}.

The main reason is that the distance measure proposed in this paper does not extend the shorter HFE until all the considered HFEs have the same length, and thus, we do not change the original information. Furthermore, in this paper, we take the hesitation degree into account. It is also an essential parameter when comparing and clustering the HFSs. In the following, we give some discussions:

(1) By comparing the original information, we can easily find that A1 is close to A7 and A2 is close to A7 too. It indicates that the pirate crimes happened in A1, A2 and A7 are similar by coincidence. That is to say, the pirates who make A1, A2 and A7 be the same. While, if we extend the shorter HFE using the traditional distance, then the original information will be changed. Thus, A1 is totally different from A7 . In this way, the distance measure proposed in this paper is more convincing. After clustering the accidents, we can analyze the characteristics of these issues, speculate the strengths of these pirates, such as the armed equipment and so on. When handling these information, we can give advice to the passing ships and the local navy. Accurate analysis can reduce much loss and it can guarantee the security of international shipping.

(2) Comparing these two results, we can find that the result calculated by the distance measure proposed in this paper is more detailed and convincing. Since the lengths of HFEs differ very much, adding numbers to the shorter one change the original information greatly. Therefore, the clustering results will be totally different. From this example, we can see that it is essential and important to consider the influence of hesitation degrees and the original information. Only we consider all the influence factors, can the results be more convincing.

6. Concluding remarks

In order to keep the original information and consider the influence factors more comprehensively, in this paper, we have proposed a novel distance measure which combines the improved hesitation degree. What’s more, a new method called orthogonal clustering method has been proposed and applied to cluster hesitant fuzzy information. Computational tests on the novel distance measure and the new clustering method have shown that the orthogonal clustering method is available and efficient. Furthermore, compared to the hesitant fuzzy netting clustering method 27 in the numerical example, we have found that the two methods have the same clustering results, while the hesitant fuzzy netting clustering method 27 needs people to recognize, which is much more inconvenient. In addition, the distance measure proposed in this paper take the hesitation degrees into consideration, which has not changed the original information and thus is more reasonable and convincing.

Acknowledgements

The authors would like to thank the anonymous reviewers for their insightful and constructive commendations that have led to an improved version of this paper. The work was supported by the National Natural Science Foundation of China (No. 71571123).

References

4.D Dubois and H Prade, Fuzzy Sets and Systems: Theory and Applications, Academic Press, New York, 1980.
18.B Everitt, S Landau, and M Leese, Cluster Analysis, 4th edition, Arnold, London, 2001.
23.XL Zhang and ZS Xu, A MST clustering analysis method under hesitant fuzzy environment, Control and Cybernetics, Vol. 41, 2012, pp. 645-666.
24.S Miyamoto, Fuzzy sets in information retrieval and cluster analysis, Kluwer, Dordrecht, 1990, pp. 181-192.
27.XD Liu, JJ Zhu, and SF Liu, Similarity measure of hesitant fuzzy sets based on symmetric cross entropy and its application in clustering analysis, Control and Decision, Vol. 29, 2014, pp. 1816-182.
Journal
International Journal of Computational Intelligence Systems
Volume-Issue
10 - 1
Pages
663 - 676
Publication Date
2017/01/30
ISSN (Online)
1875-6883
ISSN (Print)
1875-6891
DOI
10.2991/ijcis.2017.10.1.44How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Yanmin Liu
AU  - Hua Zhao
AU  - Zeshui Xu
PY  - 2017
DA  - 2017/01/30
TI  - An orthogonal clustering method under hesitant fuzzy environment
JO  - International Journal of Computational Intelligence Systems
SP  - 663
EP  - 676
VL  - 10
IS  - 1
SN  - 1875-6883
UR  - https://doi.org/10.2991/ijcis.2017.10.1.44
DO  - 10.2991/ijcis.2017.10.1.44
ID  - Liu2017
ER  -