Skip to main content

Novel distance measures of hesitant fuzzy sets and their applications in clustering analysis

Abstract

Distance and similarity measures are very important in clustering, pattern recognition, decision-making and other scientific fields. For the existing hesitant fuzzy distance, most of them do not consider the hesitance degree. Even if the hesitance degree is considered, only the degree of dispersion or the number of hesitant fuzzy values are considered. Aiming at these shortages, a new hesitance degree is defined, which has better accuracy and applicability. Then, some hesitant fuzzy distance measures based on the proposed hesitance degree are proposed, which can overcome some shortcomings of the existing distance measures. Finally, the new hesitant fuzzy distance is applied to the hierarchical hesitant fuzzy k-means clustering algorithm, and an illustration example is given to illustrate the effectiveness of the proposed method.

Introduction

The theory of fuzzy sets proposed by Zadeh [1] has achieved a great success in various fields. Afterwards, many new theories and approaches about uncertainty and imprecision have been proposed by scholars, such as intuitionistic fuzzy sets(IFS) [2], interval-valued intuitionistic fuzzy sets [3], linguistic variables [4], type-2 fuzzy sets [5], fuzzy multiset [6], picture fuzzy sets(PFS) [7], etc. With the growing complexity and uncertainty of the real-life problems, it is hard to establish the degree of membership of fuzzy set. To do this, Torra [8] introduced the concept of hesitant fuzzy set(HFS) which permitted the membership having a set of possible values. As an extended form of the fuzzy set, hesitant fuzzy set can better simulate the hesitation preference of decision makers to deal with the actual situation of people hesitating between several possible values. Since the hesitation fuzzy set came out, it has received extensive attention and obtained rich research results. For example, Zhang [9] proposed the hesitant fuzzy power average operator, it is characterized by the weight of hesitation fuzzy information depends on the degree of support for it with other hesitation fuzzy information. Considering that attributes may be related to each other in realistic decision-making problems, Zhu [10, 11] proposed the hesitant fuzzy Bonferroni mean operator and hesitant fuzzy Bonferroni geometric operator. Wei [12] considered the priority relationship between attributes and proposed the hesitant fuzzy prioritized operator. Xu et al. [13] introduced a hesitant fuzzy TOPSIS method based on the principle of maximum deviation and applied it to multi-attribute decision-making problems. Liao et al. [14] presented the hesitant fuzzy VIKOR multi-attribute decision-making method considering the psychological preference of decision-makers. Wang et al. [15] introduced the prospect value function of hesitant fuzzy elements based on prospect theory and distance measure, and then proposed a multi-attribute decision-making method according to the TOPSIS method that considers the risk preference of decision maker. Hesitant fuzzy sets also have been applied to the other fields such as cluster analysis [1619], decision analysis [2023] and pattern recognition [2427] and so on.

Distance measure is one of the important direction in the theory of hesitant fuzzy set. So far, many research results on hesitant fuzzy distance have been obtained. For instance, Xu and Xia [28] first proposed a variety of hesitant fuzzy distance measures and discussed their properties. On the basis of hesitant fuzzy distance measure by Xu, Tong [29] introduced a hybrid hesitant fuzzy distance measure considering the preference of decision makers. And Peng [30] presented a generalized hesitant fuzzy cooperative weighted distance measure. Although the above hesitant fuzzy distance measures have many merits, they require that each corresponding hesitant fuzzy element has the same length. When the length of hesitant fuzzy elements is not equal, it is necessary to add elements to meet the requirements. However, this is bound to change the original information of hesitant fuzzy elements. That is to change the real expression of experts. To overcome the shortcoming, Tang et al. [31] proposed a distance measure without considering the length of the hesitant fuzzy element. But except for the length of the hesitant fuzzy element is 1, the distance between two identical hesitant fuzzy elements is not equal to 0, which is contrary to the fact. Later, some researchers further consider the hesitance degree of hesitant fuzzy element in distance measure. Zhang and Xu [18] proposed the concept of hesitation index which determined by the degree of dispersion of hesitant fuzzy values in the hesitant fuzzy element, and proposed a series of distance and similarity measures that consider hesitation index of hesitant fuzzy sets. Li et al. [32] proposed the concept of hesitance degree which determined by the number of hesitant fuzzy values in the hesitant fuzzy element, and proposed a series of hesitant fuzzy distance measures containing hesitance degree. However, it needs to be pointed out that the hesitance degree mentioned above only considers the degree of dispersion or the number of hesitant fuzzy values in the hesitant fuzzy element, which is imperfect and has the defect of insufficient discrimination.

Methods

According to the above analysis, the existing hesitant fuzzy distance measures have different shortcomings. To overcome the shortcomings, we first define a new hesitance degree by considering the degree of dispersion and the number of hesitant fuzzy values in hesitant fuzzy element, and put forward some distance measures based on the proposed hesitance degree. The distance is divided into two cases of equal length and unequal length between two hesitant fuzzy elements, which can solve the problem of original information distortion caused by supplementary elements in the case of inconsistent lengths. Further, we apply the new hesitant fuzzy distance to the hierarchical hesitant fuzzy K-means clustering.

The paper is organized as follows. In Methods section, some concepts related to hesitant fuzzy sets are introduced. In Preliminaries section, a new hesitance degree and some new hesitant fuzzy distance measures are proposed, and their properties are discussed. In Some New hesitant fuzzy distance measures section, we applied the new distance measure to the hierarchical hesitant fuzzy K-means clustering algorithm. The fifth section is the conclusion of this paper.

Preliminaries

Definition 1

[8] Given a fixed set X, then a hesitant fuzzy set (HFS) on X is in terms of a function that when applied to X returns a subset of [0, 1].

For convenience, Xia and Xu [33] usually express HFS simply as a mathematical symbol:

$$ E=\left\{< x, h_{E}(x)>\mid x \in X\right\} $$
(1)

where hE(x) is a set of some different values in [0,1], representing the possible membership degrees of the element xX to E. For convenience, we call h=hE(x) a hesitant fuzzy element (HFE) and H the set of all HFEs.

For the convenience of comparison, We arrange the elements in hE(xi) in increasing order, and let \(h_{E}^{\sigma (j)}(x_{i})\)be the jth largest value in hE(xi).

Li [32] put forward the axiomatic definition of distance measure for hesitant fuzzy sets (HFSs).

Definition 2

[32]. Let A, B and C be three HFSs on X. Then, d is called a hesitant fuzzy distance measure for HFSs, which satisfies the following properties:

(1) 0≤d(A,B)≤1;

(2) d(A,B)=0 if and only if A=B;

(3) d(A,B)=d(B,A);

(4) d(A,B)+d(B,C)≥d(A,C).

It is noted that the number of values in different HFEs may be different, Xu and Xia extend the shorter one by adding the same value until both of them have the same length when we compare them. Let l(hE(xi)) be the number of values in hE(xi), and \(l_{x_{i}}=max\{l(h_{A}(x_{i})),l(h_{B}(x_{i}))\}\). Xu and Xia [28] proposed a series of hesitant fuzzy set distances as follows:

Definition 3

[28]. Let A and B be two HFSs on X={x1,x2,…,xn}. Then, the hesitant normalized Hamming distance as follows:

$$ d_{xh}(A, B)\,=\,\frac{1}{n} \!\sum_{i=1}^{n}\left[\frac{1}{l_{x_{i}}} \sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\!\left(x_{i}\right)\,-\,h_{B}^{\sigma(j)}\!\left(x_{i}\right)\right|\right] $$
(2)

the hesitant normalized Euclidean distance as follows:

$$ d_{xe}(A, B)=\!\left[\!\frac{1}{n}\! \sum_{i=1}^{n}\left(\frac{1}{l_{x_{i}}} \sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\!\left(x_{i}\right)\,-\,h_{B}^{\sigma(j)}\!\left(x_{i}\right)\right|^{2}\right)\right]^{\frac{1}{2}} $$
(3)

the generalized hesitant normalized distance:

$$ d_{xg}(A, B)=\!\left[\!\frac{1}{n} \!\sum_{i=1}^{n}\left(\frac{1}{l_{x_{i}}} \sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\!\left(x_{i}\right)\,-\,h_{B}^{\sigma(j)}\!\left(x_{i}\right)\right|^{\lambda}\right)\right]^{\frac{1}{\lambda}} $$
(4)

where λ>0.

In order to measure the deviation of each HFE in each HFS, Zhang and Xu [18] et al. proposed the concept of hesitance degree of HFS.

Definition 4

[18]. Let H be an HFS in a reference set X, denoted by H={<x,hH(x)>xX} and \(h_{H}(x_{i})=\left \{h_{H}^{\sigma (j)}(x_{i}) \mid j=1,2, \ldots, l_{h}\right \}\). Then, the hesitance degree of x in H can be defined as follows:

$$ h_{Z}\left(h_{H}(x)\right)= \left\{\begin{array}{ll} \!{\sqrt{\frac{2 \sum\limits_{k>j=1}^{l_{x}}\left(h_{H}^{\sigma(k)}(x_{i})-h_{H}^{\sigma(j)}(x_{i})\right)^{2}}{l_{h} \times\left(l_{h}-1\right)}}}, & l_{h}>1\\ \!0, & l_{h}=1 \end{array} \right. $$
(5)

where lh is the number of the elements in hH(xi).

In general, the bigger the range among the possible values in each HFE is, the larger the hesitance degree of the HFE is. By considering the impact of the hesitance degree of HFEs, Xu and Zhang proposed a new method for measuring the distance between HFSs:

Definition 5

[18]. Let A and B be two HFSs on X. Then, the hesitant normalized Hamming distance including hesitance degree between A and B is defined as:

$$ {}d_{h z h}(A, B) =\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\alpha}{l_{x_{i}}}\sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\left(x_{i}\right)- h_{B}^{\sigma(j)}\left(x_{i}\right)\right|+\beta\left|h_{Z}\left(h_{A}\left(x_{i}\right)\right)-h_{Z}\left(h_{B}\left(x_{i}\right)\right)\right|\right) $$
(6)

the hesitant normalized Euclidean distance including hesitance degree is defined as:

$$ {}d_{h z e}(A, B) \,=\, \left[\!\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\alpha}{l_{x_{i}}} \sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\left(x_{i}\right)- h_{B}^{\sigma(j)}\left(x_{i}\right)\right|^{2}+\beta\left|h_{Z}\left(h_{A}\left(x_{i}\right)\right)-h_{Z}\left(h_{B}\left(x_{i}\right)\right)\right|^{2}\right)\!\right]^{\frac{1}{2}} $$
(7)

the generalized hesitant normalized distance including hesitance degree is defined as:

$$ {}d_{h z g}(A, B) \,=\,\left[\!\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\alpha}{l_{x_{i}}} \sum_{j=1}^{l_{x_{i}}}\left|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)\right|^{\lambda}+\beta\left|h_{Z}\left(h_{A}\left(x_{i}\right)\right)-h_{Z}\left(h_{B}\left(x_{i}\right)\right)\right|^{\lambda}\right)\!\right]^{\frac{1}{\lambda}} $$
(8)

where λ>0, α,β[0,1],α+β = 1, \(h_{A}^{\sigma (j)}(x_{i})\) and \(h_{B}^{\sigma (j)}(x_{i})\) are the jth values in hA(xi)and hB(xi), respectively. hZ(hA(xi)) and hZ(hB(xi)) are referred to the hesitance degree of two HFEs hA(xi) and hB(xi), respectively.

Li [32] defined a hesitance degree based on the number of hesitant fuzzy values in hesitant fuzzy elements, and proposed a series of hesitant fuzzy distance measures.

Definition 6

[32]. Let H be an HFS on X={x1,x2,…,xn}. Then, the hesitance degree of x in H can be defined as follows:

$$ h_{L}\left(h_{H}\left(x_{i}\right)\right)=1-\frac{1}{l\left(h_{H}\left(x_{i}\right)\right)} $$
(9)

where l(hH(xi)) be the length of hH(xi).

Therefore, the hesitance degree of the HFS H is defined as:

$$ h_{L}(H)=\frac{1}{n} \sum_{i=1}^{n} h_{L}\left(h_{H}\left(x_{i}\right)\right) $$
(10)

Definition 7

Let M1,M2,…,Mm and B be a set of HFS on X={x1,x2,…,xn},then for any Mk and Mt,k,t=1,2,…,m, the normalized Hamming distance including hesitance degree between Mk and Mt is defined as follows:

$$ {}d_{h l h}\left({M_{k},M_{t}}\right)=\frac{1}{2 n} \sum_{i=1}^{n}\left[\left|h_{L}\left(h_{{M_{k}}}\left(x_{i}\right)\right)-h_{L}\!\left(\!h_{{M_{t}}}\!\left(x_{i}\!\right)\right)\right|+\frac{1}{l\left(x_{i}\right)}\! \sum_{j=1}^{l\left(x_{i}\right)} | h_{{M_{k}}}^{\sigma(j)}\!\left(x_{i}\right)-h_{{M_{t}}}^{\sigma(j)}\!\left(x_{i}\right)|\right] $$
(11)

the normalized Euclidean distance including hesitance degree between Mk and Mt is defined as follows:

$$ {}d_{\text{hle}}\left({M_{k},M_{t}}\right)\,=\,\left[\!\frac{1}{2 n} \sum_{i=1}^{n}\left(\!\left|h_{L}\left(h_{{M_{k}}}\left(x_{i}\right)\right)-h_{L}\!\left(\!h_{{M_{t}}}\!\left(x_{i}\!\right)\right)\right|^{2}\!\,+\,\frac{1}{l\left(x_{i}\!\right)} \sum_{j=1}^{l\left(x_{i}\right)}\!\left|h_{{M_{k}}}^{\sigma(j)}\!\left(\!x_{i}\!\right)\,-\,h_{{M_{t}}}^{\sigma(j)}\!\left(\!x_{i}\!\right)\right|^{2}\!\right)\!\right]^{\frac{1}{2 }} $$
(12)

the normalized generalized distance including hesitance degree between Mk and Mt is defined as follows:

$$ {}d_{\text{hlg}}\left({M_{k},M_{t}}\right)\,=\,\left[\!\frac{1}{2 n} \sum_{i=1}^{n}\left(\left|h_{L}\left(h_{{M_{k}}}\left(x_{i}\right)\right)\,-\,h_{L}\!\left(\!h_{{M_{t}}}\!\left(x_{i}\!\right)\right)\right|^{\lambda}\,+\,\frac{1}{l\left(x_{i}\!\right)} \sum_{j=1}^{l\left(x_{i}\right)}\!\left|h_{{M_{k}}}^{\sigma(j)}\!\left(\!x_{i}\!\right)\,-\,h_{{M_{t}}}^{\sigma(j)}\!\left(\!x_{i}\!\right)\right|^{\lambda}\!\right)\!\right]^{\frac{1}{\lambda }} $$
(13)

where λ≥1, \(l(x_{i})=max\{l(h_{{M_{k}}}\left (x_{i}\right)),l(h_{{M_{t}}}\left (x_{i}\right)))\}\), \(h_{{M_{k}}}^{\sigma (j)}\left (x_{i}\right)\) and \(h_{{M_{t}}}^{\sigma (j)}\left (x_{i}\right)\) are the jth values in \(h_{{M_{k}}}\left (x_{i}\right)\) and \(h_{{M_{t}}}\left (x_{i}\right)\), respectively.

In order to relax the limitation that the corresponding hesitant fuzzy elements have the same length. Tang et al. [31] proposed a series of distance measures.

Definition 8

Let A and B be two HFSs on X. Then, the hesitant normalized Hamming distance between A and B is defined as:

$$ d_{lth}(A, B) =\frac{1}{n} \sum\limits_{i=1}^{n}\frac{\sum_{j=1}^{l_{A}\left(x_{i}\right)} \!\sum_{k=1}^{l_{B}\left(x_{i}\right)}|h_{A}^{\sigma(j)}\!\left(x_{i}\right) -h_{B}^{\sigma(k)}\!\left(x_{i}\right)|}{l_{A}\left(x_{i}\right) l_{B}\left(x_{i}\right)} $$
(14)

the normalized Euclidean distance between A and B is defined as follows:

$$ d_{lte}(A, B) =\left[\frac{1}{n} \sum_{i=1}^{n}\frac{\sum_{j=1}^{l_{A}\left(x_{i}\right)} \!\sum_{k=1}^{l_{B}\left(x_{i}\right)}(h_{A}^{\sigma(j)}\!\left(x_{i}\right) \!-h_{B}^{\sigma(k)}\!\left(x_{i}\right))^{2}}{l_{A}\left(x_{i}\right) l_{B}\left(x_{i}\right)}\right]^{\frac{1}{2}} $$
(15)

the normalized generalized distance between A and B is defined as follows:

$$ d_{ltg}(A, B) =\left[\frac{1}{n} \sum_{i=1}^{n}\frac{\sum_{j=1}^{l_{A}\left(x_{i}\right)} \!\sum_{k=1}^{l_{B}\left(x_{i}\right)}|h_{A}^{\sigma(j)}\!\left(x_{i}\right) \!-h_{B}^{\sigma(k)}\!\left(x_{i}\right)|^{\lambda}}{l_{A}\left(x_{i}\right) l_{B}\left(x_{i}\right)}\right]^{\frac{1}{\lambda}} $$
(16)

where λ>0, \(h_{A}^{\sigma (j)}\left (x_{i}\right)\) are the jth values in hA(xi) and \(h_{B}^{\sigma (k)}\left (x_{i}\right)\) are the kth values in hB(xi), lA(xi) and lB(xi) are the lengths of hA(xi) and hB(xi), respectively.

Some New hesitant fuzzy distance measures

According to analysis, the existing method only considers the number or the degree of dispersion, which is obviously one-sided. Therefore, by simultaneously considering them, we propose a new hesitance degree as follows.

Definition 9

Let A be an HFS in a reference set X = {x1,x2,…,xn}, denoted by A={<xi,hA(xi)>xiX} and \(h_{A}(x_{i})=\left \{h_{A}^{\sigma (j)}(x_{i}) \mid j=1,2, \ldots, l_{h_{A}}\right \}\). Then, the hesitance degree of x in A can be defined as follows:

$$ h(h_{A}(x_{i}))=\left\{ \begin{array}{ll} \frac{2\theta}{l_{h_{A}} \times(l_{h_{A}}-1)}\sum\limits_{k>j=1}^{l_{h_{A}}}\left|h_{A}^{\sigma(k)}(x_{i})\right. & {}_{l_{h_{A}}>1}\\ \left.-h_{A}^{\sigma(j)}(x_{i})\right|+\mu g\left(l_{h_{A}}-1\right)\\ 0 & {}_{l_{h_{A}}=1}\\ \end{array}\right. $$
(17)

where \(l_{h_{A}}\) is the length of hA(xi), g is the minimum accuracy of values in the hesitant fuzzy element hA(xi), θ,μ[0,1],θ+μ=1.

Therefore, the hesitance degree of the HFS A is defined as:

$$ h(A)=\frac{1}{n} \sum_{i=1}^{n} h\left(h_{A}\left(x_{i}\right)\right) $$
(18)

Remark 1

n is the number of digits after the decimal point of the hesitant fuzzy element, then g=1/10n. For example, let h={0.2,0.3} be a hesitant fuzzy element, then the minimum accuracy g=1/10=0.1. If h={0.25,0.36}, then the minimum accuracy g=1/102=0.01.

Next, we use a numerical example to illustrate the advantages of the proposed hesitance degree in processing hesitation fuzzy information.

Example 1

Let h1={03,0.5}, h2={05,0.6} and h3={0.3,0.5,0.6} be three hesitant fuzzy elements, g=0.1, θ=μ=0.5. Then, their hesitance degrees are calculated by the different formulas respectively.

the result calculated by formula (5) is as follows:

$$h_{Z}(h_{1})=0.2, h_{Z}(h_{2})=0.1, h_{Z}(h_{3})=0.2 $$

the result calculated by formula (9) is as follows:

$$h_{L}(h_{1})=0.5, h_{L}(h_{2})=0.5, h_{L}(h_{3})=0.66 $$

the result calculated by formula (17) is as follows:

$$h(h_{1})=0.15, h(h_{2})=0.1, h(h_{3})=0.2 $$

From the above results, we can find that hZ({03,0.5})=hZ(h1)=hZ(h3)=hZ({0.3,0.5,0.6}) and hL({03,0.5})=hL(h1)=hL(h2)=hL({05,0.6}). Obviously, the results calculated by formula (5) and formula (9) are unreasonable. However, h(h1)≠h(h2)≠h(h3). That is to say the proposed hesitance degree can clearly distinguish the hesitance degrees of hesitant fuzzy elements h1, h2 and h3, which is consistent with people’s intuitive feeling. Therefore, the proposed hesitance degree is more reasonable than the existing hesitance degree mentioned above.

Based on the proposed hesitance degree, we proposes some new distance measures, which can compare HFEs of equal or unequal length, so we can avoid destroying the original information by adding elements when the length is unequal.

Definition 10

Let hA(xi) and hB(xj) be two HFEs. Then, the normalized Hamming distance between hA(xi) and hB(xj) is defined as:

$$ d_{hllh}(h_{A}, h_{B}) =\left\{ \begin{array}{ll} \frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)| & {}_{l_{h_{A}} \neq l_{h_{B}}}\\ +\beta\left|h(h_{A})-h(h_{B})\right|\\ \frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}\left|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)\right|\quad &{}_{l_{h_{A}}=l_{h_{B}}}\\ +\beta\left|h(h_{A})-h(h_{B})\right| \\ \end{array}\right. $$
(19)

The normalized Euclidean distance between hA(xi) and hB(xj) is defined as:

$$ d_{hlle}(h_{A}, h_{B}) =\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}\left(h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)\right)^{2}\right. & {}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left.+\beta\left(h(h_{A})-h(h_{B})\right)^{2}\right]^{1 / 2}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}\left(h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)\right)^{2}\quad \right. & {}_{l_{h_{A}}=l_{h_{B}}}\\ \left. +\beta\left(h(h_{A})-h(h_{B})\right)^{2}\right]^{1 / 2} \\ \end{array}\right. $$
(20)

The Hausdorff metric distance between hA(xi) and hB(xj) is defined as:

$$ d_{hllh}(h_{A}, h_{B}) =\left\{ \begin{array}{ll} \max \limits_{i,j}\left(\alpha|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|\right. &{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left.+\beta\left|h(h_{A})-h(h_{B})\right|\right)\\ \max \limits_{i}\left(\alpha|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|\quad \right. &{}_{l_{h_{A}}=l_{h_{B}}}\\ \left.+\beta|h(h_{A})-h(h_{B})|\right) \\ \end{array}\right. $$
(21)

The normalized generalized distance between hA(xi) and hB(xj) is defined as:

$$ d_{hllg}(h_{A}, h_{B}) =\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|^{\lambda} \right.&{}_{l_{h_{A}} \neq l_{h_{B}}} \\ \left.+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}\quad \right.&{}_{l_{h_{A}}=l_{h_{B}}}\\ \left.+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda} \\ \end{array}\right. $$
(22)

where λ>0, α,β[0,1],α+β=1, \(l_{h_{A}}\) and \(l_{h_{B}}\) are the lengths of HFEs hA(xi) and hB(xj), respectively.

Especially, if λ=1, then formula (22) degenerates to formula (21). If λ=2, then formula (22) degenerates to formula (20). If λ, then formula (22) degenerates to formula (21).

Example 2

Let h1={03,0.5}, h2={05,0.6} and h3={0.3,0.5,0.6} be three hesitant fuzzy elements, g=0.1, θ=μ=0.5. Then, the process of calculating the Hamming distance between HFEs is as follows

$$\begin{aligned} &d(h_{1},h_{2})=\frac{\alpha}{2}(|0.3-0.5|+|0.5-0.6|)+\beta|0.15-0.1| \\ &d(h_{2},h_{3})=\frac{\alpha}{2\times3}(|0.5-0.3|+|0.5-0.5|+|0.5-0.6|+|0.6-0.3|\\ &\qquad\qquad\quad+|0.6-0.5|+|0.6-0.6|)+\beta|0.1-0.2| \end{aligned} $$

Lemma 1

(Minkowski’s inequality [34]). Let (a1,a2,…an),(b1,b2,…,bn)Rn, and 1≤p<. Then

$$ \left(\sum_{k=1}^{n}\left|a_{k}+b_{k}\right|^{p}\right)^{\frac{1}{p}} \!\leqslant\left(\sum_{k=1}^{n}\left|a_{k}\right|^{p}\right)^{\frac{1}{p}}+\left(\sum_{k=1}^{n}\left|b_{k}\right|^{p}\right)^{\frac{1}{p}} $$
(23)

Theorem 1

Let {A1,A2,…,Am} be a set of HFS on X={x1,x2,…,xn}, I={1,2,…,m}, k,tI. Then, dhllh(Ak,At), dhlle(Ak,At), and dhllg(Ak,At) are hesitant fuzzy distances.

Proof

As dhllh, dhlle and dhd are the special cases of dhllg, here we only prove that dhllg is a distance measure. According to Definition 10, it can be obtained easily that Property (1) and Property (2) in Definition 2 hold. In the following, we prove that Property (3) and Property (4) hold. □

(1) By Definition 10, We have

$$\begin{aligned} &d_{hllg}(h_{A}, h_{B})\\ &=\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{B}}} \\ \end{array}\right.\\ &=\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}\left(x_{j}\right)-h_{A}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{A})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{j=1}^{l_{h_{A}}}|h_{B}^{\sigma(j)}\left(x_{i}\right)-h_{A}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{A})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{B}}} \\ \end{array}\right.\\ =&d_{hllg}(h_{B}, h_{A}) \end{aligned} $$

Thus, dhllg(hA,hB)=dhllg(hB,hA), i.e., Property (3) holds.

(2) Property (4) is d(A,B)+d(B,C)≥d(A,C), By Definition 10, it can be equivalently transformed into the following inequality:

$$\begin{array}{*{20}l} &\quad\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{B}}} \\ \end{array}\right.\\ &+\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}\left(x_{j}\right)-h_{C}\left(x_{k}\right)|^{\lambda} +\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{B}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{B}}=l_{h_{C}}} \\ \end{array}\right.\\ &\geq\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{C}\left(x_{k}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{C}}} \\ \end{array}\right. \end{array} $$

which can be further converted into:

$$\begin{array}{*{20}l} &\quad\left\{ \begin{array}{ll} \left[\frac{\alpha l_{h_{C}}}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{B}}} \\ \end{array}\right.\\ &+\!\left\{ \begin{array}{ll} \left[\frac{\alpha l_{h_{A}}}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}\left(x_{j}\right)-h_{C}\left(x_{k}\right)|^{\lambda} +\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{B}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{C}}} \\ \end{array}\right.\\ &\geq\left\{ \begin{array}{ll} \left[\frac{\alpha l_{h_{B}}}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{C}\left(x_{k}\right)|^{\lambda} +\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{C}}} \\ \end{array}\right. \end{array} $$

i.e.,

$$\begin{array}{*{20}l} &\quad\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}}\sum\limits_{j=1}^{l_{h_{B}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{B}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{B}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{B})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{B}}} \\ \end{array}\right.\\ &+\!\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{h_{C}}} \sum\limits_{j=1}^{l_{h_{B}}}\sum\limits_{i=1}^{l_{h_{A}}}|h_{B}\left(x_{j}\right)-h_{C}\left(x_{k}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{B}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{B}}} \sum\limits_{j=1}^{l_{h_{B}}}|h_{B}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{B})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{B}}=l_{h_{C}}} \\ \end{array}\right.\\ \end{array} $$
$$\begin{array}{*{20}l} &\geq\left\{ \begin{array}{ll} \left[\frac{\alpha}{l_{h_{A}} l_{h_{B}} l_{h_{C}}} \sum\limits_{k=1}^{l_{x_{k}}}\sum\limits_{j=1}^{l_{h_{C}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}\left(x_{i}\right)-h_{C}\left(x_{k}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}&{}_{l_{h_{A}} \neq l_{h_{C}}}\\ \left[\frac{\alpha}{l_{h_{A}}} \sum\limits_{i=1}^{l_{h_{A}}}|h_{A}^{\sigma(j)}\left(x_{i}\right)-h_{C}^{\sigma(j)}\left(x_{i}\right)|^{\lambda}+\beta|h(h_{A})-h(h_{C})|^{\lambda}\right]^{1 / \lambda}\quad &{}_{l_{h_{A}}=l_{h_{C}}} \\ \end{array}\right. \end{array} $$

Since the following equation holds:

$$ \left|h_{A}\left(x_{i}\right)\!-h_{C}\left(x_{k}\right)\right| =\left| h_{A}\left(x_{i}\right)-h_{B}\left(x_{j}\right)+h_{B}\left(x_{j}\right) -h_{C}\left(x_{k}\right)\right| $$
(24)

On the condition that 1≤λ≤+, we can reason from Lemma 1 that dhllg(A,C)≤dhllg(A,B)+dhllg(B,C). Therefore, Property (4) is verified.

Thus, we complete the proof of Theorem 1.

Example 3

Let h1={0.1,0.5}, h2={0.3,0.8}, h3={0.5,0.6}, h4={0.3,0.5} and h5={0.3,0.5,0.6} be five hesitant fuzzy elements, g=0.1, θ=μ=0.5, α=β=0.5. Then use different formulas to calculate the distance measures. The results are shown in Table 1.

Table 1 Comparison of different distance measures

From Table 1, it can be seen that dhllh(h1,h1) = 0 and dhllh(h1,h2)≠dhllh(h1,h3)≠dhllh(h1,h4)≠dhllh(h1,h5), which is consistent with people’s intuitive feeling. That means the results based on proposed distance measure is more reasonable than those of the above mentioned distance measures.

On the other hand, we compare the characteristics of the proposed distance measure with those of the existing distance measures. The results are shown in Table 2.

Table 2 The characteristic comparisons with existing distance measures

From Table 2, it can be seen that the proposed distance measure has all listed characteristics, but the mentioned distance measures do not have all of them. This means that the proposed distance measure is superior to the existing distance measures above in many complex situations.

Hesitant fuzzy clustering based on new distance measure

The description of clustering Algorithm

Recently, many studies focus on the clustering analysis of HFSs. Chen and Xu [35] focused on studied the clustering for hesitant fuzzy sets based on the K-means clustering algorithm, which uses the result of hierarchical clustering as the initial clusters. Zhang and Xu [36] proposed a novel hesitant fuzzy agglomerative hierarchical clustering algorithm. The algorithm considers each of the given HFSs as a unique cluster, and then compares each pair of the HFSs by using the weighted Hamming distance or the weighted Euclidean distance. The two clusters with smaller distance are jointed. Repeat the process until the desired number of clusters is achieved.

We focused on studied the hierarchical hesitant fuzzy K-means clustering algorithm, and using the new distance measure to calculate the distance between hesitant fuzzy sets. The specific steps of the hierarchical hesitant fuzzy K-means clustering algorithm are as follows:

step1. (Hierarchical clustering) Consider each hesitant fuzzy set Ai(i=1,2,…,n) as an independent cluster {A1},{A2},…,{An}. Then calculate the distance between Ai and Aj, which is denoted by dij=d(Ai,Aj). The two clusters with smaller distance are jointed by average function, which is given as follows:

$$ {}f\left(A_{1}, A_{2}\right)\,=\,\frac{1}{2}\left(A_{1} \oplus A_{2}\right) \,=\,\left\{\!<\! x_{i}, \cup_{r_{1} \in h_{A_{1}}\left(x_{1}\right), r_{2} \in h_{A_{2}}\left(x_{i}\right)}\left\{1\,-\,\left[\left(1\,-\,r_{1}\right)\left(1-r_{2}\right)\right]^{1 / 2}\right\}>\mid x_{i} \in X\right\} $$
(25)

This iterative process is repeated until all clusters are aggregated into one cluster.

step2. According to the given number of clusters, select the corresponding result in step 1 as the initial cluster, then calculate the distance between the hesitant fuzzy set Ai(i=1,2,…,n) and the center of each cluster. Finally classify Ai to the cluster with the closest cluster center.

step3. Recalculate the new cluster center through the average function of the hesitant fuzzy set.

step4. Repeat steps 2 and 3 until all cluster centers are stable.

Illustrative example

A specific example (adapted from Ref. [35]) is given below to illustrate the above algorithm. The proposed hesitant fuzzy distance is applied to the hierarchical hesitant fuzzy K-means clustering algorithm.

There are five tourism resources need to be evaluated and classified. Experts give corresponding evaluation information (g=0.1,θ=μ=0.5,α=β=0.5) to tourism resources from six aspects, namely: scale, environmental conditions, integrity, service, tourist routes and convenient transportation, which is expressed as X={x1,x2,…,x6,}, the evaluation information of the five tourism resources is represented by hesitant fuzzy sets Ai=(i=1,2,3,4,5), which are listed in Table 3:

Table 3 Hesitance fuzzy assessment information

step1. Consider each hesitating fuzzy set Ai(i=1,2,3,4,5) as an independent cluster: {A1},{A2}, {A3},{A4} and {A5}.Using the formula 21 calculate the distance between each hesitant fuzzy set and the other four hesitant fuzzy sets:

$$\begin{array}{l} d\left(A_{1}, A_{2}\right)=0.3326, d\left(A_{1}, A_{3}\right)=0.2473 \\ d\left(A_{1}, A_{4}\right)=0.2256, d\left(A_{1}, A_{5}\right)=0.4590 \\ d\left(A_{2}, A_{3}\right)=0.1797, d\left(A_{2}, A_{4}\right)=0.3444 \\ d\left(A_{2}, A_{5}\right)=0.1955, d\left(A_{3}, A_{4}\right)=0.1845 \\ d\left(A_{3}, A_{5}\right)=0.2293, d\left(A_{4}, A_{5}\right)=0.3052 \end{array} $$

Obviously, {A2} and {A3} are the two closest clusters, then calculate the new cluster {A2,A3} by formula (25). Therefore, the hesitant fuzzy set Ai(i=1,2,3,4,5) is divided into the following four clusters: {A1},{A2,A3},{A4} and {A5}. Continue to calculate the distance between each cluster and the other three clusters:

$$\begin{array}{c} d\left(\left\{A_{2}, A_{3}\right\}, A_{1}\right)=0.2881 \\ d\left(\left\{A_{2}, A_{3}\right\}, A_{4}\right)=0.2644 \\ d\left(\left\{A_{2}, A_{3}\right\}, A_{5}\right)=0.2097 \\ d\left(A_{1}, A_{4}\right)=0.2256, d\left(A_{1}, A_{5}\right)=0.4950 \\ d\left(A_{4}, A_{5}\right)=0.3052 \end{array} $$

Because of {A2,A3} and {A5} are the two closest clusters, then the hesitant fuzzy sets are divided into the following three clusters: {A2,A3,A5},{A1} and {A4}. Calculate the new cluster and the distances between each cluster and the other clusters:

$$\begin{array}{c} d\left(A_{1}, \left\{A_{2}, A_{3}, A_{5}\right\}\right)=0.4341 \\ d\left(A_{4}, \left\{A_{2}, A_{3}, A_{5}\right\}\right)=0.3085 \\ d\left(A_{4}, A_{1}\right)=0.2256 \end{array} $$

Where {A1} and {A4} are the two closest clusters, then the hesitant fuzzy sets are divided into two clusters: {A2,A3,A5} and {A1,A4}.

In the end, the two clusters merged into one cluster: {A1,A2,A3,A4,A5}.

step2. Assuming number of clusters c=3 is given, according to the result of step1, then c1={A1}, c2={A2,A3,A5} and c3={A4} are selected as the initial clusters. Next, calculate the distances of each hesitant fuzzy set Ai(i=1,2,…,5) between each initial cluster cj(j=1,2,3) as follows:

$$\begin{array}{c} d\left(A_{1}, c_{1}\right)=0,\ d\left(A_{1}, c_{2}\right)=0.4310 \\ d\left(A_{1}, c_{3}\right)=0.2806,\ d\left(A_{2}, c_{1}\right)=0.4149\\ d\left(A_{2}, c_{2}\right)=0.2122,\ d\left(A_{2}, c_{3}\right)=0.4500 \\ d\left(A_{3}, c_{1}\right)=0.3139,\ d\left(A_{3}, c_{2}\right)=0.1792, \\ d\left(A_{3}, c_{3}\right)=0.2222,\ d\left(A_{4}, c_{1}\right)=0.2806\\ d\left(A_{4}, c_{2}\right)=0.2948,\ d\left(A_{4}, c_{3}\right)=0\\ d\left(A_{5}, c_{1}\right)=0.5972,\ d\left(A_{5}, c_{2}\right)=0.1365\\ d\left(A_{5}, c_{3}\right)=0.4000 \end{array} $$

According to the above calculation results, the clustering result is c1={A1}, c2={A2,A3,A5} and c3={A4}.

step3. The cluster center remains unchanged and the iteration ends.

Comparative analysis

In order to illustrate the performance of the proposed method, we make a comparative analysis with the hierarchical hesitant fuzzy k-means clustering algorithm introduced by Chen et al. [35].

Consider each hesitating fuzzy set Ai(i=1,2,3,4,5) as an independent cluster: {A1},{A2},{A3},{A4} and {A5}. Calculating the distance between each hesitant fuzzy set and the other four hesitant fuzzy sets:

$$\begin{array}{l} d\left(A_{1}, A_{2}\right)=0.4194, d\left(A_{1}, A_{3}\right)=0.3139 \\ d\left(A_{1}, A_{4}\right)=0.2806, d\left(A_{1}, A_{5}\right)=0.5972 \\ d\left(A_{2}, A_{3}\right)=0.2222, d\left(A_{2}, A_{4}\right)=0.4500 \\ d\left(A_{2}, A_{5}\right)=0.2444, d\left(A_{3}, A_{4}\right)=0.2222 \\ d\left(A_{3}, A_{5}\right)=0.3000, d\left(A_{4}, A_{5}\right)=0.4000 \end{array} $$

We can find d(A2,A3)=d(A3,A4)= min{d(Ai,Aj)i,j=1,2,3,4,5(ij)}=0.2222, there are two options when merging the two clusters into a new cluster. Therefore, the following two cases are considered.

case1: Hesitant fuzzy sets Ai(i=1,2,3,4,5) are divided into the following four clusters: {A1}{A2,A3}{A4} and {A5}. Calculate the distances between each cluster and the other three clusters. We have d({A2,A3},A5) is the shortest distance. Merging {A2,A3} and {A5} into a new cluster, the hesitant fuzzy sets are divided into three clusters: {A2,A3,A5}{A1} and {A4}. Calculate the new cluster and the distances between each cluster and the other clusters. We have d(A1,A4) is the shortest distance. Therefore, hesitant fuzzy sets are divided into the following two clusters: {A2,A3,A5} and {A1,A4}. In the end, the two clusters are merged into one cluster: {A1,A2,A3,A4,A5}.

case2: Hesitant fuzzy sets Ai(i=1,2,3,4,5) are divided into the following four clusters: {A1} {A2} {A3,A4} and {A5}. Calculate the distance between each cluster and the other three clusters. We have d(A2,A5) is the shortest distance. Merging {A2} and {A5} into a new cluster, the hesitant fuzzy set is divided into three clusters: {A1},{A3,A4} and {A2,A5}. Calculate the new cluster and the distances between each cluster and the other clusters. We have d({A3,A4},{A2,A5}) is the shortest distance. Therefore, hesitant fuzzy sets are divided into two clusters: {A1} and {A2,A3,A4,A5}. In the end, the two clusters are merged into one cluster: {A1,A2,A3,A4,A5}.

Obviously, the clustering results obtained in different cases are different. Next, we analyze the quality of the clustering results of the two cases. Generally, the average distance dρ is an indicator to measure the quality of clustering results. The smaller the dρ, the better the clustering result. The calculation process is as follows:

$$d\!\left(\left\{A_{2}, A_{3}\right\}\!,\!A_{2}\right)\,=\,0.1531,d\!\left(\left\{A_{2},A_{3}\right\}\!,\!A_{3}\right)\,=\,0.0714 $$
$$d_{\rho}\left(\left\{A_{2},A_{3}\right\}\right)=\frac{0.1531+0.0714}{2}=0.1123 $$
$$d\!\left(\left\{A_{3}, A_{4}\right\}\!,\!A_{3}\right)\,=\,0.1750,d\left(\left\{A_{3},A_{4}\right\}\!,\!A_{4}\right)\,=\,0.1069 $$
$$d_{\rho}\left(\left\{A_{3},A_{4}\right\}\right)=\frac{0.1750+0.1.69}{2}=0.1410 $$

It can be seen that dρ({A2,A3}) is smaller than dρ({A3,A4}). Therefore, the clustering result of case1 is better than case2.

Results and Discussion

According to the above analysis, the comparison result is shown in Table 4.

Table 4 Comparison of hierarchical clustering results

From Table 4, we can find that there are two different clustering results using Chen’s method introduced in [35]. It is very difficult to decide which one to choose in the clustering process. And even if it can be selected correctly, it will increase the complexity of the algorithm. However, a unique clustering result can be obtained by the proposed method. And the result is same as the best one obtained by Chen’s method. Therefore, the hierarchical hesitant fuzzy k-means clustering method based on the proposed distance measure is more reasonable and effective.

Conclusions

Considering the existing hesitance degrees does not take into account both degree of dispersion and number of the hesitant fuzzy values in the hesitant fuzzy element, a new hesitance degree is defined in this paper, which has better accuracy and applicability. We have elaborated the important role of hesitance degree in hesitant fuzzy distance measure. Further, we proposed some hesitant fuzzy distance measures based on the new hesitance degree, which can overcome the shortcomings of the existing distance measures. Moreover, we applied the new hesitant fuzzy distance to the hierarchical hesitant fuzzy k-means clustering algorithm, and presented an example to illustrate the effectiveness of the proposed method. In addition, we have compared and analyzed with the existing hierarchical hesitant fuzzy k-means clustering algorithm. It has been found that the clustering algorithm based on new distance measure is more reasonable. The proposed distance measure can avoid the original information distortion and have higher resolution. Therefore, it can help decision-makers get the only ideal results in practical problems. In the future, We will apply the proposed distance measure to multi-attribute group decision-making. We will extend this approach to interval valued environment. We will develop the knowledge measure [37] for hesitant fuzzy set.

Availability of data and materials

All data generated or analysed during this study are included in this published article.

References

  1. Zadeh LA (1965) Fuzzy sets. Inf Control 8(3):338–353.

    Article  MATH  Google Scholar 

  2. Atanassov KT (1986) Intuitionistic fuzzy sets. Fuzzy Sets Syst 20(1):87–96.

    Article  MATH  Google Scholar 

  3. Atanassov K, Gargov G (1989) Interval valued intuitionistic fuzzy sets. Fuzzy Sets Syst 31(3):343–349.

    Article  MathSciNet  MATH  Google Scholar 

  4. Zadeh LA (1975) The concept of a linguistic variable and its application to approximate reasoning. Inf ences 8(3):199–249.

    MathSciNet  MATH  Google Scholar 

  5. Mizumoto M, Tanaka K (1976) Some properties of fuzzy sets of type 2. Inf Control 31(4):312–340.

    Article  MathSciNet  MATH  Google Scholar 

  6. YAGER, Ronald R (1986) On the theory of bags. Int J Gen Syst 13(1):23–37.

    Article  MathSciNet  Google Scholar 

  7. Mjka B, Pka B, Wd C, Wk D, Zsa B (2020) Bi-parametric distance and similarity measures of picture fuzzy sets and their applications in medical diagnosis. Egypt Inf J 22(2):201–212.

    Google Scholar 

  8. Torra V (2010) Hesitant fuzzy sets. Int J Intell Syst 25(6):529–539.

    MATH  Google Scholar 

  9. Zhang, Zhiming (2013) Hesitant fuzzy power aggregation operators and their application to multiple attribute group decision making. Inf Sci 234(Complete):150–181.

    Article  MathSciNet  MATH  Google Scholar 

  10. Zhu B, Xu Z (2013) Hesitant fuzzy bonferroni means for multi-criteria decision making. J Oper Res Soc 64:1831–1840.

    Article  Google Scholar 

  11. Zhu B, Xu ZS (2012) Hesitant fuzzy geometric bonferroni means. Inf Sci 205(1):72–85.

    Article  MathSciNet  MATH  Google Scholar 

  12. Wei G (2012) Hesitant fuzzy prioritized operators and their application to multiple attribute decision making. Knowl-Based Syst 31:176–182.

    Article  Google Scholar 

  13. Xu Z, Zhang X (2013) Hesitant fuzzy multi-attribute decision making based on topsis with incomplete weight information. Knowl-Based Syst 52(nov):53–64.

    Article  Google Scholar 

  14. Liao H, Xu Z (2013) A vikor-based method for hesitant fuzzy multi-criteria decision making. Fuzzy Optim Decis Mak 12(4):373–392.

    Article  MathSciNet  MATH  Google Scholar 

  15. WANG Y, Cuiping Q, Yixin L (2017) Hesitant fuzzy topsis multi-attribute decision method based on prospect theory. Control Decis 32(5):864–870.

    MATH  Google Scholar 

  16. Zhang X, Xu Z (2015) Hesitant fuzzy agglomerative hierarchical clustering algorithms. Int J Syst Sci 46(3):562–576.

    Article  MATH  Google Scholar 

  17. Liu X, Zhu J, Liu S (2014) Similarity measure of hesitant fuzzy sets based on symmetric cross entropy and its application in clustering analysis. Control Decis 29(10):1816–1822.

    MATH  Google Scholar 

  18. Zhang X, Xu Z (2015) Novel distance and similarity measures on hesitant fuzzy sets with applications to clustering analysis. J Intell Fuzzy Syst 28(5):2279–2296.

    MathSciNet  Google Scholar 

  19. Chen N, Xu Z, Xia M (2014) Hierarchical hesitant fuzzy k-means clustering algorithm. Appl Math J Chin Univ 29(1):1–17.

    Article  MathSciNet  MATH  Google Scholar 

  20. Akram M, Adeel A (2019) Novel topsis method for group decision-making based on hesitant m-polar fuzzy model. J Intell Fuzzy Syst 37(6):8077–8096.

    Article  Google Scholar 

  21. Akram M, Adeel A, Al-Kenani AN, Alcantud JCR (2020) Hesitant fuzzy n-soft electre-ii model: a new framework for decision-making. Neural Comput Appl 3:1–16.

    Google Scholar 

  22. Deli I, Karaaslan F (2020) Generalized trapezoidal hesitant fuzzy numbers and their applications to multi criteria decision-making problems. Soft Comput 25(1):1017–1032.

    MATH  Google Scholar 

  23. Karaaslan F, Özlü Ş (2019) Some distance measures for type-2 hesitant fuzzy sets and their applications to multi-criteria group decision making problems. Soft Comput 24(1):9965–9980.

    MATH  Google Scholar 

  24. Su Z, Xu Z, Liu H, Liu S (2015) Distance and similarity measures for dual hesitant fuzzy sets and their applications in pattern recognition. J Intell Fuzzy Syst 29(2):731–745.

    Article  MathSciNet  MATH  Google Scholar 

  25. Zeng W, Li D, Qian Y (2016) Distance and similarity measures between hesitant fuzzy sets and their application in pattern recognition 84:267–271.

  26. Zhang F, Chen S, Li J, Huang W (2018) New distance measures on hesitant fuzzy sets based on the cardinality theory and their application in pattern recognition. Soft Comput Fusion Found Methodologies Appl 22(4):1237–1245.

    MATH  Google Scholar 

  27. Li C, Li D, Jin J (2019) Int J Patt Recogn Artif Intell 33(12):1–30.

  28. Xu Z, Xia M (2011) Distance and similarity measures for hesitant fuzzy sets. Inf ences 181(11):2128–2138.

    MathSciNet  MATH  Google Scholar 

  29. Tong X, Yu L (2016) Madm based on distance and correlation coefficient measures with decision-maker preferences under a hesitant fuzzy environment. Soft Comput 20(11):4449–4461.

    Article  Google Scholar 

  30. Peng D, Gao C, Gao Z (2013) Generalized hesitant fuzzy synergetic weighted distance measures and their application to multiple criteria decision-making. Appl Math Modell 37(8):5837–5850.

    Article  MathSciNet  MATH  Google Scholar 

  31. Tang X, Peng Z, Ding H, Cheng M, Yang S, Li C, de Oliveira José Valente (2018) Novel distance and similarity measures for hesitant fuzzy sets and their applications to multiple attribute decision making. J Intell Fuzzy Syst 34(6):3903–3916.

    Article  Google Scholar 

  32. Li D, Zeng W, Zhao Y (2015) Note on distance measure of hesitant fuzzy sets. Inf Sci 321:103–115.

    Article  MathSciNet  MATH  Google Scholar 

  33. Xia M, Xu Z (2011) Hesitant fuzzy information aggregation in decision making. Int J Approx Reason 52(3):395–407.

    Article  MathSciNet  MATH  Google Scholar 

  34. Hatzimichailidis AG, Papakostas GA, Kaburlasos VG (2012) A novel distance measure of intuitionistic fuzzy sets and its application to pattern recognition problems. Int J Intell Syst 27:396–409.

    Article  Google Scholar 

  35. Chen N, Xu Z, Xia M (2014) Hierarchical hesitant fuzzy k-means clustering algorithm. Appl Math J Chin Univ 29(1):1–17.

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhang X, Xu Z (2015) Hesitant fuzzy agglomerative hierarchical clustering algorithms. Int J Syst ence 46(3):562–576.

    Article  MATH  Google Scholar 

  37. Khan MJ, Kumam P, Shutaywi M (2020) Knowledge measure for the q-rung orthopair fuzzy sets. Int J Intell Syst 36(2):628–655.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

The work is supported by the Key Research and Development Project of Hunan Province (No. 2019SK2331), the Natural Science Foundation of Hunan Province (Nos: 2018JJ3213, 2019JJ40100, 2019JJ40099), the Key scientific research projects of Hunan Education Department (Nos: 18A317, 19A202) and the Innovation Foundation for Postgraduate of Hunan Institute of Science and Technology (No. YCX2020A34).

Author information

Authors and Affiliations

Authors

Contributions

Fuping Liao was a major contributor in writing the manuscript. Wu Li conducted experimental analysis and comparison. Both of them were the main authors of the manuscript. Xiaoqiang Zhou and Gang Liu corrected the grammar of the paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Wu Li.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liao, F., Li, W., Zhou, X. et al. Novel distance measures of hesitant fuzzy sets and their applications in clustering analysis. J. Eng. Appl. Sci. 69, 115 (2022). https://doi.org/10.1186/s44147-022-00095-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s44147-022-00095-3

Keywords