freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

外文翻譯-不確定性數(shù)據(jù)挖掘:一種新的研究方向-其他專業(yè)-資料下載頁(yè)

2025-01-19 00:34本頁(yè)面

【導(dǎo)讀】目前,在數(shù)據(jù)庫(kù)數(shù)據(jù)不確定性處理領(lǐng)域中,很多研究結(jié)果已經(jīng)被發(fā)。我們認(rèn)為,當(dāng)不確定性數(shù)據(jù)被執(zhí)行數(shù)據(jù)挖掘時(shí),數(shù)據(jù)不確定性不得不被考慮在。內(nèi),才能獲得高質(zhì)量的數(shù)據(jù)挖掘結(jié)果。本文中,我們?yōu)檫@個(gè)領(lǐng)域可能的研究方向提出一個(gè)框架。特別在需要與物理環(huán)境交互的應(yīng)用中,如:移動(dòng)定位服務(wù)[15]和傳感器。因此,每個(gè)目標(biāo)的位置的變化過(guò)程是伴有。為了提供準(zhǔn)確地查詢和挖掘結(jié)果,這些導(dǎo)致數(shù)據(jù)不確定性的多方面來(lái)。再以追蹤移動(dòng)目標(biāo)應(yīng)。不幸地是,歸納得到的記錄與真實(shí)記錄。之間的誤差可能會(huì)嚴(yán)重也影響挖掘結(jié)果。圖1闡明了當(dāng)一種聚類算法被應(yīng)用追蹤帶。在模糊聚類中,一個(gè)是數(shù)據(jù)簇由一組目標(biāo)的模糊子。模糊C均值聚類算法是一種最廣泛的使用模糊聚。不同的模糊聚類方法已被應(yīng)用在一般數(shù)據(jù)或模糊數(shù)據(jù)中來(lái)產(chǎn)生的模。他們研究工作是基于一個(gè)模糊數(shù)據(jù)模型的,而我們工作的開(kāi)展則基于移。另一方面,模糊聚類則表示聚類的結(jié)果為一個(gè)“模糊”表格。示每個(gè)元組和關(guān)聯(lián)的不確定性。

  

【正文】 . return C Convergence can be defined based on different criteria. Some example convergence criteria include:(1) when the change in the sum of squared errors is smaller than a certain userspecified threshold, (2)when no objects are reassigned to a different cluster in an iteration and (3) when the number of iterations has reached a predefined maximum number. Kmeans Clustering for Uncertain Data In order to take into account data uncertainty in the clustering process, we propose a clustering algorithm with the goal of minimizing the expected sum of squared errors E(SSE). Notice that a data object xi is specified by an uncertainty region with an uncertainty pdf f(xi). Given a set of clusters, Cj’s the expected SSE can be calculated as follow: ? ?iiKj CiijKj CiijKj CiijdxxfxcxcExcEjjj)(121212? ?? ?? ?? ?? ?? ?????????????? (4) Cluster means are given by: Uncertain Data Mining: A New Research Direction 7 ? ?? ?????????????????jjjCiiiijCiijCiijjdxxfxCxECxCEc)(111 (5) We now propose a new Kmeans algorithm, called UKmeans, for clustering uncertain data. 1. Assign initial values for cluster means c1 to cK 2. repeat 3. for i = 1 to n do 4. Assign each data point xi to cluster Cj where E(|| cj xi ||) is the minimum. 5. end for 6. for j = 1 to K do 7. Recalculate cluster mean cj of cluster Cj 8. end for 9. until convergence 10. return C The main difference between UKmean clustering and Kmeans clustering lies in the putation of distance and clusters. In particular, UKmeans pute the expected distance and cluster centroids based on the data uncertainty model. Again, convergence can be defined based on different criteria. Note that if the convergence is based on squared error, E(SSE) as in Equation (4) should be used instead of SSE. In Step 4, it is often difficult to determine E(|| cj xi ||) algebraically. In particular, the variety of geometric shapes of uncertainty regions (., line, circle) and different uncertainty pdf imply that numerical integration methods are necessary. In view of this, E(|| cj xi ||2), which is easier to obtain, is used instead. This allows us to determine the cluster assignment (., Step 4) using a simple algebraic expression. 5. A Case Study and Evaluation Clustering Data with Linemoving Uncertainty The UKmeans algorithm presented in the last section is applicable to any uncertainty region and pdf. To demonstrate the feasibility of the approach, we describe Uncertain Data Mining: A New Research Direction 8 how the proposed algorithm can be applied to uncertainty models specific to moving objects that are moving in a twodimensional space. We also present the evaluation results of the algorithm. The algorithm was applied to a model with the unidirectional linemoving uncertainty, which requires that each object’s location is uniformly distributed in a line segment along the line of movement in one direction. Suppose we have a centroid c = (p, q) and a data object x specified by a line uncertainty region with a uniform distribution. Let the end points of the line segment uncertainty be (a,b) and (c,d). The line equation can be parametrized by (a + t (c a), b + t (d b)), where t is between [0,1]. Let the uncertainty pdf be f(t). Also, let the distance of the line segment uncertainty be We have: ? ? ? ?? ???? 10 222 )( dtCBttDtfxcE (6) where B = 2[(c a) (a p) + (d b) (b q)] C = (p a) 2 + (q b) 2 If f(t) is uniform, then f(t) = 1, and the above bees: ? ? CBDE ?? 23c e nt roi d fro my un c e rt a i nt l i ne of di s t a nc e22 (7) We are thus able to pute the expected squared distance easily for linemoving uncertainty for uniform distribution. These formulae can be readily used by the UKmeans algorithm to decide the assignment of clusters. Noheless, the use of uniform distribution is only a specific example here. When the pdf’s are not uniform (., Gaussian), sampling techniques can be used to estimate E(|| cj xi ||). Experiments Experiments were conducted to evaluate the performance of UKmeans. The goal is to study whether the inclusion of data uncertainty improves clustering quality. We simulate the following scenario: a system that tracks the locations of a set of moving objects has taken a snapshot of the whereabouts of the objects. This location data is stored in a set called recorded. Each object assumes an uncertainty model. Let uncertainty captures such uncertainty information. We pare two clustering approaches: (1) apply Kmeans to recorded and (2) apply UKmeans to recorded + uncertainty. More specifically, we first generated a set of random data points in a 100 x ? ? ? ?22 bdacD ????Uncertain Data Mining: A New Research Direction 9 100 2D space as recorded. For each data point, we then randomly generated its uncertainty according to the unidirectional lineuncertainty model. The uncertainty specification (uncertainty) of an object contains the type of the uncertainty (bidirectional line), the maximum distance d that the object can move, and the direction that the object can move. The actual locations of the objects were then generated based on recorded and uncertainty, simulating the scenario that the objects have moved away from their original locations as registered in recorded. Specifically, for each data point, we took its position in recorded and then generated a random number to decide the distance that the object should have moved. If it is freemoving (circle) uncertainty or bidirectional uncertainty, we generated another random number to see which direction the object should move. We
點(diǎn)擊復(fù)制文檔內(nèi)容
環(huán)評(píng)公示相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1