【正文】
些組合,您可以探討的數(shù)據(jù),更多地了解存在的關系,這在理論上可能不容易通過偶然的觀察獲得。例如,考慮那些住在同一社區(qū),驅(qū)動器相同的車,吃同樣的食物,買了類似的版本的產(chǎn)品的那一個群體的人。另一組可能包括去相同的餐廳,也有類似的薪金,休假和每年 兩次以外的地區(qū)的人。 傳統(tǒng)貝葉斯 傳統(tǒng)貝葉斯算法迅速的建立挖掘模型,可用來做分類和預測。概率用來生成計算和儲存加工過程中的立方體的模型。傳統(tǒng)貝葉斯算法產(chǎn)生一個簡單的挖掘模型,可以被視為在數(shù)據(jù)挖掘過程中的一個起點。這使得該模型成為探索數(shù)據(jù)和發(fā)現(xiàn)各種不同的輸入屬性在不同預測屬性的情況下是如何分布的一個很好的選擇。例如,您可以使用時間系預測算法歷史數(shù)據(jù)立方體的基礎上來預測銷售額和利潤。對每個模 式您只能有一系列案例。 一個例子可能包含了一套變量(例如,銷售不同的商店) 。例如,在一個商店的先售可能在預測另一個商店的當前銷售時也有用。該算法認為每個屬性 /值配對(如產(chǎn)品 /自行車)作為一個項目。該算法通過掃描數(shù)據(jù)集試圖找到往往出現(xiàn)在許多交易的項目集。例如,頻繁項目集可能包含(性別 = “男性” ,婚姻狀況 = “已婚” ,年齡 = “ 3035 ” ) 。在這種情況下,大小是 3 。如果一個嵌套表中存在數(shù)據(jù)集,每個嵌套的建制(如在購買表的產(chǎn)品)被認為是一個項目。關聯(lián)模型的規(guī)則看起來像 A, B= C (發(fā)生概率的聯(lián)系) ,其中有 A, B , C 都是頻繁項目集。 = ‘意味著 C 是通過 A 和 B 預測的。這些概率在數(shù)據(jù)挖掘文獻中也被稱為“信任”。例如,您可以使用聯(lián)結模式在他們購物籃項目上來預測一個用戶可能希 望購買的產(chǎn)品。通常串聯(lián)的一連串屬性擁有特定的命令(如點擊路徑)的一組事件。 序列簇算法是一種混合型的序列和聚類算法。該算法的一個典型的使用情況是一個門戶網(wǎng)站的網(wǎng)絡客戶分析。每個網(wǎng)站的客戶通過在這些領域中網(wǎng)頁點擊的 順序聯(lián)系起來。這些團體是視化的,提供了詳細 的了解客戶如何使用該網(wǎng)站。類似微軟決策樹算法的供應商,考慮到每個可預測屬性的情況,該算法為馬格可能輸入屬性的情況計算概率。這些來自第一代的整套案件中從最初的分類錯誤,被反饋到網(wǎng)絡,用來修改網(wǎng)絡性能的下一代,等等。然而,該算法和決策樹算法其中一個主要區(qū)別,是其學習的過程是朝著盡量減少錯誤的方向優(yōu)化網(wǎng)絡參數(shù),而決策樹算法的分裂規(guī)則,以求最大限度地發(fā)揮信息增益。 線性回歸 線性回歸算法是決策樹算法的一種特殊的構造,獲得了無效的分裂(整個回歸公式是建立在一個單一根節(jié)點) 。 邏輯回歸 邏輯回歸算法是神經(jīng)網(wǎng)絡算法的一種特殊的構造,得到了消除隱蔽層。 翻譯原文: SQL Server Management Studio SQL Server Management Studio is a collection of administrative and scripting tools for working with Microsoft SQL Server ponents. This workspace differs from Business Intelligence Development Studio in that you are working in a connected environment where actions are propagated to the server as soon as you save your work. After the data has been cleaned and prepared for data mining, most of the tasks associated with creating a data mining solution are performed within Business Intelligence Development Studio. Using the Business Intelligence Development Studio tools, you develop and test the data mining solution, using an iterative process to determine which models work best for a given situation. When the developer is satisfied with the solution, it is deployed to an Analysis Services server. From this point, the focus shifts from development to maintenance and use, and thus SQL Server Management Studio. Using SQL Server Management Studio, you can administer your database and perform some of the same functions as in Business Intelligence Development Studio, such as viewing, and creating predictions from mining models. Data Transformation Services Data Transformation Services (DTS) prises the Extract, Transform, and Load (ETL) tools in SQL Server 2021. These tools can be used to perform some of the most important tasks in data mining: cleaning and preparing the data for model creation. In data mining, you typically perform repetitive data transformations to clean the data before using the data to train a mining model. Using the tasks and transformations in DTS, you can bine data preparation and model creation into a single DTS package. DTS also provides DTS Designer to help you easily build and run packages containing all of the tasks and transformations. Using DTS Designer, you can deploy the packages to a server and run them on a regularly scheduled basis. This is useful if, for example, you collect data weekly data and want to perform the same cleaning transformations each time in an automated fashion. You can work with a Data Transformation project and an Analysis Services project together as part of a business intelligence solution, by adding each project to a solution in Business Intelligence Development Studio. Mining Model Algorithms Data mining algorithms are the foundation from which mining models are created. The variety of algorithms included in SQL Server 2021 allows you to perform many types of analysis. For more specific information about the algorithsm and how they can be adjusted using parameters, see Data Mining Algorithms in SQL Server Books Online. Microsoft Decision Trees The Microsoft Decision Trees algorithm supports both classification and