正文內(nèi)容

數(shù)據(jù)挖掘概念與技術(shù)chapter6-分類基本概念-資料下載頁

2024-12-08 09:45本頁面

　　

【正文】 ? 多項(xiàng)式回歸模型可以變換為線性回歸模型 . 例如 y = w0 + w1 x + w2 x2 + w3 x3 借助新變量 : x2 = x2, x3= x3 y = w0 + w1 x + w2 x2 + w3 x3 ? 其他函數(shù) ,如冪函數(shù) , 也可以轉(zhuǎn)化為線性函數(shù) ? Some models are intractable nonlinear (., 指數(shù)相求和 ) ? 可能通過更復(fù)雜的公式綜合計(jì)算，得到最小二乘估計(jì) 非線性回歸 2022年 1月 4日星期二 Data Mining: Concepts and Techniques 73 ? Generalized linear model: ? Foundation on which linear regression can be applied to modeling categorical response variables ? Variance of y is a function of the mean value of y, not a constant ? Logistic regression: models the prob. of some event occurring as a linear function of a set of predictor variables ? Poisson regression: models the data that exhibit a Poisson distribution ? Loglinear models: (for categorical data) ? Approximate discrete multidimensional prob. distributions ? Also useful for data pression and smoothing ? Regression trees and model trees ? Trees to predict continuous values rather than class labels Other RegressionBased Models 74 Regression Trees and Model Trees ? Regression tree: proposed in CART system (Breiman et al. 1984) ? CART: Classification And Regression Trees ? Each leaf stores a continuousvalued prediction ? It is the average value of the predicted attribute for the training tuples that reach the leaf ? Model tree: proposed by Quinlan (1992) ? Each leaf holds a regression model—a multivariate linear equation for the predicted attribute ? A more general case than regression tree ? Regression and model trees tend to be more accurate than linear regression when the data are not represented well by a simple linear model 75 Prediction: Numerical Data 76 Prediction: Categorical Data 77 Chapter 6. 分類 : 基本概念 ? 分類 : 基本概念 ? 決策樹歸納 ? 貝葉斯分類 ? 基于規(guī)則的分類 ? 模型評(píng)價(jià)與選擇 ? 提高分類準(zhǔn)確率的技術(shù) :集成方法 Ensemble Methods ? Summary Summary (I) ? Classification is a form of data analysis that extracts models describing important data classes. ? Effective and scalable methods have been developed for decision tree induction, Naive Bayesian classification, rulebased classification, and many other classification methods. ? Evaluation metrics include: accuracy, sensitivity, specificity, precision, recall, F measure, and F223。 measure. ? Stratified kfold crossvalidation is remended for accuracy estimation. Bagging and boosting can be used to increase overall accuracy by learning and bining a series of individual models. 78 Summary (II) ? Significance tests and ROC curves are useful for model selection. ? There have been numerous parisons of the different classification methods。 the matter remains a research topic ? No single method has been found to be superior over all others for all data sets ? Issues such as accuracy, training time, robustness, scalability, and interpretability must be considered and can involve tradeoffs, further plicating the quest for an overall superior method 79 References (1) ? C. Apte and S. Weiss. Data mining with decision trees and decision rules. Future Generation Computer Systems, 13, 1997 ? C. M. Bishop, Neural Networks for Pattern Recognition. Oxford University Press, 1995 ? L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth International Group, 1984 ? C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2): 121168, 1998 ? P. K. Chan and S. J. Stolfo. Learning arbiter and biner trees from partitioned data for scaling machine learning. KDD39。95 ? H. Cheng, X. Yan, J. Han, and . Hsu, Discriminative Frequent Pattern Analysis for Effective Classification, ICDE39。07 ? H. Cheng, X. Yan, J. Han, and P. S. Yu, Direct Discriminative Pattern Mining for Effective Classification, ICDE39。08 ? W. Cohen. Fast effective rule induction. ICML39。95 ? G. Cong, . Tan, A. K. H. Tung, and X. Xu. Mining topk covering rule groups for gene expression data. SIGMOD39。05 80 References (2) ? A. J. Dobson. An Introduction to Generalized Linear Models. Chapman amp。 Hall, 1990. ? G. Dong and J. Li. Efficient mining of emerging patterns: Discovering trends and differences. KDD39。99. ? R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification, 2ed. John Wiley, 2022 ? U. M. Fayyad. Branching on attribute values in decision tree generation. AAAI’94. ? Y. Freund and R. E. Schapire. A decisiontheoretic generalization of online learning and an application to boosting. J. Computer and System Sciences, 1997. ? J. Gehrke, R. Ramakrishnan, and V. Ganti. Rainforest: A framework for fast decision tree construction of large datasets. VLDB’98. ? J. Gehrke, V. Gant, R. Ramakrishnan, and . Loh, BOAT Optimistic Decision Tree Construction. SIGMOD39。99. ? T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. SpringerVerlag, 2022. ? D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian works: The bination of knowledge and statistical data. Machine Learning, 1995. ? W. Li, J. Han, and J. Pei, CMAR: Accurate and Efficient Classification Based on Multiple ClassAssociation Rules, ICDM39。01. 81 References (3) ? . Lim, . Loh, and . Shih. A parison of prediction accuracy, plexity, and training time of thirtythree old and new classification algorithms. Machine Learning, 2022. ? J. Magidson. The Chaid approach to segmentation modeling: Chisquared automatic interaction detection. In R. P. Bagozzi, editor, Advanced Methods of Marketing Research, Blackwell Business, 1994. ? M. Mehta, R. Agrawal, and J. Rissanen. SLIQ : A fast scalable classifier for data mining. EDBT39。96. ? T. M. Mitchell. Machine Learning. McGraw Hill, 1997. ? S. K. Murthy, Automatic Construction of Decision Trees from Data: A MultiDisciplinary Survey, Data Mining and Knowledge Discovery 2(4): 345389, 1998 ? J. R. Quinl

點(diǎn)擊復(fù)制文檔內(nèi)容

教學(xué)課件相關(guān)推薦

反饋的基本概念與分類-資料下載頁

【總結(jié)】反饋的基本概念與分類負(fù)反饋放大電路的方框圖及增益的一般表達(dá)式負(fù)反饋對(duì)放大電路性能的改善反饋的基本概念與分類?反饋?電路中的反饋形式?類型?四種阻態(tài)的判斷方法基本概念四種類型的反饋?zhàn)钁B(tài)?各種反饋類型的特點(diǎn)Xo基本放大A電路Xid

2025-04-29 05:37

數(shù)據(jù)挖掘概念與技術(shù)引言-資料下載頁

【總結(jié)】1數(shù)據(jù)挖掘概念與技術(shù)2第1章引言本章要點(diǎn)?數(shù)據(jù)倉庫的發(fā)展?數(shù)據(jù)挖掘?數(shù)據(jù)挖掘的類型?數(shù)據(jù)挖掘常用技術(shù)?數(shù)據(jù)挖掘解決的典型商業(yè)問題3數(shù)據(jù)倉庫的發(fā)展?自從NCR公司為WalMart建立了第一個(gè)數(shù)據(jù)倉庫。?1996年，加拿大的IDC公司調(diào)查了62家實(shí)現(xiàn)了數(shù)據(jù)倉庫的

2025-08-22 09:02

物流的基本概念、分類與功能-資料下載頁

【總結(jié)】一、物流的基本概念物資從生產(chǎn)領(lǐng)域到消費(fèi)領(lǐng)域的運(yùn)動(dòng)根據(jù)實(shí)際需要，將運(yùn)輸、儲(chǔ)存、裝卸、搬運(yùn)、包裝、流通加工、配送、信息處理等基本功能有機(jī)結(jié)合。二、物流的分類按系統(tǒng)性質(zhì)分按空間范圍分按作用分國(guó)際物流國(guó)內(nèi)物流地區(qū)物流社會(huì)物流行業(yè)物流企業(yè)

2025-04-30 02:08

數(shù)據(jù)挖掘概念與技術(shù)數(shù)據(jù)預(yù)處理-資料下載頁

【總結(jié)】2020/9/151數(shù)據(jù)預(yù)處理2020年4月27日2020/9/152數(shù)據(jù)預(yù)處理的原因?正確性（Correctness）?一致性（Consistency）?完整性（Completeness）?可靠性（Reliability）數(shù)據(jù)質(zhì)量的含義2020/9

2025-07-31 09:43

chapter01-電路的基本概念與基本定律-資料下載頁

【總結(jié)】電工電子技術(shù)講授：信息學(xué)院物理教研室張麟英東樓317電話：020－89002076（辦公）13535460716Email:教材：林育茲.電工技術(shù)（第一版）.北京：科學(xué)出版社，2021主要參考書（1）秦曾煌.電工學(xué)（第五版）.北京：高

2025-10-08 03:16

71--反饋的基本概念與分類-資料下載頁

【總結(jié)】反饋的基本概念與分類直流反饋與交流反饋正反饋與負(fù)反饋串聯(lián)反饋與并聯(lián)反饋什么是反饋電壓反饋與電流反饋什么是反饋將電子系統(tǒng)輸出回路的電量（電壓或電流）送回到輸入回路的過程。hfeibicvceibvbehrevcehiehoe內(nèi)部反饋外

2025-07-24 08:08

反饋放大電路的基本概念與分類-資料下載頁

【總結(jié)】2021/6/151第5講反饋放大電路及其穩(wěn)定性分析反饋的基本概念與分類負(fù)反饋對(duì)放大電路性能影響深度負(fù)反饋放大電路的分析與計(jì)算負(fù)反饋放大電路的穩(wěn)定性分析及頻率補(bǔ)償2021/6/152?——輸出?——輸入?——反饋?——凈

2025-05-12 18:00

11數(shù)據(jù)庫基本概念-資料下載頁

【總結(jié)】LOGO數(shù)據(jù)庫基本概念?信息信息是現(xiàn)實(shí)事物的存在方式或運(yùn)動(dòng)狀態(tài)的反映，即信息是經(jīng)過加工后的數(shù)據(jù)，它會(huì)對(duì)接收者的行為和決策產(chǎn)生影響，具有現(xiàn)實(shí)的或潛在的價(jià)值。信息的主要特征：?信息傳遞需要物質(zhì)載體，信息的獲取和傳遞要消耗能量。?信息可以感知。?信息可以存儲(chǔ)、壓縮、加工、傳遞、共享、擴(kuò)散、再生和增值。?信息、數(shù)據(jù)、數(shù)據(jù)處理及數(shù)

2025-10-09 08:25

數(shù)據(jù)倉庫基本概念-資料下載頁

【總結(jié)】數(shù)據(jù)倉庫基礎(chǔ)知識(shí)數(shù)據(jù)倉庫基本概念??????隨著市場(chǎng)競(jìng)爭(zhēng)的加劇，信息系統(tǒng)的用戶已經(jīng)不滿足于僅僅用計(jì)算機(jī)去處理每天所發(fā)生的事務(wù)數(shù)據(jù)，而是需要信息——能夠支持決策的信息，去幫助管理決策。這就需要一種能夠?qū)⑷粘I(yè)務(wù)處理中所收集到的各種數(shù)據(jù)轉(zhuǎn)變?yōu)榫哂猩虡I(yè)價(jià)值信息的技術(shù)，傳統(tǒng)數(shù)據(jù)庫系統(tǒng)無法承擔(dān)這一責(zé)

2025-01-10 01:46

大數(shù)據(jù)行業(yè)發(fā)展與基本概念-資料下載頁

【總結(jié)】大數(shù)據(jù)基本概念與行業(yè)發(fā)展目錄大數(shù)據(jù)的起源12數(shù)據(jù)發(fā)展簡(jiǎn)史3國(guó)內(nèi)行業(yè)發(fā)展電影《點(diǎn)球成金》不可再生資源VS數(shù)據(jù)數(shù)據(jù)爆炸式增長(zhǎng)（每分鐘……）傳統(tǒng)處理方式所無法解決的挑戰(zhàn)12345挑戰(zhàn)數(shù)據(jù)體量越來越大非結(jié)構(gòu)半結(jié)構(gòu)混雜處理速度要求越來越快數(shù)據(jù)應(yīng)

2025-02-12 10:52

chapter2行銷資訊系統(tǒng)的基本概念-資料下載頁

【總結(jié)】Chapter2行銷資訊系統(tǒng)的基本概念行銷資訊系統(tǒng)2本章學(xué)習(xí)綱要第一節(jié)行銷資訊系統(tǒng)的行銷環(huán)境第二節(jié)行銷資訊系統(tǒng)的基本定義第三節(jié)行銷資訊系統(tǒng)的基本組成元件第四節(jié)行銷資訊系統(tǒng)的基本功能第五節(jié)行銷資訊系統(tǒng)的未來發(fā)展重點(diǎn)科技

2025-05-10 23:49

數(shù)據(jù)挖掘概念和技術(shù)ar(3)-資料下載頁

【總結(jié)】2021-11-6數(shù)據(jù)挖掘：概念和技術(shù)1數(shù)據(jù)挖掘:概念和技術(shù)—Chapter6—2021-11-6數(shù)據(jù)挖掘：概念和技術(shù)2第6章：從大數(shù)據(jù)庫中挖掘關(guān)聯(lián)規(guī)則?關(guān)聯(lián)規(guī)則挖掘?從交易數(shù)據(jù)庫中挖掘一維的布爾形關(guān)聯(lián)規(guī)則?從交易數(shù)據(jù)庫中挖掘多層次關(guān)聯(lián)規(guī)則?在交易數(shù)據(jù)庫和數(shù)據(jù)倉庫中挖掘多維關(guān)聯(lián)規(guī)則?從

2025-10-10 19:44

數(shù)據(jù)庫系統(tǒng)基本概念-資料下載頁

【總結(jié)】數(shù)據(jù)庫原理與應(yīng)用第1章數(shù)據(jù)庫系統(tǒng)基本概念●數(shù)據(jù)庫的特點(diǎn)及相關(guān)概念●信息與數(shù)據(jù)●數(shù)據(jù)管理與數(shù)據(jù)庫●數(shù)據(jù)庫管理系統(tǒng)與信息管理系統(tǒng)●數(shù)據(jù)庫技術(shù)及發(fā)展●手工管理數(shù)據(jù)階段的特點(diǎn)●文件系統(tǒng)階段的數(shù)據(jù)

2025-05-02 08:19

數(shù)字填圖數(shù)據(jù)質(zhì)量控制基本概念與技術(shù)方法-資料下載頁

【總結(jié)】數(shù)字填圖數(shù)據(jù)質(zhì)量控制基本概念與技術(shù)方法李超嶺中國(guó)地質(zhì)調(diào)查局發(fā)展研究中心2022年12月21日數(shù)字填圖數(shù)據(jù)質(zhì)量控制基本概念與技術(shù)方法一、數(shù)字地質(zhì)調(diào)查數(shù)據(jù)模型內(nèi)容介紹二、數(shù)據(jù)驗(yàn)收組織方式三、數(shù)據(jù)庫成果驗(yàn)收階段性劃分四、數(shù)據(jù)提交驗(yàn)收的基本條件五、空間數(shù)據(jù)質(zhì)量的概念與內(nèi)容六、數(shù)據(jù)完整性檢查總體要求七、空間

2025-01-03 02:56

[精選]6sigma基本概念——基本統(tǒng)計(jì)概念-資料下載頁

【總結(jié)】6σ普及培訓(xùn)第二部分基本統(tǒng)計(jì)概念2023年三月1統(tǒng)計(jì)概念解釋以下基本統(tǒng)計(jì)概念。1.波動(dòng)(偏差)2.連續(xù)數(shù)據(jù)和離散數(shù)據(jù)3.平均值、方差、標(biāo)準(zhǔn)差4.正態(tài)曲線5.用Z值將數(shù)據(jù)標(biāo)準(zhǔn)化6.中心極限定理7.過程能力-使用Z值作為衡量工序能力的指標(biāo)-通過改進(jìn)關(guān)鍵值Xs來改進(jìn)Y2波動(dòng)所有的人不

2025-02-21 07:12

freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片