【正文】
ve Bayesian Classifier: Commentsn Advantages : n Easy to implement n Good results obtained in most of the casesn Disadvantagesn Assumption: class conditional independence , therefore loss of accuracyn Practically, dependencies exist among variables n ., hospitals: patients: Profile: age, family history etc Symptoms: fever, cough etc., Disease: lung cancer, diabetes etc n Dependencies among these cannot be modeled by Na239。ve Bayesian Classifiern How to deal with these dependencies?n Bayesian Belief Networks 2023/2/27 星期六 35Data Mining: Concepts and TechniquesBayesian Networksn Bayesian belief work allows a subset of the variables conditionally independentn A graphical model of causal relationshipsn Represents dependency among the variables n Gives a specification of joint probability distribution X YZ PqNodes: random variablesqLinks: dependencyqX,Y are the parents of Z, and Y is the parent of PqNo dependency between Z and PqHas no loops or cycles2023/2/27 星期六 36Data Mining: Concepts and TechniquesBayesian Belief Network: An ExampleFamilyHistoryLungCancerPositiveXRaySmokerEmphysemaDyspneaLC~LC(FH, S) (FH, ~S) (~FH, S) (~FH, ~S)Bayesian Belief NetworksThe conditional probability table for the variable LungCancer:Shows the conditional probability for each possible bination of its parents2023/2/27 星期六 37Data Mining: Concepts and TechniquesLearning Bayesian Networksn Several casesn Given both the work structure and all variables observable: learn only the CPTsn Network structure known, some hidden variables: method of gradient descent, analogous to neural work learningn Network structure unknown, all variables observable: search through the model space to reconstruct graph topology n Unknown structure, all hidden variables: no good algorithms known for this purposen D. Heckerman, Bayesian works for data mining2023/2/27 星期六 38Data Mining: Concepts and Techniques第 7章 : 分類和預(yù)測(cè)n What is classification? What is prediction?n Issues regarding classification and predictionn Classification by decision tree inductionn Bayesian Classificationn Classification by Neural Networksn Classification by Support Vector Machines (SVM)n Classification based on concepts from association rule miningn Other Classification Methodsn Predictionn Classification accuracyn Summary2023/2/27 星期六 39Data Mining: Concepts and Techniquesn Classification: n predicts categorical class labelsn Typical Applicationsn {credit history, salary} credit approval ( Yes/No)n {Temp, Humidity} Rain (Yes/No)ClassificationMathematically2023/2/27 星期六 40Data Mining: Concepts and TechniquesLinear Classificationn Binary Classification problemn The data above the red line belongs to class ‘x’n The data below red line belongs to class ‘o’n Examples – SVM, Perceptron, Probabilistic Classifiersxxxxxxxxxx oooo o oooo oooo2023/2/27 星期六 41Data Mining: Concepts and TechniquesDiscriminative Classifiersn Advantagesn prediction accuracy is generally high n (as pared to Bayesian methods – in general)n robust, works when training examples contain errorsn fast evaluation of the learned target functionn (Bayesian works are normally slow) n Criticismn long training timen difficult to understand the learned function (weights)n (Bayesian works can be used easily for pattern discovery)n not easy to incorporate domain knowledgen (easy in the form of priors on the data or distributions)2023/2/27 星期六 42Data Mining: Concepts and TechniquesNeural Networksn Analogy to Biological Systems (Indeed a great example of a good learning system)n Massive Parallelism allowing for putational efficiencyn The first learning algorithm came in 1959 (Rosenblatt) who suggested that if a target output value is provided for a single neuron with fixed inputs, one can incrementally change weights to learn to produce these outputs using the perceptron learning rule2023/2/27 星期六 43Data Mining: Concepts and TechniquesA Neuronn The ndimensional input vector x is mapped into variable y by means of the scalar product and a nonlinear function mappingmkfweighted sumInputvector xoutput yActivationfunctionweightvector w229。ve Bayesian Classifier: Examplen Compute P(X/Ci) for each class P(age=“30” | buys_puter=“yes”) = 2/9= P(age=“30” | buys_puter=“no”) = 3/5 = P(ine=“medium” | buys_puter=“yes”)= 4/9 = P(ine=“medium” | buys_puter=“no”) = 2/5 = P(student=“yes” | buys_puter=“yes)= 6/9 = P(student=“yes” | buys_puter=“no”)= 1/5= P(credit_rating=“fair” | buys_puter=“yes”)=6/9= P(credit_rating=“fair” | buys_puter=“no”)=2/5= X=(age=30 ,ine =medium, student=yes,credit_rating=fair) P(X|Ci) : P(X|buys_puter=“yes”)= x x x = P(X|buys_puter=“no”)= x x x =P(X|Ci)*P(Ci ) : P(X|buys_puter=“yes”) * P(buys_puter=“yes”)= P(X|buys_puter=“yes”) * P(buys_puter=“yes”)=X belongs to class “buys_puter=yes” 2023/2/27 星期六 34Data Mining: Concepts and TechniquesNa239。w0w1wnx0x1xn2023/2/27 星期六 44Data Mining: Concepts and TechniquesA Neuronmkfweighted sumInputvector xoutput yActivationfunctionweightvector w229。ve Bayes Classifier n A simplified assumption: attributes are conditionally independent:n The product of occurrence of say 2 elements x1 and x2, given the current class is C, is the product of the probabilities of each element taken separately, given the same class P([y1,y2],C) = P(y1,C) * P(y2,C)n No dependence relation between attributes n Greatly reduces the putation cost, only count the class distribution.n Once the probability P(X|Ci) is known, assign X to the class with maximum P(X|Ci)*P(Ci)2023/2/27 星期六 32Data Mining: Concepts