freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

8-1數(shù)據(jù)倉(cāng)庫(kù)與數(shù)據(jù)挖掘-全文預(yù)覽

  

【正文】 ay ? If there are more leaf nodes than fit in memory, merge existing clusters that are close to each other ? At the end of first pass we get a large number of clusters at the leaves of the Rtree ? Merge clusters to reduce the number of clusters 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Finding Support ? Determine support of itemsets via a single pass on set of transactions ? Large itemsets: sets with a high count at the end of the pass ? If memory not enough to hold all counts for all itemsets use multiple passes, considering only some itemsets in each pass. ? Optimization: Once an itemset is eliminated because its count (support) is too small none of its supersets needs to be considered. ? The a priori technique to find large itemsets: ? Pass 1: count support of all sets with just 1 item. Eliminate those items with low support ? Pass i: candidates: every set of i items such that all its i1 item subsets are large ? Count support of all candidates ? Stop if there are no candidates 169。 the population consists of a set of instances ? ., each transaction (sale) at a shop is an instance, and the set of all transactions is the population 169。ve Bayesian Classifiers ? Bayesian classifiers require ? putation of p (d | cj ) ? preputation of p (cj ) ? p (d ) can be ignored since it is the same for all classes ? To simplify the task, na239。 Use best split found (across all attributes) to partition S into S1, S2, …., S r, for i = 1, 2, ….., r Partition (Si )。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Finding Best Splits ? Categorical attributes (with no meaningful order): ? Multiway split, one child for each value ? Binary split: try all possible breakup of values into two sets, and pick the best ? Continuousvalued attributes (can be sorted in a meaningful order) ? Binary split: ? Sort values, try each as a split point – ., if values are 1, 10, 15, 25, split at ?1, ? 10, ? 15 ? Pick the value that gives best split ? Multiway split: ? A series of binary splits on the same attribute has roughly equivalent effect 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Construction of Decision Trees ? Training set: a data sample in which the classification is already known. ? Greedy top down generation of decision trees. ? Each internal node of the tree partitions the data into groups based on a partitioning attribute, and a partitioning condition for the node ? Leaf node: ? all (or most) of the items at the node belong to the same class, or ? all attributes have been considered, and no further partitioning is possible. 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Data Mining ? Data mining is the process of semiautomatically analyzing large databases to find useful patterns ? Prediction based on past history ? Predict if a credit card applicant poses a good credit risk, based on some attributes (ine, job type, age, ..) and past history ? Predict if a pattern of phone calling card usage is likely to be fraudulent ? Some examples of prediction mechanisms: ? Classification ? Given a new item whose class is unknown, predict to which class it belongs ? Regression formulae ? Given a set of mappings for an unknown function, predict the function result for a new parameter value 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Design Issues ? When and how to gather data ? Source driven architecture: data sources transmit new information to warehouse, either continuously or periodically (., at night) ? Destination driven architecture: warehouse periodically requests new information from data sources ? Keeping warehouse exactly synchronized with data sources (., using twophase mit
點(diǎn)擊復(fù)制文檔內(nèi)容
數(shù)學(xué)相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1