freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

8-1數(shù)據(jù)倉(cāng)庫(kù)與數(shù)據(jù)挖掘-wenkub

2023-01-30 18:10:28 本頁(yè)面
 

【正文】 ting causation ? ., association between exposure to chemical X and cancer, ? Clusters ? ., typhoid cases were clustered in an area surrounding a contaminated well ? Detection of clusters remains important in detecting epidemics 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition More Warehouse Design Issues ? Data cleansing ? ., correct mistakes in addresses (misspellings, zip code errors) ? Merge address lists from different sources and purge duplicates ? How to propagate updates ? Warehouse schema may be a (materialized) view of schema from data sources ? What data to summarize ? Raw data may be too large to store online ? Aggregate values (totals/subtotals) often suffice ? Queries on raw data can often be transformed by query optimizer to use aggregate values 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition DecisionSupport Systems: Overview ? Data analysis tasks are simplified by specialized tools and SQL extensions ? Example tasks ? For each product category and each region, what were the total sales in the last quarter and how do they pare with the same quarter last year ? As above, for each product category and each customer category ? Statistical analysis packages (., : S++) can be interfaced with databases ? Statistical analysis is a large field, but not covered here ? Data mining seeks to discover knowledge automatically in the form of statistical rules and patterns from large databases. ? A data warehouse archives information gathered from multiple sources, and stores it under a unified schema, at a single site. ? Important for large businesses that generate data from multiple divisions, possibly at multiple sites ? Data may also be purchased externally 169。Chapter 20: Data Analysis 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Data Warehousing ? Data sources often store only current data, not historical data ? Corporate decision making requires a unified view of all anizational data, including historical data ? A data warehouse is a repository (archive) of information gathered from multiple sources, stored under a unified schema, at a single site ? Greatly simplifies querying, permits study of historical trends ? Shifts decision support query load away from transaction processing systems 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Warehouse Schemas ? Dimension values are usually encoded using small integers and mapped to full values via dimension tables ? Resultant schema is called a star schema ? More plicated schema structures ? Snowflake schema: multiple levels of dimension tables ? Constellation: multiple fact tables 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Classification Rules ? Classification rules help assign new objects to classes. ? ., given a new automobile insurance applicant, should he or she be classified as low risk, medium risk or high risk? ? Classification rules for above example could use a variety of data, such as educational level, salary, age, etc. ? ? person P, = masters and 75,000 ? = excellent ? ? person P, = bachelors and ( ? 25,000 and ? 75,000) ? = good ? Rules are not necessarily exact: there may be some misclassifications ? Classification rules can be shown pactly as a decision tree. 169。Silberschatz, Korth and Sudarshan Database System Concepts 6th Edition Best Splits (Cont.) ? Another measure of purity is the entropy measure, which is defined as entropy (S) = – ? ? When a set S is split into multiple sets Si, I=1, 2, …, r, we can measure the purity of the resultant set of s
點(diǎn)擊復(fù)制文檔內(nèi)容
數(shù)學(xué)相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖片鄂ICP備17016276號(hào)-1