freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

第2課數(shù)據(jù)預(yù)處理技術(shù)-資料下載頁

2024-10-11 13:44本頁面

【導(dǎo)讀】.,Age=―42‖Birthday=―03/07/1997‖。.,Wasrating―1,2,3‖,nowrating―A,B,C‖

  

【正文】 mprove classification accuracy E S T S E n t S E n tS S S S( , ) | || | ( ) | || | ( )? ?1 1 2 2E n t S E T S( ) ( , )? ? ?Segmentation by Natural Partitioning ? A simply 345 rule can be used to segment numeric data into relatively uniform, ―natural‖ intervals. ? If an interval covers 3, 6, 7 or 9 distinct values at the most significant digit, partition the range into 3 equiwidth intervals ? If it covers 2, 4, or 8 distinct values at the most significant digit, partition the range into 4 intervals ? If it covers 1, 5, or 10 distinct values at the most significant digit, partition the range into 5 intervals Example of 345 Rule ($4000 $5,000) ($400 0) ($400 $300) ($300 $200) ($200 $100) ($100 0) (0 $1,000) (0 $200) ($200 $400) ($400 $600) ($600 $800) ($800 $1,000) ($2,000 $5, 000) ($2,000 $3,000) ($3,000 $4,000) ($4,000 $5,000) ($1,000 $2, 000) ($1,000 $1,200) ($1,200 $1,400) ($1,400 $1,600) ($1,600 $1,800) ($1,800 $2,000) msd=1,000 Low=$1,000 High=$2,000 Step 2: Step 4: Step 1: $351 $159 profit $1,838 $4,700 Min Low (, 5%tile) High(, 95%0 tile) Max count ($1,000 $2,000) ($1,000 0) (0 $ 1,000) Step 3: ($1,000 $2,000) Concept Hierarchy Generation for Categorical Data ? Specification of a partial ordering of attributes explicitly at the schema level by users or experts ? streetcitystatecountry ? Specification of a portion of a hierarchy by explicit data grouping ? {Urbana, Champaign, Chicago}Illinois ? Specification of a set of attributes. ? System automatically generates partial ordering by analysis of the number of distinct values ? ., street city state country ? Specification of only a partial set of attributes ? ., only street city, not others Automatic Concept Hierarchy Generation ? Some concept hierarchies can be automatically generated based on the analysis of the number of distinct values per attribute in the given data set ? The attribute with the most distinct values is placed at the lowest level of the hierarchy ? Note: Exception—weekday, month, quarter, year country province_or_ state city street 15 distinct values 65 distinct values 3567 distinct values 674,339 distinct values VI. Summary ? Data preparation is a big issue for both warehousing and mining ? Data preparation includes ? Data cleaning and data integration ? Data reduction and feature selection ? Discretization ? A lot a methods have been developed but still an active area of research References ? E. Rahm and H. H. Do. Data Cleaning: Problems and Current Approaches. IEEE Bulletin of the Technical Committee on Data Engineering. , ? D. P. Ballou and G. K. Tayi. Enhancing data quality in data warehouse environments. Communications of ACM, 42:7378, 1999. ? . Jagadish et al., Special Issue on Data Reduction Techniques. Bulletin of the Technical Committee on Data Engineering, 20(4), December 1997. ? A. Maydanchik, Challenges of Efficient Data Cleansing (DM Review Data Quality resource portal) ? D. Pyle. Data Preparation for Data Mining. Man Kaufmann, 1999. ? D. Quass. A Framework for research in Data Cleaning. (Draft 1999) ? V. Raman and J. Hellerstein. Potters Wheel: An Interactive Framework for Data Cleaning and Transformation, VLDB’2020. ? T. Redman. Data Quality: Management and Technology. Bantam Books, New York, 1992. ? Y. Wand and R. Wang. Anchoring data quality dimensions ontological foundations. Communications of ACM, 39:8695, 1996. ? R. Wang, V. Storey, and C. Firth. A framework for analysis of data quality research. IEEE Trans. Knowledge and Data Engineering, 7:623640, 1995. ?
點(diǎn)擊復(fù)制文檔內(nèi)容
教學(xué)課件相關(guān)推薦
文庫吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1