【正文】
andards ? IBM Intelligent Miner, SAS Enterprise Miner, SGI MineSet, Clementine, MS/SQLServer 2021, DBMiner, BlueMartini, MineIt, DigiMine, etc. ? A few data mining languages and standards (esp. MS OLEDB for Data Mining). ? Application achievements in many domains ? Market analysis, trend analysis, fraud detection, outlier analysis, Web mining, etc. Data Mining Costs ? Desktop tools: $500 and up (MSFT ing at low price point) ? Server / MF based: $20,000 to $700,000+ ? Must also add cost of extensive consulting for high end tools ? Don’t fet long training and learning curve time ? Ongoing process, not task automation software 提綱 ? 數(shù)據(jù)倉(cāng)庫(kù)概念 ? 數(shù)據(jù)倉(cāng)庫(kù)體系結(jié)構(gòu)及組件 ? 數(shù)據(jù)倉(cāng)庫(kù)設(shè)計(jì) ? 數(shù)據(jù)倉(cāng)庫(kù)技術(shù)(與數(shù)據(jù)庫(kù)技術(shù)的區(qū)別) ? 數(shù)據(jù)倉(cāng)庫(kù)性能 ? 數(shù)據(jù)倉(cāng)庫(kù)應(yīng)用 ? 數(shù)據(jù)挖掘應(yīng)用概述 ? 數(shù)據(jù)挖掘技術(shù)與趨勢(shì) ? 數(shù)據(jù)挖掘應(yīng)用平臺(tái)(科委申請(qǐng)項(xiàng)目) 數(shù)據(jù)挖掘趨勢(shì) ? 歷史回顧 ? 多學(xué)科交叉 ? 數(shù)據(jù)挖掘從多個(gè)角度分類 ? 最近十年的研究進(jìn)展 ? 數(shù)據(jù)挖掘的趨勢(shì) ? 數(shù)據(jù)挖掘與標(biāo)準(zhǔn)化進(jìn)程 歷史回顧 ? 1989 IJCAI Workshop on Knowledge Discovery in Databases ? Knowledge Discovery in Databases (G. PiatetskyShapiro and W. Frawley, 1991) ? 19911994 Workshops on Knowledge Discovery in Databases ? Advances in Knowledge Discovery and Data Mining (U. Fayyad, G. PiatetskyShapiro, P. Smyth, and R. Uthurusamy, 1996) ? 19951998 International Conferences on Knowledge Discovery in Databases and Data Mining (KDD’9598) ? Journal of Data Mining and Knowledge Discovery (1997) ? 1998 ACM SIGKDD, SIGKDD’19992021 conferences, and SIGKDD Explorations ? More conferences on data mining ? PAKDD, PKDD, SIAMData Mining, (IEEE) ICDM, DaWaK, SPIEDM, etc. Data Mining: Confluence of Multiple Disciplines Data Mining Database Technology Statistics Other Disciplines Information Science Machine Learning (AI) Visualization A MultiDimensional View of Data Mining ? Databases to be mined ? Relational, transactional, objectrelational, active, spatial, timeseries, text, multimedia, heterogeneous, legacy, WWW, etc. ? Knowledge to be mined ? Characterization, discrimination, association, classification, clustering, trend, deviation and outlier analysis, etc. ? Techniques utilized ? Databaseoriented, data warehouse (OLAP), machine learning, statistics, visualization, neural work, etc. ? Applications adapted ? Retail, telemunication, banking, fraud analysis, DNA mining, stock market analysis, Web mining, Weblog analysis, etc. Research Progress in the Last Decade ? Multidimensional data analysis: Data warehouse and OLAP (online analytical processing) ? Association, correlation, and causality analysis ? Classification: scalability and new approaches ? Clustering and outlier analysis ? Sequential patterns and timeseries analysis ? Similarity analysis: curves, trends, images, texts, etc. ? Text mining, Web mining and Weblog analysis ? Spatial, multimedia, scientific data analysis ? Data preprocessing and database pression ? Data visualization and visual data mining ? Many others, ., collaborative filtering Research Directions — [Han J. W. , 2021] ? Web mining ? Towards integrated data mining environments and tools ? “Vertical” (or applicationspecific) data mining ? Invisible data mining ? Towards intelligent, efficient, and scalable data mining methods Towards Integrated Data Mining Environments and Tools ? OLAP Mining: Integration of Data Warehousing and Data Mining ? Querying and Mining: An Integrated Information Analysis Environment ? Basic Mining Operations and Mining Query Optimization ? “Vertical” (or applicationspecific) data mining ? Invisible data mining Querying and Mining: An Integrated Information Analysis Environment ? Data mining as a ponent of DBMS, data warehouse, or Web information system ? Integrated information processing environment ? MS/SQLServer2021 (Analysis service) ? IBM IntelligentMiner on DB2 ? SAS EnterpriseMiner: data warehousing + mining ? Querybased mining ? Querying database/DW/Web knowledge ? Efficiency and flexibility: preprocessing, online processing, optimization, integration, etc. “ Vertical” Data Mining ? Generic data mining tools? —Too simple to match domainspecific, sophisticated