【正文】
may carry important information.parallel processingThe coordinated use of multiple processors to perform putational tasks. Parallel processing can occur on a multiprocessor puter or on a network of workstations or PCs.predictive modelA structure and process for predicting the values of specified variables in a dataset.prospective data analysisData analysis that predicts future trends, behaviors, or events based on historical data.RAIDRedundant Array of Inexpensive Disks. A technology for the efficient parallel storage of data for highperformance puter systems.retrospective data analysisData analysis that provides insights into trends, behaviors, or events that have already occurred.rule inductionThe extraction of useful ifthen rules from data based on statistical significance.SMPSymmetric multiprocessor. A type of multiprocessor puter in which memory is shared among the processors.terabyteOne trillion bytes.time series analysisThe analysis of a sequence of measurements made at specified time intervals. Time is usually the dominating dimension of the data.15 / 15。 for example, a multidimensional sales database might include the dimensions Product, Time, and City.exploratory data analysisThe use of graphical and descriptive statistical techniques to learn about the structure of a dataset.genetic algorithmsOptimization techniques that use processes such as genetic bination, mutation, and natural selection in a design based on the concepts of natural evolution.linear modelAn analytical model that assumes linear relationships in the coefficients of the variables being studied. linear regressionA statistical technique used to find the bestfitting linear relationship between a target (dependent) variable and its predictors (independent variables).logistic regressionA linear regression that predicts the proportions of a categorical target variable, such as type of customer, in a population.multidimensional databaseA database designed for online analytical processing. Structured as a multidimensional hypercube with one axis per dimension.multiprocessor puterA puter that includes multiple processors connected by a network. See parallel processing.nearest neighborA technique that classifies each record in a dataset based on a bination of the classes of the k record(s) most similar to it in a historical dataset (where k 179。1 META Group Application Development Strategies: Data Mining for Data Warehouses: Uncovering Hidden Patterns., 7/13/95 . 2 Gartner Group Advanced Technologies and Applications Research Note, 2/1/95.3 Gartner Group High Performance Computing Research Note, 1/31/95. Bradstreet can yield a prioritized list of prospects by region. A credit card pany can leverage its vast warehouse of customer transaction data to identify customers most likely to be interested in a new credit product. Using a small test mailing, the attributes of customers with an affinity for the product can be identified. Recent projects have indicated more than a 20fold decrease in costs for targeted mailing campaigns over conventional approaches. The ideal starting point is a data warehouse containing a bination of internal data tracking all customer contact coupled with external market data about petitor activity. Background information on potential customers also provides an excellent basis for prospecting. This warehouse can be implemented in a variety of relational database systems: Sybase, Oracle, Redbrick, and so on, and should be optimized for flexible and fast data access.An OLAP (OnLine Analytical Processing) server enables a more sophisticated enduser business model to be applied when navigating the data warehouse. The multidimensional structures allow the user to analyze the data as they want to view their business – summarizing by product line, region, and other key perspectives of their business. The Data Mining Server must be integrated with the data warehouse and the OLAP server to embed ROIfocused business analysis directly into this infrastructure. An advanced, processcentric metadata template defines the data mining objectives for specific business issues like campaign management, prospecting, and promotion optimization. Integration with the data warehouse enables operational decisions to be directly implemented and tracked. As the warehouse grows with new decisions and results, the organization can continually mine the best practices and apply them to future decisions.This design represents a fundamental shift from conventional decision support systems. Rather than simply delivering data to the end user through query and reporting software, the Advanced Analysis Server applies users’ business models directly to the warehouse and returns a proactive analysis of the most relevant information. These results enhance the metadata in the OLAP Server by providing a dynamic metadata layer that represents a distilled view of the data. Reporting, visualization, and other analysis tools can then be applied to plan future actions and confirm the impact of those plans.Profitable ApplicationsA wide range of panies have deployed successful applications of data mining. While early adopters of this technology have tended to be in informationintensive industries such as financial services and direct mail marketing, the technology is applicable to any pany looking to leverage a large data warehouse to better manage their customer relationships. Two crit