【文章內(nèi)容簡介】
entionally using the former approach. Significant development resources are spent on quick and dirty integration solutions that cobble together different data management systems (databases, content management systems, enterprise application systems) and transform data from one format to another (structured, XML, byte streams). Revenue is lost when applications suffer from scalability and availability problems. New business opportunities are simply overlooked because the critical nuggets of information required to make a business decision are lost among the masses of data being generated.Relational databases were born out of a need to store, manipulate and manage the integrity of large volumes of data. In the 1960s, network and hierarchical systems such as CODASYL and IMS were the stateofthe art technology for automated banking, accounting, and order processing systems enabled by the introduction of mercial mainframe puters. While these systems provided a good basis for the early systems, their basic architecture mixed the physical manipulation of data with its logical manipulation. When the physical location of data changed, such as from one area of a disk to another, applications had to be updated to reference the new location.A revolutionary paper by Codd in 1970 and its mercial implementations changed all that. Codd39。s relational model introduced the notion of data independence, which separated the physical representation of data from the logical representation presented to applications. Data could be moved from one part of the disk to another or stored in a different format without causing applications to be rewritten. Application developers were freed from the tedious physical details of data manipulation, and could focus instead on the logical manipulation of data in the context of their specific application.Not only did the relational model ease the burden of application developers, but it also caused a paradigm shift in the data management industry. The separation between what and how data is retrieved provided an architecture by which the new database vendors could improve and innovate their products. SQL became the standard language for describing what data should be retrieved. New storage schemes, access strategies, and indexing algorithms were developed to speed up how data was stored and retrieved from disk, and advances in concurrency control, logging, and recovery mechanisms further improved data integrity guarantees GRAY LIND ARIES. Costbased optimization techniques OPT pleted the transition from databases acting as an abstract data management layer to being highperform