【正文】
大數(shù)據(jù) = 海量數(shù)據(jù) + 復(fù)雜類型的數(shù)據(jù) 海量交易 數(shù)據(jù): 企業(yè)內(nèi)部的經(jīng)營交易信息主要包括聯(lián)機(jī)交易數(shù)據(jù)和聯(lián)機(jī)分析數(shù)據(jù),是 結(jié)構(gòu)化 的、通過關(guān)系數(shù)據(jù)庫進(jìn)行管理和訪問的靜態(tài)、歷史數(shù)據(jù)。 這是傳統(tǒng)企業(yè)花費(fèi)重金都難以企及的夢(mèng)想。 20 超越 BI Adhoc querying and reporting Data mining techniques Structured data, typical sources Small to midsize datasets Optimizations and predictive analytics Complex statistical analysis All types of data, and many sources Very large datasets More of a realtime 21 大數(shù)據(jù)分析的價(jià)值 ? Big data is more realtime in nature than traditional DW applications ? Traditional DW architectures (. Exadata, Teradata) are not wellsuited for big data apps ? Shared nothing, massively parallel processing, scale out architectures are wellsuited for big data apps 22 大數(shù)據(jù)的挑戰(zhàn) ? The Bottleneck is in technology ? New architecture, algorithms, techniques are needed ? Also in technical skills ? Experts in using the new technology and dealing with big data 23 利用用戶 ” 行為指紋 ” 創(chuàng)造新商機(jī) 用戶在線的每一次點(diǎn)擊,每一次評(píng)論,每一個(gè)視頻點(diǎn)播,就是大數(shù)據(jù)的典型來源。預(yù)計(jì)到 2023 年 ,全球?qū)⒖偣矒碛?35ZB的數(shù)據(jù)量 ?2023年企業(yè)創(chuàng)造、采集、管理和儲(chǔ)存信息的成本 已經(jīng)下降到 2023年的 1/6,而同期企業(yè)關(guān)于數(shù)據(jù) 的總投資自 2023年以來卻反而上升了 50%。 15 16 數(shù)據(jù)處理的變遷 ? OLTP: Online Transaction Processing (DBMSs) ? OLAP: Online Analytical Processing (Data Warehousing) ? RTAP: RealTime Analytics Processing (Big Data Architecture technology) 17 大數(shù)據(jù)的源頭 Social media and works (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and works (measuring all kinds of data) ? The progress and innovation is no longer hindered by the ability to collect data ? But, by the ability to manage, analyze,