【正文】
數(shù)據(jù)是非結(jié)構(gòu)化的 (大量的移動 終端設(shè)備 , 機器產(chǎn)生的數(shù)據(jù) ) 在未來十年,數(shù)據(jù)將迎來 44 倍的增長 (35 zettabytes by 2023) 主要的數(shù)據(jù) 增長 來自于 非結(jié)構(gòu)化數(shù)據(jù) (在線 的歸檔數(shù)據(jù) , 醫(yī)療影像 , 在線視頻和存儲 , 照 片等 ) ? ? 全球數(shù)據(jù)的構(gòu)成 ? ? ? Kaiser的數(shù)據(jù)中, 90% 是非結(jié)構(gòu)化的 (80% 的 EHR和影像數(shù)據(jù) ) 在未來十年,數(shù)據(jù)將會有 25 倍的增長 (One exabyte by 2023) 主要的數(shù)據(jù) 增長 來自于 非結(jié)構(gòu)化數(shù)據(jù) (醫(yī) 療影像 , 視頻 , 文本 , 音頻等 ) ? 信息 給 實時個性化醫(yī)療服務(wù) 帶來了可能性 ? (Requires Contextual – device, environment, spatial, Demographics, Social and Behavioral profiles in addition to medical information) Kaiser 正在評估大數(shù)據(jù)相關(guān)技術(shù) … Kaiser的數(shù)據(jù)構(gòu)成 結(jié)構(gòu)化數(shù)據(jù) 90% UNSTRUCTURED 構(gòu)化數(shù)據(jù) DATA 非結(jié) 信息 給各行業(yè)發(fā)展帶來了新一輪的機遇 (零售 , 金融 , 保險 , 制造 , 醫(yī)療 ,…) 各行業(yè)已經(jīng)開始采用 大數(shù)據(jù)技術(shù) 用于信息提 取 Source: Kaiser Master ? ? ? ? ? Integrate built/bought Realtime Predictive Analytical Solutions or Processing logic Discontinuous Change SAN/NAS SMP (5$) SAN/NAS InMemory (50$) ShareNothing Distributed Storage and Compute ($) Faulttolerant MasterSlave Architecture capable of withstanding partial system failures Data is distributed across processing slave nodes Resources containing data are not shared Master manages the data distribution, job scheduling across slave nodes and aggregating result sets Slave(s) DAS SAN/NAS MPP (10$) SAN/NAS SMP (Disk Caching, High Speed Network) (10$) 數(shù)據(jù)平臺計算的趨勢 – 分布式計算 Kaiser is looking to exploit this capability… ? Structured, Relational Tabular Data ? Interactive Query Support ? Realtime Analytics ? SQL Transaction Data ? Unstructured, Nontabular Data ? Rich Ad Hoc Integration ? Realtime Analytics ? UQL ALL Data 大數(shù)據(jù)平臺 –需求分析 處理的特性 ? Intuition (Simulation, Optimization, Stochastic Optimization) ? Information (Standard Ad Hoc reporting, Query, Alerts, Forecasting, Access) ? Interrogation (Clustering, Statistical, Quality, Semantics) ? Integration (Alignment, Semantics, Completeness, Quality) ? Ingestion (Data Model, Metadata Reference Data, Store) Information drives process optimizations with strategic impact. Modeling business intuition from data deluge. Ability to model information and transition from multiple access methods to generating, sharing, collaborating and acting on insights anytime, anywhere on any device. Support current BI tools focused on structured information. Build/buy packaged unstructured data p