【正文】
儲(chǔ)在兩個(gè)或多個(gè)信息源中的數(shù)據(jù)提取出來,建立一個(gè)包含所有這些信息源的信息的大數(shù)據(jù)庫(該數(shù)據(jù)庫可以是虛擬的) 信息集成的方式 聯(lián)邦數(shù)據(jù)庫(federal database) 數(shù)據(jù)倉庫(data warehouse) 協(xié)調(diào)器(mediator),信息集成中的問題,汽車公司有1000位代理商,想創(chuàng)建一個(gè)集成數(shù)據(jù)庫,各個(gè)代理商使用不同的數(shù)據(jù)庫模式 代理商1:Cars(serialNo, model, color, autotrans, cdPlayer,…) 代理商2:Autos(serial, model, color), Options(serial, option) 問題 數(shù)據(jù)類型不同 取值不同 語義不同 數(shù)據(jù)丟失,聯(lián)邦數(shù)據(jù)庫,DB1,DB2,DB3,DB4,問題:編寫n(n1)個(gè)組件來相互翻譯查詢,聯(lián)邦數(shù)據(jù)庫,代理商1詢問代理商2是否有自己所需要的汽車,for (each tuple(:m, :c, :a) in NeededCars{ if(:a = true) select serial from Autos, Options where Autos.serial = Options.serial and Autos.model = :m and Autos.color = :c else select serial from Autos where Autos.model = :m and Autos.color = :c not exists( select * from Options where serial = Autos.serial and option = ‘a(chǎn)utoTrans’),NeededCars(model, color, autoTrans),數(shù)據(jù)倉庫,查詢,結(jié)果,數(shù)據(jù)倉庫是個(gè)實(shí)視圖,數(shù)據(jù)倉庫,insert into AutosWhse(serialNo, model, color, autotrans, dealer) select serialNo, model, color, autotrans, ‘dealer1’ from Cars,導(dǎo)入代理商1,代理商1:Cars(serialNo, model, color, autotrans, cdPlayer,…) 代理商2:Autos(serial, model, color), Options(serial, option) 數(shù)據(jù)倉庫:AutosWhse(serialNo, model, color, autotrans, dealer) dealer指擁有該車的代理商,數(shù)據(jù)倉庫,insert into AutosWhse(serialNo, model, color, autotrans, dealer) select serialNo, model, color, ‘no’, ‘dealer2’ from Autos where not exists (select * from Options where Autos.serial = Options.serial and option = ‘a(chǎn)utoTrans’),導(dǎo)入代理商2,協(xié)調(diào)器,協(xié)調(diào)器是個(gè)虛視圖,協(xié)調(diào)器,協(xié)調(diào)器:AutosMed(serialNo, model, color, autotrans, dealer),詢問協(xié)調(diào)器關(guān)于紅色汽車的信息 select serialNo, model from autosMed where color = ‘red’,代理商1的包裝器 select serialNo, model from Cars where color = ‘red’,代理商2的包裝器 select serialNo, model from Autos where color = ‘red’,協(xié)調(diào)器,詢問協(xié)調(diào)器是否存在Gobi型號(hào)的藍(lán)色汽車,詢問代理商1是否存在Gobi型號(hào)的藍(lán)色汽車,詢問代理商2是否存在Gobi型號(hào)的藍(lán)色汽車,返回,是,否,包裝器,包裝器從協(xié)調(diào)器接受各種查詢,然后將查詢翻譯成數(shù)據(jù)源的術(shù)語,并將結(jié)果傳送給協(xié)調(diào)器 如何設(shè)計(jì)包裝器? 將協(xié)調(diào)器可能使用的查詢進(jìn)行分類,成為模板 模板是帶有代表常數(shù)的參數(shù)的查詢 協(xié)調(diào)器提供常數(shù),包裝器執(zhí)行給定好常數(shù)的查詢 用T=S表示包裝器將查詢模板T變成對(duì)數(shù)據(jù)源的查詢S,包裝器生成器,類似YACC,將翻譯好之后的查詢模板和對(duì)應(yīng)的源查詢存儲(chǔ)到表中,接受來自協(xié)調(diào)器的查詢 在表中查找與查詢匹配的模板 找到,則傳遞查詢中參數(shù),實(shí)例化模板 沒找到,拒絕協(xié)調(diào)器 源查詢發(fā)送到數(shù)據(jù)源 將數(shù)據(jù)源的答復(fù)返回給協(xié)調(diào)器,包裝器模板,協(xié)調(diào)器:AutosMed(serialNo, model, color, autotrans, dealer),代理商1:Cars(serialNo, model, color, autotrans, cdPlayer,…),select * from AutosMed where color = ‘$c’ = select serialNo, model, color, autotrans, ‘dealer1’ from Cars where color =‘$c’,查詢給定顏色的汽車,模 板 1,包裝器模板,select * from AutosMed where color = ‘$c’ and model = ‘$m’ = select serialNo, model, color, autotrans, ‘dealer1’ from Cars where color =‘$c’ and model = ‘$m’,查詢給定顏色和型號(hào)的汽車,模 板 2,過濾器,為避免太多的查詢模板,只給包裝器指定少量模板,它返回查詢所需結(jié)果的超集,然后再由包裝器過濾向數(shù)據(jù)源所提供的結(jié)果,詢問協(xié)調(diào)器關(guān)于紅色’BMW’汽車的信息 select serialNo, model from autosMed where color = ‘red’ and model = ‘BMW’ 執(zhí)行模板1,令’$c’=‘red’ 將結(jié)果保存在臨時(shí)表TempAutos中(實(shí)際中,可以是流水方式) 執(zhí)行查詢select * from TempAutos where model = ‘Gobi’,問題:如何確定一個(gè)協(xié)調(diào)器查詢是某個(gè)包裝器模板查詢結(jié)果的子集,過濾器,查詢代理商和型號(hào),代理商有兩輛同型號(hào)的紅色汽車, 一輛是自動(dòng)的,另一輛不是,針對(duì)協(xié)調(diào)器的查詢 select A1.model, A1.dealer from autosMed A1, autosMed A2 where A1.model = A2.model and A1.color = ‘red’ and A2.color = ‘red’ and A1.autoTrans = ‘no’ and A2.autoTrans = ‘yes’,過濾器,執(zhí)行模板1,令’$c’=‘red’ 將結(jié)果保存在臨時(shí)表RedAutos中 接著執(zhí)行: select A1.model, A1.dealer from RedAutos A1, RedAutos A2 where A1.model = A2.model and A1.autoTrans = ‘no’ and A2.autoTrans = ‘yes’,數(shù)據(jù)分析流程,Spread Sheet,Table,Extracting +Visualizing,計(jì)算 Vs 可視化,關(guān)系系統(tǒng)計(jì)算數(shù)據(jù)立方體 可視化系統(tǒng)顯示數(shù)據(jù)立方體,一些分析需求,用戶想使用直方圖 用戶想在不同粒度上 運(yùn)用聚集函數(shù) roll up amp。Chevy39。Chevy39。Chevy39。Units Sold39。ALL Models39。ALL Years39。ALL Colors39。,CUBE,CREATE VIEW auto_cube(units, mode