正文內(nèi)容

slingforbigdata-文庫(kù)吧

2025-08-26 20:10 本頁(yè)面

【正文】 of summarization Sampling has been proved to be a flexible method to acplish this Sampling for Big Data Data Scale: Summarization and Sampling Sampling for Big Data Traffic Measurement in the ISP Network Access Router Centers Backbone Business Datacenters Management Traffic Matrices Flow records from routers Sampling for Big Data Massive Dataset: Flow Records ? IP Flow: set of packets with mon key observed close in time ? Flow Key: IP src/dst address, TCP/UDP ports, ToS,… *64 to 104+ bits+ ? Flow Records: – Protocol level summaries of flows, piled and exported by routers – Flow key, packet and byte counts, first/last packet time, some router state – Realizations: Cisco Netflow, IETF Standards ? Scale: 100’s TeraBytes of flow records daily are generated in a large ISP ? Used to manage work over range of timescales: – Capacity planning (months),…., detecting work attacks (seconds) ? Analysis tasks – Easy: timeseries of predetermined aggregates (. address prefixes) – Hard: fast queries over exploratory selectors, history, munications subgraphs flow 1 flow 2 flow 3 flow 4 time Sampling for Big Data Flows, Flow Records and Sampling ? Two types of sampling used in practice for inter traffic: 1. Sampling packet stream in router prior to forming flow records □ Limits the rate of lookups of packet key in flow cache □ Realized as Packet Sampled NetFlow (more later…) 2. Downstream sampling of flow records in collection infrastructure □ Limits transmission bandwidth, storage requirements □ Realized in ISP measurement collection infrastructure (more later…) ? Two cases illustrative of general property – Different underlying distributions require different sample designs – Statistical optimality sometimes limited by implementation constraints □ Availability of router storage, processing cycles Sampling for Big Data Abstraction: Keyed Data Streams ? Data Model: objects are keyed weights – Objects (x,k): Weight x。 key k □ Example 1: objects = packets, x = bytes, k = key (source/destination) □ Example 2: objects = flows, x = packets or bytes, k = key □ Example 3: objects = account updates, x = credit/debit, k = account ID ? Stream of keyed weights, {(xi , ki): i = 1,2,…,n ? Generic query: subset sums – X(S) = Σi?S xi for S ? ,1,2,…,n . total weight of index subset S – Typically S = S(K) = {i: ki ? K} : objects with keys in K □ Example 1, 2: X(S(K)) = total bytes to given IP dest address / UDP port □ Example 3: X(S(K)) = total balance change over set of accounts ? Aim: Compute fixed size summary of stream that can be used to estimate arbitrary subset sums with known error bounds Sampling for Big Data Inclusion Sampling and Estimation ? HorvitzThompson Estimation: – Object of size xi sampled with probability pi – Unbiased estimate x’i = xi / pi (if sampled), 0 if not sampled: E[x’i] = xi ? Linearity: – Estimate of subset sum = sum of matching estimates – Subset sum X(S)= ?i?S xi is estimated by X’(S) = ?i?S x’i ? Accuracy: – Exponential Bounds: Pr* |X’(S) X(S)| δX(S)+ ≤ exp[g(δ)X(S)] – Confidence intervals: X(S) ? [X(?) , X+(?)] with probability 1 ? ? Futureproof: – Don’t need to know queries at time of sampling □ “Where/where did that suspicious UDP port first bee so active?” □ “Which is the most active IP address within than anomalous sub?” – Retrospective estimate: subset sum over relevant keyset Sampling for Big Data Independent Stream Sampling ? Bernoulli Sampling – IID sampling of objects with some probability p – Sampled weight x has HT estimate x/p ? Poisson Sampling – Weight xi sampled with probability pi 。 HT estimate xi / pi ? When to use Poisson vs. Bernoulli sampling? – Elephants and mice: Poisson allows probability to depend on weight… ? What is best choice of probabilities for given stream {xi} ? Sampling for Big Data Bernoulli Sampling ? The easiest possible case of sampling: all weights are 1 – N objects, and want to sample k from them uniformly – Each possible subset of k should be equally likely ? Uniformly sample an index from N (without replacement) k times – Some subtleties: truly random numbers from *1…N+ on a puter? – Assume that random number generators are good enough ? Common trick in DB: assign a random number to each item and sort – Costly if N is very big, but so is random access ? Interesting problem: take a single linear scan of data to draw sample – Streaming model of putation: see each element once – Application: IP flow sampling, too many (for us) to store – (For a while) mon tech interview question Sampling for Big Data Reservoir Sampling “Reservoir sampling” described by [Knuth 69, 81]。 enhancements [Vitter 85] ? Fixed size k uniform sample from arbitrary size N stream in one pass – No need to know stream size in advance – Include first k items . 1 – Include item n k with probability p(n) = k/n, n k □ Pick j uniformly from ,1,2,…,n □ If j ≤ k, swap item n into location j in reservoir, discard replaced item ? Neat proof shows the uniformity of the sampling method: – Let Sn = sample set after n arrivals k=7 n m ( n) Previously sampled item: induction m ? Sn1 . pn1 ? m ? Sn . pn1 * (1 – pn / k) = pn New item: selection probability Prob[n ? Sn ] = pn := k/n Sampling for Big Data Reservoir Sampling: Skip Counting ? Simple approach: check each item in turn – O(1) pe

點(diǎn)擊復(fù)制文檔內(nèi)容

教學(xué)課件相關(guān)推薦

山中訪友pptppt-資料下載頁(yè)

【總結(jié)】賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞悟賞

2024-11-23 17:57

1、草原pptppt-資料下載頁(yè)

【總結(jié)】老舍作者：老舍節(jié)選自《內(nèi)蒙風(fēng)光》理清課文脈絡(luò)自由讀課文，想一想課文寫(xiě)了哪些內(nèi)容？課文中作者又是怎樣把那草原的美寫(xiě)出來(lái)呢？請(qǐng)讀一讀，找一找，劃一劃。那里的天比別處的更可愛(ài),空氣是那么清鮮,天空是那么明朗,使我總想高歌一曲,表

2024-11-21 06:46

16絕招pptppt-資料下載頁(yè)

【總結(jié)】車(chē)站小學(xué)：楊莉你知道什么是絕招嗎？你見(jiàn)過(guò)哪些絕招？一起來(lái)瞧瞧這些孩子的絕招吧！tǐngbiēwéi絕招挺著憋氣唯獨(dú)豎起武術(shù)空翻鎮(zhèn)住調(diào)換禁不住

2024-11-21 06:03

畫(huà)楊桃pptppt課件-資料下載頁(yè)

【總結(jié)】人教版語(yǔ)文第六冊(cè)廣南縣城區(qū)一小孫榮仙這兩個(gè)楊桃有什么不同?，知道同一個(gè)事物從不同的角度看會(huì)有不同的結(jié)果，使學(xué)生從中受到科學(xué)思想方法的教育。系，學(xué)會(huì)怎樣把一段話寫(xiě)清楚。，練習(xí)用“不像……而像……”、“不要……要……”說(shuō)話。，背誦最后兩個(gè)自然段。我要做到

2025-05-01 18:20

[ppt模板]ppt快速提升-資料下載頁(yè)

【總結(jié)】PPT快速提升問(wèn)題一：PPT的分類(lèi)及重點(diǎn)？連白天做夢(mèng)都想考100分閱讀文檔類(lèi)的PPT制作重點(diǎn)文字處理圖示版式演示輔助類(lèi)的PPT制作重點(diǎn)圖片圖片還是圖片自動(dòng)演示類(lèi)的PPT制作重點(diǎn)動(dòng)畫(huà)交互多媒體雜項(xiàng)類(lèi)的PPT制作重點(diǎn)功能構(gòu)思VBA外部協(xié)同問(wèn)題二：不會(huì)設(shè)計(jì)怎么辦？2022不懂設(shè)計(jì)，怎造諾亞方舟

2025-04-14 02:08

在家里pptppt-資料下載頁(yè)

【總結(jié)】shāfā沙發(fā)shūjià書(shū)架guàzhōng掛鐘diànshì電視bàozhǐ報(bào)紙chájī茶幾diànhuà電話táidēng臺(tái)燈c

2024-11-23 10:45

唱臉譜pptppt-資料下載頁(yè)

【總結(jié)】授課人：徐金燕生?旦一般男子角色稱(chēng)：“生”一般婦女稱(chēng)：“旦”丑幽默而滑稽的男子稱(chēng)：“丑”凈品貌怪異或性格豪放的男子稱(chēng)：“凈”京劇的四大行當(dāng)：歌曲分為幾部分？各部分主要講了什么內(nèi)容？?這首曲子那一句給你留下了深刻的印象？為什么？歌曲分為AB兩部分。

2024-11-21 00:33

靜夜思pptppt-資料下載頁(yè)

【總結(jié)】明月靜夜思床前明月光,疑是地上霜。舉頭望明月,低頭思故鄉(xiāng)。背得真好!這首詩(shī)是什么意思呢?秋天的夜晚,明亮的月光照在床前,地上就像鋪了一層白霜。遠(yuǎn)離家鄉(xiāng)的人，看著那天上的月亮，不由得思念起故鄉(xiāng)來(lái)。秋天的夜晚,明亮的月光照在床前,地上就像鋪了一層白霜。床前明月光，

2024-11-24 12:43

紅色ppt模板ppt課件-資料下載頁(yè)

【總結(jié)】ReinventingZoomYourSubtitleGoesHereppt模板下載YourTopicGoesHereYoursubtopicgoeshereYourTopicGoesHereYoursubtopicgoeshereBackdrops:-Thesearefullsizedbackdrops,ju

2025-04-29 03:41

[ppt模板]ppt制作培訓(xùn)-資料下載頁(yè)

【總結(jié)】課程目錄利用快捷鍵控制幻燈片的放映幻燈片實(shí)用技巧給演示文擋來(lái)個(gè)大瘦身播放按鈕的制作如何制作引人注目的PPT幻燈片3P1快捷鍵利用鍵盤(pán)控制幻燈片的放映4用戶(hù)在演示比較復(fù)雜的幻燈片時(shí)，用鼠標(biāo)操作經(jīng)常會(huì)出現(xiàn)失誤。其實(shí)使用幻燈片放映控制快捷鍵更容易避免失誤。所以作為一名

2025-03-22 02:55

[ppt模板]ppt內(nèi)容利用-資料下載頁(yè)

【總結(jié)】平板車(chē)目標(biāo)市場(chǎng)運(yùn)行情況介紹一目錄歐曼平板車(chē)09年市場(chǎng)運(yùn)行情況總結(jié)二歐曼平板車(chē)產(chǎn)品公告資源情況介紹三歐曼平板車(chē)2022年推廣作業(yè)安排四歐曼2022年商務(wù)政策要點(diǎn)五歐曼平板銷(xiāo)量增長(zhǎng)情況13075164270500010000150002022008年09年

2025-02-14 00:44

[ppt模板]ppt風(fēng)格模板-資料下載頁(yè)

【總結(jié)】第一課件網(wǎng)：*2標(biāo)題?正文

2025-01-19 08:48

唱臉譜pptppt課件-資料下載頁(yè)

【總結(jié)】教學(xué)目標(biāo)：一、情感目標(biāo)：通過(guò)戲歌《唱臉譜》的學(xué)習(xí)，讓學(xué)生了解京劇與流行歌曲的巧妙融合，從而培養(yǎng)學(xué)生對(duì)京劇的興趣和熱愛(ài)祖國(guó)優(yōu)秀文化的情感。二、知識(shí)目標(biāo)：通過(guò)戲歌《唱臉譜》的學(xué)習(xí)，讓學(xué)生了解戲歌，臉譜、拖腔、四擊頭、鴛鴦瓦及京劇表演形式等知識(shí)，讓學(xué)生感受下滑音、裝飾音（倚音）、休止符等音樂(lè)記號(hào)在戲曲音樂(lè)中的韻味。

2025-01-19 01:15

運(yùn)動(dòng)療法pptppt課件-資料下載頁(yè)

【總結(jié)】1PT之運(yùn)動(dòng)療法2主要講述內(nèi)容?概述–定義、基本體位、運(yùn)動(dòng)處方、基本類(lèi)型、常用設(shè)備、基本原則?力學(xué)和運(yùn)動(dòng)學(xué)原理的技術(shù)–肌力訓(xùn)練、耐力訓(xùn)練、關(guān)節(jié)活動(dòng)度訓(xùn)練、牽張訓(xùn)練、牽引治療、呼吸訓(xùn)練、放松訓(xùn)練、平衡訓(xùn)練、協(xié)調(diào)訓(xùn)練、步行訓(xùn)練、轉(zhuǎn)移訓(xùn)練、水中運(yùn)動(dòng)療法。3主要講述內(nèi)容?神經(jīng)肌肉促進(jìn)技術(shù)–Ro

2025-01-17 10:25

freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

slingforbigdata-文庫(kù)吧

山中訪友pptppt-資料下載頁(yè)

1、草原pptppt-資料下載頁(yè)

16絕招pptppt-資料下載頁(yè)

畫(huà)楊桃pptppt課件-資料下載頁(yè)

[ppt模板]ppt快速提升-資料下載頁(yè)

在家里pptppt-資料下載頁(yè)

唱臉譜pptppt-資料下載頁(yè)

靜夜思pptppt-資料下載頁(yè)

紅色ppt模板ppt課件-資料下載頁(yè)

[ppt模板]ppt制作培訓(xùn)-資料下載頁(yè)

[ppt模板]ppt內(nèi)容利用-資料下載頁(yè)

[ppt模板]ppt風(fēng)格模板-資料下載頁(yè)

唱臉譜pptppt課件-資料下載頁(yè)

運(yùn)動(dòng)療法pptppt課件-資料下載頁(yè)

過(guò)秦論ppt教學(xué)ppt課件-資料下載頁(yè)

slingforbigdata-文庫(kù)吧

slingforbigdata-wenkub

slingforbigdata(已修改)

slingforbigdata(編輯修改稿)