freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

云計(jì)算平臺(tái),架構(gòu)和理論-在線瀏覽

2025-06-17 18:27本頁(yè)面
  

【正文】 web pages and related information ? Use URLs as row keys ? Various aspects of web page as column names ? Store contents of web pages in the contents: column under the timestamps when they were fetched. Rows ? Name is an arbitrary string ? Access to data in a row is atomic ? Row creation is implicit upon storing data ? Rows ordered lexicographically ? Rows close together lexicographically usually on one or a small number of machines Rows (cont.) Reads of short row ranges are efficient and typically require munication with a small number of machines. ? Can exploit this property by selecting row keys so they get good locality for data access. ? Example: , , , VS , , , Columns ? Columns have twolevel name structure: ? family:optional_qualifier ? Column family ? Unit of access control ? Has associated type information ? Qualifier gives unbounded columns ? Additional levels of indexing, if desired Timestamps ? Used to store different versions of data in a cell ? New writes default to current time, but timestamps for writes can also be set explicitly by clients ? Lookup options: ? “Return most recent K values” ? “Return all values in timestamp range (or all values)” ? Column families can be marked w/ attributes: ? “Only retain most recent K values in a cell” ? “Keep values until they are older than K seconds” Implementation – Three Major Components ? Library linked into every client ? One master server ? Responsible for: ? Assigning tablets to tablet servers ? Detecting addition and expiration of tablet servers ? Balancing tabletserver load ? Garbage collection ? Many tablet servers ? Tablet servers handle read and write requests to its table ? Splits tablets that have grown too large Implementation (cont.) ? Client data doesn’t move through master server. Clients municate directly with tablet servers for reads and writes. ? Most clients never municate with the master server, leaving it lightly loaded in practice. Tablets ? Large tables broken into tablets at row boundaries ? Tablet holds contiguous range of rows ? Clients can often choose row keys to achieve locality ? Aim for ~100MB to 200MB of data per tablet ? Serving machine responsible for ~100 tablets ? Fast recovery: ? 100 machines each pick up 1 tablet for failed machine ? Finegrained load balancing: ? Migrate tablets away from overloaded machine ? Master makes loadbalancing decisions Tablet Location ? Since tablets move around from server to server, given a row, how do clients find the right machine? ? Need to find tablet whose row range covers the target row Tablet Assignment ? Each tablet is assigned to one tablet server at a time. ? Master server keeps track of the set of live tablet servers and current assignments of tablets to servers. Also keeps track of unassigned tablets. ? When a tablet is unassigned, master assigns the tablet to an tablet server with sufficient room. API ? Metadata operations ? Create/delete tables, column families, change metadata ? Writes (atomic) ? Set(): write cells in a row ? DeleteCells(): delete cells in a row ? DeleteRow(): delete all cells in a row ? Reads ? Scanner: read arbitrary cells in a bigtable ? Each row read is atomic ? Can restrict returned rows to a particular range ? Can ask for just data from 1 row, all rows, etc. ? Can ask for all columns, just certain column families, or specific columns Refinements: Compression ? Many opportunities for pression ? Similar values in the same row/column at different timestamps ? Similar values in different columns ? Similar values across adjacent rows ? Twopass custom pressions scheme ? First pass: press long mon strings across a large window ? Second pass: look for repetitions in small window ? Speed emphasized, but good space reduction (10to1) Refinements: Bloom Filters ? Read operation has to read from disk when desired SSTable isn’t in memory ? Reduce number of accesses by specifying a Bloom filter. ? Allows us ask if an SSTable might contain data for a specified row/column pair. ? Small amount of memory for Bloom filters drastically reduces the number of disk seeks for read operations ? Use implies that most lookups for nonexistent rows or columns do not need to touch disk Refinements: Bloom Filters ? Read operation has to read from disk when desired SSTable isn’t in memory ? Reduce number of accesses by specifying a Bloom filter. ? Allows us ask if an SSTable might contain data for a specified row/column pair. ? Small amount of memory for Bloom filters drastically reduces the number of disk seeks for read operations ? Use implies that most lookups for nonexistent rows or columns do not need to touch disk 主要內(nèi)容 99 ? 云計(jì)算概述 ? Google 云計(jì)算技術(shù): GFS, Bigtable 和Mapreduce ?開源平臺(tái) Hadoop介紹 ?云計(jì)算理論 ?事務(wù)處理理論 ? DataLog理論 Hadoop Outline ? Architecture of Hadoop Distributed File System ? Hadoop usage at Facebook Hadoop, Why? ? Need to process Multi Petabyte Datasets ? Expensive to build reliability in each application. ? Nodes fail every day – Failure is expected, rather than exceptional. – The number of nodes in a cluster is not constant. ? Need mon infrastructure – Efficient, reliable, Open Source Apache License Hadoop History ? Dec 2022 – Google GFS paper published ? July 2022 – Nutch uses MapReduce ? Feb 2022 – Bees Lucene subproject ? Apr 2022 – Yahoo! on 1000node cluster ? Jan 2022 – An Apache Top Level Project ? Jul 2022 – A 4000 node test cluster ? Sept 2022 – Hive bees a Hadoop subproject Who uses Hadoop? ? Amazon/A9 ? Facebook ? Google ? IBM ? Joost ? ? New York Times ? PowerSet ? Veoh ? Yahoo! Commodity Hard
點(diǎn)擊復(fù)制文檔內(nèi)容
教學(xué)課件相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1