【正文】
cebook stores 150TB of data on 150 nodes web ? used at Twitter, Rackspace, Mahalo, Reddit, Cloudkick, Cisco, Digg, SimpleGeo, Ooyala, OpenX, others cap theorem ?consistency – all clients have same view of data ?availability – writeable in the face of node failure ?partition tolerance – processing can continue in the face of work failure (crashed router, broken work) daniel abadi: pacelc write consistency Level Description ZERO Good luck with that ANY 1 replica (hints count) ONE 1 replica. read repair in bkgnd QUORUM (DCQ for RackAware) (N /2) + 1 ALL N = replication factor Level Description ZERO Ummm… ANY Try ONE instead ONE 1 replica QUORUM (DCQ for RackAware) Return most recent TS after (N /2) + 1 report ALL N = replication factor read consistency agenda ? context ? features ? data model ? api cassandra properties ? tuneably consistent ? very fast writes ? highly available ? fault tolerant ? linear, elastic scalability ? decentralized/symmetric ? ~12 client languages – Thrift RPC API ? ~automatic provisioning of new nodes ? 0(1) dht ? big data write op Staged EventDriven Architecture ? A generalpurpose framework for high concurrency load conditioning ? Deposes applications into stages separated by queues ? Adopt a structured approach to eventdriven concurrency instrumentation data replication ? configurable replication factor ? replica placement strategy rack unaware ? Simple Strategy rack aware ? Old Network Topology Strategy data center shard ? Network Topology Strategy partitioner smackdown Random Preserving ? system will use MD5(key) to distribute data across nodes ? even distribution of keys from one CF across ranges/nodes Order Preserving ? key distribution determined by token ? lexicographical ordering ? required for range queries – scan over rows like cursor in index ? can specify the token for this node to use ? ‘scrabble’ distribution agenda ? context ? features ? data model ? api structure