Application programmers require a "predictable" behavior from persistence storage. So how do we define predictable? Traditionally, this dilemma was been addressed by the POSIX standard for file access. In the domain of data management, ACID has been our de-facto for how the database is expected to persist updates and handle concurrent read-write and write-write behavior.
So why do we need alternatives? Applications have evolved from enterprise scale to internet scale. The semantics of POSIX and ACID consistency may sometimes not be worth the tradeoff for scale, performance, and availability. Essenbtially, there is no "one size fits all," -- for instance, when you are shopping on Amazon, and browsing the catalog/adding to cart, the operations are not ACID -- the moment you hit checkout, the transaction is handled with ACID guarantees.
Defining the ingredients of predictable behavior -- the al la crate model: Instead of getting stuck with monolithic ACID/POSIX semantics, the programmers can now "pick-and-choose" -- this is possible since data management systems are increasingly being co-designed with the persistence layer e.g., Google File System (GFS) was developed to address very specific workload characteristics related with the persistence of URLs, indexes, etc. The following defines some of the key dimensions an application programmer would use to describe predictable behavior for storage.
So why do we need alternatives? Applications have evolved from enterprise scale to internet scale. The semantics of POSIX and ACID consistency may sometimes not be worth the tradeoff for scale, performance, and availability. Essenbtially, there is no "one size fits all," -- for instance, when you are shopping on Amazon, and browsing the catalog/adding to cart, the operations are not ACID -- the moment you hit checkout, the transaction is handled with ACID guarantees.
Defining the ingredients of predictable behavior -- the al la crate model: Instead of getting stuck with monolithic ACID/POSIX semantics, the programmers can now "pick-and-choose" -- this is possible since data management systems are increasingly being co-designed with the persistence layer e.g., Google File System (GFS) was developed to address very specific workload characteristics related with the persistence of URLs, indexes, etc. The following defines some of the key dimensions an application programmer would use to describe predictable behavior for storage.
- Atomicity guarantee: This is considered a pre-requisite to ensure 0 or 1 semantics for the data update. Most systems also support atomic test-and-set operations on single table row or object.
- Ordering guarantee: This ensures that the updates are visible in the order in which they are processed by the system -- defined by Lelise Lamport's happens-before casual relationship/logical clocks. On one extreme there can be global ordering for all the update operations across sites (e.g., Google's Spanner). On the other extreme the ordering can be w.r.t. individual object updates (e.g., Amazon S3). Alvaro et. al. propose an interesting taxonomy for the granularity of ordering: Object-, Dataflow-, and Language-level consistency.
- Data Freshness Guarantee + Monotonicity guarantee: Given the inherent asynchrony in distributed systems, the write followed by a read may not always reflect the latest value of the data. This can be artifact of two aspects: 1) Update prorogation to replicas and how the replicas are accessed during the read operation; 2) Read-Write Mutual exclusion semantics enforced by the system -- defined by Lamport's registers and also memory consistency models such as Sequential consistency, MESI, etc.
- WW mutual exclusion guarantee: As the name suggests, essentially describing how concurrent writes are handled. On one extreme, the system can support Last Writer Wins (e.g., Cassandra) and other extreme it may enforce 2 PL (with variants such as majority voting) or Time-stamp ordering.
- Transactions guarantee: This typically is in the context of multi-object operations. The key aspects are the atomicity of the updates and the isolation guarantee (linearizable, serializable, read repeatable, read committed, read uncommitted)
- Data integrity guarantee (unavailable rather than corrupt): Most scale-out systems continuously scrub data internally, and compare the replicas to ensure integrity -- it may make the data unavailable rather than serve corrupt data. Having this guarantee relieves the application from the role of verifying data integrity.
- Replica fidelity guarantee: For systems that allow applications to access to the replicas directly, this is guarantee defines what the application can expect -- the strongest guarantee is byte wise fidelity (e.g., Windows Azure). On the other extreme, at least once semantics with potentially different ordering within the replica (e.g., Google GFS that requires application to define Record identifiers to access the updates across the replicas).
- Mutability model: While this is internal to the persistence layer implementation, the mutability model helps shape the crash consistency and transactional semantics implemented at the application layer. A few common mutability models are in-place, versioning, immutable, out-of-place (append-only, CoW), journaled.
No comments:
Post a Comment