E N D
CAP theorem, CAP theorem, NoSQL Data Architecture patterns NoSQL Data Architecture patterns M3,C2
•Consistency, Availability, Partition tolerance. •The CAP theorem is a belief from theoretical CAP THEOREM computer science about distributed data stores that claims, in the event of a network failure on a distributed database, it is possible to provide either consistency or availability—but not both.
Consistency • All the servers in the system will have the same data so anyone using the system will get the same copy regardless of which server answers their request. Availability CAP CAP Theorem Theorem • The system will always respond to a request (even if it's not the latest data or consistent across the system or just a message saying the system isn't working) Partition Tolerance • The system continues to operate as a whole even if individual servers fail or can't be reached..
Schema-less Models • Schema of a database system refers to designing of a structure for datasets and data structures for storing into the database. • NoSQL data not necessarily have a fixed table schema. • A key-value store allows you to store any data you like under a key. • A document database effectively does the same thing, since it makes no restrictions on the structure of the documents you store. • Column-family databases allow you to store any data under any column you like. • Graph databases allow you to freely add new edges and freely add properties to nodes and edges as you wish
Flexibility Schema Schema- -less: less: Can be more tolerant of variable Acidity and Consistency models Pros Pros Ease of use and maintenance
Flexibility - Users can, in theory: • Put any kind of data into the system • Create new kinds of relationships between things (in a few products) Schema Schema- -less: Pros Pros less: • Find data without worrying about the types of data involved. Can be more tolerant of variable Acidity and Consistency models Ease of use and maintenance: • No need to worry about data types • No need for a DBA • Applications will [probably] work when new data arrives
Confusion Performance suffers Schema Schema- -less: Cons Cons less: poor Integrity Ambiguity
The additional data may not be structured and follow fixed schema as NoSQL is schema-less. In such case the data store consist of Increasing Flexibility for Data Manipulation additional data, such as documents, blogs, face book pages and tweets. NoSQL data store possess characteristic of increasing flexibility for data manipulation. Late binding of new attributes is allowed and BASE is a flexible model for NoSQL data stores.
This model accommodates the flexibility offered by NoSQL and similar approaches to the management and curation of unstructured data. BASE consists of three principles: • Basic Availability: • The NoSQL database approach focuses on the availability BASE Model BASE Model of data even in the presence of multiple failures. • It achieves this by using a highly distributed approach to database management. • Instead of maintaining a single large data store and focusing on the fault tolerance of that store, NoSQL databases spread data across many storage systems with a high degree of replication.
Soft State: • BASE databases abandon the consistency requirements of the ACID model pretty much completely. • One of the basic concepts behind BASE is that data consistency is the developer’s problem and should not be handled by the database. Eventual Consistency: • The only requirement that NoSQL databases have regarding consistency is to require that at some point in the future, data will converge to a consistent state. • No guarantees are made, however, about when this will occur. • That is a complete departure from the immediate consistency requirement of ACID that prohibits a transaction from executing until the prior transaction has completed and the database consistent state. has converged to a
Example: database for the students in various courses to demonstrate the concept of increasing flexibility
NOSQL Data Architecture Patterns
Key-value store Document Store NoSQL data stores categories Tabular Data store • Column Family store • Big table Data store • RC file format • ORC file format • Paraquet File formats Object Data store Graph Data store
Simplest way to implement a schema-less data store 1. Key Value store Enables fast Data retrieval Key is a string which maps to a large data string or BLOB (Basic Large Object)
Key-value store accesses use a primary key for accessing the values. Therefore, the store can be easily scaled up for very large data. The concept is similar to a hash table where a unique key points to a particular item(s) of data.
Image store Uses of key value store : Document or file store Lookup table Query-cache
HASH TABLE • Refers to using associated key-value pairs • A set of pairs retrieve by using a hash key. • The hash key is a computed index using hash keys in the table-column. • The entries(values) across an array of slots(also called buckets) • The buckets correspond to the key for the pairs at column. • The values are in the associated rows of that column