Data base management systems
1. Define following terms.
A.Normalization:
To eliminate redundancy of data i.e. having same information stored at multiple places, which eventually be difficult to maintain and will also increase the size of our database.
With normalization we will have tables with fewer columns which will make data retrieval and insert, update and delete operations more efficient.
(Or)
Let Rbe a relation scheme with a setFof functional dependencies.Decide whether a relation scheme Ris in “good”form.In the case that a relation scheme Ris not in “good”form, decompose it into a set of relation scheme {R1, R2, ..., Rn} such that each relation scheme is in good form the decomposition is a lossless-join decomposition
Preferably, the decomposition should be dependency preserving.
B.Transitive dependency:
Sol: A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z.
C. Partial Dependency:
Partial Functional Dependency - A non-key column is dependent on some, but not all the columns in a composite primary key.
In our above example Supplier Address was partially dependent on our composite key columns (Gadgets+Supplier).
D.Multivalued dependency
Sol: A multivalued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows.
E.Non-prime attribute
Sol: A non-prime attribute is an attribute that does not occur in any candidate key. Employee Address would be a non-prime attribute in the "Employees' Skills" table.
2. Explain various types of Normal forms?
Sol: There are six normal forms.They are
First Normal Form
Second Normal Form
Third Normal Form
Boyce-Codd Normal Form
Fourth Normal Form
Fifth Normal Form
1 NF – No multivalued attributes or repeating groups.
Column values should be atomic, scalar or should be holding single value
No repetition of information or values in multiple columns.
2 NF – 1 NF plus no partial dependency
For second normal form our database should already be in first normal form and every non-key column must depend on entire primary key.
3 NF – 2 NF plus no transitive dependencies
It should already be in Second Normal Form.
There should be no transitive dependency, i.e. we shouldn’t have any non-key column depending on any other non-key column.
BCNF-Boyce Codd normal form-Every determinant should be a candidate key.
4NF- A table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X →→ Y, X is a super key—that is, X is either a candidate key or a superset thereof.
3. Explain the 4NF? Why it is useful? Explain with example?
Sol: 4NF is concerned with a more general type of dependency known as a multivalued dependency. A table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X →→ Y, X is a super key—that is, X is either a candidate key or a superset thereof.
Multivalued dependencies
If the column headings in a relational database table are divided into three disjoint groupings X, Y, and Z, then, in the context of a particular row, we can refer to the data beneath each group of headings as x, y, and z respectively. A multivalued dependency X →→ Y signifies that if we choose any x actually occurring in the table (call this choice xc), and compile a list of all the xcyz combinations that occur in the table, we will find that xc is associated with the same y entries regardless of z.
A trivial multivalued dependency X →→ Y is one in which Y consists of all columns not belonging to X. That is, a subset of attributes in a table has a trivial multivalued dependency on the remaining subset of attributes.
A functional dependency is a special case of multivalued dependency. In a functional dependency X → Y, every x determines exactly one y, never more than one.
Example
Consider the following example:
Pizza Delivery Permutations
Restaurant Pizza Variety Delivery Area
A1 Pizza Thick Crust Springfield
A1 Pizza Thick Crust Shelbyville
A1 Pizza Thick Crust Capital City
A1 Pizza Stuffed Crust Springfield
A1 Pizza Stuffed Crust Shelbyville
A1 Pizza Stuffed Crust Capital City
Elite Pizza Thin Crust Capital City
Elite Pizza Stuffed Crust Capital City
Vincenzo's Pizza Thick Crust Springfield
Vincenzo's Pizza Thick Crust Shelbyville
Vincenzo's Pizza Thin Crust Springfield
Vincenzo's Pizza Thin Crust Shelbyville
Each row indicates that a given restaurant can deliver a given variety of pizza to a given area.
The table has no non-key attributes because its only key is {Restaurant, Pizza Variety, Delivery Area}. Therefore it meets all normal forms up to BCNF. It does not, however, meet 4NF. The problem is that the table features two non-trivial multivalued dependencies on the {Restaurant} attribute (which is not a superkey). The dependencies are:
{Restaurant} →→ {Pizza Variety}
{Restaurant} →→ {Delivery Area}
These non-trivial multivalued dependencies on a non-superkey reflect the fact that the varieties of pizza a restaurant offers are independent from the areas to which the restaurant delivers. This state of affairs leads to redundancy in the table: for example, we are told three times that A1 Pizza offers Stuffed Crust, and if A1 Pizza start producing Cheese Crust pizzas then we will need to add multiple rows, one for each of A1 Pizza's delivery areas. There is, moreover, nothing to prevent us from doing this incorrectly: we might add Cheese Crust rows for all but one of A1 Pizza's delivery areas, thereby failing to respect the multivalued dependency {Restaurant} →→ {Pizza Variety}.
To eliminate the possibility of these anomalies, we must place the facts about varieties offered into a different table from the facts about delivery areas, yielding two tables that are both in 4NF:
Varieties By Restaurant
Restaurant Pizza Variety
A1 Pizza Thick Crust
A1 Pizza Stuffed Crust
Elite Pizza Thin Crust
Elite Pizza Stuffed Crust
Vincenzo's Pizza Thick Crust
Vincenzo's Pizza Thin Crust
Delivery Areas By Restaurant
Restaurant Delivery Area
A1 Pizza Springfield
A1 Pizza Shelbyville
A1 Pizza Capital City
Elite Pizza Capital City
Vincenzo's Pizza Springfield
Vincenzo's Pizza Shelbyville
In contrast, if the pizza varieties offered by a restaurant sometimes did legitimately vary from one delivery area to another, the original three-column table would satisfy 4NF.
Ronald Fagin demonstrated that it is always possible to achieve 4NF
4. Explain the 3NF? Give one example?
Sol: Definition states that a table is in 3NF if and only if both of the following conditions hold: The relation R (table) is in second normal form (2NF)
Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every key of R.
A non-prime attribute of R is an attribute that does not belong to any candidate key of R. A transitive dependency is a functional dependency in which X → Z (X determines Z) indirectly, by virtue of X → Y and Y → Z (where it is not the case that Y → X).
This definition states that a table is in 3NF if and only if, for each of its functional dependencies X → A, at least one of the following conditions holds:
X contains A (that is, X → A is trivial functional dependency), or
X is a superkey, or
An example of a 2NF table that fails to meet the requirements of 3NF is:
Tournament Winners
Tournament Year Winner Date of Birth
Indiana Invitational 1998 Al Fredrickson 21 July 1975
Cleveland Open 1999 Bob Albertson 28 September 1968
Des Moines Masters 1999 Al Fredrickson 21 July 1975
Indiana Invitational 1999 Chip Masterson 14 March 1977
Because each row in the table needs to tell us who won a particular Tournament in a particular Year, the composite key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify a row. That is, {Tournament, Year} is a candidate key for the table.
The breach of 3NF occurs because the non-prime attribute Winner Date of Birth is transitively dependent on the candidate key {Tournament, Year} via the non-prime attribute Winner. The fact that Winner Date of Birth is functionally dependent on Winner makes the table vulnerable to logical inconsistencies, as there is nothing to stop the same person from being shown with different dates of birth on different records.
In order to express the same facts without violating 3NF, it is necessary to split the table into two:
Tournament Winners
Tournament Year Winner
Indiana Invitational 1998 Al Fredrickson
Cleveland Open 1999 Bob Albertson
Des Moines Masters 1999 Al Fredrickson
Indiana Invitational 1999 Chip Masterson
Player Dates of Birth
Player Date of Birth
Chip Masterson 14 March 1977
Al Fredrickson 21 July 1975
Bob Albertson 28 September 1968
Update anomalies cannot occur in these tables, which are both in 3NF.
5. Explain Atomicity, Consistency, Isolation and Durability?
Sol: Atomicity: Either all operations of the transaction are properly reflected in the database or none are.
Consistency: Execution of a transaction in isolation preserves the consistency of the database.
Isolation: Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions. That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution before Ti started, or Tj started execution after Ti finished.
Durability: After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.
6. Define Transaction? Explain transaction states with neat diagram?
Sol: A transaction is a unit of program execution that accesses and possibly updates various data items. To preserve the integrity of data the database system must ensure:
Transaction State
Active –the initial state; the transaction stays in this state while it is executing
Partially committed –after the final statement has been executed.
Failed --after the discovery that normal execution can no longer proceed.
Aborted –after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted:
restart the transaction; can be done only if no internal logical error
kill the transaction
Committed –after successful completion
7. Discuss about Conflict Serializability?
Sol: Conflict Serializability:Two schedules are conflict equivalent if:
_ Involve the same actions of the same transactions
_ Every pair of conflicting actions is ordered the same way
If a schedule Scan be transformed into a schedule S´by a series of swaps of non-conflicting instructions, we say that Sand S´are conflict equivalent.
We say that a schedule S is conflict serializableif it is conflict equivalent to a serial schedule
Conflicting Instructions
Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both li and lj, and at least one of these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
Intuitively, a conflict between li and lj forces a (logical) temporal order between them. If li and lj are consecutive in a schedule and they do not conflict, their results would remain the same even if they had been interchanged in the schedule.
8.Discuss about View serializability?
Sol:View Serializability :
Let Sand S´be two schedules with the same set of transactions. Sand S´are view equivalentif the following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must, in schedule S´, also read the initial value of Q.
2. For each data item Q if transaction Ti executes read(Q) in schedule S, and that value was produced by transaction Tj (if any), then transaction Ti must in schedule S´ also read the value of Q that was produced by transaction Tj .
3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must perform the final write(Q) operation in schedule S´.
As can be seen, view equivalence is also based purely on reads and writes alone.
A schedule S is view serializable it is view equivalent to a serial schedule.
Every conflict serializable schedule is also view serializable.
9. Explain following terms
A. Dense Index:
Dense index—Index record appears for every search-key value in the file.
B.Sparse Index:
Sparse Index: contains index records for only some search-key values.
Applicable when records are sequentially ordered on search-key
To locate a record with search-key value Kwe:
Find index record with largest search-key value < K
Search file sequentially starting at the record to which the index record points
C.Primary Index
Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
The search key of a primary index is usually but not necessarily the primary key.
D. Clustered and non clustered:
Clustering index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file. Also called Primary index
Non-clustering index: an index whose search key specifies an order different from the sequential order of the file. Also called Secondary index.
10. How does the strict Two-phase locking protocol ensures serializability?
Sol: According to the two-phase locking protocol, a transaction handles its locks in two distinct, consecutive phases during the transaction's execution:
Phase 1 - Expanding phase: locks are acquired and no locks are released.
Phase 2 - Shrinking phase: locks are released and no locks are acquired.
Strict two-phase locking
The strict two-phase locking (S2PL) class of schedules is the intersection of the 2PL class with the class of schedules possessing the strictness property.
To comply with the S2PL protocol a transaction needs to comply with 2PL, and release its write (exclusive) locks only after it has ended, i.e., being either committed or aborted. On the other hand, read (shared) locks are released regularly during phase 2.
Typically end of phase 1 is safely determined when a transaction has entered its ready state in all its processes (processing has ended, and it is ready to be committed; no additional locking is possible). If several processes (two or more) are involved, then a synchronization point (similar to atomic commitment) among them is needed to determine end of phase 1 for all of them. This is usually too costly, and end of phase 1 is usually postponed to be merged with transaction end (atomic commitment protocol for a multiprocess transaction), which turns S2PL to SS2PL (see below).
S2PL is a special case of 2PL, i.e., the S2PL class is a proper subclass of 2PL.
Strong strict two-phase locking
To comply with the strong strict two-phase locking (SS2PL) protocol a transaction needs to comply with 2PL, and release both its write (exclusive) and read (shared) locks only after it has ended, i.e., being either committed or aborted. A transaction obeying SS2PL can be viewed as having phase 1 that lasts the transaction's entire execution duration, and no phase 2 (or a degenerate phase 2). Thus, only one phase is actually left, and "two-phase" in the name seems to be still utilized due to the historical development of the concept from 2PL. The SS2PL property of a schedule is also called rigorousness, and an SS2PL schedule is also called a rigorous schedule.
SS2PL is a special case of S2PL, i.e., the SS2PL class of schedules is a proper subclass of S2PL (every SS2PL schedule is also an S2PL schedule, but S2PL schedules exist that are not SS2PL).
11.Expalin BCNF? Give one example?
Sol: Boyce-Codd normal form (or BCNF) is a normal form used in database normalization. It is a slightly stronger version of the third normal form (3NF). A table is in Boyce-Codd normal form if and only if, for every one of its non-trivial functional dependencies X → Y, X is a super key—that is, X is either a candidate key or a superset thereof.
Achievability of BCNF
In some cases, a non-BCNF table cannot be decomposed into tables that satisfy BCNF and preserve the dependencies that held in the original table. Beeri and Bernstein showed in 1979 that, for example, a set of functional dependencies {AB → C, C → B} cannot be represented by a BCNF schema.[5] Thus, unlike the first three normal forms, BCNF is not always achievable.
Consider the following non-BCNF table whose functional dependencies follow the {AB → C, C → B} pattern:
Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy's
Fuller Hairdresser Sweeney Todd's
Fuller Optician Eagle Eye
For each Person / Shop Type combination, the table tells us which shop of this type is geographically nearest to the person's home.
The candidate keys of the table are:
{Person, Shop Type}
{Person, Nearest Shop}
Because all three attributes are prime attributes (i.e. belong to candidate keys), the table is in 3NF. The table is not in BCNF, however, as the Shop Type attribute is functionally dependent on a non-superkey: Nearest Shop.
The violation of BCNF means that the table is subject to anomalies. For example, Eagle Eye might have its Shop Type changed to "Optometrist" on its "Fuller" record while retaining the Shop Type "Optician" on its "Davidson" record. This would imply contradictory answers to the question: "What is Eagle Eye's Shop Type?" Holding each shop's Shop Type only once would seem preferable, as doing so would prevent such anomalies from occurring:
Shop Near Person
Person Shop
Davidson Eagle Eye
Davidson Snippets
Wright Merlin Books
Fuller Doughy's
Fuller Sweeney Todd's
Fuller Eagle Eye
Shop
Shop Shop Type
Eagle Eye Optician
Snippets Hairdresser
Merlin Books Bookshop
Doughy's Bakery
Sweeney Todd's Hairdresser
In this revised design , the "Shop Near Person" table has a candidate key of {Person, Shop}, and the "Shop" table has a candidate key of {Shop}. Unfortunately, although this design adheres to BCNF, it is unacceptable on different grounds: it allows us to record multiple shops of the same type against the same person. In other words, its candidate keys do not guarantee that the functional dependency {Person, Shop Type} → {Shop} will be respected.
A design that eliminates all of these anomalies (but does not conform to BCNF) is possible.[6] This design consists of the original "Nearest Shops" table supplemented by the "Shop" table described above.
Nearest Shops
Person Shop Type Nearest Shop
Davidson Optician Eagle Eye
Davidson Hairdresser Snippets
Wright Bookshop Merlin Books
Fuller Bakery Doughy's
Fuller Hairdresser Sweeney Todd's
Fuller Optician Eagle Eye
Shop
Shop Shop Type
Eagle Eye Optician
Snippets Hairdresser
Merlin Books Bookshop
Doughy's Bakery
Sweeney Todd's Hairdresser
If a referential integrity constraint is defined to the effect that {Shop Type, Nearest Shop} from the first table must refer to a {Shop Type, Shop} from the second table, then the data anomalies described previously are prevented.
12.Define following terms.
i)Dead lock?
System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for another transaction in the set.
ii) Shared Lock?
Shared(S) mode. Data item can only be read. S-lock is requested using lock-S instruction
iii) Shrinking Phase?
Shrinking Phase:
Transaction may release locks and transaction may not obtain locks.
iv) Growing Phase?
Growing Phase:
Transaction may obtain locks and Transaction may not release locks
No comments:
Post a Comment