Exercise 14: Choosing Indexes
Adapted from Ex 8.4.2. In this problem, we consider indexes
for the relation
Ships(name, class, launched)
from our running battleships exercise. Assume:
- name is the key.
- The relation Ships is stored over 50 pages.
- The relation is clustered on class so we expect that
only one disk access is needed to find the
ships of a given class.
- On average, there are 5 ships of a class, and 25 ships
launched in any given year.
- With probability p1 the operation on this relation is a
query of the form
SELECT * FROM Ships WHERE name = n.
- With probability p2 the operation on this relation is a
query of the form
SELECT * FROM Ships WHERE class = c.
- With probability p3 the operation on this relation is
a query of the form
SELECT * FROM Ships WHERE launched = l.
- With probability 1 − p1 − p2 − p3 the operation on
this relation is an insertion of a new tuple
into Ships.
You can also make assumptions about accessing indexes and
finding empty space for insertions that
were made in Example 8.14.
(a) If you can only create one index, how would you decide
what index to create ?
(b) If you can create any number of indexes, what are the
possible index combinations ?
(c) Consider the creation of indexes on name, class, and
launched. For each combination of
indexes, estimate the average cost of an operation. As a
function of p1, p2, and p3, what is the
best choice of indexes ?