Exercise 14: Choosing Indexes

Adapted from Ex 8.4.2. In this problem, we consider indexes for the relation

Ships(name, class, launched)

from our running battleships exercise. Assume:

  1. name is the key.
  2. The relation Ships is stored over 50 pages.
  3. The relation is clustered on class so we expect that only one disk access is needed to find the ships of a given class.
  4. On average, there are 5 ships of a class, and 25 ships launched in any given year.
  5. With probability p1 the operation on this relation is a query of the form SELECT * FROM Ships WHERE name = n.
  6. With probability p2 the operation on this relation is a query of the form SELECT * FROM Ships WHERE class = c.
  7. With probability p3 the operation on this relation is a query of the form SELECT * FROM Ships WHERE launched = l.
  8. With probability 1 − p1 − p2 − p3 the operation on this relation is an insertion of a new tuple into Ships.

You can also make assumptions about accessing indexes and finding empty space for insertions that were made in Example 8.14.

(a) If you can only create one index, how would you decide what index to create ?

(b) If you can create any number of indexes, what are the possible index combinations ?

(c) Consider the creation of indexes on name, class, and launched. For each combination of indexes, estimate the average cost of an operation. As a function of p1, p2, and p3, what is the best choice of indexes ?