Home‎ > ‎CIS 54‎ > ‎Lab 5 - Normalization‎ > ‎

Lab 5 Answers

Queen Anne Curiosity Shop Project Questions

A. State assumptions about FDs.

Invoices 
  1. Price -> (Tax, Total)
  2. Phone -> (FirstName, LastName), also the reverse 
Purchase Data 
  1. Vendor -> Phone, also the reverse 

B. Comment on the designs. 

  1. The key does not functionally determine any other attribute. 
  2. The key does not functionally determine the InvoiceDate, InvoiceItem, Price, Tax and Total attributes. 
  3. The key does not functionally determine the InvoiceDate, InvoiceItem, Price, Tax and Total attributes. 
  4. The key does not functionally determine the InvoiceItem, Price, Tax and Total attributes. Look at the rows for Katherine Goodyear.  
  5. The key does not functionally determine the InvoiceDate, Price, Tax and Total attributes (assuming that the same person can buy "Chair" on more than one day).
  6. CUSTOMER is okay (if you don't have two customers with the same name) but in SALE the key does not functionally determine any of the other attributes. 
  7. In CUSTOMER the key does not functionally determine InvoiceDate, in SALE the same as #6
  8. CUSTOMER is okay but in SALE the key does not functionally determine LastName, FirstName, Price, Tax or Total because the names of items are too vague (look for "Antique Chair" on 15-Jan-15). 

C. Modify the best design from B with surrogate keys.

I will improve #8 

CUSTOMER ( CustomerID, LastName, FirstName, Phone, Email )
SALE ( SaleID, InvoiceDate, InvoiceItem, CustomerID, Price, Tax, Total )

This resolves the issue of having two customers with the same name. It also identifies an item so you can sell the same person two "Antique Chairs" on the same day!

D. Improve C by breaking SALE into SALE and SALE_ITEM

CUSTOMER ( CustomeID, LastName, FirstName, Phone, Email )
SALE ( SaleID, InvoiceDate, CustomerID, ItemID, Price, Tax, Total )
SALE_ITEM ( ItemID, InvoiceItem )

The above separates the customer and item entities into their own tables. The SALE_ITEM table links the two adding a date and price. Notice that the price is fixed at the point of sale. That allows salespeople to make deals and bargain with buyers. However, a sale can only have one item. So a customer buying multiple items would have multiple sales. 

E. Comment on..

  1. The key doesn't determine anything, look at "Antique Desk" 
  2. PurchaseItem and PurchasePrice together are unique in this data, but logically if this is an FD that means you can't buy two "chairs" for the same price, which seems wrong. 
  3. The key doesn't functionally determine anything, look at "Antique Desk"
  4. The key doesn't functionally determine PurchasePrice or PurchaseDate
  5. PURCHASE. The key doesn't functionally determine anything. See "Antique Desk" on 7-Nov-14. VENDOR is okay. 
  6. Same as #5, adding Vendor doesn't help 
  7. Same as #6, Vendor as a foreign key does nothing.

F. Modify the best design from E with surrogate keys. 

PURCHASE ( ItemID, PurchaseItem, PurchaseDate, VendorID, PurchasePrice )
VENDOR ( VendorID, Vendor, Phone )

This design resolves the problems around purchasing vague things like "Chair" from different vendors on the same date or having the price of "Chair" from a single vendor change on different dates. 

G. Put all the tables together. 

I will start by simply putting together all of the tables from the previous questions: 

From D:
    CUSTOMER ( CustomeID, LastName, FirstName, Phone, Email )
    SALE ( SaleID, InvoiceDate, CustomerIDItemID, Price, Tax, Total )    
    SALE_ITEM ( ItemID, InvoiceItem )

From F:
    PURCHASE ( ItemID, PurchaseItem, PurchaseDate, VendorID, PurchasePrice )
    VENDOR ( VendorID, Vendor, Phone )

The problem with this schema is that there are two notions of an inventory item, which is a PURCHASE and a SALE_ITEM. So, my first change will be to consolidate those into a single table called SALE_ITEM. 

    SALE_ITEM ( ItemID, ItemName, PurchaseDate, VendorID, PurchasePrice ) 

Now, I'll adjust the rest of the tables to use the SALE_ITEM table. 

    VENDOR ( VendorID, Name, Phone )
    CUSTOMER ( CustomerID, Name, Phone, Email ) 
    SALE ( SaleID, CustomerID, ItemID, Date, Price, Tax, Total ) 

This schema is normalized in 4NF because:
  1. ItemID -> ( Name, Date, Vendor, PurchasePrice ) 
  2. VendorID -> ( Name, Phone ) 
  3. CustomerID -> ( Name, Phone, Email ) 
  4. SaleID -> ( Customer, Item, Date, SalePrice, Tax, Total ) 
In other words the keys of each relation functionally determine all of the other attributes. When you purchase an item it goes into inventory in the SALE_ITEM table. The inventory has how much it cost to buy. When a customer purchases an item the transaction is recorded by a SALE. The SALE table joins a customer to an item and has its own price, the price the sale price. This lets the shop owner mark up (or down) inventory. 

However, this schema can still be improved. There is a multivalued dependency in the data: 

    Sale ->-> SaleItem 

This says that a sale may have multiple sale items. But, the opposite is not true:

    FALSE: SaleItem ->-> Sale 

An item can't be sold to multiple people. So, to satisfy this MVD in 4NF, I've made the rule, "a sale can only have one item and when you buy more than one thing you must have more than one sale." This could and should be improved by repurposing the SALE_ITEM table. I'll start by creating an ITEM table that holds items of inventory:

    ITEM ( ItemID, ItemName, PurchaseDate, VendorID, PurchasePrice )

Now the new SALE_ITEM table follows the pattern used to solve an MVD except, we need to encode the purchase price, otherwise we'll loose that information when we take it out of SALE:

    SALE_ITEM ( SaleID, ItemID, SalePrice ) 

Now that I have the SALE_ITEM join the sale and the item there's no need to have ItemID in SALE anymore and the price is now the subtotal of all of the items in the sale:

    SALE ( SaleIDCustomerID, Date, Subtotal, Tax, Total ) 

So the final schema looks like this: 

    VENDOR ( VendorID, Name, Phone )
    CUSTOMER ( CustomerID, Name, Phone, Email ) 
    ITEM ( ItemID, ItemName, PurchaseDate, VendorID, PurchasePrice )
    SALE ( SaleIDCustomerID, Date, Subtotal, Tax, Total ) 
    SALE_ITEM ( SaleIDItemID, SalePrice ) 

This schema eliminates the "one item per sale" limitation by properly recognizing the MVD. 

SQL Queries: 

# There are 16 Starbucks 

select distinct 
    name, address 
from inspection 
where name like '%STARBUCKS%' ;

# Little Caesars! 

select 
    name, address 
from inspection 
where address like '%LIVE OAK%' ;

# List verified 
select * 
from inspection 
where verified = 'Yes';
Comments