Project: Using Public Data

California’s Open Data Portal is a place where you can get lots of interesting information about the State of California and government programs. In this project you will select a dataset based on your interests and prepare the data to be imported into a schema.

Select Your Dataset

You can use any publicly available dataset for this project. The Open Data Portal has good quality datasets so I recommend looking around these URLs for something that interests you:

This is a good place for state wide data on a lot of subjects:

Individual state departments also have thier own datasets:

Data Requirements

Your schema must be complex enough to meet some requirements. Not all public datasets will meet these requirements, select one that does. You can join multiple datasets if you like. The requirements are:

  1. You must have at least three tables

  2. You must have at least two foreign key constraints


When you have decided on a dataset write a report that introduces your dataset, the report should explain what your data is and why you want to use it. The write-up must also include:

  1. An introduction to your project.

  2. A description of the CSV files that you submitted with the assignment.

  3. The entites and relationships in your data

  4. The functional dependencies you expect to see. (You don’t have to validate the FDs at this stage)

  5. The tables you expect to create.

I’ve written an example based on the Cabrillo courses labs. Please look at the example and submit something similar.

Turn In

Submit your write-up and your data files.