Project Deliverables:


1)      A text project description, including

a.       Description of the project

b.      Description of the data

c.       The process by which the database schema was crafted

d.      The description of the data extraction process

e.       A description of any data domains and data mappings, and their sources

f.        A description of at least 10 data quality rules associated with the data

g.       A description of the data quality validation process

h.       A description of the “extra” chosen

2)      Hard copy of:

a.       Database schema

b.      Data extraction code

c.       Data quality code

d.      “Extra” code

e.       Report answering the following questions from the data:

                                                               i.      How many companies were involved in more than one filing?

                                                             ii.      What is the company that appeared most frequently as the subject company in a filing?

                                                            iii.      How many filings are made by non-companies? (That is, please distinguish between an entity that is a company and one that is not a company.)

                                                           iv.      Provide a list of those records that did not validate from the new data set, and why.

                                                             v.      A histogram detailing the frequency of form types filed.

                                                           vi.      Detail all companies that have changed their names more than 3 times – this includes the current name and all previous names and dates of name change.

3)      Demonstration