NEW YORK UNIVERSITY
COMPUTER SCIENCES DEPARTMENT

MIDTERM EXAM - SOLUTIONS
G22.2433-001 (Database Systems) Spring 2000
Instructor: Viswanath Poosala
9, March, 2000.
Duration: 1hr 45min.

Total Marks: 100


  1. (5) State (briefly) three major advantages of using a DBMS instead of a file system. Are there any disadvantages? If so, state one.
    Answer:  Advantages: Flexibility in accessing data using high level query languages; Prevention of data corruption via integrity constraints; Ability to access data concurrently without problems; automated recovery in case of failure.
       Disadvantages: Costs more, takes more time due to the additional software layers, requires complex administration.
Grades: 3 marks for the advantages; 2 for the disadvantage.
  1. (E-R Model) Consider a university trying to move its operations to a DBMS. The university has several academic departments. Each department has a unique name and a street address (number, street name) associated with it. The departments offer courses with distinct names, each of which is taught by a single professor and one or more TAs. A TA can only be assigned to one course. The course is offered once a week at a fixed time. Students belong to a single department but can register for courses offered by any department. When a student takes a course, he/she is assigned to a TA for help (i.e., each TA of the course helps different set of students). Each person (professor, student, TA) has a name and a unique SSN. Professor and TA also earn a salary. Finally, the university also wants to keep track of the one or more languages that the TA can speak.
    1. (15) Design an E-R diagram for modeling the above data. State your assumptions.
    2. (10) Derive the relations corresponding to your E-R diagram. Try not to have any redundant tables.
    3. (5) Identify any two foreign key constraints for this relational database schema.
    Grading: The ER diagram should capture ALL the relationships in the question. Importantly, the TA-Student-Course relationship is best captured by a ternary relationship (-3 if not); the TA-languages relationship is best captured using a set attribute; it's also ok to use a relationship between the two entities TA and Language (-2 if neither is used).
  1. (Rel. DB Design) Consider the EMP_PROJ relation schema with the attributes SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC (project location). The following set of functional dependencies hold on this schema:
    Grades: Full marks for correct answers. Not much scope for partial marks, but I leave it to the TA's discretion.
Answer: SSN,PNUMBER->ENAME, PNAME, PLOC, HOURS.
     Candidate key is {SSN, PNUMBER}.
     It is not in BCNF because SSN->ENAME is not a superkey dependency.
     Therefore, all tuples containing the same SSN will also have the same ENAME, causing redundancy. This results in waste of space and problems with updating the data consistently.
     3NF decomposition: {SSN, ENAME},  {PNUMBER, PNAME, PLOC},  {SSN, PNUMBER, HOURS}
     BCNF decomposition: same.
  1. (SQL) Consider the following Supplier-Part-Project database.
  2. Write the following queries in SQL. State your assumptions if any.
    Grades: Full marks for correct query; -2 for getting the SELECT clause wrong.
     Answer:
      1. CREATE TABLE SPJ
            ( SNO INTEGER NOT NULL,
              PNO INTEGER NOT NULL,
              JNO INTEGER NOT NULL,
              QTY INTEGER NOT NULL,
              PRIMARY KEY (SNO, PNO, JNO),
              FOREIGN KEY (SNO) REFERENCES S,
              FOREIGN KEY (PNO) REFERENCES P,
              FOREIGN KEY (JNO) REFERENCES J);

        2. CREATE VIEW LONDON_SUPPLIERS AS (
            SELECT SNAME FROM S WHERE SCITY = 'London');

        3. select distinct P.COLOR
            from S,P,J,SPJ
            where (S.SCITY = 'London') and
                        (J.JCITY = 'Paris') and
                        (S.SNO = SPJ.SNO) and
                        (P.PNO = SPJ.PNO) and
                        (J.JNO = SPJ.JNO)
            order by P.COLOR;
        4. select JNO, count(distinct PNO)
            from SPJ
            where SNO = 'S1';

        5. select distinct JNO
            from SPJ spj1
            where not exists ((select SNO from S where SCITY = 'London) -
                                         (select SNO from SPJ spj2 where spj1.JNO = spj2.JNO));
        6. Pi_{JNO, SNO}(SPJ)  %  Pi_{SNO}(Sigma_{SCITY=London}(S));
 

  1. (9) (SQL) Given the database schema R(a,b,c), and a relation r on the schema R, write an SQL query to test whether the functional dependeny "a -> b" holds on relation r. Assume that there are no null values in r. Your answer should be "Yes" if the dependency holds and can be "No" or empty result if the dependency does not hold.
        Answer. select distinct 'Yes' from R t1
              where not exists (select * from R t2 where ((t1.a = t2.a) && (t1.b != t2.b));
        Grades: 9 for correct; 7 for getting the right predicate but missing the select clause.
 
  1. (1) (For a Million $$ :-)) What/who is Strawberry Fields:
        Answer: (b), (c).
        Grades: 1 for everyone.