CSCE 520 Final Exam Thursday December 14, 2017 Do all problems, putting your answers on separate paper. All answers should be reasonably short. The exam is open book, open notes, but no electronic devices. You have two and a half hours; pace yourself accordingly. When you submit your exam pages, staple them together with a single staple in the upper left-hand corner. A stapler will be provided. Make sure your name is on at least the first page. During the test, BE SURE NOT TO WRITE ANYTHING IN THE AREA THAT WILL BE UNDERNEATH THE STAPLE. There are 140 points total. 120 points constitutes full credit, and undergraduates get a free 12-point boost. Any score in excess of full credit counts as extra credit. Note that questions are not necessarily given in order of presentation in class or of difficulty. 1. (30 points total) Let R(A, B, C, D, E) be a relation whose schema satisfies the following functional dependencies: B E BE A C D CD E (a) (5 points) Compute the closures {A} +, {B} +, and {C} +. (b) (5 points) List all keys of R. (c) (5 points) List the FDs that hold in π A,B,C (R). You can omit FDs that follow from those you list. (d) (15 points) Decompose R completely into BCNF relations as efficiently as possible without losing any information (lossless join). Use the method described in the textbook or in class. (You may lose some FDs in the decomposition; that is OK.) There may be more than one correct answer. 2. (5 points) Let R and S be the following two tables, respectively: A B B C ---+--- ---+--- 1 3 2 4 4 2 3 2 3 3 2 3 3 1 1 2 Write a table for π A,C (σ A>C (R S)), assuming bag operations. Tuple order does not matter. 3. (45 points total) Assume our usual relational database schema for students taking classes: Student(sid, name, status) Class(prefix, courseno, semester, instructor) Course(prefix, courseno, title) Takes(sid, prefix, courseno, semester, grade) Where: The primary key for Student is (sid). The primary key for Class is (prefix, courseno, semester). The primary key for Course is (prefix, courseno). The primary key for Takes is (sid, prefix, courseno, semester). grade is of numerical type, between 0.0 and 4.0.
For parts (b,c,d) you may use subqueries as you see fit, provided they are reasonable. (a) (10 points) Give a relational algebra expression for the course titles and instructors of all CSCE classes taught in Fall 2017 where the student with the highest grade is named Ahmed. (b) (10 points) Give an SQL query for the above that is not gratuitously complex. (c) (10 points) Give a data modification statement in SQL that boosts Ahmed s grade up to 2.0 in every MATH class taught by Eva that he has taken where he got a grade lower than 2.0. (Note that all students named Ahmed might benefit from this.) (d) (15 points) Recall that the CREATE ASSERTION command in SQL has syntax CREATE ASSERTION <assertion-name> CHECK (<condition>); and creates the global constraint that <condition> must hold at all times. Suppose we add the following table to the schema above: InMajor(prefix, courseno) A tuple (p, n) in this relation indicates that the course with prefix p and course number n is a required course for all majors in the program (given by the prefix p). Give a CREATE ASSERTION command that enforces the requirement that any student taking a course in the CSCE major numbered 300 or above must have in previous semesters taken (with a grade of 2.0 or better) all courses in the CSCE major numbered less than 300. (The CSE department at USC had this requirement until recently.) For this, you can assume (not actually true) that semester values are ordered chronologically by <, <=, etc. E.g., semester1 < semester2 means that semester1 occurs before semester2. 4. (40 points total) Consider the following relational database schema: Airport( code, name, city, capacity ) /* name: the name of the airport (e.g., Hartsfield-Jackson ) code: a three-letter code (e.g., CLT ) city: the name of the city served by the airport (e.g., Atlanta, GA ) capacity: the maximum number of planes per day through the airport Airline( name, headquarters ) /* name: the name of the airline (e.g., Aeroflot ) headquarters: the city (e.g., Moscow, RU ) where the airline s headquarters is DailyFlight( flightno, airline, departcode, arrivecode ) /* flightno: the flight number (a code of up to 8 characters, e.g., DL1701 ) airlinename: the name of the airline that runs the flight departcode: the three-letter code of the airport from which the flight leaves arrivecode: the three-letter code of the airport at which the flight arrives Assume the following constraints on the data: Primary Keys: A airport is uniquely identified by its code. An airline is uniquely identified by its name. A daily flight is uniquely identified by its flight number. Value Constraints: All airports must have positive capacity. The name of an airport cannot be null. Referential Integrity: A airlinename appearing in the DailyFlight table must also appear as a name in the Airline table. Both the depart and arrive codes for a daily flight must also appear as codes in the airport table. (a) (15 points) Give CREATE TABLE commands in SQL for the three relations above, giving attribute types that are reasonably appropriate and consistent. Also incorporate the given constraints. (b) (10 points) Express the constraint (in the form R =, where R is some expression in relational algebra) that if some flight arrives at an airport, then some flight run by the same airline must depart from the same airport.
(c) (15 points) Suppose the tables described above are made up of the following tuples: Airport: name code city capacity Hartsfield-Jackson ATL Atlanta, GA 1260 Logan BOS Boston, MA 200 Columbia CAE Columbia, SC 30 De Gaulle CDG Paris, FR 800 Charlotte-Douglas CLT Charlotte, NC 400 Metro DTW Detroit, MI 250 Newark EWR Newark, NJ 600 Frankfurt FRA Frankfurt, GR 2000 Kennedy JFK New York, NY 1659 Los Angeles LAX Los Angeles, CA 1350 LaGuardia LGA New York, NY 1700 Gatwick LGW London, UK 1000 Heathrow LHR London, UK 700 Midway MDW Chicago, IL 50 Narita NRT Tokyo, JP 800 O Hare ORD Chicago, IL 1490 Orly ORY Paris, FR 650 Sheremetyevo SVO Moscow, RU 900 Airline: name headquarters Aeroflot Moscow, RU Air France Paris, FR American Dallas, TX British Airways London, UK Delta Atlanta, GA JAL Tokyo, JP Lufthansa Frankfurt, GR Midway Chicago, IL Northwest Detroit, MI United Chicago, IL Virgin Pacific Seattle, WA DailyFlight: flightno airlinename departcode arrivecode L201 Lufthansa FRA CLT DL1287 Delta ATL JFK DL790 Delta ORD LAX AF016 Aeroflot SVO JFK UA310 United LAX NRT AF521 Air France ORY CLT NWA549 Northwest FRA DTW BA120 British Airways BOS LHR AF6223 Air France LHR ORY L202 Lufthansa CLT FRA What is returned by the following SQL queries? (You can suppress duplicate entries if you want.) (a) SELECT city FROM Airport WHERE capacity > 200; (b) SELECT headquarters FROM Airline, DailyFlight WHERE name = airlinename AND arrivecode in ( CLT, LAX, JFK ); (c) SELECT city, flightno FROM Airport, Airline, DailyFlight WHERE code = arrivecode AND name = airlinename AND headquarters = Chicago, IL ;
5. (20 points total) This problem refers to an XML document with product data, stored locally with file name proddata.xml. The root element of this document is shown in Figures 12.5 and 12.5 on pages 526 527 of the textbook. That element is reproduced below with a slight correction (restoring a missing tag): <Products> <Maker name = "A"> <PC model = "1001" price = "2114"> <Speed>2.66</Speed> <HardDisk>250</HardDisk> <PC model = "1002" price = "995"> <Speed>2.10</Speed> <RAM>512</RAM> <HardDisk>250</HardDisk> <Laptop model = "2004" price = "1150"> <Speed>2.00</Speed> <RAM>512</RAM> <HardDisk>60</HardDisk> <Screen>13.3</Screen> <Laptop model = "2005" price = "2500"> <Speed>2.16</Speed> <HardDisk>120</HardDisk> <Screen>17.0</Screen> <Maker name = "E"> <PC model = "1011" price = "959"> <Speed>1.86</Speed> <RAM>2048</RAM> <HardDisk>160</HardDisk> <PC model = "1012" price = "649"> <Speed>2.80</Speed> <HardDisk>160</HardDisk> <Laptop model = "2001" price = "3673"> <Speed>2.00</Speed> <RAM>2048</RAM> <HardDisk>240</HardDisk> <Screen>20.1</Screen> <Printer model = "3002" price = "239"> <Color>false</Color> <Type>laser</Type>
<Maker name = "H"> <Printer model = "3006" price = "100"> <Color>true</Color> <Type>ink-jet</Type> <Printer model = "3007" price = "200"> <Color>true</Color> <Type>laser</Type> </Products> (a) (10 points) What, specifically, is returned by the following XQuery expression, when run on the document above? let $prods := doc("proddata.xml")/products let $bigpcs := ( for $maker in $prods/maker, $pc in $maker/pc where $pc/ram > 2000 or $pc/harddisk > 200 return <BigPC model = {$pc/@model} maker = {$maker/@name} /> ) return <BigPCs>{$bigPCs}</BigPCs> For readability s sake you may break lines and indent as appropriate. (b) (10 points) Write an XQuery expression that finds the names of those makers that make both printers and PCs (i.e., where at least one printer and at least one PC are made by the same maker). The result should be a sequence of names. Your query should work in general, not just on the specific data given above.