Introduction to Data Management CSE 344

Similar documents
SWEN502 Foundations of Databases Session 2. Victoria University of Wellington, 2017, Term 2 Markus Luczak-Roesch

COP 4540 Database Management

ISM Travel & Events 2017 June 12-14, 2017 Miami, FL

Solutions to Examination in Databases (TDA357/DIT620)

Part 1. Part 2. airports100.csv contains a list of 100 US airports.

Lecture 2: Image Classification pipeline. Fei-Fei Li & Andrej Karpathy Lecture 2-1

Management System for Flight Information

FINAL EXAM: DATABASES ("DATABASES") 22/06/2010 SCHEMA

CONTACT ON (888) FOR ANY QUERY RELATED TO BOOKING AMERICAN AIRLINES RESERVATIONS

Query formalisms for relational model relational algebra

MYOB EXO OnTheGo. Release Notes 1.2

Asynchronous Query Execution

Shazia Zaman MSDS 63712Section 401 Project 2: Data Reduction Page 1 of 9

Management System for Flight Information

TABLE OF CONTENTS. Washington Aviation System Plan Update July 2017 i

- Online Travel Agent Focus -

North American Online Travel Report

The impact of infrastructure-related taxes and fees on domestic fares

10 - Relational Data and Joins

SUSTAIN: A Framework for Sustainable Aviation

Operational Evaluation of a Flight-deck Software Application

BELLINGHAM INT L AIRPORT (BLI)

AUGUST 2008 MONTHLY PASSENGER AND CARGO STATISTICS

Parks & Leisure Services Dept Youth Council Meeting Calendar (Subject to Updates)

CSCE 520 Final Exam Thursday December 14, 2017

Port Everglades Master Plan Update Tenant Workshop. July 12, Port Everglades Master Plan Update 2006

Thanksgiving Holiday Period Traffic Fatality Estimate, 2017

Chapter 16 Revenue Management

Passenger and Cargo Statistics Report

Residential Property Price Index

Passenger Rebooking - Decision Modeling Challenge

Residential Property Price Index

Decision aid methodologies in transportation

Trade report comparison per year (Month detail)

Passenger and Cargo Statistics Report

CISC 7510X Midterm Exam For the below questions, use the following schema definition.

Semantic Representation and Scale-up of Integrated Air Traffic Management Data

JFK LHR. airports & flight connections

Trade report comparison per year (YTD)

2nd Quarter. AEDC is pleased to present the Anchorage Quarterly Economic Indicators Report for the second quarter of 2010.

Transportation Timetabling

6 MONTHS 1/2 RENT ON 3 YEAR DEALS! SIERRA VISTA BUSINESS PARK Chaparral Court, Anaheim, California FEATURES

CLIA EVENTS AT SEATRADE CRUISE GLOBAL

Trade report comparison per year (Month detail)

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #4 Airfare Prices Problem

RFP No B013 Travel Agency Services for the Metropolitan Washington Airports Authority

Chapter 14. Design of Flexible Airport Pavements AC 150/5320-6D

Venice Airport: A small Big Data story

Trade report comparison per year (Month detail)

Consider the following: Do you travel on low cost airlines? What is your favorite airline? Are the tickets expensive? Do you get food and movies?

Score : Name : Sheet 1. 4-Digit Subtraction

REVISION FOR TEST RELATIVE PRONOUNS

REGIONNAIRE A Monthly Newsletter for the So. California Region Vintage Chevrolet Club of America. April 2011

The Buy American Act & Berry Amendment

ultimate traffic Live User Guide

SIMAIR: A STOCHASTIC MODEL OF AIRLINE OPERATIONS

Linear Functions PreTest

City of Perugia CHUMS Take up seminar 2-3 December 2015 Edinburgh

This article is based upon a report issued by IdeaWorksCompany.

Solutions. Note, because landing and take-off arrivals are not staggered, all taking-off aeroplanes wait for 2 minutes before entering the runway.

What IS Our Experience? The Gulf Shores & Orange Beach Area Destination Brand Tourism Summit

RENO-TAHOE INTERNATIONAL AIRPORT APRIL 2008 PASSENGER STATISTICS

2013 Travel Survey. for the States of Guernsey Commerce & Employment Department RESEARCH REPORT ON Q1 2013

Microsoft Courses Schedule February December 2017

U.S. DOMESTIC INDUSTRY OVERVIEW FOR OCTOBER 2010 All RNO Carriers Systemwide year over year comparison

Reno-Tahoe Airport Authority U.S. DOMESTIC INDUSTRY OVERVIEW FOR FEBRUARY

The Residential Outlook for South Australia

Portability: D-cide supports Dynamic Data Exchange (DDE). The results can be exported to Excel for further manipulation or graphing.

July 21, Mayor & City Council Business Session KCI Development Program Process Update

Appendix 3 REMPLAN Economic Impact Modelling: New Energy Port Hedland Waste to Energy Project

Best Practices update Churning

ESPLANADE CENTRE. 260 West Esplanade and 255 West 1 st Street North Vancouver, British Columbia

Time Watch Investments Limited

San Martin Boulevard over Riviera Bay Project Development & Environment (PD&E) Study Update PID A

Experience with Digital NOTAM

Travel and Visitor Industry

BICYCLE AND PEDESTRIAN COUNT PROGRAM 2016 Annual Report

Inventory Down, Occupancy Up

Learn to Fly: Private Pilot Ground School DeCal

FDAP Seminar. Miami, October 2016

Traffic data submitted by various domestic airlines has been analysed for the month of Mar Following are the salient features:

THE DESIGN AND IMPLEMENTATION OF A PRACTICAL TSUNAMI EVACUATION DRILL

New Developments in VISSIM

DEN is the #1 Economic Engine of the State of Colorado and the Rocky Mountain region Generating over $26 Billion annually in economic benefit DEN

1. The scour elevation can normally be found on the bridge hydraulics sheet and in the Pile Data Table. True False

The Improvement of Airline Tickets Selling Process

U.S. DOMESTIC INDUSTRY OVERVIEW FOR MARCH

Traffic data submitted by various domestic airlines has been analysed for the month of May Following are the salient features:

LOUNGE/CHECK-IN/SPECIAL SERVICES LOUNGE

City of Kansas City AIRPORT COMMITTEE BRIEFING. Major Renovation Evaluation for Kansas City International Airport.

The Real World of Business Aviation: A Survey of Companies Using General Aviation Aircraft

INVITATION TO EXHIBITORS The 4 th Underwater Acoustics Conference & Exhibition (UACE2017)

DOWNTOWN SAN JOSÉ AIRSPACE & DEVELOPMENT CAPACITY STUDY (PROJECT CAKE) STEERING COMMITTEE MEETING #7. Draft. November 13, 2018

Project 2 Database Design and ETL

2. (5 points) Who was John Doe s driver on April 1st, 2018?

Reward Payback for Hotel Loyalty Programs Reward value returned for every dollar spent on hotel rates

AUGUST 2018 MONTHLY STATISTICAL REPORT

Lost on Ellis Island W.M. Akers

Beach Management Hayling Island. Marc Bryan - Coastal Engineer Havant, Portsmouth and Gosport Coastal Partnership

The Carbon Footprint of Queensland Tourism

Transcription:

Introduction to Data Management CSE 344 Lectures 5: Aggregates in SQL Daniel Halperin CSE 344 - Winter 2014 1

Announcements Webquiz 2 posted this morning Homework 1 is due on Thursday (01/16) 2

(Random detour:) Who is this? http://content.lib.washington.edu/cdm4/item_viewer.php?cisoroot=/portraits&cisoptr=117&cisobox=1&rec=5 CSE 344 - Winter 2014 3

Does this help? http://content.lib.washington.edu/cdm4/item_viewer.php?cisoroot=/uwcampus&cisoptr=1649 CSE 344 - Winter 2014 4

Winlock W Miller (of Miller Hall :) UW Regent (managers of univ.) for 35 years between 1913 and 1953 Usually full of executives from major instutions Current Board of Regents Chair is former Alaska Airlines CEO, etc. Winlock, WA is named after him Father was Gen l William Winlock Miller (confusing, I know), first mayor of Olympia and land speculator. (I think) WA has some interesting history! CSE 344 - Winter 2014 5

Refresh your memory ---------- ---------- ---------- ---------- ---------- > SELECT * FROM Purchase; pid product price quantity month 1 bagel 1.99 20 september 2 bagel 2.5 12 december 3 banana 0.99 9 september 4 banana 1.59 9 february 5 gizmo 99.99 5 february 6 gizmo 99.99 3 march 7 gizmo 49.99 3 april 8 gadget 89.99 3 january 9 gadget 89.99 3 february 10 gadget 49.99 3 march 11 orange NULL 5 may 12 orange 1.29 34 january CSE 344 - Winter 2014 6

Refresh your memory How do we Compute the total number of sales? Compute the total number of products sold? Compute the total number of each product sold? Compute the gross $ spent on of each product? (qty * price) Compute the average gross $ of each product? (2 ways) CSE 344 - Winter 2014 7

Refresh your memory How do we Compute the gross monthly sales in $? (units * price/unit) Sort the months from most sales to least? Find all the unique prices? (2 ways) CSE 344 - Winter 2014 8

HAVING Clause Same query as earlier, except that we consider only products that had at least 30 sales. SELECT product, sum(price*quantity) FROM Purchase WHERE price > 1 GROUP BY product HAVING Sum(quantity) > 30 HAVING clause contains conditions on aggregates. CSE 344 - Winter 2014 9

WHERE vs HAVING WHERE condition is applied to individual rows The rows may or may not contribute to the aggregate No aggregates allowed here HAVING condition is applied to the entire group Entire group is returned, or not at all May use aggregate functions in the group CSE 344 - Winter 2014 10

Aggregates and Joins create table Product (pid int primary key, pname varchar(15), manufacturer varchar(15));" " insert into product values(1,'bagel','sunshine Co.');" insert into product values(2,'banana','busyhands');" insert into product values(3,'gizmo','gizmoworks');" insert into product values(4,'gadget','busyhands');" insert into product values(5,'powergizmo','powerworks');" CSE 344 - Winter 2014 11

Aggregate + Join Example SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What do these queries mean? SELECT x.manufacturer, y.month, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer, y.month CSE 344 - Winter 2014 12

General form of Grouping and Aggregation SELECT S FROM R 1,,R n WHERE C1 GROUP BY a 1,,a k HAVING C2 Why? S = may contain attributes a 1,,a k and/or any aggregates but NO OTHER ATTRIBUTES C1 = is any condition on the attributes in R 1,,R n C2 = is any condition on aggregate expressions and on attributes a 1,,a k CSE 344 - Winter 2014 13

Semantics of SQL With Group-By SELECT S FROM R 1,,R n WHERE C1 GROUP BY a 1,,a k HAVING C2 Evaluation steps: 1. Evaluate FROM-WHERE using Nested Loop Semantics 2. Group by the attributes a 1,,a k 3. Apply condition C2 to each group (may have aggregates) 4. Compute aggregates in S and return the result CSE 344 - Winter 2014 14

Empty Groups In the result of a group by query, there is one row per group in the result No group can be empty! In particular, count(*) is never 0 SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What if there are no purchases for a manufacturer CSE 344 - Winter 2014 15

Empty Groups: Example SELECT product, count(*) FROM purchase GROUP BY product SELECT product, count(*) FROM purchase WHERE price > 2.0 GROUP BY product 5 groups in our example dataset 3 groups in our example dataset CSE 344 - Winter 2014 16

Empty Group Problem SELECT x.manufacturer, count(*) FROM Product x, Purchase y WHERE x.pname = y.product GROUP BY x.manufacturer What if there are no purchases for a manufacturer CSE 344 - Winter 2014 17

Empty Group Solution: Outer Join SELECT x.manufacturer, count(y.pid) FROM Product x LEFT OUTER JOIN Purchase y ON x.pname = y.product GROUP BY x.manufacturer CSE 344 - Winter 2014 18

1) List all manufacturers with more than 10 items sold. Return the manufacturer name and the number of items sold. CSE 344 - Winter 2014 19