Asynchronous Query Execution

Similar documents
Wichita State University Libraries SOAR: Shocker Open Access Repository

Brent D. Bowen University of Nebraska at Omaha Aviation Institute. Dean E. Headley Wichita State University W. Frank Barton School of Business

Airline Quality Rating 2006

World Class Airport For A World Class City

MIT ICAT. Fares and Competition in US Markets: Changes in Fares and Demand Since Peter Belobaba Celian Geslin Nikolaos Pyrgiotis

LCCs vs. Legacies: Converging Business Models

Modeling Airline Fares

Stifel 2017 Transportation & Logistics Conference Tammy Romo, EVP and CFO

MIT ICAT. Price Competition in the Top US Domestic Markets: Revenues and Yield Premium. Nikolas Pyrgiotis Dr P. Belobaba

Airline Quality Rating 2011

US AIRLINE COST AND PRODUCTIVITY CONVERGENCE: DATA ANALYSIS

10 - Relational Data and Joins

Airline Quality Rating 2012

World Class Airport For A World Class City

Wichita State University Libraries SOAR: Shocker Open Access Repository

World Class Airport For A World Class City

Weather Index Project: Investigating the effect of weather on flight delays

2016 Air Service Updates

Fundamentals of Airline Markets and Demand Dr. Peter Belobaba

World Class Airport For A World Class City

Airline Quality Rating 2013

Data Session U.S.: T-100 and O&D Survey Data. Presented by: Tom Reich

Measuring Airline Networks

Data Exploring and Data Wrangling - NYCFlights13 Dataset Vaibhav Walvekar

2016 Air Service Updates

2 of 33

2012 Airfares CA Out-of-State City Pairs -

MIT ICAT. Robust Scheduling. Yana Ageeva John-Paul Clarke Massachusetts Institute of Technology International Center for Air Transportation

Permanent IT Salaries Q Working with you to create a great recruitment experience

CISC 7510X Midterm Exam For the below questions, use the following schema definition.

ACI-NA BUSINESS TERM SURVEY APRIL 2017

Rami El Mawas CE 291

Table of Contents SECTION ONE - SYSTEM ONE ACCESS

Background Information. Instructions. Problem Statement. HOMEWORK INSTRUCTIONS Homework #4 Airfare Prices Problem

2016 Air Service Updates

Introduction to Data Management CSE 344

JFK LHR. airports & flight connections

Baggage Reconciliation System

Airline Quality Rating 2015

2016 Air Service Updates

No One is Ever Satisfied: Meeting Today s Air Service Challenges

PLANNING A RESILIENT AND SCALABLE AIR TRANSPORTATION SYSTEM IN A CLIMATE-IMPACTED FUTURE

A Decade of Consolidation in Retrospect

1 of 31

Microsoft Courses Schedule February December 2017

Part 1. Part 2. airports100.csv contains a list of 100 US airports.

Venice Airport: A small Big Data story

LONG BEACH, CALIFORNIA

PLANNED SCHEDULE CHANGES ICELANDAIR Update: January 2019 (update 1)

Unit Activity Answer Sheet

Product information & MORE. Product Solutions

Salt Lake City Int'l Airport Airport Schedule Reports

Technology Tools. Wednesday, January 23, :15pm 2:30pm

2008 Manchester Flats Cancelling Machine

Investor Presentation

Cowen Securities 6 th Annual Global Transportation Conference June 11, 2013

A Nested Logit Approach to Airline Operations Decision Process *

2016 Annual Shareholders Meeting

Process Guide Version 2.5 / 2017

Big Data Processing using Parallelism Techniques Shazia Zaman MSDS 7333 Quantifying the World, 4/20/2017

Passengers Boarded At The Top 50 U. S. Airports ( Updated April 2

Distance to Jacksonville from Select Cities

Airline Quality Rating 2014

Slide 1. Slide 2. Slide 3 FLY AMERICA / OPEN SKIES OBJECTIVES. Beth Kuhn, Assistant Director, Procurement Services

BEFORE THE DEPARTMENT OF TRANSPORTATION WASHINGTON, D.C. ANSWER OF DELTA AIR LINES, INC. TO OBJECTIONS

Young Researchers Seminar 2009

Airline Operations A Return to Previous Levels?

Monthly Airport Passenger Activity Summary. December 2007

Semantic Representation and Scale-up of Integrated Air Traffic Management Data

Supportable Capacity

MIT ICAT M I T I n t e r n a t i o n a l C e n t e r f o r A i r T r a n s p o r t a t i o n

DATE: / / Harter Self Perception Profile - Children AGE: INSTRUCTIONS

Monthly Noise Report. Long Beach Airport. December Airport Advisory Commission. Airport Management. Wayne Chaney Sr. Chair

Pacific Airways I S N T T H E W O R L D A S M A L L P L A C E? Operations Manual v 2.3. Revised: Dec. 1, Updated by Tom Detlefsen

Objectives... ii Text Design Helps - Hints... iii Handbooks and Instructional Support. iv 1 1-1

Air Service Assessment & Benchmarking Study Marquette, MI

2017 Marketing and Communications Conference. November 6, 2017

Management System for Flight Information

ACI-NA BUSINESS TERM SURVEY 2018 BUSINESS OF AIRPORTS CONFERENCE

Physics Is Fun. At Waldameer Park! Erie, PA

2015 Region 1 Conference in Manchester, NH Attendance by States/Provinces

PITTSBURGH INTERNATIONAL AIRPORT ANALYSIS OF SCHEDULED AIRLINE TRAFFIC. October 2016

Ticketing and Booking Data

Fare rules & restrictions Iberia (IB) OKN8D1T1 NYC to BSL

Solutions to Examination in Databases (TDA357/DIT620)

2012 Air Service Data & Planning Seminar

Airline Subscriber Services Miscellaneous Functionality

ERASMUS. Strategic deconfliction to benefit SESAR. Rosa Weber & Fabrice Drogoul

Equity and Equity Metrics in Air Traffic Flow Management

Use and Issuance of Bahamasair E-Tickets

Incentives and Competition in the Airline Industry

Special edition paper Development of a Crew Schedule Data Transfer System

The Conference Board Consumer Confidence Index increased in August The Index now stands at up from 96.7 in July.

MT - Blitzplan Manual

Saudi Arabia booking information system

Aviation Maintenance Industry Outlook and Economic Impact

Validation of Runway Capacity Models

Airports Council International North America Air Cargo Facilities and Security Survey

ECS & Docker: Secure Async Brennan Saeta

Davenport Group Coverage Model

Transcription:

Asynchronous Query Execution Alexander Rubin April 12, 2015

About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years Started at MySQL AB, Sun Microsystems, Oracle (MySQL Consulting) Worked at Hortonworks (Hadoop company) Joined Percona in 2013

Problem Set Problem 1: Reporting query takes too long Use 1 CPU core only! Does not take advantage of my big server!

Problem Set Problem 2: Pagination is slow SELECT ORDER BY LIMIT 10 Works very fast SELECT COUNT(1) ORDER BY Is really slow

Problem Set Problem 3: A query slows down page load INSERT INTO page_log VALUES ( ) Only used for internal logs Makes all pages load slow

Problem Set Customers unhappy

Answer async query execution

Agenda Splitting 1 query into N threads in the code Bash script example PHP asyncronous code example

Problem 1: pagination query mysql> select FlightDate, Carrier, origin, dest, ActualElapsedTime - > from ontime - > where origin = 'SFO' - > order by FlightDate desc limit 10; +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + FlightDate Carrier origin dest ActualElapsedTime +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + 2013-10- 31 B6 SFO FLL 316 2013-10- 31 B6 SFO FLL 307 2013-10- 31 B6 SFO JFK 298 2013-10- 31 B6 SFO AUS 201 2013-10- 31 B6 SFO LGB 84 2013-10- 31 B6 SFO LGB 78 2013-10- 31 B6 SFO BOS 313 2013-10- 31 B6 SFO BOS 315 2013-10- 31 B6 SFO BOS 336 2013-10- 31 B6 SFO JFK 343 +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + 10 rows in set (0.00 sec)

Problem 1: pagination query mysql> select count(*) - > from ontime - > where origin = 'SFO'; +- - - - - - - - - - + count(*) +- - - - - - - - - - + 3433692 +- - - - - - - - - - + 1 row in set (1.52 sec)

Problem 1: pagination query mysql> explain select count(*) from ontime where origin = 'SFO'\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: ontime type: ref possible_keys: airport_date key: airport_date key_len: 6 ref: const rows: 6735366 Extra: Using where; Using index 1 row in set (0.00 sec)

Problem 1: pagination query mysql> explain select FlightDate, Carrier, origin, dest, ActualElapsedTime from ontime where origin = 'SFO' order by FlightDate desc limit 10\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: ontime type: ref possible_keys: airport_date key: airport_date key_len: 6 ref: const rows: 6735366 Extra: Using where

Problem 1: pagination query mysql> select SQL_CALC_FOUND_ROWS FlightDate, Carrier, origin, dest, ActualElapsedTime from ontime where origin = 'SFO' order by FlightDate desc limit 10; +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + FlightDate Carrier origin dest ActualElapsedTime +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + 2013-10- 31 B6 SFO FLL 316 2013-10- 31 B6 SFO FLL 307 2013-10- 31 B6 SFO JFK 298 2013-10- 31 B6 SFO AUS 201 2013-10- 31 B6 SFO LGB 84 2013-10- 31 B6 SFO LGB 78 2013-10- 31 B6 SFO BOS 313 2013-10- 31 B6 SFO BOS 315 2013-10- 31 B6 SFO BOS 336 2013-10- 31 B6 SFO JFK 343 +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - +- - - - - - +- - - - - - - - - - - - - - - - - - - + 10 rows in set (23.06 sec) mysql> select found_rows(); +- - - - - - - - - - - - - - + found_rows() +- - - - - - - - - - - - - - + 3433692 +- - - - - - - - - - - - - - +

Problem 1: Solution Run the main query (LIMIT 10, 0.00 sec) first Run the second query (COUNT, 2 sec) after Asynchronously Update the COUNT on top of the report Javascript comes handy

Problem 2: reporting query Which airlines have maximum delays for the flights inside continental US during the business days from 1988 to 2009?

Problem 2: reporting query SELECT min(yeard), max(yeard), Carrier, count(*) as cnt, sum(arrdelayminutes>30) as flights_delayed, round(sum(arrdelayminutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01- 01' GROUP by carrier HAVING cnt > 100000 and max(yeard) > 1990 ORDER by rate DESC

Problem 2: reporting query +- - - - - - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - + min(yeard) max(yeard) Carrier cnt flights_delayed rate +- - - - - - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - + 2003 2009 EV 1454777 237698 0.16 2006 2009 XE 1016010 152431 0.15 2006 2009 YV 740608 110389 0.15 2003 2009 B6 683874 103677 0.15 2003 2009 FL 1082489 158748 0.15... +- - - - - - - - - - - - +- - - - - - - - - - - - +- - - - - - - - - +- - - - - - - - - - +- - - - - - - - - - - - - - - - - +- - - - - - + 24 rows in set (15 min 56.40 sec)

Problem 2: potential solution #!/bin/bash fn="./res$$.txt" for c in '9E' 'AA' 'AL' 'AQ' 'AS' 'B6' 'CO' 'DH' 'DL' 'EA' 'EV' 'F9' 'FL' 'HA' 'HP' 'ML' 'MQ' 'NW' 'OH' 'OO' 'PA' 'PI' 'PS' 'RU' 'TW' 'TZ' 'UA' 'US' 'WN' 'XE' 'YV' do sql=" select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(arrdelayminutes>30) as flights_delayed, round(sum(arrdelayminutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01- 01' and carrier = '$c'" mysql - uroot ontime - e "$sql" >> "$fn" & done wait sort - n $fn uniq

Problem 2: potential solution $ time./airline_par.sh > /airline_par_res.txt real 8m13.323s user 0m0.064s sys 0m0.068s

Problem 2: potential solution $ head airline_par_res.txt min(yeard) max(yeard) Carrier cnt flights_delayed rate 1988 1988 AL 265654 26291 0.10 1988 1988 PS 32052 1367 0.04 1988 1989 PI 551858 56122 0.10 1988 1990 EA 579546 55616 0.10 1988 1991 PA 206841 19465 0.09 1988 2001 TW 2659963 280741 0.11 1988 2005 HP 2607603 235675 0.09

Cpu0 : 14.1%us, 1.7%sy, 0.0%ni, 79.5%id, 4.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 11.9%us, 3.9%sy, 0.0%ni, 82.1%id, 2.1%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 14.7%us, 1.3%sy, 0.0%ni, 80.0%id, 4.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 15.5%us, 1.7%sy, 0.0%ni, 79.1%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 14.9%us, 1.3%sy, 0.0%ni, 81.5%id, 2.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 17.4%us, 2.0%sy, 0.0%ni, 76.6%id, 4.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 13.2%us, 1.3%sy, 0.0%ni, 84.1%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 11.7%us, 1.3%sy, 0.0%ni, 84.3%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 16.8%us, 1.7%sy, 0.0%ni, 78.8%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 16.1%us, 2.0%sy, 0.0%ni, 79.5%id, 2.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 15.9%us, 1.7%sy, 0.0%ni, 79.7%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 18.3%us, 2.0%sy, 0.0%ni, 77.0%id, 2.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 8.3%us, 1.7%sy, 0.0%ni, 89.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 7.6%us, 1.3%sy, 0.0%ni, 90.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 6.6%us, 0.3%sy, 0.0%ni, 92.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 8.6%us, 1.3%sy, 0.0%ni, 89.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 15.0%us, 0.3%sy, 0.0%ni, 84.3%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 2.7%us, 0.3%sy, 0.0%ni, 95.7%id, 1.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 24.1%us, 1.3%sy, 0.0%ni, 74.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 11.4%us, 0.3%sy, 0.0%ni, 87.3%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 7.4%us, 1.0%sy, 0.0%ni, 90.6%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 13.3%us, 0.7%sy, 0.0%ni, 85.4%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 6.3%us, 0.7%sy, 0.0%ni, 92.1%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 6.3%us, 1.7%sy, 0.0%ni, 91.0%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st

Problem 3: Asynchronous PHP # Run queries in parallel: foreach ($all_links as $linkid => $link) { $link- >query("select something FROM tablen WHERE ", MYSQLI_ASYNC); } $processed = 0; do { $links = $errors = $reject = array(); foreach ($all_links as $link) { $links[] = $errors[] = $reject[] = $link; } # loop to wait on results if (!mysqli_poll($links, $errors, $reject, 60)) { continue; } foreach ($links as $k=>$link) { if ($result = $link- >reap_async_query()) { $res = $result- >fetch_row(); # Handle returned result mysqli_free_result($result); } else die(sprintf("mysqli Error: %s", mysqli_error($link))); $processed++; } } while ($processed < count($all_links));

Problem 3: Asynchronous PHP http:///blog/2013/03/06/accessing- xtradb- cluster- nodes- in- parallel- from- php- using- mysql- asynchronous- calls/

Questions? Thank you! Blog: http://www.arubin.org