Anomaly Detection in airlines schedules. Asmaa Fillatre Data Scientist, Amadeus

Similar documents
Cross-sectional time-series analysis of airspace capacity in Europe

Measuring Airline Networks

Predicting Flight Delays Using Data Mining Techniques

The Indian Outbound Travel Market. with Special Insight into the Image of Europe as a Destination

01 Amadeus at a glance

Integrated Optimization of Arrival, Departure, and Surface Operations

Revenue Management in a Volatile Marketplace. Tom Bacon Revenue Optimization. Lessons from the field. (with a thank you to Himanshu Jain, ICFI)

Digital twin for life predictions in civil aerospace

DATA MANAGEMENT & CONNECTED SOLUTIONS

ROUTE TRAFFIC FORECASTING DATA, TOOLS AND TECHNIQUES

AIRLINE ACADEMY. Enroll yourself with the Middle East No1 Airline Academy We give birth to the best Work force in Airline & Hospitality Industry

Unmanned Aircraft Systems Integration

Quantile Regression Based Estimation of Statistical Contingency Fuel. Lei Kang, Mark Hansen June 29, 2017

Sustaining quality of services through service reliability and availability

ATTEND Analytical Tools To Evaluate Negotiation Difficulty

Decision aid methodologies in transportation

PRESENTATION OVERVIEW

15:00 minutes of the scheduled arrival time. As a leader in aviation and air travel data insights, we are uniquely positioned to provide an

Blending Methods and Other Improvements for Exemplar-based Image Inpainting Techniques

New Zealand s System of Tourism Statistics

Venice Airport: A small Big Data story

Identification of Waves in IGC files

Predicting flight routes with a Deep Neural Network in the operational Air Traffic Flow and Capacity Management system

Application of Queueing Theory to Airport Related Problems

The Centre for Transport Studies Imperial College London: Developments in measuring airspace capacity in Europe

A 3D simulation case study of airport air traffic handling

Todsanai Chumwatana, and Ichayaporn Chuaychoo Rangsit University, Thailand, {todsanai.c;

Use of DDR data in PREDICT to support the pre-tactical planning (D-6 to D-1) FMP Exchange Workshop, edition th of September 2012

Worldwide Passenger Flows Estimation

Methodology and coverage of the survey. Background

Network of International Business Schools

A Study of Tradeoffs in Airport Coordinated Surface Operations

99,9 Truths about Airline Pricing and what Travel Managers can do about it! Michael Schneider Oslo,

Analysis of vertical flight efficiency during climb and descent

Operational Evaluation of a Flight-deck Software Application

Airline Schedule Development Overview Dr. Peter Belobaba

Measure 67: Intermodality for people First page:

- Online Travel Agent Focus -

Defining and Managing capacities Brian Flynn, EUROCONTROL

MULTIDISCIPLINARYMEETING REGARDING GLOBAL TRACKING

Predicting a Dramatic Contraction in the 10-Year Passenger Demand

I n t e r m o d a l i t y

Aviation Economics & Finance

A Macroscopic Tool for Measuring Delay Performance in the National Airspace System. Yu Zhang Nagesh Nayak

North American Online Travel Report

Airspace User Forum 2012

Measuring Productivity for Car Booking Solutions

Efficiency and Automation

Big Data In Airport Operations

Service Fees & Commission Cuts

A Statistical Method for Eliminating False Counts Due to Debris, Using Automated Visual Inspection for Probe Marks

Retail Travel Operations

Integrated Quality Management for MICE destinations A key to Success. Bruce Redor Partner

2016 Sabre GLBL Inc. All rights reserved.

General Aviation Economic Footprint Measurement

Development of Flight Inefficiency Metrics for Environmental Performance Assessment of ATM

ACI Documents. Aircraft Noise Rating Index

GEOGRAPHY OF GLACIERS 2

David Controle, Analytics Accelerator Airbus. Why Invest in AI and Deep Learning NVIDIA GTC

THE PERFORMANCE OF DUBLIN AIRPORT:

How much is it? An innovative methodology to measure room rates across OTAs and online platforms

Temporal Deviations from Flight Plans:

Baggage Handling. Hosted Service

ATM Seminar 2015 OPTIMIZING INTEGRATED ARRIVAL, DEPARTURE AND SURFACE OPERATIONS UNDER UNCERTAINTY. Wednesday, June 24 nd 2015

EC108 May Omar Valdez. UNWTO.Themis Foundation Executive Director

8th USA/Europe. Paper #141: Lateral Intent Error s Impact on Aircraft Prediction. Federal Aviation Administration ATM R&D Seminar

Documentation of the Elevation Selected to Model Helicopter Noise at HTO

From rail timetables to regional and urban indicators on rail passenger services

Airport capacity constraints & air travellers airport choice behaviour from global constraints to local effects

Discuss issues observed during the trial and implementation of ADS-B including review items from ADS-B Problem report database ADS-B ISSUES

ANALYSIS OF THE CONTRIUBTION OF FLIGHTPLAN ROUTE SELECTION ON ENROUTE DELAYS USING RAMS

Performance and Efficiency Evaluation of Airports. The Balance Between DEA and MCDA Tools. J.Braz, E.Baltazar, J.Jardim, J.Silva, M.

Airport Slot Capacity: you only get what you give

Federal Aviation Administration Flight Plan Presented at the Canadian Aviation Safety Seminar April 20, 2004

An Analytical Approach to the BFS vs. DFS Algorithm Selection Problem 1

Unmanned Aircraft System Loss of Link Procedure Evaluation Methodology

Istanbul Technical University Air Transportation Management, M.Sc. Program Aviation Economics and Financial Analysis Module November 2014

Hydrological study for the operation of Aposelemis reservoir Extended abstract

Analysis of ATM Performance during Equipment Outages

Shazia Zaman MSDS 63712Section 401 Project 2: Data Reduction Page 1 of 9

Configuration of Airport Passenger Buildings. Outline

AI in a SMART AIrport

Giraffe abundance and demography in relation to food supply, predation and poaching

System Wide Modeling for the JPDO. Shahab Hasan, LMI Presented on behalf of Dr. Sherry Borener, JPDO EAD Director Nov. 16, 2006

PLANNING A RESILIENT AND SCALABLE AIR TRANSPORTATION SYSTEM IN A CLIMATE-IMPACTED FUTURE

NextGen AeroSciences, LLC Seattle, Washington Williamsburg, Virginia Palo Alto, Santa Cruz, California

Yakima Air Terminal/McAllister Field Airport Master Plan Update

ECOsystem: MET-ATM integration to improve Aviation efficiency

Permitting foreign ownership and control. Potential effects of a further deregulation of air transport markets in Europe

ACI EUROPE POSITION PAPER

Measuring the Business of the NAS

Symbology comparison of Two-dimensional Symbologies with focus on EDI messages on transport labels

Tour route planning problem with consideration of the attraction congestion

Administrative Manual Directive on Official Travel

Airspace User View CRM. 17 February 2015

Online Appendix to Quality Disclosure Programs and Internal Organizational Practices: Evidence from Airline Flight Delays

ERASMUS. Strategic deconfliction to benefit SESAR. Rosa Weber & Fabrice Drogoul

US $ 1,800 1,600 1,400 1,200 1,000

National MICE Development A Global Perspective

Optimizing trajectories over the 4DWeatherCube

Transcription:

Anomaly Detection in airlines schedules Asmaa Fillatre Data Scientist, Amadeus

AMADEUS PRESENTATION 1. IT company that develops business solutions for the travel and tourism industry 2. Operates globally in the travel and technology market Travel buyers Consumers/ General public Corporate travel departments Travel providers 711 airlines (over 420 bookable) 24 Insurance companies 50+ cruise and ferry lines 207 tour operators 110,000+ hotel properties 30 car rental companies 95 railways IT SOLUTIONS Including direct distribution technology Common / overlapping platforms & applications Common data centre Common customers Common sales & marketing infrastructure DISTRIBUTION BUSINESS Provision of indirect distribution services Travel agencies Travel Management companies Business travel agencies Leisure travel agencies Online travel agencies Consolidators Single-site agency Travel search companies Airline sale offices and airline websites connected to Amadeus direct sell technology Page 2

1 Airline Schedules

Airline schedules Flight Connections & Network Analysis Market Capacity Evolution and Trend Airline schedules Airport Hub Analysis Market Competition Analysis Page 4

Airline schedules data 110.000 daily flights One year = 40.150.000 flights Flight number Departure time Arrival time Aircraft type Departure airport Arrival airport Airline code Flight schedules Airline code Airline country Airlines Airport code Airport name Airport location Longitude latitude Airports Aircraft code Aircraft capacity Aircrafts 5

Motivations 1. The airline schedules contain many errors. 2. It is important to identify outliers prior to modelling and analysis. 3. Detect anomalies automatically 4. Overcome the issue of non prior knowledge (no ground truth) Page 6

Anomalies examples (1) Airlines use wrong IATA airport codes Airlines missing Merger between two companies Flown distance much higher than aircraft average Flown distance much higher than the aircraft average Elapsed time/distance not appropriate New routes traffic Sports event (OG, FIFA World Cup, etc) Sudden grow in monthly Aircraft capacity for United Airlines Page 7

Anomalies examples (2) Page 8

4 Unsupervised Anomaly detection Goal: Process unlabelled data and detect anomalies

Machine learning Labeled data (normal/abnormal) Direct feedback Predict outcome/future Supervised No labels No feedback Find hidden structure Unsupervised Learning Semisupervised Some labelled data : Supervised learning + additional unlabelled data Unsupervised learning + additional labelled data Page 10

Residuals-based anomaly detection in three steps Input data Low-rank approximation Residuals generation Anomaly detection 20 20 18 16 14 12 10 8 Page 11 6 4 0 0.2 0.4 0.6 0.8 1 20 4 0 0.2 0.4 0.6 0.8 1 20 18 16 14 12 10 8 6 4 0 0.2 0.4 0.6 0.8 1 10 18 18 8 16 14 12 10 8 6 16 14 12 10 8 6 6 4 2 0-2 -4-6 -8-10 0 0.2 0.4 0.6 0.8 1 10 8 6 4 2 0-2 -4-6 -8-10 0 0.2 0.4 0.6 0.8 1 No Anomaly 4 0 0.2 0.4 0.6 0.8 1 Anomaly

Residual and Anomaly Detection Residual R i = Input Reconstrucion Residual normalization Residual thresholding Z i = (R i μ) σ Three sigma rule Any data sample outside the interval μ 3σ, μ + 3σ is considered to be potential anomaly Page 12

6 Deep learning: Stacked Autoencoder Goal: Learn the internal structure and features of the data itself

Autoencoder One hidden layer Minimize X X w.r.t. all W e (l), Wd (l) and be (l), bd (l) Trained with Backpropagation Self-supervised technique Learn a meaningful representation of the data in some other dimensionality where and Encoding Decoding Page 14

PCA Input W b + Output Autoencoder Introduce non linearity Input W b + W b +. Output Page 15

Regularization Deep Autoencoder or stacked autoencoder Cost function Average sum of squared error Weight decay Sparsity Penality Constraints on the activation ρ Regularization by λ which should be close to ρ Page 16

Stacked Autoencoder training Training one hidden layer at a time Example with 2 hidden layer Page 17

Input images Hello world of deep learning Anomaly Detection on MNIST Autoencoder lowest reconstruction error highest reconstruction error Learned features Output images Page 18

7 Autoencoder based Anomaly detection for airlines schedules

Normalized number of flights Raw data: multivariate time series airlines R2R nb flights weeks Preprocessing For more natural representations of data The Autoencoder can learn some patterns Some region to region time series 2012 week 10 Page 20

Autoencoder for time series Anomaly detection Data preparation Data transformation Autoencoder configuration TS preprocessing Data normalization Training set Train Autoencoder 175 100 50 Testing set W, B 100 Reconstruction error thresholding 175 + β, λ and ρ Outlier detection Page 21

8 United Airlines (UA) schedules data processing Goal: highlight how does the Autoencoder perform in practice

UA anomaly detection (1) World normalization of Input data World Min Max Normalization. Autoencoder Outpuṯ Residuals R2Ri. 2010 to 2016 Input from UA Page 23

UA anomaly detection (2) Regional normalization of Input data Min Max Normalization per region. Autoencoder Outpuṯ Residuals R2Ri. 2010 to 2016 Input from UA Page 24

8 Air France (AF) Goal: highlight how does the Autoencoder perform in practice

AF anomaly detection (1) World normalization of Input data World Min Max Normalization. Autoencoder Outpuṯ Residuals R2Ri. 2010 to 2016 Input from AF Page 26

AF anomaly detection (1) Regional normalization of Input data Min Max Normalization per region. Autoencoder Outpuṯ Residuals R2Ri. 2010 to 2016 Input from AF Page 27

Autoencoder pros and cons Pros Cons Page 28

Conclusion Unsupervised machine learning (no ground truth) o Well adapted to the absence of labels o Hard to interpret: the review process of outliers relies on domain experts Deep learning/feature engineering Page 29

Thanks for your attention