Modeling Airline Fares Evidence from the U.S. Domestic Airline Sector Domingo Acedo Gomez Arturs Lukjanovics Joris van den Berg 31 January 2014
Motivation and Main Findings Which Factors Influence Fares? Distance Competition Seasonality Carrier Ticket class Economic situation Total passengers Hubs Our Model Results 22 explanatory factors included Overall accuracy of 50% of fare variation Adding Southwest increases accuracy to 55% ENAC Modeling Airline Fares 31 January 2014 2 / 16
DB1B Origin & Destination Database Itinerary ID 200911307517 Main Information Coupons Carrier Breaks Itinerary $ fare Fare class ENAC Modeling Airline Fares 31 January 2014 3 / 16
A Look Inside the Database ENAC Modeling Airline Fares 31 January 2014 4 / 16
Reducing the Database Which tickets do we keep for our study? 1 Round trip 2 Two or four coupons 3 Single carrier 4 No extreme fares 5 Economy class 6 Main majors & low cost carriers 7 Regular routes 8 Lower 48 states ENAC Modeling Airline Fares 31 January 2014 5 / 16
Data Flow ENAC Modeling Airline Fares 31 January 2014 6 / 16
Final Dataset 128,192 Observations 65% Direct Flights 35% Indirect Flights ENAC Modeling Airline Fares 31 January 2014 7 / 16
Descriptive Statistics ENAC Modeling Airline Fares 31 January 2014 8 / 16
Descriptive Statistics ENAC Modeling Airline Fares 31 January 2014 9 / 16
Model Results Dependent Variable: ln(avgweightedfare) Method: Least Squares Sample (adjusted): 1 128191 Included observations: 128191 after adjustments Variable Coefficient Std. Error t-statistic Prob. C 42.99172 0.338122 127.1484 0.0000 AVERAGEROUTEPOPULATION/1000000 0.007107 0.000342 20.79077 0.0000 TOTALPAX/1000-0.099927 0.001948-51.29007 0.0000 (TOTALPAX/1000) 2 0.008593 0.000337 25.49255 0.0000 DISTANCE/1000 0.610302 0.005283 115.5298 0.0000 (DISTANCE/1000) 2-0.107413 0.002108-50.94444 0.0000 H-INDEX/0.1 0.012098 0.000351 34.46632 0.0000 CLASSRATIO 0.752086 0.033692 22.32214 0.0000 CLASSRATIO 2-1.165174 0.023031-50.59143 0.0000 FREQAIRPORT=1 0.096143 0.002240 42.92069 0.0000 DIRECT=1-0.019687 0.002006-9.815180 0.0000 CARRIER= AS -0.271279 0.005801-46.76208 0.0000 CARRIER= FL -0.331333 0.004582-72.30864 0.0000 CARRIER= AA -0.044150 0.003071-14.37402 0.0000 CARRIER= DL -0.026154 0.003019-8.664515 0.0000 CARRIER= NW 0.115003 0.003405 33.77637 0.0000 CARRIER= UA 0.026226 0.003207 8.178843 0.0000 CARRIER= US -0.063643 0.003238-19.65542 0.0000 CARRIER= 9E -0.216606 0.270274-0.801433 0.4229 CARRIER= B6-0.543833 0.048076-11.31193 0.0000 CARRIER= WN -0.168659 0.005076-33.22809 0.0000 CARRIER= EV 0.016828 0.035035 0.480301 0.6310 YEAR -0.018556 0.000169-109.6910 0.0000 R-squared 0.499430 Mean dependent var 6.056327 Adjusted R-squared 0.499344 S.D. dependent var 0.381934 S.E. of regression 0.270246 Akaike info criterion 0.221208 Sum squared resid 9360.448 Schwarz criterion 0.222959 Log likelihood -14155.42 Hannan-Quinn criter. 0.221733 F-statistic 5812.541 Durbin-Watson stat 1.858848 Prob(F-statistic) 0.000000 ENAC Modeling Airline Fares 31 January 2014 10 / 16
Model Performance Predicted fare 1993-2010 ENAC Modeling Airline Fares 31 January 2014 11 / 16
Including Southwest Dependent Variable: ln(avgweightedfare) Method: Least Squares Sample (adjusted): 1 148125 Included observations: 148054 after adjustments Variable Coefficient Std. Error t-statistic Prob. C 35.42660 0.318410 111.2608 0.0000 AVERAGEROUTEPOPULATION/1000000 0.009141 0.000317 28.81292 0.0000 TOTALPAX/1000-0.105992 0.001664-63.69786 0.0000 (TOTALPAX/1000)ˆ2 0.007886 0.000262 30.04379 0.0000 DISTANCE/1000 0.675759 0.004947 136.6076 0.0000 (DISTANCE/1000)ˆ2-0.132959 0.001977-67.25279 0.0000 H-INDEX/0.1 0.015500 0.000325 47.64406 0.0000 CLASSRATIO 0.865604 0.030551 28.33299 0.0000 CLASSRATIOˆ2-1.108471 0.021028-52.71497 0.0000 FREQAIRPORT=1 0.109104 0.002021 53.98739 0.0000 DIRECT -0.037399 0.001863-20.07095 0.0000 CARRIER= AS -0.250315 0.005753-43.51145 0.0000 CARRIER= FL -0.329620 0.004549-72.45734 0.0000 CARRIER= AA -0.036666 0.003045-12.04244 0.0000 CARRIER= DL -0.023884 0.002990-7.986586 0.0000 CARRIER= NW 0.116872 0.003378 34.59792 0.0000 CARRIER= UA 0.031257 0.003181 9.827048 0.0000 CARRIER= US -0.034285 0.003193-10.73724 0.0000 CARRIER= 9E -0.186904 0.268389-0.696391 0.4862 CARRIER= B6-0.457841 0.047695-9.599321 0.0000 CARRIER= WN -0.438353 0.003345-131.0549 0.0000 CARRIER= EV 0.070191 0.034784 2.017924 0.0436 YEAR -0.014885 0.000159-93.48576 0.0000 R-squared 0.545766 Mean dependent var 5.999173 Adjusted R-squared 0.545698 S.D. dependent var 0.398153 S.E. of regression 0.268363 Akaike info criterion 0.207202 Sum squared resid 10660.99 Schwarz criterion 0.208741 Log likelihood -15315.57 Hannan-Quinn criter. 0.207661 F-statistic 8084.567 Durbin-Watson stat 1.835690 Prob(F-statistic) 0.000000 ENAC Modeling Airline Fares 31 January 2014 12 / 16
Model Performance with Southwest Predicted fare 1993-2010 ENAC Modeling Airline Fares 31 January 2014 13 / 16
Forecasting Southwest enters a new route! LAS - ORD Las Vegas - Chicago Population: 4,296,645 B737-700: 140 PAX, 2 week, 13 weeks, 3,640 (364) Currently 4 carriers Distance: 1600NM 90% restricted class tickets Direct flight LAS is a frequent airport ENAC Modeling Airline Fares 31 January 2014 14 / 16
A Reality Check (Booking LAS - ORD) ENAC Modeling Airline Fares 31 January 2014 15 / 16
Conclusion We... Processed 122 GB of DB1B data with Python Constructed an econometric model with 22 variables Were able to capture 55% of the observed fare variation ENAC Modeling Airline Fares 31 January 2014 16 / 16