Cross-Language Question Answering by Multiple Automatic Translations

Similar documents
Multi Product Dynamic Lot-Sizing with Supplier Selection under Quantity Discount and Budget Constraints

ECONOMICS AND POLITICS: INTEREST RATE CONVERGENCE IN EUROPE AND EMU. by Livio Stracca * Abstract

A Note on a New Weighted Idiosyncratic Risk Measure

Modelling International Tourist Arrivals to the Five Major Spanish Destinations

Purchasing Power Parity and Half Life: Another Look

Parallel Algorithms and VLSI Structures for Median Filtering of Images in Real Time

More on finding a Single Number to indicate Overall Performance of a Benchmark Suite

Airport Charges for 2017 Consultation Document

CITY OF LOS ANGELES (NTER-DEPARTMENTAL CORRESPONDENCE OPERATIONS AND MAINTENANCE RESPONSIBILITIES FOR THE LOS ANGELES RIVER

Kinematics Review Checklist

Study on the economic disparity and convergence of urban agglomeration in the middle reaches of the Yangtze River

Kinematics Review Checklist

PRELAB: FORCES AND MOTION

An Econometric Analysis of US Airline Flight Delays with of-day Effects

International Review of Business Research Papers Vol. 3 No. 3 August 2007 Pp The Determination of Load Factors in the Airline Industry

PASSENGER TRIP DELAYS IN THE U.S. AIRLINE TRANSPORTATION SYSTEM IN 2007

Diagnosis of Opisthorchiasis by Enzyme Linked Immunosorbent Assay Using Partially Purified Antigens.

,.,.,l ~. G!fl'&:EXCHANGE Q~~!~~!Q~_Q&~!~!Q~ on the granting of financial support for demonstration projects in the field of geothermal energy

Kinematics and Dynamics Simulation Research for Roller Coaster Multi-body System Gening Xu a, Hujun Xin b, Fengyi Lu c,mingliang Yang d

Markets and Consumers Group

MULTI-OBJECTIVE HYBRID OPTIMIZATION (DA-AL) FOR EFFICIENT JOB SHOP SCHEDULING

Airline Route Structure Competition and Network Policy

Optimal Control. Lecture 6. Prof. Daniela Iacoviello

Exchanging values of unsold seats with airline alliance partners in a competitive environment

Estimating hospital inflation from routinely collected DRG data: Application to Swiss Hospitals

Getting Aligned. How Adopting Standards Affects Canada s Productivity and Growth

Estimating Delay and Capacity Impacts of Airport Infrastructure Investments. Mark Hansen May 2003

Regional disparities in mortality by heart attack: evidence from France

What Determines Chinese Firms Decision on Implementing Voluntary Environmental Schemes?

Cross-sectional Variation of Measurement Error and Predictability of Earnings and Stock Returns. Jung Hoon Kim

Long Term Preventive Generation Maintenance Scheduling with Network Constraints Ali Badri a*, Ahmad Norozpour Niazi b, Seyyed Mehdi Hoseini c

Quality Control. Ignacio Cascos Depto. Estadística, Universidad Carlos III 1

Do Enhancements to Loyalty Programs Affect Demand? The Impact of International Frequent Flyer Partnerships on Domestic Airline Demand

Do Enhancements to Loyalty Programs Affect Demand? The Impact of International Frequent Flyer Partnerships on Domestic Airline Demand

School of Economics and Management

Airport Apron Capacity Estimation Model Enhancement

User Manual. If you have any issues or questions, please contact us via

Family Figures A380 A350 A330 A320. July 2017 Edition

SCIENCE CHINA Life Sciences. A generalized model of island biogeography

The effect of the High Aswan Dam on the hydrological regime of the River Nile

Hob EX8..LY... en Instruction manual

Che Diario De Bolivia (Spanish Edition) [Kindle Edition] By Ernesto Che Guevara

TN ELA Standards: Fifth Grade

MT- RJ Fiber Optic Connector Kits

Modelling International Tourism Demand and Uncertainty in Maldives and Seychelles: A Portfolio Approach

Small Size Defected Ground Structure (DGS) Coupled Resonator Band Pass Filters with Capacitor Loaded Slot Using FDTD Method

4-8 Ramadan Nayyara Banqueting, Conferences & Exhibitions Center

Hoja De Ejercicios 1 Superlativos English Area

Evaluating the Productivity of Faith based Hospitals in Tanzania: Application of Malmquist Productivity Index (MPI) Approach

Questions. Stock price fluctuation asymmetry. Stock price fluctuation asymmetry. Stock price fluctuation asymmetry. Stock price fluctuation asymmetry

MAY 10 Rev A

G. Schroeder, J. Eidam Institut für Rechtsmedizin Medizinische Hochschule Hannover

WP/15/227. Macrofinancial Analysis in the World Economy: A Panel Dynamic Stochastic General Equilibrium Approach. by Francis Vitek

Outline. Measuring the returns to innovation (1) Framework for analysis. Framework for analysis. Why is this an interesting problem?

Shipping Market Integration - The Case of Sticky Newbuilding Prices

Genetic Algorithm Based Decentralized Controller for Load- Frequency Control of Interconnected Power Systems with RFB Considering TCPS in the Tie-Line

Tuesday 24 May 2016 Morning

Development Watch Inc PO Box 1076, Coolum Beach, QLD, 4573

Airborne Museum. Sainte-Mère-Eglise Your first name :... Visit guide 9-11 years old. Visit the museum with Albert! Airborne Museum. Your name :...

booked( checki GUESTFORROOM AND TAX.

Airborne Museum. Sainte-Mère-Eglise Your first name :... Visite guide 9-11 years old. Visit the museum with Albert. Airborne Museum. Your name :...

Aviation Policy and Performance in China: A Comprehensive Evaluation

IN FLIGHT REFUELING FOR COMMERCIAL AIRLINERS

Volatility in International Tourism Demand for Small Island Tourism Economies

EXISTING CONDITIONS AND CHARACTER Cedar Field currently contains a play area, a basketball court with two back-to-back hoops, and an informal

Research on Game Theory of Congestion Pricing and Parking Charging

Chapter 12 Phone Reservations Dialogue

Wide Area Augmentation System (WAAS) and Local Area Augmentation System (LAAS) Update

A Long Term Evaluation of the Japanese Medical Payment System for Cataract Surgeries: Did the Medical Policy Reduce the Long Hospital Stay in Japan?

OASIS DP SANIC RO. Reverse Osmosis Units with Made in Italy components. Keep your filters protected with silver based Antimicrobial Technology

Exploi'ng the full poten'al of TCAS II. Capt. Pascal Kremer ERA / Luxair

Rural effect for typical production in southern Italy

Michelin Green Sightseeing Travel Guide Alpes Du Sud, Haute Provence (France) French Language Edition (French Edition)

BATHROOM ACCESSORIES SANITARYWARE.

Los Dones Del Espiritu (Spanish Edition) By Yiye Avila

Note: several documents referenced in the narrative are hyperlinked.

Lesson 12.2 Problem Solving with Right Triangles

Family Figures A380 A350 A330 A320. July 2017 Edition

LuxCis Fiber Optic Connector

THE SARDINE FISHERY ALONG THE WESTERN COAST OF BAJA CALIFORNIA, TO 1994

Operator's Manual. PCB Separator MAESTRO 2M

DECLARACIÓN Y PREGUNTAS Y RESPUESTAS DE APOYO SOBRE LA ACTUALIZACIÓN DE LA ALERTA DE VIAJES DEL 20 DE FEBRERO DE 2009

Preferential Trade Agreement in Services and Its impact on Welfare

DEL 6 AL 10 DE NOVIEMBRE. 1º GRADO- ESPAÑOL. Día Materia Actividad. November 6th 10th. 1º GRADE - ENGLISH DAY SUBJECT ACTIVITY

Opportunistic Maintenance in Aircraft using Relevant Condition Parameter based Approach

Cruise ports and sustainability Contemporary Issues. DR. GEORGE K. VAGGELAS Department Of Shipping, Trade and Transport, University Of the Aegean

EXHIBIT LIST. No Exhibit Name Page. 1 R391 HS2 Residents Charter.pdf (R391) R392 Response to Select Committee

The Bradley Curve A New EH&S Culture Model

Operator's Manual. PCB Separator MAESTRO 3E

EUROPEANS EXPERIENCE WITH USING SHIPS AND PERCEPTIONS OF MARITIME SAFETY

ONE PLATFORM, MULTIPLE APPLICATIONS: SURVEYING CONSTRUCTION FORESTRY AGRICULTURE ENVIRONMENT POWER ENGINEERING BIRDIE YOUR TAILOR-MADE UAV

A Game Theoretical Study of Cooperative Advertising with Multiple Retailers in a Distribution Channel

Offshore Revalidation FAQs

Primary 1 st Grade VOCABULARY GUIDE

El Dia De Muertos / The Day Of The Dead (Spanish Edition) By Ivar Da Coll READ ONLINE

Linearization Technique and its Application. to Numerical Solution of Bidimensional Nonlinear. Convection Diffusion Equation

CNC With the user in mind. More powerful than ever

49 CFR PART 571 FMVSS No. 302 FLAMMABILITY OF INTERIOR MATERIALS

Bacardí Y La Larga Lucha Por Cuba (Spanish Edition) By Tom Gjelten

Kuta Software Percent Of Change Answer Key

Transcription:

Cross-Language Queson Answerng by Mulple Auomac Translaons Sabano Larosa, Sefano Rovea Dp. Informaca e Scenze dell Informazone, Unversà d Genova, Ialy 2000s036@educ.ds.unge., se@ds.unge. Paolo Rosso Dpo. Ssemas Informácos y Compuacón, Unversdad Polécnca de Valenca, Span prosso@dsc.upv.es Manuel Monez-y-Gomez Laboraoro de Tecnologas de Lenguaje Ins. Nac. de Asrofsca, Ópca y Elecrónca, Mexco. mmonesg@naoep.mx

Mullngual QA Sysems MQAS allow he user o ge he answer by searchng documens wren n a language dfferen han he one used n he query, n order o explo he redundancy of documens on he Web. An mporan sep for a MQAS s he ranslaon of a queson from a language source o a desnaon one. A he momen, majory of QA sysems use onlne ranslaors. The qualy of her ranslaors s ofen no very good and hs has a negave mpac on he QA sysem effcency.

Objecves We focus on he problem relaed o he selecon of he bes ranslaon f more han one ranslaor s used. The wo mehods we propose (Word-Coun and Double Translaon, are oally sascal and herefore hey are language ndependen. We wll concenrae on he ranslaon from Ialan o Spansh, because he documens wren n he laer language presen on he Web are greaer n comparson o hose wren n Ialan. Two mehods was mplemened wh wo formulas: he DICE and he COSINE.

Word-Coun wh Dce formula Ths mehod explos he redundancy of erms n all he ranslaons. The ranslaon wh he hghes number of words n common wll be chosen. To fnd he number of common words, he nersecon of he Spansh ranslaons s aken no accoun. Example of ranslaed queson wh four dfferen ranslaors: 1. Qué sgnfca la sgla CEE? Che cosa sgnfca la sgla CEE? 2. Qué cosa sgnfca sglas el EEC? ( Wha does he abbrevaon EEC mean? 3. Qué sgnfca la CEE de la abrevacón? 4. Qué cosa sgnfca la pone la sgla CEE?

Word-Coun wh Dce formula The Dce formula s used o esablsh he degree of smlary among he ranslaons and o creae a herarchy explong he nformaon ha hey have n common: Sm(, j = 2* len( len( + len( j j Where: and j are he ranslaons ha we consder; len( j represens he nersecon (number of words n common; len( and len ( j represen he number of words for every ranslaon. For nsance, o ge he smlary grade of he frs ranslaon we do: Sm 12 + Sm 13 + Sm 14

Word-Coun wh Dce formula To ncrease he accuracy n he choce of he bes ranslaon, N-Grams are used up o 3-Grams. Example of 2-Grams of he phase: Qué sgnfca la sgla CEE? (Wha does he abbrevaon EEC mean? Qué sgnfca sgnfca la la sgla sgla CEE The N-Grams are very useful n cases n whch ranslaons are formed by same dencal words bu n dfferen order.

Word-Coun wh Cosne Formula The cosne formula s used o calculae he smlary degree. The ranslaons are represened as vecors n a -dmensonal space and o calculae he keyword weghs, he scheme TermFrequency- InverseDocumenFrequency (d-df s used. Example: Qual è la capale della Repubblca del Sud Afrca? ( Wha s he capal of he Republc of Souh Afrca? 1. Cuál es la capal de la Repúblca de la Sur Áfrca? 2. Cuál es enenddo ellos de la repúblca de la Áfrca del sur? 3. Cuál es la capal de la Repúblca del Sur una Afrca? 4. Cuál es el capal de la repúblca del sur Afrca?

Word-Coun wh Cosne Formula All words ha are n he ranslaon are consdered keywords (k only once and whou repeon. Ls of keywords: cuál, es, la, capal, de, repúblca, sur, áfrca, enenddo, ellos, del, una, afrca, el To calculae he weghs for every ranslaon he followng formula s used: f (, j*log(1+ n N Where: N s he oal number of ranslaons; n s he number of documens ha conan k f(i,j=freq(i,j / max(freq(,j I represens he frequency of he keywords n he ranslaon, normalzed w.r. he maxmum, calculaed on all he keywords of ha ranslaon.

Word-Coun wh Cosne Formula The vecor conanng he assocaon weghs o every keywords s obaned. T1:[1.33, 4.0, 0.62, 1.33, 0.35, 0.93, 0.50,, 0.30] Once he vecors have been found, he nex sep s he calculaon of he smlary degree among ranslaons by usng he followng formula: Sm( j, q = ( j 2 j * * q 2 q The fnal calculaon s performed n hs way: Tran1 = Sm( 1, 2 + Sm( 1, 3 + Sm( 1, 4 Tran2 = Sm( 2, 1 + Sm( 2, 3 + Sm( 2, 4 Tran3 = Sm( 3, 1 + Sm( 3, 2 + Sm( 3, 4 Tran4 = Sm( 4, 1 + Sm( 4, 2 + Sm( 4, 3 The ranslaon wh he hghes value s chosen

Double Translaon Mehod Every queson n Ialan s ranslaed no Spansh hen reranslaed back no Ialan. Four ranslaors are used and he ranslaon whose resuls are more smlar o he orgnal queson wll be chosen. The algorhm for hs mehod wh he Dce formula dffers from he prevous for he nersecon beween ranslaors. In fac we make an nersecon beween he orgnal queson and he reranslaed queson. For he mehod wh he Cosne formula he dfference wh he prevous are ha we make a ls of keywords ncludng he orgnal queson. For he orgnal queson we use hs formula: (0.5 + [0.5* f (, j] *log(1+ n N

Resuls We ranslae 450 facual queson from he CLEF 2003 compeon These quesons are ranslaed wh 4 dfferen ranslaors. WC wh Dce DT wh Dce WC wh Cos DT wh Cos 1-Gram 51,33% 46,66% 48,66% 45,77% 2-Grams 51,11% 49,11% 49,33% 48,44% 3-Grams 51,55% 50,22% 50,00% 49,11% The able shows he percenage of success usng he dfferen ranslaors, applyng he echnques prevously explaned. To ncrease he accuracy n he choce of he bes ranslaon N-Grams are used up o 3-Grams.

Resuls Dae Person Organzao n Locaon Measure WcDce1-G 46% 59% 58% WcDce2-G 58% DDce2-G 61% DDce3-G 61% 64% DCos3-G 61% Baselne 70% 64% 42% 72% 40% The able shows he percenage of success for each caegory of queson

Conclusons and Furher Work A prooype of ranslaon for a Mullngual QA Sysem was proposed We have observed ha some ranslaors make a bad ranslaon, probably due o he fac ha an nermedary ranslaon n Englsh s needed, for wo ranslaors, o oban a fnal Spansh ranslaon. There are some cases where he bad redundancy penalzes he elecon of he bes ranslaon mehod. The machne ranslaor whch obaned he bes resuls s PowerTranslaorPro (55.33%. Ths baselne was beer han our bes resuls (51.55% whch are obaned wh he Word-Coun mehod. The prelmnary resuls seem o be promsng. In fac an opmal combnaon among he Word-Coun and Double Translaon could ncrease he percenage of success. We esmae ha should be possble o oban approxmaely an ncrease up o 20%. Ths s due o he fac ha he choce obaned from wo mehods are no he same. Furher expermens are needed o mprove he qualy of ranslaons. The use of oher ranslaors s foreseen. We need o make some furher expermens wh oher ses of facual quesons o make a comparson wh he prelmnary resuls we obaned.