Research
Publications
Refereed Papers
36) Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation
Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture
arxiv (June 12, 2024).
[pdf]
35) "Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time
Scott Rome, Tianwen Chen, Raphael Tang, Luwei Zhou, Ferhan Ture
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).
[pdf]
34) Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models
Raphael Tang, Crystina Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture
In Proc. of Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).
[pdf]
32) What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations
Raphael Tang, Xinyu Zhang, Jimmy Lin, Ferhan Ture
arxiv (Nov 30, 2023)
[pdf]
33) What the DAAM: Interpreting Stable Diffusion Using Cross Attention
Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, Ferhan Ture
In Proc. of Association for Computational Linguistics (ACL 2023).
Best Paper Award
[pdf]
32) Simulating Humans at Scale to Evaluate Voice Interfaces for TVs: the Round-Trip System at Comcast
Breck Baldwin, Lauren Reese, Liming Zhang, Jan Neumann, Taylor Cassidy, Michael Pereira, G Craig Murray, Kishorekumar Sundararajan, Yidnekachew Endale, Pramod Kadagattor, Paul Wolfe, Brian Aiken, Tony Braskich, Donte Jiggetts, Adam Sloan, Esther Vaturi, Crystal Pender, and Ferhan Ture
In Proc. of Conference on Web Search and Data Mining (WSDM 2023).
31) Learning to Rank Instant Search Results with Multiple Indices: A Case Study in Search Aggregation for Entertainment
Scott Rome, Sardar Hamidian, Richard Walsh, Kevin Foley, and Ferhan Ture
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022).
[pdf]
30) Auto-annotation for Voice-enabled Entertainment Systems
Wenyan Li and Ferhan Ture
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020).
[pdf] [presentation]
29) Challenges and Opportunities in Understanding Spoken Queries Directed at Modern Entertainment Platforms.
Ferhan Ture, Jinfeng Rao, Raphael Tang, and Jimmy Lin
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019).
[pdf]
28) Yelling at Your TV: An Analysis of Speech Recognition Errors and Subsequent User Behavior on Entertainment Systems.
Raphael Tang, Ferhan Ture, and Jimmy Lin
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019).
[pdf]
27) Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks.
Raphael Tang, Gefei Yang, Hong Wei, Yajie Mao, Ferhan Ture, and Jimmy Lin
arXiv (Submitted 12/19/2018)
[pdf]
26) Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search.
Jinfeng Rao, Wei Yang, Yuhao Zhang, Ferhan Ture, and Jimmy Lin
In Proc. of Association for the Advancement of Artificial Intelligence (AAAI 2019).
[pdf]
25) Multi-Task Learning with Neural Networks for Voice Query Understanding on an Entertainment Platform.
Jinfeng Rao, Ferhan Ture, and Jimmy Lin
In Proc. of International Conference on Knowledge Discovery & Data Mining (KDD 2018).
[pdf]
24) What Do Viewers Say to Their TVs? An Analysis of Voice Queries to Entertainment Systems.
Jinfeng Rao, Ferhan Ture, and Jimmy Lin
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018).
[pdf]
23) Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks.
Jinfeng Rao, Ferhan Ture, Hua He, Oliver Jojic, and Jimmy Lin
In Proc. of International Conference on Information and Knowledge Management (CIKM 2017).
[pdf]
22) No Need to Pay Attention: Simple Recurrent Neural Networks Work! (for Answering "Simple" Questions).
Ferhan Ture and Oliver Jojic
In Proc. of Empirical Methods in NLP (EMNLP 2017).
[pdf]
21) Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams.
Jinfeng Rao, Hua He, Haotian Zhang, Ferhan Ture, Royal Sequiera, Salman Mohammed, and Jimmy Lin
In Proc. of SIGIR 2017 Workshop on Neural Information Retrieval (Neu-IR 2017).
[pdf]
20) Mining Temporal Statistics of Query Terms for Searching Social Media Posts.
Jinfeng Rao, Ferhan Ture, Xing Niu and Jimmy Lin
To appear in International Conference on the Theory of Information Retrieval (ICTIR 2017).
[pdf]
19) Learning to Translate for Multilingual Question Answering.
Ferhan Ture and Elizabeth Boschee
In Proc. of Conference on Empirical Methods in Natural Language Processing (EMNLP 2016).
[pdf]
18) Ask Your TV: Real-Time Question Answering with Recurrent Neural Networks.
Ferhan Ture and Oliver Jojic
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016) - Industry Track.
[pdf] [presentation]
17) Structured TV Shows --- "You have been Chopped".
Ferhan Ture, Jonghyun Choi, Hongcheng Wang and Vamsi Potluru
In ICML Workshop on Multi-View Representation Learning (MVRL 2016).
16) Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval.
Ferhan Ture and Jimmy Lin
In ACM Transactions on Information Systems (TOIS). Volume 32, Issue 4, 2014.
[pdf]
15) Learning to Translate: A Query-Specific Combination Approach for Cross-Lingual Information Retrieval.
Ferhan Ture and Elizabeth Boschee
In Proc. of Conference on Empirical Methods in Natural Language Processing (EMNLP 2014).
[pdf]
14) Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation.
Vladimir Eidelman, Ke Wu, Ferhan Ture, Philip Resnik and Jimmy Lin
In Proc. of Workshop on Statistical Machine Translation (WMT 2013).
13) Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce.
Vladimir Eidelman, Ke Wu, Ferhan Ture, Philip Resnik and Jimmy Lin
In Proc. of Annual Meeting of the Association for Computational Linguistics (ACL 2013).
12) Flat vs. Hierarchical Translation Models for Cross-Language Information Retrieval.
Ferhan Ture and Jimmy Lin
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2013).
11) Combining Statistical Translation Techniques for Cross-Language Information Retrieval.
Ferhan Ture, Jimmy Lin, and Douglas W. Oard
In Proc. of International Conference on Computational Linguistics (COLING 2012).
[pdf]
10) Looking Inside the Box: Context-Sensitive Translation for
Cross-Language Information Retrieval.
Ferhan Ture, Jimmy Lin and Douglas W. Oard
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012).
9) Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling.
Ferhan Ture and Jimmy Lin
In Proc. of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012).
[pdf] [presentation] [code] [data]
8) Encouraging Consistent Translation Choices.
Ferhan Ture, Douglas W. Oard and Philip Resnik
In Proc. of Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2012).
[pdf] [presentation]
7) No Free Lunch: Brute Force vs Locality-Sensitive Hashing for Cross-Lingual Pairwise Similarity.
Ferhan Ture, Tamer Elsayed and Jimmy Lin
In Proc. of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011).
[pdf] [presentation] [code]
6) cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models.
Chris Dyer, Adam Lopez, Juri Ganitkevitch, Jonathan Weese, Ferhan Ture, Phil Blunsom, Hendra Setiawan, Vladimir Eidelman and Philip Resnik
In Proc. of Association for Computational Linguistics (ACL 2010 - Demonstration Track).
[software]
5) HAPLO-ASP: Haplotype Inference using Answer Set Programming.
Esra Erdem, Ozan Erdem, and Ferhan Ture
In Proc. of Logic Programming and Nonmonotonic Reasoning (LPNMR 2009).
4) Comparing ASP, CP, ILP on two Challenging Applications: Wire Routing and Haplotype Inference.
Elvin Coban, Esra Erdem and Ferhan Ture (alphabetical order)
In Proc. of Logic and Search (LaSh 2008).
3) Efficient Haplotype Inference with Answer Set Programming.
Esra Erdem and Ferhan Ture (alphabetical order)
In Proc. of Association for the Advancement of Artificial Intelligence (AAAI 2008).
[abstract] [presentation] [pdf]
2) Solving challenging grid puzzles with answer set programming.
Merve Cayli, Ayse Gul Karatop, Emrah Kavlak, Hakan Kaynar, Ferhan Ture and Esra Erdem
In Proc. of Answer Set Programming (ASP 2007).
1) Learning Morphological Disambiguation Rules for Turkish.
Deniz Yuret and Ferhan Ture
In Proc. of North American Chapter of the Association for Computational Linguistics (NAACL 2006).
[abstract] [presentation] [pdf]
Technical Reports
1) Brute-Force Approaches to Batch Retrieval: Scalable Indexing with MapReduce, or Why Bother?.
Tamer Elsayed, Ferhan Ture, and Jimmy Lin
Technical Report HCIL-2010-23, University of Maryland, College Park, October 2010.
[pdf]
Theses
Searching to Translate, and Translating to Search: When Information Retrieval Meets Machine Translation.
Ferhan Ture
Doctoral Dissertation, University of Maryland, College Park. May 2013.
[presentation] [pdf]
A Hybrid Machine Translation System from Turkish to English.
Ferhan Ture
Masters Thesis, Sabanci University, Turkey. July 2008.
[abstract] [presentation] [pdf]
Projects
Current Research Projects
Past Research Projects
2011-2013
Using Translation Models to Improve CLIR
Joint work with Doug Oard and Jimmy Lin.
Translation models can provide better translations for the task of CLIR by using larger
translation units (e.g. phrases) and wider context. We explore ways that a translation grammar
and decoder can improve effectiveness and efficiency of CLIR systems.
2010-2012
Encouraging Translation Consistency
Joint work with Doug Oard and Philip Resnik.
We re-visit the one-sense-per-discourse heuristic in the context of translation, and argue that the translation of a token should be consistent throughout a discourse. We’ve implemented variants of this heuristic as a feature in the translation model and have shown significant BLEU improvements in both Arabic-English and Chinese-English.
2009-2013
Cross-lingual Pairwise Similarity Computation in Large Collections of Documents
Joint work with Jimmy Lin.
Pairwise similarity is the task of finding similar pairs of documents in a large collection efficiently. We can extend this to cross-lingual domains such as Wikipedia, to detect similar documents written in different languages. We explore various approaches to implement this idea and propose using it in the application of bilingual parallel text collection.
2009-2010
Parallel Conditional Random Field (CRF) Training for Machine Translation Systems
Joint work with Chris Dyer, Jimmy Lin, and Philip Resnik.
We parallelize CRF training, a supervised learning method that is a good combination of descriptive and generative learning approaches. The feature set one can use in a CRF model is very flexible and it can be trained using an EM-like approach. Scalability of CRF models is necessary in MT applications, which is the motivation to parallelize the process with MapReduce.
2009-2010
Learning a Sentiment Lexicon from the Web
Joint work with Jimmy Lin.
We can exploit lots and lots of data (50 million English web pages from the ClueWeb09 collection), in order to learn a sentiment lexicon in an unsupervised manner. Using emoticons as annotations, we propose an approach to determine subjectivity by calculating various term statistics.
2009
Learning Decision Lists in Parallel for Morphological Disambiguation in Turkish
Joint work with Jimmy Lin.
The goal is to apply “cloud computing” to parallelize the process of learning decision lists. Our approach was intended to scale our previous morphological disambiguator (a joint work with Deniz Yuret) to much larger data sets, and possibly perform bootstrapping to gain from unannotated data.
2007-2008
A Hybrid Machine Translation (MT) System from Turkish to English
Supervised by Prof. Kemal Oflazer.
We have created a hybrid Turkish-to-English MT system, which maps Turkish text to all possible English translations, and builds an English language model that selects the most probable one. Mapping is done at the sentence level via a parallel grammar implemented using Avenue1 transfer engine, and the SRILM Toolkit2 is used to create a language model.
2006-2009
Formal Approaches to Haplotype Inference
Supervised by Dr. Esra Erdem.
In this project, we develop new formal approaches to solving Haplotype Inference problem, by means of various declarative programming paradigms, such as Answer Set Programming, Constraint Programming and Integer Linear Programming.
2006-2008
AI Planning for Genome Rearrangement
Supervised by Dr. Esra Erdem.
We view the genome rearrangement problem as the problem of planning rearrangement events that transform one genome to the other, represent it as a planning problem, and use TLPlan to solve it.
Fall 2006
Automated Reasoning about Challenging Grid Puzzles
Supervised by Dr. Esra Erdem. Joint work with Merve Cayli, Ayse Gul Karatop, Emrah Kavlak, and Hakan Kaynar.
In this project we study challenging grid puzzles (of complexity NP) interesting for answer set programming from the viewpoints of representation and computation.
Spring 2006
Perturbation Theory and WKB Approximation Methods
Supervised by Prof. Ali Mostafazadeh.
In this work, we analyzed the method of asymptotic approximations, and applied the WKB Approximation Method to solve the Schroedinger Equation.
2005-2006
Morphological Analysis and Disambiguation of Turkish Language
Supervised by Dr. Deniz Yuret.
We developed a learning-based morphological disambiguator for Turkish. Stand-alone disambiguator can be downloaded here.
2005-2006
Prediction of Lagrangian Trajectories in the Ocean
Supervised by Dr. Mine Caglar.
We approximate the trajectory of an object lost in the ocean, using a mathematical model that applies interpolation and regression techniques to discrete time-position data.