Skip to content

Commit 5507cc2

Browse files
committed
Working on the report
1 parent 45b067a commit 5507cc2

4 files changed

Lines changed: 101 additions & 13 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
psueod-relevance

report/main.bib

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,3 +58,32 @@ @article{nogueira2019multi
5858
journal={arXiv preprint arXiv:1910.14424},
5959
year={2019}
6060
}
61+
62+
63+
@article{elgohary2019can,
64+
title={Can you unpack that? learning to rewrite questions-in-context},
65+
author={Elgohary, Ahmed and Peskov, Denis and Boyd-Graber, Jordan},
66+
journal={Can You Unpack That? Learning to Rewrite Questions-in-Context},
67+
year={2019}
68+
}
69+
70+
@article{anantha2020open,
71+
title={Open-domain question answering goes conversational via question rewriting},
72+
author={Anantha, Raviteja and Vakulenko, Svitlana and Tu, Zhucheng and Longpre, Shayne and Pulman, Stephen and Chappidi, Srinivas},
73+
journal={arXiv preprint arXiv:2010.04898},
74+
year={2020}
75+
}
76+
77+
@article{zhao2020sparta,
78+
title={SPARTA: Efficient open-domain question answering via sparse transformer matching retrieval},
79+
author={Zhao, Tiancheng and Lu, Xiaopeng and Lee, Kyusong},
80+
journal={arXiv preprint arXiv:2009.13013},
81+
year={2020}
82+
}
83+
84+
@article{thakur2021beir,
85+
title={Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models},
86+
author={Thakur, Nandan and Reimers, Nils and R{\"u}ckl{\'e}, Andreas and Srivastava, Abhishek and Gurevych, Iryna},
87+
journal={arXiv preprint arXiv:2104.08663},
88+
year={2021}
89+
}

report/main.pdf

12.1 KB
Binary file not shown.

report/main.tex

Lines changed: 71 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
\begin{document}
1313

14-
\title{Evaluation of Conversational Search Engines}
14+
\title{Comparative Study of Conversational Search Engine Retrieval Pipelines}
1515
\subtitle{Group ID: \#5}
1616

1717
% AUTHORS:
@@ -73,34 +73,44 @@ \section{Problem Statement}\label{sec:problem}
7373
\section{Related Work}\label{sec:related}
7474
In this section, we delve into pertinent research encompassing the realms of conversational search engines and the broader area of information retrieval. While certain highlighted studies do not directly cater to conversational search engines or explicit information retrieval, their techniques remain invaluable in various stages of the conversational retrieval process.
7575

76-
\subsection*{\texttt{RM3} Pseudo-Relevance Feedback Query Expansion}
76+
\subsection*{Pseudo-Relevance Feedback by Query Expansion}\label{sec:prf}
7777

7878
\subsection*{Text-to-Text Transfer Transformer}\label{sec:t5}
7979
The vast domain of natural language processing (NLP) revolves around the understanding of natural language, whether presented in text or speech form. NLP aspires to equip computers with the capability to grasp the depth of human language and harness this understanding to execute a range of tasks, such as text summarization, machine translation, and question answering. Given the diverse nature of these tasks in terms of their input, output, and underlying challenges, developing a unified model proficient across the entire spectrum poses a significant challenge.
8080

8181
Enter the Text-to-Text Transfer Transformer (\texttt{T5}) \cite{raffel2020exploring}. This work by Raffel et al. introduces transfer learning in NLP, aiming to craft a versatile model that can be used for any NLP problem. In essence, T5 models first learn the basics of language. Then, they're sharpened for particular tasks using targeted data. It's common to find models that have been trained in this manner for any specific NLP problem.
8282

83-
\subsection*{\texttt{doc2query}}
83+
\subsection*{\texttt{doc2query}}\label{sec:doc2query}
8484
Traditional retrieval techniques, such as \texttt{BM25}, rely primarily on term occurrences in both queries and documents. However, they often overlook the semantics of the content. As a result, documents that may be semantically relevant to a query might be scored as non-relevant due to differences in syntax or terminology. Dense retrieval methods, which emphasize semantic similarities between texts, can address this problem but are computationally taxing during retrieval.
8585

8686
A notable solution to this is the \texttt{doc\-2query} method proposed by Nogueira et al. \cite{nogueira2019document}. It employs a text-to-text transformer to convert documents into queries. By generating and appending a few of these transformed queries to the original document, classical retrieval methods show significantly improved performance. This is because these additional queries often capture semantic nuances similar to those in the actual query \cite{nogueira2019document,nogueira2019doc2query,pradeep2021expando}. Importantly, \texttt{doc\-2query} shifts the computational load to the indexing phase, ensuring minimal performance lag during retrieval. By leveraging the \texttt{T5} model, the authors further enhanced the query generation quality, leading to the variation known as \texttt{doc\-TTTTTquery}, \texttt{doc\--T5query}, or \texttt{doc\-2query\--T5} \cite{nogueira2019doc2query}.
8787

88-
\subsection*{\texttt{monoT5} \& \texttt{duoT5} Rerankers}
89-
\texttt{monoT5} and \texttt{duoT5} are neural re-rankers, also developed by Nogueira et al., which attempt to inject semantic understanding into the retrieval process \cite{nogueira2020document,nogueira2019multi}. Using the \texttt{T5} model, they re-rank a list of documents based on their semantic relevance to a given query. Specifically, \texttt{monoT5} processes a query and a single document, outputting a relevance score. In contrast, \texttt{duoT5} considers a query and two documents, determining which document is more relevant. Although \texttt{duoT5} offers a more nuanced ranking, its pairwise comparison method makes it computationally heavier. Hence, a staged re-ranking approach is proposed: first using \texttt{monoT5} for the top $k$ documents and subsequently applying \texttt{duoT5} to a smaller subset, the top $l$, where $l \ll k$ \cite{nogueira2019multi,pradeep2021expando}.
88+
\subsection*{\texttt{SPARTA}}\label{sec:sparta}
89+
\texttt{SPARTA}, introduced by Zhao et al. \cite{zhao2020sparta}, represents a nuanced take on sparse retrieval. At its core, it works by encoding documents into sparse representations during the indexing phase. These representations not only capture the document's actual content but also incorporate terms that are semantically resonant, even if they're not present in the document. This underlying principle echoes the rationale of approaches like \texttt{doc2query} and dense retrieval models.
90+
91+
Yet, where \texttt{SPARTA} differentiates itself is in its retrieval phase. Unlike dense retrieval models, it retrieves pertinent documents using straightforward index lookups, mirroring lexical retrieval strategies like \texttt{BM25} \cite{zhao2020sparta}.
9092

91-
\subsection*{\texttt{SPARTA} Reranker}
93+
However, in real-world applications, \texttt{SPARTA} faces challenges. Several other models, including \texttt{BM25} and \texttt{doc2query-T5}, surpass it in ranking efficacy. Additionally, its indexing footprint is substantially larger compared to alternatives like \texttt{doc2query-T5} \cite{thakur2021beir}.
9294

95+
\subsection*{\texttt{monoT5} \& \texttt{duoT5} Rerankers}\label{sec:rerankers}
96+
\texttt{monoT5} and \texttt{duoT5} are neural re-rankers, also developed by Nogueira et al., which attempt to inject semantic understanding into the retrieval process \cite{nogueira2020document,nogueira2019multi}. Using the \texttt{T5} model, they re-rank a list of documents based on their semantic relevance to a given query. Specifically, \texttt{monoT5} processes a query and a single document, outputting a relevance score. In contrast, \texttt{duoT5} considers a query and two documents, determining which document is more relevant. Although \texttt{duoT5} offers a more nuanced ranking, its pairwise comparison method makes it computationally heavier. Hence, a staged re-ranking approach is proposed: first using \texttt{monoT5} for the top $k$ documents and subsequently applying \texttt{duoT5} to a smaller subset, the top $l$, where $l \ll k$ \cite{nogueira2019multi,pradeep2021expando}.
9397

94-
\subsection*{Expando-Mono-Duo Design Pattern}
98+
99+
\subsection*{Expando-Mono-Duo Design Pattern}\label{sec:expando}
95100
The same research team introduced a strategic pattern for integrating the above tools into retrieval pipelines, termed the Expando-Mono-Duo design pattern \cite{pradeep2021expando}. Here's how it works: During indexing, \texttt{doc2query-T5} is employed to enhance document representation and better the initial retrieval results from methods like \texttt{BM25}. The retrieved results are then re-ranked with \texttt{monoT5}. A selected top tier from this list undergoes another re-ranking using \texttt{duoT5}. Trials show that this composite approach leads to marked improvements in result quality across multiple evaluation metrics \cite{pradeep2021expando}.
96101

97-
\subsection*{\texttt{T5} Conversational Query Rewriting}
102+
\subsection*{Conversational Query Rewriting}\label{sec:cqr}
103+
104+
Conversational search engines distinguish themselves from standard search engines by determining document relevance through the entirety of a conversation, not just the immediate query. In conversational contexts, subsequent questions often lean on prior interactions, implying that previous questions and answers must be factored in when fetching relevant documents. However, there's also a need to cater to conversation shifts where the immediate query doesn't relate to preceding exchanges. Blindly considering the entire conversational history in such cases could detriment retrieval accuracy.
98105

106+
Elgohary et al. address this challenge with an innovative approach \cite{elgohary2019can}. They suggest reshaping the current query based on the overarching conversation. This reformulated query is designed to function autonomously within conventional retrieval pipelines. In essence, this technique extends the utility of standard search engines to conversational question-answering scenarios by introducing a preceding conversational query modification stage.
107+
108+
Employing text-to-text transformers, like \texttt{T5}, can be instrumental in achieving this rewrite. These models are nurtured to revamp the immediate query, factoring in the conversational backdrop. Studies validate the efficacy of this approach, highlighting its capacity to enhance the retrieval accuracy of traditional search engines in conversational contexts \cite{elgohary2019can,anantha2020open,Lajewska:2023:ECIR}.
99109

100110

101111

102112
\section{Baseline Method}\label{sec:baseline}
103-
Our baseline method is inspired by the baseline method presented by Łajewska et al. \cite{Lajewska:2023:ECIR}. Our baseline method is structured in the following sequence:
113+
Our baseline method is inspired by the baseline method presented by Łajewska et al. \cite{Lajewska:2023:ECIR}. It is structured in the following sequence:
104114
\begin{enumerate}
105115
\item \texttt{T5} Query Rewriting
106116
\item \texttt{BM25} Retrieval
@@ -111,7 +121,7 @@ \section{Baseline Method}\label{sec:baseline}
111121
\end{enumerate}
112122
\end{enumerate}
113123

114-
\subsection{T5 Query Rewriting}
124+
\subsection{\texttt{T5} Query Rewriting}
115125
In conversation search engines, query rewriting is the crucial component to include the semantics of the conversation history into the currently asked query, which results into a singular rewritten query that can be fed into the retrieval pipeline.
116126

117127
For this purpose, we include all the previously rewritten queries $q'_0 \dots q'_{n-1}$ of our conversation, as well as the response $r_{n-1}$ of the CSE to the previous rewritten query $q'_{n-1}$ into the current query $q_n$. This is done by concatenating the previous rewritten queries and the response into a single string:
@@ -124,14 +134,62 @@ \subsection{T5 Query Rewriting}
124134

125135
Driven by these insights, we have turned our attention to other query rewriting techniques. \texttt{Pyterrier} provides the \texttt{Sequential\-Dependence} query rewriting method\footnote{URL: \url{https://pyterrier.readthedocs.io/en/latest/rewrite.html\#sequentialdependence}}. We have found, however, that this rewriter also does not produce the desired results.
126136

127-
Subsequent exploration led us to the \texttt{T5} neural query rewriter trained for conversational question rewriting\footnote{URL: \url{https://huggingface.co/castorini/t5-base-canard}}. With this method, $q'_n$ closely mirrored $q_n$, subtly infusing it with the conversation's context, particularly when no drastic topic alterations were identified. A valuable by-product was the concise nature of the rewritten query, a departure from the growing length observed previously.
137+
Subsequent exploration led us to the \texttt{T5} neural query rewriter trained for conversational question rewriting, see Section \ref{sec:cqr}. With this method, $q'_n$ closely mirrored $q_n$, subtly infusing it with the conversation's context, particularly when no drastic topic alterations were identified. A valuable by-product was the concise nature of the rewritten query, a departure from the growing length observed previously. Since retrieval latency is a critical factor, we utilized a smaller \texttt{T5} model: \texttt{castorini\-/t5\--base\--canard}\footnote{URL: \url{https://huggingface.co/castorini/t5-base-canard}}
128138

129-
\subsection{BM25 Retrieval}
139+
\subsection{\texttt{BM25} Retrieval}
130140
We settled on the \texttt{BM25} retrieval method, a commonly used formula in the realm of information retrieval, for its simplicity and its deployment in the reference system, allowing for direct comparisons.
131141

132142
\subsection{Re-ranking}
133-
The re-ranking stage of our baseline system consists of two stages: First, the top 1000 documents retrieved by the \texttt{BM25} retrieval method are re-ranked using the \texttt{monoT5} reranker. Afterwards, the top 50 documents of the previous re-ranking stage are rearranged using the \texttt{duoT5} reranker. The precise count of documents subject to reranking at each stage is a hyperparameter of our system, allowing to balance computational cost and result quality. These rerankers were implemented in the \texttt{pyterrier\_t5} library.\footnote{URL: \url{https://github.com/terrierteam/pyterrier_t5}}
143+
The re-ranking stage of our baseline system consists of two stages: First, the top 1000 documents retrieved by the \texttt{BM25} retrieval method are re-ranked using the \texttt{monoT5} reranker. Afterwards, the top 50 documents of the previous re-ranking stage are rearranged using the \texttt{duoT5} reranker, see Section \ref{sec:rerankers}. The precise count of documents subject to reranking at each stage is a hyperparameter of our system, allowing to balance computational cost and result quality. These rerankers were implemented in the \texttt{pyterrier\_t5} library.\footnote{URL: \url{https://github.com/terrierteam/pyterrier_t5}} Again, since a low latency of our retrieval pipeline is crucial to us, we utilized smaller \texttt{T5} models: \texttt{castorini\-/monot5\--base\--msmarco}\footnote{URL: \url{https://huggingface.co/castorini/monot5-base-msmarco}} for \texttt{monoT5} and \texttt{castorini\-/duot5\--base\--msmarco}\footnote{URL: \url{https://huggingface.co/castorini/duot5-base-msmarco}} for \texttt{duoT5}.
144+
145+
\section{Incorporating Pseudo-Relevance Feedback into Our Baseline}\label{sec:baseline+rm3}
146+
147+
Recognizing the substantial performance enhancements associated with pseudo-relevance feedback, we felt compelled to integrate a query expansion mechanism into our baseline retrieval method, see Section \ref{sec:baseline}. Our choice fell upon the \texttt{RM3} query expansion technique, well-established for its robustness and acceptance within the information retrieval community. For a deeper dive into its mechanics and principles, readers are directed to Section \ref{sec:prf}.
148+
149+
In the \texttt{Pyterrier} framework, the setup requires that any query expansion follows an initial retrieval phase. This initial retrieval fetches the top $p$ documents, forming the foundation for subsequent query expansion using \texttt{RM3}. With the query expanded, it's then passed into a secondary retrieval phase to retrieve the final document set for the end-user. And, to fine-tune the output, we again apply re-ranking using both \texttt{monoT5} and \texttt{duoT5}.
150+
151+
Henceforth, we'll label this integrated retrieval approach as "baseline + \texttt{RM3}", which is structured as follows:
152+
\begin{enumerate}
153+
\item \texttt{T5} Query Rewriting
154+
\item \texttt{BM25} Retrieval
155+
\item \texttt{RM3} Pseudo-Relevance Feedback Query Expansion
156+
\item \texttt{BM25} Retrieval
157+
\item Re-ranking
158+
\begin{enumerate}
159+
\item Re-ranking using \texttt{monoT5}
160+
\item Top-document re-ranking using \texttt{duoT5}
161+
\end{enumerate}
162+
\end{enumerate}
134163

164+
\section{Document Expansion Method}
165+
JUST IDEA
166+
\begin{enumerate}
167+
\setcounter{enumi}{-1}
168+
\item \texttt{doc2query-T5} Document Expansion
169+
\item \texttt{T5} Query Rewriting
170+
\item \texttt{BM25} Retrieval
171+
\item Re-ranking
172+
\begin{enumerate}
173+
\item Re-ranking using \texttt{monoT5}
174+
\item Top-document re-ranking using \texttt{duoT5}
175+
\end{enumerate}
176+
\end{enumerate}
177+
178+
\section{Extending the Document Expansion Method with Pseudo-Relevance Feedback}
179+
JUST IDEA
180+
\begin{enumerate}
181+
\setcounter{enumi}{-1}
182+
\item \texttt{doc2query-T5} Document Expansion
183+
\item \texttt{T5} Query Rewriting
184+
\item \texttt{BM25} Retrieval
185+
\item \texttt{RM3} Pseudo-Relevance Feedback Query Expansion
186+
\item \texttt{BM25} Retrieval
187+
\item Re-ranking
188+
\begin{enumerate}
189+
\item Re-ranking using \texttt{monoT5}
190+
\item Top-document re-ranking using \texttt{duoT5}
191+
\end{enumerate}
192+
\end{enumerate}
135193

136194
\section{Advanced Method}\label{sec:advanced}
137195
Explain what you are taking as your advanced method(s), as well as why this is a promising attempt to outperform the baseline method, and why you are making specific implementation choices.

0 commit comments

Comments
 (0)