Search Engines:
11-442 / 11-642 / 11-742
HW1: Boolean Retrieval Design Guide
HW1 requires you to understand and make several extensions to the
QryEval search engine architecture. It can be a little difficult to
know where to start. We recommend the following path.
Your Unranked Boolean retrieval model supports the #OR
operator. Extend it to cover the #AND operator. There are four main
steps.
- Create the QrySopAnd class. Copy the QrySopOr class and change
the logic of the matching criteria (docIteratorHasMatch) from "match
any argument" to "match all arguments".
- Consider whether you need to adjust how scores are calculated.
- Modify the QryParser to recognize #AND operators in queries.
- Change the default query operator from #OR to #AND.
Implement the Ranked Boolean retrieval model. There are four main steps.
- Create a RetrievalModelRankedBoolean class. Copy the
RetrievalModelUnrankedBoolean class and make changes as
necessary.
- Extend the Ranker class to use the RetrievalModelRankedBoolean
class.
- Modify the #SCORE operator to generate scores for the Ranked
Boolean retrieval model. #SCORE needs to produce a score for the
current matching document. A #SCORE operator always has one argument
(._args[0]) that is a QryIopXxx operator. Look at the QryIop class
to see what functions all QryIopXxx query operators support. One of
them gives you access to tf and term locations.
- Depending on how you implemented #OR and #AND score
calculations, you may need to make adjustments to QrySopOr and
QrySopAnd to support the RankedBoolean retrieval model.
Implement the NEAR operator. There are three main steps.
- Use the QryIopSyn class to guide your implementation of a
QryIopNear class.
- The evaluate method is the heart of each QryIopXxx operator. It
uses the arguments (which are always QryIopXxx operators that have
inverted lists) to produce a new inverted list.
- Modify the QryParser to recognize #NEAR operators in queries.
That's about it.
FAQ
If you have questions not answered here, see the
HW1 FAQ.
Copyright 2024, Carnegie Mellon University.
Updated on January 26, 2024
Jamie Callan