[an error occurred while processing this directive]

HW1: Lexical Retrieval Design Guide

HW1 requires you to understand and make several extensions to the QryEval search engine architecture. It can be a little difficult to know where to start. We recommend the following path.

Your Unranked Boolean retrieval model supports the #OR operator. Extend it to cover the #AND operator. There are four main steps.

Create the QrySopAnd class. Copy the QrySopOr class and change the logic of the matching criteria (docIteratorHasMatch) from "match any argument" to "match all arguments".
Consider whether you need to adjust how scores are calculated.
Modify the QryParser to recognize #AND operators in queries.
Change the default query operator from #OR to #AND.

Unranked Boolean is now finished.

Implement the Ranked Boolean retrieval model. There are four main steps.

Create a RetrievalModelRankedBoolean class. Copy the RetrievalModelUnrankedBoolean class and make changes as necessary.
Extend the Ranker class to support RetrievalModelRankedBoolean.
Modify the #SCORE operator to generate scores for the Ranked Boolean retrieval model. #SCORE needs to produce a score for the current matching document. A #SCORE operator always has one argument (._args[0]) that is a QryIopXxx operator. Look at the QryIop class to see what functions all QryIopXxx query operators support. One of them gives you access to tf and term locations.
Depending on how you implemented #OR and #AND score calculations, you may need to make adjustments to QrySopOr and QrySopAnd to support the RankedBoolean retrieval model.

Ranked Boolean is now finished.

Implement BM25. There are five main steps, but it is similar to what you did for RankedBoolean.

Create a RetrievalModelBM25 class. Note that the BM25 retrieval model has parameters (b, k₁). The retrieval model is a convenient place to store them because it is passed through the retrieval pipeline.
Extend the Ranker class to support RetrievalModelBM25.
Modify the #SCORE operator to generate scores for the BM25 retrieval model.
Create the QrySopSum and QrySopWsum classes.
Modify te QryParser to recognize #SUM and #WSUM in queries.

BM25 is now finished.

Finally, implement the #NEAR operator. There are three main steps.

Use the QryIopSyn class to guide your implementation of a QryIopNear class.
The evaluate method is the heart of each QryIopXxx operator. It uses the arguments (which are always QryIopXxx operators that have inverted lists) to produce a new inverted list.
Modify the QryParser to recognize #NEAR operators in queries.

If you have done it correctly, #NEAR works for all three retrieval models.

FAQ

If you have questions not answered here, see the HW1 FAQ.

[an error occurred while processing this directive]