| |
- builtins.object
-
- InvList
class InvList(builtins.object) |
|
InvList(fieldString, termString=None)
Create, access, and manipulate inverted lists. This inverted
list datatype is intended to be simpler to use than Lucene's.
Attributes:
ctf:
Collection term frequency. The total number of term occurrences
(positions) in the inverted list.
df:
Document frequency. The number of documents in the inverted list.
postings:
A list of DocPosting objects (see below). |
|
Methods defined here:
- __init__(self, fieldString, termString=None)
- If no TermString is provided, return an empty inverted list.
Otherwise return an inverted list from the index.
fieldString: The name of a document field.
termString: A lexically-processed term that may be in the corpus.
- __str__(self)
- Render the inverted list as a string. Old InspectIndex format.
- appendPosting(self, docid, positions)
- Append a posting to the posting list. Postings must be appended
in docid order, otherwise this method fails.
docid: An internal document id (an integer).
positions: A list of document locations where the term occurs.
Returns True if the posting was added, otherwise False.
- getDocid(self, n)
- Get the n'th document id from the inverted list.
n: An integer from 0 to df-1 that indicates which document
in the inverted list.
Returns the internal docid of the n'th document.
- getTf(self, n)
- Get the term frequency in the n'th document of the inverted list.
n: An integer from 0 to df-1 that indicates which document
in the inverted list.
Returns the term frequency in the n'th document.
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
Data and other attributes defined here:
- DocPosting = <class 'InvList.InvList.DocPosting'>
- Utility class that makes it easier to construct postings.
docid: An internal document id (an integer).
locations: A list of document locations.
Attributes:
docid:
An internal document id.
tf:
Term frequency of the term in the document. Also, the length
of the positions list.
positions:
A list of locations in the document where the term occurs.
| |