collection.tsv.1: first half of the original MS MARCO passage ranking collection collection.tsv.2: second half of the original MS MARCO passage ranking collection (split into two halves for parallel computing. You can use more splits.) collection_jsonl.zip: the same MS MARCO passage ranking collection, in json format. myalltrain.relevant.docterm_recall: train file