Information Retrieval Quiz: Trivia!

10 Questions | Total Attempts: 2421

SettingsSettingsSettings
Information Retrieval Quiz: Trivia! - Quiz

.


Questions and Answers
  • 1. 
    A large repository of documents in IR is called:
    • A. 

      Corpus

    • B. 

      Database

    • C. 

      Dictionary

    • D. 

      Collection

  • 2. 
    Postings list should be sorted by:
    • A. 

      Document Frequency

    • B. 

      DocID

    • C. 

      TermID

    • D. 

      Term frequency

  • 3. 
    For query optimization while intersecting two postings list, we should
    • A. 

      Process in the order of increasing document frequency

    • B. 

      Process in any order

    • C. 

      Process in the order of decreasing document frequency

    • D. 

      None of the above

  • 4. 
    Term-document incidence matrix is:
    • A. 

      Dense

    • B. 

      Sparse

    • C. 

      Depends upon the data

    • D. 

      Cannot predict

  • 5. 
    Lemmatization is a technique for:
    • A. 

      Ranking documents

    • B. 

      Case folding

    • C. 

      Normalization

    • D. 

      Tokenization

  • 6. 
    A model of information retrieval in which we can pose any query in which search terms are combined with the operators AND, OR, and NOT:
    • A. 

      Ad hoc retrieval model

    • B. 

      Ranked retrieval model

    • C. 

      Boolean retrieval model

    • D. 

      Proximity query modelmodel

  • 7. 
    The number of times that a word or term occurs in a document is called:
    • A. 

      Document frequency

    • B. 

      Collection frequency

    • C. 

      Term frequency

    • D. 

      Indexing granularity

  • 8. 
    Stemming increases the size of the vocabulary.
    • A. 

      True

    • B. 

      False

  • 9. 
    A crude heuristic process that chops off the ends of the words to reduce inflectional forms of words and reduce the size of the vocabulary is called:
    • A. 

      Lemmatization

    • B. 

      Case folding

    • C. 

      True casing

    • D. 

      Stemming

  • 10. 
    In information retrieval, extremely common words which would appear to be of little value in helping select documents and are excluded from the index vocabulary are called:
    • A. 

      Stop words

    • B. 

      Tokens

    • C. 

      Lemmatized words

    • D. 

      Stemmed terms