Data Mining Trivia Quiz

51 Questions | Total Attempts: 10241

SettingsSettingsSettings
Please wait...
Data Mining Quizzes & Trivia

Living in a world run and managed using soft technology, data has become an essential part of human existence. The quantity of data generated and stored everyday makes it had to retrieve when needed. How much do you know about this subject? Take up the quiz below and see how good you are. All the best!


Related Topics
Questions and Answers
  • 1. 
    What is Data?
  • 2. 
    An _____ is a property or characteristic of an object. Example: eye color of a person, temperature, and etc.
  • 3. 
    A collection of attributes describe an object?
    • A. 

      True

    • B. 

      False

  • 4. 
    _______ values are numbers or symbols assigned to an attribute
  • 5. 
    Same attribute can be mapped to different attribute values?
    • A. 

      True

    • B. 

      False

  • 6. 
    Different attributes cannot be mapped to the same set of values?
    • A. 

      True

    • B. 

      False

  • 7. 
    What are the different types of attributes?
    • A. 

      Nominal

    • B. 

      Ordinal

    • C. 

      Spacial

    • D. 

      Temperatures

    • E. 

      Interval

    • F. 

      Cordinality

    • G. 

      Ratio

  • 8. 
    Examples of Nominal can be:
    • A. 

      ID Numbers, eye color, zip codes

    • B. 

      Rankings, taste of potato chips, grades, height

    • C. 

      Calendar dates, temperatures in celsius or fahrenheit, phone numbers

    • D. 

      Temperature in Kelvin, length, time, counts

  • 9. 
    Examples of Ordinal can be:
    • A. 

      ID Numbers, eye color, zip codes

    • B. 

      Rankings, taste of potato chips, grades, height

    • C. 

      Calendar dates, temperatures in celsius or fahrenheit, phone numbers

    • D. 

      Temperature in Kelvin, length, time, counts

  • 10. 
    Examples of Interval can be:
    • A. 

      ID Numbers, eye color, zip codes

    • B. 

      Rankings, taste of potato chips, grades, height

    • C. 

      Calendar dates, temperatures in celsius or fahrenheit

    • D. 

      Temperature in Kelvin, length, time, counts

  • 11. 
    Examples of Ratio can be:
    • A. 

      ID Numbers, eye color, zip codes

    • B. 

      Rankings, taste of potato chips, grades, height

    • C. 

      Calendar dates, temperatures in celsius or fahrenheit

    • D. 

      Temperature in Kelvin, length, time, counts

  • 12. 
    The type of a Nominal attribute depends on which of the following properties:
    • A. 

      Distinctness & order

    • B. 

      Distinctness, order & addition

    • C. 

      Distinctness

    • D. 

      All 4 properties

  • 13. 
    The type of an Ordinal attribute depends on which of the following properties:
    • A. 

      Distinctness & order

    • B. 

      Distinctness, order & addition

    • C. 

      Distinctness

    • D. 

      All 4 properties

  • 14. 
    The type of an Interval attribute depends on which of the following properties:
    • A. 

      Distinctness & order

    • B. 

      Distinctness, order & addition

    • C. 

      Distinctness

    • D. 

      All 4 properties

  • 15. 
    The type of a Ratio attribute depends on which of the following properties:
    • A. 

      Distinctness & order

    • B. 

      Distinctness, order & addition

    • C. 

      Distinctness

    • D. 

      All 4 properties

  • 16. 
    _______ Attribute has only a finite or countably infinite set of values, often represented as integer variables, Example: zip codes, counts, or the set of words in a collection of documents
  • 17. 
    _________ Attribute has real numbers as attribute values. Practically, real values can only be measured and represented using a finite number of digits. Is typically represented as floating-point variable.
  • 18. 
    Types of data sets are: 
    • A. 

      Graph

    • B. 

      Categorial

    • C. 

      Gyroscope

    • D. 

      Graph

    • E. 

      Counter

    • F. 

      Ordered

  • 19. 
    Record data set consists of: 
    • A. 

      World Wide Web, Molecular Structures

    • B. 

      Spacial Data, Temporal Data, Sequential Data, Genetic Sequence Data

    • C. 

      Generic Data, Inferential Data, Continuous Data

    • D. 

      Data Matrix, Document Data, Transaction Data

  • 20. 
    Graph data set consists of: 
    • A. 

      World Wide Web, Molecular Structures

    • B. 

      Spacial Data, Temporal Data, Sequential Data, Genetic Sequence Data

    • C. 

      Generic Data, Inferential Data, Continuous Data

    • D. 

      Data Matrix, Document Data, Transaction Data

  • 21. 
    Ordered data set consists of: 
    • A. 

      World Wide Web, Molecular Structures

    • B. 

      Spacial Data, Temporal Data, Sequential Data, Genetic Sequence Data

    • C. 

      Generic Data, Inferential Data, Continuous Data

    • D. 

      Data Matrix, Document Data, Transaction Data

  • 22. 
    Important Characteristics of Structured Data are:
    • A. 

      Generality

    • B. 

      Dimensionality

    • C. 

      Resolution

    • D. 

      Spacial

    • E. 

      Sparsity

  • 23. 
    _________ Data is a data that consists of a collection of records, each of which consists of a fixed set of attributes
  • 24. 
    Data Matrix is:
    • A. 

      If data objects have the same fixed set of numeric attributes, then the data objects can be thought of as points in a multi-dimensional space, where each dimension represents a distinct attribute

    • B. 

      Such data set can be represented by an m by n matrix, where there are m rows, one for each object, and n columns, one for each attribute

    • C. 

      Neither A or B

    • D. 

      Both A and B

  • 25. 
    Each document becomes a ___________ vector
  • 26. 
    __________ data is a special type of record data, where each record involves a set of items
  • 27. 
    Generic graph and HTML Links are examples of _________ data
  • 28. 
    The data that helps to identify substructures is considered to be ___________ data
  • 29. 
    ____________ data is sequences of transactions or genomic sequence data
  • 30. 
    Average Monthly Temperature of land and ocean can be considered a(n) ________ data
  • 31. 
    What are some examples of data quality problems:
    • A. 

      Noise and outliers

    • B. 

      Genomic fields

    • C. 

      Missing values

    • D. 

      Duplicate data

    • E. 

      Strategic values

  • 32. 
    __________ is a systematic variation of Measurements from the quantity being measured.
  • 33. 
    ________ is the closeness of measurements to the true value of the quantity being measured.
  • 34. 
    _________ is the closeness of repeated measurements (of the same quantity) to other measurements.
  • 35. 
    _________ refers to modification of original values, such as distortion of a person's voice when talking on a poor phone and "snow" on television screen.
  • 36. 
    _________ are data objects with characteristics that are considerably different than most of the other data objects in the data set.
  • 37. 
    _______ data set may include data objects that are duplicates, or almost duplicates of one another.
  • 38. 
    Which seven of these are part of  Data Preprocessing?
    • A. 

      Aggregation

    • B. 

      Distortion

    • C. 

      Sequel Ordering

    • D. 

      Sampling

    • E. 

      Dimensionality Reduction

    • F. 

      Bias Resolution

    • G. 

      Feature subset selection

    • H. 

      Feature creation

    • I. 

      Rendering Objects

    • J. 

      Discretization and Binarization

    • K. 

      Pixelation

    • L. 

      Attribute Transformation

    • M. 

      Attribute Regression

  • 39. 
    _______ is combining two or more attributes (or objects) into a single attribute (or object).
  • 40. 
    What are the purposes of Aggregation:
    • A. 

      Data Reduction

    • B. 

      Resolution

    • C. 

      Change of scale

    • D. 

      More "Stable" data

    • E. 

      Image quarreling

  • 41. 
    • A. 

      It is the main technique employed for data selection

    • B. 

      Statisticians sample because obtaining the entire set of data of interest is too expensive or time consuming.

    • C. 

      It is used in data mining because processing the entire set of data or interest is too expensive or time consuming.

    • D. 

      Because it is easier and viable to use.

  • 42. 
    The Key principle for effective sampling is the following:
    • A. 

      Using a sample will work almost as well as using the entire data sets, if the sample is representative

    • B. 

      A sample is presentative if it has approximately the same property (of interest) as the original set of data

    • C. 

      If mean is of interest then the mean of the sample, should be similar to mean of the full data.

    • D. 

      All of the above

  • 43. 
    What are the types of sampling:
    • A. 

      Random

    • B. 

      Without replacement

    • C. 

      With replacement

    • D. 

      Stratified

    • E. 

      Simplified

  • 44. 
    In Curse of Dimensionality, when dimensionality increases, data becomes increasingly sparse in the space that it occupies.
    • A. 

      True

    • B. 

      False

  • 45. 
    In curse dimensionality, definitions of density and distance between points, which is critical for clustering and outlier detection, become less meaningful. 
    • A. 

      True

    • B. 

      False

  • 46. 
    Some purpose of Dimensionality Reduction include: -- Avoid curse of dimensionality -- Reduce amount of time and memory required by data mining algorithms -- Allow data to be more easily visualized -- May help to eliminate irrelevant features or reduce noise
    • A. 

      True

    • B. 

      False

  • 47. 
    Some techniques of Dimensionality reduction:  -- Principle Component Analysis  -- Singular value Decomposition -- Others: supervised and non-linear techniques
    • A. 

      True

    • B. 

      False

  • 48. 
    Feature Subset Selection consists of which of the following:
    • A. 

      Another way to reduce dimensionality of data

    • B. 

      Redundant features

    • C. 

      Irrelevant Features

    • D. 

      Techniques

    • E. 

      Systems Approach

    • F. 

      Logic View

  • 49. 
    What are some technique approach of the Feature Subset Selection:
    • A. 

      Dictionary Hack Approach

    • B. 

      Dynamic brute force approach

    • C. 

      Brute force approach

    • D. 

      Embedded approach

    • E. 

      Filter approach

    • F. 

      Wrapper approaches:

  • 50. 
    What are the methodologies of Feature Creation
    • A. 

      Brute Force approach

    • B. 

      Feature Extraction

    • C. 

      Mapping Data to New Space

    • D. 

      Sparsity Feature

    • E. 

      Feature Construction

  • 51. 
    Proximity refers to a ____________ and ___________