DT PH II Practice 1

1. Data when processed becomes Information

True

False

From Data Quality

Explanation

From Data Quality

2. DataStage is an ETL tool

True

False

DataStage is indeed an ETL (Extract, Transform, Load) tool. ETL tools are used to extract data from various sources, transform it into a suitable format, and load it into a target system or database. DataStage is specifically designed for this purpose, allowing users to create data integration jobs that extract data from different sources, apply transformations, and load it into a target database or data warehouse. Therefore, the correct answer is true.

Explanation

DataStage is indeed an ETL (Extract, Transform, Load) tool. ETL tools are used to extract data from various sources, transform it into a suitable format, and load it into a target system or database. DataStage is specifically designed for this purpose, allowing users to create data integration jobs that extract data from different sources, apply transformations, and load it into a target database or data warehouse. Therefore, the correct answer is true.

3. If a primary key uses multiple columns to identify a record then it is known as compound key

True

False

A compound key is used when multiple columns are combined to uniquely identify a record in a database table. This is useful when a single column cannot uniquely identify a record. Therefore, if a primary key uses multiple columns, it is known as a compound key. Hence, the given statement is true.

Explanation

A compound key is used when multiple columns are combined to uniquely identify a record in a database table. This is useful when a single column cannot uniquely identify a record. Therefore, if a primary key uses multiple columns, it is known as a compound key. Hence, the given statement is true.

4. Tracing involves audit trails between deleted and surviving customers

True

False

Tracing refers to the process of establishing connections or links between deleted customers and the ones that still exist. It involves creating an audit trail to track the activities and interactions of these customers. Therefore, the statement "Tracing involves audit trails between deleted and surviving customers" is true.

Explanation

Tracing refers to the process of establishing connections or links between deleted customers and the ones that still exist. It involves creating an audit trail to track the activities and interactions of these customers. Therefore, the statement "Tracing involves audit trails between deleted and surviving customers" is true.

5. Data masking and mask pattern analysis are used in substituting string patterns

True

False

Data masking and mask pattern analysis are indeed used in substituting string patterns. Data masking is a technique used to protect sensitive data by replacing it with fictitious but realistic data. It helps to ensure that the original data is not exposed to unauthorized individuals. Mask pattern analysis, on the other hand, involves identifying and analyzing patterns in the masked data to ensure that it follows the desired format and structure. Both of these techniques are commonly employed in data security and privacy measures.

Explanation

Data masking and mask pattern analysis are indeed used in substituting string patterns. Data masking is a technique used to protect sensitive data by replacing it with fictitious but realistic data. It helps to ensure that the original data is not exposed to unauthorized individuals. Mask pattern analysis, on the other hand, involves identifying and analyzing patterns in the masked data to ensure that it follows the desired format and structure. Both of these techniques are commonly employed in data security and privacy measures.

6. An expression combining two different fact columns in a table (ex – sales – discount) can be set as a fact expression

True

False

In a table, it is possible to combine two different fact columns, such as sales and discount, into a single fact expression. This can be done to calculate the net sales amount after applying the discount. Therefore, the statement is true.

Explanation

In a table, it is possible to combine two different fact columns, such as sales and discount, into a single fact expression. This can be done to calculate the net sales amount after applying the discount. Therefore, the statement is true.

7. Customer merging is matching the best attribute into the surviving records from duplicate records

True

False

Customer merging is the process of combining or consolidating duplicate customer records into a single, accurate record. This involves identifying and matching the best attributes or information from each duplicate record and merging them into the surviving record. By doing so, businesses can eliminate duplicate data, improve data quality, and ensure that customer information is up to date and accurate. Therefore, the statement "Customer merging is matching the best attribute into the surviving records from duplicate records" is true.

Explanation

Customer merging is the process of combining or consolidating duplicate customer records into a single, accurate record. This involves identifying and matching the best attributes or information from each duplicate record and merging them into the surviving record. By doing so, businesses can eliminate duplicate data, improve data quality, and ensure that customer information is up to date and accurate. Therefore, the statement "Customer merging is matching the best attribute into the surviving records from duplicate records" is true.

8. Data quality (MDM) involves avoiding overheads while preparing the DW.

True

False

Data quality (MDM) is indeed important in avoiding overheads while preparing the data warehouse (DW). Data quality refers to the accuracy, completeness, consistency, and reliability of data, and it plays a crucial role in ensuring that the data used in the DW is reliable and trustworthy. By implementing Master Data Management (MDM) practices, organizations can improve data quality by ensuring that master data is accurate, consistent, and up-to-date. This, in turn, helps to avoid unnecessary costs and inefficiencies associated with poor data quality, ultimately leading to a more effective and efficient data warehouse.

Explanation

Data quality (MDM) is indeed important in avoiding overheads while preparing the data warehouse (DW). Data quality refers to the accuracy, completeness, consistency, and reliability of data, and it plays a crucial role in ensuring that the data used in the DW is reliable and trustworthy. By implementing Master Data Management (MDM) practices, organizations can improve data quality by ensuring that master data is accurate, consistent, and up-to-date. This, in turn, helps to avoid unnecessary costs and inefficiencies associated with poor data quality, ultimately leading to a more effective and efficient data warehouse.

9. Tablespace span across containers and tables can span across tablespaces

True

False

This statement is true because tablespaces in a database can span across multiple containers. A container is a physical storage unit that can be a file or a disk. By spanning across multiple containers, tablespaces can utilize the available storage space efficiently. Additionally, tables within a database can also span across multiple tablespaces. This allows for better management of data and enables partitioning and distribution of tables across different tablespaces based on specific requirements.

Explanation

This statement is true because tablespaces in a database can span across multiple containers. A container is a physical storage unit that can be a file or a disk. By spanning across multiple containers, tablespaces can utilize the available storage space efficiently. Additionally, tables within a database can also span across multiple tablespaces. This allows for better management of data and enables partitioning and distribution of tables across different tablespaces based on specific requirements.

10. The default sort for an attribute can be set in the attribute definition itself.

True

False

The given statement is true because when defining an attribute, it is possible to specify the default sorting order for that attribute. This allows for automatic sorting of data based on the attribute without the need for additional sorting instructions.

Explanation

The given statement is true because when defining an attribute, it is possible to specify the default sorting order for that attribute. This allows for automatic sorting of data based on the attribute without the need for additional sorting instructions.

11. Data quality audit provides traceability between original and corrected values.

True

False

Data quality audit is a process that ensures the accuracy and reliability of data. It involves examining data for errors, inconsistencies, and completeness. By conducting a data quality audit, organizations can trace the origin of data and compare it with the corrected values. This helps in identifying the source of errors and discrepancies, enabling organizations to make necessary corrections and improvements. Therefore, the statement that data quality audit provides traceability between original and corrected values is true.

Explanation

Data quality audit is a process that ensures the accuracy and reliability of data. It involves examining data for errors, inconsistencies, and completeness. By conducting a data quality audit, organizations can trace the origin of data and compare it with the corrected values. This helps in identifying the source of errors and discrepancies, enabling organizations to make necessary corrections and improvements. Therefore, the statement that data quality audit provides traceability between original and corrected values is true.

12. Evaluate data quality before building a fully fledged data ware house

True

False

From Data Quality

Explanation

From Data Quality

13. Customer matching is done with Fuzzy and intelligent logic.

True

False

Customer matching is done with fuzzy and intelligent logic, which means that it is not a straightforward and exact process. Fuzzy logic allows for a degree of uncertainty and imprecision in the matching process, taking into account similarities and patterns rather than strict criteria. Intelligent logic implies that the matching system is capable of learning and adapting over time, becoming more accurate and efficient in identifying the right customers for a particular product or service. Therefore, the statement "Customer matching is done with fuzzy and intelligent logic" is true.

Explanation

Customer matching is done with fuzzy and intelligent logic, which means that it is not a straightforward and exact process. Fuzzy logic allows for a degree of uncertainty and imprecision in the matching process, taking into account similarities and patterns rather than strict criteria. Intelligent logic implies that the matching system is capable of learning and adapting over time, becoming more accurate and efficient in identifying the right customers for a particular product or service. Therefore, the statement "Customer matching is done with fuzzy and intelligent logic" is true.

14. MDM is maintained at organizational level

True

False

The statement "MDM is maintained at organizational level" is true. Master Data Management (MDM) refers to the process of creating and managing a single, consistent, and accurate version of an organization's critical data. MDM is typically implemented and maintained at the organizational level to ensure that all departments and systems within the organization have access to and use the same reliable data. By centralizing the management of master data, organizations can improve data quality, reduce data inconsistencies, and enhance decision-making processes.

Explanation

The statement "MDM is maintained at organizational level" is true. Master Data Management (MDM) refers to the process of creating and managing a single, consistent, and accurate version of an organization's critical data. MDM is typically implemented and maintained at the organizational level to ensure that all departments and systems within the organization have access to and use the same reliable data. By centralizing the management of master data, organizations can improve data quality, reduce data inconsistencies, and enhance decision-making processes.

15. Hierarchies in microstrategy are

System Hierarchy

User Hierarchy

None

In MicroStrategy, hierarchies are used to organize and structure data in a logical manner. The two types of hierarchies mentioned, System Hierarchy and User Hierarchy, are commonly used in MicroStrategy. System Hierarchy refers to the default hierarchy created by the system based on the attributes and their relationships in the data model. User Hierarchy, on the other hand, allows users to create their own custom hierarchies based on their specific needs and preferences. Therefore, the correct answer includes both System Hierarchy and User Hierarchy as the types of hierarchies in MicroStrategy.

Explanation

In MicroStrategy, hierarchies are used to organize and structure data in a logical manner. The two types of hierarchies mentioned, System Hierarchy and User Hierarchy, are commonly used in MicroStrategy. System Hierarchy refers to the default hierarchy created by the system based on the attributes and their relationships in the data model. User Hierarchy, on the other hand, allows users to create their own custom hierarchies based on their specific needs and preferences. Therefore, the correct answer includes both System Hierarchy and User Hierarchy as the types of hierarchies in MicroStrategy.

Submit

16. Crosswalk allows metadata created by one user to be used by another

True

False

Crosswalk allows metadata created by one user to be used by another. This means that if one user creates metadata for a specific purpose, another user can access and utilize that metadata for their own purposes. This allows for the sharing and reusability of metadata, promoting collaboration and efficiency among users.

Explanation

Crosswalk allows metadata created by one user to be used by another. This means that if one user creates metadata for a specific purpose, another user can access and utilize that metadata for their own purposes. This allows for the sharing and reusability of metadata, promoting collaboration and efficiency among users.

17. Cache size can be changed in DS Administrator

True

False

The given statement is true because in DS Administrator, the cache size can be modified or adjusted. The DS Administrator is a tool used for managing and configuring various aspects of a system, including the cache. By accessing the DS Administrator, users can change the cache size to optimize performance and storage capacity based on their specific needs and requirements.

Explanation

The given statement is true because in DS Administrator, the cache size can be modified or adjusted. The DS Administrator is a tool used for managing and configuring various aspects of a system, including the cache. By accessing the DS Administrator, users can change the cache size to optimize performance and storage capacity based on their specific needs and requirements.

18. Which tool extracts data from textual sources

Conversion

Mark-Up

Extraction

Extraction is the correct answer because it refers to the process of retrieving or extracting data from textual sources. This tool is used to gather information from various text-based documents, such as websites, articles, reports, or social media posts. Extraction tools typically analyze the text and identify relevant data based on specific criteria or patterns. This extracted data can then be further processed, analyzed, or stored for various purposes such as data mining, business intelligence, or research.

Explanation

Extraction is the correct answer because it refers to the process of retrieving or extracting data from textual sources. This tool is used to gather information from various text-based documents, such as websites, articles, reports, or social media posts. Extraction tools typically analyze the text and identify relevant data based on specific criteria or patterns. This extracted data can then be further processed, analyzed, or stored for various purposes such as data mining, business intelligence, or research.

19. Rule repository contains Databases or Flat Files

True

False

The rule repository contains databases or flat files. This means that the repository is used to store and manage rules, which can be stored in either a database or a flat file format. This allows for easy access, retrieval, and management of the rules within the repository. Therefore, the statement "True" is correct.

Explanation

The rule repository contains databases or flat files. This means that the repository is used to store and manage rules, which can be stored in either a database or a flat file format. This allows for easy access, retrieval, and management of the rules within the repository. Therefore, the statement "True" is correct.

20. Bad quality data affects concurrency and performance.

True

False

Bad quality data refers to data that is inaccurate, incomplete, inconsistent, or outdated. When dealing with bad quality data, it can lead to issues with concurrency and performance. Concurrency refers to the ability of multiple users to access and manipulate data at the same time. If the data is of poor quality, it can cause conflicts and inconsistencies when multiple users try to access and modify it simultaneously. This can lead to data corruption and hinder the overall performance of the system. Therefore, it is true that bad quality data affects concurrency and performance.

Explanation

Bad quality data refers to data that is inaccurate, incomplete, inconsistent, or outdated. When dealing with bad quality data, it can lead to issues with concurrency and performance. Concurrency refers to the ability of multiple users to access and manipulate data at the same time. If the data is of poor quality, it can cause conflicts and inconsistencies when multiple users try to access and modify it simultaneously. This can lead to data corruption and hinder the overall performance of the system. Therefore, it is true that bad quality data affects concurrency and performance.

21. Types of BI Metadata

A. OLAP Metadata

B. Reporting Metadata

C. Data Mining Metadata

The correct answer is a, b, and c because these are all types of BI metadata. OLAP metadata refers to the metadata used in online analytical processing, which involves analyzing multidimensional data. Reporting metadata is used in generating reports and includes information about data sources, report layouts, and filters. Data mining metadata is used in the process of discovering patterns and relationships in large datasets. These three types of metadata are essential components of a business intelligence system, as they help in organizing and understanding data for analysis and reporting purposes.

Explanation

The correct answer is a, b, and c because these are all types of BI metadata. OLAP metadata refers to the metadata used in online analytical processing, which involves analyzing multidimensional data. Reporting metadata is used in generating reports and includes information about data sources, report layouts, and filters. Data mining metadata is used in the process of discovering patterns and relationships in large datasets. These three types of metadata are essential components of a business intelligence system, as they help in organizing and understanding data for analysis and reporting purposes.

Submit

22. Block indexes for multiple columns produces

Multidimensional Clusters

Clusters

Blocks

The correct answer is multidimensional Clusters. When block indexes are created for multiple columns, it allows for the creation of multidimensional clusters. This means that the data is organized and stored in a way that allows for efficient retrieval and analysis of data across multiple dimensions. This can be particularly useful in situations where data needs to be analyzed and compared across different attributes or variables. By using multidimensional clusters, it becomes easier to navigate and query the data, leading to improved performance and accuracy in data analysis.

Explanation

The correct answer is multidimensional Clusters. When block indexes are created for multiple columns, it allows for the creation of multidimensional clusters. This means that the data is organized and stored in a way that allows for efficient retrieval and analysis of data across multiple dimensions. This can be particularly useful in situations where data needs to be analyzed and compared across different attributes or variables. By using multidimensional clusters, it becomes easier to navigate and query the data, leading to improved performance and accuracy in data analysis.

23. Reports can run with only attributes on the template (and no metrics).

True

False

Reports can run with only attributes on the template (and no metrics) because attributes provide the dimensions or categories by which data is organized, while metrics provide the quantitative measures or calculations based on those dimensions. By using attributes alone, the report can still display and analyze data based on different categories or dimensions without any specific calculations or quantitative measures. This allows for a more descriptive and categorical analysis of the data.

Explanation

Reports can run with only attributes on the template (and no metrics) because attributes provide the dimensions or categories by which data is organized, while metrics provide the quantitative measures or calculations based on those dimensions. By using attributes alone, the report can still display and analyze data based on different categories or dimensions without any specific calculations or quantitative measures. This allows for a more descriptive and categorical analysis of the data.

24. Household matching is for

Business

Product

Customer

None of the above.

Household matching refers to the process of matching customer data with household data to identify and group individuals who belong to the same household. This is done to gain a better understanding of customer behavior, preferences, and demographics, which can be valuable for businesses in targeting their marketing efforts and providing personalized experiences. Therefore, the correct answer is customer as household matching is primarily focused on identifying and analyzing customers within a household.

Explanation

Household matching refers to the process of matching customer data with household data to identify and group individuals who belong to the same household. This is done to gain a better understanding of customer behavior, preferences, and demographics, which can be valuable for businesses in targeting their marketing efforts and providing personalized experiences. Therefore, the correct answer is customer as household matching is primarily focused on identifying and analyzing customers within a household.

25. Different tablespaces have different page sizes

True

False

In a database management system, a tablespace is a logical storage container that holds various database objects such as tables, indexes, and views. Each tablespace can have its own specific page size, which determines the size of the data blocks used to store data on disk. This allows for flexibility in optimizing storage and performance based on the specific needs of different database objects. Therefore, it is true that different tablespaces can have different page sizes.

Explanation

In a database management system, a tablespace is a logical storage container that holds various database objects such as tables, indexes, and views. Each tablespace can have its own specific page size, which determines the size of the data blocks used to store data on disk. This allows for flexibility in optimizing storage and performance based on the specific needs of different database objects. Therefore, it is true that different tablespaces can have different page sizes.

26. In which of the following stages a job cannot be run?

Abort

Compiled

The stage where a job cannot be run is the "Abort" stage. This is because when a job is aborted, it is forcefully terminated and cannot be executed further. The other stages mentioned, namely "Compiled" and "Compiled", do not necessarily imply that a job cannot be run. However, it is worth noting that the repetition of "Compiled" in the options may indicate an error or incomplete question.

Explanation

The stage where a job cannot be run is the "Abort" stage. This is because when a job is aborted, it is forcefully terminated and cannot be executed further. The other stages mentioned, namely "Compiled" and "Compiled", do not necessarily imply that a job cannot be run. However, it is worth noting that the repetition of "Compiled" in the options may indicate an error or incomplete question.

27. Trillium server process requires

Input Structure(DLL file)

Output structure (DLL file)

Parameter file (PAR file)

The Trillium server process requires an Input Structure (DLL file), an Output Structure (DLL file), and a Parameter file (PAR file). These files are necessary for the Trillium server process to function properly. The Input Structure (DLL file) contains the necessary data and instructions for the server process to process the input data. The Output Structure (DLL file) defines the format and structure of the output data generated by the server process. The Parameter file (PAR file) contains the configuration settings and parameters that govern the behavior of the server process.

Explanation

The Trillium server process requires an Input Structure (DLL file), an Output Structure (DLL file), and a Parameter file (PAR file). These files are necessary for the Trillium server process to function properly. The Input Structure (DLL file) contains the necessary data and instructions for the server process to process the input data. The Output Structure (DLL file) defines the format and structure of the output data generated by the server process. The Parameter file (PAR file) contains the configuration settings and parameters that govern the behavior of the server process.

Submit

28. Types of actions in hierarchy display

Locked

Limited

Entry point

Filtered

All the options

The correct answer is "All the options" because the question is asking about the types of actions in a hierarchy display. The options listed - Locked, Limited, Entry point, and Filtered - are all valid types of actions that can be present in a hierarchy display. Therefore, the answer is that all of the options listed are types of actions in a hierarchy display.

Explanation

The correct answer is "All the options" because the question is asking about the types of actions in a hierarchy display. The options listed - Locked, Limited, Entry point, and Filtered - are all valid types of actions that can be present in a hierarchy display. Therefore, the answer is that all of the options listed are types of actions in a hierarchy display.

29. Data quality does not refer to

Volume

Accuracy

Consistency

Integrity

Data quality refers to the accuracy, consistency, and integrity of the data. It ensures that the data is reliable, complete, and free from errors or inconsistencies. However, volume does not fall under the category of data quality. Volume refers to the amount or quantity of data, and while it is important to manage and analyze large volumes of data effectively, it is not directly related to the quality of the data itself.

Explanation

Data quality refers to the accuracy, consistency, and integrity of the data. It ensures that the data is reliable, complete, and free from errors or inconsistencies. However, volume does not fall under the category of data quality. Volume refers to the amount or quantity of data, and while it is important to manage and analyze large volumes of data effectively, it is not directly related to the quality of the data itself.

30. The rules of cleansing are embedded in Trillium's

Parameter file (PAR).

Output structure (DLL file)

Input structure (DLL file)

The correct answer is the Parameter file (PAR). The explanation for this is that the rules of cleansing are embedded in the Parameter file (PAR). This means that the Parameter file contains the specific instructions and guidelines for how data should be cleansed. It likely includes information on what types of data should be removed or corrected, as well as any specific algorithms or processes that should be followed. The Output structure (DLL file) and Input structure (DLL file) are not directly related to the rules of cleansing, so they are not the correct answer.

Explanation

The correct answer is the Parameter file (PAR). The explanation for this is that the rules of cleansing are embedded in the Parameter file (PAR). This means that the Parameter file contains the specific instructions and guidelines for how data should be cleansed. It likely includes information on what types of data should be removed or corrected, as well as any specific algorithms or processes that should be followed. The Output structure (DLL file) and Input structure (DLL file) are not directly related to the rules of cleansing, so they are not the correct answer.

31. Reason for poor quality of data

Careless / Inaccurate data entry

No stringent rules or processes followed to validate the data entry

Lack of Master Data Management strategy

The poor quality of data can be attributed to several factors. One reason is careless or inaccurate data entry, where individuals responsible for inputting data may make mistakes or not pay attention to detail. Another factor is the absence of stringent rules or processes to validate the data entry, which allows for errors to go unnoticed. Additionally, the lack of a Master Data Management strategy contributes to poor data quality as there is no systematic approach to ensure data accuracy, consistency, and integrity.

Explanation

The poor quality of data can be attributed to several factors. One reason is careless or inaccurate data entry, where individuals responsible for inputting data may make mistakes or not pay attention to detail. Another factor is the absence of stringent rules or processes to validate the data entry, which allows for errors to go unnoticed. Additionally, the lack of a Master Data Management strategy contributes to poor data quality as there is no systematic approach to ensure data accuracy, consistency, and integrity.

Submit

32. A filter qualification can combine

Attribute qualification and metric qualification only

Attribute qualification and report as filter only

Attribute qualification, report as filter and relationship filter only

Attribute qualification, metric qualification and relationship filter only

Attribute qualification, metric qualification, report as filter and relationship in any combination

A filter qualification can combine attribute qualification, metric qualification, report as filter, and relationship in any combination. This means that a filter can be created using one or more attributes, metrics, reports, and relationships. It allows for flexibility in filtering data based on specific attributes, metrics, reports, and relationships, enabling more precise and customized data analysis.

Explanation

A filter qualification can combine attribute qualification, metric qualification, report as filter, and relationship in any combination. This means that a filter can be created using one or more attributes, metrics, reports, and relationships. It allows for flexibility in filtering data based on specific attributes, metrics, reports, and relationships, enabling more precise and customized data analysis.

33. Default page size in DB2?

4 KB

2 KB

8 KB

16 KB

The default page size in DB2 is 4 KB. This means that the data in DB2 is stored in pages, and each page has a size of 4 KB. This page size is commonly used because it strikes a balance between efficient storage and efficient retrieval of data. Smaller page sizes would result in more pages and potentially slower performance, while larger page sizes would result in wasted space if the data does not fill up the entire page. Therefore, 4 KB is a commonly used default page size in DB2.

Explanation

The default page size in DB2 is 4 KB. This means that the data in DB2 is stored in pages, and each page has a size of 4 KB. This page size is commonly used because it strikes a balance between efficient storage and efficient retrieval of data. Smaller page sizes would result in more pages and potentially slower performance, while larger page sizes would result in wasted space if the data does not fill up the entire page. Therefore, 4 KB is a commonly used default page size in DB2.

34. The No. of CPUs used in DB2 Enterprise edition

2 CPU

4 CPU

18 CPU

No Limit

The given answer "No Limit" suggests that there is no maximum or set limit on the number of CPUs that can be used in DB2 Enterprise edition. This means that users can utilize as many CPUs as they require, based on their specific needs and system capabilities.

Explanation

The given answer "No Limit" suggests that there is no maximum or set limit on the number of CPUs that can be used in DB2 Enterprise edition. This means that users can utilize as many CPUs as they require, based on their specific needs and system capabilities.

35. Updating schema is required, when we do changes in __________________

Attributes

Facts

Hierarchies

All the options

Updating the schema is required when we make changes in attributes, facts, and hierarchies. This is because these components are essential for defining the structure and organization of a database. When any modifications are made to these elements, the schema needs to be updated to reflect these changes accurately. Therefore, updating the schema is necessary when changes are made to any of these options.

Explanation

Updating the schema is required when we make changes in attributes, facts, and hierarchies. This is because these components are essential for defining the structure and organization of a database. When any modifications are made to these elements, the schema needs to be updated to reflect these changes accurately. Therefore, updating the schema is necessary when changes are made to any of these options.

36. Survivorship is a concept used in

Data de-duplication

Cleansing

Enrichment

None

Survivorship is a concept used in data de-duplication. Data de-duplication is the process of identifying and removing duplicate data entries from a dataset. Survivorship refers to the process of selecting the most accurate and reliable data entry among the duplicates to be retained in the dataset, while discarding the rest. This ensures that only the most relevant and correct information is retained, improving data quality and reducing storage space requirements.

Explanation

Survivorship is a concept used in data de-duplication. Data de-duplication is the process of identifying and removing duplicate data entries from a dataset. Survivorship refers to the process of selecting the most accurate and reliable data entry among the duplicates to be retained in the dataset, while discarding the rest. This ensures that only the most relevant and correct information is retained, improving data quality and reducing storage space requirements.

37. Not a DB2 license method

User

CPU

Memory

The given options, User, CPU, and Memory, are all related to computer hardware components. In the context of DB2, User and CPU are not license methods, but Memory can be considered as a factor for licensing. Memory is often used as a metric for determining the licensing requirements of certain software, including DB2, as the amount of memory allocated to a system can affect its performance and capacity. Therefore, Memory is the correct answer as it is not a DB2 license method.

Explanation

The given options, User, CPU, and Memory, are all related to computer hardware components. In the context of DB2, User and CPU are not license methods, but Memory can be considered as a factor for licensing. Memory is often used as a metric for determining the licensing requirements of certain software, including DB2, as the amount of memory allocated to a system can affect its performance and capacity. Therefore, Memory is the correct answer as it is not a DB2 license method.

38. Can multiple selections possible in DataStage?

Not Possible

Yes Possible

Multiple selections are possible in DataStage. This means that users can select and process multiple data elements or records simultaneously. This allows for efficient and streamlined data processing, as it eliminates the need for repetitive manual selection and processing of individual data elements. Users can select multiple data elements based on specific criteria or conditions, and perform actions such as transformation, filtering, or integration on the selected data elements as a group.

Explanation

Multiple selections are possible in DataStage. This means that users can select and process multiple data elements or records simultaneously. This allows for efficient and streamlined data processing, as it eliminates the need for repetitive manual selection and processing of individual data elements. Users can select multiple data elements based on specific criteria or conditions, and perform actions such as transformation, filtering, or integration on the selected data elements as a group.

39. During which of the operations data is not modified

Data profiling

Data cleansing

Data enrichment

During data profiling, the focus is on analyzing and understanding the data, rather than modifying it. Data profiling involves examining the quality, structure, and content of the data to gain insights and identify any issues or anomalies. This process helps in understanding the data's characteristics, such as its completeness, accuracy, and consistency. Unlike data cleansing and data enrichment, data profiling does not involve making changes or additions to the data. Instead, it aims to provide a comprehensive overview of the data, enabling better decision-making and data management.

Explanation

During data profiling, the focus is on analyzing and understanding the data, rather than modifying it. Data profiling involves examining the quality, structure, and content of the data to gain insights and identify any issues or anomalies. This process helps in understanding the data's characteristics, such as its completeness, accuracy, and consistency. Unlike data cleansing and data enrichment, data profiling does not involve making changes or additions to the data. Instead, it aims to provide a comprehensive overview of the data, enabling better decision-making and data management.

40. Metadata should be maintained even when

Base resource changes

If two sources merges together

Base source is deleted

Metadata should be maintained even when the base resource changes because the metadata provides important information about the resource, such as its origin, format, and any restrictions or permissions associated with it. This ensures that the metadata remains accurate and up-to-date, allowing users to effectively search, retrieve, and use the resource. Similarly, when two sources merge together, it is important to maintain the metadata from both sources to preserve the integrity and completeness of the merged data. Lastly, even if the base source is deleted, the metadata should still be retained to provide historical context and reference for any data or resources that were derived from or linked to the base source.

Explanation

Metadata should be maintained even when the base resource changes because the metadata provides important information about the resource, such as its origin, format, and any restrictions or permissions associated with it. This ensures that the metadata remains accurate and up-to-date, allowing users to effectively search, retrieve, and use the resource. Similarly, when two sources merge together, it is important to maintain the metadata from both sources to preserve the integrity and completeness of the merged data. Lastly, even if the base source is deleted, the metadata should still be retained to provide historical context and reference for any data or resources that were derived from or linked to the base source.

Submit

41. Clean up will not effect on by which phase

Acquisition

Application

Cleanup

None

The question is asking which phase will not be affected by the clean-up. Clean-up is a process of removing unnecessary or unwanted elements. In the context of the given options, acquisition refers to the phase of obtaining or acquiring something. Clean-up is not related to the acquisition phase, as it focuses on organizing and removing unnecessary elements rather than obtaining something new. Therefore, the clean-up will not affect the acquisition phase.

Explanation

The question is asking which phase will not be affected by the clean-up. Clean-up is a process of removing unnecessary or unwanted elements. In the context of the given options, acquisition refers to the phase of obtaining or acquiring something. Clean-up is not related to the acquisition phase, as it focuses on organizing and removing unnecessary elements rather than obtaining something new. Therefore, the clean-up will not affect the acquisition phase.

42. The maximum number of attributes that can be set as parent to another attribute is

No Limit

One

Two

Three

There is no limit to the number of attributes that can be set as a parent to another attribute. This means that an attribute can have any number of parent attributes.

Explanation

There is no limit to the number of attributes that can be set as a parent to another attribute. This means that an attribute can have any number of parent attributes.

43. Data cleansing and standardization will be taken care by

Data Quality Tools

Data Profiling Tools

Metadata Tools

Data quality tools are specifically designed to identify and correct errors, inconsistencies, and inaccuracies in data. They help in cleansing and standardizing the data by removing duplicate entries, validating data against predefined rules, and ensuring data integrity. These tools can also perform various data enrichment techniques to enhance the overall quality of the data. Therefore, it is logical to conclude that data quality tools will be responsible for data cleansing and standardization.

Explanation

Data quality tools are specifically designed to identify and correct errors, inconsistencies, and inaccuracies in data. They help in cleansing and standardizing the data by removing duplicate entries, validating data against predefined rules, and ensuring data integrity. These tools can also perform various data enrichment techniques to enhance the overall quality of the data. Therefore, it is logical to conclude that data quality tools will be responsible for data cleansing and standardization.

44. The best practice in data quality is

Fixing data quality issues in ETL

Fixing data quality issues in ODS

Fixing data quality issues in Source

Fixing data quality issues in DW

From Data Quality

Explanation

From Data Quality

45. During the de-duplication process

Delete the original values since they consume space

Keep the original values in trail tables

Do not disturb the original values and place the new values in new tables

None

During the de-duplication process, the original values are kept in trail tables. This means that instead of deleting the original values or placing new values in new tables, the original values are preserved. This allows for a record of the original values to be maintained while still removing any duplicate entries. Keeping the original values in trail tables can be useful for auditing purposes or for historical reference.

Explanation

During the de-duplication process, the original values are kept in trail tables. This means that instead of deleting the original values or placing new values in new tables, the original values are preserved. This allows for a record of the original values to be maintained while still removing any duplicate entries. Keeping the original values in trail tables can be useful for auditing purposes or for historical reference.

46. Order of execution in Datastage is

Stage variable then Constraints then Derivations

Derivations then Stage variable then Constraints

Constraints then Derivations then Stage variable

The correct answer is "Stage variable then Constraints then Derivations." In Datastage, the order of execution is important for proper data processing. Stage variables are evaluated first, as they are used to store intermediate values during the data transformation process. Constraints are then applied to filter the data based on certain conditions. Finally, derivations are performed to calculate new values or modify existing ones. This sequence ensures that the stage variables are available for use in constraints and derivations, allowing for accurate data manipulation.

Explanation

The correct answer is "Stage variable then Constraints then Derivations." In Datastage, the order of execution is important for proper data processing. Stage variables are evaluated first, as they are used to store intermediate values during the data transformation process. Constraints are then applied to filter the data based on certain conditions. Finally, derivations are performed to calculate new values or modify existing ones. This sequence ensures that the stage variables are available for use in constraints and derivations, allowing for accurate data manipulation.

47. In two tier architecture, how many ODBC connections are there?

2

5

1

No such rules

In two-tier architecture, there are two ODBC connections. This is because two-tier architecture consists of a client-side application that communicates directly with the database server. The first ODBC connection is established between the client-side application and the database server for data retrieval and manipulation. The second ODBC connection is used for administrative purposes, such as database management and configuration. Having two separate connections allows for better control and organization of the client-server communication in two-tier architecture.

Explanation

In two-tier architecture, there are two ODBC connections. This is because two-tier architecture consists of a client-side application that communicates directly with the database server. The first ODBC connection is established between the client-side application and the database server for data retrieval and manipulation. The second ODBC connection is used for administrative purposes, such as database management and configuration. Having two separate connections allows for better control and organization of the client-server communication in two-tier architecture.

48. Fetching data from hard disk to buffer pool is known as pre-fatching

True

False

The statement is true because pre-fetching refers to the process of fetching data from the hard disk to the buffer pool in advance, anticipating that it will be needed in the near future. This helps to improve the overall performance of the system by reducing the time required to access the data when it is actually needed.

Explanation

The statement is true because pre-fetching refers to the process of fetching data from the hard disk to the buffer pool in advance, anticipating that it will be needed in the near future. This helps to improve the overall performance of the system by reducing the time required to access the data when it is actually needed.

49. Steps avoid poor quality data

Set stringent rules in validation process; if not, then in ETL process

De-duplication

Provide feedback about quality of data to source and ask source to correct and resend them

The answer suggests three steps to avoid poor quality data. The first step is to set stringent rules in the validation process, and if not possible, then in the ETL (Extract, Transform, Load) process. This ensures that data is thoroughly checked and validated before being used. The second step is de-duplication, which involves removing any duplicate or redundant data entries. This helps in maintaining data integrity and accuracy. The third step is to provide feedback about the quality of data to the source and request them to correct and resend the data. This ensures that the source takes responsibility for the quality of the data they provide.

Explanation

The answer suggests three steps to avoid poor quality data. The first step is to set stringent rules in the validation process, and if not possible, then in the ETL (Extract, Transform, Load) process. This ensures that data is thoroughly checked and validated before being used. The second step is de-duplication, which involves removing any duplicate or redundant data entries. This helps in maintaining data integrity and accuracy. The third step is to provide feedback about the quality of data to the source and request them to correct and resend the data. This ensures that the source takes responsibility for the quality of the data they provide.

Submit

50. A container is not a

File

Directory

Raw device

Memory

A container is not a memory. A container is a lightweight and portable software package that contains everything needed to run an application, including the code, runtime, system tools, and libraries. It provides isolation and allows applications to run consistently across different environments. Memory, on the other hand, refers to the physical or virtual storage space used by a computer to store data and instructions that can be accessed by the processor.

Explanation

A container is not a memory. A container is a lightweight and portable software package that contains everything needed to run an application, including the code, runtime, system tools, and libraries. It provides isolation and allows applications to run consistently across different environments. Memory, on the other hand, refers to the physical or virtual storage space used by a computer to store data and instructions that can be accessed by the processor.

51. Which is not a data quality tool?

Quality stage

Trillium

Data Stage

All the options

Data Stage is an ETL tool from IBM.

Explanation

Data Stage is an ETL tool from IBM.

52. The frequency of data count is obtained in

Data profiling

Data cleansing

Data management

Data profiling involves analyzing and examining the data to understand its structure, content, and quality. By conducting data profiling, the frequency of data count can be obtained. This process helps in identifying the patterns, inconsistencies, and anomalies within the data, allowing organizations to gain insights and make informed decisions. Data cleansing, on the other hand, focuses on removing or correcting errors, duplicates, and inconsistencies in the data. Data management refers to the overall process of collecting, storing, organizing, and maintaining data.

Explanation

Data profiling involves analyzing and examining the data to understand its structure, content, and quality. By conducting data profiling, the frequency of data count can be obtained. This process helps in identifying the patterns, inconsistencies, and anomalies within the data, allowing organizations to gain insights and make informed decisions. Data cleansing, on the other hand, focuses on removing or correcting errors, duplicates, and inconsistencies in the data. Data management refers to the overall process of collecting, storing, organizing, and maintaining data.

53. If a user would want to list the top 10 revenue values by region what type of filter is to be used?

Attribute element list qualification

Attribute form qualification

Metric qualification

Relationship filter qualification

To list the top 10 revenue values by region, a metric qualification filter should be used. This type of filter allows the user to filter and sort data based on numerical values, such as revenue. By applying a metric qualification filter, the user can specify the criteria for selecting the top revenue values and display them in the desired order. This filter is specifically designed for filtering and analyzing numerical metrics in a dataset.

Explanation

To list the top 10 revenue values by region, a metric qualification filter should be used. This type of filter allows the user to filter and sort data based on numerical values, such as revenue. By applying a metric qualification filter, the user can specify the criteria for selecting the top revenue values and display them in the desired order. This filter is specifically designed for filtering and analyzing numerical metrics in a dataset.

54. Database partition is known as

Node

Leaf

Root

A database partition is known as a node. In a distributed database system, a node refers to a single server or computer that stores a portion of the database. Each node is responsible for managing a specific partition of the data, allowing for efficient storage and retrieval of information. Nodes can communicate and coordinate with each other to ensure data consistency and availability across the entire database system.

Explanation

A database partition is known as a node. In a distributed database system, a node refers to a single server or computer that stores a portion of the database. Each node is responsible for managing a specific partition of the data, allowing for efficient storage and retrieval of information. Nodes can communicate and coordinate with each other to ensure data consistency and availability across the entire database system.

55. Grant permission can be given by

DataStage Manager

DataStage Administrator

DataStage Director

DataStage Designer

The correct answer is DataStage Administrator because the DataStage Administrator is responsible for managing and administering the DataStage environment. They have the authority to grant permissions to users and control access to various DataStage components such as DataStage Manager, DataStage Director, and DataStage Designer.

Explanation

The correct answer is DataStage Administrator because the DataStage Administrator is responsible for managing and administering the DataStage environment. They have the authority to grant permissions to users and control access to various DataStage components such as DataStage Manager, DataStage Director, and DataStage Designer.

56. What is the language used in a data quality tool?

C

JAVA

C#

COBOL

The correct answer is C. The C programming language is commonly used in data quality tools. C is a powerful and efficient language that allows for low-level programming and direct memory manipulation, making it well-suited for tasks such as data processing and analysis. Many data quality tools are written in C or have components written in C to optimize performance and ensure accurate and reliable data management.

Explanation

The correct answer is C. The C programming language is commonly used in data quality tools. C is a powerful and efficient language that allows for low-level programming and direct memory manipulation, making it well-suited for tasks such as data processing and analysis. Many data quality tools are written in C or have components written in C to optimize performance and ensure accurate and reliable data management.

57. What is the default connection timed out time in DataStage?

86400

10000

84600

Never

The default connection timed out time in DataStage is 86400. This means that if a connection is idle for more than 86400 seconds, it will be terminated.

Explanation

The default connection timed out time in DataStage is 86400. This means that if a connection is idle for more than 86400 seconds, it will be terminated.

58. A fact can have different expressions based on the table against which it is evaluated.

True

False

This statement is true because facts can be interpreted or expressed differently depending on the context or framework in which they are evaluated. Different perspectives, viewpoints, or variables can influence the way a fact is presented or understood. Therefore, a fact may have multiple expressions or interpretations based on the table or criteria used for evaluation.

Explanation

This statement is true because facts can be interpreted or expressed differently depending on the context or framework in which they are evaluated. Different perspectives, viewpoints, or variables can influence the way a fact is presented or understood. Therefore, a fact may have multiple expressions or interpretations based on the table or criteria used for evaluation.

59. Which are the following is not an IBM product?

Meta stage

Quality Stage

Profile Stage

Analysis stage

The Analysis stage is not an IBM product. The other options, Meta stage, Quality Stage, and Profile Stage, are all IBM products used in data integration and data quality management. However, the Analysis stage does not correspond to any known IBM product in this context.

Explanation

The Analysis stage is not an IBM product. The other options, Meta stage, Quality Stage, and Profile Stage, are all IBM products used in data integration and data quality management. However, the Analysis stage does not correspond to any known IBM product in this context.

60. In which type of filter the SQL is not changed?

View filter

Report filter

A view filter is a type of filter in SQL where the SQL query is not changed. It is used to filter the data displayed in a view, which is a virtual table created by a query. The view filter allows you to specify conditions or criteria that restrict the data shown in the view, without modifying the underlying SQL query. This means that the original SQL query remains the same, but the view filter modifies the result set returned by the query based on the specified filter conditions.

Explanation

A view filter is a type of filter in SQL where the SQL query is not changed. It is used to filter the data displayed in a view, which is a virtual table created by a query. The view filter allows you to specify conditions or criteria that restrict the data shown in the view, without modifying the underlying SQL query. This means that the original SQL query remains the same, but the view filter modifies the result set returned by the query based on the specified filter conditions.

61. Schema objects are

Facts

Tables

Transformations

Partition mappings

Attributes

Hierarchies

Function and operastors

The given answer is correct because schema objects include various components such as facts, tables, transformations, partition mappings, attributes, hierarchies, and functions/operators. These objects are essential for organizing and representing data in a database or data warehouse. Facts represent the numerical data or metrics, tables store the structured data, transformations modify or manipulate the data, partition mappings define how data is distributed across multiple storage devices, attributes describe the characteristics of the data, hierarchies represent the relationships between data elements, and functions/operators perform calculations or operations on the data.

Explanation

The given answer is correct because schema objects include various components such as facts, tables, transformations, partition mappings, attributes, hierarchies, and functions/operators. These objects are essential for organizing and representing data in a database or data warehouse. Facts represent the numerical data or metrics, tables store the structured data, transformations modify or manipulate the data, partition mappings define how data is distributed across multiple storage devices, attributes describe the characteristics of the data, hierarchies represent the relationships between data elements, and functions/operators perform calculations or operations on the data.

Submit

62. Metadata creation tools

Templates

Mark-Up tools

Extraction tool

Conversion tool

The given options are all different types of metadata creation tools. Templates are pre-designed structures or forms that help in organizing and standardizing metadata. Mark-Up tools are used to add metadata tags or labels to content. Extraction tools are used to extract metadata from various sources or documents. Conversion tools are used to convert metadata from one format to another.

Explanation

The given options are all different types of metadata creation tools. Templates are pre-designed structures or forms that help in organizing and standardizing metadata. Mark-Up tools are used to add metadata tags or labels to content. Extraction tools are used to extract metadata from various sources or documents. Conversion tools are used to convert metadata from one format to another.

Submit

63. During which of the operations data is not modified

Data profiling

Data cleansing

Data enrichment

None

Data profiling is the process of analyzing and understanding the structure, content, and quality of data. It involves examining the data to identify patterns, inconsistencies, and anomalies. During data profiling, the data itself is not modified or changed in any way. Instead, it focuses on gathering information about the data, such as its type, format, and distribution. This analysis helps in understanding the data better and making informed decisions about data cleansing or enrichment processes.

Explanation

Data profiling is the process of analyzing and understanding the structure, content, and quality of data. It involves examining the data to identify patterns, inconsistencies, and anomalies. During data profiling, the data itself is not modified or changed in any way. Instead, it focuses on gathering information about the data, such as its type, format, and distribution. This analysis helps in understanding the data better and making informed decisions about data cleansing or enrichment processes.

64. Trillium source

Flat files, fixed width

Flat file ,comma separated

ODBC connection

All

The correct answer is "Flat files, fixed width". This means that the Trillium source can be obtained from flat files that have a fixed width format. This format is used to store data where each field has a specific length, and the data is aligned accordingly. The fixed width format is commonly used when the data needs to be imported or exported into systems that require a specific layout.

Explanation

The correct answer is "Flat files, fixed width". This means that the Trillium source can be obtained from flat files that have a fixed width format. This format is used to store data where each field has a specific length, and the data is aligned accordingly. The fixed width format is commonly used when the data needs to be imported or exported into systems that require a specific layout.

65. Which of the following database is used in DS repository?

Universe

Oracle

Sybase

MS-SQL Sever

Universe is the correct answer because it is a type of database that is commonly used in DS (Data Science) repositories. It is a multidimensional database that allows for complex data analysis and reporting. Universe databases are designed to store and organize large amounts of data in a way that is optimized for efficient querying and analysis. This makes it a popular choice for data scientists and analysts working with large datasets.

Explanation

Universe is the correct answer because it is a type of database that is commonly used in DS (Data Science) repositories. It is a multidimensional database that allows for complex data analysis and reporting. Universe databases are designed to store and organize large amounts of data in a way that is optimized for efficient querying and analysis. This makes it a popular choice for data scientists and analysts working with large datasets.

66. Formula automatically gets updated in fact column by which metrics

A) Nested Metrics

Compound Metrics

Smart Metrics

Derived Metrics

Smart Metrics automatically update the formula in the fact column. Unlike Nested Metrics, Compound Metrics, and Derived Metrics, Smart Metrics have the ability to dynamically adjust their calculations based on changes in the underlying data. This ensures that the metrics always reflect the most up-to-date information without requiring manual updates to the formula.

Explanation

Smart Metrics automatically update the formula in the fact column. Unlike Nested Metrics, Compound Metrics, and Derived Metrics, Smart Metrics have the ability to dynamically adjust their calculations based on changes in the underlying data. This ensures that the metrics always reflect the most up-to-date information without requiring manual updates to the formula.

67. Metadata storage formats

Human readable format (XML)

Non-human readable format (Binary)

Pdf

Text

The given answer is correct because XML is a format that can be easily read and understood by humans. It uses tags to define elements and attributes to provide additional information about those elements. On the other hand, binary formats are not designed to be read by humans as they consist of binary data that is encoded in a way that is efficient for computers to process. Therefore, XML is a human-readable format, while binary formats are non-human readable.

Explanation

The given answer is correct because XML is a format that can be easily read and understood by humans. It uses tags to define elements and attributes to provide additional information about those elements. On the other hand, binary formats are not designed to be read by humans as they consist of binary data that is encoded in a way that is efficient for computers to process. Therefore, XML is a human-readable format, while binary formats are non-human readable.

Submit

68. Report view mode in microstrategy

Grid

Graph

SQL

Grid graph mode

The report view mode in MicroStrategy allows users to choose between different display formats for their reports. The available options include Grid, Graph, SQL, and Grid graph mode. Grid mode presents the data in a tabular format, Graph mode displays the data in graphical charts and visualizations, SQL mode allows users to directly view and manipulate the underlying SQL query, and Grid graph mode combines both grid and graph views for a comprehensive analysis of the data.

Explanation

The report view mode in MicroStrategy allows users to choose between different display formats for their reports. The available options include Grid, Graph, SQL, and Grid graph mode. Grid mode presents the data in a tabular format, Graph mode displays the data in graphical charts and visualizations, SQL mode allows users to directly view and manipulate the underlying SQL query, and Grid graph mode combines both grid and graph views for a comprehensive analysis of the data.

Submit

69. Basic Functionalities of Trillium

Data Profiling

Data Quality

Data Enrichment

Data Volume

Trillium offers several basic functionalities, including data profiling, data quality, and data enrichment. Data profiling involves analyzing and understanding the characteristics and quality of data. Data quality refers to the accuracy, completeness, consistency, and reliability of data. Data enrichment involves enhancing the existing data with additional information to provide more insights and value. These functionalities are essential for organizations to ensure that their data is reliable, accurate, and useful for decision-making purposes.

Explanation

Trillium offers several basic functionalities, including data profiling, data quality, and data enrichment. Data profiling involves analyzing and understanding the characteristics and quality of data. Data quality refers to the accuracy, completeness, consistency, and reliability of data. Data enrichment involves enhancing the existing data with additional information to provide more insights and value. These functionalities are essential for organizations to ensure that their data is reliable, accurate, and useful for decision-making purposes.

Submit

70. When we import a job, the job will be in which state?

State while exported

Aborted state

Not compiled state

When a job is imported, it will be in the "Not compiled" state. This means that the job has not been compiled or validated yet. In order to run the job successfully, it needs to be compiled first to check for any errors or issues. Therefore, when a job is imported, it is initially in the "Not compiled" state until it is compiled and validated.

Explanation

When a job is imported, it will be in the "Not compiled" state. This means that the job has not been compiled or validated yet. In order to run the job successfully, it needs to be compiled first to check for any errors or issues. Therefore, when a job is imported, it is initially in the "Not compiled" state until it is compiled and validated.

71. Which option in the metric editor allows the user to calculate a metric 6 months prior to the supplied month value?

Compound Metric

Conditional Metric

Transformation Metric

Adaptive Metric

The Transformation Metric option in the metric editor allows the user to calculate a metric 6 months prior to the supplied month value. This option provides the functionality to apply transformations or calculations on the metric data, enabling the user to manipulate the data and derive the desired result. By utilizing this option, the user can easily calculate the metric value for a specific time period in the past, in this case, 6 months prior to the supplied month value.

Explanation

The Transformation Metric option in the metric editor allows the user to calculate a metric 6 months prior to the supplied month value. This option provides the functionality to apply transformations or calculations on the metric data, enabling the user to manipulate the data and derive the desired result. By utilizing this option, the user can easily calculate the metric value for a specific time period in the past, in this case, 6 months prior to the supplied month value.

72. How to make a percentage calculation (a/b) metric automatically calculate average in place of sum for sub-totals when the sub-totals specified at the report level is sum.

Smart Metrics

Transformation Metrics

Level Metric

Compound Metric

Smart Metrics are a type of metric that can automatically calculate averages instead of sums for sub-totals when the sub-totals specified at the report level are sums. This means that when using a percentage calculation (a/b) metric, the Smart Metrics feature will intelligently adjust the calculation to provide average values for sub-totals instead of sum values. This can be useful in scenarios where average values are more meaningful or relevant for analyzing data.

Explanation

Smart Metrics are a type of metric that can automatically calculate averages instead of sums for sub-totals when the sub-totals specified at the report level are sums. This means that when using a percentage calculation (a/b) metric, the Smart Metrics feature will intelligently adjust the calculation to provide average values for sub-totals instead of sum values. This can be useful in scenarios where average values are more meaningful or relevant for analyzing data.

73. If the user wants particular attribute to be displayed in the report output, include the attribute in

Report display form

Report layout form

Browse form

None

The correct answer is "Report display form" because the report display form is the specific form or section where the user can select and specify which attributes they want to be displayed in the report output. This form allows the user to customize the content of the report by including specific attributes that are relevant to their needs.

Explanation

The correct answer is "Report display form" because the report display form is the specific form or section where the user can select and specify which attributes they want to be displayed in the report output. This form allows the user to customize the content of the report by including specific attributes that are relevant to their needs.

74. Types of DW Metadata

Back Room

Front Room

Source System

Data Staging

RDBMS

The given answer lists the different types of metadata in a data warehouse. The "Back Room" refers to the metadata that is stored in the back-end of the data warehouse system, such as data transformation rules and data lineage. The "Front Room" refers to the metadata that is exposed to the end users, such as data definitions and business glossaries. "Source System" metadata includes information about the data sources used in the data warehouse. "Data Staging" metadata pertains to the process of loading and transforming data from source systems to the data warehouse. "RDBMS" metadata refers to the metadata associated with the relational database management system used in the data warehouse.

Explanation

The given answer lists the different types of metadata in a data warehouse. The "Back Room" refers to the metadata that is stored in the back-end of the data warehouse system, such as data transformation rules and data lineage. The "Front Room" refers to the metadata that is exposed to the end users, such as data definitions and business glossaries. "Source System" metadata includes information about the data sources used in the data warehouse. "Data Staging" metadata pertains to the process of loading and transforming data from source systems to the data warehouse. "RDBMS" metadata refers to the metadata associated with the relational database management system used in the data warehouse.

Submit

75. Which symbol is used in defining the parameter in DataStage?

$

#

@

&&

The symbol "$" is used in defining the parameter in DataStage. This symbol is commonly used to represent a parameter or variable in many programming languages. In DataStage, parameters are often used to pass values between different stages or jobs, allowing for greater flexibility and reusability of code. By using the "$" symbol, DataStage recognizes that the value following it is a parameter and should be treated as such.

Explanation

The symbol "$" is used in defining the parameter in DataStage. This symbol is commonly used to represent a parameter or variable in many programming languages. In DataStage, parameters are often used to pass values between different stages or jobs, allowing for greater flexibility and reusability of code. By using the "$" symbol, DataStage recognizes that the value following it is a parameter and should be treated as such.

76. MicroStrategy is a

ROLAP tool

OLAP Tool

MicroStrategy is a ROLAP (Relational Online Analytical Processing) tool. ROLAP tools are designed to analyze data stored in relational databases. They allow users to perform complex queries and analysis on large datasets by leveraging the power of SQL and relational database management systems. MicroStrategy, as a ROLAP tool, provides functionalities such as drill-down, slice and dice, and data aggregation, enabling users to gain insights and make data-driven decisions.

Explanation

MicroStrategy is a ROLAP (Relational Online Analytical Processing) tool. ROLAP tools are designed to analyze data stored in relational databases. They allow users to perform complex queries and analysis on large datasets by leveraging the power of SQL and relational database management systems. MicroStrategy, as a ROLAP tool, provides functionalities such as drill-down, slice and dice, and data aggregation, enabling users to gain insights and make data-driven decisions.

77. Data cleansing and standardization will be taken care by

Data Profiling Tools

Data Quality Tools

Metadata Tools

ETL Tool

Data quality tools are responsible for ensuring the accuracy, completeness, and consistency of data. They perform various tasks such as data cleansing and standardization, which involve identifying and correcting errors, inconsistencies, and duplicates in the data. These tools also validate and verify data against predefined rules and standards to ensure its quality. Therefore, data quality tools are the most suitable option for handling data cleansing and standardization tasks.

Explanation

Data quality tools are responsible for ensuring the accuracy, completeness, and consistency of data. They perform various tasks such as data cleansing and standardization, which involve identifying and correcting errors, inconsistencies, and duplicates in the data. These tools also validate and verify data against predefined rules and standards to ensure its quality. Therefore, data quality tools are the most suitable option for handling data cleansing and standardization tasks.

78. Effects of locking

Improves concurrency

Degrades performance

Increase Performances

Degrades concurrency

Locking is a mechanism used in concurrent programming to ensure that multiple threads or processes can access shared resources in a synchronized manner. When locking is implemented effectively, it can improve concurrency by allowing multiple threads to access shared resources simultaneously without conflicts. However, the process of acquiring and releasing locks can introduce overhead and potentially slow down the overall performance of the system. Therefore, while locking can enhance concurrency, it can also degrade performance.

Explanation

Locking is a mechanism used in concurrent programming to ensure that multiple threads or processes can access shared resources in a synchronized manner. When locking is implemented effectively, it can improve concurrency by allowing multiple threads to access shared resources simultaneously without conflicts. However, the process of acquiring and releasing locks can introduce overhead and potentially slow down the overall performance of the system. Therefore, while locking can enhance concurrency, it can also degrade performance.

Submit

79. Which client tool is used to create or move the projects in Datastage?

DataStage Designer

DataStage Director

DataStage Manager

DataStage Administrator

The DataStage Administrator client tool is used to create or move projects in DataStage. It provides the necessary features and functionalities to manage and administer DataStage projects. This tool allows users to perform tasks such as creating and configuring projects, managing project resources, scheduling and monitoring jobs, and controlling access and security settings. With DataStage Administrator, users have the ability to efficiently manage and organize their DataStage projects, ensuring smooth execution and optimal performance.

Explanation

The DataStage Administrator client tool is used to create or move projects in DataStage. It provides the necessary features and functionalities to manage and administer DataStage projects. This tool allows users to perform tasks such as creating and configuring projects, managing project resources, scheduling and monitoring jobs, and controlling access and security settings. With DataStage Administrator, users have the ability to efficiently manage and organize their DataStage projects, ensuring smooth execution and optimal performance.

80. What are the standards defined for metadata

ANSI X3.528

ISO/IEC 11179

ISO/IEC 11197

ANSI X3.825

ANSI X3.285

ISO/IEC 11179 and ANSI X3.285 are the standards defined for metadata. ISO/IEC 11179 provides guidelines and specifications for managing and registering metadata in a standardized manner. It defines various aspects of metadata, including its structure, content, and representation. ANSI X3.285, on the other hand, focuses on the syntax and semantics of metadata for the interchange of information. These standards ensure consistency and interoperability in the management and exchange of metadata across different systems and organizations.

Explanation

ISO/IEC 11179 and ANSI X3.285 are the standards defined for metadata. ISO/IEC 11179 provides guidelines and specifications for managing and registering metadata in a standardized manner. It defines various aspects of metadata, including its structure, content, and representation. ANSI X3.285, on the other hand, focuses on the syntax and semantics of metadata for the interchange of information. These standards ensure consistency and interoperability in the management and exchange of metadata across different systems and organizations.

Submit

81. Update schema, updates the information stored in

Metadata database

Warehouse catalog

Analytical Engine

Data warehouse database

The correct answer is the Metadata database because updating the schema refers to modifying the structure and organization of the database. This includes changes to tables, columns, relationships, and other metadata information. The Metadata database stores this metadata, which describes the structure, content, and other characteristics of the data stored in the database. By updating the schema, the information stored in the Metadata database is modified to reflect the changes made to the database structure.

Explanation

The correct answer is the Metadata database because updating the schema refers to modifying the structure and organization of the database. This includes changes to tables, columns, relationships, and other metadata information. The Metadata database stores this metadata, which describes the structure, content, and other characteristics of the data stored in the database. By updating the schema, the information stored in the Metadata database is modified to reflect the changes made to the database structure.

82. Which client tool is used to schedule, run and validate the job?

DataStage Director

DataStage Manager

DataStage Administrator

DataStage Manager Roles

DataStage Director is the correct answer because it is a client tool used in IBM InfoSphere DataStage to schedule, run, and validate jobs. It provides a graphical interface that allows users to manage and monitor DataStage jobs, view job logs, and troubleshoot any issues that may arise during job execution. DataStage Director also allows users to schedule jobs to run at specific times or intervals, ensuring that data integration processes are executed in a timely and efficient manner.

Explanation

DataStage Director is the correct answer because it is a client tool used in IBM InfoSphere DataStage to schedule, run, and validate jobs. It provides a graphical interface that allows users to manage and monitor DataStage jobs, view job logs, and troubleshoot any issues that may arise during job execution. DataStage Director also allows users to schedule jobs to run at specific times or intervals, ensuring that data integration processes are executed in a timely and efficient manner.

83. Which client tool is used to import and export components?

DS Manager

DataStage Director

DataStage Designer

DataStage Administrator

DS Manager is the correct answer because it is the client tool used specifically for managing and administering DataStage components. It allows users to import and export components, as well as perform other administrative tasks such as scheduling and monitoring jobs. DataStage Director is used for job monitoring and execution, DataStage Designer is used for designing and developing DataStage jobs, and DataStage Administrator is used for managing and configuring DataStage projects and resources.

Explanation

DS Manager is the correct answer because it is the client tool used specifically for managing and administering DataStage components. It allows users to import and export components, as well as perform other administrative tasks such as scheduling and monitoring jobs. DataStage Director is used for job monitoring and execution, DataStage Designer is used for designing and developing DataStage jobs, and DataStage Administrator is used for managing and configuring DataStage projects and resources.

84. Where can you see the CPU utilization in DataStage?

DataStage Designer

DataStage Director

DataStage Administrator

DataStage Manager

DataStage Director is the correct answer because it is a tool in DataStage that provides a graphical interface for managing and monitoring DataStage jobs. It allows users to view and analyze CPU utilization, as well as other performance metrics, such as memory usage and job status. The DataStage Director provides a comprehensive overview of the system's performance and allows users to make informed decisions based on the information provided.

Explanation

DataStage Director is the correct answer because it is a tool in DataStage that provides a graphical interface for managing and monitoring DataStage jobs. It allows users to view and analyze CPU utilization, as well as other performance metrics, such as memory usage and job status. The DataStage Director provides a comprehensive overview of the system's performance and allows users to make informed decisions based on the information provided.

85. The transformation information is stored in a table as part of the warehouse in

Expression based transformation

Table based transformation

Both the options

None of the options

The correct answer is "Table based transformation" because in this type of transformation, the transformation information is stored in a table as part of the warehouse. This means that the transformation logic and rules are defined and maintained within a table structure, allowing for easy access, modification, and management of the transformation information. This approach is often used when there are complex or frequent transformations that need to be applied to the data.

Explanation

The correct answer is "Table based transformation" because in this type of transformation, the transformation information is stored in a table as part of the warehouse. This means that the transformation logic and rules are defined and maintained within a table structure, allowing for easy access, modification, and management of the transformation information. This approach is often used when there are complex or frequent transformations that need to be applied to the data.

86. Output of hash file sorted

True

False

The output of a hash file is not sorted because a hash file uses a hashing algorithm to store and retrieve data in an unordered manner. The purpose of a hash file is to provide quick access to data based on a key, rather than maintaining a specific order. Therefore, the statement "Output of hash file sorted" is false.

Explanation

The output of a hash file is not sorted because a hash file uses a hashing algorithm to store and retrieve data in an unordered manner. The purpose of a hash file is to provide quick access to data based on a key, rather than maintaining a specific order. Therefore, the statement "Output of hash file sorted" is false.

87. Default administrator in UNIX is

Root

Sysadm

Dsadm

Adm

In UNIX, the default administrator is "Root". The root user has complete control over the system and can perform any task, including modifying system files, installing software, and managing user accounts. The root user is also known as the superuser and has unrestricted access to all commands and directories. This is why it is crucial to exercise caution when using the root account, as any mistakes or malicious actions can have severe consequences on the system.

Explanation

In UNIX, the default administrator is "Root". The root user has complete control over the system and can perform any task, including modifying system files, installing software, and managing user accounts. The root user is also known as the superuser and has unrestricted access to all commands and directories. This is why it is crucial to exercise caution when using the root account, as any mistakes or malicious actions can have severe consequences on the system.

88. Fast export is used to export all the data

True

False

Fast export is not used to export all the data. It is a utility provided by Teradata that allows for high-speed data extraction from a Teradata database to an external file. It is designed for exporting a subset of data based on specific criteria and is particularly useful for large-scale data migration or backup purposes. Therefore, the correct answer is False.

Explanation

Fast export is not used to export all the data. It is a utility provided by Teradata that allows for high-speed data extraction from a Teradata database to an external file. It is designed for exporting a subset of data based on specific criteria and is particularly useful for large-scale data migration or backup purposes. Therefore, the correct answer is False.

89. Metadata can be classified based on

Content

Mutability

Logical function

Transformation

Partition mapping

Metadata can be classified based on various factors, including content, mutability, and logical function. Content refers to the type of information that the metadata describes, such as the title, author, or date of a document. Mutability refers to whether the metadata can be modified or not. Logical function refers to the purpose or role of the metadata within a system or application. These classifications help to organize and manage metadata effectively, allowing for easier retrieval and analysis of information.

Explanation

Metadata can be classified based on various factors, including content, mutability, and logical function. Content refers to the type of information that the metadata describes, such as the title, author, or date of a document. Mutability refers to whether the metadata can be modified or not. Logical function refers to the purpose or role of the metadata within a system or application. These classifications help to organize and manage metadata effectively, allowing for easier retrieval and analysis of information.

Submit

90. Where do log file exists ?

DataStage Director

DataStage Manager

DataStage Administrator

DataStage Designer

The correct answer is DataStage Director. In DataStage, log files exist in the DataStage Director. DataStage Director is a graphical interface that allows users to manage and monitor DataStage jobs. It provides a centralized location for viewing job status, logs, and statistics. Log files are important for troubleshooting and analyzing job execution, and they can be accessed and reviewed within the DataStage Director interface.

Explanation

The correct answer is DataStage Director. In DataStage, log files exist in the DataStage Director. DataStage Director is a graphical interface that allows users to manage and monitor DataStage jobs. It provides a centralized location for viewing job status, logs, and statistics. Log files are important for troubleshooting and analyzing job execution, and they can be accessed and reviewed within the DataStage Director interface.

91. When the attribute forms are included in the following category, the attribute elements corresponding to the selected form will be displayed in the data explorer or in the element list selection while creation of filters.

Report Display Forms

Browse Display Forms

Filter Display Forms

Both Report Display Forms and Browse Display Forms

Both Report Display Forms and Filter Display Forms

When the attribute forms are included in the Browse Display Forms category, the attribute elements corresponding to the selected form will be displayed in the data explorer or in the element list selection while creating filters.

Explanation

When the attribute forms are included in the Browse Display Forms category, the attribute elements corresponding to the selected form will be displayed in the data explorer or in the element list selection while creating filters.

92. Block indexes are preferable for high cardinality columns.

True

False

Block indexes are not preferable for high cardinality columns. Block indexes are more efficient for low cardinality columns, where there are a limited number of distinct values. For high cardinality columns with a large number of distinct values, a different indexing method, such as B-tree or bitmap indexes, would be more suitable.

Explanation

Block indexes are not preferable for high cardinality columns. Block indexes are more efficient for low cardinality columns, where there are a limited number of distinct values. For high cardinality columns with a large number of distinct values, a different indexing method, such as B-tree or bitmap indexes, would be more suitable.

93. We can use fact directly into the report.

True

False

The statement suggests that facts can be directly included in a report. However, this is not true. In a report, facts need to be supported by evidence, data, or research. Simply stating a fact without any supporting information may not be sufficient or credible. Therefore, the correct answer is False.

Explanation

The statement suggests that facts can be directly included in a report. However, this is not true. In a report, facts need to be supported by evidence, data, or research. Simply stating a fact without any supporting information may not be sufficient or credible. Therefore, the correct answer is False.