The AWS Certified Machine Learning - Specialty certification is intended for individuals who perform a development or data science role. It validates a candidate's ability to design, implement, deploy, and maintain machine learning (ML) solutions for given business problems.
K-means clustering
Random Cut Forest (RCF)
XGBoost
BlazingText
Rate this question:
Launch the SageMaker notebook instance within the VPC with SageMaker-provided internet access enabled. Use an S3 ACL to open read privileges to the everyone group
Launch the SageMaker notebook instance within the VPC and create an S3 VPC endpoint for the notebook to access the data. Copy the JSON dataset from Amazon S3 into the ML storage volume on the SageMaker notebook instance and work against the local dataset.
Launch the SageMaker notebook instance within the VPC and create an S3 VPC endpoint for the notebook to access the data, define a custom S3 bucket policy to only allow requests from your VPC to access the S3 bucket.
Launch the SageMaker notebook instance within the VPC with SageMaker-provided internet access enabled. Generate an S3 pre-signed URL for access to data in the bucket.
Rate this question:
Face landmarks filters set to a max sharpness
Bounding box and confidence score for face comparison threshold tolerances set to max values
Confidence threshold tolerance set to the default
Face collection contents
Rate this question:
Use the Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm on the single time series consisting of the full year of data with a predictor_type of regressor.
Use Amazon SageMaker Random Cut Forest (RCF) on the single time series consisting of the full year of data.
Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of regressor.
Use the Amazon SageMaker Linear Learner algorithm on the single time series consisting of the full year of data with a predictor_type of classifier.
Rate this question:
Increase the learning rate. Keep the batch size the same.
Reduce the batch size. Decrease the learning rate
Keep the batch size the same. Decrease the learning rate
Do not change the learning rate. Increase the batch size.
Rate this question:
Amazon SageMaker notebook instances are based on the EC2 instances within the customer account but they run outside of VPCs.
Amazon SageMaker notebook instances are based on the Amazon ECS service within customer accounts.
Amazon SageMaker notebook instances are based on EC2 instances running within AWS service accounts.
Amazon SageMaker notebook instances are based on AWS ECS instances running within AWS service accounts.
Use Amazon S3 Standard for all raw data. Use Amazon S3 Glacier Deep Archive for all processed data.
Use Amazon S3 Standard for the processed data that is within one year of processing. After one year, use Amazon S3 Glazier for the processed data. Use Amazon S3 Glacier Deep Archive for all raw data.
Use Amazon Elastic File System (Amazon EFS) for processed data is within one year of processing. After one year, use Amazon S3 Standard for the processed data. Use Amazon S3 Glacier Deep Archive for all raw data.
Use Amazon S3 Standard for both the raw and processed data. after one year, use Amazon S3 Glacier Deep Archive for the raw data.
Rate this question:
.Store datasets as files in Amazon S3.
Store datasets as files in an Amazon EBS volume attached to an Amazon EC2 instance.
Store datasets as tables in a multi-node Amazon Redshift cluster.
Store datasets as global tables in Amazon DynamoDB.
Rate this question:
Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
Use AWS Glue to catalogue the data and Amazon Athena to run queries.
Use AWS Batch to run ETL on the data and Amazon Aurora to run the queries.
Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run queries.
Rate this question:
Binarization
One-hot encoding
Tokenization
Normalization transformation
Rate this question:
Increase the instance size for training
Increase the batch size in the model
Change the input mode to Pipe
Create an Amazon EBS volume with the data on it and attach it to the training job
Rate this question:
Remove the rows containing the missing values
Remove the columns containing the missing values
Fill the missing values with zeros
Impute the missing values using regression
Add regularization to the model
Rate this question:
Ask the social media handling team to review each post using Amazon SageMaker GroundTruth and provide the label
Use the sentiment analysis natural language processing library to determine whether a post requires a response
Use Amazon Mechanical Turk to publish Human Intelligence Tasks that ask Turk workers to label the posts
Use the a priori probability distribution of the two classes. Then, use Monte-Carlo simulation to generate the labels
Use K-Means to cluster posts into various groups, and pick the most frequent word in each group as its label
Rate this question:
Launch the notebook instances in a public subnet and access the data through the public S3 endpoint
Launch the notebook instances in a private subnet and access the data through a NAT gateway
Launch the notebook instances in a public subnet and access the data through a NAT gateway
Launch the notebook instances in a private subnet and access the data through an S3 VPC endpoint.
Rate this question:
Oversampling using bootstrapping
Undersampling
Oversampling using SMOTE
Class weight adjustment
Rate this question:
Use the scikit-learn python library to build a sentiment analysis service to provide insight data to the marketing team’s internal application platform. Build a dashboard into the application platform using React or Angular.
Use the DetectSentiment Amazon Comprehend API as a service to provide insight data to the marketing team’s internal application platform. Build a dashboard into the application platform using React or Angular.
Use the Amazon Lex API as a service to implement the to provide insight data to the marketing team’s internal application platform. Build a dashboard into the application platform using React or Angular.
Use Amazon Translate, Amazon Comprehend, Amazon Kinesis, Amazon Athena, and Amazon QuickSight to build a natural-language-processing (NLP)-powered social media dashboard
Rate this question:
Amazon Comprehend syntax analysts and entity detection
Amazon SageMaker BlazingText allow mode
Natural Language Toolkit (NLTK) stemming and stop word removal
Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers
Rate this question:
Use the SageMaker batch transform feature to transform the training data into a DataFrame
Use AWS Glue to compress the data into the Apache Parquet format
Transform the dataset into the Recordio protobuf format
Use the SageMaker hyperparameter optimization feature to automatically optimize the data
Rate this question:
Convert the individual sentences into sequences of words. Use those as the input.
Convert the individual sentences into numerical sequences starting from the number 1 for each word in a sentence. Use the sentences as the input.
Vectorize the sentences. Transform them into numerical sequences. Use the sentences as the input.
Vectorize the sentences. Transform them into numerical sequences with a padding. Use the sentences as the input.
Rate this question:
Bernoulli Distribution
Normal Distribution
Poisson Distribution
Binomial Distribution
Rate this question:
Use a random forest by building multiple randomized decision trees and averaging their outputs to get the predictions of the housing prices.
Gather additional training data that gives a more diverse representation of the housing price data.
Use the “dropout” technique to penalize large weights and prevent overfitting.
Use feature selection to eliminate irrelevant features and iteratively train your model until you eliminate the overfitting.
Rate this question:
AWS CloudTrail
AWS Health
AWS Trusted Advisor
Amazon CloudWatch
AWS Config
Rate this question:
Write the raw data to Amazon S3. Schedule an AWS Lambda function to submit a Spark step to a persistent Amazon EMR cluster based on the existing schedule. Use the existing PySpark logic to run the ETL job on the EMR cluster. Output the results to a “processed” location in Amazon S3 that is accessible for downstream use.
Write the raw data to Amazon S3. Create an AWS Glue ETL job to perform the ETL processing against the input data. Write the ETL job in PySpark to leverage the existing logic. Create a new AWS Glue trigger to trigger the ETL job based on the existing schedule. Configure the output target of the ETL job to write to a “processed” location in Amazon S3 that is accessible for downstream use.
Write the raw data to Amazon S3. Schedule an AWS Lambda function to run on the existing schedule and process the input data from Amazon S3. Write the Lambda logic in Python and implement the existing PySpark logic to perform the ETL process. Have the Lambda function output the results to a “processed” location in Amazon S3 that is accessible for downstream use.
Use Amazon Kinesis Data Analytics to stream the input data and perform real-time SQL queries against the stream to carry out the required transformations within the stream. Deliver the output results to a “processed” location in Amazon S3 that is accessible for downstream use.
Rate this question:
Dimensionality reduction
Data normalization
Model regularization
Data augmentation for the minority class
Rate this question:
Replace each missing value by the mean or median across non-missing values in same row.
Delete observations that contain missing values because these represent less than 5% of the data
Replace each missing value by the mean or median across non-missing values in the same column.
For each feature, approximate the missing values using supervised learning based on other features.
Rate this question:
ROC Curve
Precision
Recall
PR Curve
Rate this question:
Grid Search
Random Search
Breadth First Search
Bayesian optimization
Depth first search
Rate this question:
AWS Glue as the data dialog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for real-time data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations.
Amazon Athena as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for near-real time data insights; Amazon Kinesis Data Firehose for clickstream analytics; AWS Glue to generate personalized product recommendations.
AWS Glue as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon Kinesis Data Firehose for delivery to Amazon ES for clickstream analytics; Amazon EMR to generate personalized product recommendations.
Amazon Athena as the data catalog; Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics for historical data insights; Amazon DynamoDB streams for clickstream analytics; AWS Glue to generate personalized product recommendations.
Rate this question:
Drop the test samples with missing full review text fields, and then run through the test set.
Copy the summary text fields and use them to fill in the missing full review text fields, and then run through the test set.
Use an algorithm that handles missing data better than decision trees.
Generate synthetic data to fill in the fields that are missing data, and then run through the test set.
Rate this question:
The factorization machines (FM) algorithm
The Latent Dirichlet Allocation (LDA) algorithm
The principal component analysis (PCA) algorithm
The k-means algorithm
The Random Cut Forest (RCF) algorithm
Rate this question:
Write a direct connection to the SQL database within the notebook and pull data in.
Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data Pipeline and provide the S3 location within the notebook.
Move the data to Amazon DynamoDB and set up a connection to DynamoDB within the notebook to pull data in.
Move the data to Amazon ElastiCache using AWS DMS and set up a connection within the notebook to pull data in for fast access.
Rate this question:
YperparameterTunerJob()
HyperparameterTuner()
HyperparameterTuningJobs()
Hyperparameter()
Rate this question:
SageMaker Data Wrangler
SageMaker Model Monitor
SageMaker Multi-Model Endpoints
SageMaker Distributed Training
Rate this question:
As variable 1 increases, variable 5 increases
As variable 1 increases, variable 5 decreases
Variable 1 does not have any influence on variable 5
The data is not sufficient to make a well-informed interpretation
Rate this question:
Logistic regression
Random Cut Forest (RCF)
Principal component analysis (PCA)
Linear regression
Rate this question:
Derive a dictionary of tokens from claims in the entire dataset. Apply one-hot encoding to tokens found in each claim of the training set. Send the derived features space as inputs to an Amazon SageMaker built in supervised learning algorithm.
Apply Amazon SageMaker BlazingText in Word2Vec mode to claims in the training set. Send the derived features space as inputs for the downstream supervised task.
Apply Amazon SageMaker BlazingText in classification mode to labeled claims in the training set to derive features for the claims that correspond to the compliant and non-compliant labels, respectively.
Apply Amazon SageMaker Object2Vec to claims in the training set. Send the derived features space as inputs for the downstream supervised task.
Rate this question:
Kinesis Data Streams
Kinesis Firehose
Kinesis Data Analytics
Amazon Kinesis Video Streams
Rate this question:
Regression
Classification
Recommender system
Reinforcement learning
Rate this question:
Regression
Classification
Recommender system
Reinforcement learning
Rate this question:
Load a smaller subset of the data into the SageMaker notebook and train locally. Confirm that the training code is executing and the model parameters seem reasonable. Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe input mode.
Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to the instance. Train on a small amount of the data to verify the training code and hyperparameters. Go back to Amazon SageMaker and train using the full dataset.
Use AWS Glue to train a model using a small subset of the data to confirm that the data will be compatible with Amazon SageMaker. Initiate a SageMaker training job using the full dataset from the S3 bucket using Pipe input mode.
Load a smaller subset of the data into the SageMaker notebook and tram locally. Confirm that the training code is executing and the model parameters seem reasonable. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the S3 bucket to train the full dataset.
Rate this question:
AWS Glue
Amazon Comprehend
AWS SageMaker
Amazon Lex
Rate this question:
Amazon SageMaker DeepAR
SciKit Learn Regression
Convolutional neural network (CNN)
Scikit Learn Random Forest
Rate this question:
Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to the Amazon SageMaker notebook instance.
Сonfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the KMS key policy to the notebook’s KMS role.
Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset. Grant permission in the KMS key policy to that role.
Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook instance.
Rate this question:
TN = 91, FP = 9 FN = 22, TP = 78
TN = 99, FP = 1 FN = 21, TP = 79
TN = 96, FP = 4 FN = 10, TP = 90
TN = 98, FP = 2 FN = 18, TP = 82
Rate this question:
Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Batch job, and an external Apache Hive metastore to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Batch job, and an AWS Glue Data Catalog to search and discover metadata.
Use an AWS Glue crawler to crawl S3 data, an Amazon CloudWatch alarm to trigger an AWS Glue ETL job, and an external Apache Hive metastore to search and discover metadata.
Rate this question:
Download the AWS SDK for the Spark environment.
Install the SageMaker Spark library in the Spark environment.
Use the appropriate estimator from the SageMaker Spark Library to train a model.
Compress the training data into a ZIP file and upload it to a pre-defined Amazon S3 bucket.
Use the SageMaker Model transform method to get inferences from the model hosted in SageMaker.
Convert the DataFrame object to a CSV file, and use the CSV file as input for obtaining inferences from SageMaker.
Rate this question:
Ingest the data using Amazon Kinesis Data Firehose, and use Amazon Kinesis Data Analytics Random Cut Forest (RCF) for anomaly detection. Then use Kinesis Data Firehose to stream the results to Amazon S3.
Ingest the data into Apache Spark Streaming using Amazon EMR. and use Spark MLlib with k-means to perform anomaly detection. Then store the results in an Apache Hadoop Distributed File System (HDFS) using Amazon EMR with a replication factor of three as the data lake.
Ingest the data and store it in Amazon S3 Use AWS Batch along with the AWS Deep Learning AMIs to train a k-means model using TensorFlow on the data in Amazon S3.
Ingest the data and store it in Amazon S3. Have an AWS Glue job that is triggered on demand transform the new data. Then use the built-in Random Cut Forest (RCF) model within Amazon SageMaker to detect anomalies in the data.
Rate this question:
Regression
Classification
Natural language processing (NLP)
A rule-based solution should be used instead of ML
Rate this question:
Quiz Review Timeline (Updated): Jan 8, 2025 +
Our quizzes are rigorously reviewed, monitored and continuously updated by our expert board to maintain accuracy, relevance, and timeliness.
Wait!
Here's an interesting quiz for you.