Machine Learning MCQs with Answers

Discover important Machine Learning MCQs with answers to boost your knowledge and prepare effectively for exams and interviews in the field of Machine Learning.

Machine Learning MCQs with Answers

Regression trees are often used to model _____ data.

A. linear
B. nonlinear
C. categorical
D. symmetrical

Support Vector Machine is

A. logical model
B. probabilistic model
C. geometric model
D. none of the above

For the given weather data, Calculate probability of not playing

A. 0.4
B. 0.64
C. 0.36
D. 0.5

In PCA the number of input dimensiona are equal to principal components

A. TRUE
B. FALSE

What are the two methods used for the calibration in Supervised Learning?

A. Platt Calibration and Isotonic Regression
B. Statistics and Informal Retrieval
C. Both A and B
D. None of these

Regression trees are often used to model . . . . . . . . data.

A. linear
B. nonlinear
C. categorical
D. symmetrical

Which of the following is true about Manhattan distance?

A. it can be used for continuous variables
B. it can be used for categorical variables
C. it can be used for categorical as well as continuous
D. it can be used for constants

In terms of bias and variance. Which of the following is true when you fit degree 2 polynomial?

A. bias will be high, variance will be high
B. bias will be low, variance will be high
C. bias will be high, variance will be low
D. bias will be low, variance will be low

Supervised learning and unsupervised clustering both require at least one

A. hidden attribute.
B. output attribute.
C. input attribute.
D. categorical attribute.

Even if there are no actual supervisors . . . . . . . . learning is also based on feedback provided by the environment

A. Supervised
B. Reinforcement
C. Unsupervised
D. None of the above

When it is necessary to allow the model to develop a generalization ability and avoid a common problem called . . . . . . . .

A. Overfitting
B. Overlearning
C. Classification
D. Regression

Which of the following option is true about k-NN algorithm?

A. it can be used for classification
B. it can be used for regression
C. it can be used in both classification and regression
D. not useful in ml algorithm

Feature can be used as a

A. binary split
B. predictor
C. both a and b
D. none of the above

The average positive difference between computed and desired outcome values.

A. root mean squared error
B. mean squared error
C. mean absolute error
D. mean positive error

The most general form of distance is

A. manhattan
B. eucledian
C. mean
D. minkowski

scikit-learn offers the class______ which is responsible for filling the holes using a strategy based on the mean, median, or frequency

A. LabelEncoder
B. LabelBinarizer
C. DictVectorizer
D. Imputer

Which of the following function provides unsupervised prediction ?

A. cl_forecastb
B. cl_nowcastc
C. cl_precastd
D. none of the mentioned

In the last decade, many researchers started training bigger and bigger models, built with several different layers that’s why this approach is called _____

A. Deep learning
B. Machine learning
C. Reinforcement learning
D. Unsupervised learning

Which of the following is a good test dataset characteristic?

A. large enough to yield meaningful results
B. is representative of the dataset as a whole
C. both A and B
D. none of the above

Features being classified is ___ of each other in Nave Bayes Classifier

A. independent
B. dependent
C. partial dependent
D. none

In syntax of linear model lm (formula,data,..), data refers to ______

A. Matrix
B. Vector
C. Array
D. List

How it’s possible to use a different placeholder through the parameter______

A. regression
B. classification
C. random_state
D. missing values

Which of the following is a categorical data?

A. branch of bank
B. expenditure in rupees
C. prize of house
D. weight of a person

Linear Regression is a supervised machine learning algorithm.

A. TRUE
B. FALSE

Which of the following is the difference between stacking and blending?

A. stacking has less stable cv
B. in blending, you create out of fold prediction
C. stacking is simpler than blending
D. none of these

Logistic regression is a . . . . . . . . regression technique that is used to model data having a . . . . . . . . outcome.

A. linear, numeric
B. linear, binary
C. nonlinear, numeric
D. nonlinear, binary

You are given reviews of few Netflix series marked as positive, negative and neutral. Classifying reviews of a new Netflix series is an example of

A. supervised learning
B. unsupervised learning
C. semi supervised learning
D. reinforcement learning

In following type of feature selection method we start with empty feature set

A. forward feature selection
B. backword feature selection
C. both a and b
D. none of the above

Neural Networks are complex . . . . . . . . with many parameters.

A. linear functions
B. nonlinear functions
C. discrete functions
D. exponential functions

If Linear regression model perfectly first i.e., train error is zero, then _____

A. Test error is also always zero
B. Test error is non zero
C. Couldn’t comment on Test error
D. Test error is equal to Train error

When the number of classes is large Gini index is not a good choice.

A. TRUE
B. FALSE

Data used to build a data mining model.

A. training data
B. validation data
C. test data
D. hidden data

This technique associates a conditional probability value with each data instance.

A. linear regression
B. logistic regression
C. simple regression
D. multiple linear regression

What would you do in PCA to get the same projection as SVD?

A. transform data to zero mean
B. transform data to zero median
C. not possible
D. none of these

The . . . . . . . . of the hyperplane depends upon the number of features.

A. dimension
B. classification
C. reduction
D. none of the above

What is the approach of basic algorithm for decision tree induction?

A. greedy
B. top down
C. procedural
D. step by step

Can we extract knowledge without apply feature selection

A. Yes
B. No

Computers are best at learning

A. facts.
B. concepts.
C. procedures.
D. principles.

KDD represents extraction of

A. data
B. knowledge
C. rules
D. model

Linear Regression is a _____ machine learning algorithm.

A. supervised
B. unsupervised
C. semi-supervised
D. cant say

Which of the following can only be used when training data are linearly separable?

A. linear hard-margin svm
B. linear logistic regression
C. linear soft margin svm
D. the centroid method

The average squared difference between classifier predicted output and actual output.

A. mean squared error
B. root mean squared error
C. mean absolute error
D. mean relative error

Which of the following methods do we use to find the best fit line for data in Linear Regression?

A. Least Square Error
B. Maximum Likelihood
C. Logarithmic Loss
D. Both A and B

Following are the descriptive models

A. clustering
B. classification
C. association rule
D. both a and c

What are common feature selection methods in regression task?

A. correlation coefficient
B. greedy algorithms
C. all above
D. none of these

Which of the following is characteristic of best machine learning method ?

A. fast
B. accuracy
C. scalable
D. all above

Some people are using the term _______ instead of prediction only to avoid the weird idea that machine learning is a sort of modern magic.

A. Inference
B. Interference
C. Accuracy
D. None of above

What characterize unlabelled examples in machine learning:

A. there is no prior knowledge
B. there is no confusing knowledge
C. there is prior knowledge
D. there is plenty of confusing knowledge

What are the different Algorithm techniques in Machine Learning?

A. supervised learning and semi-supervised learning
B. unsupervised learning
C. both A & B
D. none of the mentioned

Which of the following is not Machine Learning?

A. artificial intelligence
B. rule based inference
C. both a and b
D. none of the mentioned

If machine learning model output does not involves target variable then that model is called as

A. descriptive model
B. predictive model
C. reinforcement learning
D. all of the above

Which are two techniques of Machine Learning ?

A. Genetic Programming and Inductive Learning
B. Speech recognition and Regression
C. Both A and B
D. None of the Mentioned

In simple term, machine learning is

A. training based on historical data
B. prediction to answer a query
C. both A and B
D. atomization of complex tasks

Which of the following is the best machine learning method?

A. scalable
B. accuracy
C. fast
D. all of the above

The output of training process in machine learning is

A. machine learning model
B. machine learning algorithm
C. null
D. accuracy

Application of machine learning methods to large databases is called

A. data mining.
B. artificial intelligence
C. big data computing
D. internet of things

If machine learning model output involves target variable then that model is called as

A. descriptive model
B. predictive model
C. reinforcement learning
D. all of the above

IMPORTANT: Review General Knowledge MCQs and Computer MCQs for an effective test preparation.

Machine Learning MCQs with Answers

Leave a Comment Cancel reply