Which metric is useful for imbalanced datasets and represents the harmonic mean of precision and recall?A.
Accuracy
B. Precision
C. Recall
D. F1 Score
Which metric is useful for imbalanced datasets and represents the harmonic mean of precision and recall?
A. Accuracy
B. Precision
C. Recall
D. F1 Score
What does the term "overfitting" refer to in the context of model evaluation and validation?
A. The model fits the training data perfectly
B. The model has too few parameters
C. The model generalizes well to new data
D. The model has low bias and high variance
In model evaluation, what is the primary goal of feature scaling or normalization?
A. To make the model's predictions more accurate
B. To improve the model's interpretability
C. To reduce the number of features in the dataset
D. To increase the model's complexity
Which method involves splitting the dataset into three parts: training, validation, and test sets, ensuring that the model's performance is assessed on unseen data?
A. Holdout Validation
B. Cross-Validation
C. Stratified Sampling
D. Leave-One-Out Cross-Validation (LOOCV)
What is the primary advantage of using stratified sampling in model evaluation?
A. It ensures that each class is represented fairly
B. It reduces the risk of overfitting
C. It simplifies the model's architecture
D. It requires less computational resources
What is the primary purpose of a validation dataset in machine learning?
A. To train the model
B. To evaluate the model on unseen data
C. To test the model's performance on training data
D. To visualize data relationships
Which metric is commonly used to evaluate classification models and represents the ratio of correctly predicted positive instances to all positive instances?
A. Accuracy
B. Precision
C. Recall
D. F1 Score
In k-fold cross-validation, if you choose a higher value of k (e.g., k = 10), what effect does it have on the model evaluation process?
A. It reduces the risk of overfitting
B. It reduces the number of folds used in training
C. It increases the model's complexity
D. It decreases the training time
Which technique is used to prevent data leakage in cross-validation, ensuring that information from the test set doesn't influence model training?
A. Leave-One-Out Cross-Validation (LOOCV)
B. Stratified Sampling
C. Holdout Validation
D. Feature Scaling
What is the purpose of the Receiver Operating Characteristic (ROC) curve in model evaluation?
A. To compare different machine learning algorithms
B. To visualize the model's decision boundary
C. To measure the model's prediction accuracy
D. To evaluate the trade-off between true positive rate and false positive rate