MACHINE LEARNING – CHAPTER 1 Open Questions
DownloadTélécharger
Actions
Vote :
ScreenshotAperçu

Informations
Catégorie :Category: nCreator TI-Nspire
Auteur Author: SPITZER2001
Type : Classeur 3.0.1
Page(s) : 1
Taille Size: 6.66 Ko KB
Mis en ligne Uploaded: 05/05/2025 - 17:05:01
Uploadeur Uploader: SPITZER2001 (Profil)
Téléchargements Downloads: 9
Visibilité Visibility: Archive publique
Shortlink : https://tipla.net/a4620184
Type : Classeur 3.0.1
Page(s) : 1
Taille Size: 6.66 Ko KB
Mis en ligne Uploaded: 05/05/2025 - 17:05:01
Uploadeur Uploader: SPITZER2001 (Profil)
Téléchargements Downloads: 9
Visibilité Visibility: Archive publique
Shortlink : https://tipla.net/a4620184
Description
Fichier Nspire généré sur TI-Planet.org.
Compatible OS 3.0 et ultérieurs.
<<
MACHINE LEARNING CHAPTER 1 Open Questions with Detailed Model Answers (Plain Copy Format) 1. Explain the difference between supervised, unsupervised, and reinforcement learning. Give one example for each. Model Answer: Supervised learning uses labeled data, where each input is associated with a corresponding output. The model learns to map inputs to outputs and is evaluated on its generalization to unseen data. Example: Spam detection using logistic regression. Unsupervised learning uses unlabeled data to identify patterns or structures. It is commonly used for clustering or dimensionality reduction. Example: Customer segmentation with k-means. Reinforcement learning involves an agent interacting with an environment, receiving rewards or penalties. The goal is to learn a policy that maximizes cumulative reward over time. Example: A robot learning to walk. 2. Define "overfitting" in machine learning. Why does it occur, and how can it be prevented? Model Answer: Overfitting occurs when a model learns patterns in the training data too well, including noise or anomalies. This results in poor generalization to new data. It often happens when the model is too complex or the training set is small. It can be prevented through regularization (e.g., L1 or L2), early stopping, reducing model complexity, or using cross-validation. 3. Describe the steps involved in the machine learning data pipeline. Why is data preprocessing important? Model Answer: Key steps in the pipeline include: Data collection gathering raw data from various sources. Data cleaning handling missing values, outliers, and errors. Data transformation normalizing numerical features, encoding categorical variables. Data splitting dividing into training, validation, and test sets. Feature selection or dimensionality reduction reducing the number of input variables. Preprocessing is important to ensure that the data is structured, clean, and optimized for model learning and performance. 4. What is gradient descent and how is it used to train machine learning models? Mention one limitation and how it can be addressed. Model Answer: Gradient descent is an optimization algorithm that minimizes a loss function by updating model parameters in the direction of steepest descent. At each iteration, the parameters are adjusted to reduce the prediction error. A limitation is that it can get stuck in local minima or saddle points in non-convex loss functions. This can be addressed using techniques like stochastic gradient descent (SGD), momentum, or adaptive optimizers such as Adam. 5. Compare classification and regression tasks. How does the nature of the output variable affect the choice of algorithm and loss function? Model Answer: Classification predicts discrete labels (e.g., spam or not spam), while regression predicts continuous values (e.g., house price). Classification uses algorithms like logistic regression and SVMs, and loss functions like cross-entropy. Regression uses linear regression or random forests with loss functions like mean squared error (MSE). The type of output determines the appropriate algorithm and metric. 6. What is the role of a loss function in machine learning? Name one loss function used for regression and one for classification. Model Answer: The loss function quantifies how well the models predictions match the true labels. It guides the optimization process during training. In regression, a common loss is Mean Squared Error (MSE). In classification, cross-entropy is widely used to compare predicted class probabilities to actual labels. 7. What is the difference between training, validation, and test sets in supervised learning? Model Answer: The training set is used to fit the model parameters. The validation set is used to tune hyperparameters and monitor for overfitting. The test set evaluates final model performance on unseen data. Proper separation of these sets ensures unbiased performance assessment. 8. What is underfitting? How does it differ from overfitting? Model Answer: Underfitting happens when the model is too simple to learn the underlying data patterns, resulting in poor training and test performance. Overfitting occurs when the model is too complex and memorizes the training data, performing poorly on new data. A good model balances both. 9. Explain the concept of generalization in ML. Why is it important? Model Answer: Generalization is the ability of a model to perform well on new, unseen data. It is essential because the goal of machine learning is to apply learned patterns from training to real-world cases. A model that generalizes well avoids both underfitting and overfitting. 10. Describe how PCA reduces dimensionality. What is one benefit of using PCA before training a model? Model Answer: Principal Component Analysis (PCA) transforms data into a new coordinate system where the axes (principal components) maximize variance. It reduces the number of features by keeping the most informative com
[...]
>>
Compatible OS 3.0 et ultérieurs.
<<
MACHINE LEARNING CHAPTER 1 Open Questions with Detailed Model Answers (Plain Copy Format) 1. Explain the difference between supervised, unsupervised, and reinforcement learning. Give one example for each. Model Answer: Supervised learning uses labeled data, where each input is associated with a corresponding output. The model learns to map inputs to outputs and is evaluated on its generalization to unseen data. Example: Spam detection using logistic regression. Unsupervised learning uses unlabeled data to identify patterns or structures. It is commonly used for clustering or dimensionality reduction. Example: Customer segmentation with k-means. Reinforcement learning involves an agent interacting with an environment, receiving rewards or penalties. The goal is to learn a policy that maximizes cumulative reward over time. Example: A robot learning to walk. 2. Define "overfitting" in machine learning. Why does it occur, and how can it be prevented? Model Answer: Overfitting occurs when a model learns patterns in the training data too well, including noise or anomalies. This results in poor generalization to new data. It often happens when the model is too complex or the training set is small. It can be prevented through regularization (e.g., L1 or L2), early stopping, reducing model complexity, or using cross-validation. 3. Describe the steps involved in the machine learning data pipeline. Why is data preprocessing important? Model Answer: Key steps in the pipeline include: Data collection gathering raw data from various sources. Data cleaning handling missing values, outliers, and errors. Data transformation normalizing numerical features, encoding categorical variables. Data splitting dividing into training, validation, and test sets. Feature selection or dimensionality reduction reducing the number of input variables. Preprocessing is important to ensure that the data is structured, clean, and optimized for model learning and performance. 4. What is gradient descent and how is it used to train machine learning models? Mention one limitation and how it can be addressed. Model Answer: Gradient descent is an optimization algorithm that minimizes a loss function by updating model parameters in the direction of steepest descent. At each iteration, the parameters are adjusted to reduce the prediction error. A limitation is that it can get stuck in local minima or saddle points in non-convex loss functions. This can be addressed using techniques like stochastic gradient descent (SGD), momentum, or adaptive optimizers such as Adam. 5. Compare classification and regression tasks. How does the nature of the output variable affect the choice of algorithm and loss function? Model Answer: Classification predicts discrete labels (e.g., spam or not spam), while regression predicts continuous values (e.g., house price). Classification uses algorithms like logistic regression and SVMs, and loss functions like cross-entropy. Regression uses linear regression or random forests with loss functions like mean squared error (MSE). The type of output determines the appropriate algorithm and metric. 6. What is the role of a loss function in machine learning? Name one loss function used for regression and one for classification. Model Answer: The loss function quantifies how well the models predictions match the true labels. It guides the optimization process during training. In regression, a common loss is Mean Squared Error (MSE). In classification, cross-entropy is widely used to compare predicted class probabilities to actual labels. 7. What is the difference between training, validation, and test sets in supervised learning? Model Answer: The training set is used to fit the model parameters. The validation set is used to tune hyperparameters and monitor for overfitting. The test set evaluates final model performance on unseen data. Proper separation of these sets ensures unbiased performance assessment. 8. What is underfitting? How does it differ from overfitting? Model Answer: Underfitting happens when the model is too simple to learn the underlying data patterns, resulting in poor training and test performance. Overfitting occurs when the model is too complex and memorizes the training data, performing poorly on new data. A good model balances both. 9. Explain the concept of generalization in ML. Why is it important? Model Answer: Generalization is the ability of a model to perform well on new, unseen data. It is essential because the goal of machine learning is to apply learned patterns from training to real-world cases. A model that generalizes well avoids both underfitting and overfitting. 10. Describe how PCA reduces dimensionality. What is one benefit of using PCA before training a model? Model Answer: Principal Component Analysis (PCA) transforms data into a new coordinate system where the axes (principal components) maximize variance. It reduces the number of features by keeping the most informative com
[...]
>>