Databricks-Machine-Learning-Associate 無料問題集「Databricks Certified Machine Learning Associate」
A data scientist is using Spark SQL to import their data into a machine learning pipeline. Once the data is imported, the data scientist performs machine learning tasks using Spark ML.
Which of the following compute tools is best suited for this use case?
Which of the following compute tools is best suited for this use case?
正解:C
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A data scientist is using MLflow to track their machine learning experiment. As a part of each of their MLflow runs, they are performing hyperparameter tuning. The data scientist would like to have one parent run for the tuning process with a child run for each unique combination of hyperparameter values. All parent and child runs are being manually started with mlflow.start_run.
Which of the following approaches can the data scientist use to accomplish this MLflow run organization?
Which of the following approaches can the data scientist use to accomplish this MLflow run organization?
正解:C
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A data scientist wants to efficiently tune the hyperparameters of a scikit-learn model in parallel. They elect to use the Hyperopt library to facilitate this process.
Which of the following Hyperopt tools provides the ability to optimize hyperparameters in parallel?
Which of the following Hyperopt tools provides the ability to optimize hyperparameters in parallel?
正解:D
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A data scientist has been given an incomplete notebook from the data engineering team. The notebook uses a Spark DataFrame spark_df on which the data scientist needs to perform further feature engineering. Unfortunately, the data scientist has not yet learned the PySpark DataFrame API.
Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?
Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?
正解:B
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A data scientist wants to tune a set of hyperparameters for a machine learning model. They have wrapped a Spark ML model in the objective function objective_function and they have defined the search space search_space.
As a result, they have the following code block:
Which of the following changes do they need to make to the above code block in order to accomplish the task?
As a result, they have the following code block:
Which of the following changes do they need to make to the above code block in order to accomplish the task?
正解:D
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)
A data scientist has replaced missing values in their feature set with each respective feature variable's median value. A colleague suggests that the data scientist is throwing away valuable information by doing this.
Which of the following approaches can they take to include as much information as possible in the feature set?
Which of the following approaches can they take to include as much information as possible in the feature set?
正解:E
解答を投票する
解説: (JPNTest メンバーにのみ表示されます)