DEA-7TT2 無料問題集「EMC Associate - Data Science and Big Data Analytics v2」

Your company has 3 different sales teams. Each team's sales manager has developed incentive offers to increase the size of each sales transaction.
Any sales manager whose incentive program can be shown to increase the size of the average sales transaction will receive a bonus. Data are available for the number and average sale amount for transactions offering one of the incentives as well as transactions offering no incentive.
The VP of Sales has asked you to determine analytically if any of the incentive programs has resulted in a demonstrable increase in the average sale amount.
Which analytical technique would be appropriate in this situation?
Response:

A time series of monthly sales indicates a periodic pattern, and the value reaches the peak every December. Which statistic needs to be subtracted from the actual value to adjust for this seasonal effect?
Response:

Refer to the exhibit.

In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known to have not defaulted on their loan, and the blue represents borrowers that are known to have defaulted on their loan.
Which analytical method could produce the probabilities needed to build this exhibit?
Response:

In a Logistic Regression, the coefficient for "age" equals -3. What is the correct interpretation of the Logistic Regression coefficient, holding all other variables constant?
Response:

What are the characteristics of Big Data?
Response:

You fit a Logistic Regression model to your training data and notice that the variable X has an infinite magnitude coefficient. What does this indicate?
Response:

When would you use GROUP BY ROLLUP clause in your OLAP query?
Response:

After which phase of the data analytics lifecycle should you determine if the model is robust enough?
Response:

Consider the following SQL statement:
SELECT employee_id, year, salary, avg(salary)
OVER
(PARTITION BY employee_id ORDER BY year ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as result_1 FROM employee ORDER BY employee_id, year For each employee_id, what is returned as result_1?
Response:

You are assigned the task of creating customer profiles for your company. In your database, you have 25 key input variables that come together to define 2,500 customers. You decide to run a K-means cluster analysis on the 25 input variables based on k=4 to build your profiles.
Your analysis resulted in four cluster populations:
Cluster A=1,000 customers
Cluster B=560 customers
Cluster C=925 customers
Cluster D=15 customers
What should be attempted first to more evenly distribute the customer population across clusters?
Response:

What is a motivation for using a data analytics lifecycle?
Response:

When would you use a Wilcoxson Rank Sum test?
Response:

Which activity is performed in the Operationalize phase of the Data Analytics Lifecycle?
Response:

You are building a logistic regression model to predict whether a tax filer will be audited within the next two years. Your training set population is 1000 filers. The audit rate in your training data is 4.2%.
What is the sum of the probabilities that the model assigns to all the filers in your training set that have been audited?
Response:

In time series analysis, what is an indication of a stationary sequence?
Response:

弊社を連絡する

我々は12時間以内ですべてのお問い合わせを答えます。

オンラインサポート時間:( UTC+9 ) 9:00-24:00
月曜日から土曜日まで

サポート:現在連絡