DP-203無料問題集「Microsoft Data Engineering on Microsoft Azure」

質問 1

You are designing an Azure Synapse Analytics dedicated SQL pool.
You need to ensure that you can audit access to Personally Identifiable information (PII).
What should you include in the solution?

（A）column-level security

（B）sensitivity classifications

（C）dynamic data masking

（D）row-level security (RLS)

正解：B 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 2

You have an Azure Databricks workspace named workspace1 in the Standard pricing tier.
You need to configure workspace1 to support autoscaling all-purpose clusters. The solution must meet the following requirements:
Automatically scale down workers when the cluster is underutilized for three minutes.
Minimize the time it takes to scale to the maximum number of workers.
Minimize costs.
What should you do first?

（A）Set Cluster Mode to High Concurrency.

（B）Upgrade workspace1 to the Premium pricing tier.

（C）Enable container services for workspace1.

（D）Create a cluster policy in workspace1.

正解：B 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 3

You have an Azure Data Lake Storage Gen2 account named account1 that contains a container named Container"1. Container1 contains two folders named FolderA and FolderB.
You need to configure access control lists (ACLs) to meet the following requirements:
* Group1 must be able to list and read the contents and subfolders of FolderA.
* Group2 must be able to list and read the contents of FolderA and FolderB.
* Group2 must be prevented from reading any other folders at the root of Container1.
How should you configure the ACL permissions for each group? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

正解：

Explanation:

質問 4

You have an Azure Storage account and a data warehouse in Azure Synapse Analytics in the UK South region.
You need to copy blob data from the storage account to the data warehouse by using Azure Data Factory. The solution must meet the following requirements:
Ensure that the data remains in the UK South region at all times.
Minimize administrative effort.
Which type of integration runtime should you use?

（A）Self-hosted integration runtime

（B）Azure-SSIS integration runtime

（C）Azure integration runtime

正解：C 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 5

You have an Azure Data Factory pipeline that has the activities shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.
NOTE: Each correct selection is worth one point.

正解：

Explanation:
Box 1: succeed
Box 2: failed
Example:
Now let's say we have a pipeline with 3 activities, where Activity1 has a success path to Activity2 and a failure path to Activity3. If Activity1 fails and Activity3 succeeds, the pipeline will fail. The presence of the success path alongside the failure path changes the outcome reported by the pipeline, even though the activity executions from the pipeline are the same as the previous scenario.

Activity1 fails, Activity2 is skipped, and Activity3 succeeds. The pipeline reports failure.
Reference:
https://datasavvy.me/2021/02/18/azure-data-factory-activity-failures-and-pipeline-outcomes/

質問 6

You have an Azure data factory that connects to a Microsoft Purview account. The data factory is registered in Microsoft Purview.
You update a Data Factory pipeline.
You need to ensure that the updated lineage is available in Microsoft Purview.
What You have an Azure subscription that contains an Azure SQL database named DB1 and a storage account named storage1. The storage1 account contains a file named File1.txt. File1.txt contains the names of selected tables in DB1.
You need to use an Azure Synapse pipeline to copy data from the selected tables in DB1 to the files in storage1. The solution must meet the following requirements:
* The Copy activity in the pipeline must be parameterized to use the data in File1.txt to identify the source and destination of the copy.
* Copy activities must occur in parallel as often as possible.
Which two pipeline activities should you include in the pipeline? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

（A）ForEach

（B）Lookup

（C）If Condition

（D）Get Metadata

正解：C、D 解答を投票する

質問 7

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:
* The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.
* Line total sales amount and line total tax amount will be aggregated in Databricks.
* Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.
You need to recommend an output mode for the dataset that will be processed by using Structured Streaming.
The solution must minimize duplicate data.
What should you recommend?

（A）Update

（B）Complete

（C）Append

正解：A 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 8

You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.
You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.
What should you include in the solution?

（A）Partition by DateTime fields.

（B）Use a JSON format for physical data storage.

（C）Include a watermark column.

（D）Sink to Azure Queue storage.

正解：A 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

質問 9

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1.
You have the queries shown in the following table.

You are evaluating whether to enable result set caching for Pool1. Which query results will be cached if result set caching is enabled?

（A）Query1 and Query3 only

（B）Query1 only

（C）Query 1, Query2, and Query3 only

（D）Query 1 and Query2 only

（E）Query2 only

正解：D 解答を投票する

質問 10

You have an Azure Synapse Analytics workspace named WS1.
You have an Azure Data Lake Storage Gen2 container that contains JSON-formatted files in the following format.

You need to use the serverless SQL pool in WS1 to read the files.
How should you complete the Transact-SQL statement? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.

正解：

Explanation:

Box 1: openrowset
The easiest way to see to the content of your CSV file is to provide file URL to OPENROWSET function, specify csv FORMAT.
Example:
SELECT *
FROM OPENROWSET(
BULK 'csv/population/population.csv',
DATA_SOURCE = 'SqlOnDemandDemo',
FORMAT = 'CSV', PARSER_VERSION = '2.0',
FIELDTERMINATOR =',',
ROWTERMINATOR = '\n'
Box 2: openjson
You can access your JSON files from the Azure File Storage share by using the mapped drive, as shown in the following example:
SELECT book.* FROM
OPENROWSET(BULK N't:\books\books.json', SINGLE_CLOB) AS json
CROSS APPLY OPENJSON(BulkColumn)
WITH( id nvarchar(100), name nvarchar(100), price float,
pages_i int, author nvarchar(100)) AS book
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/query-single-csv-file
https://docs.microsoft.com/en-us/sql/relational-databases/json/import-json-documents-into-sql-server

質問 11

You plan to develop a dataset named Purchases by using Azure databricks Purchases will contain the following columns:
* ProductID
* ItemPrice
* lineTotal
* Quantity
* StorelD
* Minute
* Month
* Hour
* Year
* Day
You need to store the data to support hourly incremental load pipelines that will vary for each StoreID. the solution must minimize storage costs. How should you complete the rode? To answer, select the appropriate options In the answer area.
NOTE: Each correct selection is worth one point.

正解：

Explanation:

Box 1: partitionBy
We should overwrite at the partition level.
Example:
df.write.partitionBy("y","m","d")
mode(SaveMode.Append)
parquet("/data/hive/warehouse/db_name.db/" + tableName)
Box 2: ("StoreID", "Year", "Month", "Day", "Hour", "StoreID")
Box 3: parquet("/Purchases")
Reference:
https://intellipaat.com/community/11744/how-to-partition-and-write-dataframe-in-spark-without-deleting- partitions-with-no-new-data

質問 12

A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.
The company must be able to monitor the devices in real-time.
You need to design the solution.
What should you recommend?

（A）Azure Stream Analytics cloud job using Azure PowerShell

（B）Azure Data Factory instance using Azure Portal

（C）Azure Analysis Services using Azure Portal

（D）Azure Analysis Services using Azure PowerShell

正解：A 解答を投票する

解説: (JPNTest メンバーにのみ表示されます)

DP-203 無料問題集「Microsoft Data Engineering on Microsoft Azure」

弊社を連絡する

関連リンク

トップ試験