Databricks-Certified-Professional-Data-Engineer 無料問題集「Databricks Certified Professional Data Engineer」

The business intelligence team has a dashboard configured to track various summary metrics for retail stories.
This includes total sales for the previous day alongside totals and averages for a variety of time periods. The fields required to populate this dashboard have the following schema:

For Demand forecasting, the Lakehouse contains a validated table of all itemized sales updated incrementally in near real-time. This table named products_per_order, includes the following fields:

Because reporting on long-term sales trends is less volatile, analysts using the new dashboard only require data to be refreshed once daily. Because the dashboard will be queried interactively by many users throughout a normal business day, it should return results quickly and reduce total compute associated with each materialization.
Which solution meets the expectations of the end users while controlling and limiting possible costs?

Which statement describes the default execution mode for Databricks Auto Loader?

解説: (JPNTest メンバーにのみ表示されます)
A data ingestion task requires a one-TB JSON dataset to be written out to Parquet with a target part-file size of 512 MB. Because Parquet is being used instead of Delta Lake, built-in file-sizing features such as Auto- Optimize & Auto-Compaction cannot be used.
Which strategy will yield the best performance without shuffling data?

解説: (JPNTest メンバーにのみ表示されます)
What statement is true regarding the retention of job run history?

Which statement describes Delta Lake Auto Compaction?

解説: (JPNTest メンバーにのみ表示されます)
The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables.
Which approach will ensure that this requirement is met?

解説: (JPNTest メンバーにのみ表示されます)
The data architect has mandated that all tables in the Lakehouse should be configured as external (also known as "unmanaged") Delta Lake tables.
Which approach will ensure that this requirement is met?

解説: (JPNTest メンバーにのみ表示されます)
Which statement characterizes the general programming model used by Spark Structured Streaming?

解説: (JPNTest メンバーにのみ表示されます)
In order to prevent accidental commits to production data, a senior data engineer has instituted a policy that all development work will reference clones of Delta Lake tables. After testing both deep and shallow clone, development tables are created using shallow clone.
A few weeks after initial table creation, the cloned versions of several tables implemented as Type 1 Slowly Changing Dimension (SCD) stop working. The transaction logs for the source tables show that vacuum was run the day before.
Why are the cloned tables no longer working?

解説: (JPNTest メンバーにのみ表示されます)
The DevOps team has configured a production workload as a collection of notebooks scheduled to run daily using the Jobs UI. A new data engineering hire is onboarding to the team and has requested access to one of these notebooks to review the production logic.
What are the maximum notebook permissions that can be granted to the user without allowing accidental changes to production code or data?

解説: (JPNTest メンバーにのみ表示されます)

弊社を連絡する

我々は12時間以内ですべてのお問い合わせを答えます。

オンラインサポート時間:( UTC+9 ) 9:00-24:00
月曜日から土曜日まで

サポート:現在連絡