Professional-Data-Engineer 無料問題集「Google Certified Professional Data Engineer」

You have a Standard Tier Memorystore for Redis instance deployed in a production environment. You need to simulate a Redis instance failover in the most accurate disaster recovery situation, and ensure that the failover has no impact on production dat a. What should you do?

解説: (JPNTest メンバーにのみ表示されます)
You work for a bank. You have a labelled dataset that contains information on already granted loan application and whether these applications have been defaulted. You have been asked to train a model to predict default rates for credit applicants.
What should you do?

Which of these sources can you not load data into BigQuery from?

解説: (JPNTest メンバーにのみ表示されます)
You are building a data pipeline on Google Cloud. You need to prepare data using a casual method for a machine-learning process. You want to support a logistic regression model. You also need to monitor and adjust for null values, which must remain real-valued and cannot be removed. What should you do?

The Dataflow SDKs have been recently transitioned into which Apache service?

解説: (JPNTest メンバーにのみ表示されます)
Which is not a valid reason for poor Cloud Bigtable performance?

解説: (JPNTest メンバーにのみ表示されます)
You are running a pipeline in Cloud Dataflow that receives messages from a Cloud Pub/Sub topic and writes the results to a BigQuery dataset in the EU. Currently, your pipeline is located in europe-west4 and has a maximum of 3 workers, instance type n1-standard-1. You notice that during peak periods, your pipeline is struggling to process records in a timely fashion, when all 3 workers are at maximum CPU utilization. Which two actions can you take to increase performance of your pipeline? (Choose two.)

正解:B、E 解答を投票する
To give a user read permission for only the first three columns of a table, which access control method would you use?

解説: (JPNTest メンバーにのみ表示されます)
You have spent a few days loading data from comma-separated values (CSV) files into the Google BigQuery table CLICK_STREAM. The column DT stores the epoch time of click events. For convenience, you chose a simple schema where every field is treated as the STRING type. Now, you want to compute web session durations of users who visit your site, and you want to change its data type to the TIMESTAMP. You want to minimize the migration effort without making future queries computationally expensive. What should you do?

What are the minimum permissions needed for a service account used with Google Dataproc?

解説: (JPNTest メンバーにのみ表示されます)
You need to modernize your existing on-premises data strategy. Your organization currently uses.
* Apache Hadoop clusters for processing multiple large data sets, including on-premises Hadoop Distributed File System (HDFS) for data replication.
* Apache Airflow to orchestrate hundreds of ETL pipelines with thousands of job steps.
You need to set up a new architecture in Google Cloud that can handle your Hadoop workloads and requires minimal changes to your existing orchestration processes. What should you do?

解説: (JPNTest メンバーにのみ表示されます)
You're training a model to predict housing prices based on an available dataset with real estate properties. Your plan is to train a fully connected neural net, and you've discovered that the dataset contains latitude and longtitude of the property. Real estate professionals have told you that the location of the property is highly influential on price, so you'd like to engineer a feature that incorporates this physical dependency.
What should you do?

解説: (JPNTest メンバーにのみ表示されます)
You need to compose visualization for operations teams with the following requirements:
Telemetry must include data from all 50,000 installations for the most recent 6 weeks (sampling once every minute) The report must not be more than 3 hours delayed from live data.
The actionable report should only show suboptimal links.
Most suboptimal links should be sorted to the top.
Suboptimal links can be grouped and filtered by regional geography.
User response time to load the report must be <5 seconds.
You create a data source to store the last 6 weeks of data, and create visualizations that allow viewers to see multiple date ranges, distinct geographic regions, and unique installation types. You always show the latest data without any changes to your visualizations. You want to avoid creating and updating new visualizations each month. What should you do?

You work for a large ecommerce company. You store your customers order data in Bigtable. You have a garbage collection policy set to delete the data after 30 days and the number of versions is set to 1. When the data analysts run a query to report total customer spending, the analysts sometimes see customer data that is older than 30 days. You need to ensure that the analysts do not see customer data older than 30 days while minimizing cost and overhead. What should you do?

解説: (JPNTest メンバーにのみ表示されます)
You store historic data in Cloud Storage. You need to perform analytics on the historic dat a. You want to use a solution to detect invalid data entries and perform data transformations that will not require programming or knowledge of SQL.
What should you do?

弊社を連絡する

我々は12時間以内ですべてのお問い合わせを答えます。

オンラインサポート時間:( UTC+9 ) 9:00-24:00
月曜日から土曜日まで

サポート:現在連絡