Professional-Data-Engineer Professional Data Engineer on Google Cloud Platform

Loading demo links...

Showing 1–3 of 10 questions

Question 1

You are choosing a NoSQL database to handle telemetry data submitted from millions of Internet-of-Things (IoT) devices. The volume of data is growing at 100 TB per year, and each data entry has about 100 attributes. The data processing pipeline does not require atomicity, consistency, isolation, and durability (ACID). However, high availability and low latency are required.

You need to analyze the data by querying against individual fields. Which three databases meet your requirements? (Choose three.)

Select all that apply, then click Submit answer.

  • Redis

  • HBase

  • MySQL

  • MongoDB

  • Cassandra

  • HDFS with Hive

Question 2

You decided to use Cloud Datastore to ingest vehicle telemetry data in real time. You want to build a storage system that will account for the long-term data growth, while keeping the costs low. You also want to create snapshots of the data periodically, so that you can make a point-in-time (PIT) recovery, or clone a copy of the data for Cloud Datastore in a different environment. You want to archive these snapshots for a long time. Which two methods can accomplish this? (Choose two.)

Select all that apply, then click Submit answer.

  • Use managed export, and store the data in a Cloud Storage bucket using Nearline or Coldline class.

  • Use managed export, and then import to Cloud Datastore in a separate project under a unique namespace reserved for that export.

  • Use managed export, and then import the data into a BigQuery table created just for that export, and delete temporary export files.

  • Write an application that uses Cloud Datastore client libraries to read all the entities. Treat each entity as a BigQuery table row via BigQuery streaming insert. Assign an export timestamp for each export, and attach it as an extra column for each row. Make sure that the BigQuery table is partitioned using the export timestamp column.

  • Write an application that uses Cloud Datastore client libraries to read all the entities. Format the exported data into a JSON file. Apply compression before storing the data in Cloud Source Repositories.

Question 3

You are training a spam classifier. You notice that you are overfitting the training data. Which three actions can you take to resolve this problem? (Choose three.)

Select all that apply, then click Submit answer.

  • Get more training examples

  • Reduce the number of training examples

  • Use a smaller set of features

  • Use a larger set of features

  • Increase the regularization parameters

  • Decrease the regularization parameters