DP-203 Data Engineering on Microsoft Azure

Loading demo links...

Showing 16–18 of 20 questions

Question 16 (New Update)

You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage!

New files are uploaded daily to storage1.

• Incrementally process new files as they are upkorage1 as a structured streaming source. The solution must meet the following requirements:

• Minimize implementation and maintenance effort.

• Minimize the cost of processing millions of files.

• Support schema inference and schema drift.

Which should you include in the recommendation?

Select an option, then click Submit answer.

  • Auto Loader

  • Apache Spark FileStreamSource

  • COPY INTO

  • Azure Data Factory

Question 17 (New Update)

You are designing a financial transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will have a clustered columnstore index and will include the following columns:

You have the following query requirements:

You need to recommend a partition strategy for the table to minimize query times.

On which column should you recommend partitioning the table?

Select an option, then click Submit answer.

  • CustomerSegment

  • AccountType

  • TransactionType

  • TransactionMonth

Question 18 (Mixed Questions)

You have the following Azure Data Factory pipelines:

Ingest Data from System1

Ingest Data from System2

Populate Dimensions

Populate Facts

Ingest Data from System1 and Ingest Data from System2 have no dependencies. Populate Dimensions must execute after Ingest Data from System1 and Ingest Data from System2. Populate Facts must execute after Populate Dimensions pipeline. All the pipelines must execute every eight hours.

What should you do to schedule the pipelines for execution?

Select an option, then click Submit answer.

  • Add an event trigger to all four pipelines.

  • Add a schedule trigger to all four pipelines.

  • Create a patient pipeline that contains the four pipelines and use a schedule trigger.

  • Create a patient pipeline that contains the four pipelines and use an event trigger.