Databricks-Certified-Professional-Data-Engineer Databricks Certified Data Engineer Professional

Loading demo links...

Showing 4–6 of 15 questions

Question 4

What is the main difference between the silver layer and gold layer in medallion architecture?

Select an option, then click Submit answer.

  • Silver optimized to perform ETL, Gold is optimized query performance

  • Gold is optimized go perform ETL, Silver is optimized for query performance

  • Silver is copy of Bronze, Gold is a copy of Silver

  • Silver is stored in Delta Lake, Gold is stored in memory

  • Silver may contain aggregated data, gold may preserve the granularity of original data

Question 5

Your team member is trying to set up a delta pipeline and build a second gold table to the same pipeline with aggregated metrics based on an existing Delta Live table called sales_orders_cleaned but he is facing a problem in starting the pipeline, the pipeline is failing to state it cannot find the table sales_orders_cleaned, you are asked to identify and fix the problem.

1. CREATE LIVE TABLE sales_order_in_chicago

2. AS

3. SELECT order_date, city, sum(price) as sales,

4. FROM sales_orders_cleaned

5. WHERE city = 'Chicago')

6. GROUP BY order_date, city

Select an option, then click Submit answer.

  • Use STREAMING LIVE instead of LIVE table

  • Delta live table can be used in a group by clause

  • Delta live tables pipeline can only have one table

  • Sales_orders_cleaned table is missing schema name LIVE

  • The pipeline needs to be deployed so the first table is created before we add a second table

Question 6

Which of the following Structured Streaming queries is performing a hop from a bronze table to a Silver table?

 

 

Select an option, then click Submit answer.

  • 1. (spark.table("sales").groupBy("store")
    2. .agg(sum("sales")).writeStream
    3. .option("checkpointLocation",checkpointPath)
    4. .outputMode("complete")
    5. .table("aggregatedSales"))
     

  • 1. (spark.table("sales").agg(sum("sales"),sum("units"))
    2. .writeStream
    3. .option("checkpointLocation",checkpointPath)
    4. .outputMode("complete")
    5. .table("aggregatedSales"))
     

  • 1. (spark.table("sales")
    2. .withColumn("avgPrice", col("sales") / col("units"))
    3. .writeStream
    4. .option("checkpointLocation", checkpointPath)
    5. .outputMode("append")
    6. .table("cleanedSales"))

     

  • 1. (spark.readStream.load(rawSalesLocation)
    2. .writeStream
    3. .option("checkpointLocation", checkpointPath)
    4. .outputMode("append")
    5. .table("uncleanedSales") )
     

  • 1. (spark.read.load(rawSalesLocation)
    2. .writeStream
    3. .option("checkpointLocation", checkpointPath)
    4. .outputMode("append")
    5. .table("uncleanedSales") )