Databricks-Certified-Professional-Data-Scientist Databricks Certified Professional Data Scientist Exam

Loading demo links...

Showing 1–3 of 10 questions

Question 1

Which of the following true with regards to the K-Means clustering algorithm?

Select all that apply, then click Submit answer.

  • Labels are not pre-assigned to each objects in the cluster.

  • Labels are pre-assigned to each objects in the cluster.

  • It classify the data based on the labels.

  • It discovers the center of each cluster.

  • It find each objects fall in which particular cluster


Question 2

What are the advantages of the Hashing Features?

Select all that apply, then click Submit answer.

  • Requires the less memory

  • Less pass through the training data

  • Easily reverse engineer vectors to determine which original feature mapped to a vector location


Question 3

Suppose that the probability that a pedestrian will be tul by a car while crossing the toad at a pedestrian crossing without paying attention to the traffic light is lo be computed. Let H be a discrete random variable taking one value from (Hit. Not Hit). Let L be a discrete random variable taking one value from (Red. Yellow. Green).

Realistically, H will be dependent on L That is, P(H = Hit) and P(H = Not Hit) will take different values depending on whether L is red, yellow or green. A person is. for example, far more likely to be hit by a car when trying to cross while Hie lights for cross traffic are green than if they are red In other words, for any given possible pair of values for Hand L. one must consider the joint probability distribution of H and L to find the probability* of that pair of events occurring together if Hie pedestrian ignores the state of the light

Here is a table showing the conditional probabilities of being bit. defending on ibe stale of the lights (Note that the columns in this table must add up to 1 because the probability of being hit oi not hit is 1 regardless of the stale of the light.)

Select all that apply, then click Submit answer.

  • The marginal probability P(H=Hit) is the sum along the H=Hit row of this joint distribution table, as this is the probability of being hit when the lights are red OR yellow OR green.

  • marginal probability that P(H=Not Hit) is the sum of the H=Not Hit row

  • marginal probability that P(H=Not Hit) is the sum of the H= Hit row