Databricks-Certified-Professional-Data-Scientist – Databricks Certified Professional Data Scientist Exam

Loading demo links...

Showing 4–6 of 10 questions

Question 4

Select the correct statement which applies to K-Nearest Neighbors

Select all that apply, then click Submit answer.

○
No Assumption about the data
○
Computationally expensive
○
Require less memory
○
Works with Numeric Values

Question 5

Suppose you have been given two Random Variables X and Y, whose joint distribution is already known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y. It is the probability distribution of X when the value of Y is not known. So how do you calculate the marginal distribution of X

Select all that apply, then click Submit answer.

○
This is typically calculated by summing the joint probability distribution over Y.
○
This is typically calculated by integrating the joint probability distribution over Y
○
This is typically calculated by summing (In case of discrete variable) the joint probability distribution over Y
○
This is typically calculated by integrating(ln case of continuous variable) the joint probability distribution over Y.
○
'
For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is

Text

Description automatically generated with low confidence where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y In this case, the variable Y has been marginalized out. Bivariate marginal and joint probabilities for discrete random variables are often displayed as two-way tables.
Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is

Diagram

Description automatically generated with medium confidence

where pX.Y(x.y) gives the joint distribution of X and Y while pX|Y(x|y) gives the conditional distribution for X given Y Again: the variable Y has been marginalized out.
Note that a marginal probability can always be written as an expected value:

Text, letter

Description automatically generated
Intuitively, the marginal probability of X is computed by examining the conditional probability of X given a particular value of Y, and then averaging this conditional probability over the distribution of all values of Y This follows from the definition of expected value, i.e.
in general

A picture containing diagram

Description automatically generated

Reference / correct answer:

This is typically calculated by summing the joint probability distribution over Y.

This is typically calculated by integrating the joint probability distribution over Y

This is typically calculated by summing (In case of discrete variable) the joint probability distribution over Y

This is typically calculated by integrating(ln case of continuous variable) the joint probability distribution over Y.

: Given two random variables X and Y whose joint distribution is known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y. It is the probability distribution of X when the value of Y is not known.

This is typically calculated by summing or integrating the joint probability distribution over

Y. '

For discrete random variables, the marginal probability mass function can be written as Pr(X = x). This is

Text

Description automatically generated with low confidence where Pr(X = x,Y = y) is the joint distribution of X and Y, while Pr(X = x|Y = y) is the conditional distribution of X given Y In this case, the variable Y has been marginalized out. Bivariate marginal and joint probabilities for discrete random variables are often displayed as two-way tables.

Similarly for continuous random variables, the marginal probability density function can be written as pX(x). This is

Diagram

Description automatically generated with medium confidence

where pX.Y(x.y) gives the joint distribution of X and Y while pX|Y(x|y) gives the conditional distribution for X given Y Again: the variable Y has been marginalized out.

Note that a marginal probability can always be written as an expected value:

Text, letter

Description automatically generated

Intuitively, the marginal probability of X is computed by examining the conditional probability of X given a particular value of Y, and then averaging this conditional probability over the distribution of all values of Y This follows from the definition of expected value, i.e.

in general

A picture containing diagram

Description automatically generated

Question 6

What is the considerable difference between L1 and L2 regularization?

Select an option, then click Submit answer.

○
L1 regularization has more accuracy of the resulting model
○
Size of the model can be much smaller in L1 regularization than that produced by L2regularization
○
L2-regularization can be of vital importance when the application is deployed in resource-tight environments such as cell-phones.
○
All of the above are correct

Reference / correct answer:

Size of the model can be much smaller in L1 regularization than that produced by L2regularization

: The two most common regularization methods are called L1 and L2 regularization. L1 regularization penalizes the weight vector for its L1-norm (i.e. the sum of the absolute values of the weights), whereas L2 regularization uses its L2-norm. There is usually not a considerable difference between the two methods in terms of the accuracy of the resulting model (Gao et al 2007), but L1 regularization has a significant advantage in practice. Because many of the weights of the features become zero as a result of L1regularized training, the size of the model can be much smaller than that produced by L2regularization. Compact models require less space on memory and storage, and enable the application to start up quickly. These merits can be of vital importance when the application is deployed in resource-tight environments such as cell-phones.

Regularization works by adding the penalty associated with the coefficient values to the error of the hypothesis. This way, an accurate hypothesis with unlikely coefficients would be penalized whila a somewhat less accurate but more conservative hypothesis with low coefficients would not be penalized as much.

← Prev 1 2 3 4 Next →

Page 2 of 4

Sale Ends In

2h 0m 0s