Home / Sem 6 / MTH265 / Unit 5 Subjective

Unit5 - Subjective Questions

MTH265 • Practice Questions with Detailed Answers

Define a Discrete Random Variable. Give an example to illustrate your definition.

Discrete Random Variable:

A random variable is a function that maps outcomes of a random experiment to real numbers. It is called discrete if it takes on a finite or countably infinite number of distinct values.

Example:
Consider the experiment of tossing two fair coins.

Let $X$ be the random variable representing the number of heads obtained.
The possible outcomes of the sample space are $S = \{HH, HT, TH, TT\}$ .
The random variable maps these outcomes to real numbers:
- $X(HH) = 2$
- $X(HT) = 1$
- $X(TH) = 1$
- $X(TT) = 0$
Here, $X$ can only take the discrete values $\{0, 1, 2\}$ . Thus, $X$ is a discrete random variable.

What is a Probability Mass Function (PMF)? State its fundamental properties.

Probability Mass Function (PMF):

The PMF is a function that gives the probability that a discrete random variable is exactly equal to some value. If $X$ is a discrete random variable, its PMF is denoted by $P(X = x)$ or $p(x)$ .

Fundamental Properties:
For any discrete random variable $X$ with possible values $x_1, x_2, x_3, \dots$ , the PMF must satisfy two main properties:

Non-negativity: The probability of any specific outcome is always greater than or equal to zero.
$p(x_i) \ge 0 \quad \text{for all } i$
Summation to 1: The sum of the probabilities of all possible values in the sample space must equal exactly 1.
$\sum_{i} p(x_i) = 1$

Explain the concept of Expected Value (Mathematical Expectation) for a discrete random variable.

Expected Value (Mathematical Expectation):

The expected value of a discrete random variable $X$ is a weighted average of all possible values that the variable can take on, where each value is weighted by its respective probability. It represents the "long-run average" value of the variable if the random experiment were repeated many times.

Formula:
If $X$ takes values $x_1, x_2, \dots, x_n$ with probabilities $p(x_1), p(x_2), \dots, p(x_n)$ , the expected value $E[X]$ (or $\mu$ ) is defined as:
$E[X] = \sum_{x} x \cdot P(X = x)$

Key Points:

It is a measure of central tendency.
$E[X]$ does not have to be a value that $X$ can actually take (e.g., the expected value of a fair die roll is $3.5$).

State and explain the Linearity of Expectation. Provide its mathematical formula.

Linearity of Expectation:

The linearity of expectation is a fundamental property stating that the expected value of the sum of random variables is equal to the sum of their individual expected values, regardless of whether the random variables are independent or not.

Mathematical Formulas:

Scaling and Shifting: For any random variable $X$ and constants $a$ and $b$ :
$E[aX + b] = aE[X] + b$
Sum of Variables: For any random variables $X$ and $Y$ :
$E[X + Y] = E[X] + E[Y]$
General Form: For a sequence of random variables $X_1, X_2, \dots, X_n$ and constants $c_1, c_2, \dots, c_n$ :
$E\left[\sum_{i=1}^{n} c_i X_i\right] = \sum_{i=1}^{n} c_i E[X_i]$

Significance:
It dramatically simplifies complex probability calculations, especially when dealing with the sum of indicator random variables.

Derive the formula for Variance $V(X) = E[X^2] - (E[X])^2$ from the standard definition of variance.

Derivation of Variance Formula:

The standard definition of variance for a random variable $X$ with expected value $\mu = E[X]$ is the expected value of the squared deviation from the mean:
$V(X) = E[(X - \mu)^2]$

Steps:

Expand the square inside the expectation:
$V(X) = E[X^2 - 2\mu X + \mu^2]$
Apply the linearity of expectation ( $E[aX + b] = aE[X] + b$ ):
$V(X) = E[X^2] - E[2\mu X] + E[\mu^2]$
Since $\mu$ is a constant, $E[2\mu X] = 2\mu E[X]$ and $E[\mu^2] = \mu^2$ :
$V(X) = E[X^2] - 2\mu E[X] + \mu^2$
Substitute $\mu = E[X]$ back into the equation:
$V(X) = E[X^2] - 2(E[X])(E[X]) + (E[X])^2$
$V(X) = E[X^2] - 2(E[X])^2 + (E[X])^2$
Simplify the expression:
$V(X) = E[X^2] - (E[X])^2$

This is often called the computational formula for variance.

Define a Bernoulli trial. How does it relate to the Binomial Distribution?

Bernoulli Trial:

A Bernoulli trial is a random experiment that has exactly two mutually exclusive possible outcomes, typically labeled as "success" and "failure".

Let the probability of success be $p$ .
Let the probability of failure be $q = 1 - p$ .

Relation to Binomial Distribution:

A Binomial Distribution represents the number of successes in a fixed number of independent Bernoulli trials.
If we repeat a Bernoulli trial $n$ times independently, and we define a random variable $X$ as the total number of successes in these $n$ trials, then $X$ follows a Binomial Distribution, denoted as $X \sim B(n, p)$ .
The probability of getting exactly $k$ successes is given by the PMF:
$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$

Derive the Expected Value of a Binomial Distribution.

Derivation of $E[X]$ for Binomial Distribution:

Let $X \sim B(n, p)$ . The PMF is $P(X = k) = \binom{n}{k} p^k q^{n-k}$ where $q = 1-p$ .

Method 1: Using Linearity of Expectation (Simpler)

Express $X$ as the sum of $n$ independent Bernoulli trials: $X = I_1 + I_2 + \dots + I_n$ , where $I_j = 1$ if the $j$ -th trial is a success, and $0$ otherwise.
The expected value of a single Bernoulli trial is $E[I_j] = 1 \cdot p + 0 \cdot q = p$ .
By linearity of expectation:
$E[X] = E\left[\sum_{j=1}^n I_j\right] = \sum_{j=1}^n E[I_j] = \sum_{j=1}^n p = np$

Method 2: Using the Definition
$E[X] = \sum_{k=0}^n k \binom{n}{k} p^k q^{n-k}$
Since the term is 0 when $k=0$ , we start from $k=1$ :
$E[X] = \sum_{k=1}^n k \frac{n!}{k!(n-k)!} p^k q^{n-k}$
Cancel $k$ with the $k!$ in the denominator:
$E[X] = \sum_{k=1}^n \frac{n!}{(k-1)!(n-k)!} p^k q^{n-k}$
Factor out $n$ and $p$ :
$E[X] = np \sum_{k=1}^n \frac{(n-1)!}{(k-1)!(n-k)!} p^{k-1} q^{n-k}$
Let $j = k-1$ and $m = n-1$ . As $k$ goes from $1$ to $n$ , $j$ goes from $0$ to $m$ :
$E[X] = np \sum_{j=0}^m \binom{m}{j} p^j q^{m-j}$
The sum is the binomial expansion of $(p+q)^m = 1^m = 1$ .
Thus, $E[X] = np$ .

Explain the Poisson Distribution. Under what conditions is it an appropriate model?

Poisson Distribution:

The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space, provided these events occur with a known constant mean rate and independently of the time since the last event.

Probability Mass Function:
If $X$ is a Poisson random variable with mean rate $\lambda > 0$ , its PMF is:
$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$
where $k = 0, 1, 2, \dots$

Appropriate Conditions:

Independence: The occurrence of one event does not affect the probability of a second event.
Constant Rate: The average rate ( $\lambda$ ) at which events occur is constant over the interval.
Singularity: Two events cannot occur at exactly the same instant.
Limiting Case: It is often used as an approximation to the Binomial distribution when the number of trials $n$ is very large, the probability of success $p$ is very small, and $\lambda = np$ is finite.

Describe the Geometric Distribution and define its PMF.

Geometric Distribution:

The geometric distribution models the number of Bernoulli trials needed to get the first success. For example, flipping a coin repeatedly until a "Heads" appears.

Assumptions:

The phenomenon being modeled consists of a sequence of independent trials.
Each trial has only two possible outcomes (Success or Failure).
The probability of success, $p$ , is the same for every trial.

Probability Mass Function (PMF):
Let $X$ be the number of trials needed to achieve the first success. The probability that the first success occurs on the $k$ -th trial requires exactly $k-1$ failures followed by $1$ success.
$P(X = k) = (1 - p)^{k-1} p$
where $k = 1, 2, 3, \dots$ and $p$ is the probability of success in a single trial.

What is the memoryless property of the Geometric Distribution? Explain it mathematically.

Memoryless Property:

The memoryless property states that the probability of achieving the first success after a certain number of additional trials is independent of how many failures have already occurred. In simple terms, the "system" forgets its past.

Mathematical Explanation:
For a geometric random variable $X$ , the memoryless property is expressed as:
$P(X > s + t \mid X > t) = P(X > s)$
for any integers $s, t \ge 0$ .

Proof Outline:

The probability of needing more than $k$ trials is the probability of $k$ consecutive failures: $P(X > k) = (1-p)^k$ .
Using conditional probability:
$P(X > s + t \mid X > t) = \frac{P(X > s + t \text{ and } X > t)}{P(X > t)}$
Since $s + t > t$ , the event $X > s + t$ is a subset of $X > t$ , so the intersection is just $X > s + t$ :
$= \frac{P(X > s + t)}{P(X > t)} = \frac{(1-p)^{s+t}}{(1-p)^t}$
Simplifying gives:
$= (1-p)^s = P(X > s)$

This proves the geometric distribution is memoryless.

Compare and contrast the Binomial and Poisson distributions.

Comparison of Binomial and Poisson Distributions:

1. Definition:

Binomial: Models the number of successes in a fixed number of independent trials.
Poisson: Models the number of events occurring in a continuous interval of time or space.

2. Parameters:

Binomial: Defined by two parameters: $n$ (number of trials) and $p$ (probability of success).
Poisson: Defined by a single parameter: $\lambda$ (average rate of occurrence).

3. Possible Values:

Binomial: Random variable $X$ takes finite integer values from $0$ to $n$ .
Poisson: Random variable $X$ takes infinite non-negative integer values: $0, 1, 2, \dots, \infty$ .

4. Mean and Variance:

Binomial: Mean $= np$ , Variance $= np(1-p)$ . Here, Mean $>$ Variance.
Poisson: Mean $= \lambda$ , Variance $= \lambda$ . Here, Mean $=$ Variance.

5. Relationship:
Poisson is a limiting case of the Binomial distribution where $n \to \infty$ , $p \to 0$ , and $np = \lambda$ remains constant.

State Markov's Inequality. Provide a brief explanation of its utility.

Markov's Inequality:

If $X$ is any non-negative random variable ( $X \ge 0$ ) and $a > 0$ is any positive real number, then the probability that $X$ is greater than or equal to $a$ is bounded by the expected value of $X$ divided by $a$ .

Formula:
$P(X \ge a) \le \frac{E[X]}{a}$

Utility and Explanation:

Upper Bounds: It provides a loose but guaranteed upper bound on the tail probability of a non-negative random variable using only its expected value.
Minimal Assumptions: It requires almost no knowledge of the underlying probability distribution—only that the variable is non-negative and its mean exists.
Foundation: It serves as a foundational theorem used to prove tighter bounds, such as Chebyshev's inequality.

State and prove Chebyshev's Inequality.

Chebyshev's Inequality:
For any random variable $X$ with expected value $\mu$ and finite variance $\sigma^2$ , and for any real number $k > 0$ :
$P(|X - \mu| \ge k) \le \frac{\sigma^2}{k^2}$

Proof:
We use Markov's inequality to prove Chebyshev's inequality.

Define a new random variable $Y = (X - \mu)^2$ . Since $Y$ is a square, it is non-negative ( $Y \ge 0$ ).
The expected value of $Y$ is the definition of variance:
$E[Y] = E[(X - \mu)^2] = \sigma^2$
Apply Markov's inequality to $Y$ with the constant $a = k^2$ :
$P(Y \ge k^2) \le \frac{E[Y]}{k^2}$
Substitute back $Y = (X - \mu)^2$ and $E[Y] = \sigma^2$ :
$P((X - \mu)^2 \ge k^2) \le \frac{\sigma^2}{k^2}$
The event $(X - \mu)^2 \ge k^2$ is mathematically equivalent to $|X - \mu| \ge k$ . Therefore:
$P(|X - \mu| \ge k) \le \frac{\sigma^2}{k^2}$

This proves Chebyshev's Inequality, showing that the probability of a random variable deviating from its mean by more than $k$ is bounded by its variance divided by $k^2$ .

What does it mean for two discrete random variables to be independent? State the condition.

Independence of Random Variables:

Two discrete random variables $X$ and $Y$ are said to be independent if the realization of one does not affect the probability distribution of the other. In other words, knowing the value of $X$ provides no information about the value of $Y$ , and vice versa.

Mathematical Condition:
$X$ and $Y$ are independent if and only if their joint probability mass function is equal to the product of their marginal probability mass functions for all possible values $x$ and $y$ :
$P(X = x \text{ and } Y = y) = P(X = x) \cdot P(Y = y)$

Properties of Independent Variables:

$E[XY] = E[X]E[Y]$
$V(X + Y) = V(X) + V(Y)$ (The variance of the sum is the sum of the variances, as the covariance is zero).

Derive the Expected Value of a Geometric Distribution.

Derivation of $E[X]$ for a Geometric Distribution:

Let $X \sim \text{Geometric}(p)$ . The PMF is $P(X = k) = q^{k-1}p$ , where $q = 1-p$ and $k \ge 1$ .

The expected value is:
$E[X] = \sum_{k=1}^{\infty} k P(X = k) = \sum_{k=1}^{\infty} k q^{k-1} p$

Steps:

Factor out $p$ :
$E[X] = p \sum_{k=1}^{\infty} k q^{k-1}$
Recognize that $k q^{k-1}$ is the derivative of $q^k$ with respect to $q$ :
$E[X] = p \sum_{k=1}^{\infty} \frac{d}{dq} (q^k)$
Since $|q| < 1$ , we can swap the summation and the derivative:
$E[X] = p \frac{d}{dq} \left( \sum_{k=1}^{\infty} q^k \right)$
The infinite geometric series $\sum_{k=1}^{\infty} q^k$ sums to $\frac{q}{1-q}$ :
$E[X] = p \frac{d}{dq} \left( \frac{q}{1-q} \right)$
Apply the quotient rule to differentiate $\frac{q}{1-q}$ :
$\frac{d}{dq} \left( \frac{q}{1-q} \right) = \frac{(1)(1-q) - (q)(-1)}{(1-q)^2} = \frac{1-q+q}{(1-q)^2} = \frac{1}{(1-q)^2}$
Substitute this back into the expectation equation:
$E[X] = p \cdot \frac{1}{(1-q)^2}$
Since $1-q = p$ , we have:
$E[X] = p \cdot \frac{1}{p^2} = \frac{1}{p}$

Thus, the expected value of a geometric random variable is $1/p$ .

A fair 6-sided die is rolled. Let X be the outcome of the roll. Calculate the expected value E[X] and the variance V(X).

Calculation of E[X] and V(X):

The random variable $X$ can take values $\{1, 2, 3, 4, 5, 6\}$ , each with probability $P(X=x) = \frac{1}{6}$ .

1. Expected Value $E[X]$ :
$E[X] = \sum_{x=1}^{6} x P(X=x)$
$E[X] = 1(\frac{1}{6}) + 2(\frac{1}{6}) + 3(\frac{1}{6}) + 4(\frac{1}{6}) + 5(\frac{1}{6}) + 6(\frac{1}{6})$
$E[X] = \frac{1+2+3+4+5+6}{6} = \frac{21}{6} = 3.5$

2. Variance $V(X)$ :
First, calculate $E[X^2]$ :
$E[X^2] = \sum_{x=1}^{6} x^2 P(X=x)$
$E[X^2] = 1^2(\frac{1}{6}) + 2^2(\frac{1}{6}) + 3^2(\frac{1}{6}) + 4^2(\frac{1}{6}) + 5^2(\frac{1}{6}) + 6^2(\frac{1}{6})$
$E[X^2] = \frac{1 + 4 + 9 + 16 + 25 + 36}{6} = \frac{91}{6} \approx 15.167$

Now, use the formula $V(X) = E[X^2] - (E[X])^2$ :
$V(X) = \frac{91}{6} - (3.5)^2 = \frac{91}{6} - 12.25 = 15.167 - 12.25 = 2.9167$
Exact fraction: $V(X) = \frac{91}{6} - \frac{49}{4} = \frac{182 - 147}{12} = \frac{35}{12}$ .

Distinguish between a Probability Mass Function (PMF) and a Cumulative Distribution Function (CDF).

Distinction between PMF and CDF:

Probability Mass Function (PMF):

Definition: Gives the probability that a discrete random variable is exactly equal to a specific value.
Notation: $p(x) = P(X = x)$ .
Applicability: Only applies to discrete random variables.
Properties: $0 \le p(x) \le 1$ and $\sum p(x) = 1$ .

Cumulative Distribution Function (CDF):

Definition: Gives the probability that a random variable evaluates to a value less than or equal to a specific value $x$ .
Notation: $F(x) = P(X \le x)$ .
Applicability: Applies to both discrete and continuous random variables.
Properties: It is a non-decreasing function. $\lim_{x \to -\infty} F(x) = 0$ and $\lim_{x \to \infty} F(x) = 1$ .

Relationship:
For a discrete random variable, the CDF is the running sum of the PMF:
$F(x) = \sum_{k \le x} p(k)$

Explain the application of Bayes' Theorem in the context of discrete random variables.

Bayes' Theorem for Discrete Random Variables:

Bayes' Theorem provides a way to update the probabilities of hypotheses when given evidence. For discrete random variables $X$ and $Y$ , Bayes' Theorem allows us to find the conditional probability $P(X=x \mid Y=y)$ if we know $P(Y=y \mid X=x)$ and the marginal probabilities.

Formula:
$P(X=x \mid Y=y) = \frac{P(Y=y \mid X=x) P(X=x)}{P(Y=y)}$

Using the Law of Total Probability, the denominator $P(Y=y)$ can be expanded across all possible values of $X$ :
$P(X=x_i \mid Y=y) = \frac{P(Y=y \mid X=x_i) P(X=x_i)}{\sum_j P(Y=y \mid X=x_j) P(X=x_j)}$

Application:

Diagnostic Testing: Updating the probability that a patient has a disease ( $X$ ) given a positive test result ( $Y$ ).
Machine Learning: Naive Bayes classifiers use this principle to classify data points into discrete categories based on feature probabilities.

A factory produces lightbulbs. The number of defective bulbs in a batch follows a Poisson distribution with a mean of 2. What is the probability of finding exactly 0, 1, or 2 defective bulbs in a batch?

Solution using Poisson Distribution:

Let $X$ be the number of defective bulbs. We are given that $X \sim \text{Poisson}(\lambda = 2)$ .
The PMF for a Poisson distribution is:
$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!} = \frac{2^k e^{-2}}{k!}$

We need to find the probability of $X=0$ , $X=1$ , and $X=2$ , and then sum them up for the total probability $P(X \le 2)$ .

Probability of exactly 0 defectives ( $k=0$ ):
$P(X = 0) = \frac{2^0 e^{-2}}{0!} = \frac{1 \cdot e^{-2}}{1} = e^{-2} \approx 0.1353$
Probability of exactly 1 defective ( $k=1$ ):
$P(X = 1) = \frac{2^1 e^{-2}}{1!} = 2e^{-2} \approx 0.2707$
Probability of exactly 2 defectives ( $k=2$ ):
$P(X = 2) = \frac{2^2 e^{-2}}{2!} = \frac{4e^{-2}}{2} = 2e^{-2} \approx 0.2707$

Total Probability ( $P(X \le 2)$ ):
$P(X \le 2) = P(X=0) + P(X=1) + P(X=2) = e^{-2} + 2e^{-2} + 2e^{-2} = 5e^{-2}$
$5e^{-2} \approx 5(0.1353) = 0.6765$

Answer: The probability of finding 0, 1, or 2 defective bulbs is approximately $0.6765$ (or $67.65\%$ ).

Using Chebyshev's Inequality, determine the minimum probability that a random variable X lies within 3 standard deviations of its mean.

Application of Chebyshev's Inequality:

Chebyshev's Inequality states that for any random variable $X$ with mean $\mu$ and variance $\sigma^2$ , the probability that $X$ deviates from its mean by $k$ or more is:
$P(|X - \mu| \ge k) \le \frac{\sigma^2}{k^2}$

We want to find the probability that $X$ lies within 3 standard deviations of the mean. Let $k = 3\sigma$ .
First, we find the probability that $X$ is outside this range:
$P(|X - \mu| \ge 3\sigma) \le \frac{\sigma^2}{(3\sigma)^2}$
$P(|X - \mu| \ge 3\sigma) \le \frac{\sigma^2}{9\sigma^2} = \frac{1}{9}$

To find the probability that $X$ lies within 3 standard deviations, we take the complement:
$P(|X - \mu| < 3\sigma) = 1 - P(|X - \mu| \ge 3\sigma)$
Since $P(|X - \mu| \ge 3\sigma) \le \frac{1}{9}$ , the complement must be greater than or equal to $1 - \frac{1}{9}$ :
$P(|X - \mu| < 3\sigma) \ge 1 - \frac{1}{9} = \frac{8}{9}$

Conclusion:
The minimum probability that $X$ lies within 3 standard deviations of its mean is $\frac{8}{9}$ (or approximately $88.89\%$ ). This holds true regardless of the shape of the underlying probability distribution.

Unit4 Unit6