🔀Stochastic Processes Unit 1 – Probability Theory Basics
Probability theory basics form the foundation for understanding random events and processes. This unit covers key concepts like probability measures, sample spaces, and random variables, as well as fundamental rules and axioms that govern probabilistic reasoning.
The unit delves into various probability distributions, expected values, and variance. It also explores conditional probability, independence, and joint distributions, providing essential tools for analyzing complex stochastic systems and real-world applications.
Probability measures the likelihood of an event occurring and ranges from 0 (impossible) to 1 (certain)
Sample space (Ω) represents the set of all possible outcomes of an experiment or random process
Event (A) is a subset of the sample space and represents a collection of outcomes of interest
Random variable (X) assigns a numerical value to each outcome in the sample space and can be discrete or continuous
Probability distribution describes the probabilities of different outcomes for a random variable
Probability mass function (PMF) defines the probability distribution for a discrete random variable
Probability density function (PDF) defines the probability distribution for a continuous random variable
Cumulative distribution function (CDF) gives the probability that a random variable is less than or equal to a specific value
Expectation (E[X]) represents the average value of a random variable over its entire range
Probability Axioms and Rules
Axiom 1: Non-negativity - The probability of any event A is greater than or equal to zero, P(A)≥0
Axiom 2: Normalization - The probability of the entire sample space is equal to one, P(Ω)=1
Axiom 3: Countable Additivity - For any countable sequence of disjoint events A1,A2,..., the probability of their union is equal to the sum of their individual probabilities, P(⋃i=1∞Ai)=∑i=1∞P(Ai)
Complement Rule: The probability of an event A not occurring (complement) is equal to one minus the probability of A, P(Ac)=1−P(A)
Addition Rule: For any two events A and B, the probability of their union is equal to the sum of their individual probabilities minus the probability of their intersection, P(A∪B)=P(A)+P(B)−P(A∩B)
Multiplication Rule: For any two events A and B, the probability of their intersection is equal to the probability of A multiplied by the conditional probability of B given A, P(A∩B)=P(A)⋅P(B∣A)
Random Variables and Distributions
Discrete random variables take on a countable number of distinct values (integers, finite sets)
Examples include the number of heads in a fixed number of coin flips or the number of defective items in a batch
Continuous random variables can take on any value within a specified range or interval (real numbers)
Examples include the time until a machine fails or the weight of a randomly selected object
Probability mass function (PMF) for a discrete random variable X is denoted as pX(x) and gives the probability that X takes on a specific value x
Probability density function (PDF) for a continuous random variable X is denoted as fX(x) and represents the relative likelihood of X taking on a value near x
The probability of X falling within an interval [a, b] is given by the integral of the PDF over that interval, P(a≤X≤b)=∫abfX(x)dx
Cumulative distribution function (CDF) for a random variable X is denoted as FX(x) and gives the probability that X is less than or equal to a specific value x, FX(x)=P(X≤x)
Expected Value and Variance
Expected value (mean) of a discrete random variable X is denoted as E[X] and is calculated by summing the product of each possible value and its probability, E[X]=∑xx⋅pX(x)
Expected value of a continuous random variable X is calculated by integrating the product of each value and its PDF over the entire range, E[X]=∫−∞∞x⋅fX(x)dx
Variance of a random variable X measures the spread of its distribution and is denoted as Var(X) or σX2
For a discrete random variable, Var(X)=∑x(x−E[X])2⋅pX(x)
For a continuous random variable, Var(X)=∫−∞∞(x−E[X])2⋅fX(x)dx
Standard deviation (σX) is the square root of the variance and has the same units as the random variable
Linearity of expectation states that for any two random variables X and Y, E[X+Y]=E[X]+E[Y], even if X and Y are not independent
Conditional Probability and Bayes' Theorem
Conditional probability of an event A given event B is denoted as P(A∣B) and represents the probability of A occurring given that B has occurred, P(A∣B)=P(B)P(A∩B)
Bayes' Theorem relates conditional probabilities and is useful for updating probabilities based on new information, P(A∣B)=P(B)P(B∣A)⋅P(A)
P(A) is the prior probability of event A before considering event B
P(B∣A) is the likelihood of observing event B given that event A has occurred
P(B) is the marginal probability of event B, which acts as a normalizing constant
Law of Total Probability states that for a partition of the sample space {B1,B2,...}, the probability of an event A can be calculated as P(A)=∑iP(A∣Bi)⋅P(Bi)
This law is useful when the probability of A is difficult to calculate directly but can be found by conditioning on a partition of the sample space
Independence and Joint Distributions
Two events A and B are independent if the occurrence of one does not affect the probability of the other, P(A∩B)=P(A)⋅P(B)
Two random variables X and Y are independent if their joint probability distribution can be expressed as the product of their individual (marginal) distributions, fX,Y(x,y)=fX(x)⋅fY(y)
Joint probability distribution describes the probabilities of all possible combinations of values for two or more random variables
Joint probability mass function (JPMF) is used for discrete random variables, pX,Y(x,y)
Joint probability density function (JPDF) is used for continuous random variables, fX,Y(x,y)
Marginal distribution of a random variable can be obtained from the joint distribution by summing (discrete) or integrating (continuous) over the other variable(s)
For discrete random variables, pX(x)=∑ypX,Y(x,y)
For continuous random variables, fX(x)=∫−∞∞fX,Y(x,y)dy
Common Probability Distributions
Bernoulli distribution models a single trial with two possible outcomes (success or failure) with probability p for success and 1-p for failure
Binomial distribution models the number of successes in a fixed number of independent Bernoulli trials with the same success probability p
PMF: pX(x)=(xn)px(1−p)n−x, where (xn) is the binomial coefficient
Poisson distribution models the number of events occurring in a fixed interval of time or space, given an average rate λ
PMF: pX(x)=x!e−λλx
Uniform distribution models a random variable with equal probability over a specified range [a, b]
PDF: fX(x)=b−a1 for a≤x≤b
Normal (Gaussian) distribution is a continuous probability distribution characterized by its mean μ and standard deviation σ
PDF: fX(x)=σ2π1e−2σ2(x−μ)2
Standard normal distribution has μ = 0 and σ = 1
Applications in Stochastic Processes
Markov chains model systems that transition between states, where the probability of moving to a particular state depends only on the current state (memoryless property)
Transition probabilities pij represent the probability of moving from state i to state j in one step
Steady-state probabilities πj represent the long-term proportion of time spent in each state j
Poisson processes model the occurrence of events over time, where the time between events follows an exponential distribution with rate λ
The number of events in a fixed time interval follows a Poisson distribution with mean λt
Brownian motion (Wiener process) models the random motion of particles suspended in a fluid, with independent, normally distributed increments
Applications include modeling stock prices, diffusion processes, and noise in electrical systems
Queueing theory applies probability distributions to analyze waiting lines and service systems, such as customer service centers or computer networks
Arrival times often modeled using a Poisson process
Service times may follow various distributions (exponential, normal, etc.)
Reliability theory uses probability distributions to model the lifetime and failure rates of components and systems
Exponential distribution often used to model constant failure rates (memoryless property)
Weibull distribution can model increasing or decreasing failure rates over time