- Published on
Notation for Machine Learning
- Authors
Numerical Object
Difference between Scalar, Vector, Matrix and Tensor.
In the context of mathematics and machine learning, Scalar, Vector, Matrix and Tensor are all different types of mathematical objects.
A scalar is just a single number. For example: is a scalar.
Vector is an array of numbers. For example: is a vector.
- Matrix is a 2-D array. For example: is a matrix .
- Tensor is generally spoken of as matrices with a dimension larger than 2. For example: is a tensor with 3 dimensions ().
Notation
- : a scalar
- : a vector
- : a matrix
- or : a general tensor
- : the identity matrix (of some given dimension), i.e., a square matrix with 1 on all diagonal entries and 0 on all off-diagonals.
- : the element of vector .
- : the element of matrix at row and column
Set theory
: a set
: the set of integers
: the set of positive integers
: the set of real numbers
: the set of -dimensional vectors of real numbers
: the set of matrices of real numbers with rows and columns
: cardinality (number of elements) of set . For example: with
: union of sets and
: intersection of sets and
: set subtraction of from contains only those elements of that do not belong to
Functions and Operators
: a function. For example:
: the natural logarithm (base ). For example:
: the exponential function. For example:
: the indicator function; evaluates to 1 if the boolean argument is true, and 0 otherwise.
: the set-membership indicator function; evaluates to 1 if the element belongs to the set and 0 otherwise.
- Definition: Let is a sample space and be an event. The indicator function of E, denoted by , is a random variable defined as:
: transpose of a vector or a matrix.
- For example: Let matrix .
- Note:
: inverse of matrix .
For example: With . So
Note:
Read more Matrix, Identity Matrix, Inverse Matrix
: Hadamard (elementwise) product. For example: and . So
Hadamard Product: A Complete Guide to Element-Wise Matrix Multiplication
: concatenation
: norm
: norm
: inner (dot) product of vectors and
: summation over a collection of elements
: product over a collection of elements
: an equality asserted as a definition of the symbol on the left-hand side
Vector norm
A norm is a way to measure the size of a vector, a matrix, or a tensor. In other words, norms are a class of functions that enable us to quantify the magnitude of a vector. For instance, the norm of a vector drawn below is a measure of its length from the origin.
Given an -dimensional vector
a general vector norm , sometimes written with a double bar as , is a non-negative norm defined such that:
- when and iff .
- for any scalar .
- (Triangle Inequality).
Types of Vector Norms
The norm can be mathematically written as:
The norm is used in situations when it is helpful to distinguish between zero and non-zero values. The norm increases proportionally with the components of the vector. It is used in Lasso (Least Absolute Shrinkage and Selection Operator) regression, which involves adding the norm of the coefficient as a penalty term to the loss function.
The norm can be mathematically written as:
The norm is also known as the Euclidean norm. is the most commonly used norm and measures the shortest distance from the origin. Since it entails squaring of each component of the vector, it is not robust to outliers (very small components become even smaller, e.g., ). It is used in Ridge regression, which involves adding the coefficient of the norm as a penalty term to the loss function.
The norm can be mathematically written as:
The norm is defined as the absolute value of the largest component of the vector. Therefore, it is also called the max norm. The norm simplifies to the absolute value of the largest element in the vector.
The General Norm
- The norm can be mathematically written as: We can now generalize to the idea of what is known as the ** -norm**. We can derive all other norms from the -norm by varying the values of . That is to say, if you substitute the value of with one, two, and respectively in the formula above, you’ll obtain , , and norms.
Vector norms are important in machine learning because they are used to calculate distances between data points, define loss functions, apply regularization techniques, and are involved in algorithms like Support Vector Machine (SVM).
Probability and Information Theory
: a random variable
: a probability distribution. (VN: "Phân phối xác suất").
: the random variable follows distribution
: The probability assigned to the event where random variable takes values .
: The conditional probability distribution of given
.
: a probability density function (PDF) associated with distribution
. (VN: "Hàm mật độ").
: expectation of random variable .
: random variables and are independent.
: random variables and are conditionally independent given .
: standard deviation of random variable . (VN: "Độ lệch chuẩn của biến ngẫu nhiên ").
: variance of random variable , equal to . (VN: "Phương sai của ").
: covariance of random variables and
. (VN: "Hiệp phương sai").
: the Pearson correlation coefficient between and , equals .
: entropy of random variable .
: the KL-divergence (or relative entropy) from distribution to distribution .
Below is some code to represent a scalar, vector, matrix, tensor,... using numpy and tensorflow.
Create row vector (one-demension)
# Using numpy
import numpy as np
# Create row vector (one-demension)
v1 = np.array([1, 2, 3])
print(v1)
print(v1.shape)
# Output:
# [1 2 3]
# (3,)
Create a matrix
# Create a matrix
m1 = np.array([[1, 2, 3]])
print(m1)
print(m1.shape)
print(m1.T)
print(m1.T.shape)
# Output:
# [[1 2 3]]
# (1, 3)
# [[1]
# [2]
# [3]]
# (3, 1)
# Vector toàn số 0
v3 = np.zeros(3) # [0. 0. 0.]
# Vector toàn số 1
v4 = np.ones(3) # [1. 1. 1.]
# Vector từ 0 đến 9
v5 = np.arange(10) # [0 1 2 3 4 5 6 7 8 9]
# Vector có 5 phần tử cách đều từ 0 đến 1
v6 = np.linspace(0, 1, 5) # [0. 0.25 0.5 0.75 1. ]
M = np.arange(-10, 10).reshape(10, 2)
print(M)
# Matrix I
I = np.ones((3, 2))
print(I)
import numpy as np
# Define matrices
A = np.array([[2, 4], [1, 3], [5, 2]])
B = np.array([[3, 1], [2, 4], [1, 6]])
# Compute Hadamard product
C = A*B
print(C)
# Output: [[ 6 4]
# [ 2 12]
# [ 5 12]]
# Alternative explicit notation
C_explicit = np.multiply(A, B)
print(C_explicit)
# Output: [[ 6 4]
# [ 2 12]
# [ 5 12]]