Random numbers in Matlab and Python
Analyzing digital images involves a lot of signal processing. Before Python became popular, Matlab was the environment of choice because of its excellent support for signal processing. For this reason, many implementations were originally developed in Matlab and can still be found on lab websites and on GitHub. I found these implementations very helpful, especially for replicating previous research results and comparing new methods to established ones.
When running experiments on a large dataset, I prefer to offload the longer lasting computations to another machine, or a shared compute cluster. The problem is that the workhorses there do not have a Matlab license. While Matlab replacements like Octave could be used, I often find it simpler to re-implement the original method in Python. Luckily, libraries such as NumPy and SciPy provide identical or similar methods to Matlab.
After finishing the Python re-implementation, it is important to verify that the Python implementations yields the same output as the original Matlab code. Some Matlab codes use random numbers, which makes a numerical comparison difficult.
This post describes how to match random numbers in Matlab and Python, such that the outputs of both programs can be compared numerically. Some Matlab random number methods have an identical NumPy counter-part. For other methods, where Matlab’s number generation algorithm is not known, one solution is to implement a drop-in replacement which can be reproduced in Python.
All code snippets were tested using Matlab R2023a and NumPy 1.26.1.
-
Drawing random floats from a uniform distribution.
Luckily, Matlab and NumPy provide the same algorithm for drawing from a uniform distribution in the interval
[0, 1)
.We first initialize a Marsenne Twister generator, which is the default in Matlab and used to be the default in NumPy up until version 1.16. Although NumPy offers a newer Marsenne Twister, the newer version uses a different seeding algorithm. Therefore, we use the legacy random number generator.
stream = RandStream('mt19937ar', 'Seed', seed); rand(stream, size)
rng = np.random.RandomState(seed) rng.random(size)
If you prefer to work with the global RNG in Matlab, you can skip the stream:
rng(seed, 'twister'); rand(size)
-
Drawing random integers from a uniform distribution.
Because the algorithm behind Matlab’s
randi
is not known (at least to me), it is difficult to find an equivalent NumPy implementation. Instead, we can implement a drop-in replacement in Matlab based onrand
and write a matching Python variant.function [numbers] = randi2(stream, low, high, size) % Uniformly distributed random integers % High is exclusive numbers = floor(rand(stream, size) * (high - low) + low); end
def randi2(rng, low, high, size): """ Draw random integers from a uniform distribution. Equivalent to the custom randi2 Matlab method. :param rng: random number generator :param low: minimum integer value to generate :param high: one above the maximum integer value to generate :param size: size of the array to generate :return: """ numbers = rng.random(size) * (high - low) + low return np.floor(numbers).astype(int)
-
Drawing random numbers from a normal distribution.
Matlab has a built-in method
randn
, but its exact implementation is not known. We use the inverse of the cumulative density function to transform a uniformly distributed random number as if it was drawn from a standard normal distribution.function [numbers] = randn2(stream, size) % Draw random numbers from a standard normal distribution numbers = norminv(rand(stream, size), 0, 1); end
from scipy.stats import norm def randn2(rng, size): """ Draw random numbers from a standard normal distribution. :param rng: random number generator :param size: shape of the array to be generated :return: ndarray with random numbers from a standard normal distribution """ return norm.ppf(rng.random(*size))
-
Drawing random permutations.
Again, we write a drop-in replacement for Matlab’s
randperm
, and reproduce the replacement in Python. Note that the Matlab outputs are 1 higher than the Python outputs because Matlab starts indexing at 1.function p = randperm2(stream, n) % Generate a random permutation of length n random_numbers = rand(stream, 1, n); [~, p] = sort(random_numbers); end
def randperm2(rng, n): """ Generate a random permutation of length n :param rng: random number generator :param n: size of the permutation :return: random permutation of length n """ random_numbers = rng.random(n) return np.argsort(random_numbers)
If you need to compare a Matlab and Python implementations only once, an alternative to these custom implementations is to draw the random numbers in one program, store them to a file, and load the numbers into the other program.
Apart from random numbers, Matlab and NumPy also differ in other aspects. One of these aspects is the direction of rounding. Matlab’s built-in method rounds |x| = 0.5
away from zero, whereas NumPy truncates |x| = 0.5
towards zero.
Futhermore, numbers very close to 0.5
may be represented as 0.49999999
in Matlab and 0.50000001
in Python, or vice versa. To ensure that these cases are also rounded consistently, we can implement a custom round method in both Matlab and Python. To retain the rounding behavior of Matlab, both implementations round numbers close to 0.5
away from zero.
function [x_rounded] = round2(x)
% Rounds values close to 0.5 away from zero
% Preserve the input sign
x_sign = sign(x);
% Convert to absolute value
x_abs = abs(x);
% Round towards zero
x_abs_floor = floor(x_abs);
% Round away from zero
x_abs_ceil = ceil(x_abs);
% Compute difference between value and floored value
abs_floor_diff = x_abs - x_abs_floor;
% Condition for rounding away from zero
mask_ceil = abs_floor_diff >= 0.5 | isclose(abs_floor_diff, 0.5);
% Ceil or floor
x_rounded = x_abs_floor;
x_rounded(mask_ceil) = x_abs_ceil(mask_ceil);
% Restore sign
x_rounded = x_rounded .* x_sign;
end
function [close] = isclose(a, b, rtol, atol)
% Re-implementation of NumPy's isclose
if nargin < 4
atol = 1e-8;
end
if nargin < 3
rtol = 1e-5;
end
if ~isscalar(a) && ~isscalar(b)
% a and b are both matrices
assert(isequal(size(a), size(b)));
end
abs_diff = abs(a - b);
close = abs_diff <= atol + rtol * abs(b);
end
def round2(x):
"""
Helper method to mimic Matlab rounding
|x| = 0.5 => round away from zero
:param x: ndarray
:return: rounded ndarray
"""
# Preserve the input sign
x_sign = np.sign(x)
# Convert to absolute value
x_abs = np.abs(x)
# Round towards zero
x_abs_floor = np.floor(x_abs)
# Round away from zero
x_abs_ceil = np.ceil(x_abs)
# Compute difference between value and floored value
abs_floor_diff = x_abs - x_abs_floor
# Condition for rounding away from zero
mask_ceil = (abs_floor_diff >= 0.5) | np.isclose(abs_floor_diff, 0.5)
# Ceil or floor
x_rounded = np.where(mask_ceil, x_abs_ceil, x_abs_floor)
# Restore sign
x_rounded *= x_sign
return x_rounded
Another difference is the layout of multi-dimensional arrays. If you generate 2-dimensional arrays of random numbers, note that Matlab fills the columns of the matrix first (column-major layout as used by Fortran), while NumPy fills the rows first (row-major layout as used in C/C++). To match the two arrays, one needs to be transposed.
This concludes the collection of hints for comparing random numbers across Matlab and Python implementations.