mvpa2.measures.rsa.cdist¶

mvpa2.measures.rsa.
cdist
(XA, XB, metric='euclidean', p=2, V=None, VI=None, w=None)¶ Computes distance between each pair of the two collections of inputs.
The following are common calling conventions:
Y = cdist(XA, XB, 'euclidean')
Computes the distance between m points using Euclidean distance (2norm) as the distance metric between the points. The points are arranged as m ndimensional row vectors in the matrix X.
Y = cdist(XA, XB, 'minkowski', p)
Computes the distances using the Minkowski distance uv_p (pnorm) where p \geq 1.
Y = cdist(XA, XB, 'cityblock')
Computes the city block or Manhattan distance between the points.
Y = cdist(XA, XB, 'seuclidean', V=None)
Computes the standardized Euclidean distance. The standardized Euclidean distance between two nvectors
u
andv
is\sqrt{\sum {(u_iv_i)^2 / V[x_i]}}.
V is the variance vector; V[i] is the variance computed over all the i’th components of the points. If not passed, it is automatically computed.
Y = cdist(XA, XB, 'sqeuclidean')
Computes the squared Euclidean distance uv_2^2 between the vectors.
Y = cdist(XA, XB, 'cosine')
Computes the cosine distance between vectors u and v,
1  \frac{u \cdot v} {{u}_2 {v}_2}
where *_2 is the 2norm of its argument
*
, and u \cdot v is the dot product of u and v.Y = cdist(XA, XB, 'correlation')
Computes the correlation distance between vectors u and v. This is
1  \frac{(u  \bar{u}) \cdot (v  \bar{v})} {{(u  \bar{u})}_2 {(v  \bar{v})}_2}
where \bar{v} is the mean of the elements of vector v, and x \cdot y is the dot product of x and y.
Y = cdist(XA, XB, 'hamming')
Computes the normalized Hamming distance, or the proportion of those vector elements between two nvectors
u
andv
which disagree. To save memory, the matrixX
can be of type boolean.Y = cdist(XA, XB, 'jaccard')
Computes the Jaccard distance between the points. Given two vectors,
u
andv
, the Jaccard distance is the proportion of those elementsu[i]
andv[i]
that disagree where at least one of them is nonzero.Y = cdist(XA, XB, 'chebyshev')
Computes the Chebyshev distance between the points. The Chebyshev distance between two nvectors
u
andv
is the maximum norm1 distance between their respective elements. More precisely, the distance is given byd(u,v) = \max_i {u_iv_i}.
Y = cdist(XA, XB, 'canberra')
Computes the Canberra distance between the points. The Canberra distance between two points
u
andv
isd(u,v) = \sum_i \frac{u_iv_i} {u_i+v_i}.
Y = cdist(XA, XB, 'braycurtis')
Computes the BrayCurtis distance between the points. The BrayCurtis distance between two points
u
andv
isd(u,v) = \frac{\sum_i (u_iv_i)} {\sum_i (u_i+v_i)}
Y = cdist(XA, XB, 'mahalanobis', VI=None)
Computes the Mahalanobis distance between the points. The Mahalanobis distance between two pointsu
andv
is \sqrt{(uv)(1/V)(uv)^T} where (1/V) (theVI
variable) is the inverse covariance. IfVI
is not None,VI
will be used as the inverse covariance matrix.Y = cdist(XA, XB, 'yule')
Computes the Yule distance between the boolean vectors. (seeyule
function documentation)Y = cdist(XA, XB, 'matching')
Synonym for ‘hamming’.Y = cdist(XA, XB, 'dice')
Computes the Dice distance between the boolean vectors. (seedice
function documentation)Y = cdist(XA, XB, 'kulsinski')
Computes the Kulsinski distance between the boolean vectors. (seekulsinski
function documentation)Y = cdist(XA, XB, 'rogerstanimoto')
Computes the RogersTanimoto distance between the boolean vectors. (seerogerstanimoto
function documentation)Y = cdist(XA, XB, 'russellrao')
Computes the RussellRao distance between the boolean vectors. (seerussellrao
function documentation)Y = cdist(XA, XB, 'sokalmichener')
Computes the SokalMichener distance between the boolean vectors. (seesokalmichener
function documentation)Y = cdist(XA, XB, 'sokalsneath')
Computes the SokalSneath distance between the vectors. (seesokalsneath
function documentation)Y = cdist(XA, XB, 'wminkowski')
Computes the weighted Minkowski distance between the vectors. (seewminkowski
function documentation)Y = cdist(XA, XB, f)
Computes the distance between all pairs of vectors in X using the user supplied 2arity function f. For example, Euclidean distance between the vectors could be computed as follows:
dm = cdist(XA, XB, lambda u, v: np.sqrt(((uv)**2).sum()))
Note that you should avoid passing a reference to one of the distance functions defined in this library. For example,:
dm = cdist(XA, XB, sokalsneath)
would calculate the pairwise distances between the vectors in X using the Python function
sokalsneath
. This would result in sokalsneath being called {n \choose 2} times, which is inefficient. Instead, the optimized C version is more efficient, and we call it using the following syntax:dm = cdist(XA, XB, 'sokalsneath')
Parameters: XA : ndarray
An m_A by n array of m_A original observations in an ndimensional space. Inputs are converted to float type.
XB : ndarray
An m_B by n array of m_B original observations in an ndimensional space. Inputs are converted to float type.
metric : str or callable, optional
The distance metric to use. If a string, the distance function can be ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.
w : ndarray, optional
The weight vector (for weighted Minkowski).
p : scalar, optional
The pnorm to apply (for Minkowski, weighted and unweighted)
V : ndarray, optional
The variance vector (for standardized Euclidean).
VI : ndarray, optional
The inverse of the covariance matrix (for Mahalanobis).
Returns: Y : ndarray
A m_A by m_B distance matrix is returned. For each i and j, the metric
dist(u=XA[i], v=XB[j])
is computed and stored in the ij th entry.Raises: ValueError
An exception is thrown if
XA
andXB
do not have the same number of columns.Examples
Find the Euclidean distances between four 2D coordinates:
>>> from scipy.spatial import distance >>> coords = [(35.0456, 85.2672), ... (35.1174, 89.9711), ... (35.9728, 83.9422), ... (36.1667, 86.7833)] >>> distance.cdist(coords, coords, 'euclidean') array([[ 0. , 4.7044, 1.6172, 1.8856], [ 4.7044, 0. , 6.0893, 3.3561], [ 1.6172, 6.0893, 0. , 2.8477], [ 1.8856, 3.3561, 2.8477, 0. ]])
Find the Manhattan distance from a 3D point to the corners of the unit cube:
>>> a = np.array([[0, 0, 0], ... [0, 0, 1], ... [0, 1, 0], ... [0, 1, 1], ... [1, 0, 0], ... [1, 0, 1], ... [1, 1, 0], ... [1, 1, 1]]) >>> b = np.array([[ 0.1, 0.2, 0.4]]) >>> distance.cdist(a, b, 'cityblock') array([[ 0.7], [ 0.9], [ 1.3], [ 1.5], [ 1.5], [ 1.7], [ 2.1], [ 2.3]])