Using PDDs
Calculating PDDs
The PDD (pointwise distance distribution) of a crystal is given by amd.PDD()
.
It accepts a crystal and an integer k, returning the \(\text{PDD}_k\) as a matrix
with k+1 columns, the weights of each row being in the first column.
If you have a .cif file, use amd.CifReader
to read the crystals. If
csd-python-api
is installed and you have CSD refcodes, use amd.CSDReader
.
You can also give the coordinates of motif points and unit cell as a tuple of numpy
arrays, in Cartesian form. Examples:
# get PDDs of crystals in a .cif
crystals = list(amd.CifReader('file.cif'))
pdds = [amd.PDD(crystal, 100) for crystal in crystals]
# get PDDs of crystals in DEBXIT family
pdds = [amd.PDD(crystal, 100) for crystal in amd.CSDReader('DEBXIT', families=True)]
# PDD of 3D cubic lattice
motif = np.array([[0,0,0]])
cell = np.identity(3)
cubic_pdd = amd.PDD((motif, cell), 100)
Each PDD returned by amd.PDD(c, k)
is a matrix with k+1
columns.
Calculation options
amd.PDD
accepts a few optional arguments (not relevant to amd.AMD
):
amd.PDD(periodic_set, k, order=True, collapse=True, collapse_tol=1e-4)
order
lexicograpgically orders the rows of the PDD, and collapse
merges rows
if all the elements are within collapse_tol
. The technical definition of PDD
requires doing both in order for PDD to satisfy invariance, but sometimes it’s useful to
disable the behaviours, particularly so that the PDD rows are in the same order as the
passed motif points. Without ordering and collapsing, isometric inputs could give different
PDDs; but the Earth mover’s distance between the PDDs would still be 0.
Comparing by PDD
The Earth mover’s distance is
an appropriate metric to compare PDDs, and the amd.compare
module has functions
for these comparisons.
Most useful are compare.PDD_pdist()
and compare.PDD_cdist()
, which mimic
the interface of scipy’s functions pdist
and cdist
. pdist
takes one set and
compares all elements pairwise, whereas cdist
takes two sets and compares elements in
one with the other. cdist
returns a 2D distance matrix, but pdist
returns a
condensed distance matrix (see scipy’s pdist function).
The default metric for AMD comparisons is l-infinity, but it can be changed to any metric
accepted by scipy’s pdist/cdist.
# compare crystals in file1.cif with those in file2.cif by PDD, k=100
pdds1 = [amd.PDD(crystal, 100) for crystal in amd.CifReader('file1.cif')]
pdds2 = [amd.PDD(crystal, 100) for crystal in amd.CifReader('file2.cif')]
distance_matrix = amd.PDD_cdist(pdds1, pdds2)
# compare everything in file1.cif with each other
condensed_dm = amd.PDD_pdist(pdds1)
You can compare one PDD with another with compare.emd()
:
# compare DEBXIT01 and DEBXIT02 by PDD, k=100
pdds = [amd.PDD(crystal, 100) for crystal in amd.CSDReader(['DEBXIT01', 'DEBXIT02'])]
distance = amd.emd(pdds[0], pdds[1])
compare.emd()
, compare.PDD_pdist()
and compare.PDD_cdist()
all accept
an optional argument metric
, which can be anything accepted by scipy’s pdist/cdist functions.
The metric used to compare PDD matrices is always Earth mover’s distance, but this still requires another metric
between the rows of PDDs (so there’s a different Earth mover’s distance for each choice of metric).
Comparison options
amd.PDD_cdist
and amd.PDD_pdist
share the following optional arguments:
metric
chooses the metric used for comparison of PDD rows, as explained above. See scipy’s cdist/pdist for a list of accepted metrics.k
will truncate the passed PDDs to lengthk
before comaparing (sok
must not be larger than the passed PDDs). Useful if comparing for severalk
values.verbose
(defaultFalse
) prints an ETA to the terminal.