Questions and Answers on the fingerprint tools

This page contains answers to the selected questions by the users of the fingerprint tools. See the describing article.
fingerprint.f90 calculates the eigenvalues of an overlap matrix for atom-centered Gaussian type orbitals (GTO) as a structural fingerprint vector. fpdriver.f90 is a main program to test the subroutine. It gets an arbitrary number of structures in a single xyz formated file and gives the fingerprints and pairwise distances. Subroutines with additional functionalities (arbitrary number of GTOs, derivatives of the fingerprints with respect to atomic positions, etc.) are also available upon request.

Q&A:

You recommend using fingerprints of length 3n or 4n. But I do not understand how it is possible for an n-by-n matrix of inter-atomic distances to get more than n eigenvalues.
An n-by-n matrix has of course only n eigenvalues. Instead you can use larger matrices like those proposed in subsections III.A, III.B and III.C in the describing article. For instance the overlap matrix constructed from s- and p-type GTO's are 4n-by-4n. The required tools are freely available on our website.
A practical comment: As already mentioned in the paper, even n-by-n matrices might be good enough for some applications. If your molecule and/or dataset is too big, you can first use the small n-by-n matrices made out of s-type GTO's only. This reduces the computational costs but then you have to accept the risk of violation of the coincidence axiom.

What was the biomolecule you used? I cannot find any description in the paper other than "C₂₂H₂N₂O₃".
A snapshot of the molecule is shown in Fig 5 of the describing article. In the caption of the same figure the molecule name is also stated: 6-benzyl-1-benzyloxymethyl-5-isopropyl uracil.
A practical comment: As a common practice in the context of biomolecules, you might want to exclude the hydrogen atoms when constructing the required matrices.
In your paper, you compare various metrics in configurational space. What if I optimize pairwise alignment of atoms and rotation for one of the proposed metrics (e.g. OM) and start a RMSD optimization from that alignment? Would such approach help to find RMSD (global minimum) or would that be a trap leading to local minimum?
The advantage of e.g. the OM-based metric is that it does not require any structural superposition or atomic reindexing. So, no alignment which could be used as an initial guess for RMSD minimization is given by them.
As far as I understood, your method adds an alignment of inertia principal axis set. In Table 1, you compare MC searches of global minima with searches of local minima. How does your search of global minima perform when compared to the search of global minima without using principal axes of inertia?
Excluding the Hungarian algorithm-based phase is possible and searching for the global minimum RMSD can be started from the MC phase. If the configurations are really distinct, then the Hungarian algorithm-based phase does not help at all, but since it is really computationally cheap as compared to the MC phase excluding that phase is not important. The main usage of the Hungarian algorithm is for detecting the identical configurations (which is done within a few iterations in all of our benchmark sets).

Comments/Questions

You can send your email to ali DOT sadeghi AT unibas DOT ch, or drop your message in the box below.


Your email adress: (will not be shared)