Molecule¶
- class qcdb.Molecule(molinit=None, dtype=None, geom=None, elea=None, elez=None, elem=None, mass=None, real=None, elbl=None, name=None, units='Angstrom', input_units_to_au=None, fix_com=None, fix_orientation=None, fix_symmetry=None, fragment_separators=None, fragment_charges=None, fragment_multiplicities=None, molecular_charge=None, molecular_multiplicity=None, comment=None, provenance=None, connectivity=None, enable_qm=True, enable_efp=True, missing_enabled_return_qm='none', missing_enabled_return_efp='none', missing_enabled_return='error', tooclose=0.1, zero_ghost_fragments=False, nonphysical=False, mtol=0.001, verbose=1)[source]¶
Bases:
LibmintsMolecule
Class to store the elements, coordinates, fragmentation pattern, charge, multiplicity of a molecule. Largely replicates psi4’s libmints Molecule class, developed by Justin M. Turney and Andy M. Simmonett with incremental improvements by other psi4 developers. Major
This class extends LibmintsMolecule and occasionally
psi4.core.Molecule
itself.Methods Summary
B787
(ref_mol, *, fix_mode[, do_plot, ...])Finds shift, rotation, and atom reordering of concern_mol that best aligns with ref_mol.
BFS
([seed_atoms, bond_threshold, ...])Detect fragments among real atoms through a breadth-first search (BFS) algorithm.
axis_representation
([zero])Molecule vs.
Computes center of charge of molecule (does not translate molecule).
format_basis_for_cfour
(puream)Function to print the BASIS=SPECIAL block for Cfour according to the active atoms in Molecule.
format_basis_for_nwchem
(basopt)Function to print NWChem-style basis sets into [basis block] according to the active atoms in Molecule.
format_basis_for_nwchem_puream
(puream)Function to recognize puream for NWChem
Returns a string of Molecule formatted for mol2.
Returns string of molecule definition block.
from_arrays
([geom, elea, elez, elem, mass, ...])Construct Molecule from unvalidated arrays and variables.
from_dict
(molrec[, verbose])from_schema
(molschema[, return_dict, verbose])Construct Molecule from non-Psi4 schema.
from_string
(molstr[, dtype, name, fix_com, ...])inertia_tensor
([masswt, zero])Compute inertia tensor.
inertia_tensor_partial
(part[, masswt, zero])Compute inertia tensor based on atoms in part.
inertial_system
([masswt, zero])Solve inertial system
inertial_system_partial
(part[, masswt, zero])Solve inertial system based on atoms in part
init_with_mol2
(xyzfilename[, no_com, ...])Pull information from a MOl2 file.
init_with_xyz
(xyzfilename[, no_com, ...])Pull information from an XYZ file.
Moves molecule to center of charge
print_ring_planes
(entity1, entity2[, ...])(reals only, 1-indexed)
Number of unique orientations of the rigid molecule that only interchange identical atoms.
rotor_type
([tol])Returns the rotor type.
run_dftd3
([func, dashlvl, dashparam, ...])Compute dispersion correction via Grimme's DFTD3 program.
save_string_xyz
([save_ghosts, save_natom])Save a string for a XYZ-style file.
save_xyz
(filename[, save_ghosts, save_natom])Save an XYZ file.
scramble
(*[, do_shift, do_rotate, ...])Generate a Molecule with random or directed translation, rotation, and atom shuffling.
set_fragment_pattern
(frl, frt, frc, frm)Set fragment member data through public method analogous to psi4.core.Molecule
to_arrays
([dummy, ghost_as_dummy])Exports coordinate info into NumPy arrays.
to_dict
([force_c1, force_units, np_out])Serializes instance into Molecule dictionary.
to_schema
(dtype[, units])Serializes instance into dictionary according to schema dtype.
to_string
(dtype[, units, atom_format, ...])Format a string representation of QM molecule.
Methods Documentation
- B787(ref_mol, *, fix_mode, do_plot=False, verbose=1, atoms_map=False, run_resorting=False, mols_align=False, run_to_completion=False, uno_cutoff=0.001, run_mirror=False)[source]¶
Finds shift, rotation, and atom reordering of concern_mol that best aligns with ref_mol.
Wraps
qcelemental.molutil.align.B787()
forqcdb.Molecule
orpsi4.core.Molecule
. Employs the Kabsch, Hungarian, and Uno algorithms to exhaustively locate the best alignment for non-oriented, non-ordered structures.- Parameters:
concern_mol (
Molecule
) – Molecule of concern, to be shifted, rotated, and reordered into best coincidence with ref_mol.ref_mol (
Molecule
) – Molecule to match.fix_mode (
str
) – {“copy”, “true”} Relevant to thefix_com
,fix_orientation
, andgeometry
state of the returned Molecule.fixed_mode="copy"
uses thefix_com
andfix_orientation
(hereafter fix_) attributes ofconcern_mol
(self) to create the returned molecule. The perhaps unexpected implication if fix_=F is that the resultant molecule will be in standard orientation (pretty) and NOT ALIGNED TO REF_MOL. Nevertheless, this is sometimes useful when imitating the original construction of a molecule.fixed_mode="true" sets ``fix_com=True
andfix_orientation=True
to create the returned molecule. The perhaps unexpected implication if fix_=F is that the resultant molecule will DIFFER FROM CONCERN_MOL BY MORE THAN GEOMETRY. Nevertheless, this is the common usage so that the returned molecule actually has the aligned geometry regardless ofconcern_mol
(self) fix_. Note that a possible compromise of returned molecule always having the aligned geometry and the input fix_ is technically possible but contrary to the Molecule design.atoms_map (
bool
) – Whether atom1 of ref_mol corresponds to atom1 of concern_mol, etc. If true, specifying True can save much time.mols_align (
bool
) – Whether ref_mol and concern_mol have identical geometries by eye (barring orientation or atom mapping) and expected final RMSD = 0. If True, procedure is truncated when RMSD condition met, saving time.do_plot (
bool
) – Pops up a mpl plot showing before, after, and ref geometries.run_to_completion (
bool
) – Run reorderings to completion (past RMSD = 0) even if unnecessary because mols_align=True. Used to test worst-case timings.run_resorting (
bool
) – Run the resorting machinery even if unnecessary because atoms_map=True.uno_cutoff (
float
) – TODOrun_mirror (
bool
) – Run alternate geometries potentially allowing best match to ref_mol from mirror image of concern_mol. Only run if system confirmed to be nonsuperimposable upon mirror reflection.verbose (int) –
- Returns:
First item is RMSD [A] between ref_mol and the optimally aligned geometry computed. Second item is a AlignmentMill namedtuple with fields (shift, rotation, atommap, mirror) that prescribe the transformation from concern_mol and the optimally aligned geometry. Third item is a crude charge-, multiplicity-, fragment-less Molecule at optimally aligned (and atom-ordered) geometry. Return type determined by concern_mol type.
- Return type:
- BFS(seed_atoms=None, bond_threshold=1.2, return_arrays=False, return_molecules=False, return_molecule=False)[source]¶
Detect fragments among real atoms through a breadth-first search (BFS) algorithm.
- Parameters:
seed_atoms (
Optional
[List
]) – List of lists of atoms (0-indexed) belonging to independent fragments. Useful to prompt algorithm or to define intramolecular fragments through border atoms. Example: [[1, 0], [2]]bond_threshold (
float
) – Factor beyond average of covalent radii to determine bond cutoff.return_arrays (
bool
) – If True, also return fragments as list of arrays.return_molecules (
bool
) – If True, also return fragments as list of Molecules.return_molecule (
bool
) – If True, also return one big Molecule with fragmentation encoded.
- Returns:
bfs_map (list of lists) – Array of atom indices (0-indexed) of detected fragments.
bfs_arrays (tuple of lists of ndarray, optional) – geom, mass, elem info per-fragment. Only provided if return_arrays is True.
bfs_molecules (list of qcdb.Molecule or psi4.core.Molecule, optional) – List of molecules, each built from one fragment. Center and orientation of fragments is fixed so orientation info from self is not lost. Loses chgmult and ghost/dummy info from self and contains default chgmult. Only provided if return_molecules is True. Returned are of same type as self.
bfs_molecule (qcdb.Molecule or psi4.core.Molecule, optional) – Single molecule with same number of real atoms as self with atoms reordered into adjacent fragments and fragment markers inserted. Loses ghost/dummy info from self; keeps total charge but not total mult. Only provided if return_molecule is True. Returned is of same type as self.
Authors
——-
Original code from Michael S. Marshall, linear-scaling algorithm from
Trent M. Parker, revamped by Lori A. Burns
Notes
Relies upon van der Waals radii and so faulty for close (especially hydrogen-bonded) fragments. See seed_atoms.
Any existing fragmentation info/chgmult encoded in self is lost.
- axis_representation(zero=1e-08)[source]¶
Molecule vs. laboratory frame representation (e.g., IR or IIIL).
- Parameters:
zero (
float
) – Screen for inertial tensor elements- Returns:
Representation code IR, IIR, IIIR, IL, IIL, IIIL. When molecule not in inertial frame, string is prefixed by “~”.
- Return type:
Notes
Not carefully handling degenerate inertial elements.
- center_of_charge()[source]¶
Computes center of charge of molecule (does not translate molecule).
>>> H2OH2O.center_of_charge() [-0.073339893272065401, 0.002959783555632145, 0.0]
- format_basis_for_cfour(puream)[source]¶
Function to print the BASIS=SPECIAL block for Cfour according to the active atoms in Molecule. Special short basis names are used by Psi4 libmints GENBAS-writer in accordance with Cfour constraints.
- format_basis_for_nwchem(basopt)[source]¶
Function to print NWChem-style basis sets into [basis block] according to the active atoms in Molecule. Basis sets are loaded from Psi4 basis sets library.
- format_molecule_for_mol()¶
Returns a string of Molecule formatted for mol2.
Written by Trent M. Parker 9 Jun 2014
- static from_arrays(geom=None, elea=None, elez=None, elem=None, mass=None, real=None, elbl=None, name=None, units='Angstrom', input_units_to_au=None, fix_com=False, fix_orientation=False, fix_symmetry=None, fragment_separators=None, fragment_charges=None, fragment_multiplicities=None, molecular_charge=None, molecular_multiplicity=None, comment=None, provenance=None, connectivity=None, missing_enabled_return='error', tooclose=0.1, zero_ghost_fragments=False, nonphysical=False, mtol=0.001, verbose=1, return_dict=False)[source]¶
Construct Molecule from unvalidated arrays and variables.
Light wrapper around
from_arrays()
that is a full-featured constructor to dictionary representa- tion of Molecule. This follows one step further to return Molecule instance.:param See
from_arrays()
.: :param return_dict: Additionally return Molecule dictionary intermediate. :type return_dict: bool, optional- Returns:
mol (
Molecule
)molrec (dict, optional) – Dictionary representation of instance. Only provided if return_dict is True.
- static from_schema(molschema, return_dict=False, verbose=1)[source]¶
Construct Molecule from non-Psi4 schema.
Light wrapper around
from_arrays()
.- Parameters:
- Returns:
mol (
Molecule
)molrec (dict, optional) – Dictionary representation of instance. Only provided if return_dict is True.
- static from_string(molstr, dtype=None, name=None, fix_com=None, fix_orientation=None, fix_symmetry=None, return_dict=False, enable_qm=True, enable_efp=True, missing_enabled_return_qm='none', missing_enabled_return_efp='none', verbose=1)[source]¶
- inertia_tensor(masswt=True, zero=1e-14)[source]¶
Compute inertia tensor.
>>> print H2OH2O.inertia_tensor() [[8.704574864178731, -8.828375721817082, 0.0], [-8.828375721817082, 280.82861714077666, 0.0], [0.0, 0.0, 281.249500988553]]
- inertia_tensor_partial(part, masswt=True, zero=1e-14)[source]¶
Compute inertia tensor based on atoms in part.
- inertial_system_partial(part, masswt=True, zero=1e-14)[source]¶
Solve inertial system based on atoms in part
- classmethod init_with_mol2(xyzfilename, no_com=False, no_reorient=False, contentsNotFilename=False)[source]¶
Pull information from a MOl2 file. No fragment info detected. Bohr/Angstrom pulled from first line if available. Charge, multiplicity, tagline pulled from second line if available. Body accepts atom symbol or atom charge in first column. Arguments no_com and no_reorient can be used to turn off shift and rotation. If xyzfilename is a string of the contents of an XYZ file, rather than the name of a file, set contentsNotFilename to
True
.NOTE: chg/mult NYI
>>> H2O = qcdb.Molecule.init_with_mol2('h2o.mol2')
- classmethod init_with_xyz(xyzfilename, no_com=False, no_reorient=False, contentsNotFilename=False)[source]¶
Pull information from an XYZ file. No fragment info detected. Bohr/Angstrom pulled from first line if available. Charge, multiplicity, tagline pulled from second line if available. Body accepts atom symbol or atom charge in first column. Arguments no_com and no_reorient can be used to turn off shift and rotation. If xyzfilename is a string of the contents of an XYZ file, rather than the name of a file, set contentsNotFilename to
True
.>>> H2O = qcdb.Molecule.init_with_xyz('h2o.xyz')
- rotational_symmetry_number()[source]¶
Number of unique orientations of the rigid molecule that only interchange identical atoms.
Notes
Source http://cccbdb.nist.gov/thermo.asp (search “symmetry number”)
- run_dftd3(func=None, dashlvl=None, dashparam=None, dertype=None, verbose=1)[source]¶
Compute dispersion correction via Grimme’s DFTD3 program.
- Parameters:
func (
Optional
[str
]) – Name of functional (func only, func & disp, or disp only) for which to compute dispersion (e.g., blyp, BLYP-D2, blyp-d3bj, blyp-d3(bj), hf+d). Any or all parameters initialized from dashcoeff[dashlvl][func] can be overwritten via dashparam.dashlvl (
Optional
[str
]) – Name of dispersion correction to be applied (e.g., d, D2, d3(bj), das2010). Must be key in dashcoeff or “alias” or “formal” to one.dashparam (
Optional
[Dict
]) – Values for the same keys as dashcoeff[dashlvl][‘default’] used to override any or all values initialized by func. Extra parameters will error.dertype (
Union
[int
,str
,None
]) – Maximum derivative level at which to run DFTD3. For large molecules, energy-only calculations can be significantly more efficient. Influences return values, see below.verbose (
int
) – Amount of printing.
- Returns:
energy (float) – When dertype=0, energy [Eh].
gradient (ndarray) – When dertype=1, (nat, 3) gradient [Eh/a0].
(energy, gradient) (tuple of float and ndarray) – When dertype=None, both energy [Eh] and (nat, 3) gradient [Eh/a0].
- save_string_xyz(save_ghosts=True, save_natom=False)[source]¶
Save a string for a XYZ-style file.
>>> H2OH2O.save_string_xyz() 6 -2 3 water_dimer O -1.551007000000 -0.114520000000 0.000000000000 H -1.934259000000 0.762503000000 0.000000000000 H -0.599677000000 0.040712000000 0.000000000000 O 1.350625000000 0.111469000000 0.000000000000 H 1.680398000000 -0.373741000000 -0.758561000000 H 1.680398000000 -0.373741000000 0.758561000000
- save_xyz(filename, save_ghosts=True, save_natom=True)[source]¶
Save an XYZ file.
>>> H2OH2O.save_xyz('h2o.xyz')
- scramble(*, do_shift=True, do_rotate=True, do_resort=True, deflection=1.0, do_mirror=False, do_plot=False, do_test=True, run_to_completion=False, run_resorting=False, fix_mode, verbose=1)[source]¶
Generate a Molecule with random or directed translation, rotation, and atom shuffling. Optionally, check that the aligner returns the opposite transformation.
- Parameters:
ref_mol (
Molecule
) – Molecule to perturb.do_shift (
Union
[bool
,ndarray
,List
]) – Whether to generate a random atom shift on interval [-3, 3) in each dimension (True) or leave at current origin. To shift by a specified vector, supply a 3-element list.do_rotate (
Union
[bool
,ndarray
,List
[List
]]) – Whether to generate a random 3D rotation according to algorithm of Arvo. To rotate by a specified matrix, supply a 9-element list of lists.do_resort (
Union
[bool
,List
]) – Whether to shuffle atoms (True) or leave 1st atom 1st, etc. (False). To specify shuffle, supply a nat-element list of indices.deflection (
float
) – If do_rotate, how random a rotation: 0.0 is no change, 0.1 is small perturbation, 1.0 is completely random.do_mirror (
bool
) – Whether to construct the mirror image structure by inverting y-axis.do_plot (
bool
) – Pops up a mpl plot showing before, after, and ref geometries.do_test (
bool
) – Additionally, run the aligner on the returned Molecule and check that opposite transformations obtained.run_to_completion (
bool
) – By construction, scrambled systems are fully alignable (final RMSD=0). Even so, True turns off the mechanism to stop when RMSD reaches zero and instead proceed to worst possible time.run_resorting (
bool
) – Even if atoms not shuffled, test the resorting machinery.fix_mode (
str
) – {“copy”, “true”} Relevant to thefix_com
,fix_orientation
, andgeometry
state of the returned Molecule.fixed_mode="copy"
uses thefix_com
andfix_orientation
(hereafter fix_) attributes ofconcern_mol
(self) to create the returned molecule. The perhaps unexpected implication if fix_=F is that the resultant molecule will be in standard orientation (pretty) and NOT ALIGNED TO REF_MOL. Nevertheless, this is sometimes useful when imitating the original construction of a molecule.fixed_mode="true" sets ``fix_com=True
andfix_orientation=True
to create the returned molecule. The perhaps unexpected implication if fix_=F is that the resultant molecule will DIFFER FROM CONCERN_MOL BY MORE THAN GEOMETRY. Nevertheless, this is the common usage so that the returned molecule actually has the aligned geometry regardless ofconcern_mol
(self) fix_. Note that a possible compromise of returned molecule always having the aligned geometry and the input fix_ is technically possible but contrary to the Molecule design.verbose (
int
) – Print level.
- Returns:
mol (Molecule)
data (Dict[key, Any]) – Molecule is scrambled copy of ref_mol (self). data[‘rmsd’] is RMSD [A] between ref_mol and the scrambled geometry. data[‘mill’] is a AlignmentMill with fields (shift, rotation, atommap, mirror) that prescribe the transformation from ref_mol to the returned geometry.
- set_fragment_pattern(frl, frt, frc, frm)[source]¶
Set fragment member data through public method analogous to psi4.core.Molecule
- to_arrays(dummy=False, ghost_as_dummy=False)[source]¶
Exports coordinate info into NumPy arrays.
- Parameters:
- Returns:
geom, mass, elem, elez, uniq (ndarray, ndarray, ndarray, ndarray, ndarray) – (nat, 3) geometry [a0]. (nat,) mass [u]. (nat,) element symbol. (nat,) atomic number. (nat,) hash of element symbol and mass. Note that coordinate, orientation, and element information is preserved but fragmentation, chgmult, and dummy/ghost is lost.
Usage
—–
geom, mass, elem, elez, uniq = molinstance.to_arrays()
- to_dict(force_c1=False, force_units=False, np_out=True)[source]¶
Serializes instance into Molecule dictionary.