biopandas version: 0.4.1
PandasMol2
PandasMol2()
Object for working with Tripos Mol2 structure files.
Attributes
-
df
: pandas.DataFrameDataFrame of a Mol2's ATOM section
-
mol2_text
: strMol2 file contents in string format
-
code
: strID, code, or name of the molecule stored
-
pdb_path
: strLocation of the MOL2 file that was read in via
read_mol2
Methods
distance(xyz=(0.0, 0.0, 0.0))
Computes Euclidean distance between atoms in self.df and a 3D point.
Parameters
-
xyz
: tuple (0.00, 0.00, 0.00)X, Y, and Z coordinate of the reference center for the distance computation
Returns
-
pandas.Series
: Pandas Series object containing the Euclideandistance between the atoms in the atom section and
xyz
.
distance_df(df, xyz=(0.0, 0.0, 0.0))
Computes Euclidean distance between atoms and a 3D point.
Parameters
-
df
: DataFrameDataFrame containing entries similar to the PandasMol2.df format for the the distance computation to the
xyz
reference coordinates. -
xyz
: tuple (0.00, 0.00, 0.00)X, Y, and Z coordinate of the reference center for the distance computation
Returns
-
pandas.Series
: Pandas Series object containing the Euclideandistance between the atoms in the atom section and
xyz
.
read_mol2(path, columns=None)
Reads Mol2 files (unzipped or gzipped) from local drive
Note that if your mol2 file contains more than one molecule,
only the first molecule is loaded into the DataFrame
Attributes
-
path
: strPath to the Mol2 file in .mol2 format or gzipped format (.mol2.gz)
-
columns
: dict or None (default: None)If None, this methods expects a 9-column ATOM section that contains the following columns:
{0:('atom_id', int), 1:('atom_name', str), 2:('x', float), 3:('y', float), 4:('z', float), 5:('atom_type', str), 6:('subst_id', int), 7:('subst_name', str), 8:('charge', float)}
If your Mol2 files are formatted differently, you can provide your own column_mapping dictionary in a format similar to the one above. However, note that not all assert_raise_message methods may be supported then.
Returns
self
read_mol2_from_list(mol2_lines, mol2_code, columns=None)
Reads Mol2 file from a list into DataFrames
Attributes
-
mol2_lines
: listA list of lines containing the mol2 file contents. For example, ['@
MOLECULE\n', 'ZINC38611810\n', ' 65 68 0 0 0\n', 'SMALL\n', 'NO_CHARGES\n', '\n', '@ ATOM\n', ' 1 C1 -1.1786 2.7011 -4.0323 C.3 1 <0> -0.1537\n', ' 2 C2 -1.2950 1.2442 -3.5798 C.3 1 <0> -0.1156\n', ...] -
mol2_code
: str or NoneName or ID of the molecule.
-
columns
: dict or None (default: None)If None, this methods expects a 9-column ATOM section that contains the following columns: {0:('atom_id', int), 1:('atom_name', str), 2:('x', float), 3:('y', float), 4:('z', float), 5:('atom_type', str), 6:('subst_id', int), 7:('subst_name', str), 8:('charge', float)} If your Mol2 files are formatted differently, you can provide your own column_mapping dictionary in a format similar to the one above. However, note that not all assert_raise_message methods may be supported then.
Returns
self
rmsd(df1, df2, heavy_only=True)
Compute the Root Mean Square Deviation between molecules
Parameters
-
df1
: pandas.DataFrameDataFrame with HETATM, ATOM, and/or ANISOU entries
-
df2
: pandas.DataFrameSecond DataFrame for RMSD computation against df1. Must have the same number of entries as df1
-
heavy_only
: bool (default: True)Which atoms to compare to compute the RMSD. If
True
(default), computes the RMSD between non-hydrogen atoms only.
Returns
-
rmsd
: floatRoot Mean Square Deviation between df1 and df2
Properties
df
Acccesses the pandas DataFrame
split_multimol2
split_multimol2(mol2_path)
Generator function that splits a multi-mol2 file into individual Mol2 file contents.
Parameters
-
mol2_path
: strPath to the multi-mol2 file. Parses gzip files if the filepath ends on .gz.
Returns
A generator object for lists for every extracted mol2-file. Lists contain
the molecule ID and the mol2 file contents.
e.g., ['ID1234', ['@