This is a read-only mirror of pymolwiki.org

Difference between revisions of "Cealign"

From PyMOL Wiki
Jump to navigation Jump to search
 
m (2 revisions)
 
(34 intermediate revisions by 9 users not shown)
Line 1: Line 1:
== Introduction ==
+
[[Image:cealign_ex1.png|300px|thumb|right|cealign superposition of 1c0mB and 1bco]]
  
'''Go directly to [[Cealign#Version_0.8-RBS|DOWNLOAD]]'''
+
[[cealign]] aligns two proteins using the CE algorithm. It is very robust for proteins with little to no sequence similarity (twilight zone). For proteins with decent structural similarity, the [[super]] command is preferred and with decent sequence similarity, the [[align]] command is preferred, because these commands are much faster than [[cealign]].
  
This page is the home page of the open-source CEAlign PyMOL plugin. The CE algorithm is a fast and accurate protein structure alignment algorithm, pioneered by Drs. Shindyalov and Bourne (See
+
''This command is new in PyMOL 1.3, see the [[cealign plugin]] for manual installation.''
References).  There are a few changes from the original CE publication (See Notes).
 
  
The source code is implemented in C with the rotations finally done by Numpy in Python.  Because the computationally complex portion of the code is written in C, it's quick.  That is, on my machines --- relatively fast 64-bit machines --- I can align two 400+ amino acid structures in about 0.300 s with the C++ implementation.
+
== Usage ==
  
This plugs into PyMol very easily. See [[Cealign#The_Code|the code]] and [[Cealign#Examples|examples]] for installation and usage.
+
  cealign target, mobile [, target_state [, mobile_state
 +
    [, quiet [, guide [, d0 [, d1 [, window [, gap_max
 +
    [, transform [, object ]]]]]]]]]]
  
== Comparison to PyMol ==
+
== Arguments ==
'''Why should you use this?'''
 
  
PyMOL's structure alignment algorithm is fast and robust.  However, its first step is to perform a sequence alignment of the two selections.  Thus, proteins in the '''twilight zone''' or those having a low sequence identity, may not align well.  Because CE is a structure-based alignment, this is not a problem.  Consider the following example.  The image at LEFT was the result of CE-aligning two proteins (1C0M chain B to 1BCO).  The result is '''152''' aligned (alpha carbons) residues (not atoms) at '''4.96 Angstroms'''.  The image on the RIGHT shows the results from PyMol's align command: an alignment of '''221 atoms''' (not residues) at an RMSD of '''15.7 Angstroms'''.
+
'''Note''': The '''mobile''' and '''target''' arguments are swapped with respect to the [[align]] and [[super]] commands.
  
<gallery>
+
* '''target''' = string: atom selection of target object
Image:cealign_ex1.png|Cealign's results (152 aligned; 4.96 Ang.)
+
* '''mobile''' = string: atom selection of mobile object
Image:pymol_align.png|PyMol's results (763 atoms; 18.4 Ang. )
+
* '''target_state''' = int: object state of target selection {default: 1}
</gallery>
+
* '''mobile_state''' = int: object state of mobile selection {default: 1}
 +
* '''quiet''' = 0/1: suppress output {default: 0 in command mode, 1 in API}
 +
* '''guide''' = 0/1: only use "guide" atoms (CA, C4') {default: 1}
 +
* '''d0, d1, window, gap_max''': CE algorithm parameters
 +
* '''transform''' = 0/1: do superposition {default: 1}
 +
* '''object''' = string: name of alignment object to create {default: (no alignment object)}
  
== Examples ==
+
== Example ==
=== Usage ===
 
==== Syntax ====
 
  
CEAlign has the semantic, and syntactic formalism of
+
<syntaxhighlight lang="python">
<source lang="python">
+
fetch 1c0mB 1bco, async=0
cealign MASTER, TARGET
+
as ribbon
</source>
+
cealign 1bco, 1c0mB, object=aln
where a post-condition of the algorithm is that the coordinates of the '''MASTER''' protein are unchanged.  This allows for easier multi-protein alignments.  For example,
+
</syntaxhighlight>
<source lang="python">
 
cealign 1AUE, 1BZ4
 
cealign 1AUE, 1B68
 
cealign 1AUE, 1A7V
 
cealign 1AUE, 1CPR
 
</source>
 
will superimpose all the TARGETS onto the MASTER.
 
  
=====Examples=====
+
== See Also ==
<source lang="python">
 
cealign 1cll and i. 42-55, 1ggz and c. A
 
cealign 1kao, 1ctq
 
cealign 1fao, 1eaz
 
</source>
 
  
=====Multiple Structure Alignments=====
+
* [[super]]
Use the '''alignto''' command, now provided with cealign.  Just type,
+
* [[align]]
<source lang="python">
+
* [[cealign plugin]]
alignto PROT
 
</source>
 
to align all your proteins in PyMOL to the one called, '''PROT'''.
 
  
=== Results ===
+
[[Category:Commands|Align]]
See '''Changes''' for updates.  But, overall, the results here are great.
+
[[Category:Structure_Alignment|Align]]
 
 
<gallery>
 
Image:v7_1fao_1eaz.png|EASY: 1FAO vs. 1EAZ; 96 residues, 1.28 Ang
 
Image:v7_1cbs_1hmt.png|EASY: 1CBS vs. 1HMT; 128 residues, 2.01 Ang
 
Image:v7_1a15_1b50.png|MODERATE: 1A15 vs 1B50; 56 residues, 2.54 Ang.
 
Image:v7_1oan_1s6n.png|EASY: 1OAN vs. 1S6N (state 1); 96 residues aligned to 3.83 Ang. RMSD.
 
Image:v7_1rlw_1byn.png|HARD: 1RLW to 1BYN; 104 residues; 2.21 Ang.
 
Image:v7_1ten_3hhr.png|HARD: 1TEN vs. 3HHR; 80 residues, 2.91 Ang.
 
Image:v7_2sim_1nsb.png|HARD: 2SIM vs. 1NSB; 272 residues, 4.93 Ang.
 
Image:v7_1cew_1mol.png|HARD: 1CEW vs. 1MOL; 80 residues, 4.03 Ang.
 
</gallery>
 
 
 
== Installation ==
 
 
 
===Mac OS X (10.5)===
 
[[Image:Cealign mac os x.png|300px|thumb|center|CEAlign running on Mac OS X (10.5)]]
 
* Install PyMOL under fink.
 
* Install Numpy for fink:
 
<source lang="bash">
 
/sw/bin/fink install scipy-core-25
 
</source>
 
* Install cealign
 
<source lang="bash">
 
sudo /sw/bin/python setup.py install
 
</source>
 
* In PyMOL, run the two scripts needed for cealing: "cealign.py" and "qkabsch.py".
 
* Voila!
 
 
 
===Windows systems===
 
This is a quick and dirty method to get it working on Win32 right now, more details coming soon.
 
====Requirements====
 
* Latest PyMol, installed on your system
 
* Numpy for python 2.4 -- quick download of just what's needed: http://users.umassmed.edu/Shivender.Shandilya/pymol/numpy.zip
 
* Pre-compiled ccealign.pyd python module: http://users.umassmed.edu/Shivender.Shandilya/pymol/ccealign.zip
 
* Modified pymolrc: http://users.umassmed.edu/Shivender.Shandilya/pymol/pymolrc
 
* cealign.py and qkabsch.py from the Cealign-0.8-RBS package: download below
 
 
 
====Directions====
 
# Unzip the numpy.zip file, which will give you a folder named '''numpy'''
 
# Move this entire folder to: C:\Program Files\DeLano Scientific\PyMOL\modules\  (or the corresponding location on your system)
 
# Unzip ccealign.zip, which will give you a file called  '''ccealign.pyd'''
 
# Move this pyd file to: C:\Program Files\DeLano Scientific\PyMOL\py24\DLLs\  (or the corresponding location on your system)
 
# Copy the downloaded '''pymolrc''' file to: C:\Program Files\DeLano Scientific\PyMOL\  (or the corresponding location on your system)
 
# Extract and copy the files cealign.py and qkabsch.py from the Cealign-0.8-RBS package to: C:\Program Files\DeLano Scientific\PyMOL\py24\Lib\  (or the corresponding location on your system)
 
# Run PyMol and load some molecules
 
# Run this command in Pymol: '''cealign molecule1, molecule2'''
 
# Enjoy!
 
 
 
===*nix systems===
 
====Requirements====
 
* C compiler
 
* Python 2.4+ with distutils
 
* Numpy
 
** for User-compiled PyMOL: <source lang="python">python setup.py install</source>
 
** for the precompiled version of PyMOL <source lang="python">python setup.py install --prefix "" --root /DIR_TO/pymol/ext/</source>
 
 
 
====Directions====
 
# uncompress the distribution file '''cealign-VERSION.tgz'''
 
# cd cealign-VERSION
 
# sudo python setup.py install  # if you installed by PyMOL by hand
 
## python setup.py install --prefix "" --root /DIR/TO/pymol/ext/  # if you are using the precompiled binary download
 
# insert "run DIR_TO_CEALIGN/cealign.py" and "run DIR_TO_CEALIGN/qkabsch.py" into your '''.pymolrc''' file, or just run the two Python scripts by hand.
 
# load some molecules
 
# run, '''cealign molecule1, molecule2'''
 
# enjoy
 
 
 
=====Pre-compiled Hackish Install=====
 
For those people that prefer to use the pre-compiled version of PyMOL, here are the basics for your install.  '''This is a poor method of installing Cealign.  I suggest users compile and install their own PyMOL.'''  The final goal is to get
 
# '''ccealign.so''' module into '''PYMOL/ext/lib/python2.4/site-packages'''
 
# numpy installed (get the numpy directory into (or linked into) '''PYMOL/ext/lib/python2.4/site-packages'''
 
# and be able to run cealign.py and qkabsch.py from PyMOL.
 
If you can do the above three steps, '''cealign''' should run from the pre-compiled PyMOL.
 
 
 
In more detail, on a completely fictitious machine --- that is, I created the following commands from a fake machine and I don't expect a copy/paste of this to work '''anywhere''', but the commands should be helpful enough to those who need it:
 
<source lang="python">
 
# NOTES:
 
# This is fake code: don't copy/paste it.
 
#
 
# PYMOL='dir to precompiled PyMOL install'
 
# CEALIGN='dir where you will unpack cealign'
 
# replace lib with lib64 for x86-64
 
# install numpy
 
apt-get install numpy
 
 
 
# link numpy to PyMOL
 
ln -s /usr/local/lib/python2.4/site-packages/numpy PYMOL/ext/lib/python2.4/site-packages
 
 
 
# download and install Cealign
 
wget http://www.pymolwiki.org/images/e/ed/Cealign-0.6.tar.bz2
 
tar -jxvf Cealign-0.6.tar.bz2
 
cd cealign-0.6
 
sudo python setup.py build
 
cp build/lib-XYZ-linux/ccealign.so PYMOL/ext/lib/python2.4/site-packages
 
 
 
# run pymol and try it out
 
pymol
 
run CEALIGN/cealign.py
 
run CEALIGN/qkabsch.py
 
fetch 1cew 1mol, async=0
 
cealign 1c, 1m
 
</source>
 
 
 
== The Code ==
 
Please unpack and read the documentation.  All comments/questions should be directed to Jason Vertrees (javertre _at_ utmb ...dot... edu). 
 
 
 
'''LATEST IS v0.8-RBS'''.  (Dedicated to Bryan Sutton for allowing me to use his computer for testing.)
 
 
 
=== Version 0.8-RBS ===
 
* '''Download: [[Media:Cealign-0.8-RBS.tar.bz2|CE Align v0.8-RBS]] (bz2)'''
 
* '''Download: [[Media:Cealign-0.8-RBS.zip|CE Align v0.8-RBS]] (zip)'''
 
 
 
=== Beta Version 0.9 ===
 
Use at your own peril.  Please report any problems or inconsistent alignments to this discussion page, or to me directly; my email address all over this page.
 
 
 
'''Improvements/Changes''':
 
* All C++
 
** So, faster
 
** comes with the dependencies built in
 
* No numpy
 
 
 
''' Download: [[Media:Cealign-0.9.zip|CE Align v0.9]] (zip)'''
 
 
 
== Coming Soon ==
 
* Windows binary
 
* Linux Binaries (32bit, x86-64)
 
* Better instructions for precompiled distributions
 
* Optimization
 
 
 
== Updates ==
 
 
 
=== 2008-03-25 ===
 
Pure C++ code released.  See the beta version above.
 
 
 
=== 2007-04-14 ===
 
v0.8-RBS source updated.  Found the bug that had been plaguing 32-bit machines.  This should be the last release for a little while.
 
 
 
Also, I provide the option of aligning based solely upon RMSD or upon the better CE-Score.  See the '''References''' for information on the '''CE Score'''.
 
 
 
== Troubleshooting ==
 
 
 
Post your problems/solutions here.
 
 
 
=== Unicode Issues in Python/Numpy ===
 
'''Problem''': Running/Installing cealign gives
 
<source lang="python">
 
Traceback (most recent call last):
 
  File "/home/byron/software/pymol_1.00b17/pymol/modules/pymol/parser.py",
 
line 308, in parse
 
  File "/home/byron/software/pymol_1.00b17/pymol/modules/pymol/parsing.py",
 
line 410, in run_file
 
  File "qkabsch.py", line 86, in ?
 
    import numpy
 
  File "/usr/lib/python2.4/site-packages/numpy/__init__.py", line 36, in ?
 
    import core
 
  File "/usr/lib/python2.4/site-packages/numpy/core/__init__.py", line 5, in ?
 
    import multiarray
 
ImportError: /home/byron/software/pymol/ext/lib/python2.4/site-packages/numpy/core/multiarray.so:
 
undefined symbol: _PyUnicodeUCS4_IsWhitespace
 
</source>
 
where the important line is
 
<source lang="python">
 
undefined symbol: _PyUnicodeUCS4_IsWhitespace
 
</source>
 
 
 
This problem indicates that your Numpy Unicode is using a different byte-size for unicode characters than is the Python distribution your PyMOL is running from.  For example, this can happen if you use the pre-built PyMOL and some other pre-built Numpy package.
 
 
 
 
 
 
 
'''Solution''': Hand-install Numpy.
 
 
 
 
 
=== LinAlg Module Not Found ===
 
'''Problem''': Running CE Align gives the following error message:
 
<source lang="python">
 
run qkabsch.py
 
Traceback (most recent call last):
 
File "/usr/lib/python2.4/site-packages/pymol/parser.py", line 285, in parse
 
parsing.run_file(exp_path(args[nest][0]),pymol_names,pymol_names)
 
File "/usr/lib/python2.4/site-packages/pymol/parsing.py", line 407, in run_file
 
execfile(file,global_ns,local_ns)
 
File "qkabsch.py", line 86, in ?
 
import numpy
 
File "/usr/lib/python2.4/site-packages/numpy/__init__.py", line 40, in ?
 
import linalg
 
ImportError: No module named linalg
 
</source>
 
 
 
 
 
 
 
'''Solution''': You do not have the linear algebra module installed (or Python can't find it) on your machine.  One workaround is to install [http://www.scipy.org/ Scientific Python]. (on debian/ubuntu this can be done by: sudo apt-get install python-scipy) Another is to reinstall the Numpy package from source, ensuring that you have the necessary requirements for the linear algebra module (linpack, lapack, fft, etc.).
 
 
 
=== CCEAlign & NumPy Modules Not Found ===
 
'''Problem''': Running CE Align gives the following error message:
 
<source lang="python">
 
PyMOL>run cealign.py
 
Traceback (most recent call last):
 
  File "/home/local/warren/MacPyMOL060530/build/Deployment/MacPyMOL.app/pymol/modules/pymol/parser.py", line 297, in parse
 
  File "/home/local/warren/MacPyMOL060530/build/Deployment/MacPyMOL.app/pymol/modules/pymol/parsing.py", line 408, in run_file
 
  File "/usr/local/pymol/scripts/cealign-0.1/cealign.py", line 59, in ?
 
    from ccealign import ccealign
 
ImportError: No module named ccealign
 
run qkabsch.py
 
Traceback (most recent call last):
 
File "/home/local/warren/MacPyMOL060530/build/Deployment/MacPyMOL.app/pymol/modules/pymol/parser.py", line 297, in parse
 
File "/home/local/warren/MacPyMOL060530/build/Deployment/MacPyMOL.app/pymol/modules/pymol/parsing.py", line 408, in run_file
 
File "qkabsch.py", line 86, in ?
 
import numpy
 
ImportError: No module named numpy
 
</source>
 
 
 
 
 
 
 
'''Solution''': This problem occurs under [http://www.apple.com/macosx Apple Mac OS X] if (a) the Apple's python executable on your machine (/usr/bin/python, currently version 2.3.5) is superseded by [http://fink.sourceforge.net/ Fink]'s python executable (/sw/bin/python, currently version 2.5) and (b) you are using [http://delsci.com/rel/099/#MacOSX precompiled versions of PyMOL] (MacPyMOL, PyMOLX11Hybrid or PyMOL for Mac OS X/X11). These executables ignore Fink's python and instead use Apple's - so, in order to run CE Align, one must install NumPy (as well as CE Align itself) using Apple's python. To do so, first download the [http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103 Numpy source code archive] (currently version 1.0.1), unpack it, change directory to numpy-1.0.1 and specify the full path to Apple's python executable during installation: <tt>sudo /usr/bin/python setup.py install | tee install.log</tt>. Then, donwload the [http://www.pymolwiki.org/index.php/Cealign#The_Code CE Align source code archive] (currently version 0.2), unpack it, change directory to cealign-0.2 and finally install CE Align as follows: <tt>sudo /usr/bin/python setup.py install | tee install.log</tt>.
 
[[User:Lucajovine|Luca Jovine]] 05:11, 25 January 2007 (CST).
 
 
 
=== The Function SimpAlign() is not found ===
 
'''Problem''': Running CE Align gives the following error message:
 
<source lang="python">
 
PyMOL>cealign 1CLL,1GGZ
 
Traceback (most recent call last):
 
  File "C:\Program Files (x86)\DeLano Scientific\PyMOL/modules\pymol\parser.py", line 203, in parse
 
    result=apply(kw[nest][0],args[nest],kw_args[nest])
 
  File "py24/Lib/cealign.py", line 177, in cealign
 
    curScore = simpAlign( matA, matB, mol1, mol2, stored.mol1, stored.mol2, align=0, L=len(matA) )
 
NameError: global name 'simpAlign' is not defined
 
</source>
 
I am running PyMOL v. 0.99rc6 on Win XP Professional x64 edition version 2003 sp2 and have followed the windows install procedure as described above.
 
 
 
 
 
=== Short Alignments Don't Work ===
 
If you are trying to align fewer than 16 residues then use [[align]], [[super]], or [[optAlign]].  CE uses a window size of 8; and to build a path of more than one window, you need 2*8=16 residues.  I will insert some code to re-route small alignments to one of the aforementioned alignment algorithms.
 
 
 
=== It Worked A Second Ago! ===
 
[[Image:Rewind.png|thumb|right|Showing the rewind button to rewind to state 1.]]
 
 
 
If you were using cealign (or alignto) and now the commands don't work -- that is, they return an RMSD, but don't actually superimpose the objects, then you have a simple problem dealing with states.  Most likely the cause of this oddness was (1) when you issued "cealign prot1, prot2" one of them was actually an ensemble of states or (2) you are trying to align to proteins with only one state, but are not looking at state one (because the last protein you were considering had more than one state and you quit editing that protein on a state that's not state 1).  To fix this, use the rewind button to get the proteins back into state 1 & reissue the cealign/alignto command.
 
 
 
== References ==
 
Text taken from PubMed and formatted for the wiki.  The first reference is the most important for this code.
 
 
 
#  Shindyalov IN, Bourne PE. '''Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.'''  ''Protein Eng.'' 1998 Sep;11(9):739-47.  PMID: 9796821 [PubMed - indexed for MEDLINE]
 
# Jia Y, Dewey TG, Shindyalov IN, Bourne PE. '''A new scoring function and associated statistical significance for structure alignment by CE.'''  ''J Comput Biol.'' 2004;11(5):787-99. PMID: 15700402 [PubMed - indexed for MEDLINE]
 
#  Pekurovsky D, Shindyalov IN, Bourne PE. '''A case study of high-throughput biological data processing on parallel platforms.'''  ''Bioinformatics.'' 2004 Aug 12;20(12):1940-7. Epub 2004 Mar 25.  PMID: 15044237 [PubMed - indexed for MEDLINE]
 
#  Shindyalov IN, Bourne PE. '''An alternative view of protein fold space.'''  ''Proteins.'' 2000 Feb 15;38(3):247-60.  PMID: 10713986 [PubMed - indexed for MEDLINE]
 
 
 
== License ==
 
The CEAlign and all its subprograms that I wrote, are released under the open source Free BSD License (BSDL).
 
 
 
 
 
[[Category:Script_Library]]
 
[[Category:Structure_Alignment|Cealign]]
 

Latest revision as of 15:32, 20 October 2014

cealign superposition of 1c0mB and 1bco

cealign aligns two proteins using the CE algorithm. It is very robust for proteins with little to no sequence similarity (twilight zone). For proteins with decent structural similarity, the super command is preferred and with decent sequence similarity, the align command is preferred, because these commands are much faster than cealign.

This command is new in PyMOL 1.3, see the cealign plugin for manual installation.

Usage

cealign target, mobile [, target_state [, mobile_state
    [, quiet [, guide [, d0 [, d1 [, window [, gap_max
    [, transform [, object ]]]]]]]]]]

Arguments

Note: The mobile and target arguments are swapped with respect to the align and super commands.

  • target = string: atom selection of target object
  • mobile = string: atom selection of mobile object
  • target_state = int: object state of target selection {default: 1}
  • mobile_state = int: object state of mobile selection {default: 1}
  • quiet = 0/1: suppress output {default: 0 in command mode, 1 in API}
  • guide = 0/1: only use "guide" atoms (CA, C4') {default: 1}
  • d0, d1, window, gap_max: CE algorithm parameters
  • transform = 0/1: do superposition {default: 1}
  • object = string: name of alignment object to create {default: (no alignment object)}

Example

fetch 1c0mB 1bco, async=0
as ribbon
cealign 1bco, 1c0mB, object=aln

See Also