Presentation Graphics of Protein and DNA Structures
This is a little how-to guide for creating high quality graphics of structures for papers, posters, lab meetings and the web.
Programs
Dave and Jane Richardson were one of the first people to attempt to make simplified representations of protein structures. Very often these representations were made by hand. John Priestle's program RIBBON was the first program to attempt to automate the process, although this was not an interactive program. These days there are a whole range of programs that do analogous things....Programs specializing in producing images: Per Kraulis's program Molscript is a partially interactive program that can generate sophisticated molecular representations. It can also output data that can be interpreted by other programs. It's the program I'll spend most of my time dealing with here. Mike Carson's program Ribbons is a similar type of program with a different interface, one perhaps giving even more flexibility in the style of representation, but with a more limited output interface. Ribbons is probably the best option for DNA/RNA.
Programs drawing representations on screen: Grasp is still the pre-eminent molecular graphics package for the SGI, notably for it's surface capabilities and electrostatics (e.g. APBS). However for reasons that still amaze me, GRASP is not ported to Linux or OSX. Pymol is a rapidly-evolving graphics program that can do much of what Grasp can do, but you need to use external programs to generate surfaces and electrostatics and is the best option for the other platforms. Dino is another program along the same lines, as is VMD. Rasmol was the first simple program for molecular graphics, and has simplified representations but they are not of particularly high quality. Swiss-PDBViewer (aka Deep View) is an extension of that concept with more bells and whistles. There are lots of analogous programs to all of these, but the ones are above are the most popular.
Graphics programs like O and XtalView are also capable of protein representation, but they're usually not quite of the quality that one was aiming for.
Rules for a Good Picture
- Simple
- Uncluttered
- Four colors or less
A Simple Image in Molscript
Molscript is a molecular graphics program driven by a script file that tells the program what to do. You can get very creative with Molscript, but we'll wait a little later for that. If you want to try all of this on your own computer, you'll need to find out where Molscript is located (if you've got labcshrc sourced in your .cshrc is might already be in place) and you'll want this PDB file. It's the p53 core domain complexed with DNA. You can get your own by searching for PDB ID 1TSR at the RCSB site.You'll need a PDB file in standard format, with an END statement at the end of the file or Molscript will whine about it. Here's a simple script file to render a backbone trace of the file "mystructure.pdb":
plot read mol "mystructure.pdb"; transform atom * by centre position atom *; trace from A94 to A289; trace from B96 to B289; trace from C95 to C289; end_plotDownload the file trace.in.
To run it, try:
molscript -opengl < trace.inand move it around with the mouse. |
The syntax is fairly clear: the plot is enclosed within plot....end_plot pairs, you read the PDB file in, your transform it by centering it based on all atoms, and you do a simple chain trace of the three polypeptide chains specified.
The output format that Molscript choses depends on the flags you give it at runtime, and to a certain extent what that version was compiled to support. The usual flags are:
No flag | PostScript output |
-ps or -postscript | PostScript output |
-r or -raster4d | Raster3D format output |
-gl or -opengl | Output to graphics screen |
-vrml | VRML output |
-h | Print syntax to the screen |
Note that in OpenGL mode you can rotate the molecule on the screen with the left mouse button depressed and get the output view matrix with Right-Mouse or (CMD-Left-Mouse on OSX).
Molecular Cartoons
OK, that trace command isn't very pretty. You can quickly get something a
lot prettier using Molauto (also part of the Molscript package):
molauto mystructure.pdb > molauto.in molscript -opengl < molauto.inDownload the file molauto.in. |
Molauto produced rendering commands for all the secondary structure, CPK atoms for the Zincs, and a simple trace for DNA. Take a look at the actual command file in a text editor. Notice that all the components of the original trace.in are there (plot, end_plot, read, transform) but a whole new set of graphics commands are present:
coil from A94 to A102; strand from A102 to A104; turn from A104 to A105; helix from A105 to A108; turn from A108 to A109; cpk in residue A1; double-helix nucleotides;which define the secondary structure for the residues. Notice that (perhaps counterintuitively) adjacent elements start/end at the same residue - the Calpha atom is the boundary. Graphics commands helix and sheet are obvious, but turn and coil are ways of specifying some general polypeptide - in the case of trace the plot runs through the Calpha atoms, but with coil it has more freedom and often looks better. The command CPK draws a space filling representation, in this case of the Zinc atom. The command double-helix gives a trivial representation of the DNA double strand. DNA/RNA are not Molscript's strong points.
Another gain here is the use of color:
set planecolour hsb 0.6667 1 1;here defined as HSB (Hue/Saturation/Brightness) but you're more likely to use:
set planecolour red;for primary colors, or
set planecolour grey 0.5;for various shades of gray, or
set planecolour rgb 1.0 0.8 0.2;when you get picky enough to quibble over the representation of yellow that Molscript uses. You can place color commands at any point in the script - Molscript uses the colors that have been defined up to that point. However please bear in mind that putting 10 different colors in one figure is pretty much guaranteed to confuse the viewer, no matter how sophisticated the coloring scheme.
Molauto tries to come up with a definition for the secondary structure to use in Molscript, but you are more than welcome to use your own - often as not I'll use the one given by the program DSSP, but you can use things like Procheck or just your own eyes. Often you have to tweak the boundaries of the secondary structure elements to make it look good. I haven't done such tweaking in these examples, and if you look closely there are a few areas where the definition could be improved for the purpose of esthetics.
We can alter molauto.dat to suit our coloring schemes and get rid of the
HSB lines. Here's a version that just draws the B chain, DNA and one zinc
to show the so-called "consensus complex" in a single color.
molscript -opengl < view1.inDownload the file view1.in. |
But this is still in a "random orientation" and we want a specific view. So rotate the molecule with the mouse until you've got a view you like, then use the Right Mouse (or CMD-Left Mouse) to pull down the menu and select "Output View". This is the transformation matrix from the original viewpoint to the new viewpoint. You paste this in the transform section in the Molscript command file like so:
transform atom * by centre position atom * by rotation -0.422066 0.826046 -0.373509 -0.866381 -0.488837 -0.10209 -0.266916 0.280513 0.921992 ;i.e. between the last of the transform commands and the terminating semi-colon ; In fact it's a good idea to stick the ; on a line on it's own. And then just to finesse it slightly I moved the molecule up a few Angstroms using the translate command within transform.
Using this modified transformation, the new view looks like:
molscript -opengl < view2.inDownload the file view2.in. |
It's not quite publication quality, but it's not too bad. However that DNA representation looks pretty bad. Because Molscript has rather brain-dead DNA representations, probably the best way to do it is to show all the atoms in ball-and-stick mode: replace "double-helix" with "ball-and-stick".
Here lies one of the subtleties of Molscript: double-helix is a residue graphics command and ball-and-stick is an atom graphics command and they use different selection specifiers. The selection nucleotides is a residue selection, for example. We can switch from one mode to the other using the word "in". So double-helix nucleotides (both command and selection are residue type) becomes ball-and-stick in nucleotides (command is atom type, selection is residue type).
Using this new representation, the new view looks like:
molscript -opengl < view3.inDownload the file view3.in. Molscript is using the default atom colors for DNA, 1/4 of the default atom radii for ball-and-stick, but it's using the bond color from the parameter "planecolour grey 0.8" which is a fairly light gray - white tends to overwhelm the scene. Molscript lets you change all the colors, and the radii of the balls and sticks if you want. The defaults are sensible but there's considerable flexibility within the program. |
And that's already good enough to use for a figure on a poster or a website or perhaps a lab talk. You'd want some labels (put on using Photoshop) for something flashier. If you want another viewpoint, for example looking down the DNA axis, you can simply re-orient the view with the mouse and output a new view matrix, or perhaps it's just easier to put a:
by rotation x 90line in the transform section. This flips the model 90 degrees around the horizontal axis. You'll still need to tweak it a little visually, but to a decent approximation you're already looking down the DNA axis. The tweaks include a new rotation matrix and also a vertical translation...
And now you have a top-down view instead:
molscript -opengl < view4.inDownload the file view4.in. Notice how we have a cumulative set of transform commands, with each command acting on the current viewpoint. This makes transformations in Molscript fairly predictable. The X axis is horizontal, the Y axis is vertical and the Z axis is point out the screen at you. You should experiment with the sign of the angle, because I can never remember which direction is positive. |
How I Made Those JPEGs
By now you might be wondering how I made the JPEGs I inlined above. It's one of the properties of Molscript that I can output to more than just OpenGL format. For example I can elect to output in "render" format and get the render program in Raster3D draw the image for me. Raster3D may or may not be defined in /labcshrc. If in doubt as the system manager for the location of render. I used the following script:molscript -raster3d < view1.in > view1.dat render -tiff view1.tiff < view1.dat convert -resize 200 view1.tiff view1.jpgwhich first uses Molscript to create an output file in render format that I call view1.dat. Then I used the render program from Raster3D to take that data and create a TIFF format file from it. Then I finally use the convert program from the ImageMagick package to convert from TIFF format to JPEG format, while resizing to 200 pixels on each edge (the default size out of render is a little too large).
Showing Molecular Interactions
So much for the cartoons - usually you want to show a little more detail in your structure than just an overall view displayed distantly. Let's say we want to show the four ligands complexing the Zinc atom. To do this we need to carefully control the viewpoint, translation, zoom and slab within Molscript as well as the regular graphics controls. It isn't really all that hard, but requires a little more attention to detail.By default, Molscript sets the parameters window and slab to cover the molecule you've drawn with a little to spare. It reports these values when you run the program:
setting window to 87.96 setting slab to 77.06for molauto.in. You can put these values explicitly in the script file before the read command. If you reduce the window, you zoom in, and if you increase the window you zoom out (the parameter defines the size of the square window in Angstroms). The slab command defines the thickness of the slab that clips the graphics elements front and rear.
You'll also want to center on a specific feature. I'll build an image of the Zn site in p53, so we might as well center on the zinc itself. The coordinates for the Zn are (60.110 17.988 75.931) from the PDB file, so we change the transform command from:
transform atom * by centre position atom *to
transform atom * by translation -60.1 -18.0 -75.9Note that we use the negative of the Zn coordinates so that it ends up at (0,0,0). Now the viewpoint rotates around this Zn. I'll also decrease the window to 30.
This revised viewpoint looks like this:
molscript -opengl < view5.inDownload the file view5.in. But this figure has very obvious deficiencies - we need to zoom in some more, we should get rid of the DNA, and we'll want to change the viewpoint so we can actually see the Zinc ! |
In addition, we want to draw the Zn ligands. Just like with the DNA, we can do this using the ball-and-stick commands. However in this case we are just interested in a few select residues, so we could use:
ball-and-stick in residue B242;but this will draw the whole residue including the peptide backbone. A far neater way to draw it is just the Calpha atom and the side chain. This means we need to use a could of Molscript's control structures, require and either and we'll nest them within each other. The syntax for require is:
require item1, item2 and item3where all three items must be present. There two or more items, you just keep appending more of them after item1, comma-separated. The either statement looks similar:
either item1, item2 or item3Again with two or more items, but in this case if any of the items are selected the statement is true. To select everything except the main-chain we can use the negative of the either statement:
not either atom O, atom C or atom N(these are atom names, not element types). But in order to get both this selection and a specific residue we need to put the either command within a require command:
ball-and-stick require not either atom O, atom C or atom N and in residue B242 ;where I've used a little indenting to make the structure clearer. Ball-and-stick is an atom command, so for atom selections (e.g. atom atom_name)) we don't need an in statement, but in the case of residue selections (e.g. residue residue_name) you do need the in to flip between atom and residue mode.
In this latest figure I've added the four Zinc ligands, zoomed in
still more, got rid of the DNA and changed the viewpoint to make it
reveal the Zinc more clearly:
molscript -opengl < view6.inDownload the file view6.in. Again, the atoms have the default atom colors but we've defined the stick color (planecolour) to be light gray (grey 0.8). Things are improving but the Zn (rendered using the cpk) command looks a little too big - we could make it as small as the other atoms by rendering it ball-and-stick but an alternative way is to set the radius directly. By default it is 1.7 so we'll use about half of it: set atomradius atom ZN* 0.8; |
Also, if you look at the Calpha atom of the Cys side-chain at the bottom, you'll see that the backbone coil of this residue does not pass through the Calpha. This is because the coil rendering command does quite a lot of smoothing. The alternative, the turn command is guaranteed to go through the Calpha atoms so we change coil from B237 to B239; into turn from B237 to B239;. If we have problems with helices or strands, we don't have alternatives, however. Turns often don't look quite as pretty as coils, so use sparingly.
Finally, I'll change two more things: I'll set plane2colour to give the inner surface of the helices a darker cyan appearance (set planecolour rgb 0. 0.3 0.3;). Then we should put in some bonds between the Zinc ligands and the Zinc itself. It's possible to do this with the command:
ball-and-stick require not either atom O, atom C or atom N and in either residue B176, residue B179, residue B238, residue B242 or residue B1 ;but in this requires us to redefine bonddistance to a larger number to get the bonds to draw (set bonddistance 2.2), and typically causes problems with spurious bonds being drawn in (e.g.) Histidine rings. An alternative method is to specify explicit lines be drawn between atoms. These lines are controlled by parameters linethickness, linecolour etc so some tinkering might be needed to make them look right. For example:
set linecolour grey 0.8, linewidth 2.5; line position require atom SG and in residue B176 to position require atom ZN and in residue B1;The linewidth parameter is particularly variable in "best" value.
So with as many optimizations as we can be bothered to add, we have:
molscript -opengl < view7.inDownload the file view7.in. If you add the linedash parameter and a different color, the line command is excellent for drawing hydrogen bonds (perhaps in an eye-catching red in this figure). |
A Recap of Drawing Commands
The plot is encapsulated in plot .... end_plot statements.The window and slab statement control window size/thickness.
Use transform, by rotation, by translation to change viewpoint. Use coil, turn, helix, strand to draw secondary structure.
Use ball-and-stick and cpk for atom-by-atom drawings.
Use line for explicit bonds and hydrogen bonds.
Use planecolour, plane2colour, atomcolour to change atom/residue colors.
Drawing Superimposed Molecules in Molscript
You might sometimes want to draw superimpositions of different molecules within Molscript. The superimpositions have to be done with other programs (e.g. LSQMAN) - Molscript won't do these for you. However in order to get the secondary structure to draw correctly, you need to be able to refer to the molecules separately.One chaap-and-nasty way to do this is to give the different molecules different chain labels or numbering and have them in the same PDB file. This certainly works, and I've done that often enough. However a better way to do this is to read multiple PDB files into different molecules in the header, and then refer to these different molecules throughout the drawing process explicitly using require molecule mol and require molecule mol2 constructions.
E.G:
plot read mol "mystructure.pdb"; read mol2 "mysuperimposedstructure.pdb"; transform atom * by translation -60.1 -18.0 -75.9; set planecolour cyan, plane2colour rgb 0. 0.3 0.3; coil require molecule mol and from B96 to B102; strand require molecule mol and from B102 to B104; turn require molecule mol and from B104 to B105; helix require molecule mol and from B105 to B108; turn require molecule mol and from B108 to B109; strand require molecule mol and from B109 to B113; coil require molecule mol and from B113 to B123; strand require molecule mol and from B123 to B126; set planecolour red, plane2colour yellow; coil require molecule mol2 and from A96 to A102; strand require molecule mol2 and from A102 to A104; turn require molecule mol2 and from A104 to A105; helix require molecule mol2 and from A105 to A108; turn require molecule mol2 and from A108 to A109; strand require molecule mol2 and from A109 to A113; coil require molecule mol2 and from A113 to A123; strand require molecule mol2 and from A123 to A126; set atomradius atom ZN* 0.8; cpk in require molecule mol and residue B1; end_plot
I haven't made an example of this yet, but it should be pretty obvious what I mean by all this, if you've made it this far in the tutorial.
Notes on use of Photoshop
Photoshop 5 used to be really tedious to use when adding labels, but the most recent versions (7, 8=CS, 9=CS2) put each label in a new layer allowing you to easily delete unwanted or misplaced labels. Just remember to flatten the image once you have the final version (using Layers->Flatten) although you can save intermediate versions with the layers intact in Phtoshop format.To change the "brightness" on an image with a white background, use Images->Adjustments->Brightness and Contrast and change the Brightness. To change the brightness on an image with a black background use the Contrast adjustment because otherwise you will make the black background gray. For more advanced modifications, some knowledge of Photoshop is required, although Image->Adjustments->Levels is pretty versatile. You can also modify existing colors in an image using Image->Adjustments->Replace Color although you can get yourself into plenty of trouble when using this tool. I'm moderately experienced with Photoshop, if you have questions (I use it a lot for my own digital photography hobby).
Most of the time, if you want to replace colors, I recommend just remaking the image.
Variations Upon a Theme: Molscript Alternatives
Molscript v2.1.2 was last issued on 12th January 1999. There's no indication that there will be another version. However there have been a variety of Molscript-derived variants created since then.POVScript+ adds the capablity of drawing the following items:
- electron density maps and other volumetric data
- GRASP surface files
- isotropic and anisotropic temperature factors
- ability to render files using POVray
BobScript is a variant of an older version of Molscript, v1.4, that implements the following things in addition to traditional Molscript items:
- electron density maps and masks
Created: June 2005 by Phil Jeffrey.
Revised: June 2006 by Phil Jeffrey.