Difference between revisions of "File formats"
AngelHerraez (talk | contribs) (provide link while avoiding anti-spam rules) |
AngelHerraez (talk | contribs) (MOLv3000) |
||
Line 35: | Line 35: | ||
Some extra information on SD files at [http://www.epa.gov/ncct/dsstox/MoreonSDF.html US EPA DSSTox]. | Some extra information on SD files at [http://www.epa.gov/ncct/dsstox/MoreonSDF.html US EPA DSSTox]. | ||
+ | The newer format, '''V3000''' (extended molfile or extended connection table), applies to both MOL and SDF, hasn't got the 1000-atom limit and is also supported by Jmol. | ||
== MOL2 (Sybyl, Tripos) == | == MOL2 (Sybyl, Tripos) == |
Revision as of 19:26, 9 April 2009
- Description of files in Jmol+JSmol distribution
- File formats read or written by Jmol
- The Jmol scripting interface
- Scripting as a programming language
- Complete reference of scripting commands:
- Loading models directly from databases
- Mouse manual
- Default colors used by Jmol
- Atom sets predefined in Jmol
- Support for bond orders · isotopes · stereochemistry · hydrogen bonds
- Jmol as editor
- Multi-touch support
- Copying and pasting state scripts between applets.
- Backward compatibility (changes of behavior across versions)
- Features added since version 10
- Users mailing list (and a mirror)
(An attempt to compile information on file format specifications. It's not complete yet.)
Jmol example/test data files in all formats accepted.
Chemical file formats on Wikipedia.
File formats on Open Babel.
Nice description of 3D Geometry & Modeling Formats by Wolfram Mathematica.
Coordinates of molecule
MOL and SD (Symyx MDL)
MOL = MDL molfile
SD = SDF = Structure Data Format
Jmol reads MOL and SD files (and can write MOL files under some circumstances). Original from Molecular Design Limited, then Elsevier MDL, now Symyx Technologies, widely adopted by many other programs. Contains atom coordinates and bonds. Limited to 1000 atoms.
MOL header lines:
- The first line is reserved for the molecule name and will be so used by Jmol in the popup menu.
- The second line is in principle reserved for information on the originating program, user, etc. (Jmol will ignore this line).
- The third line is for comments, and may contain an inline script starting with
jmolscript:
.
SD files share the MOL format but may contain several structures (separated by lines with $$$$), which will be read by Jmol as multiple models or frames.
Official document (PDF): http://www.mdl.com/downloads/public/ctfile/ctfile.pdf, copied here.
Some extra information on SD files at US EPA DSSTox.
The newer format, V3000 (extended molfile or extended connection table), applies to both MOL and SDF, hasn't got the 1000-atom limit and is also supported by Jmol.
MOL2 (Sybyl, Tripos)
Jmol reads MOL2 files. Original from Tripos. Contains atom coordinates, bonds, partial charge, substructure information.
Official document: http://www.tripos.com/data/support/mol2.pdf
PDB
Jmol reads PDB files. Contains atom coordinates and information on biomolecular residues, sequence, chains, hydrogen and disulfide bonds, secondary structure, biologically relevant sites, cofactors. Can also contain temperature factor, formal charge, element symbol, alternate locations.
(Official Protein Data Bank document) Atomic Coordinate Entry Format. Description: http://www.wwpdb.org/documentation/format23/v2.3.html
XYZ
Jmol reads XYZ files. Originally from XMol package, but has been widely adopted by many other programs. Contains only atom coordinates (no bonds) and, optionally, charges and vectors (e.g. for atom vibration). Supports multi-model data (multi-frame, animations).
XYZ header lines:
- The first line is reserved for the number of atoms.
- The second line is for comments, and may contain an #Script_inline_within_a_molecular_coordinates_file inline script starting with
jmolscript:
.
Example by Paul Bourke.
Description:
- XYZ datafiles specify molecular geometries using a Cartesian coordinate system. This simple, stripped-down, ASCII-readable format is intended to serve as a "transition" format for the XMol series of applications. For example, suppose a molecular datafile was in a format not supported by XMol. In order to read the data into XMol, it would be possible to modify the datafile, perhaps by creating a shell script, so that it fit the relatively lenient requirements of the XYZ format specification. Once data is in XYZ format, it may be examined by XMol, or converted to yet another format.
- The XYZ format supports multi-step datasets. Each step is represented by a two-line "header," followed by one line for each atom. The first line of a step's header is the number of atoms in that step. This integer may be preceded by whitespace; anything on the line after the integer is ignored. The second line of the header leaves room for a descriptive string. This line may be blank, or it may contain some information pertinent to that particular step, but it must exist, and it must be just one line long. Each line of text describing a single atom must contain at least four fields of information, separated by whitespace: the atom's type (a short string of alphanumeric characters), and its x-, y-, and z-positions. Optionally, extra fields may be used to specify a charge for the atom, and/or a vector associated with the atom. If an input line contains five or eight fields, the fifth field is interpreted as the atom's charge; otherwise, a charge of zero is assumed. If an input line contains seven or eight fields, the last three fields are interpreted as the components of a vector. These components should be specified in angstroms.
- Note that the XYZ format doesn't contain connectivity information. This intentional omission allows for greater flexibility: to create an XYZ file, you don't need to know where a molecule's bonds are; you just need to know where its atoms are. Connectivity information is generated automatically for XYZ files as they are read into XMol-related applications. Briefly, if the distance between two atoms is less than the sum of their covalent radii, they are considered bonded.
- Source: man page for XYZ (part of XMol), quoted at http://www.ccl.net/chemistry/resources/messages/1996/10/21.005-dir/index.html
The XYZ reader in Jmol reads any of the following (updated for Jmol v. 11.4.5 and 11.5.41):
Sym x y z Sym x y z vibX vibY vibZ Sym x y z FormalCharge(integer) Sym x y z FormalCharge(integer) vibX vibY vibZ Sym x y z PartialCharge(decimal) Sym x y z PartialCharge(decimal) vibX vibY vibZ
- where
Sym
is either an element symbol (C, Fe, Si) or an element symbol preceded by an isotope number (2H, 13C, etc.)
CIF
Jmol reads CIF files. Crystallographic Information File, the official format from the International Union of Crystallography. Original documentation, Acta Cryst. (1991). A47, 655-685, and 2003 update.
CIF files may contain anywhere an inline script starting with jmolscript:
.
mmCIF
Jmol reads mmCIF files. Macromolecular Crystallographic Information File, an expanded format to cope with macromolecules. Official documentation.
Alchemy (Tripos)
Jmol does not (yet) support reading of Alchemy and Alchemy2000 files. Alchemy example and Alchemy2000 description by Paul Bourke.
A simple Alchemy reader is implemented starting Jmol 11.7.18.
A complete specification of these formats is needed to fully implement the reader. If you have those details, please contact the developers team.
GAMESS
Jmol reads GAMESS files (General Atomic and Molecular Electronic Structure System, by Gordon research group at Iowa State University).
Gaussian
Jmol reads only the output format. There are example files of Gaussian input, output and log.
Recent versions of Jmol application can also export to files in Gaussian input format.
Cube (Gaussian)
Jmol reads Cube files, original from Gaussian software (Gaussian website).
Description of Cube Input and Cube Output formats: http://www.nersc.gov/nusers/resources/software/apps/chemistry/gaussian/g98/00000430.htm
Description by Paul Bourke.
GROMACS
This is not read by Jmol, but might be supported in the future.
File format is called gro or Gromos87. Usual extension is .gro
Description of the format.
You can convert from gro to pdb using the "editconf" program, which is a part of the GROMACS package that can be run from the command line:
editconf -f whatever.gro -o whatever.pdb
HIV (Hyperchem)
Jmol reads HIV files, the native format of Hyperchem, a software sold by Hypercube Inc..
Example by Paul Bourke, and other example files.
MOPAC
Jmol reads mopout output files from MOPAC
and the new graphf output from MOPAC2007 (.mgf
files), which
contains coordinates, charges, and molecular orbitals.
openMOPAC, Molecular Orbital PACkage, public domain.
PQR
Jmol (11.1.30 or later) reads pqr files.
PQR format is a format based on pdb
, where the occupancy is replaced with the atomic charge and the temperature (or B factor) is replaced with atomic radius (however, the column positions in many pqr files do not match those of pdb files). This gives the acronym: P for pdb, Q for charge, R for radius. Jmol interprets the charge values (property partialcharge) and the radii (property vanderwaals), and can hence use them e.g. in color atoms partialCharge
and spacefill
.
The PQR format has somewhat uncertain origins, but is used by several computational biology packages, including MEAD, AutoDock and APBS, for which it is the primary input format.
PQR format description within APBS documentation. Note that APBS reads PQR loosely, based only on white space delimiters, but Jmol may be more strict about column positions.
PDB files can be converted to PQR by the PDB2PQR software, which adds missing hydrogen atoms and calculates the charge and radius parameters from a variety of force fields.
Amber
Jmol (11.7 or later) reads molecular dynamics output files from Amber. The fileset must have have a structure like:
1 (topology file) + n (coordinate files)
The filter
option of the load
command can be used, as well as a new option to allow selective "first,last,step" loading of coordinate trajectories.
(This is preliminary and needs testing)
Images
Saving images from Jmol application
Images (snapshots) of Jmol's viewport, including the model in the current rendering, can be saved by using the application's top menu bar:
File > Export > Export Image
. A dialog will open to allow choosing the location and filename, as well as the format in the Image Type
drop-down list; choose among JPEG
, PNG
, PPM
or GIF
formats. The menu also has File > Export > Render in POV-Ray
, which produces images that can be displayed and edited in this raytracing program.
- Note: GIF format is only available in recent versions of Jmol.
Information on these image formats and POV-Ray on Wikipedia.
Saving images from Jmol application and signed applet
The pop-up menu also allows in these cases (not in the normal, unsigned applet) to save a snapshot of the model, in JPEG
, PNG
or POV-Ray
(raytracing) formats. Open the pop-up menu (right-click, or Ctrl+click, or click on Jmol frank) and choose Save
. A dialog will open to choose the location and filename.
Saving images from Jmol applet
These methods work for both signed and unsigned applet, but for the first the method above is more convenient.
Basic method
Note: This only works for certain browsers, so it is not a general solution for the general user. In particular, it does not work in MSIE. In Firefox, it may fail for large applet sizes.
The command from the console or via JavaScript is
getProperty image
.
The result is a base-64 encoded JPEG image that looks like this:
/9j/4AAQSkZJRgABAAAAAQABAAD//gBBSl...
Pop that into an <img>
tag using Firefox or Opera (not MSIE), and you have
an image that you can copy into your clipboard and do anything you would
do with a JPG. The JavaScript required to create the tag looks like this:
var myImage = jmolGetPropertyAsString("image") document.getElementById("someDiv").innerHTML = '<img src="data:image/jpeg;base64,' + myImage + '">'
Example: Bob Hanson's examples page (Just under the applet, click on the word "image".)
General method, using Perl
This works in any browser. The author must have access to the web server and be able to use a Perl script there.
It sends the "base64" encoded image (described above) to a perl
script on the server. This script uses the MIME::Base64
package to decode the image (MIME::Base64::decode($data)
) and sends it back to the browser, embedded in an HTML page.
Example code:
HTML file and Javascript (client side):
<html> <head> <script type="text/javascript"> function get_snapshot() { var BI = document.getElementById("bounce_image"); var BI_D = document.getElementById("IMAGE_DATA"); var BASE64 = jmolGetPropertyAsString("image"); BI_D.value = BASE64; BI.submit(); } </script> </head> <body> <script type="text/javascript"> jmolInitialize("jmol/"); jmolApplet(350,"load something.pdb"); </script> <input type="button" id="snapshot" value="snapshot" onclick='get_snapshot()'> <form id="bounce_image" action="http://MY.SERVER.COM/cgi-bin/decode_snapshot.pl" method="post" target="_blank"> <input type="hidden" id="IMAGE_DATA" name="IMAGE_DATA" value="empty"> </form> </body> </html>
PERL script (server side) file 'decode_snapshot.pl':
#!/usr/bin/perl use MIME::Base64; print "Content-type: image/jpeg\n\n"; %postFields = (); read( STDIN, $tmpStr, $ENV{ "CONTENT_LENGTH" } ); @parts = split( /\&/, $tmpStr ); foreach (@parts) { s/%([0-9A-F][0-9A-F])/pack("c",hex($1))/ge; ( $name, $value ) = split(/\=/); $postFields{ "$name" } = $value; } $decoded = decode_base64($postFields{"IMAGE_DATA"}); open (MYFILE, '>path_to_file/jmol_snapshot.jpg'); print MYFILE $decoded; close (MYFILE); print $decoded; exit;
It is also possible to add additional information to the page containing the generated image (like in the example below).
Example: http://www.fli-leibniz.de/cgi-bin/3d_mapping.pl?CODE=1deh (click on the "snapshot" button in the "Graphics Window" section).
General method, using just HTML, and PHP if necessary
This works in any browser. The server must support PHP, but no configuration is needed.
- No access to server configuration is needed (i.e., no need to install server-side scripts).
- If the browser is MS Internet Explorer, it needs online access to a PHP-enabled server that will return the image using a php page which is part of this package.
- If the browser is another, inline base64-encoded images will be used first (understood by Firefox and Opera, at least). This method works both online and off-line (e.g., from hard disk, USB disk or CDROM).
The image is opened in a new window, that fits the image size. It can then be copied to clipboard using the browser's pop-up menu on it. Some browsers also allow to save the image to disk, from the same pop-up menu.
Example and downloadable kit: http://biomodel.uah.es/Jmol/export-image/
Surfaces
JVXL (Jmol Voxel)
Jmol reads and writes JVXL files.
This format is unique to Jmol, stores isosurface data in a highly compressed format.
Documented at http://www.stolaf.edu/academics/chemapps/jmol/docs/misc/JVXL-format.pdf
Pmesh
Jmol reads pmesh files for rendering pmesh surfaces, using not the load
command, but the pmesh
command.
Description.
Cube (Gaussian)
See above.
Open DX
Jmol (11.1.18 or later) reads DX files for rendering isosurfaces and color mapping. DX contains three-dimensional scalar data; most frequently, isosurface and color by molecular electrostatic potential (MEP).
(Not to be confused with JCAMP-DX format, used for spectral vibrational data, which can be shown using JSpecView Applet and MDL Chime.)
Open DX files are produced, among others, by APBS (Adaptive Poisson-Boltzmann Solver). APBS exists in standalone and web-server-based versions, and as a plug-in for PyMOL. There is also an APBS web service integrated into Gemstone, which is a front-end GUI that facilitates the access to computational services run at dedicated web servers (follow the Gemstone tutorial).
Technical:
- DX, as Cube, defines a three-dimensional grid of points in space. At each point is a number (a "scalar value"). This set of point values is then used by Jmol to define an "isosurface" -- the surface separating points having values greater than a given cutoff from those that have a value less than a certain cutoff. A typical application is molecular orbitals. The
phase
parameter of theisosurface
command allows bicolor rendering: one color for "points greater than x" and another color for "points less than -x". - Jmol can read DX files and re-export them to much smaller JVXL files.
Generally a MEP data set is not used for the isosurface itself. Rather, it is used to map color onto another isosurface, usually some representation of the "molecular surface".
Bottom line is that we can now take molecular electrostatic potential data that were generated in PyMOL (requires the APBS plugin for PyMOL) or in APBS or Gemstone, and use them to color a surface generated in Jmol.
Documentation:
- Jmol interpretation of DX files and re-exporting into much smaller JVXL files
- DX generation of MEP data by APBS
- OpenDX specification and software package
Efvet
Jmol (version 11.7.12 or later, support is preliminar) reads efvet files for rendering isosurfaces and color mapping. Efvet is an XML file format used in eF-site, a database for molecular surfaces colored by electrostatic potential, that covers the whole PDB.
The efvet file contains geometric information and coloring attributes of the molecular surface in the form of a set of polygons. Electrostatic potentials and hydrophobic properties are described together by the surface color: red --> blue colors correspond to negative --> positive electrostatic potentials, and yellow color indicates the surface of the hydrophobic residues.
3D objects
Recent versions of Jmol application and signed applet can export models to several formats that specify three-dimensional objects and can be read by specialized software, either raytracing or 3D-world.
Documentation and plans for Jmol support of 3D object formats: VRML, X3D, U3D.
POV-Ray
Jmol can export, with limited features, the current view of a model into POV-Ray format.
VRML
Jmol can export, with limited features, the model into Virtual Reality Modeling Language, VRML.
- Jmol exports only atoms (as spheres), bonds (as cylinders) and isosurfaces (as IndexedFaceSets, single color). Translucency will be supported soon. No dots, labels, cartoons...
- The atom size and color, bond thickness and color, are preserved. The orientation and zoom are not always kept.
- Depending on your VRML viewer or plug-in, you may need to change the file to a .wrl extension.
The vrml file can then be opened using several programs and browser plugins, and manipulated in 3D using the mouse.
Maya
Jmol can export, with limited features, the current view of a model into Maya format. Wikipedia info on Maya
OBJ
Jmol (starting v. 11.7.28) can read files in the obj
file format. This format is generated by Wavefront, Java3D and PyMOL. The described objects are handled as isosurfaces in Jmol, so they can be saved in more compact JVXL format, if desired, and mapped with other data.
- Description of
obj
file format at EG-Models.de, an archive of electronic geometry models. - Description of
obj
file format at FileFormat.info: www.fileformat.info/format/wavefrontobj/egff.htm - Description of
obj
file format at The Graphics File Formats Page by Martin Reddy. - Export of
obj
file format from PyMOL (molecular surfaces).
Scripting
Script input
Jmol reads script files, using not the load
command, but the script
command. These are plain-text files containing commands in the Jmol scripting language (in part common with Rasmol and Chime), that will modify the way the molecular model is shown. The file can have any extension.
For details on the scripting language, visit the Interactive Scripting Documentation.
Script output
Jmol application (not the applet) can write a script file that will restore the current appearance and state of the model. This functionality is still under improvement.
Use the application's top menu bar:
File > Export > Export Image or Script
and in the Image Type
drop-down list, choose SPT
, write a filename and click on Save
.
Inline formats
Molecular data are usually contained in an external file and loaded into Jmol using the load
command, but they can also be contained within the webpage (or fed into it using JavaScript or PHP, e.g. from a database).
In turn, script commands can also be contained in the molecular file.
To allow for this "inline" formats, several methods are implemented:
Please, note that these are advanced procedures. For normal needs, they can be avoided in favour of using normal scripting practices.
Molecular coordinates inline within a webpage
Can be done using direct instructions for the applet or, more easily, using functions in the Jmol.js library: jmolAppletInline, jmolLoadInline, jmolLoadInlineScript
.
Molecular coordinates inline within a script or script file
Can be done using the data "model"
command (Jmol 11 only).
Script inline within a molecular coordinates file
Scripts can be included, in a single line, after a jmolscript:
tag (case-sensitive; the final colon is needed). This must be taken as a comment by the molecular file parser, so its location depends on the file format:
- In a PDB file, use
REMARK jmolscript:
in any line, followed by the script commands in the same line. - In a MOL file, use
jmolscript:
in the third line, followed by the script commands in the same line. - In an XYZ file, use
jmolscript:
in the second line, followed by the script commands in the same line. - In a CIF file, use
jmolscript:
in any line, followed by the script commands in the same line.
In all cases, the script will be applied after the whole molecule has loaded and after whatever script commands may have been set using set defaultLoadScript
.
Contributors
AngelHerraez, NicolasVervelle, Mkubasik, Geoffr, Pimpim, Dandin1