From zottola@chemvax.chem.duke.edu Thu Jan 7 20:36:10 1993 Date: Wed, 06 Jan 1993 13:58:58 EST From: zottola@chemvax.chem.duke.edu To: CHEMISTRY@osc.edu Subject: Cartesian to PDB Conversion - A Summary Dear Netters, Below is a list of the solutions I have received in my quest for a routine to convert cartesian coordinates into SYBYL readable PDB format. This list does not include Jan's posting of the fortran executables he has made available via anonymous ftp. This list also does not include the many generous offers of help in trying to solve this problem. Again thanks to one and all who responded to my query! -mark z. 1) Mark, The way I have been doing this type of transformations is as follows: -with the XYZ file, do a run with Mopac using the 0SCF keyword. If your version of Mopac (I assume you have one) is 6.0, the output (*.out file) should give you the system's coordinates in both XYZ and INTERNAL forms. -from the output, copy the block that contains the system in internal coordinates (last block) into a file that must contain the upper part (everything, but the coordinates) of an *.arc file from Mopac. This new file must be named *.sta to be read by Sybyl. Also, in the just created *.sta file, the 3rd non-blank line (line containing the total number of atoms of every type) must be modified accordingly. Obviously, the starting *.arc file (if you don't have one) can be easily created by doing a Mopac run (in the order of seconds) of a simple system. NOTES: -In my version of sybyl I have to change also the second non-blank line (version line) of the just created *.sta file to read "VERSION 5.00"; and I have to edit the coordinate columns to look exactly like the example below. These last operations can be readily carried out with a simple shell script (perhaps in "nawk" UNIX shell language). -------------------------------CUT HERE--------------------------------- SUMMARY OF AM1 CALCULATION VERSION 5.00 AM1 BONDS EF /tmp_mnt/home/eby/rafael/sybyl/pbztPolymopac2.dat DEFINING A DUMMY ATOM (WHICH COINCIDES WITH THE Tv) C 0.0000000 0 0.000000 0 0.000000 0 0 0 0 C 1.4023182 1 0.000000 0 0.000000 0 1 0 0 C 1.3925309 1 120.491775 1 0.000000 0 2 1 0 C 1.4018573 1 120.013080 1 0.011448 1 3 2 1 C 1.4022088 1 119.501337 1 -0.006766 1 4 3 2 C 1.4017758 1 119.481924 1 -0.008443 1 1 2 3 H 1.1032095 1 120.192216 1 -179.994083 1 2 1 3 H 1.1019895 1 119.576861 1 -179.990156 1 3 2 1 H 1.1031646 1 120.204522 1 -179.998951 1 5 4 3 H 1.1019215 1 120.416823 1 -179.997841 1 6 1 2 C 1.4599093 1 121.035896 1 179.993477 1 4 3 2 N 1.3231421 1 125.404264 1 0.033392 1 11 4 3 S 1.7494256 1 119.181316 1 -179.962314 1 11 4 3 C 1.4068053 1 109.848965 1 -179.998738 1 12 11 4 C 1.4014290 1 125.023003 1 -179.995792 1 14 12 11 C 1.3872028 1 118.164477 1 -179.999537 1 15 14 12 C 1.4432826 1 121.214175 1 0.004160 1 16 15 14 C 1.4014243 1 120.626432 1 -0.003837 1 17 16 15 C 1.4432322 1 114.366623 1 0.004121 1 14 12 11 H 1.1009352 1 120.485763 1 0.002677 1 15 14 12 H 1.1008893 1 120.515321 1 -179.999542 1 18 17 16 S 1.6921053 1 129.287470 1 -179.997306 1 16 15 14 N 1.4067819 1 114.372176 1 179.996517 1 17 16 15 C 1.3231197 1 109.830044 1 0.002209 1 23 17 16 XX 1.5088405 1 125.461003 1 179.998250 1 24 23 17 Tv 12.5597124 1 0.000000 0 0.000000 0 1 25 22 0 0.0000000 0 0.000000 0 0.000000 0 0 0 0 -------------------------------CUT HERE------------------------------------ Hope this helps !! Con Saludos, Rafael G. Ramirez ------------------------------------------------------------------ email : rafael@eby.polymer.uakron.edu phone: (216) 972-5810 usmail: Institute of Polymer Science FAX : (216) 972-5290 The University of Akron, Akron, OH 44325-3909 U. S. A. 2) #! /bin/sh awk '{printf "%s %6s %s %-3s %2s %5s %11.3f %7.3f %7.3f %s %s\n", "ATOM", NR, "", $1, "RES", "1", $2, $3, $4, " 1.00", " 0.00" }' $1 The above shell script should convert simple cartesians into pdb format. The first $1 refers to column 1 of the input file which should contain the atom name. $2,$3,$4 refer to the cartesian coordinates, in this case in columns 2,3 and 4 of the input file. These can of course be changed if the atom name and coordinates are in different columns. The second $1 refers to the input file. Just put the above into a file, call it "con", make it executable and type con input_file > output_file. This should give a format readable as "pdb" format. Cheers Nick Tomkinson chs1nt@surrey.ac.uk 3) C C GAUPDB.FOR This program transforms gaussian cartesian format C into PDB files. Using PDB format, you can read C your coordinates into SYBYL. Program was written C and compiled on VAX under VMS. Input files should C be trimmed out of your Gaussian output and given C the extension .XYZ. Output files will have the extension C .PDB. See comments below!! Questions about the code: C C Dr. Rick Gussio C NCI-Biomedical Supercomputing Center C P.O. Box B, Bldg. 430 C Frederick, MD 21202 C C Email: gussio@ncifcrf.gov C CHARACTER FILENAME*35,OUTFILE*35,TITLE*80,WLINE*80,ATOM*4 CHARACTER TYPE*4,RES*3,TRSH1*1,TRSH2*2,TRSH3*3,TRSH4*4,SEGID*4 CHARACTER FI*35 REAL X,Y,Z,W,PAF INTEGER ATOMNO,TYPENO,RESNO,TOTATO,I C C line headings C ATOM='ATOM' PAF=0 C C formats for total line read C 10 FORMAT(A4) 15 FORMAT(I5) 20 FORMAT(A80) C C a few formats C 30 FORMAT(1X,I4,7X,I4,7X,3F12.6) 40 FORMAT(A4,I7,2X,A3,1X,A3,3X,I3,4X,3F8.4) C C prompt user for filename C WRITE(6,*)' ' WRITE(6,*)' Please ENTER the Filename WITHOUT the Extension : ' READ(6,43) FILENAME WRITE(6,43) FILENAME 43 FORMAT(A35) C C remove trailing spaces C I=LAST(FILENAME,35) C C Find cartesian coordinates in the gaussian output file: eg. C C 1 8 -3.080796 -0.357418 -0.065404 C 2 1 -4.063259 -0.115795 -0.258498 C 3 1 -2.940725 -0.680416 0.902559 C 4 7 -0.308698 0.805764 -0.011805 C 5 8 0.966751 1.594701 0.016611 C 6 7 1.058242 -1.403593 -0.026561 C 7 8 2.333691 -0.614656 0.001856 C C Create a file, the file name should have C the extension .XYZ C OPEN(UNIT=1,FILE=FILENAME//'.XYZ',STATUS='OLD') C C output file will have .PDB extension C OUTFILE=FILENAME(1:I)//'.PDB' OPEN(UNIT=2,FILE=OUTFILE,STATUS='NEW',FORM='FORMATTED', + ACCESS='SEQUENTIAL',CARRIAGECONTROL='LIST') C C C read title lines C file filter C 80 CONTINUE C READ(1,20,END=1000) WLINE READ(WLINE,15) ATOMNO READ(WLINE,30) ATOMNO,TYPENO,X,Y,Z WRITE(6,30) ATOMNO,TYPENO,X,Y,Z RESNO = 1 TRSH1 = 'G' RES = 'GAUS' TRSH= 'Z' IF( TYPENO .EQ. 1 ) TYPE= 'H ' IF( TYPENO .EQ. 6 ) TYPE= 'C ' IF( TYPENO .EQ. 7 ) TYPE= 'N ' IF( TYPENO .EQ. 8 ) TYPE= 'O ' IF( TYPENO .EQ. 15) TYPE= 'P ' IF( TYPENO .EQ. 16) TYPE= 'S ' IF( TYPENO .EQ. 17) TYPE= 'CL ' IF( TYPENO .LE. 0 ) TYPE= 'X ' TRSH3= 'G' SEQID='GAUS' TRSH4='G' RESID='GAUS' W=0.000 C C write pdb file C WRITE(2,40) ATOM, ATOMNO,TYPE,RES,RESNO,X,Y,Z GO TO 80 C C close files C 99 FORMAT(A3) 1000 WRITE(2,99) 'TER' CLOSE(UNIT=2) CLOSE(UNIT=1) STOP END C C Appends extensions to filenames: C this function finds the last non blank character C FUNCTION LAST(TEXT,N) CHARACTER TEXT*(*) DO 1 I=N,1,-1 1 IF(TEXT(I:I) .NE.' ') GO TO 2 I=1 2 LAST=I RETURN END 4) SYBYL has an interface that was originally written for the GAUSSIAN 86 program, but I believe it will also write and read files for newer versions of GAUSSIAN. From the command line, use the SYBYL command: SYBYL> GAUSS86 RETRIEVE GEOMETRY This command assumes a copy of the molecule exists in the molecule area you entered and will update the x,y,z coordinates of the moelcule using the coordinates in the GAUSSIAN output file. Thus if you can somehow get a copy of your molecule (with any geometry, it doesn't matter how poor, it's only important that the atom numbering be the same as that used in the GAUSSIAN calculation) into SYBYL, this may be a way for you to read the GAUSSIAN structure into SYBYL. Note that the GAUSS86 RETRIEVE command will make no changes to connectivities or atom types. It simply modifies the x,y,z coordinates of the atoms. One quick and dirty way to make a starting structure in SYBYL of your molecule that you could use with the GAUSS86 RETRIEVE command would be to use the SYBYL command: SYBYL> ADD RAWATOM M1 0 0 0 for each atom in the molecule. This will place all the atoms on top of each other at 0,0,0. This is OK though, because when you then use the GAUSS86 M1 RETRIEVE command, it will place the atoms at their correct x,y,z positions. To quickly generate the bonds, use the SYBYL command SYBYL> CRYSIN M1 CONNECT * * NO_SYMMETRY_SEARCH BOND_LENGTH_TABLE after you have used the GAUSS86 command. This will automatically create bonds based on distances between atoms. I hope this helps you with your problem. If not, let me know and we can probably put together a little SPL script that will help you. Regards, Vic Lewchenko Tripos Associates, Inc. St.Louis, MO victor@tripos.com 5) I had the same need for a conversion program to Sybyl before we got G92. If the newzmat utility doesn't work out for you (I haven't tried it), I'm sending you a simple fortran program written for the vax that worked for me. It converts cart coord to a pseudo-pdb format that SYBYL will read. Two things you may have to change around: the fortran format definitions to suit your needs, and atom types once the molecule is in SYBYL (no big deal). Along with the short program, I'm sending an example .com file you can use to mimic the format as well as the output you can try in SYBYL. Let me know if you have any questions or if you don't receive all the files. Happy Holidays the conversion program pdbfor.for: C Program to Convert Cartesian Coords to Sybyl readable PDB format DIMENSION X1(5000), Y(5000), Z(5000) INTEGER I,J,N,X REAL X1,Y,Z CHARACTER*80 NAME(5000), FNAME, JUNK(5000) I = 1 X = 0 DO 40 N = 1, 5000 READ(5,25,ERR=45) JUNK(N) 25 FORMAT(A80) IF (JUNK(N)(20:30) .EQ. ' ') GOTO 40 READ(JUNK(N),30,ERR=45) NAME(N), X1(N), Y(N), Z(N) 30 FORMAT(A12,3F9.6) X = X + 1 40 CONTINUE 45 DO 50 N=1, X IF (Y(N) .EQ. 0.0000) GOTO 50 WRITE(6,99) ' ATOM',N,NAME(N)(1:4),'R01','1',X1(N),Y(N),Z(N) 50 CONTINUE 99 FORMAT(A5,4X,I3,1X,A4,1X,A3,5X,A1,5X,F7.3,1X,F7.3,1X,F7.3) END a sample .com file: $ mat $ assign bac10.out sys$output $ run pdbfor C1 9.69488 2.44667 -0.33565 C10 9.75925 -0.54929 -2.39149 C11 10.42360 0.69546 -1.92134 C12 11.67762 0.68335 -1.44625 C13 12.20292 1.88993 -0.73423 C14 11.11370 2.53232 0.13327 C15 9.59960 1.99327 -1.83249 C16 8.15503 1.88341 -2.24958 C17 10.22275 3.12909 -2.72714 C18 12.62008 -0.46922 -1.61531 C19 6.79800 -0.80531 0.14685 C2 8.81938 1.51381 0.56641 C20 8.11897 -0.10148 3.12819 C21 8.17820 3.44656 3.62179 C22 9.35497 3.25012 4.31283 C23 9.55583 3.82641 5.46785 C24 8.56960 4.52187 6.11817 C25 7.36193 4.72855 5.49377 C26 7.15077 4.18950 4.20918 C27 7.89495 2.85724 2.32609 C28 11.62870 -0.59770 2.41494 C29 12.81577 0.02700 3.11955 C3 9.20562 -0.01024 0.67253 C30 9.86225 -1.37695 -4.62256 C31 9.40133 -1.04179 -6.04783 C4 9.28030 -0.45526 2.11508 C5 8.95327 -1.94486 2.55438 C6 8.41510 -2.90193 1.53510 C7 8.68032 -2.47646 0.14314 C8 8.31210 -0.96358 -0.16042 C9 8.45115 -0.91890 -1.70539 H10 10.40815 -1.27361 -1.85717 H13 12.52995 2.71200 -1.46352 H14 11.34803 3.49404 0.20608 H141 10.96950 2.09754 0.97116 H16 8.13443 1.40767 -3.08006 H161 7.61942 1.29595 -1.53756 H162 7.90267 2.91589 -2.40136 H17 11.34545 3.30505 -2.50872 H171 9.38845 3.38884 -2.90360 H172 10.34120 2.57887 -3.74272 H18 13.56767 0.01583 -1.94725 H181 11.84242 -0.72804 -2.31745 H182 13.15568 -0.46736 -0.86133 H191 6.26498 -1.47843 -0.50224 H192 6.47098 -0.03538 -0.28012 H2 7.93357 1.63204 0.29739 H20 7.53187 0.33144 2.62719 H201 8.25287 0.53626 3.71311 H22 10.11202 2.91775 3.70447 H25 6.57912 5.31601 5.95529 H26 6.26240 4.24443 3.63907 H27 10.34377 -2.16737 0.52075 H3 10.02448 -0.01676 0.29616 H5 9.62792 -2.31260 3.15657 H6 7.59368 -2.88796 1.83619 H61 8.62367 -3.69793 1.95959 H7 7.97478 -3.01923 -0.31714 O1 9.04597 3.72214 -0.17646 O10 9.44510 -0.39009 -3.78344 O100 10.45965 -2.33960 -4.29185 O13 13.29215 1.56781 0.10489 O2 8.93267 2.10406 1.85964 O20 6.85465 2.98106 1.71403 O4 10.51373 0.03259 2.73084 O40 11.70337 -1.53429 1.66220 O5 7.92843 -1.46819 3.47865 O7 10.04508 -2.73621 -0.22829 O9 7.55763 -1.35740 -2.35447 the output from the sample: ATOM 1 C1 R01 1 9.695 2.447 -0.336 ATOM 2 C10 R01 1 9.759 -0.549 -2.391 ATOM 3 C11 R01 1 10.424 0.695 -1.921 ATOM 4 C12 R01 1 11.678 0.683 -1.446 ATOM 5 C13 R01 1 12.203 1.890 -0.734 ATOM 6 C14 R01 1 11.114 2.532 0.133 ATOM 7 C15 R01 1 9.600 1.993 -1.832 ATOM 8 C16 R01 1 8.155 1.883 -2.250 ATOM 9 C17 R01 1 10.223 3.129 -2.727 ATOM 10 C18 R01 1 12.620 -0.469 -1.615 ATOM 11 C19 R01 1 6.798 -0.805 0.147 ATOM 12 C2 R01 1 8.819 1.514 0.566 ATOM 13 C20 R01 1 8.119 -0.101 3.128 ATOM 14 C21 R01 1 8.178 3.447 3.622 ATOM 15 C22 R01 1 9.355 3.250 4.313 ATOM 16 C23 R01 1 9.556 3.826 5.468 ATOM 17 C24 R01 1 8.570 4.522 6.118 ATOM 18 C25 R01 1 7.362 4.729 5.494 ATOM 19 C26 R01 1 7.151 4.189 4.209 ATOM 20 C27 R01 1 7.895 2.857 2.326 ATOM 21 C28 R01 1 11.629 -0.598 2.415 ATOM 22 C29 R01 1 12.816 0.027 3.119 ATOM 23 C3 R01 1 9.206 -0.010 0.673 ATOM 24 C30 R01 1 9.862 -1.377 -4.622 ATOM 25 C31 R01 1 9.401 -1.042 -6.048 ATOM 26 C4 R01 1 9.280 -0.455 2.115 ATOM 27 C5 R01 1 8.953 -1.945 2.554 ATOM 28 C6 R01 1 8.415 -2.902 1.535 ATOM 29 C7 R01 1 8.680 -2.476 0.143 ATOM 30 C8 R01 1 8.312 -0.964 -0.160 ATOM 31 C9 R01 1 8.451 -0.919 -1.705 ATOM 32 H10 R01 1 10.408 -1.274 -1.857 ATOM 33 H13 R01 1 12.530 2.712 -1.464 ATOM 34 H14 R01 1 11.348 3.494 0.206 ATOM 35 H141 R01 1 10.969 2.098 0.971 ATOM 36 H16 R01 1 8.134 1.408 -3.080 ATOM 37 H161 R01 1 7.619 1.296 -1.538 ATOM 38 H162 R01 1 7.903 2.916 -2.401 ATOM 39 H17 R01 1 11.345 3.305 -2.509 ATOM 40 H171 R01 1 9.388 3.389 -2.904 ATOM 41 H172 R01 1 10.341 2.579 -3.743 ATOM 42 H18 R01 1 13.568 0.016 -1.947 ATOM 43 H181 R01 1 11.842 -0.728 -2.317 ATOM 44 H182 R01 1 13.156 -0.467 -0.861 ATOM 45 H191 R01 1 6.265 -1.478 -0.502 ATOM 46 H192 R01 1 6.471 -0.035 -0.280 ATOM 47 H2 R01 1 7.934 1.632 0.297 ATOM 48 H20 R01 1 7.532 0.331 2.627 ATOM 49 H201 R01 1 8.253 0.536 3.713 ATOM 50 H22 R01 1 10.112 2.918 3.704 ATOM 51 H25 R01 1 6.579 5.316 5.955 ATOM 52 H26 R01 1 6.262 4.244 3.639 ATOM 53 H27 R01 1 10.344 -2.167 0.521 ATOM 54 H3 R01 1 10.024 -0.017 0.296 ATOM 55 H5 R01 1 9.628 -2.313 3.157 ATOM 56 H6 R01 1 7.594 -2.888 1.836 ATOM 57 H61 R01 1 8.624 -3.698 1.959 ATOM 58 H7 R01 1 7.975 -3.019 -0.317 ATOM 59 O1 R01 1 9.046 3.722 -0.176 ATOM 60 O10 R01 1 9.445 -0.390 -3.783 ATOM 61 O100 R01 1 10.460 -2.340 -4.292 ATOM 62 O13 R01 1 13.292 1.568 0.105 ATOM 63 O2 R01 1 8.933 2.104 1.860 ATOM 64 O20 R01 1 6.855 2.981 1.714 ATOM 65 O4 R01 1 10.514 0.033 2.731 ATOM 66 O40 R01 1 11.703 -1.534 1.662 ATOM 67 O5 R01 1 7.928 -1.468 3.479 ATOM 68 O7 R01 1 10.045 -2.736 -0.228 ATOM 69 O9 R01 1 7.558 -1.357 -2.354 Hope this helps! Jeanne Bundens Bryn Mawr College jbundens@cc.brynmawr.edu --- Administrivia: This message is automatically appended by the mail exploder: CHEMISTRY@osc.edu --- everyone CHEMISTRY-REQUEST@osc.edu --- coordinator OSCPOST@osc.edu send help from chemistry Anon. ftp kekule.osc.edu CHEMISTRY-SEARCH@osc.edu --- search the archives, read help.search file first --- --------------------------------------------------------------------------- From m10!frisch@uunet.UU.NET Thu Jan 7 21:14:36 1993 Date: Tue, 22 Dec 92 17:41:59 EST From: Michael Frisch To: chemistry@osc.edu Subject: Re: cartesian to pdb format I have a small problem and was wondering if anyone out here can help me. I have been examining a series of compounds with gaussian92. I thus have these compounds expressed as their cartesian coordinates. I would like to use Sybyl to visualize these structures. Does anyone know if a program exists to convert these cartesian coordinate files into Sybyl-readable pdb files? I was hoping to obtain this program through anonymous ftp rather than my research director's research grant. Thanks! -mark zottola zottola@chemvax.chem.duke.edu The NewZMat program distributed with Gaussian 92 can write PDB files which can be read by most graphics programs, although I haven't tested Sybyl in particular. In recent revisions of G92, you can use either the -opdb or -obkv switch to NewZMat (they are synonymous). In earlier versions, and in G90, you had to say -obkv (the clearer synonym wasn't available). Mike Frisch ------- --- Administrivia: This message is automatically appended by the mail exploder: CHEMISTRY@osc.edu --- everyone CHEMISTRY-REQUEST@osc.edu --- coordinator OSCPOST@osc.edu send help from chemistry Anon. ftp kekule.osc.edu CHEMISTRY-SEARCH@osc.edu --- search the archives, read help.search file first ---